When creating a container, don't worry about whether or not the base
image's top layer has the right ID mappings in cases where the base
image doesn't have a top layer.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
These interfaces can be used to setup a graphdriver mountpoint
of the source directory for use within a container.
The RemoveTemp interface umounts the mountpoint and then removes
all of the modified data in the graphdriver for this source directory.
The primary use case of these interfaces is for container engines that
want to mount a directory from the host system into the container. The
source dirctory then can be modified without actually changing the
directory on the host.
Containers will use these interfaces for sharing packaing cache directories
like /var/cache/dnf, to help speed up container builds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When you build an image with a parent layer in read-only stores
and the new image in read/write stores, the first time you try
to create a container based on the image, it fails, since it
cannot find the image in the same store.
This patch looks not only in the same store, but all of the stores
available.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a method to generate a lock file for a specific digest. Such a
digest-specific lock file is needed to synchronize threads and processes
when copying blobs from a registry to the containers-storage.
Whenever a layer is about to get copied, the lock must be acquired which
indicates to other processes and threads that the layer/blob is already
being copied.
To avoid leaking file descriptors for long-living users of
containers/storage, such as CRI-O, open and close the file on demand
during Lock() and Unlock(). The internal reference counters allows to
determine if we are the first or last user.
Note: as deleting the lock files is subject to race conditions, we place
the lock files in a graph-specific directory in the runroot. Since the
runroot is a tmpfs, the files will be cleanup during reboot.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
In an effort to remove cross vendoring, trying to fix buildah from importing
from libpod. I beleive these libraries make more sense in containers/storage
then in libpod.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Drop our dependency on the image library's manifest package by requiring
that callers pass its Digest() function to us as a callback. This makes
our CLI test/diagnostic tool calculate digests of s1 manifests
incorrectly, but that's not something that we were testing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Enable executing parallel `GetBlob()` executions in containers/image by
using reader-lock acquisitions in `ImageBigData()` and `Diff()`.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Deferring method calls on loop variables must be avoided by all means as
the calls will be invoked on the last item of the loop.
The intermediate fix used in this commit is to allocate a new variable
on the heap for each loop iteration. An example transformation is:
FROM:
for _, x := range x_slice {
x.Lock()
defer x.Unlock()
}
TO:
for _, x_itr := range x_slice {
x := x_itr
x.Lock()
defer x.Unlock()
}
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When we removed all traces of override_kernel_check, we created a
situation where older configuration files would suddenly start causing
us to emit an error at startup. Soften that to a warning, for now at
least.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Change how we compute digests for BigData items with names that start
with "manifest" so that we use the image library's manifest.Digest()
function, which knows how to preprocess schema1 manifests to get the
right value, instead of just trying to finesse it.
Track the digests of multiple manifest-named items for images.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Need to access the storage structs in the machine-config
operator code for container runtime configuration but
with it being in store.go, it is pullng in way too many
dependencies. Moving it out to a separate package cuts down
the dependencies by a huge amount.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
When removing an image, remove the image's mapped top layers before the
image's "main" top layer, in case the graph driver is hiding a
dependency between the mapped layers and the "real" one (as it's allowed
to do).
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add a CreateFromTemplate() method to graph drivers, and use it instead
of a driver-oblivious diff/put method when we want to create a copy of
an image's top layer that has the same parent and which differs from the
original only in its ID maps.
This lets drivers that can quickly make an independent layer based on
another layer do something smarter than we were doing with the
driver-oblivious method. For some drivers, a native method is
dramatically faster.
Note that the driver needs to be able to do this while still exposing
just one notional layer (i.e., one link in the chain of layers for a
given container) to the higher levels of the APIs, so if the new layer
is actually a child of the template layer, that needs to remain a detail
that's private to the driver.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The logic that depended on override_kernel_check was changed to test for
the feature at runtime, so we don't need to be suggesting to people that
they need to set this option, or that the option is even a thing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We need to expose this file path in podman info
to make it easier for users to discover where
the configuration file is.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If we fail to create an ID-mapped copy of an image's layer, report the
ID of the layer that we were attempting to create an ID-mapped copy of,
instead of attempting to log the ID of its parent, which might not
exist.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We have seen situations in buildah where a container is being
built and user hits ^c, then he ends up in a situation where
he can not delete container, since layer does not exist.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
I found that other projects, tend to parse multiple maps at once. So, we may
want to allow the base library to do so in order to decrease complexity in the
upper layers.
This is follow-up on previous refactoring in 7b209d36fd, I didn't got
it right on first try, sry.
Signed-off-by: Šimon Lukašík <isimluk@fedoraproject.org>
We want to allow tools like podman/buildah to override default storage
container mount options on a container by container basis.
For example if the default mount options for containers/storage include
nodev or nosuid, we want to allow podman to turn these off if the user
specifies --privileged.
We also might want to turn off certain user namespace flags that will cause
buildah and podman build to work slower when creating container images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
where it belongs.
I have noticed that this parsing gets spread across projects. Basically, the
very same method is present in libpod, buildah, and cri-o projects. We better
start re-using this code from single place or soon everyone has its own version.
Signed-off-by: Šimon Lukašík <slukasik@redhat.com>
This patch adds a MountOpts field to the drivers so we can simplify
the interface to Get and allow additional options to be passed in the future.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
While generating a Diff, hold the lock on the layer store until after
we've completely finished building the diff.
There's an internal Mount/Unmount being done so that we can read the
layer's contents, and we don't update the mount counts properly if we're
not still holding the lock when the layer store's Unmount() method is
called, which doesn't happen until the ReadCloser that Diff() returns
gets closed.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
In CreateContainer(), don't use ROLayerStores() to get a list of the
read-only layer stores after we've acquired the lock on the writeable
layer store. ROLayerStores() acquires the graph lock, which we should
never try to acquire while we're holding the layer store lock.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
I believe we should be running container images mounted with nodev by default.
This would eliminate the disk of a device sneaking into the container without
being on the approved list. This would give us the same or potentially additional
security over the device cgroup.
It would be nice if this could be passed in on an image by image basis. So users
could also specify if they want nosuid images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
podman unmount wants to know if the image is only mounted 1 time
and refuse to unmount if the container state expects it to be mounted.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add force to umount to force the umount of a container image
Add an interface to indicate whether or not the layer is mounted
Add a boolean return from unmount to indicate when the layer is really unmounted
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Move the configuration file parsing to a new function
ReloadConfigurationFile so that it is possible to load a different
configuration file and override the current settings. This is useful
with rootless containers so a configuration file specific to the user
can be loaded.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
It is needed to use an OSTree repository (either directly or as a parent
repository) that is not under the storage home directory.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Add a ContainerSize() method, which knows how to compute the sizes of
container, so that our callers don't need to all be updated when we make
changes to how we store them.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add an ImageSize() method, which knows how to compute the sizes of
images, so that our callers don't need to all be updated when we make
changes to how we store them.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Allow images to have multiple top layers which should only differ by
which UID/GID mappings are used in them, to make creating multiple
containres which use the same mappings faster.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We need to support thinpool options from containers/storage
All of these options are currently available in container/storage and
are used via Docker. I want to make them available in the configuration
file so that they can be used with CRI-O or other tools.
# Storage Options for thinpool
# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = 20
# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = 80
# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = 10G
# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize=64k
# directlvm_device specifies a custom block storage device to use for the
# thin pool.
# directlvm_device = ""
# directlvm_device_force wipes device even if device already has a filesystem
# directlvm_device_force = false
# fs specifies the filesystem type to use for the base device.
# fs="xfs"
# log_level sets the log level of devicemapper.
# LogLevelSuppress 0 - Default
# LogLevelFatal 2
# LogLevelErr 3
# LogLevelWarn 4
# LogLevelNotice 5
# LogLevelInfo 6
# LogLevelDebug 7
# log_level = 7
# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space=10%
# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg=""
# mountopt = specifies extra mount options used when mounting the thin devices.
# mountopt=""
# use_deferred_removal Marking device for deferred removal
# use_deferred_removal = False
# use_deferred_deletion Marking device for deferred deletion
# use_deferred_deletion = False
# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries=0
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a note about needing to call reexec.Init() to the doc for the
GetStore() function. Also correct incorrect usage elsewhere.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add store methods for finding the list of UIDs and GIDs which probably
need to be mapped if a given layer or container's layer, which has to
have been mounted at least once in order for us to know where it goes,
is going to be used for a container that is run with the configured ID
mappings in a separate user namespace.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Tweak the order of arguments to LayerStore.Create()/CreateWithFlags()/Put()
so that the moreOptions struct is directly after the options map.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>