This solves a race condition where a mountpoint is created without the
home mount being present.
The cause is that another process could be calling the graph driver
cleanup as part of store.Shutdown() causing the unmount of the
driver home directory.
The unmount could happen between the time the rlstore is retrieved and
the actual mount, causing the driver mount to be done without a home
mount below it.
A third process then would re-create again the home mount, shadowing
the previous mount.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757845
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Storage options are really driver specific and it is when distributions set
defaults, they should not effect the user if he changes the default driver.
By moving the storage options to be driver specific, we can make sure all
drivers only document and support their options.
With this patch we will continue to support the global mountopt but the driver
specific version will override the global mountopt.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
it was an attempt to use OSTree to deduplicate files, at the time we
already had a dependency on OSTree for system containers in
containers/image. Since the feature never really took off, let's just
drop it.
Closes: https://github.com/containers/storage/issues/419
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When reporting an error from store.ImageBigData(), distinguish between
cases where we can't find the specified image, and where we found the
image, but it didn't have a matching requested item.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Images don't have to have layers, so they don't have to have top layers,
and we shouldn't return an error when attempting to determine the size
of such an image.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Wrap the ID or the digest to ErrImageUnknown errors to avoid ambiguity
which image is unknown. Consumers of the storage library may have
multiple subsequent calls to the storage API where it can be unclear
which image is unknown. Wrapping the ID and digest attempts to avoid
this ambiguity.
Related-to: github.com/containers/libpod/issues/2979
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Modified patch of Kevin Pelzel.
Also changed ApplyDiff to take new ApplyDiffOpts Struct.
Signed-off-by: Kevin Pelzel <kevinpelzel22@gmail.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Use RLock() to lock stores that we know are read-only, and panic in
Lock() if we know that we're not a read-write lock.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Pass the library-level RunRoot in as part of the Config struct that we
pass to lower-level driver initialization functions.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We have a bug in podman that uses the defaultGraphDriver options
for returning the MountOptions rather then the driver overrides
from the user.
This PR adds a new interface GetMountOptions which parses the callers
graphdriveroptions and return the mountoptions
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When creating a container, don't worry about whether or not the base
image's top layer has the right ID mappings in cases where the base
image doesn't have a top layer.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
These interfaces can be used to setup a graphdriver mountpoint
of the source directory for use within a container.
The RemoveTemp interface umounts the mountpoint and then removes
all of the modified data in the graphdriver for this source directory.
The primary use case of these interfaces is for container engines that
want to mount a directory from the host system into the container. The
source dirctory then can be modified without actually changing the
directory on the host.
Containers will use these interfaces for sharing packaing cache directories
like /var/cache/dnf, to help speed up container builds.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When you build an image with a parent layer in read-only stores
and the new image in read/write stores, the first time you try
to create a container based on the image, it fails, since it
cannot find the image in the same store.
This patch looks not only in the same store, but all of the stores
available.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a method to generate a lock file for a specific digest. Such a
digest-specific lock file is needed to synchronize threads and processes
when copying blobs from a registry to the containers-storage.
Whenever a layer is about to get copied, the lock must be acquired which
indicates to other processes and threads that the layer/blob is already
being copied.
To avoid leaking file descriptors for long-living users of
containers/storage, such as CRI-O, open and close the file on demand
during Lock() and Unlock(). The internal reference counters allows to
determine if we are the first or last user.
Note: as deleting the lock files is subject to race conditions, we place
the lock files in a graph-specific directory in the runroot. Since the
runroot is a tmpfs, the files will be cleanup during reboot.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
In an effort to remove cross vendoring, trying to fix buildah from importing
from libpod. I beleive these libraries make more sense in containers/storage
then in libpod.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Drop our dependency on the image library's manifest package by requiring
that callers pass its Digest() function to us as a callback. This makes
our CLI test/diagnostic tool calculate digests of s1 manifests
incorrectly, but that's not something that we were testing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Enable executing parallel `GetBlob()` executions in containers/image by
using reader-lock acquisitions in `ImageBigData()` and `Diff()`.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
Deferring method calls on loop variables must be avoided by all means as
the calls will be invoked on the last item of the loop.
The intermediate fix used in this commit is to allocate a new variable
on the heap for each loop iteration. An example transformation is:
FROM:
for _, x := range x_slice {
x.Lock()
defer x.Unlock()
}
TO:
for _, x_itr := range x_slice {
x := x_itr
x.Lock()
defer x.Unlock()
}
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
When we removed all traces of override_kernel_check, we created a
situation where older configuration files would suddenly start causing
us to emit an error at startup. Soften that to a warning, for now at
least.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Change how we compute digests for BigData items with names that start
with "manifest" so that we use the image library's manifest.Digest()
function, which knows how to preprocess schema1 manifests to get the
right value, instead of just trying to finesse it.
Track the digests of multiple manifest-named items for images.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Need to access the storage structs in the machine-config
operator code for container runtime configuration but
with it being in store.go, it is pullng in way too many
dependencies. Moving it out to a separate package cuts down
the dependencies by a huge amount.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
When removing an image, remove the image's mapped top layers before the
image's "main" top layer, in case the graph driver is hiding a
dependency between the mapped layers and the "real" one (as it's allowed
to do).
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add a CreateFromTemplate() method to graph drivers, and use it instead
of a driver-oblivious diff/put method when we want to create a copy of
an image's top layer that has the same parent and which differs from the
original only in its ID maps.
This lets drivers that can quickly make an independent layer based on
another layer do something smarter than we were doing with the
driver-oblivious method. For some drivers, a native method is
dramatically faster.
Note that the driver needs to be able to do this while still exposing
just one notional layer (i.e., one link in the chain of layers for a
given container) to the higher levels of the APIs, so if the new layer
is actually a child of the template layer, that needs to remain a detail
that's private to the driver.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The logic that depended on override_kernel_check was changed to test for
the feature at runtime, so we don't need to be suggesting to people that
they need to set this option, or that the option is even a thing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We need to expose this file path in podman info
to make it easier for users to discover where
the configuration file is.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If we fail to create an ID-mapped copy of an image's layer, report the
ID of the layer that we were attempting to create an ID-mapped copy of,
instead of attempting to log the ID of its parent, which might not
exist.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
We have seen situations in buildah where a container is being
built and user hits ^c, then he ends up in a situation where
he can not delete container, since layer does not exist.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
I found that other projects, tend to parse multiple maps at once. So, we may
want to allow the base library to do so in order to decrease complexity in the
upper layers.
This is follow-up on previous refactoring in 7b209d36fd, I didn't got
it right on first try, sry.
Signed-off-by: Šimon Lukašík <isimluk@fedoraproject.org>
We want to allow tools like podman/buildah to override default storage
container mount options on a container by container basis.
For example if the default mount options for containers/storage include
nodev or nosuid, we want to allow podman to turn these off if the user
specifies --privileged.
We also might want to turn off certain user namespace flags that will cause
buildah and podman build to work slower when creating container images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
where it belongs.
I have noticed that this parsing gets spread across projects. Basically, the
very same method is present in libpod, buildah, and cri-o projects. We better
start re-using this code from single place or soon everyone has its own version.
Signed-off-by: Šimon Lukašík <slukasik@redhat.com>
This patch adds a MountOpts field to the drivers so we can simplify
the interface to Get and allow additional options to be passed in the future.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
While generating a Diff, hold the lock on the layer store until after
we've completely finished building the diff.
There's an internal Mount/Unmount being done so that we can read the
layer's contents, and we don't update the mount counts properly if we're
not still holding the lock when the layer store's Unmount() method is
called, which doesn't happen until the ReadCloser that Diff() returns
gets closed.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
In CreateContainer(), don't use ROLayerStores() to get a list of the
read-only layer stores after we've acquired the lock on the writeable
layer store. ROLayerStores() acquires the graph lock, which we should
never try to acquire while we're holding the layer store lock.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
I believe we should be running container images mounted with nodev by default.
This would eliminate the disk of a device sneaking into the container without
being on the approved list. This would give us the same or potentially additional
security over the device cgroup.
It would be nice if this could be passed in on an image by image basis. So users
could also specify if they want nosuid images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>