The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so I don't see it as problem.
Fixes#22901
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
if the current user is not mapped into the new user namespace, use an
intermediate mount to allow the mount point to be accessible instead
of opening up all the parent directories for the mountpoint.
Closes: https://github.com/containers/podman/issues/23028
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
with the new mount API is available, the OCI runtime doesn't require
that each parent directory for a bind mount must be accessible.
Instead it is opened in the initial user namespace and passed down to
the container init process.
This requires that the kernel supports the new mount API and that the
OCI runtime uses it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.
For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.
In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.
Fixes#22571
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.
The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.
Fixes#22653
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Image volumes (the `--mount type=image,...` kind, not the
`podman volume create --driver image ...` kind - it's strange
that we have two) are needed for our automount scheme, but the
request is that we mount only specific subpaths from the image
into the container. To do that, we need image volume subpath
support. Not that difficult code-wise, mostly just plumbing.
Also, add support to the CLI; not strictly necessary, but it
doesn't hurt anything and will make testing easier.
Signed-off-by: Matt Heon <mheon@redhat.com>
This includes migrating from cdi.GetRegistry() to cdi.Configure() and
cdi.GetDefaultCache() as applicable.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
if the 'U' option is provided, do not chown the destination target to
the existing target in the image.
Closes: https://github.com/containers/podman/issues/22224
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if the volume is mounted with "idmap", there should not be any mapping
using the user namespace mappings since this is done at runtime using
the "idmap" kernel feature.
Closes: https://github.com/containers/podman/issues/22228
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.
A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.
Fixes#22044
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This effectively fix errors like "unable to upgrade to tcp, received
409" like #19930 in the special case where podman itself is running
rootful but inside a container which itself is rootless.
[NO NEW TESTS NEEDED]
Signed-off-by: Romain Geissler <romain.geissler@amadeus.com>
if the target mount path already exists and the container uses a user
namespace, correctly map the target UID/GID to the host values before
attempting a chown.
Closes: https://github.com/containers/podman/issues/21608
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
Conditional expression duplicates the
code above, therefore, remove it
Found by Linux Verification Center (linuxtesting.org) with SVACE.
[NO NEW TESTS NEEDED]
Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
* Add BaseHostsFile to container configuration
* Do not copy /etc/hosts file from host when creating a container using Docker API
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
Like stated in [PR for crun](https://github.com/containers/crun/pull/1372)
that HostID is what being mapped here, so we should be checking `HostID` instead of `ContainerID`. `v.ContainerID` here is the id of owner of files on filesystem, that can be totally unrelated to the uid maps.
Signed-off-by: Karuboniru <yanqiyu01@gmail.com>
If we get an error chowning a file or directory to a UID/GID pair
for something like ENOSUP or EPERM, then we should ignore as long as the UID/GID
pair on disk is correct.
Fixes: https://github.com/containers/podman/issues/20801
[NO NEW TESTS NEEDED]
Since this is difficult to test and existing tests should be sufficient
to ensure no regression.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
We were ignoreing relabel requests on certain unsupported
file systems and not on others, this changes to consistently
logrus.Debug ENOTSUP file systems.
Fixes: https://github.com/containers/podman/discussions/20745
Still needs some work on the Buildah side.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Add a new `no-dereference` mount option supported by crun 1.11+ to
re-create/copy a symlink if it's the source of a mount. By default the
kernel will resolve the symlink on the host and mount the target.
As reported in #20098, there are use cases where the symlink structure
must be preserved by all means.
Fixes: #20098
Fixes: issues.redhat.com/browse/RUN-1935
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
This updates the container-device-interface dependency to v0.6.2 and renames the import to
tags.cncf.io/container-device-interface to make use of the new vanity URL.
[NO NEW TESTS NEEDED]
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Docker allows the passing of -1 to indicate the maximum limit
allowed for the current process.
Fixes: https://github.com/containers/podman/issues/19319
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
commit 8b4a79a744 introduced
oom_score_adj clamping when the container oom_score_adj value is lower
than the current one in a rootless environment. Move the check to
init() time so it is performed every time the container starts and not
only when it is created. It is more robust if the oom_score_adj value
is changed for the current user session.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
The intention of --read-only-tmpfs=fals when in --read-only mode was to
not allow any processes inside of the container to write content
anywhere, unless the caller also specified a volume or a tmpfs. Having
/dev and /dev/shm writable breaks this assumption.
Fixes: https://github.com/containers/podman/issues/12937
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
The change to use the custom dns server in aardvark-dns caused a
regression here because macvlan networks never returned the nameservers
in netavark and it also does not make sense to do so.
Instead check here if we got any network nameservers, if not we then use
the ones from the config if set otherwise fallback to host servers.
Fixes#19169
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
An empty range caused a panic as parseOptionIDs tried to check further
down for an @ at index 0 without taking into account that the splitted
out string could be empty.
Signed-off-by: Simon Brakhane <simon@brakhane.net>
For pods with bridged and slirp4netns networking we create /etc/hosts
entries to make it more convenient for the containers to address each
other. We omitted to do this for pasta networking, however. Add the
necessary code to do this.
Closes: https://github.com/containers/podman/issues/17922
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Most of the code moved there so if from there and remove it here.
Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.
[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
My PR[1] to remove PostConfigureNetNS is blocked on other things I want
to split this change out. It reduces the complexity when generating
/etc/hosts and /etc/resolv.conf as now we always write this file after
we setup the network. this means we can get the actual ip from the netns
which is important.
[NO NEW TESTS NEEDED] This is just a rework.
[1] https://github.com/containers/podman/pull/18468
Signed-off-by: Paul Holzinger <pholzing@redhat.com>