Commit Graph

103 Commits

Author SHA1 Message Date
Paul Holzinger def182d396
restore: fix missing network setup
The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so  I don't see it as problem.

Fixes #22901

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-24 18:52:02 +02:00
openshift-merge-bot[bot] bf2de4177b
Merge pull request #23064 from giuseppe/podman-pass-timeout-stop-to-systemd
container: pass StopTimeout to the systemd slice
2024-06-23 14:57:55 +00:00
Giuseppe Scrivano 49eb5af301
libpod: intermediate mount if UID not mapped into the userns
if the current user is not mapped into the new user namespace, use an
intermediate mount to allow the mount point to be accessible instead
of opening up all the parent directories for the mountpoint.

Closes: https://github.com/containers/podman/issues/23028

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano 08a8429459
libpod: avoid chowning the rundir to root in the userns
so it is possible to remove the code to make the entire directory
world accessible.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano c81f075f43
libpod: do not chmod bind mounts
with the new mount API is available, the OCI runtime doesn't require
that each parent directory for a bind mount must be accessible.
Instead it is opened in the initial user namespace and passed down to
the container init process.

This requires that the kernel supports the new mount API and that the
OCI runtime uses it.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano 7d22f04f56
container: pass KillSignal and StopTimeout to the systemd scope
so that they are honored when systemd terminates the scope.

Closes: https://issues.redhat.com/browse/RHEL-16375

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 13:46:08 +02:00
Matthew Heon 046c0e5fc2 Only stop chowning volumes once they're not empty
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.

For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.

In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.

Fixes #22571

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2024-05-22 17:47:01 -04:00
Paul Holzinger fb2ab832a7
fix incorrect host.containers.internal entry for rootless bridge mode
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.

The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.

Fixes #22653

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-17 12:28:44 +02:00
Matt Heon 693ae0ebc6 Add support for image volume subpaths
Image volumes (the `--mount type=image,...` kind, not the
`podman volume create --driver image ...` kind - it's strange
that we have two) are needed for our automount scheme, but the
request is that we mount only specific subpaths from the image
into the container. To do that, we need image volume subpath
support. Not that difficult code-wise, mostly just plumbing.

Also, add support to the CLI; not strictly necessary, but it
doesn't hurt anything and will make testing easier.

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-04-25 14:12:27 -04:00
Paul Holzinger 83dbbc3a51
Replace golang.org/x/exp/slices with slices from std
Use "slices" from the standard library, this package was added in go
1.21 so we can use it now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-04-23 11:16:40 +02:00
Giuseppe Scrivano 5656ad40b1
libpod: use fileutils.(Le|E)xists
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 09:52:14 +02:00
Evan Lezar a40cf3195a Bump tags.cncf.io/container-device-interface to v0.7.1
This includes migrating from cdi.GetRegistry() to cdi.Configure() and
cdi.GetDefaultCache() as applicable.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2024-04-06 12:25:26 +02:00
Giuseppe Scrivano 519a66c6a9
container: do not chown to dest target with U
if the 'U' option is provided, do not chown the destination target to
the existing target in the image.

Closes: https://github.com/containers/podman/issues/22224

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-03 14:41:33 +02:00
Giuseppe Scrivano d81319eb71
libpod: use original IDs if idmap is provided
if the volume is mounted with "idmap", there should not be any mapping
using the user namespace mappings since this is done at runtime using
the "idmap" kernel feature.

Closes: https://github.com/containers/podman/issues/22228

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-03-31 23:46:17 +02:00
Paul Holzinger dc1795b4b2
use new c/common pasta2 setup logic to fix dns
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.

A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.

Fixes #22044

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-03-19 12:09:31 +01:00
Romain Geissler f59a5f1351
Fix running container from docker client with rootful in rootless podman.
This effectively fix errors like "unable to upgrade to tcp, received
409" like #19930 in the special case where podman itself is running
rootful but inside a container which itself is rootless.

[NO NEW TESTS NEEDED]

Signed-off-by: Romain Geissler <romain.geissler@amadeus.com>
2024-02-18 15:09:14 +00:00
Giuseppe Scrivano c29fde2656
libpod: correctly map UID/GID for existing dirs
if the target mount path already exists and the container uses a user
namespace, correctly map the target UID/GID to the host values before
attempting a chown.

Closes: https://github.com/containers/podman/issues/21608

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-02-12 23:04:24 +01:00
Matt Heon 72f1617fac Bump Go module to v5
Moving from Go module v4 to v5 prepares us for public releases.

Move done using gomove [1] as with the v3 and v4 moves.

[1] https://github.com/KSubedi/gomove

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-08 09:35:39 -05:00
Daniel J Walsh fcae702205
Reuse timezone code from containers/common
Replaces: https://github.com/containers/podman/pull/21077

[NO NEW TESTS NEEDED] Existing tests should handle this.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-02-06 07:09:16 -05:00
Matt Heon 9fb57d346f Cease using deprecated runc userlookup
Instead switch to github.com/moby/sys/user, which we already had
as an indirect dependency.

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-02-02 11:02:43 -05:00
openshift-merge-bot[bot] 83f89db6c8
Merge pull request #20961 from karuboniru/patch-1
fix checking of relative idmapped mount
2024-01-11 17:20:56 +00:00
Egor Makrushin 380fa1c836 Remove redundant code in generateSpec()
Conditional expression duplicates the
code above, therefore, remove it

Found by Linux Verification Center (linuxtesting.org) with SVACE.

[NO NEW TESTS NEEDED]

Signed-off-by: Egor Makrushin <emakrushin@astralinux.ru>
2024-01-09 17:26:03 +03:00
Oleksandr Redko 8bdf77aa20 Refactor: replace StringInSlice with slices.Contains
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-05 16:25:56 +02:00
Oleksandr Redko 2a2d0b0e18 chore: delete obsolete // +build lines
Signed-off-by: Oleksandr Redko <Oleksandr_Redko@epam.com>
2024-01-04 11:53:38 +02:00
Gavin Lam db68764d8b
Fix Docker API compatibility with network alias (#17167)
* Add BaseHostsFile to container configuration
* Do not copy /etc/hosts file from host when creating a container using Docker API

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
2023-12-14 23:31:44 -05:00
Karuboniru e7eb97b84a
fix checking of relative idmapped mount
Like stated in [PR for crun](https://github.com/containers/crun/pull/1372)

that HostID is what being mapped here, so we should be checking `HostID` instead of `ContainerID`. `v.ContainerID` here is the id of owner of files on filesystem, that can be totally unrelated to the uid maps.

Signed-off-by: Karuboniru <yanqiyu01@gmail.com>
2023-12-09 20:16:38 +00:00
Daniel J Walsh c8f262fec9
Use idtools.SafeChown and SafeLchown everywhere
If we get an error chowning a file or directory to a UID/GID pair
for something like ENOSUP or EPERM, then we should ignore as long as the UID/GID
pair on disk is correct.

Fixes: https://github.com/containers/podman/issues/20801

[NO NEW TESTS NEEDED]

Since this is difficult to test and existing tests should be sufficient
to ensure no regression.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-27 20:41:56 -05:00
Daniel J Walsh ddd6cdfd77
Ignore SELinux relabel on unsupported file systems
We were ignoreing relabel requests on certain unsupported
file systems and not on others, this changes to consistently
logrus.Debug ENOTSUP file systems.

Fixes: https://github.com/containers/podman/discussions/20745

Still needs some work on the Buildah side.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-22 09:25:38 -05:00
Valentin Rothberg e40d70cecc new 'no-dereference' mount option
Add a new `no-dereference` mount option supported by crun 1.11+ to
re-create/copy a symlink if it's the source of a mount.  By default the
kernel will resolve the symlink on the host and mount the target.
As reported in #20098, there are use cases where the symlink structure
must be preserved by all means.

Fixes: #20098
Fixes: issues.redhat.com/browse/RUN-1935
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-11-21 13:17:58 +01:00
openshift-merge-bot[bot] 7d107b9892
Merge pull request #19879 from rhatdan/ulimits
Support passing of Ulimits as -1 to mean max
2023-11-10 10:47:43 +00:00
renovate[bot] 942bcf34b8 Update container-device-interface (CDI) to v0.6.2
This updates the container-device-interface dependency to v0.6.2 and renames the import to
tags.cncf.io/container-device-interface to make use of the new vanity URL.

[NO NEW TESTS NEEDED]

Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-11-04 01:12:06 +01:00
Daniel J Walsh 18d6bb40d5
Support passing of Ulimits as -1 to mean max
Docker allows the passing of -1 to indicate the maximum limit
allowed for the current process.

Fixes: https://github.com/containers/podman/issues/19319

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-11-01 08:46:55 -04:00
openshift-ci[bot] 4f6a8f0d50
Merge pull request #20483 from vrothberg/RUN-1934
container.conf: support attributed string slices
2023-10-27 17:49:13 +00:00
Valentin Rothberg e966c86d98 container.conf: support attributed string slices
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-10-27 12:44:33 +02:00
Matthew Heon be7dd128ef Mask /sys/devices/virtual/powercap
I don't really like this solution because it can't be undone by
`--security-opt unmask=all` but I don't see another way to make
this retroactive. We can potentially change things up to do this
the right way with 5.0 (actually have it in the list of masked
paths, as opposed to adding at spec finalization as now).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-10-26 18:24:25 -04:00
Paul Holzinger bad25da92e
libpod: add !remote tag
This should never be pulled into the remote client.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-24 12:11:34 +02:00
Paul Holzinger 29273cda10
lint: fix warnings found by perfsprint
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-10-20 16:27:46 +02:00
Radostin Stoyanov 9b17d6cb06
vendor: update checkpointctl to v1.1.0
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2023-09-12 08:41:02 +01:00
Giuseppe Scrivano 19bd9b33dd
libpod: move oom_score_adj clamp to init
commit 8b4a79a744 introduced
oom_score_adj clamping when the container oom_score_adj value is lower
than the current one in a rootless environment.  Move the check to
init() time so it is performed every time the container starts and not
only when it is created.  It is more robust if the oom_score_adj value
is changed for the current user session.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-09-11 17:04:37 +02:00
Giuseppe Scrivano 702709a916
libpod: do not parse --hostuser in base 8
fix the parsing of --hostuser to treat the input in base 10.

Closes: https://github.com/containers/podman/issues/19800

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-08-31 12:34:58 +02:00
Daniel J Walsh 6f284dbd46
podman exec should set umask to match container
Fixes: https://github.com/containers/podman/issues/19713

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-08-24 13:20:06 -04:00
Doug Rabson 27b41f0877 libpod: use /var/run instead of /run on FreeBSD
This changes /run to /var/run for .containerenv and secrets in FreeBSD
containers for consistency with FreeBSD path conventions. Running Linux
containers on FreeBSD hosts continue to use /run for compatibility.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-08-17 14:04:53 +01:00
Daniel J Walsh 22a8b68866
make /dev & /dev/shm read/only when --read-only --read-only-tmpfs=false
The intention of --read-only-tmpfs=fals when in --read-only mode was to
not allow any processes inside of the container to write content
anywhere, unless the caller also specified a volume or a tmpfs. Having
/dev and /dev/shm writable breaks this assumption.

Fixes: https://github.com/containers/podman/issues/12937

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-30 06:09:30 -04:00
Daniel J Walsh a224ff7313
Should be checking tmpfs versus type not source
Paul found this in https://github.com/containers/podman/pull/19241

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-15 07:57:37 -04:00
Daniel J Walsh f256f4f954
Use constants for mount types
Inspired by https://github.com/containers/podman/pull/19238

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-14 07:17:21 -04:00
Paul Holzinger 2a9b9bb53f
netavark: macvlan networks keep custom nameservers
The change to use the custom dns server in aardvark-dns caused a
regression here because macvlan networks never returned the nameservers
in netavark and it also does not make sense to do so.

Instead check here if we got any network nameservers, if not we then use
the ones from the config if set otherwise fallback to host servers.

Fixes #19169

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-12 14:07:34 +02:00
Simon Brakhane dee94ea699 bugfix: do not try to parse empty ranges
An empty range caused a panic as parseOptionIDs tried to check further
down for an @ at index 0 without taking into account that the splitted
out string could be empty.

Signed-off-by: Simon Brakhane <simon@brakhane.net>
2023-07-06 11:16:34 +02:00
David Gibson 39624473b0 pasta: Create /etc/hosts entries for pods using pasta networking
For pods with bridged and slirp4netns networking we create /etc/hosts
entries to make it more convenient for the containers to address each
other.  We omitted to do this for pasta networking, however.  Add the
necessary code to do this.

Closes: https://github.com/containers/podman/issues/17922

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-30 13:04:02 +10:00
Paul Holzinger 614c962c23
use libnetwork/slirp4netns from c/common
Most of the code moved there so if from there and remove it here.

Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.

[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-22 11:16:13 +02:00
Paul Holzinger 810c97bd85
libpod: write /etc/{hosts,resolv.conf} once
My PR[1] to remove PostConfigureNetNS is blocked on other things I want
to split this change out. It reduces the complexity when generating
/etc/hosts and /etc/resolv.conf as now we always write this file after
we setup the network. this means we can get the actual ip from the netns
which is important.

[NO NEW TESTS NEEDED] This is just a rework.

[1] https://github.com/containers/podman/pull/18468

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-21 15:33:42 +02:00