This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.
In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.
Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.
Fixes#22852
Fixes RHEL-39513
Signed-off-by: Matt Heon <mheon@redhat.com>
When a users asks for specific devices we should still add them and not
ignore them just because privileged adds all of them.
Most notably if you set --device /dev/null:/dev/test you expect
/dev/test in the container, however as we ignored them this was not the
case. Another side effect is that the input was not validated at at all.
This leads to confusion as descriped in the issue.
Fixes#23132
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The reserved annotation io.podman.annotations.volumes-from is made public to let user define volumes-from to have one container mount volumes of other containers.
The annotation format is: io.podman.annotations.volumes-from/tgtCtr: "srcCtr1:mntOpts1;srcCtr2:mntOpts;..."
Fixes: containers#16819
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
The current field separator comma of the inspect annotation conflicts with the mount options of --volumes-from as the mount options itself can be comma separated.
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
SpecGen is our primary container creation abstraction, and is
used to connect our CLI to the Libpod container creation backend.
Because container creation has a million options (I exaggerate
only slightly), the struct is composed of several other structs,
many of which are quite large.
The core problem is that SpecGen is also an API type - it's used
in remote Podman. There, we have a client and a server, and we
want to respect the server's containers.conf. But how do we tell
what parts of SpecGen were set by the client explicitly, and what
parts were not? If we're not using nullable values, an explicit
empty string and a value never being set are identical - and we
can't tell if it's safe to grab a default from the server's
containers.conf.
Fortunately, we only really need to do this for booleans. An
empty string is sufficient to tell us that a string was unset
(even if the user explicitly gave us an empty string for an
option, filling in a default from the config file is acceptable).
This makes things a lot simpler. My initial attempt at this
changed everything, including strings, and it was far larger and
more painful.
Also, begin the first steps of removing all uses of
containers.conf defaults from client-side. Two are gone entirely,
the rest are marked as remove-when-possible.
[NO NEW TESTS NEEDED] This is just a refactor.
Signed-off-by: Matt Heon <mheon@redhat.com>
Docker allows the passing of -1 to indicate the maximum limit
allowed for the current process.
Fixes: https://github.com/containers/podman/issues/19319
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add --rdt-class=COS to the create and run command to enable the
assignment of a container to a Class of Service (COS). The COS
represents a part of the cache based on the Cache Allocation Technology
(CAT) feature that is part of Intel's Resource Director Technology
(Intel RDT) feature set. By assigning a container to a COS, all PID's of
the container have only access to the cache space defined for this COS.
The COS has to be pre-configured based on the resctrl kernel driver.
cat_l2 and cat_l3 flags in /proc/cpuinfo represent CAT support for cache
level 2 and 3 respectively.
Signed-off-by: Wolfgang Pross <wolfgang.pross@intel.com>
commit cf364703fc changed the way
/sys/fs/cgroup is mounted when there is not a netns and it now honors
the ro flag. The mount was created using a bind mount that is a
problem when using a cgroup namespace, fix that by mounting a fresh
cgroup file system.
Closes: https://github.com/containers/podman/issues/20073
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
These files should never be included on the remote client. There only
there to finalize the spec on the server side.
This makes sure it will not get reimported by accident and bloat the
remote client again.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
commit 8b4a79a744 introduced
oom_score_adj clamping when the container oom_score_adj value is lower
than the current one in a rootless environment. Move the check to
init() time so it is performed every time the container starts and not
only when it is created. It is more robust if the oom_score_adj value
is changed for the current user session.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when running rootless, if the specified oom_score_adj for the
container process is lower than the current value, clamp it to the
current value and print a warning.
Closes: https://github.com/containers/podman/issues/19829
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The intention of --read-only-tmpfs=fals when in --read-only mode was to
not allow any processes inside of the container to write content
anywhere, unless the caller also specified a volume or a tmpfs. Having
/dev and /dev/shm writable breaks this assumption.
Fixes: https://github.com/containers/podman/issues/12937
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
we were silently ignoring --device-cgroup-rule in rootless mode. Make
sure an error is returned if the user tries to use it.
Closes: https://github.com/containers/podman/issues/18698
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when using --userns=auto or --userns=pod, we should bind mount /sys
from the host instead of creating a new /sys in the container,
otherwise we rely on the fallback provided by crun, which might not be
available in other runtimes.
Also, in the last version of crun the fallback is stricter than it
used to be before and it uses a recursive bind mount through the new
mount API. That can be missing on old kernel.
Closes: https://github.com/containers/crun/issues/1131
[NO NEW TESTS NEEDED] to trigger the failure, we need a specific
combination of kernel, libc and OCI runtime.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if /sys is bind mounted from the host then also add an explicit mount
for /sys/fs/cgroup so that 'ro' is honored.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
False is the assumed value, and inspect and podman generate kube are
being cluttered with a ton of annotations that indicate nothing.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Conceptually equivalent to networking by means of slirp4netns(1),
with a few practical differences:
- pasta(1) forks to background once networking is configured in the
namespace and quits on its own once the namespace is deleted:
file descriptor synchronisation and PID tracking are not needed
- port forwarding is configured via command line options at start-up,
instead of an API socket: this is taken care of right away as we're
about to start pasta
- there's no need for further selection of port forwarding modes:
pasta behaves similarly to containers-rootlessport for local binds
(splice() instead of read()/write() pairs, without L2-L4
translation), and keeps the original source address for non-local
connections like slirp4netns does
- IPv6 is not an experimental feature, and enabled by default. IPv6
port forwarding is supported
- by default, addresses and routes are copied from the host, that is,
container users will see the same IP address and routes as if they
were in the init namespace context. The interface name is also
sourced from the host upstream interface with the first default
route in the routing table. This is also configurable as documented
- sandboxing and seccomp(2) policies cannot be disabled
- only rootless mode is supported.
See https://passt.top for more details about pasta.
Also add a link to the maintained build of pasta(1) manual as valid
in the man page cross-reference checks: that's where the man page
for the latest build actually is -- it's not on Github and it doesn't
match any existing pattern, so add it explicitly.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Almost all of SpecGenToOCI deals with linux-specific aspects of the
runtime spec. Rather than try to factor this out piecemeal, I think it
is cleaner to move the whole function along with its implementation
helper functions. This also meams we don't need non-linux stubs for
functions called from oci_linux.go
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>