add a new flag that allows to override the pull options configured in
the storage.conf file.
e.g.: --pull-option="enable_partial_images=false" can be specified to
Podman to disable partial pulls even if enabled.
Leave it as a hidden configuration flag for now since the API itself
is marked as experimental in c/storage.
Currently c/storage doesn't honor the overrides, being fixed with
https://github.com/containers/storage/pull/1966
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.
Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This avoids dereferencing c.config.Spec.Linux if it is nil, which is the
case on FreeBSD.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
This fixes a regression added in commit 4fd84190b8, because the name was
overwritten by the createTimer() timer call the removeTransientFiles()
call removed the new timer and not the startup healthcheck. And then
when the container was stopped we leaked it as the wrong unit name was
in the state.
A new test has been added to ensure the logic works and we never leak
the system timers.
Fixes#22884
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add a `podman system check` that performs consistency checks on local
storage, optionally removing damaged items so that they can be
recreated.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
debian's man (5) hostname page states "The file should contain a single newline-terminated hostname
string."
[NO NEW TESTS NEEDED]
fix#22729
Signed-off-by: Bo Wang <wangbob@uniontech.com>
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.
For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.
In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.
Fixes#22571
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Now WaitForExit returns the exit code as stored in the db instead of
returning an error when the container was removed.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
wait for another interval when the container transitioned to "stopped"
to give more time to the healthcheck status to change.
Closes: https://github.com/containers/podman/issues/22760
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.
The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.
Fixes#22653
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
wait for the healthy status on the thread where the container lock is
held. Otherwise, if it is performed from a go routine, a different
thread is used (since the runtime.LockOSThread() call doesn't have any
effect), causing pthread_mutex_unlock() to fail with EPERM.
Closes: https://github.com/containers/podman/issues/22651
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The v5 API made a breaking change for podman inspect, this means that
an old client could not longer parse the result from the new 5.X server.
The other way around new client and old server already worked.
As it turned out there were several users that run into this, one case
to hit this is using an old 4.X podman machine wich now pulls a newer
coreos with podman 5.0. But there are also other users running into it.
In order to keep the API working we now have a version check and return
the old v4 compatible payload so the old remote client can still work
against a newer server thus removing any major breaking change for an
old client.
Fixes#22657
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This reverts commit 909ab59419.
The workaround was added almost 5 years ago to workaround an issue
with old conmon releases. It is safe to assume such ancient conmon
releases are not used anymore.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The scenario for inducing this is as follows:
1. Start a container with a long stop timeout and a PID1 that
ignores SIGTERM
2. Use `podman stop` to stop that container
3. Simultaneously, in another terminal, kill -9 `pidof podman`
(the container is now in ContainerStateStopping)
4. Now kill that container's Conmon with SIGKILL.
5. No commands are able to move the container from Stopping to
Stopped now.
The cause is a logic bug in our exit-file handling logic. Conmon
being dead without an exit file causes no change to the state.
Add handling for this case that tries to clean up, including
stopping the container if it still seems to be running.
Fixes#19629
Signed-off-by: Matt Heon <mheon@redhat.com>
Systemd dislikes it when we rapidly create and remove a transient
unit. Solution: If we change the name every time, it's different
enough that systemd is satisfied and we stop having errors trying
to restart the healthcheck.
Generate a random 32-bit integer, and add it (formatted as hex)
to the end of the unit name to do this. As a result, we now have
to store the unit name in the database, but it does make
backwards compat easy - if the unit name in the DB is empty, we
revert to the old behavior because the timer was created by old
Podman.
Should resolve RHEL-26105
Signed-off-by: Matt Heon <mheon@redhat.com>
Effectively, this is an ability to take an image already pulled
to the system, and automatically mount it into one or more
containers defined in Kubernetes YAML accepted by `podman play`.
Requirements:
- The image must already exist in storage.
- The image must have at least 1 volume directive.
- The path given by the volume directive will be mounted from the
image into the container. For example, an image with a volume
at `/test/test_dir` will have `/test/test_dir` in the image
mounted to `/test/test_dir` in the container.
- Multiple images can be specified. If multiple images have a
volume at a specific path, the last image specified trumps.
- The images are always mounted read-only.
- Images to mount are defined in the annotation
"io.podman.annotations.kube.image.automount/$ctrname" as a
semicolon-separated list. They are mounted into a single
container in the pod, not the whole pod.
As we're using a nonstandard annotation, this is Podman only, any
Kubernetes install will just ignore this.
Underneath, this compiles down to an image volume
(`podman run --mount type=image,...`) with subpaths to specify
what bits we want to mount into the container.
Signed-off-by: Matt Heon <mheon@redhat.com>
Image volumes (the `--mount type=image,...` kind, not the
`podman volume create --driver image ...` kind - it's strange
that we have two) are needed for our automount scheme, but the
request is that we mount only specific subpaths from the image
into the container. To do that, we need image volume subpath
support. Not that difficult code-wise, mostly just plumbing.
Also, add support to the CLI; not strictly necessary, but it
doesn't hurt anything and will make testing easier.
Signed-off-by: Matt Heon <mheon@redhat.com>
Checking if the file exists before opening it anyway is really pointless
and needs a extra syscall and in theory is racy as the file might have
been changed between the two calls. We can simply ignore the ENOENT
error on the ReadFile call.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When the field is set to false we should never log healthcheck events.
Fixes https://issues.redhat.com/browse/RHEL-18987
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We already know the status of the healthcheck in the caller so calling
healthCheckStatus() just make the event code sync the container state
and reread the healthcheck file for no reason.
It is much better to directly pass the status down to the event call.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
In cases where we fail to configure the error is returned as it and may
be missing useful context. Make sure we know the error happened as part
of the storage setup.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This is something Docker does, and we did not do until now. Most
difficult/annoying part was the REST API, where I did not really
want to modify the struct being sent, so I made the new restart
policy parameters query parameters instead.
Testing was also a bit annoying, because testing restart policy
always is.
Signed-off-by: Matt Heon <mheon@redhat.com>
The logic here is more complex than I would like, largely due to
the behavior of `podman inspect` for running containers. When a
container is running, `podman inspect` will source as much as
possible from the OCI spec used to run that container, to grab
up-to-date information on things like devices. We don't want to
change this, it's definitely the right behavior, but it does make
updating a running container inconvenient: we have to rewrite the
OCI spec as part of the update to make sure that `podman inspect`
will read the correct resource limits.
Also, make update emit events. Docker does it, we should as well.
Signed-off-by: Matt Heon <mheon@redhat.com>
This includes migrating from cdi.GetRegistry() to cdi.Configure() and
cdi.GetDefaultCache() as applicable.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Podman needs to be able to detect when a system reboot occurs to
do certain types of cleanup operation (for example, reset
container states, clean up IPAM allocations, etc). our current
method for this is a sentinel file on a tmpfs filesystem. The
problem emerges that there is no directory that is guaranteed to
be a tmpfs and is also guaranteed to be accessible to rootless
users in the FHS. If the user has a systemd user session, we can
depend on /run/user/$UID, but we can't reliably say that they do.
This code will detect the no-tmpfs-but-reboot-occurred case by
writing the current system boot ID to our tmpfs sentinel file
when it is created, and checking that file every time Podman
starts to make sure that the current boot ID matches the cached
one in the sentinel file. If they don't match, a reboot occurred
and the sentinel file was not on a tmpfs and thus survived. In
that case, throw an error telling the user to remove certain
directories (the ones that are supposed to be tmpfs), so we can
proceed as expected.
Signed-off-by: Matt Heon <mheon@redhat.com>
if the 'U' option is provided, do not chown the destination target to
the existing target in the image.
Closes: https://github.com/containers/podman/issues/22224
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
if the volume is mounted with "idmap", there should not be any mapping
using the user namespace mappings since this is done at runtime using
the "idmap" kernel feature.
Closes: https://github.com/containers/podman/issues/22228
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Useful to tell whether containers are being made with pasta or
slirp4netns by default. Info is bloated enough already that I
don't really have concerns about shoving more into it.
Fixes#22172
Signed-off-by: Matt Heon <mheon@redhat.com>
This factors out the check for cgroupsv2 unified mode into a
platform-specific file and stops podman from generating a (harmless)
warning every time it is run on FreeBSD.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
I believe the previous code meant to use cmd.Run instead of cmd.Start.
The issue is that cmd.Start returns before the command has finished
executing, so the conditional body checking for the stderr of the
command never gets executed.
Raise the cmd.Start up into it's own conditional, which is checking for
whether the process could be started. Then we consume stderr, check for
some specific strings in the output, and then finally continue on with
the rest of the code.
Signed-off-by: Keith Johnson <kj@ubergeek42.com>
Fix following issues:
- create container API handler ignores Annotations from HostConfig
- inspect container API handler does not provide Annotations as
part of HostConfig
Signed-off-by: diplane <diplane3d@gmail.com>
Always teardown the network, trying to reuse the netns has caused
a significant amount of bugs in this code here. It also never worked
for containers with user namespaces. So once and for all simplify this
by never reusing the netns. Originally this was done to have a faster
restart of containers but with netavark now we are much faster so it
shouldn't be that noticeable in practice. It also makes more sense to
reconfigure the netns as it is likely that the container exited due
some broken network state in which case reusing would just cause more
harm than good.
The main motivation for this change was the pasta change to use
--dns-forward by default. As the restarted contianer had no idea what
nameserver to use as pasta just kept running.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.
A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.
Fixes#22044
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The annotations should be maintained by CRI-O itself to decouple the
projects from a dependency perspective.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Commit 03f6589f3 added basic support for pull-error event from libimage
but it contains several problems:
1. storing the error as error type prevents it from being unmarshalled,
thus change it to a string
2. the error was never propagated from the libimage event to the podman
event struct
3. the error message was not wired into the cli and API
This commit fixes these problems.
Fixes#21458
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
when performing a system reset with containers that run somewhere where
a soft kill wont work (like sleep), containers will wait 10 seconds
before terminating with a sigkill. But for a forceful action like
system reset, we should outright set no timeout so containers stop
quickly and are not waiting on a timeout
Fixes#21874
Signed-off-by: Brent Baude <bbaude@redhat.com>
This vendors the latest c/common version, including making Pasta
the default rootless network provider. That broke a number of
tests, which have been fixed as part of this PR.
Also includes a change to network stats logic, which simplifies
the code a bit and makes it actually work with Pasta.
Signed-off-by: Matt Heon <mheon@redhat.com>
The ID filed in the Event struct is duplicated for no reason, since the
Details struct is directly embedded in the Event the ID filed is
basically duplicate on the same level multiple times. Removing this one
should be be safe and make no change to the resulting json.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This effectively fix errors like "unable to upgrade to tcp, received
409" like #19930 in the special case where podman itself is running
rootful but inside a container which itself is rootless.
[NO NEW TESTS NEEDED]
Signed-off-by: Romain Geissler <romain.geissler@amadeus.com>
The reserved annotation io.podman.annotations.volumes-from is made public to let user define volumes-from to have one container mount volumes of other containers.
The annotation format is: io.podman.annotations.volumes-from/tgtCtr: "srcCtr1:mntOpts1;srcCtr2:mntOpts;..."
Fixes: containers#16819
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
if the target mount path already exists and the container uses a user
namespace, correctly map the target UID/GID to the host values before
attempting a chown.
Closes: https://github.com/containers/podman/issues/21608
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Conmon writes the exit file and oom file (if container
was oom killed) to the persist directory. This directory
is retained across reboots as well.
Update podman to create a persist-dir/ctr-id for the exit
and oom files for each container to be written to. The oom
state of container is set after reading the files
from the persist-dir/ctr-id directory.
The exit code still continues to read the exit file from
the exits directory.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
We were preserving ContainerStateExited, which is better than
nothing, but definitely not correct. A container that ran at any
point during the last boot should be moved to Exited state to
preserve the fact that they were run at least one. This means we
have to convert Running, Stopped, Stopping, Paused containers to
exited as well.
Signed-off-by: Matt Heon <mheon@redhat.com>
When interface_name attribute in containers.conf file is set to "device", then set interface names inside containers same as the network_interface names of the respective network.
The change applies to macvlan and ipvlan networks only. The interface_name attribute value has no impact on any other types of networks.
If the interface name is set in the user request, then that takes precedence.
Fixes: #21313
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
This mirrors how the Docker API handles things, allowing us to be
more compatible with Docker and more verbose on the Libpod API.
Stats are given as per network interface in the container, but
still aggregated for `podman stats` and `podman pod stats`
display (so the CLI does not change, only the Libpod and Compat
APIs).
Signed-off-by: Matt Heon <mheon@redhat.com>
During system shutdown, Podman should go down gracefully, meaning
that we have time to spawn cleanup processes which remove any
containers set to autoremove. Unfortunately, this isn't always
the case. If we get a SIGKILL because the system is going down
immediately, we can't recover from this, and the autoremove
containers are not removed.
However, we can pick up any leftover autoremove containers when
we refesh the DB state, which is the first thing Podman does
after a reboot. By detecting any autoremove containers that have
actually run (a container that was created but never run doesn't
need to be removed) at that point and removing them, we keep the
fresh boot clean, even if Podman was terminated abnormally.
Fixes#21482
[NO NEW TESTS NEEDED] This requires a reboot to realistically
test.
Signed-off-by: Matt Heon <mheon@redhat.com>
Currently we deadlock in the slirp4netns setup code as we try to
configure an non exissting netns. The problem happens because we tear
down the netns in the userns case correctly since commit bbd6281ecc but
that introduces this slirp4netns problem. The code does a proper new
network setup later so we should only use the short cut when not in a
userns.
Fixes#21477
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman v5 will not support cgroups-v1. This commit will print a warning
if it detects a cgroups-v1 system. The warning can be hidden by setting
envvar `PODMAN_CGROUPSV1_WARNING`.
This warning is patched out for RHEL 9 builds as cgroups-v1 will still
be supported on RHEL 9 systems.
Resolves: https://issues.redhat.com/browse/RUN-1957
[NO NEW TESTS NEEDED]
Co-authored-by: Ed Santiago <santiago@redhat.com>
Co-authored-by: Sascha Grunert <sgrunert@redhat.com>
Co-authored-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
The current field separator comma of the inspect annotation conflicts with the mount options of --volumes-from as the mount options itself can be comma separated.
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>