Commit Graph

4179 Commits

Author SHA1 Message Date
Paul Holzinger 77081df8cd
libpod: bind ports before network setup
We bind ports to ensure there are no conflicts and we leak them into
conmon to keep them open. However we bound the ports after the network
was set up so it was possible for a second network setup to overwrite
the firewall configs of a previous container as it failed only later
when binding the port. As such we must ensure we bind before the network
is set up.

This is not so simple because we still have to take care of
PostConfigureNetNS bool in which case the network set up happens after
we launch conmon. Thus we end up with two different conditions.

Also it is possible that we "leak" the ports that are set on the
container until the garbage collector will close them. This is not
perfect but the alternative is adding special error handling on each
function exit after prepare until we start conmon which is a lot of work
to do correctly.

Fixes https://issues.redhat.com/browse/RHEL-50746

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-30 14:39:08 +02:00
Giuseppe Scrivano 61f0230c31
kube: record infra user namespace
if there is an annotation that specifies the user namespace for the
infra container, then make sure it is used for the entire pod.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-24 12:10:48 +02:00
Giuseppe Scrivano e97bb79b7a
kube: invert branches
it increases readability as it doesn't need the negation, and the
first branch is shorter.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-24 12:10:47 +02:00
Daniel J Walsh 7768cf235e
Run codespell on source
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-07-23 07:28:23 -04:00
openshift-merge-bot[bot] 249d042035
Merge pull request #23343 from Luap99/fix-hc-output
libpod: correctly capture healthcheck output
2024-07-22 12:18:34 +00:00
Paul Holzinger b6b61a6a49
libpod: add hidden env to set sqlite timeout
Some users want to experiment with different timeout values.

Fixes #23236

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-22 12:59:00 +02:00
Paul Holzinger 55b6e4c3e8
podman pod stats: fix race when ctr process exits
Like commit 55749af0c7 but for podman *pod* stats not the normal podman
stats. We must ignore ErrCtrStopped here as well as this will happen
when the container process exited.

While at it remove a useless argument from the function as it was always
nil and restructure the logic flow to make it easier to read.

Fixes #23334

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-22 10:30:42 +02:00
Paul Holzinger 5e8884ab0d
libpod: correctly capture healthcheck output
Using the scanner is just unnecessary complicated an buggy as it will
not read the final line with a newline. There is also the problem that
it happens in a separate goroutine so it could loose output if we read
the array before the scanner was done.

The API accepts a Writer so we can just directly use a bytes.Buffer
which captures all output in memory without the need of another
goroutine.

This also means that now we always include the final newline in the
output. I checked with docker and they do the same so this is good.

Fixes #23332

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-19 15:16:55 +02:00
openshift-merge-bot[bot] 88c68a4b58
Merge pull request #23271 from giuseppe/drop-unmount-for-overlay-storage
test: podman system service doesn't leak mount on termination
2024-07-15 12:20:11 +00:00
Giuseppe Scrivano fbc4768a00
libpod: shutdown Stop waits for handlers completion
wait for handlers currently being processed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-15 11:41:28 +02:00
Giuseppe Scrivano 6832a35f65
libpod: cleanup store at shutdown
shutdown the containers store so that the home directory mount is not
leaked when "podman system service" exits.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-15 11:41:28 +02:00
Paul Holzinger 3280da0500
fix race conditions in start/attach logic
The current code did something like this:
lock()
getState()
unlock()

if state != running
  lock()
  getState() == running -> error
  unlock()

This of course is wrong because between the first unlock() and second
lock() call another process could have modified the state. This meant
that sometimes you would get a weird error on start because the internal
setup errored as the container was already running.

In general any state check without holding the lock is incorrect and
will result in race conditions. As such refactor the code to combine
both StartAndAttach and Attach() into one function that can handle both.
With that we can move the running check into the locked code.

Also use typed error for this specific error case then the callers can
check and ignore the specific error when needed. This also allows us to
fix races in the compat API that did a similar racy state check.

This commit changes slightly how we output the result, previously a
start on already running container would never print the id/name of the
container which is confusing and sort of breaks idempotence. Now it will
include the output except when --all is used. Then it only reports the
ids that were actually started.

Fixes #23246

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-12 15:11:34 +02:00
openshift-merge-bot[bot] 04bd415c74
Merge pull request #23167 from mheon/fix_rhel_37948
Ignore result of EvalSymlinks on ENOENT
2024-07-11 20:13:02 +00:00
Matt Heon 830e550073 Ignore result of EvalSymlinks on ENOENT
When the path does not exist, filepath.EvalSymlinks returns an
empty string - so we can't just ignore ENOENT, we have to discard
the result if an ENOENT is returned.

Should fix Jira issue RHEL-37948

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-07-11 09:39:56 -04:00
Farya L. Maerten c819c7a973 create runtime's worker queue before queuing any job
It seems that if some background tasks are queued in libpod's Runtime before the worker's channel is set up (eg. in the refresh phase), they are not executed later on, but the workerGroup's counter is still ticked up. This leads podman to hang when the imageEngine is shutdown, since it waits for the workerGroup to be done.

fixes containers/podman#22984

Signed-off-by: Farya Maerten <me@ltow.me>
2024-07-09 11:15:29 +02:00
Paul Holzinger 62956ac192
libpod: first delete container then cidfile
I am seeing a weird flake in my parallel system test PR. The issue is
that system units generated by podman systemd generate leave a container
in the Removing state behind.

As far as I can tell the porblems seems to be that the cleanup process
is killed while it tries to remove the container from the db. Because
the cidfile was removed before the ExecStopPost=podman rm ... process no
longer had access to the cidfile and reported no error because it runs
with --ignore.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-07-05 10:27:42 +02:00
Paul Holzinger 6db8ff7f7b
libpod/container_top_linux.c: fix missing header
As this file uses open it needs to include fcntl.h.
This should fix the build error seen on epel9[1], not sure why it works
on the other platforms.

[1] https://download.copr.fedorainfracloud.org/results/packit/containers-podman-23113/epel-9-aarch64/07672197-podman/builder-live.log.gz

Fixes 65ed96585d ("podman top: join the container userns")

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-27 10:50:17 +02:00
Paul Holzinger 65ed96585d
podman top: join the container userns
When we execute ps(1) in the container and the container uses a userns
with a different id mapping the user id field will be wrong.

To fix this we must join the userns in such case.

Fixes #22293

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-26 11:10:56 +02:00
Paul Holzinger def182d396
restore: fix missing network setup
The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so  I don't see it as problem.

Fixes #22901

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-24 18:52:02 +02:00
openshift-merge-bot[bot] bf2de4177b
Merge pull request #23064 from giuseppe/podman-pass-timeout-stop-to-systemd
container: pass StopTimeout to the systemd slice
2024-06-23 14:57:55 +00:00
Giuseppe Scrivano 49eb5af301
libpod: intermediate mount if UID not mapped into the userns
if the current user is not mapped into the new user namespace, use an
intermediate mount to allow the mount point to be accessible instead
of opening up all the parent directories for the mountpoint.

Closes: https://github.com/containers/podman/issues/23028

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano 08a8429459
libpod: avoid chowning the rundir to root in the userns
so it is possible to remove the code to make the entire directory
world accessible.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano c81f075f43
libpod: do not chmod bind mounts
with the new mount API is available, the OCI runtime doesn't require
that each parent directory for a bind mount must be accessible.
Instead it is opened in the initial user namespace and passed down to
the container init process.

This requires that the kernel supports the new mount API and that the
OCI runtime uses it.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano 094bc673ef
libpod: unlock the thread if possible
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 18:01:26 +02:00
Giuseppe Scrivano 7d22f04f56
container: pass KillSignal and StopTimeout to the systemd scope
so that they are honored when systemd terminates the scope.

Closes: https://issues.redhat.com/browse/RHEL-16375

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 13:46:08 +02:00
Giuseppe Scrivano e48f3137c0
libpod: fix comment
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-21 10:07:55 +02:00
Marius Hoch 6dd9abf9ec sqlite_state: Fix RewriteVolumeConfig
The VolumeConfig table does not have an ID column, thus
use the Name column to update it.

Fixes #23052

Signed-off-by: Marius Hoch <mail@mariushoch.de>
2024-06-20 11:39:44 +02:00
openshift-merge-bot[bot] 00bcd9aa81
Merge pull request #22733 from nalind/system-check
Add `podman system check`
2024-06-13 10:35:56 +00:00
Giuseppe Scrivano 730a215025
podman: add new hidden flag --pull-option
add a new flag that allows to override the pull options configured in
the storage.conf file.

e.g.: --pull-option="enable_partial_images=false" can be specified to
Podman to disable partial pulls even if enabled.

Leave it as a hidden configuration flag for now since the API itself
is marked as experimental in c/storage.

Currently c/storage doesn't honor the overrides, being fixed with
https://github.com/containers/storage/pull/1966

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-12 15:48:36 +02:00
Paul Holzinger a9de888a15
libpod: do not resuse networking on start
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.

Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-07 17:50:28 +02:00
Doug Rabson ffc8522646 libpod: fix 'podman kube generate' on FreeBSD
This avoids dereferencing c.config.Spec.Linux if it is nil, which is the
case on FreeBSD.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2024-06-05 10:38:30 +01:00
openshift-merge-bot[bot] b63767866e
Merge pull request #22895 from Luap99/hc-startup-leak
libpod: do not leak systemd hc startup unit timer
2024-06-04 17:41:21 +00:00
Paul Holzinger e8ea1e7632
libpod: do not leak systemd hc startup unit timer
This fixes a regression added in commit 4fd84190b8, because the name was
overwritten by the createTimer() timer call the removeTransientFiles()
call removed the new timer and not the startup healthcheck. And then
when the container was stopped we leaked it as the wrong unit name was
in the state.

A new test has been added to ensure the logic works and we never leak
the system timers.

Fixes #22884

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 18:03:46 +02:00
Nalin Dahyabhai fec58a4571 Add `podman system check` for checking storage consistency
Add a `podman system check` that performs consistency checks on local
storage, optionally removing damaged items so that they can be
recreated.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2024-06-04 10:00:37 -04:00
Bo Wang 7243c7109c fix(libpod): add newline character to the end of container's hostname file
debian's man (5) hostname page states "The file should contain a single newline-terminated hostname
string."
[NO NEW TESTS NEEDED]

fix #22729

Signed-off-by: Bo Wang <wangbob@uniontech.com>
2024-06-04 15:20:04 +08:00
Giuseppe Scrivano 4ece83bdf9
libpod: cleanup default cache on system reset
Closes: https://github.com/containers/podman/issues/22825

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-29 11:10:55 +02:00
openshift-merge-bot[bot] af8fe2b75e
Merge pull request #22764 from giuseppe/give-more-time-to-healthcheck-status-change
libpod: wait another interval for healthcheck
2024-05-28 13:21:43 +00:00
openshift-merge-bot[bot] eee0dc256a
Merge pull request #22727 from mheon/chown_all_the_time
Always chown volumes when mounting into a container
2024-05-23 12:34:07 +00:00
Matthew Heon 046c0e5fc2 Only stop chowning volumes once they're not empty
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.

For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.

In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.

Fixes #22571

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2024-05-22 17:47:01 -04:00
Giuseppe Scrivano d094a9f18e
podman: fix --sdnotify=healthy with --rm
Now WaitForExit returns the exit code as stored in the db instead of
returning an error when the container was removed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-22 21:34:38 +02:00
Giuseppe Scrivano e166f6bfe0
libpod: wait another interval for healthcheck
wait for another interval when the container transitioned to "stopped"
to give more time to the healthcheck status to change.

Closes: https://github.com/containers/podman/issues/22760

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-22 21:34:34 +02:00
openshift-merge-bot[bot] cc79d5e82e
Merge pull request #22700 from Luap99/libpod-inspect-API-v4
remote API: restore v4 payload in container inspect
2024-05-22 12:32:29 +00:00
openshift-merge-bot[bot] 4f31335fa4
Merge pull request #22404 from testwill/close_file
fix: close resource file
2024-05-21 11:47:36 +00:00
Paul Holzinger fb2ab832a7
fix incorrect host.containers.internal entry for rootless bridge mode
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.

The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.

Fixes #22653

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-17 12:28:44 +02:00
Giuseppe Scrivano 35375e0af8
container_api: do not wait for healtchecks if stopped
do not wait for the healthcheck status to change if the container is
stopped.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-15 09:34:08 +02:00
Giuseppe Scrivano b06c58b4a5
libpod: wait for healthy on main thread
wait for the healthy status on the thread where the container lock is
held.  Otherwise, if it is performed from a go routine, a different
thread is used (since the runtime.LockOSThread() call doesn't have any
effect), causing pthread_mutex_unlock() to fail with EPERM.

Closes: https://github.com/containers/podman/issues/22651

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-14 22:55:02 +02:00
Paul Holzinger bcb7edfded
remote API: restore v4 payload in container inspect
The v5 API made a breaking change for podman inspect, this means that
an old client could not longer parse the result from the new 5.X server.
The other way around new client and old server already worked.

As it turned out there were several users that run into this, one case
to hit this is using an old 4.X podman machine wich now pulls a newer
coreos with podman 5.0. But there are also other users running into it.
In order to keep the API working we now have a version check and return
the old v4 compatible payload so the old remote client can still work
against a newer server thus removing any major breaking change for an
old client.

Fixes #22657

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-14 17:56:20 +02:00
openshift-merge-bot[bot] 0c09421f85
Merge pull request #22641 from mheon/handle_stopping_loop
Ensure that containers do not get stuck in stopping
2024-05-13 12:32:40 +00:00
Giuseppe Scrivano 8433a01aa2
Revert "container stop: kill conmon"
This reverts commit 909ab59419.

The workaround was added almost 5 years ago to workaround an issue
with old conmon releases.  It is safe to assume such ancient conmon
releases are not used anymore.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-09 22:49:14 +02:00
Matt Heon 3fa8e98a31 Ensure that containers do not get stuck in stopping
The scenario for inducing this is as follows:
1. Start a container with a long stop timeout and a PID1 that
   ignores SIGTERM
2. Use `podman stop` to stop that container
3. Simultaneously, in another terminal, kill -9 `pidof podman`
   (the container is now in ContainerStateStopping)
4. Now kill that container's Conmon with SIGKILL.
5. No commands are able to move the container from Stopping to
   Stopped now.

The cause is a logic bug in our exit-file handling logic. Conmon
being dead without an exit file causes no change to the state.
Add handling for this case that tries to clean up, including
stopping the container if it still seems to be running.

Fixes #19629

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-05-09 11:17:24 -04:00