automation-tests

Commit Graph

Author	SHA1	Message	Date
Paul Holzinger	77081df8cd	libpod: bind ports before network setup We bind ports to ensure there are no conflicts and we leak them into conmon to keep them open. However we bound the ports after the network was set up so it was possible for a second network setup to overwrite the firewall configs of a previous container as it failed only later when binding the port. As such we must ensure we bind before the network is set up. This is not so simple because we still have to take care of PostConfigureNetNS bool in which case the network set up happens after we launch conmon. Thus we end up with two different conditions. Also it is possible that we "leak" the ports that are set on the container until the garbage collector will close them. This is not perfect but the alternative is adding special error handling on each function exit after prepare until we start conmon which is a lot of work to do correctly. Fixes https://issues.redhat.com/browse/RHEL-50746 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-30 14:39:08 +02:00
Giuseppe Scrivano	61f0230c31	kube: record infra user namespace if there is an annotation that specifies the user namespace for the infra container, then make sure it is used for the entire pod. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-07-24 12:10:48 +02:00
Giuseppe Scrivano	e97bb79b7a	kube: invert branches it increases readability as it doesn't need the negation, and the first branch is shorter. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-07-24 12:10:47 +02:00
Daniel J Walsh	7768cf235e	Run codespell on source Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2024-07-23 07:28:23 -04:00
openshift-merge-bot[bot]	249d042035	Merge pull request #23343 from Luap99/fix-hc-output libpod: correctly capture healthcheck output	2024-07-22 12:18:34 +00:00
Paul Holzinger	b6b61a6a49	libpod: add hidden env to set sqlite timeout Some users want to experiment with different timeout values. Fixes #23236 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-22 12:59:00 +02:00
Paul Holzinger	55b6e4c3e8	podman pod stats: fix race when ctr process exits Like commit `55749af0c7` but for podman pod stats not the normal podman stats. We must ignore ErrCtrStopped here as well as this will happen when the container process exited. While at it remove a useless argument from the function as it was always nil and restructure the logic flow to make it easier to read. Fixes #23334 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-22 10:30:42 +02:00
Paul Holzinger	5e8884ab0d	libpod: correctly capture healthcheck output Using the scanner is just unnecessary complicated an buggy as it will not read the final line with a newline. There is also the problem that it happens in a separate goroutine so it could loose output if we read the array before the scanner was done. The API accepts a Writer so we can just directly use a bytes.Buffer which captures all output in memory without the need of another goroutine. This also means that now we always include the final newline in the output. I checked with docker and they do the same so this is good. Fixes #23332 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-19 15:16:55 +02:00
openshift-merge-bot[bot]	88c68a4b58	Merge pull request #23271 from giuseppe/drop-unmount-for-overlay-storage test: podman system service doesn't leak mount on termination	2024-07-15 12:20:11 +00:00
Giuseppe Scrivano	fbc4768a00	libpod: shutdown Stop waits for handlers completion wait for handlers currently being processed. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-07-15 11:41:28 +02:00
Giuseppe Scrivano	6832a35f65	libpod: cleanup store at shutdown shutdown the containers store so that the home directory mount is not leaked when "podman system service" exits. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-07-15 11:41:28 +02:00
Paul Holzinger	3280da0500	fix race conditions in start/attach logic The current code did something like this: lock() getState() unlock() if state != running lock() getState() == running -> error unlock() This of course is wrong because between the first unlock() and second lock() call another process could have modified the state. This meant that sometimes you would get a weird error on start because the internal setup errored as the container was already running. In general any state check without holding the lock is incorrect and will result in race conditions. As such refactor the code to combine both StartAndAttach and Attach() into one function that can handle both. With that we can move the running check into the locked code. Also use typed error for this specific error case then the callers can check and ignore the specific error when needed. This also allows us to fix races in the compat API that did a similar racy state check. This commit changes slightly how we output the result, previously a start on already running container would never print the id/name of the container which is confusing and sort of breaks idempotence. Now it will include the output except when --all is used. Then it only reports the ids that were actually started. Fixes #23246 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-12 15:11:34 +02:00
openshift-merge-bot[bot]	04bd415c74	Merge pull request #23167 from mheon/fix_rhel_37948 Ignore result of EvalSymlinks on ENOENT	2024-07-11 20:13:02 +00:00
Matt Heon	830e550073	Ignore result of EvalSymlinks on ENOENT When the path does not exist, filepath.EvalSymlinks returns an empty string - so we can't just ignore ENOENT, we have to discard the result if an ENOENT is returned. Should fix Jira issue RHEL-37948 Signed-off-by: Matt Heon <mheon@redhat.com>	2024-07-11 09:39:56 -04:00
Farya L. Maerten	c819c7a973	create runtime's worker queue before queuing any job It seems that if some background tasks are queued in libpod's Runtime before the worker's channel is set up (eg. in the refresh phase), they are not executed later on, but the workerGroup's counter is still ticked up. This leads podman to hang when the imageEngine is shutdown, since it waits for the workerGroup to be done. fixes containers/podman#22984 Signed-off-by: Farya Maerten <me@ltow.me>	2024-07-09 11:15:29 +02:00
Paul Holzinger	62956ac192	libpod: first delete container then cidfile I am seeing a weird flake in my parallel system test PR. The issue is that system units generated by podman systemd generate leave a container in the Removing state behind. As far as I can tell the porblems seems to be that the cleanup process is killed while it tries to remove the container from the db. Because the cidfile was removed before the ExecStopPost=podman rm ... process no longer had access to the cidfile and reported no error because it runs with --ignore. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-07-05 10:27:42 +02:00
Paul Holzinger	6db8ff7f7b	libpod/container_top_linux.c: fix missing header As this file uses open it needs to include fcntl.h. This should fix the build error seen on epel9[1], not sure why it works on the other platforms. [1] https://download.copr.fedorainfracloud.org/results/packit/containers-podman-23113/epel-9-aarch64/07672197-podman/builder-live.log.gz Fixes `65ed96585d` ("podman top: join the container userns") Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-06-27 10:50:17 +02:00
Paul Holzinger	65ed96585d	podman top: join the container userns When we execute ps(1) in the container and the container uses a userns with a different id mapping the user id field will be wrong. To fix this we must join the userns in such case. Fixes #22293 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-06-26 11:10:56 +02:00
Paul Holzinger	def182d396	restore: fix missing network setup The restore code path never called completeNetworkSetup() and this means that hosts/resolv.conf files were not populated. This fix is simply to call this function. There is a big catch here. Technically this is suposed to be called after the container is created but before it is started. There is no such thing for restore, the container runs right away. This means that if we do the call afterwards there is a short interval where the file is still empty. Thus I decided to call it before which makes it not working with PostConfigureNetNS (userns) but as this does not work anyway today so I don't see it as problem. Fixes #22901 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-06-24 18:52:02 +02:00
openshift-merge-bot[bot]	bf2de4177b	Merge pull request #23064 from giuseppe/podman-pass-timeout-stop-to-systemd container: pass StopTimeout to the systemd slice	2024-06-23 14:57:55 +00:00
Giuseppe Scrivano	49eb5af301	libpod: intermediate mount if UID not mapped into the userns if the current user is not mapped into the new user namespace, use an intermediate mount to allow the mount point to be accessible instead of opening up all the parent directories for the mountpoint. Closes: https://github.com/containers/podman/issues/23028 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 18:01:26 +02:00
Giuseppe Scrivano	08a8429459	libpod: avoid chowning the rundir to root in the userns so it is possible to remove the code to make the entire directory world accessible. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 18:01:26 +02:00
Giuseppe Scrivano	c81f075f43	libpod: do not chmod bind mounts with the new mount API is available, the OCI runtime doesn't require that each parent directory for a bind mount must be accessible. Instead it is opened in the initial user namespace and passed down to the container init process. This requires that the kernel supports the new mount API and that the OCI runtime uses it. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 18:01:26 +02:00
Giuseppe Scrivano	094bc673ef	libpod: unlock the thread if possible Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 18:01:26 +02:00
Giuseppe Scrivano	7d22f04f56	container: pass KillSignal and StopTimeout to the systemd scope so that they are honored when systemd terminates the scope. Closes: https://issues.redhat.com/browse/RHEL-16375 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 13:46:08 +02:00
Giuseppe Scrivano	e48f3137c0	libpod: fix comment Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-21 10:07:55 +02:00
Marius Hoch	6dd9abf9ec	sqlite_state: Fix RewriteVolumeConfig The VolumeConfig table does not have an ID column, thus use the Name column to update it. Fixes #23052 Signed-off-by: Marius Hoch <mail@mariushoch.de>	2024-06-20 11:39:44 +02:00
openshift-merge-bot[bot]	00bcd9aa81	Merge pull request #22733 from nalind/system-check Add `podman system check`	2024-06-13 10:35:56 +00:00
Giuseppe Scrivano	730a215025	podman: add new hidden flag --pull-option add a new flag that allows to override the pull options configured in the storage.conf file. e.g.: --pull-option="enable_partial_images=false" can be specified to Podman to disable partial pulls even if enabled. Leave it as a hidden configuration flag for now since the API itself is marked as experimental in c/storage. Currently c/storage doesn't honor the overrides, being fixed with https://github.com/containers/storage/pull/1966 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-06-12 15:48:36 +02:00
Paul Holzinger	a9de888a15	libpod: do not resuse networking on start If a container was stopped and we try to start it before we called cleanup it tried to reuse the network which caused a panic as the pasta code cannot deal with that. It is also never correct as the netns must be created by the runtime in case of custom user namespaces used. As such the proper thing is to clean the netns up first. Also change a e2e test to report better errors. It is not directly related to this chnage but it failed on v1 of this patch so we noticed the ugly error message it produced. Thanks to Ed for the fix. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-06-07 17:50:28 +02:00
Doug Rabson	ffc8522646	libpod: fix 'podman kube generate' on FreeBSD This avoids dereferencing c.config.Spec.Linux if it is nil, which is the case on FreeBSD. [NO NEW TESTS NEEDED] Signed-off-by: Doug Rabson <dfr@rabson.org>	2024-06-05 10:38:30 +01:00
openshift-merge-bot[bot]	b63767866e	Merge pull request #22895 from Luap99/hc-startup-leak libpod: do not leak systemd hc startup unit timer	2024-06-04 17:41:21 +00:00
Paul Holzinger	e8ea1e7632	libpod: do not leak systemd hc startup unit timer This fixes a regression added in commit `4fd84190b8`, because the name was overwritten by the createTimer() timer call the removeTransientFiles() call removed the new timer and not the startup healthcheck. And then when the container was stopped we leaked it as the wrong unit name was in the state. A new test has been added to ensure the logic works and we never leak the system timers. Fixes #22884 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-06-04 18:03:46 +02:00
Nalin Dahyabhai	fec58a4571	Add `podman system check` for checking storage consistency Add a `podman system check` that performs consistency checks on local storage, optionally removing damaged items so that they can be recreated. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>	2024-06-04 10:00:37 -04:00
Bo Wang	7243c7109c	fix(libpod): add newline character to the end of container's hostname file debian's man (5) hostname page states "The file should contain a single newline-terminated hostname string." [NO NEW TESTS NEEDED] fix #22729 Signed-off-by: Bo Wang <wangbob@uniontech.com>	2024-06-04 15:20:04 +08:00
Giuseppe Scrivano	4ece83bdf9	libpod: cleanup default cache on system reset Closes: https://github.com/containers/podman/issues/22825 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-29 11:10:55 +02:00
openshift-merge-bot[bot]	af8fe2b75e	Merge pull request #22764 from giuseppe/give-more-time-to-healthcheck-status-change libpod: wait another interval for healthcheck	2024-05-28 13:21:43 +00:00
openshift-merge-bot[bot]	eee0dc256a	Merge pull request #22727 from mheon/chown_all_the_time Always chown volumes when mounting into a container	2024-05-23 12:34:07 +00:00
Matthew Heon	046c0e5fc2	Only stop chowning volumes once they're not empty When an empty volume is mounted into a container, Docker will chown that volume appropriately for use in the container. Podman does this as well, but there are differences in the details. In Podman, a chown is presently a one-and-done deal; in Docker, it will continue so long as the volume remains empty. Mount into a dozen containers, but never add content, the chown occurs every time. The chown is also linked to copy-up; it will always occur when a copy-up occurred, despite the volume now not being empty. This PR changes our logic to (mostly) match Docker's. For some reason, the chowning also stops if the volume is chowned to root at any point. This feels like a Docker bug, but as they say, bug for bug compatible. In retrospect, using bools for NeedsChown and NeedsCopyUp was a mistake. Docker isn't actually tracking this stuff; they're just doing a copy-up and permissions change unconditionally as long as the volume is empty. They also have the two linked as one operation, seemingly, despite happening at very different times during container init. Replicating that in our stateful system is nontrivial, hence the need for the new CopiedUp field. Basically, we never want to chown a volume with contents in it, except if that data is a result of a copy-up that resulted from mounting into the current container. Tracking who did the copy-up is the easiest way to do this. Fixes #22571 Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2024-05-22 17:47:01 -04:00
Giuseppe Scrivano	d094a9f18e	podman: fix --sdnotify=healthy with --rm Now WaitForExit returns the exit code as stored in the db instead of returning an error when the container was removed. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-22 21:34:38 +02:00
Giuseppe Scrivano	e166f6bfe0	libpod: wait another interval for healthcheck wait for another interval when the container transitioned to "stopped" to give more time to the healthcheck status to change. Closes: https://github.com/containers/podman/issues/22760 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-22 21:34:34 +02:00
openshift-merge-bot[bot]	cc79d5e82e	Merge pull request #22700 from Luap99/libpod-inspect-API-v4 remote API: restore v4 payload in container inspect	2024-05-22 12:32:29 +00:00
openshift-merge-bot[bot]	4f31335fa4	Merge pull request #22404 from testwill/close_file fix: close resource file	2024-05-21 11:47:36 +00:00
Paul Holzinger	fb2ab832a7	fix incorrect host.containers.internal entry for rootless bridge mode We have to exclude the ips in the rootless netns as they are not the host. Now that fix only works if there are more than one ip one the host available, if there is only one we do not set the entry at all which I consider better as failing to resolve this name is a much better error for users than connecting to a wrong ip. It also matches what --network pasta already does. The test is bit more compilcated as I would like, however it must deal with both cases one ip, more than one so there is no way around it I think. Fixes #22653 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-05-17 12:28:44 +02:00
Giuseppe Scrivano	35375e0af8	container_api: do not wait for healtchecks if stopped do not wait for the healthcheck status to change if the container is stopped. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-15 09:34:08 +02:00
Giuseppe Scrivano	b06c58b4a5	libpod: wait for healthy on main thread wait for the healthy status on the thread where the container lock is held. Otherwise, if it is performed from a go routine, a different thread is used (since the runtime.LockOSThread() call doesn't have any effect), causing pthread_mutex_unlock() to fail with EPERM. Closes: https://github.com/containers/podman/issues/22651 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-14 22:55:02 +02:00
Paul Holzinger	bcb7edfded	remote API: restore v4 payload in container inspect The v5 API made a breaking change for podman inspect, this means that an old client could not longer parse the result from the new 5.X server. The other way around new client and old server already worked. As it turned out there were several users that run into this, one case to hit this is using an old 4.X podman machine wich now pulls a newer coreos with podman 5.0. But there are also other users running into it. In order to keep the API working we now have a version check and return the old v4 compatible payload so the old remote client can still work against a newer server thus removing any major breaking change for an old client. Fixes #22657 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2024-05-14 17:56:20 +02:00
openshift-merge-bot[bot]	0c09421f85	Merge pull request #22641 from mheon/handle_stopping_loop Ensure that containers do not get stuck in stopping	2024-05-13 12:32:40 +00:00
Giuseppe Scrivano	8433a01aa2	Revert "container stop: kill conmon" This reverts commit `909ab59419`. The workaround was added almost 5 years ago to workaround an issue with old conmon releases. It is safe to assume such ancient conmon releases are not used anymore. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2024-05-09 22:49:14 +02:00
Matt Heon	3fa8e98a31	Ensure that containers do not get stuck in stopping The scenario for inducing this is as follows: 1. Start a container with a long stop timeout and a PID1 that ignores SIGTERM 2. Use `podman stop` to stop that container 3. Simultaneously, in another terminal, kill -9 `pidof podman` (the container is now in ContainerStateStopping) 4. Now kill that container's Conmon with SIGKILL. 5. No commands are able to move the container from Stopping to Stopped now. The cause is a logic bug in our exit-file handling logic. Conmon being dead without an exit file causes no change to the state. Add handling for this case that tries to clean up, including stopping the container if it still seems to be running. Fixes #19629 Signed-off-by: Matt Heon <mheon@redhat.com>	2024-05-09 11:17:24 -04:00

1 2 3 4 5 ...

4179 Commits