podman

Commit Graph

Author	SHA1	Message	Date
Paul Holzinger	8a90765b90	filters: use new FilterID function from c/common Remove code duplication and use the new FilterID function from c/common. Also remove the duplicated ComputeUntilTimestamp in podman use the one from c/common as well. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-06-13 17:49:41 +02:00
OpenShift Merge Robot	2a947c2f4b	Merge pull request #18869 from vrothberg/debug-18860 container wait: indicate timeout in error	2023-06-13 09:38:52 -04:00
Valentin Rothberg	c0ab293131	container wait: indicate timeout in error When waiting for a container, there may be a time window where conmon has already exited but the container hasn't been fully cleaned up. In that case, we give the container at most 20 seconds to be fully cleaned up. We cannot wait forever since conmon may have been killed or something else went wrong. After the timeout, we optimistically assume the container to be cleaned up and its exit code to present. If no exit code can be found, we return an error. Indicate in the error whether the timeout kicked in to help debug (transient) errors and flakes (e.g., #18860). [NO NEW TESTS NEEDED] Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-06-13 13:48:29 +02:00
Toshiki Sonoda	6f821634ad	libpod: Podman info output more network information podman info prints the network information about binary path, package version, program version and DNS information. Fixes: #18443 Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>	2023-06-13 11:19:29 +09:00
OpenShift Merge Robot	3cae574ab2	Merge pull request #18507 from mheon/fix_rm_depends Fix `podman rm -fa` with dependencies	2023-06-12 13:27:34 -04:00
Paul Holzinger	ab502fc5c4	criu: return error when checking for min version There is weird issue #18856 which causes the version check to fail. Return the underlying error in these cases so we can see it and debug it. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-06-12 15:29:21 +02:00
Matthew Heon	2ebc9004f4	Ignore spurious warnings when killing containers There are certain messages logged by OCI runtimes when killing a container that has already stopped that we really do not care about when stopping a container. Due to our architecture, there are inherent races around stopping containers, and so we cannot guarantee that we are the people to kill it - but that doesn't matter because Podman only cares that the container has stopped, not who delivered the fatal signal. Unfortunately, the OCI runtimes don't understand this, and log various warning messages when the `kill` command is invoked on a container that was already dead. These cause our tests to fail, as we now check for clean STDERR when running Podman. To work around this, capture STDERR for the OCI runtime in a buffer only for stopping containers, and go through and discard any of the warnings we identified as spurious. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-08 09:19:47 -04:00
Matthew Heon	f1ecdca4b6	Ensure our mutexes handle recursive locking properly We use shared-memory pthread mutexes to handle mutual exclusion in Libpod. It turns out that these have configurable options for how to handle a recursive lock (IE, a thread trying to lock a lock that the same thread had previously locked). The mutex can either deadlock, or allow the duplicate lock without deadlocking. Default behavior is, helpfully, unspecified, so if not explicitly set there is no clear indication of which of these behaviors will be seen. Unfortunately, today is the first I learned of this, so our initial implementation did not explicitly set our preferred behavior. This turns out to be a major problem with a language like Golang, where multiple goroutines can (and often do) use the same OS thread. So we can have two goroutines trying to stop the same container, and if the no-deadlock mutex behavior is in use, both threads will successfully acquire the lock because the C library, not knowing about Go's lightweight threads, sees the same PID trying to lock a mutex twice, and allows it without question. It appears that, at least on Fedora/RHEL/Debian libc, the default (unspecified) behavior of the locks is the non-deadlocking version - so, effectively, our locks have been of questionable utility within the same Podman process for the last four years. This is somewhat concerning. What's even more concerning is that the Golang-native sync.Mutex that was also in use did nothing to prevent the duplicate locking (I don't know if I like the implications of this). Anyways, this resolves the major issue of our locks not working correctly by explicitly setting the correct pthread mutex behavior. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-07 14:09:12 -04:00
Matthew Heon	a750cd9876	Fix a race removing multiple containers in the same pod If the first container to get the pod lock is the infra container it's going to want to remove the entire pod, which will also remove every other container in the pod. Subsequent containers will get the pod lock and try to access the pod, only to realize it no longer exists - and that, actually, the container being removed also no longer exists. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-07 14:09:12 -04:00
Matthew Heon	0e47465e4a	Discard errors when a pod is already removed This was causing some CI flakes. I'm pretty sure that the pods being removed already isn't a bug, but just the result of another container in the pod removing it first - so no reason not to ignore the errors. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-07 14:09:12 -04:00
Matthew Heon	398e48a24a	Change Inherit to use a pointer to a container This fixes a lint issue, but I'm keeping it in its own commit so it can be reverted independently if necessary; I don't know what side effects this may have. I don't think there are any issues, but I'm not sure why it wasn't a pointer in the first place, so there may have been a reason. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-07 14:09:07 -04:00
OpenShift Merge Robot	c99d42b8e4	Merge pull request #18798 from edsantiago/fix_filters filters: better handling of id=	2023-06-07 12:31:11 -04:00
Ed Santiago	992093ae91	filters: better handling of id= For filter=id=XXX (containers, pods) and =ctr-ids=XXX (pods): if XXX is only hex characters, treat it as a PREFIX otherwise, treat it as a REGEX Add tests. Update documentation. And fix an incorrect help message. Fixes: #18471 Signed-off-by: Ed Santiago <santiago@redhat.com>	2023-06-07 05:29:06 -06:00
Matt Heon	944673c883	Address review feedback and add manpage notes The inspect format for `.LockNumber` needed to be documented. Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-06 11:04:59 -04:00
Matt Heon	4fda7936c5	`system locks` now reports held locks To debug a deadlock, we really want to know what lock is actually locked, so we can figure out what is using that lock. This PR adds support for this, using trylock to check if every lock on the system is free or in use. Will really need to be run a few times in quick succession to verify that it's not a transient lock and it's actually stuck, but that's not really a big deal. Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-05 19:34:36 -04:00
Matt Heon	0948c078c2	Add a new hidden command, podman system locks This is a general debug command that identifies any lock conflicts that could lead to a deadlock. It's only intended for Libpod developers (while it does tell you if you need to run `podman system renumber`, you should never have to do that anyways, and the next commit will include a lot more technical info in the output that no one except a Libpod dev will want). Hence, hidden command, and only implemented for the local driver (recommend just running it by SSHing into a `podman machine` VM in the unlikely case it's needed by remote Podman). These conflicts should normally never happen, but having a command like this is useful for debugging deadlock conditions when they do occur. Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-05 14:47:12 -04:00
Matt Heon	1013696ad2	Add number of free locks to `podman info` This is a nice quality-of-life change that should help to debug situations where someone runs out of locks (usually when a bunch of unused volumes accumulate). Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-05 14:00:40 -04:00
Matt Heon	3b39eb1333	Include lock number in pod/container/volume inspect Being able to easily identify what lock has been allocated to a given Libpod object is only somewhat useful for debugging lock issues, but it's trivial to expose and I don't see any harm in doing so. Signed-off-by: Matt Heon <mheon@redhat.com>	2023-06-05 12:28:50 -04:00
David Gibson	b2c0006706	pasta: Correct handling of unknown protocols setupPasta() has logic to handle forwarding of TCP or UDP ports. It has what looks like logic to give an error if trying to forward ports of any other protocol. However, there's a straightforward error in this that it will in fact only give the error if you try to use a protocol called "default". Other unknown protocols will fall through and result in a nonsensical pasta command line which will almost certainly cause a cryptic error later on. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2023-06-05 12:21:08 +10:00
Matthew Heon	2c9f18182a	The removeContainer function now accepts a struct We had something like 6 different boolean options (removing a container turns out to be rather complicated, because there are a million-odd things that want to do it), and the function signature was getting unreasonably large. Change to a struct to clean things up. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-01 16:27:27 -04:00
Matthew Heon	ef1a22cdea	Fix a deadlock when removing pods The infra container would try to remove the pod, despite the pod already being in the process of being removed - oops. Add a check to ensure we don't try and remove the pod when called by the `podman pod rm` command. Also, wire up noLockPod - it wasn't previously wired in, which is concerning, and could be related? Finally, make a few minor fixes to un-break lint. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-01 16:27:25 -04:00
Matthew Heon	8cb5d39d43	Pods now return what containers were removed with them This probably should have been in the API since the beginning, but it's not too late to start now. The extra information is returned (both via the REST API, and to the CLI handler for `podman rm`) but is not yet printed - it feels like adding it to the output could be a breaking change? Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-01 16:24:59 -04:00
Matthew Heon	bc1a31ce6d	Make RemoveContainer return containers and pods removed This allows for accurate reporting of dependency removal, but the work is still incomplete: pods can be removed, but do not report the containers they removed as part of said removal. Will add this in a subsequent commit. Major note: I made ignoring no-such-container errors automatic once it has been determined that a container did exist in the first place. I can't think of any case where this would not be a TOCTOU - IE, no reason not to ignore them. The `--ignore` option to `podman rm` should still retain meaning as it will ignore errors from containers that didn't exist in the first place. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-01 16:24:56 -04:00
Matthew Heon	e8d7456278	Add an API for removing a container and dependencies This is the initial stage of implementation. The current API functions but does not report the additional containers and pods removed. This is necessary to properly display results to the user after `podman rm --all`. The existing remove-dependencies code has been removed in favor of this more native solution. Signed-off-by: Matthew Heon <matthew.heon@pm.me>	2023-06-01 15:32:50 -04:00
OpenShift Merge Robot	a7e23d341d	Merge pull request #18756 from Luap99/tz libpod: fix timezone handling	2023-06-01 14:16:20 -04:00
OpenShift Merge Robot	e91f6f16bf	Merge pull request #15867 from boaz0/closes_15754 Fix: display online_cpus in compat REST API	2023-06-01 11:03:14 -04:00
Paul Holzinger	34c258b419	libpod: fix timezone handling The current way of bind mounting the host timezone file has problems. Because /etc/localtime in the image may exist and is a symlink under /usr/share/zoneinfo it will overwrite the targetfile. That confuses timezone parses especially java where this approach does not work at all. So we end up with an link which does not reflect the actual truth. The better way is to just change the symlink in the image like it is done on the host. However because not all images ship tzdata we cannot rely on that either. So now we do both, when tzdata is installed then use the symlink and if not we keep the current way of copying the host timezone file in the container to /etc/localtime. Also note that we need to rebuild the systemd image to include tzdata in order to test this as our images do not contain the tzdata by default. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2149876 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-06-01 11:04:13 +02:00
Jan Hendrik Farr	f097728891	set max ulimits for rootless on each start Signed-off-by: Jan Hendrik Farr <github@jfarr.cc>	2023-05-31 09:20:31 +00:00
Boaz Shuster	5c7d50f08c	Fix: display online_cpus in compat REST API Signed-off-by: Boaz Shuster <boaz.shuster.github@gmail.com>	2023-05-31 07:41:30 +03:00
Valentin Rothberg	08b0d93ea3	kube play: exit-code propagation Implement means for reflecting failed containers (i.e., those having exited non-zero) to better integrate `kube play` with systemd. The idea is to have the main PID of `kube play` exit non-zero in a configurable way such that systemd's restart policies can kick in. When using the default sdnotify-notify policy, the service container acts as the main PID to further reduce the resource footprint. In that case, before stopping the service container, Podman will lookup the exit codes of all non-infra containers. The service will then behave according to the following three exit-code policies: - `none`: exit 0 and ignore containers (default) - `any`: exit non-zero if _any_ container did - `all`: exit non-zero if _all_ containers did The upper values can be passed via a hidden `kube play --service-exit-code-propagation` flag which can be used by tests and later on by Quadlet. In case Podman acts as the main PID (i.e., when at least one container runs with an sdnotify-policy other than "ignore"), Podman will continue to wait for the service container to exit and reflect its exit code. Note that this commit also fixes a long-standing annoyance of the service container exiting non-zero. The underlying issue was that the service container had been stopped with SIGKILL instead of SIGTERM and hence exited non-zero. Fixing that was a prerequisite for the exit-code propagation to work but also improves the integration of `kube play` with systemd and hence Quadlet with systemd. Jira: issues.redhat.com/browse/RUN-1776 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-05-25 14:46:34 +02:00
Valentin Rothberg	6dbc138339	prune exit codes only when container doesn't exist Make sure to prune container exit codes only when the associated container does not exist anymore. This is needed when checking if any container in kube-play exited non-zero and a building block for the below linked Jira card. [NO NEW TESTS NEEDED] - there are no unit tests for exit code pruning. Jira: https://issues.redhat.com/browse/RUN-1776 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-05-25 13:14:27 +02:00
OpenShift Merge Robot	688e6dbef1	Merge pull request #18640 from HirazawaUi/add-pasta-to-podman-info podman: Add pasta to podman info	2023-05-25 06:55:04 -04:00
binghongtao	977b3cdbf6	podman: Add pasta to podman info [NO NEW TESTS NEEDED] Fixes: #18561 Signed-off-by: binghongtao <695097494plus@gmail.com>	2023-05-25 00:39:52 +08:00
OpenShift Merge Robot	fe64f79469	Merge pull request #18636 from mtrmac/cleanupStorage-error Fix, and reduce repetitiveness, in container cleanup error handling	2023-05-23 07:43:01 -04:00
OpenShift Merge Robot	ca7d0128b2	Merge pull request #18619 from vyasgun/pr/events-volume-name fix: event --filter volume=vol-name should compare the event name with volume name	2023-05-23 02:42:57 -04:00
Miloslav Trmač	032d4a95f0	Consolidate error handling in Runtime.removeContainer Use a helper to handle the cleanupErr logic instead of copy&pasting it EIGHT times. Also modifies the returned errors to be wrapped with a context, and changes the text of the logged errors a bit. Signed-off-by: Miloslav Trmač <mitr@redhat.com>	2023-05-22 19:14:06 +02:00
Miloslav Trmač	f556e58bb0	Consolidate error handling in Container.cleanupStorage Use a shared helper instead of copy&pasting the handling of cleanupErr EIGHT times. This changes the wording of logged error text, and the error in one case, a bit. Signed-off-by: Miloslav Trmač <mitr@redhat.com>	2023-05-22 19:14:06 +02:00
Miloslav Trmač	4969c552ec	Fix reporting errors on container unmount [NO NEW TESTS NEEDED] ... because testing this would require us to intentionally create an inconsistent state, which should ideally not be possible... (and because at this point I don't even know what the reported failure was.) Signed-off-by: Miloslav Trmač <mitr@redhat.com>	2023-05-22 19:11:56 +02:00
OpenShift Merge Robot	af8d19dc2e	Merge pull request #18581 from vrothberg/fix-18572 wait: look for exit code in stopped state	2023-05-22 11:51:14 -04:00
Gunjan Vyas	5f29c7bf98	fix: podman event --filter volume=vol-name should compare the event name with volume name Fixes: https://github.com/containers/podman/issues/18618 Signed-off-by: Gunjan Vyas <vyasgun20@gmail.com>	2023-05-22 19:11:15 +05:30
Valentin Rothberg	1b9272a060	wait: look for exit code in stopped state Make sure to look for the container's exit code when it's in stopped state. With `--restart=always`, the container seems to stay in the stopped state which led the wait logic to loop until the 20 seconds timeout for the cleanup process to have finished kicks in. Also defensively make sure to loop when the container is in stopped state but no exit code has been written yet. Add a regression test to make sure Podman doesn't wait more than 20 seconds. Even on a CI machine under high load I expect it to take much much much less than that, so I do not expect this test to flake in the future. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-05-22 14:53:19 +02:00
Erik Sjölund	685c736185	source code comments and docs: fix typos, language, Markdown layout - fix a/an before noun - fix loose -> lose - fix "the the" - fix lets -> let's - fix Markdown layout - fix a few typos - remove unnecessary text in troubleshooting.md Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>	2023-05-22 07:52:16 +02:00
OpenShift Merge Robot	a8291227de	Merge pull request #18620 from HirazawaUi/find_slirp4netns_from_helper_binaries_dir podman: Added find slirp4netns binary file from helper_binaries_dir	2023-05-20 06:18:07 -04:00
binghongtao	29749362a0	podman: Added find slirp4netns binary file from helper_binaries_dir [NO NEW TESTS NEEDED] Fixes: #18568 Signed-off-by: binghongtao <695097494plus@gmail.com>	2023-05-20 03:17:22 +08:00
Giuseppe Scrivano	7c53a463b2	stats: get mem limit from the cgroup `b25b330306` introduced this behaviour. It was fine at the time because we didn't support "container update", so the limit could not be changed at runtime. Since it is not possible to change the memory limit at runtime, read the limit as reported from the cgroup. https://github.com/containers/crun/pull/1217 is required for crun. Closes: https://github.com/containers/podman/issues/18621 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-05-19 14:59:43 +02:00
Daniel J Walsh	13f787842d	Fix handling of .containenv on tmpfs Fixes: https://github.com/containers/podman/issues/18531 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-05-13 06:03:21 -04:00
OpenShift Merge Robot	c307aeba37	Merge pull request #18506 from nalind/so-much-diffsize libpod/Container.rootFsSize(): use recorded image sizes	2023-05-10 06:08:12 -04:00
OpenShift Merge Robot	7a5daa0df3	Merge pull request #18492 from daw1012345/main Ensure the consistent setting of the HOME env variable on container start	2023-05-10 05:34:02 -04:00
Dawid Kulikowski	01e20818cc	Ensure the consistent setting of the HOME env variable on container start Signed-off-by: Dawid Kulikowski <git@dawidkulikowski.pl>	2023-05-09 16:34:28 +02:00
Valentin Rothberg	1fb3cdf8a8	sqlite: disable WAL mode As shown in #17831, WAL mode plays a role in causing `database is locked` errors. Those are errors, in theory, should not happen as the DB should busy wait. mattn/go-sqlite3/issues/274 has some comments indicating that the busy handler behaves differently in WAL mode which may be an explanation to the error. For now, let's disable WAL mode and only re-enable it when we have clearer understanding of what's going on. The upstream issue along with the SQLite documentation do not give me the clear guidance that I would need. [NO NEW TESTS NEEDED] - flake is only reproducible in CI. Fixes: #18356 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-05-09 15:54:26 +02:00

1 2 3 4 5 ...

3847 Commits