automation-tests

Commit Graph

Author	SHA1	Message	Date
Daniel J Walsh	b72bb11629	Add TERM iff TERM not defined in container when podman exec -t Fixes: https://github.com/containers/podman/issues/20334 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-10-18 15:40:52 -04:00
openshift-ci[bot]	62a81a166a	Merge pull request #20383 from Luap99/init-path use FindInitBinary() for init binary	2023-10-18 17:17:59 +00:00
openshift-ci[bot]	aabe5c8aa5	Merge pull request #20394 from giuseppe/cleanup-exec-session-on-errors exec: do not leak session IDs on errors	2023-10-18 13:52:12 +00:00
Paul Holzinger	caef657c5b	libpod: rename confusing import name The packge is called slirp4netns and renaming it makes no sense, this was likely done by accident. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-18 15:14:23 +02:00
Giuseppe Scrivano	fa19e1baa2	exec: do not leak session IDs on errors always cleanup the exec session when the command specified to the "exec" is not found. Closes: https://github.com/containers/podman/issues/20392 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-10-18 15:02:22 +02:00
openshift-ci[bot]	a1982c043d	Merge pull request #20365 from p-fruck/fix/api-compat-network-connected fix(API): Catch ErrNetworkConnected for compat	2023-10-18 08:55:31 +00:00
openshift-ci[bot]	6624ccb4b1	Merge pull request #20384 from Luap99/double-netns libpod: restart+userns cleanup netns correctly	2023-10-18 07:47:02 +00:00
Philipp Fruck	ad53190253	fix(api): Ensure compatibality for network connect When trying to connect a container to a network and the connection already exists, an error should only be raised if the container is already running (or is in the `ContainerStateCreated` transition) to mimic the behavior of Docker as described here: https://github.com/containers/podman/pull/15516#issuecomment-1229265942 For running and connected containers 403 is returned which fixes #20365 Signed-off-by: Philipp Fruck <dev@p-fruck.de>	2023-10-17 22:56:32 +02:00
Paul Holzinger	bbd6281ecc	libpod: restart+userns cleanup netns correctly When a userns and netns is used we need to let the runtime create the netns otherwise the netns is not owned by the right userns and thus the capabilities would not be correct. The current restart logic tries to reuse the netns which is fine if no userns is used but when one is used we setup a new netns (which is correct) but forgot to cleanup the old netns. This resulted in leaked network namespaces and because no teardown was ever called leaked ipam assignments, thus a quickly restarting container will run out of ip space very fast. Fixes #18615 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-17 17:25:50 +02:00
Daniel J Walsh	9637fed2fd	Fix output of podman --remote top Allow users to specify podman-remote top $cid -eo "pid comm" or podman-remote top $cid -eo pid,comm Fixes: https://github.com/containers/podman/issues/19176 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com> didid# new file: test/system/085-top.bats Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-10-16 17:25:10 -04:00
Paul Holzinger	99a14332ef	healthcheck: make sure to always show health_status events This fixes a regression caused by commit `7e6e267329`, unfortunately this was not caught during review as for some reason this works fine rootless and only fails as root. Because we set the systemd log level to notice in order to hide the unit started/stopped messages to prevent spamming the journal the issue is that this now also causes systemd to ignore the events we write to journald as we also send them as info level. To fix this we simply send health_status events now on notice level. I decided against sending all events on notice as I think info is fine for them. Whenever the notice level is right is of course debatable but given it may contain the unhealthy message I think having this a notice should be ok. The main reason this made it through testing is because we do not rely on the systemd unit to fire healthchecks in the tests as this is flaky. There is one test were we rely on it though and I added a check there to make sure events are displayed correctly when trigger via systemd. Fixes #20342 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-12 15:02:32 +02:00
Paul Holzinger	3cc9db8626	libpod: fix deadlock while parallel container create When containers are created with a named volume it can deadlock because the create logic tried to lock all volumes in a loop, this is fine if it only ever creates a single container at any given time. However because we multiple containers can be created at the same time they can cause a deadlock between the volumes. This is because the order of the loop is not stable, in fact it is based on the order of how the volumes were specified on the cli. So if you create two containers at the same time with `-v vol1:/dir2 -v vol2:/dir2` and the other one with `-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock. Now one solution could be to order the volumes to prevent the issue but the reason for holding the lock is dubious. The goal was to prevent the volume from being removed in the meantime. However that could still have happend before we acquired the lock so it didn't protect against that. Both boltdb and sqlite already prevent us from adding a container with volumes that do not exists due their internal consistency checks. Sqlite even uses FOREIGN KEY relationships so the schema will prevent us from doing anything wrong. The create code currently first checks if the volume exists and if not creates it. I have checked that the db will guarantee that this will not work: Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume` Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed` Keep in mind that this error is normally not seen, only if the volume is removed between the volume exists check and adding the container in the db this messages will be seen wich is an acceptable race and a pre-existing condition anyway. [NO NEW TESTS NEEDED] Race condition, hard to test in CI. Fixes #20313 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-11 11:40:35 +02:00
Paul Holzinger	29ae516006	use sqlite as default database Use sqlite as default but for upgrades it will still use boltdb to avoid breaking anyone. This is done by checking if the boltdb file already exists and if it does then we have to use it. I added a e2e test to check the new logic and removed the system test for it, the problem with the system test is that we share the storage dir there so all following commands without --db-backend would try to use boltdb as a single --db-backend boltdb command will create the file and then all folllwing commands will use it because of the backwards compat. In e2e tests each test uses their own --root so it is not an issue there. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-10 17:11:28 +02:00
Paul Holzinger	8a52e638e6	vendor latest c/common Includes the default db backend changes. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-10-10 17:08:04 +02:00
Giuseppe Scrivano	8ac2aa7938	container: always check if mountpoint is mounted when running as a service, the c.state.Mounted flag could get out of sync if the container is cleaned up through the cleanup process. To avoid this, always check if the mountpoint is really present before skipping the mount. [NO NEW TESTS NEEDED] Closes: https://github.com/containers/podman/issues/17042 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-10-09 17:20:22 +02:00
openshift-ci[bot]	3b39d4b082	Merge pull request #20132 from cgiradkar/Issue-17856 Change log level for health_check	2023-10-04 17:59:32 +00:00
Chetan Giradkar	7e6e267329	Filter health_check and exec events for logging in console Podman server logs are mostly full of healthcheck output, making them hard to navigate. Hence, made healthcheck service to run with LogLevelMax=notice, this would remove the normal output, inclusive the started/stopped messages from systemd itself. Fixes #17856 Signed-off-by: Chetan Giradkar <cgiradka@redhat.com>	2023-10-04 14:50:15 +01:00
Ygal Blum	979c77f10e	Volume create - fast exit when ignore is set and volume exists Signed-off-by: Ygal Blum <ygal.blum@gmail.com>	2023-10-01 16:54:24 +03:00
Wolfgang Pross	bfbd0c8960	move IntelRdtClosID to HostConfig Signed-off-by: Wolfgang Pross <wolfgang.pross@intel.com>	2023-09-27 16:44:13 +00:00
Wolfgang Pross	40d3c3b9b0	Add Intel RDT support Add --rdt-class=COS to the create and run command to enable the assignment of a container to a Class of Service (COS). The COS represents a part of the cache based on the Cache Allocation Technology (CAT) feature that is part of Intel's Resource Director Technology (Intel RDT) feature set. By assigning a container to a COS, all PID's of the container have only access to the cache space defined for this COS. The COS has to be pre-configured based on the resctrl kernel driver. cat_l2 and cat_l3 flags in /proc/cpuinfo represent CAT support for cache level 2 and 3 respectively. Signed-off-by: Wolfgang Pross <wolfgang.pross@intel.com>	2023-09-27 16:44:13 +00:00
Valentin Rothberg	7ade972102	libpod: pass entire environment to conmon Pass the _entire_ environment to conmon instead of selectively enabling only specific variables. The main reasoning is to make sure that conmon and the podman-cleanup callback process operate in the exact same environment than the initial podman process. Some configuration files may be passed via environment variables. Podman not passing those down to conmon has led to subtle and hard to debug issues in the past, so passing all down will avoid such kinds of issues in the future. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-09-26 16:48:52 +02:00
Valentin Rothberg	6293ec2e2d	fix handling of static/volume dir The processing and setting of the static and volume directories was scattered across the code base (including c/common) leading to subtle errors that surfaced in #19938. There were multiple issues that I try to summarize below: - c/common loaded the graphroot from c/storage to set the defaults for static and volume dir. That ignored Podman's --root flag and surfaced in #19938 and other bugs. c/common does not set the defaults anymore which gives Podman the ability to detect when the user/admin configured a custom directory (not empty value). - When parsing the CLI, Podman (ab)uses containers.conf structures to set the defaults but also to override them in case the user specified a flag. The --root flag overrode the static dir which is wrong and broke a couple of use cases. Now there is a dedicated field for in the "PodmanConfig" which also includes a containers.conf struct. - The defaults for static and volume dir and now being set correctly and adhere to --root. - The CONTAINERS_CONF_OVERRIDE env variable has not been passed to the cleanup process. I believe that _all_ env variables should be passed to conmon to avoid such subtle bugs. Overall I find that the code and logic is scattered and hard to understand and follow. I refrained from larger refactorings as I really just want to get #19938 fixed and then go back to other priorities. https://github.com/containers/common/pull/1659 broke three pkg/machine tests. Those have been commented out until getting fixed. Fixes: #19938 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-09-25 14:14:30 +02:00
Paul Holzinger	af2665c28a	pod rm: do not log error if anonymous volume is still used This is not really an error, if the anonymous volume is still used then this likely means it was transferred to another container with --volumes-from. This is what the user wants and it is not like the user can act on the logged error anyway. Once the last user of the volume is removed it will be removed correctly. see https://github.com/containers/podman/pull/19637 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-09-22 14:44:14 +02:00
OpenShift Merge Robot	639eb52c89	Merge pull request #20062 from vrothberg/syslog-fix pass --syslog to the cleanup process	2023-09-20 11:57:33 -04:00
OpenShift Merge Robot	c5be3aecde	Merge pull request #20056 from Luap99/fast-net-ls compat API: speed up network list	2023-09-20 16:43:00 +02:00
Valentin Rothberg	4652a2623f	pass --syslog to the cleanup process The --syslog flag has not been passed to the cleanup process (i.e., conmon's exit args) complicating debugging quite a bit. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-09-20 15:37:07 +02:00
Daniel J Walsh	73dc72f80d	vendor of containers/common Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-09-20 08:39:49 -04:00
Paul Holzinger	8e5adde0b3	compat API: speed up network list The network list compat API requires us to include all containers with their ip addresses for the selected networks. Because we have no network -> container mapping in the db we have to go through all containers every time. However the old code did it in the most ineffective way possible, it quered the containers from the db for each individual network. The of course is extremely expensive. Now the other expensive call is calling Inspect() on the container each time. Inspect does for more than we need. To fix this we fist query containers only once for the API call, then replace the inspect call with directly accessing the network status. This will speed things up a lot! The reported scenario includes 100 containers and 25 networks, previously it took 1.5s for the API call not it takes 24ms, that is a more than a 62x improvement. (tested with curl) [NO NEW TESTS NEEDED] We have no timing tests. Fixes #20035 Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-09-20 13:08:42 +02:00
Paul Holzinger	befdb41995	libpod: remove unused ContainerState() fucntion First that function claims to deep copy but then actually return the original state so it does not work correctly, given that there are no users just remove it instead of fixing it. Signed-off-by: Paul Holzinger <pholzing@redhat.com>	2023-09-20 11:01:09 +02:00
OpenShift Merge Robot	1e43fae5ad	Merge pull request #19873 from rst0git/update-checkpointctl vendor: update github.com/checkpoint-restore/checkpointctl to 1.1.0	2023-09-14 15:22:02 +02:00
Daniel J Walsh	b1e3e8d972	Run codespell on code Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-09-14 06:13:23 -04:00
Radostin Stoyanov	9b17d6cb06	vendor: update checkpointctl to v1.1.0 Signed-off-by: Radostin Stoyanov <radostin@redhat.com>	2023-09-12 08:41:02 +01:00
danishprakash	cdcf18b862	kube: add DaemonSet support for generate Signed-off-by: danishprakash <danish.prakash@suse.com>	2023-09-12 10:30:57 +05:30
OpenShift Merge Robot	cbb955811c	Merge pull request #19245 from mheon/fix_19237 Ensure HC events fire after logs are written	2023-09-11 19:47:37 +02:00
OpenShift Merge Robot	325736fcb7	Merge pull request #19914 from umohnani8/term Add support for kube TerminationGracePeriodSeconds	2023-09-11 19:24:18 +02:00
Giuseppe Scrivano	19bd9b33dd	libpod: move oom_score_adj clamp to init commit `8b4a79a744` introduced oom_score_adj clamping when the container oom_score_adj value is lower than the current one in a rootless environment. Move the check to init() time so it is performed every time the container starts and not only when it is created. It is more robust if the oom_score_adj value is changed for the current user session. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-11 17:04:37 +02:00
Matt Heon	925794c6aa	Ensure HC events fire after logs are written HC events were firing as part of the `exec` call, before it had even been decided whether the HC succeeded or failed. As such, the status was not going to be correct any time there was a change (e.g. the first event after a container went healthy to unhealthy would still read healthy). Move the event into the actual Healthcheck function and throw it in a defer to make sure it happens at the very end, after logs are written. Ignores several conditions that did not log previously (container in question does not have a healthcheck, or an internal failure that should not really happen). Still not a perfect solution. This relies on the HC log being written, when instead we could just get the status straight from the function writing the event - so if we fail to write the log, we can still report a bad status. But if the log wasn't written, we're in bad shape regardless - `podman ps` would disagree with the event written, for example. Fixes #19237 Signed-off-by: Matt Heon <mheon@redhat.com>	2023-09-11 08:02:46 -04:00
Urvashi Mohnani	d9a85466a0	Add support for kube TerminationGracePeriodSeconds Add support to kube play to support the TerminationGracePeriodSeconds fiels by sending the value of that to podman's stopTimeout. Add support to kube generate to generate TerminationGracePeriodSeconds if stopTimeout is set for a container (will ignore podman's default). Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>	2023-09-10 16:41:24 -04:00
Giuseppe Scrivano	b8f6a12d01	libpod: create the cgroup pod before containers When a container is created and it is part of a pod, we ensure the pod cgroup exists so limits can be applied on the pod cgroup. Closes: https://github.com/containers/podman/issues/19175 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:48 +02:00
Giuseppe Scrivano	5de8f4aba0	libpod: allow cgroup path without infra container a pod can use cgroups without an infra container. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:48 +02:00
Giuseppe Scrivano	5121c9eb0e	libpod: check if cgroup exists before creating it do not create the pod cgroup if it already exists. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:48 +02:00
Giuseppe Scrivano	38209ef49d	libpod: refactor platformMakePod signature accept only the resources to be used by the pod, so that the function can more easily be used by a successive patch. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:48 +02:00
Giuseppe Scrivano	627ac1c96b	libpod: destroy pod cgroup on pod stop When the pod is stopped, we need to destroy the pod cgroup, otherwise it is leaked. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:48 +02:00
Giuseppe Scrivano	556db46a68	libpod: refactor code to new function move the code to remove the pod cgroup to a separate function. It is a preparation for the next patch. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-09-08 14:58:44 +02:00
Daniel J Walsh	6ee8f73d41	Merge pull request #19885 from rhatdan/kube Add support for kube securityContext.procMount	2023-09-08 06:56:05 -04:00
Ed Santiago	70cf9740f1	StopContainer: display signal num when name unknown Under some circumstances podman tries to kill a container using signal 37, for which unix.SignalName() returns "". Not helpful. So, when that happens, show "(signal number)". Signed-off-by: Ed Santiago <santiago@redhat.com>	2023-09-07 14:13:14 -06:00
Daniel J Walsh	b83485022d	Add support for kube securityContext\.procMount Fixes: https://github.com/containers/podman/issues/19881 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2023-09-07 09:49:11 -04:00
Valentin Rothberg	589867d716	podman: don't restart after kill Also add a new `StoppedByUser` field to the container-inspect state which can be useful during debugging and is now also used in the regression test. Note that I moved the `false` check one test above such that we can compare the previous Podman version which should just be stuck in the `wait $ctr` command since it will continue restarting. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>	2023-09-07 15:18:02 +02:00
Matt Heon	71549c642f	Ignore spurious container-removal errors When removing a container's dependency, getting an error that the container has already been removed (ErrNoSuchCtr and ErrCtrRemoved) should not be fatal. We wanted the container gone, it's gone, no need to error out. [NO NEW TESTS NEEDED] This is a race and thus hard to test for. Fixes #18874 Signed-off-by: Matt Heon <mheon@redhat.com>	2023-09-05 14:35:37 -04:00
Giuseppe Scrivano	702709a916	libpod: do not parse --hostuser in base 8 fix the parsing of --hostuser to treat the input in base 10. Closes: https://github.com/containers/podman/issues/19800 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2023-08-31 12:34:58 +02:00

1 2 3 4 5 ...

3973 Commits