Commit Graph

23615 Commits

Author SHA1 Message Date
renovate[bot] 6d4006b123
Update module github.com/docker/docker to v27.3.1+incompatible
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-09-20 19:56:39 +00:00
openshift-merge-bot[bot] 2f44b166e7
Merge pull request #24024 from Luap99/netns-dir
libpod: setupNetNS() correctly mount netns
2024-09-20 14:41:59 +00:00
Paul Holzinger 792796183f
libpod: setupNetNS() correctly mount netns
The netns dir has a special logic to bind mout itself and make itslef
shared. This code here didn't which lead to catastrophic bug during
netns unmounting as we were unable to unmount the netns as the mount got
duplicated and had the wrong parent mount. This caused us to loop forever
trying to remove the file.

Fixes https://issues.redhat.com/browse/RHEL-59620
Fixes #23685

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-20 15:19:22 +02:00
Paul Holzinger f6bda786ed
vendor latest c/common
To include the pkg/netns changes.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-20 15:18:35 +02:00
openshift-merge-bot[bot] f7be7a365a
Merge pull request #24019 from edsantiago/quadlet-rootfs-fix
CI: Quadlet rootfs test: use container image as rootfs
2024-09-20 10:55:12 +00:00
openshift-merge-bot[bot] e38f86c024
Merge pull request #24020 from containers/renovate/github.com-docker-docker-27.x
Update module github.com/docker/docker to v27.3.0+incompatible
2024-09-20 10:22:14 +00:00
renovate[bot] 597773464c
Update module github.com/docker/docker to v27.3.0+incompatible
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-09-19 22:52:25 +00:00
Ed Santiago a08ae98161 CI: Quadlet rootfs test: use container image as rootfs
Test was written to use / (root). This is not parallel-safe.

Fixes: #23909

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-19 15:19:14 -06:00
openshift-merge-bot[bot] 217ecac740
Merge pull request #23996 from edsantiago/safename-200
CI: make 200-pod parallel-safe
2024-09-19 14:27:38 +00:00
openshift-merge-bot[bot] 80776fa5bb
Merge pull request #24007 from edsantiago/systest-cleanup
CI: system tests: various small cleanups
2024-09-19 14:05:36 +00:00
openshift-merge-bot[bot] eb18c41835
Merge pull request #24002 from edsantiago/systest-registry
CI: system test registry: use --net=host
2024-09-19 12:48:35 +00:00
Ed Santiago 9c51eead06 CI: system test registry: use --net=host
This removes the need for a tricky/fragile namespace workaround.

Huge thanks to Paul for discovering documentation on the
Registry container, and how to override config.yml settings:

   https://distribution.github.io/distribution/about/configuration/#override-specific-configuration-options

Drive-by: consistentize quotes in -eVAR="value". Minor, but
makes them all easier to read with emacs/vi syntax highlighting.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-19 05:17:15 -06:00
openshift-merge-bot[bot] 327c26af2f
Merge pull request #24008 from stilwelb/fix-typo
Fix typo in error message
2024-09-18 20:02:59 +00:00
openshift-merge-bot[bot] bb235fb9cc
Merge pull request #24006 from Luap99/vendor-common
vendor latest c/common
2024-09-18 19:46:28 +00:00
Ed Santiago e3af5a38d3 CI: rm system test: bump grace period
The "rm on stopping containers" test is flaking under high load,
probably because I bumped up two timeouts in the healthcheck
container that it relies on. Bump up this test's timeout as well.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:35:00 -06:00
Ed Santiago 3396dabdf3 CI: system tests: minor documentation on parallel
Only in 000-TEMPLATE. I know I need to write more thorough
documentation. I choose to defer that.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:32:36 -06:00
Brad Stilwell 31cdf1197b fix typo in error message
Fixes: containers/podman#24001

Signed-off-by: Brad Stilwell <stilwelb@us.ibm.com>
2024-09-18 13:24:34 -04:00
Ed Santiago 1d5c8ac18e CI: system tests: always create pause image
...not just when running parallel Bats, because Bats
does not provide any way to know if we're parallel.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:23:12 -06:00
Ed Santiago 5e5c68ffbe CI: quadlet system test: be more forgiving
...of high system load (such as when running parallel tests).
Allow time for services to reach desired state, by retrying
a few times in a loop.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 11:22:48 -06:00
Paul Holzinger 6dcda2196a
vendor latest c/common
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 19:21:50 +02:00
openshift-merge-bot[bot] 04d193daa9
Merge pull request #23987 from edsantiago/safename-090
CI: make 090-events parallel-safe
2024-09-18 16:06:31 +00:00
openshift-merge-bot[bot] bef0aabbdd
Merge pull request #23995 from Luap99/netns-leak
CI: netns leak checks for system and e2e
2024-09-18 15:49:59 +00:00
openshift-merge-bot[bot] 7fee222d52
Merge pull request #23997 from Luap99/expose-sctp
allow exposed sctp ports
2024-09-18 15:08:45 +00:00
openshift-merge-bot[bot] f580ae0d19
Merge pull request #23985 from Luap99/wait-hang
wait: fix handling of multiple conditions with exited
2024-09-18 12:26:28 +00:00
Ed Santiago 6fe832d5d6 CI: make 200-pod parallel-safe
...as much as possible. Not all tests can be parallelized.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-18 06:25:18 -06:00
Paul Holzinger d7335855d7
allow exposed sctp ports
There is no reason to disallow exposed sctp ports at all. As root we can
publish them find and as rootless it should error later anyway.

And for the case mentioned in the issue it doesn't make sense as the
port is not even published thus it is just part of the metadata which is
totally in all cases.

Fixes #23911

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 14:24:45 +02:00
Paul Holzinger 755a06aa44
test/e2e: add netns leak check
Like we do in system tests now check for netns leaks in e2e as well. Now
because things run in parallel and this dir is shared we cannot test
after each test only once per suite. This will be a PITA to debug if
leaks happen as the netns files do not contain the container ID and are
just random bytes (maybe we should change this?)

Fixes #23715

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 14:05:26 +02:00
Paul Holzinger 2d469e517d
test/system: netns leak check for rootless as well
This fixes the problem where even as root we check the netns files from
root. But in order to catch any rootless bugs we must check the rootless
files from $XDG_RUNTIME_DIR/netns.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-18 12:07:11 +02:00
Ed Santiago 5468718f22 CI: make 090-events parallel-safe
...or at least as much as possible. Some tests cannot
be run in parallel due to #23750: "--events-backend=file"
does not actually work the way a naïve user would intuit.
Stop/die events are asynchronous, and can be gathered
by *ANY OTHER* podman process running after it, and if
that process has the default events-backend=journal,
that's where the event will be logged. See #23987 for
further discussion.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 18:21:58 -06:00
openshift-merge-bot[bot] 62c101651f
Merge pull request #23857 from rhatdan/run
Remove containers/common/pkg/config from pkg/util
2024-09-17 20:31:28 +00:00
openshift-merge-bot[bot] 1e9464c9b4
Merge pull request #23937 from edsantiago/test-crun-17
New VMs: test crun 1.17
2024-09-17 20:28:43 +00:00
openshift-merge-bot[bot] 4dfff40840
Merge pull request #23989 from edsantiago/enable-bats-parallel
CI: system tests: enable parallel tests
2024-09-17 19:30:57 +00:00
openshift-merge-bot[bot] 75369fd283
Merge pull request #23986 from mheon/fix_23981
Match output of Compat Top API to Docker
2024-09-17 19:06:13 +00:00
openshift-merge-bot[bot] f29901ef1b
Merge pull request #23983 from nalind/manifest-remove-docs
podman-manifest-remove: update docs and help output
2024-09-17 18:52:30 +00:00
openshift-merge-bot[bot] d0642ca913
Merge pull request #23988 from edsantiago/safename-012
CI: make 012-manifest parallel-safe
2024-09-17 18:00:13 +00:00
Ed Santiago 8402b6535f Misc minor test fixes
...for dealing with flakes in parallel mode

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago 7fcf94d7b5 Add network namespace leak check
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago b3da5be2b1 Add workaround for buildah parallel bug
Need --layers=false in podman build, otherwise a buildah race
can trigger "layer not known" failures:

   https://github.com/containers/buildah/issues/5674

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago 5fc3de5583 registry: lock start attempts
When running parallel, multiple tests could be trying to start
the registry at once. Make this parallel-safe.

Also, use a safer port range for the registry. Something
outside of /proc/sys/net/ipv4/ip_local_port_range

Sorry, I'm including a FIXME section that I haven't investigated
deeply enough.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago bf6131780a Update system test template and README
Add a few best-practices examples, and add a whole section
describing the dos and donts of writing parallel-safe tests.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago 6502e30cfd bats log: differentiate parallel tests from sequential
For tests run in parallel, show file number as |nnn| (vs [nnn])

Teach logformatter to distinguish the two, adding 'p' to anchors
in parallel tests. Necessary because in this scheme we run bats
twice, thus see 'ok 1' twice, and we want to differentiate them.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:37 -06:00
Ed Santiago 6b621d9571 ci: bump system tests to fastvm
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:36 -06:00
Ed Santiago bcffa9ce30 clean_setup: create pause image
Workaround for #23292, where simultaneous 'pod create' commands
will all start a podman-build of the pause image, but only
one of them will be tagged, and the others will leak <none>
images.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 11:19:36 -06:00
Ed Santiago 812c7e9436 CI: make 012-manifest parallel-safe
Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 10:35:01 -06:00
Nalin Dahyabhai 00c13afcb9 podman-manifest-remove: update docs and help output
* podman manifest remove doesn't accept references as descriptions of
  what to remove from a list or index; only use digests in the man page
* podman manifest remove only removes one thing at a time; correct the
  man page examples

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2024-09-17 11:36:12 -04:00
Paul Holzinger aa108924ea
test/system: remove wait workaround
The issue is closed and I recently fixed a number of races (bf74797c69)
in the remote attach API that sound like exactly like the same error
that was mentioned in issue #9597.

As such I think this works, if it start flaking again we can revert this
or better fix the actual bug.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-17 17:35:18 +02:00
Paul Holzinger fbed3a01d2
wait: fix handling of multiple conditions with exited
As it turns on things are not so simple after all...
In podman-py it was reported[1] that waiting might hang, per our docs wait
on multiple conditions should exit once the first one is hit and not all
of them. However because the new wait logic never checked if the context
was cancelled the goroutine kept running until conmon exited and because
we used a waitgroup to wait for all of them to finish it blocked until
that happened.

First we can remove the waitgroup as we only need to wait for one of
them anyway via the channel. While this alone fixes the hang it would
still leak the other goroutine. As there is no way to cancel a goroutine
all the code must check for a cancelled context in the wait loop to no
leak.

Fixes 8a943311db ("libpod: simplify WaitForExit()")
[1] https://github.com/containers/podman-py/issues/425

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-09-17 17:35:17 +02:00
Matt Heon e04668c8ca Match output of Compat Top API to Docker
We were only splitting on tabs, not spaces, so we returned just a
single line most of the time, not an array of the fields in the
output of `ps`. Unfortunately, some of these fields are allowed
to contain spaces themselves, which makes things complicated, but
we got lucky in that Docker took the simplest possible solution
and just assumed that only one field would contain spaces and it
would always be the last one, which is easy enough to duplicate
on our end.

Fixes #23981

Signed-off-by: Matt Heon <mheon@redhat.com>
2024-09-17 11:34:22 -04:00
Ed Santiago d571ca6536 system test parallelization: enable two-pass approach
For the past two months we've been splitting system tests
into two categories: those that CAN be run in parallel,
and those that CANNOT. Much work has been done to replace
hardcoded names (mycontainer, mypod) with safename().
Hundreds of test runs, in CI and on Ed's laptop, have
proven this approach viable.

make {local,remote}system now runs in two steps: first
the serial ones, then the parallel ones. hack/bats will
now recognize the 'ci:parallel' tag and add --jobs (nprocs).

This requires some tweaking of leak_check, because there
can be umpteen tests running (affecting image/container/pod/etc
state) when any given test completes.

Rules for enabling parallelization in tests:

   * use unique container/pod/volume/network names (safename)
   * do not run 'podman rm -a' or 'rmi -a'
   * never use the -l (--latest) option
   * do not run 'podman ps/images' and expect precise output

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-09-17 09:25:02 -06:00
openshift-merge-bot[bot] f4a08f46b7
Merge pull request #23959 from auyer/hide-secrets-from-container-inspect
Hide secrets from container inspect command
2024-09-17 13:00:18 +00:00