Commit Graph

6771 Commits

Author SHA1 Message Date
renovate[bot] 5f66277138
chore(deps): update dependency setuptools to ~=75.3.0
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-10-29 10:33:01 +00:00
openshift-merge-bot[bot] a56cda18cf
Merge pull request #24388 from shenpengfeng/main
chore: fix some function names in comment
2024-10-29 10:32:12 +00:00
shenpengfeng 9abc17f1e1 chore: fix some function names in comment
Signed-off-by: shenpengfeng <xinhangzhou@icloud.com>
2024-10-29 17:57:31 +08:00
openshift-merge-bot[bot] 3a7e1deed4
Merge pull request #24390 from edsantiago/safename-070
CI: make 070-build.bats use safe image names
2024-10-28 14:41:28 +00:00
openshift-merge-bot[bot] 2cbb2e8c42
Merge pull request #24392 from edsantiago/parallelize-520
CI: parallelize 520-checkpoint tests
2024-10-28 13:49:13 +00:00
Ed Santiago 41a82c9a95 CI: parallelize 450-interactive system tests
This has been running reliably for weeks in #23275

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-28 07:03:29 -06:00
Ed Santiago 10d056cc5e CI: parallelize 520-checkpoint tests
This has been running reliably for weeks in #23275

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-28 07:02:51 -06:00
Ed Santiago e6b7e4ff84 CI: make 070-build.bats use safe image names
In preparation for maybe some day being able to run build tests
in parallel.

SUPER IMPORTANT NOTE! BUILD TESTS CANNOT BE PARALLELIZED YET!
buildah, when run in parallel, barfs with:

    race: parallel builds: copying...committing...creating... layer not known

Until this is fixed, podman-build can never be run in parallel.
See https://github.com/containers/buildah/issues/5674

This PR is simply cleaning things up so, if/when that day comes,
the ensuing parallelize PR will be short & sweet.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-28 06:58:26 -06:00
openshift-merge-bot[bot] 0962a1e1bf
Merge pull request #24352 from edsantiago/systemd-leak-cleanup
System tests: clean up unit file leaks
2024-10-28 12:07:27 +00:00
Paul Holzinger 64516e1b8f
test/system: add podman network reload test to distro gating
The recent fedora kernel 6.11.4 has a problem with ipv6 networks [1].
This is not a podman bug at all but rather a kernel regression. I can
reproduce the issue easily by running this test.

Given many users were hit by this add it to the distro level gating
which runs in the fedora openQA framework and then we should catch a
bad kernel like this hopefully in the future and prevent it from going
into stable.

[1] https://github.com/containers/podman/issues/24374

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-28 11:51:43 +01:00
Ed Santiago 743a0d49eb System tests: clean up unit file leaks
Quadlet tests and some systemd tests leak unit files, as
reported by 'systemctl list-units --failed'. Clean them up.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-28 04:45:04 -06:00
Paul Holzinger 6069cdda00
healthcheck: do not leak statup service
The startup service is special because we have to transition from
startup to the normal unit. And in order to do so we kill ourselves (as
we are run as part of the service). This means we always exited 1 which
causes systemd to keep us failure and not remove the transient unit
unless "reset-failed" is called. As there is no process around to do
that we cannot really do this, thus make us exit(0) which makes more
sense.

Of course we could try to reset-failed the unit later but the code for
that seems more complicated than that.

Add a new test from Ed that ensures we check for all healthcheck units
not just the timer to avoid leaks. I slightly modified it to provide a
better error on leaks.

Fixes: 0bbef4b830 ("libpod: rework shutdown handler flow")
Fixes: #24351

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-25 13:47:59 +02:00
Jan Rodák afedb83917
Add Startup HealthCheck configuration to the podman inspect
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
2024-10-24 13:49:51 +02:00
Ed Santiago ee9c681f31 Buildah treadmill: improve wording in test-fail instructions
Clarify, expand, fix a typo. These are the instructions
shown when the **patching** step fails, typically when
buildah's helpers.bash is changed in a way that conflicts
with our make-it-work-in-podman patches.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-23 12:34:33 -06:00
Paul Holzinger 0cdb9b3b22
ps: fix display of exposed ports
This fixes two problems, first if a port is published and exposed it
should not be shown twice. It is enough to show the published one.

Second, if there is a huge range the ports were no grouped causing the
output to be unreadable basically. Now we group exposed ports like we do
with the normal published ports.

Fixes #23317

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-23 15:03:30 +02:00
David Gibson 5b131b8273 test/system: Fix spurious "duplicate tests" failures in pasta tests
As an internal consistency check, the pasta tests check for duplicated test
cases by grepping a log file for a parsed test id.  However it uses
grep -F for the purpose which will not perform an exact match, but a
substring match.  There are some tests which generate an id which is a
substring of the id for other tests, so when test order is randomised, this
can cause a spurious failure.  This can happen in practice when running
the test in parallel with very high concurrency (e.g. -j 100).

Fix this by adding the -x option to grep, which only checks for full line
exact matches.

Fixes: https://github.com/containers/podman/issues/24342

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2024-10-23 14:02:53 +11:00
Miloslav Trmač 6fd0e227b4 Improve "podman load - from URL"
Don't assume that the loaded image will be deduplicated
with the server image.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-22 19:36:14 +02:00
Miloslav Trmač 77ef28c14f Try to repair c/storage after removing an additional image store
The additional image store feature assumes that images / layers
in the additional store never go away, while we do remove it after
this test. Try to repair the store.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-22 19:36:03 +02:00
Miloslav Trmač 1d7ec1ef5f Use the config digest to compare images loaded/pulled using different methods
Historically, non-schema1 images had a deterministic image ID == config digest.
With zstd:chunked, we don't want to deduplicate layers pulled by consuming the
full tarball and layers partially pulled based on TOC, because we can't cheaply
ensure equivalence; so, image IDs for images where a TOC was used differ.

To accommodate that, compare images using their configs digests, not using image IDs.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-22 19:36:02 +02:00
Miloslav Trmač bf8f2b5551 Simplify the additional store test
When looking up the current-store image ID, do that
from the same output where we verify that the ID is from the
current store, instead of listing images twice.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-22 19:15:46 +02:00
Miloslav Trmač 3bc6072142 Fix the store choice in "podman pull image with additional store"
The test got the stores RW status backwards.

Before zstd:chunked, both image IDs should be the same, so this used
to make no difference.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-22 19:15:46 +02:00
openshift-merge-bot[bot] 57095a9e62
Merge pull request #24335 from giuseppe/test-set-soft-ulimit
test: set soft ulimit
2024-10-22 11:09:41 +00:00
openshift-merge-bot[bot] f4227e887c
Merge pull request #24275 from Luap99/wait-condition
libpod API: only return exit code without conditions
2024-10-22 10:53:12 +00:00
Giuseppe Scrivano 94878af151
test: set soft ulimit
when the current soft limit is higher than the new value, ulimit fails
to set the hard limit as (tested on Rawhide):

[root@rawhide ~]# ulimit -n -H 1048575
-bash: ulimit: open files: cannot modify limit: Invalid argument

to avoid the problem, set also the soft limit:

[root@rawhide ~]# ulimit -n -H
12345678
[root@rawhide ~]# ulimit -n -H 1048575
-bash: ulimit: open files: cannot modify limit: Invalid argument
[root@rawhide ~]# ulimit -n -SH 1048575
[root@rawhide ~]# ulimit -n -H
1048575

commit 71d5ee0e04 introduced the issue.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-22 12:05:07 +02:00
Miloslav Trmač fdc9feea0e Fix 330-corrupt-images.bats in composefs test runs
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-18 23:44:04 +02:00
openshift-merge-bot[bot] 290d94d3c0
Merge pull request #24300 from edsantiago/flake-fix-checkpoint-test
CI: e2e: fix checkpoint flake
2024-10-18 16:42:44 +00:00
Paul Holzinger 67e0fa8b89
quadlet: add default network dependencies to all units
There is no good reason for the special case, kube and pod units
definitely need it. Volume and network units maybe not but for
consistency we add it there as well. This makes the docs much easier to
write and understand for users as the behavior will not differ.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-18 14:01:22 +02:00
Paul Holzinger 57b022782b
quadlet: ensure user units wait for the network
As documented in the issue there is no way to wait for system units from
the user session[1]. This causes problems for rootless quadlet units as
they might be started before the network is fully up. TWhile this was
always the case and thus was never really noticed the main thing that
trigger a bunch of errors was the switch to pasta.

Pasta requires the network to be fully up in order to correctly select
the right "template" interface based on the routes. If it cannot find a
suitable interface it just fails and we cannot start the container
understandingly leading to a lot of frustration from users.

As there is no sign of any movement on the systemd issue we work around
here by using our own user unit that check if the system session
network-online.target it ready.

Now for testing it is a bit complicated. While we do now correctly test
the root and rootless generator since commit ada75c0bb8 the resulting
Wants/After= lines differ between them and there is no logic in the
testfiles themself to say if root/rootless to match specifics. One idea
was to use `assert-key-is-rootless/root` but that seemed like more
duplication for little reason so use a regex and allow both to make it
pass always. To still have some test coverage add a check in the system
test to ask systemd if we did indeed have the right depdendencies where
we can check for exact root/rootless name match.

[1] https://github.com/systemd/systemd/issues/3312

Fixes #22197

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-18 11:43:48 +02:00
Paul Holzinger ada75c0bb8
test/e2e: test quadlet with and without --user
This seems to be a testing gap, we need to test both for full coverage.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-17 15:53:10 +02:00
Ed Santiago fa920f54c7 CI: e2e: fix checkpoint flake
Two flakes seen in the last three months. One of them was in
August, so it's not related to ongoing criu-4.0 problems.

Suspected cause: race waiting for "podman run --rm" container
to transition from stopped to removed.

Solution: allow a 5-second grace period, retrying every second.

Also: add explanations to the Expect()s, remove unnecessary
code, and tighten up the CID check.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-17 06:40:33 -06:00
Ed Santiago fe96c843bf APIv2 test fix: image history
I'm assuming this was buildah#5595: the COMMENT field moved around.
Deal with it, and add a few more checks while we're at it.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-16 10:15:19 -06:00
Ed Santiago 67e39c1ec5 pasta udp tests: new bytecheck helper
...for debugging #24147, because "md5sum mismatch" is not
the best way to troubleshoot bytestream differences.

socat is run on the container, so this requires building a
new testimage (20241011). Bump to new CI VMs[1] which include it.

 [1] https://github.com/containers/automation_images/pull/389

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-16 10:15:19 -06:00
renovate[bot] 927cb7624c
Update dependency setuptools to ~=75.2.0
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-10-16 13:48:10 +00:00
Ed Santiago 1ddb15c81f System tests: safer pause-image creation
The current mypod hack breaks down when running individual tests:

    $ hack/bats 010   <<< barfs because it does not want pause-image!

Reason: Bats does not provide any official way to tell if tests
are being run in parallel.

Workaround: use an undocumented way.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-16 06:02:23 -06:00
openshift-merge-bot[bot] a2eb5429b3
Merge pull request #24264 from edsantiago/try-try-again
CI: fix changing-rootFsSize flake
2024-10-15 22:05:42 +00:00
openshift-merge-bot[bot] d5be88e0c2
Merge pull request #24228 from giuseppe/do-not-lower-rlimits
podman: do not set rlimits to the default value
2024-10-15 22:02:52 +00:00
Paul Holzinger 768aaadca1
libpod API: only return exit code without conditions
The special handling to return the exit code after the container has
been removed should only be done if there are no special conditions
requested. If a user asked for running or nay other state returning the
exit code immediately with a success response is just wrong. We only
want to allow that so the remote client can fetch the exit code without
races.

Fixes b3829a2932 ("libpod API: make wait endpoint better against rm races")

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-15 18:31:36 +02:00
Paul Holzinger b0f2ebbe9d
test/e2e: fix default signal exit code test
By default golang programs exit 2 on special exit signals that can be
cought and produce a stack trace. However this is behavior that can be
modfied via GOTRACEBACK=crash[1], in that case it does not exit(2) but
rather sends itself SIGABRT to the parent sees the signal exit and out
test sees that es exit code 134, 128 + 6 (SIGABRT), like most shells do.

As it turns out GOTRACEBACK=crash is the default mode on all fedora and
RHEL rpm builds as they patch the build with a special
"rpm_crashtraceback" go build tag.

While that change is old and existing for a very long time it was never
caught until commit 5e240ab1f5, which switched the old ExitWithError()
check that accepted anything > 0, to just accept 2. And as CI only test
upstream builds that are build without rpm_crashtraceback we did not
catch in CI either. Only once a user actually used distro build against
the source e2e test it failed.

I like to highlight that running distro builds against upstream e2e
tests is not something we really support or plan to support but given
this is a easy fix I decided to just fix it here as any user with
GOTRACEBACK=crash set would face the same issue.

While I touch this test remove the unnecessary RestoreArtifact() call
which is not needed at all as we do nothing with the image and just
slows the test down for now reason.

[1] https://pkg.go.dev/runtime#section-sourcefiles

Fixes #24213

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-15 15:17:50 +02:00
Ed Santiago 1b57dcab61 CI: fix changing-rootFsSize flake
(Second try). Use an airgapped image in the inspect-data tests.

Fixes: #23756

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-15 05:14:49 -06:00
Giuseppe Scrivano 71d5ee0e04
podman: do not set rlimits to the default value
since the effect would be to lower the rlimits when their definition
is higher than the default value.

The test doesn't fail on the previous version, unless the system is
configured with a nofile ulimit higher than the default value.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2317721

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-11 23:04:27 +02:00
openshift-merge-bot[bot] d512e44147
Merge pull request #24227 from Luap99/ci-image-update
cirrus: update CI images
2024-10-10 17:25:39 +00:00
Paul Holzinger 4e3a03795d
test/e2e: skip some Containerized checkpoint tests
They no longer work in the latest image update, it is not clear why and
I do not have the time to debug that stuff. I opened #24230 to track it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-10 17:44:09 +02:00
Paul Holzinger fe404959ed
test: update timezone checks
In debian EST and MST7MDT are gone by default and moved to a special
package[1], instead of also installing that in the images lets use
different timezones in the test.

[1] 42c0008f86

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-10 17:44:08 +02:00
Paul Holzinger f517e52167
test/e2e: try debug potential pasta issue
Run pasta with --trace and a log file to see if the hangs are caused by
pasta not correctly closing connections as assumed in #24219.

As the log is super verbose do not log it by default so I added some
extra logic to make sure it is only logged when the test fails.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-10 12:00:25 +02:00
Ed Santiago 38803713d6 CI: quadlet system tests: use airgapped testimage
This command sequence causes SizeRootFs to change on foo:

   podman tag foo newimagename
   podman save ... newimagename
   podman load ...

Solution: get foo completely out of the picture. Use an
airgapped image: new image, new digest, new everything.

Fixes: #23756

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-09 14:11:00 -06:00
openshift-merge-bot[bot] 5890190c59
Merge pull request #24194 from lambinoo/quadlet-disable-default-dependencies
Allow removing implicit quadlet systemd dependencies
2024-10-09 16:23:31 +00:00
Farya L. M bac655a6b1 Allow removing implicit quadlet systemd dependencies
Quadlet inserts network-online.target Wants/After dependencies to ensure pulling works.
Those systemd statements cannot be subsequently reset.

In the cases where those dependencies are not wanted, we add a new
configuration item called `DefaultDependencies=` in a new section called
[Quadlet]. This section is shared between different unit types.

fixes #24193

Signed-off-by: Farya L. Maerten <me@ltow.me>
2024-10-09 14:48:05 +02:00
Ed Santiago e7833d52cf 055-rm test: clean up a test, and document
There's an important reason why the healthcheck container in 055-rm
test uses 'sleep infinity' and not 'top. Document it.

And, the test itself wasn't actually working as intended. Make
it safer by confirming that the container actually enters
the "stopping" state.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-10-07 15:22:49 -06:00
openshift-merge-bot[bot] 6b0ad8269c
Merge pull request #24182 from containers/renovate/golang.org-x-tools-0.x
fix(deps): update module golang.org/x/tools to v0.26.0
2024-10-07 16:59:17 +00:00
Paul Holzinger 45df394072
server: fix url parsing in info
When we are activated by systemd the code assumed that we had a valid
URL which was not the case so it failed to parse the URL which causes
the info call to fail all the time.
This fixes two problems first add the schema to the systemd activated
listener URL so it can be parsed correctly but second simply do not
parse it as url as all we care about in the info call is if it is unix
and the file path exists.

Fixes #24152

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-10-07 12:03:56 +02:00