Commit Graph

1622 Commits

Author SHA1 Message Date
openshift-merge-bot[bot] f79ede86c6
Merge pull request #22914 from Luap99/start-stopped
libpod: do not reuse networking on start
2024-06-11 19:18:55 +00:00
Daniel J Walsh ad8fc6a74b
--squash --layers=false should be allowed
This is the same as what --squash-all is doing, and we already support
--squash with --layers=true since this is the default.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2024-06-10 13:24:05 -04:00
Paul Holzinger a9de888a15
libpod: do not resuse networking on start
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.

Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-07 17:50:28 +02:00
openshift-merge-bot[bot] 42ffa4db43
Merge pull request #22886 from Luap99/fast-system-test-3
test/system: make some tests faster part 3
2024-06-05 13:19:00 +00:00
Paul Holzinger e8ea1e7632
libpod: do not leak systemd hc startup unit timer
This fixes a regression added in commit 4fd84190b8, because the name was
overwritten by the createTimer() timer call the removeTransientFiles()
call removed the new timer and not the startup healthcheck. And then
when the container was stopped we leaked it as the wrong unit name was
in the state.

A new test has been added to ensure the logic works and we never leak
the system timers.

Fixes #22884

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 18:03:46 +02:00
Paul Holzinger 350dfabf66
test/system: speed up podman ps --external
The buildah buil kill trick is bad as we have to sleep and wait to aboid
flakes which takes time. Instead it is possible to redo this build part
manually with buildah commands. It is not trival and harder to
understand but it safes 2-3s so I think it is worth it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:01 +02:00
Paul Holzinger 8fa1ffbbec
test/system: speed up podman network connect/disconnect
Combine multiple inspect --format into one, it is not much but is makes
it faster by a few 100 ms.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:01 +02:00
Paul Holzinger 8640ce998c
test/system: speed up podman network reload
First, as root don't wait 5s for the timeout, 1s is enough. Also switch
to use the curl --max-time option instead, that way we know we do not
kill curl before it had the chance to do anything possibly.

Second, combine podman inspect commands into one. This makes the test
faster by over one second as we safe a bunch of podman commands.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 609146fb75
test/system: speed up quadlet - pod simple
Another case of contianer does not exit with SIGTERM so we waste 10s.
Now because our contianer reacts to sigterm and exits 0 the systemd unit
status changed to inactive from failed.
And most importantly add Notify=yes because the socat call always failed
as the default is to not leak the notify socket into the container.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 7f3bb2d238
test/system: speed up podman parallel build should not race
It is not clear at all why the count of 30 was choosen, this seems a
lot and of course takes quite a while. The test takes over 16s in CI.
To speed it up reduce the count to 10. I think this should still be good
enough to ensure there are no races IMO.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 8852614792
test/system: speed up podman cp dir from host to container
It makes the test a bit uglier but I cannot see a good way to sped this
up otherwise. I chnaged the created test to only start/stop the
contianer once instead of every test case iteration. This makes it about
2s faster locally.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 8d3f65b026
test/system: speed up podman build - workdir, cmd, env, label
Overall just combine several container runs into one. Every RUN
instruction will run a new container which is quite expensive so chain
the commands together. The same for podman run's.
I could have combined a bit more but I think this leaves it still
readable. This speeds up the test about 4s locally from 8s before.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 471e001c7f
test/system: speed up podman --log-level recognizes log levels
Use podman version over podman info because info has to query a lot of
internal state, e.g. contianer and image count, so it is slower than a
simple info. This speeds the test up by about 600ms locally.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:24:00 +02:00
Paul Holzinger 26bdb5d110
test/system: remove obsolete debug in net connect/disconnect test
Issue #11825 was fixed a long time ago. Also we no longer test
cni/dnsname so there is really no point in having this.
Speeds up the test by 1 second.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:23:59 +02:00
Paul Holzinger c466377013
test/system: speed up quadlet - basic
Another case of contianer does not exit with SIGTERM so we waste 10s.
Now because our contianer reacts to sigterm and exits 0 the systemd
unit status changed to inactive from failed.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:23:59 +02:00
Paul Holzinger 6b021dd4ba
test/system: speed up user namespace preserved root ownership
We don't have two loop twice for the stat call we can just stat both
dirs at once. This means we only have to create half of the containers
so the test is twice as fast.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-06-04 16:23:59 +02:00
Giuseppe Scrivano 900e29549a
libpod: do not move podman with --cgroups=disabled
The expectation with --cgroups=disabled is that the current cgroup is
used by the container.

Currently the --cgroups=disabled is passed directly to the OCI
runtime, but it doesn't stop Podman from creating a new cgroup when it
doesn't own the current one.

Closes: https://github.com/containers/podman/issues/20910

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-30 16:59:30 +02:00
openshift-merge-bot[bot] 846d717c0b
Merge pull request #22826 from Luap99/fast-system-test-2
test/system: make some tests faster part 2
2024-05-29 12:59:09 +00:00
Paul Holzinger ad661b5b31
test/system: speed up kube generate tmpfs on /tmp
The command does not react on sigterm, so kube down needs to wait 10s.
To fix it first use a command that does but also write the yaml
directly instead of doing the podman create && kube generate dance.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:01:16 +02:00
Paul Holzinger bff0697de8
test/system: speed up podman kube play tests
use a command that stops on SIGTERM not sleep, that way the tests can
continue to use podman kube down without waiting for the full stop
timeout every time.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:01:16 +02:00
Paul Holzinger 67356a71b3
test/system: speed up podman shell completion test
This test is by far the slowest one taking over minute, the reason is
that it is checking every single podman command for shell completions.
The test is useful but it does not need to check the "..." argument 3
times. Test a second time to make sure not only the first arg is
completed. This change makes it about 15 seconds faster.

Long term we should get this test out of the main system tests together
with other cli only tests as they do not need to run on each OS, etc...

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:01:15 +02:00
Paul Holzinger 01642c64ea
test/system: simplify test signal handling in containers
The current logic used podman logs I don't understand way, all we care
about is the container output and we can just read the same with a
attached podman run, of course we have to move it into the background
but it did the some with logs.

This also allows us to remove the extra log-driver checks and because
podman logs seems to be much slower than the extra run we safe over 10s
with this change.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:01:15 +02:00
Paul Holzinger 6fa064f991
test/system: speed up podman container rm ...
Use only one retry and a short stop timeout to speed them up. I am not
sure if this will cause flakes, I have not seen any after trying for
some time so I think this works just as well. And is about 2-3 seconds
faster for both tests.

If it does start to flake we can revert this commit again or write the
test differently.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:00:51 +02:00
Paul Holzinger 37120bbe80
test/system: speed up podman ps - basic tests
Do not wait 5 seconds, just stop the container directly.
This speeds up the test by more than 4 seconds.

One could make the case here that we want to check podman wait but
there are so many other podman wait tests that it should not matter.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:00:39 +02:00
Paul Holzinger 4f3c691087
test/system: speed up read-only from containers.conf
Instead of iterating over all tmp dirs and creating test containers for
each one we can just pass all files to one touch call. With that we have
to create much less containers while still checking the same thing. This
speeds up the test by about 4 seconds.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:00:39 +02:00
Paul Holzinger edf6f1814e
test/system: speed up podman logs - multi ...
The test used sleep to synchronize log output between both containers
which is slow. There is actually no way to guarantee the ordering on
the reading side so just remove the sleep's and check the the lines
within the same container are in the right order.

Trying to preserve the orignal ordering is just not possible if we speed
up the test as it would flake to often.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 11:00:30 +02:00
Paul Holzinger fe05e25edf
test/system: speed up podman run --name
There is no reason for this check to wait 4 seconds for the container to
run, instead make sure to have a running process and then stop it
directly with -t0 not have any delay.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-29 10:39:51 +02:00
Ed Santiago 1ae05473c1 Debian: switch to crun
As agreed in Planning meeting of 2024-03-20, Podman 5.x will
drop support for cgroups v1 and for runc. Make it so.

CI images built in https://github.com/containers/automation_images/pull/338

Signed-off-by: Ed Santiago <santiago@redhat.com>
2024-05-28 16:34:39 -06:00
openshift-merge-bot[bot] aca5a7b036
Merge pull request #22821 from Luap99/fast-system-test
test/system: make some tests faster part 1
2024-05-28 14:44:40 +00:00
openshift-merge-bot[bot] af8fe2b75e
Merge pull request #22764 from giuseppe/give-more-time-to-healthcheck-status-change
libpod: wait another interval for healthcheck
2024-05-28 13:21:43 +00:00
Paul Holzinger 1093ebb72b
test/system: speed up podman generate systemd - envar
This container did not react to sigterm thus we always waited 10s for it
to stop. Also do not wait 2s for the logs instead use a retry loop.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-28 13:53:55 +02:00
Paul Holzinger 15606148e5
test/system: speed up podman-kube@.service template
The test does a normal stop on a command that does not react to sigterm.
As I cannot fix the system stop logic use a command which does. This
safes us 10s as it no longer waits for the timeout.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 18:37:18 +02:00
Paul Holzinger 42f43fb3a3
test/system: speed up kube play healthcheck initialDelaySeconds
Both tests take 10s longer than they need to because they run the sleep
command int he container which does not react to sigterm, as such podman
waits 10s before killing it with sigkill.

To fix it just stop them with podman rm -fa -t0 to avoid the wait and do
not use podman kube down as we cannot set a timeout there. podman kube
down is still covered in many other tests so this is not an issue.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 18:23:25 +02:00
Paul Holzinger 9e321aafda
test/system: speed up exit-code propagation test
IMO it is not important to cover each case with each sdnotify policy, to
speed them up we run all the exit code cases only once just twice for
each policy while switching the sdnotify policy between each case. This
way we safe 50% of runs and should still have sufficient coverage.

Before it took around 24 seconds, with this it is around 12 seconds now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 18:10:07 +02:00
Paul Holzinger 94ba2cf1a1
test/system: speed up "podman run --timeout"
There is really no point in waiting 10s for the kill, let's use 2 this
should be good enough to observe the timing.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 17:51:59 +02:00
Paul Holzinger 82bffb9c50
test/system: fix slow kube play --wait with siginterrupt
This test waits 15 seconds to send sigterm for no good reason, we can
just make the timeout shorter. Also make sure the podman command quit on
sigterm by looking for the output message.

While at it fix the tests to use $PODMAN_TMPDIR not /tmp and define the
yaml in the test instead of using the podman create && podman kube
generate && podman rm way to create the yaml as it is a bit slower as we
have to call three podman commands for it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 17:37:20 +02:00
Paul Holzinger 9a7ffaa077
test/system: speed up podman events tests
Merge two podman event tests into one to speed them up as they did
mostly the same anyway. This way we only have to do the setup/teardown
once and only run one container.

Second, add the --since option because reading the journal can be slow
if you have thousands of event entries. This is not so critical in CI as
we run on fresh systems but on local dev machines I have almost 100k
events in the journal so parsing all of them makes this test slow (like
30s), with this change I can get it under 1s.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 17:14:28 +02:00
Paul Holzinger 9de1d4f653
test/system: speed up "podman auto-update using systemd"
Defining a timer with a fixed interval is not a good idea as we first
have to wait until the timer triggers, while the interval was every two
seconds it means that we have to wait at least 2s for it to start.
However much worse it means it triggers the unit over and over, this
seems to cause some soft of race with the output check. I have seen
this test run 10-60s which does not make much sense.

Switching the timer to trgger once on start seem to make the test run
consistently in 7s locally for me so this is much better.

There still is the question if we really have to test this at all on
each upstream PR but I left it for now.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 16:20:05 +02:00
Paul Holzinger a09152ab28
test/system: remove podman wait test
It takes over 10 seconds for this test as it uses --wait 5 twice which
runs into the timeout. IMO this tests is just redundant as it is already
covered in the e2e tests much better. Thus remove it here.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-27 15:54:56 +02:00
Giuseppe Scrivano 7f567a4e51
tests: disable tests affected by a race condition
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-27 13:02:26 +02:00
openshift-merge-bot[bot] eee0dc256a
Merge pull request #22727 from mheon/chown_all_the_time
Always chown volumes when mounting into a container
2024-05-23 12:34:07 +00:00
Matthew Heon 046c0e5fc2 Only stop chowning volumes once they're not empty
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.

For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.

In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.

Fixes #22571

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2024-05-22 17:47:01 -04:00
Giuseppe Scrivano d094a9f18e
podman: fix --sdnotify=healthy with --rm
Now WaitForExit returns the exit code as stored in the db instead of
returning an error when the container was removed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-22 21:34:38 +02:00
David Gibson d418391ce6 test, pasta: Ignore deprecated addresses in tests
The default_addr shell function in test/system/helpers.network is used to
get the host's default address, which is used in a number of pasta
networking tests.  However, in certain circumstances it can incorrectly
pick a deprecated address as the primary address.  Correct it to exclude
those.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2024-05-22 17:36:33 +10:00
Paul Holzinger fb2ab832a7
fix incorrect host.containers.internal entry for rootless bridge mode
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.

The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.

Fixes #22653

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-17 12:28:44 +02:00
openshift-merge-bot[bot] f7a30461e0
Merge pull request #22658 from giuseppe/libpod-wait-for-healthy-on-main-thread
libpod: wait for healthy on main thread
2024-05-16 15:59:54 +00:00
Paul Holzinger cb905f59ea
test/system: fix documentation
First, point users to hack/bats for running them locally. Second, remove
TODO.md as it doesn't contain any helpful information. Basically all the
missing tests there have been added so this does not serve any purpose
and is missleading.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-15 13:08:39 +02:00
Giuseppe Scrivano b06c58b4a5
libpod: wait for healthy on main thread
wait for the healthy status on the thread where the container lock is
held.  Otherwise, if it is performed from a go routine, a different
thread is used (since the runtime.LockOSThread() call doesn't have any
effect), causing pthread_mutex_unlock() to fail with EPERM.

Closes: https://github.com/containers/podman/issues/22651

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-14 22:55:02 +02:00
Nalin Dahyabhai c46884aa93 `podman events`: check for an error after we finish reading events
The function that's handing us events will return an error after closing
the channel over which it's sending events, and its caller (in its own
goroutine) will then send that error over another channel.

The logic that started the goroutine is likely to notice that the events
channel is closed before noticing that the error channel has a result
for it to read, so any error that would have been communicated would be
lost.

When we finish reading events, check if the reader returned an error
before telling our caller that there was no error.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2024-05-14 13:18:51 -04:00
Paul Holzinger 2a609b0f74
rootless: fix reexec to use /proc/self/exe
Under some circumstances podman might be executed with a different argv0
than the actual path to the podman binary. This breaks the reexec logic
as it tried to exec argv0 which failed.

This is visible when using podmansh as login shell which get's the
special -podmansh on argv0 to signal the shell it is a login shell.

To fix this we can simply use /proc/self/exe as command path which is
much more robust and the argv array is still passed correctly.

Fixes #22672

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2024-05-14 12:02:19 +02:00