Where possible, use safename and add ci:parallel tags.
One test runs "podman kill -a", which would be unwise to run
in parallel with other tests.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Add 'ci:parallel' tags to a few easy places. And, two
small easily-reviewed safename or random-port additions.
These have been working fine in #23275. I want to stop
carrying them there so I can work on simplifying my PR.
Signed-off-by: Ed Santiago <santiago@redhat.com>
- replace random_string with safename in container/network names
- add ci:parallel tags where possible.
- where not possible, add explanations
- fix a userns leak
Signed-off-by: Ed Santiago <santiago@redhat.com>
Workaround (NOT A FIX) for pasta issue #23482, wherein
podman logs includes a waitpid: ESRCH warning. Consensus
seems to be that this is a bug in socat.
Signed-off-by: Ed Santiago <santiago@redhat.com>
- fix a few missing safenames
- eliminate 'container rm -a'
- when running ps, do substring match, not exact
- where possible, add ci:parallel tags
- when not possible, explain
Also, fix a completely broken inspect test
Signed-off-by: Ed Santiago <santiago@redhat.com>
This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.
In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.
Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.
Fixes#22852
Fixes RHEL-39513
Signed-off-by: Matt Heon <mheon@redhat.com>
By default wait only waits for the exit of a container, there is really
no way to make it wait for the removal too when the container was
created with --rm. I though I found a clever way in 8a943311db but this
is not working race free. While it works most of the time any other
parallel process might call syncContainer() before the cleanup process
holds the lock until it removes it. As such the wait hack to only update
the state and not sync the exit file did not work so we can drop that.
However the test wants to wait for the removal to happen by the cleanup
process and we can already say --condition=removing to do this but this
will throw an error if the ctr was removed instead of counting this as
success so fix that as well.
Fixes#23640
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The usual, safename instead of hardcoded names or random_string.
And remove some rmi statements: we no longer clean up pause_image.
Been working great in #23275 all week.
Signed-off-by: Ed Santiago <santiago@redhat.com>
...by using a crude port lock-and-reserve mechanism. This is
a small cherrypick from code that has been working in #23275
over dozens of CI runs. Am separating out into a small PR
because it's stable, harmless to serial runs, and will
simplify the eventual review of #23275.
Closes: #23488
Signed-off-by: Ed Santiago <santiago@redhat.com>
Use safename instead of hardcoded object names. Requires moving
a test table down, into the function itself instead of global,
because the table needs to know object names.
Also: sneak in a workaround for dealing with quay flakes (in
image search). The local registry is allowing almost all tests
to pass even when quay is down, but this one test still needs
to hit quay.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Now that on-failure exits right away the test is racy as the
RestartCount is not at the value we expect as the container is still
restarting in the background. As such add a timer based approach.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The current code did several complicated state checks that simply do not
work properly on a fast restarting container. It uses a special case for
--restart=always but forgot to take care of --restart=on-failure which
always hang for 20s until it run into the timeout.
The old logic also used to call CheckConmonRunning() but synced the
state before which means it may check a new conmon every time and thus
misses exits.
To fix the new the code is much simpler. Check the conmon pid, if it is
no longer running then get then check exit file and get exit code.
This is related to #23473 but I am not sure if this fixes it because we
cannot reproduce.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When will I learn not to dismiss something as "easy"?
Anyhow, this doesn't actually change anything parallel-wise
but it does reduce a race condition seen on heavily-loaded
slow systems, wherein a container goes into unhealthy before
we want it to. This version isn't perfect; I don't think
there's an ideal fix for this.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Only one test can be parallelized. Do so, and add a comment
to the other one explaining why it can't be.
Also, add some missing error-message checks.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Very few changes needed, all of them simple.
It is impossible to parallelize this entire file, because "stop -a".
Add tags to tests that can be parallelized, and comments to those
that can't.
Signed-off-by: Ed Santiago <santiago@redhat.com>
When the cidfile does not exists and ignore is set the cli parser skips
the file without error and we call into the backend code without any
names at all. This should logically be a NOP but on remote it caused all
containers to be returned which caused podman stop to stop everything in
this case.
Fixes#23554
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Do not rely on an arbitrary delay in order to ensure the port was bound
in the container. Instead this approach checks if the port is bound in
the netns and only then starts the client. This speeds up the entire
test file by 50% but more importantly in parallel testing it solves
hangs as the timeout there was unreliable.
Fixes#23471
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add support for the ServiceName key for all unit types
Extend the PodInfo struct into UnitInfo to consolidate all prepopulated data into a single map
Use the NodesInfo map instead of the resourceName
Update the UnitInfo in the convert function instead of returning it
No need to replace extension anymore just remove it
All e2e tests with dependencies on other Quadlet files moved to a separate section
Add the capability of overriding the service name in the test
Add e2e tests for the new functionality
Adjust integration tests
Update the MAN page
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
Use safename for containers, volumes, images.
Build a temporary scratch image for podman image mount, so
we can safely mount/umount it (instead of $IMAGE) without
risk of other parallel tests umounting it.
Fixed some oopsies ("$vol1" is empty string, so, NOP test)
And... an experiment. I'm leaving in my 'ci:parallel' tags
and notes, so I don't have to carry them in #23275. This
is harmless, basically just noisy comments. The drawback
is, if for some reason #23275 does not pan out, I'll have
to go back and remove those tags. Right now I'm feeling
pretty comfortable about this parallelization approach tho.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Use safename instead of hardcoded "test"
Start registry once, in setup_file(), instead of requiring
individual tests to do so.
Add explicit --authfile arg to a bunch of places that now need it
Minor cleanup and improvements in test descriptions. I may have
gotten a little carried away here, but if this test ever fails
these additions will make someone's life much easier.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Strictly speaking we don't need the path yet, but it existing
prevents a lot of strangeness in our path-checking logic to
validate the current Podman configuration, as it was the only
path that might not exist this early in init.
Fixes#23515
Signed-off-by: Matt Heon <mheon@redhat.com>
if idmap is specified for a volume, reverse the mappings when copying
up from the container, so that the original permissions are maintained.
Closes: https://github.com/containers/podman/issues/23467
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
BATS teardown logs are unreadable, making it almost impossible
to see tiny "Leaked this-or-that" messages.
Solution: new _run_podman_quiet() helper, replaces run_podman
in a small number of cases within teardown. Clunky, and
duplicative, sorry.
New helper for leak_check, basically spits out warnings (and
bumps error count) if it sees any output whatsoever from
individual "podman XXX ls" commands.
Signed-off-by: Ed Santiago <santiago@redhat.com>
We bind ports to ensure there are no conflicts and we leak them into
conmon to keep them open. However we bound the ports after the network
was set up so it was possible for a second network setup to overwrite
the firewall configs of a previous container as it failed only later
when binding the port. As such we must ensure we bind before the network
is set up.
This is not so simple because we still have to take care of
PostConfigureNetNS bool in which case the network set up happens after
we launch conmon. Thus we end up with two different conditions.
Also it is possible that we "leak" the ports that are set on the
container until the garbage collector will close them. This is not
perfect but the alternative is adding special error handling on each
function exit after prepare until we start conmon which is a lot of work
to do correctly.
Fixes https://issues.redhat.com/browse/RHEL-50746
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
I broke the kube external storage test in the course of my
safename PR: _write_test_yaml() with no command generated
a pod that did not trigger the conditions required for
this test.
Solution: run a container (top). Add new checks to prevent
this gap from happening again.
Signed-off-by: Ed Santiago <santiago@redhat.com>
These test steps check the automount feature with multi images for
following item:
1. multi images can be auotmounted with yaml file.
2. if there are same path exist in the images, the last one
should trumps.
3. the volume is mounted readonly in the container.
4. the volumes are only mounted in the specific container, but
not the whole pods.
Signed-off-by: Yiqiao Pu <ypu@redhat.com>
validate that a "podman generate" and "podman play" cycle restores the
specified user namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The tests didn't check anything actually because default_ifname requires
an ip version argument to work. Thus pasta_iface was empty, add new
checks to prevent this kind of error again.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The test assumes that if more than 1 ip on the host we should be able to
set host.containers.internal. This however is not how the logic works in
the code. What it actually does is to check all ips in the
rootless-netns and then it knows that it cannot use any of these ips.
This includes any podman bridge ips.
You can reproduce the error when you have only one ipv4 on the host then
run a container as root in the background and run the test:
hack/bats --rootless 505:host.containers.internal
So the failure here was that there was already a podman container
running as root on the default bridge thus the test saw 2 ips but then
the rootless run also uses the same subnet for its bridge and the code
knew that ip would not work either. I could have made another special
condition in test but the better way to work around it is to create a
new network. A new network will make sure there are no conflicting
subnets assigned so the test will pass.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Continuing efforts on making system tests parallel-safe by
using unique names for containers and pods.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Two tests failing in gating but never CI; add some debug
instrumentation to make it possible to find out what
is going on
Signed-off-by: Ed Santiago <santiago@redhat.com>
Using the scanner is just unnecessary complicated an buggy as it will
not read the final line with a newline. There is also the problem that
it happens in a separate goroutine so it could loose output if we read
the array before the scanner was done.
The API accepts a Writer so we can just directly use a bytes.Buffer
which captures all output in memory without the need of another
goroutine.
This also means that now we always include the final newline in the
output. I checked with docker and they do the same so this is good.
Fixes#23332
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When the file is empty it is possible our code panics as bar.ProxyReader
returns nil when the bar is finished which is the case for 0 size as it
doesn't have to read anything from there. However as this happens on
different goroutines it is race and most of the time still works.
To fix this simply skip the progress bar setup for empty files.
While at it fix the deprecated argument in the tests.
Fixes#23281
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
nmap-ncat has been downgraded on Fedora, to 7.92.
nc -l -p PORT requires 7.95. Switch to nc -l ADDR PORT.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The end goal is making this test file parallel-safe, by:
1) Having all tests use unique names for all objects; and
2) Not doing "rm -a" or "expect ps to be empty".
This commit is not enough to make tests parallel-safe. The
rest of the changes are not relevant for now. This set of
changes is _necessary_ for parallelizing, and is _meaningful_
(good practice) for current linear-testing podman without
introducing any unnecessary cruft.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Make safename() invocations consistent within the same
test. This puts the onus on the caller to add a unique
element when calling multiple times, e.g. "ctr1-$(safename)".
This is not too much of a burden. Major benefit is making
it easy for a reader to associate containers, pods, volumes,
images within a given test.
And, use dashes, not underscores. "podman generate kube"
removes underscores, making it very difficult to do
things like "podman inspect $podname" (because we need
to generate "$podname_with_underscores_removed")
Signed-off-by: Ed Santiago <santiago@redhat.com>
Many instances. Simplify by having _write_test_yaml() define
the variable TESTYAML and make it available to callers.
Global replace, with care taken to undo any instances
where _write_test_yaml() is not invoked first.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Get rid of the last two instances of the clunky $testYaml
writing, by adding a 'volume=' arg to _write_test_yaml()
Signed-off-by: Ed Santiago <santiago@redhat.com>
Remnant from the very early days of this test file. There's
a boilerplate $testYaml string used in many tests; each
use requires three clunky lines of prep. Most of those
were not needed; we can (and now do) use _write_test_yaml()
instead.
There are still two instances that could not be fixed in
this commit. I will do those next. This commit is kept
relatively simple for ease of review.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Many system tests use hardcoded names for containers, images,
and everything. This has worked because system tests run
serially. It will not work if we ever run in parallel.
Create a new safename() helper, and use it as follows:
myctr=c_$(safename)
myvol1=v1_$(safename)
...
Find current instances of hardcoded names, and replace
with safe ones.
Whether or not we ever end up parallelizing system tests,
this is simply good practice.
There are far too many instances to fix in one (reviewable) PR.
This is commit 1 of N.
Signed-off-by: Ed Santiago <santiago@redhat.com>
shutdown the containers store so that the home directory mount is not
leaked when "podman system service" exits.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
netavark can use iptables or nftables as firewall driver, thus if we try
to flush rules make sure we try both to keep the test working when we
switch the default to nftables.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This commit gets tests working under the new local-registry system:
* amend a few image names, mostly just sticking to a consistent
list of those images in our registry cache. Mostly minor
tag updates.
* trickier: pull_test: change some error messages, and remove
a test that's now a NOP. Basically, with a local (unprotected)
registry we always get "404 manifest unknown"; with a real
registry we'll get "403 I can't tell you".
* trickiest: seccomp_test: build our own images at run time,
with our desired labels. Until now we've been pulling
prebuilt images, but those will not copy to the local
cache registry. Something about v1? Anyhow, I gave up
trying to cache them, and the workaround is straightforward.
Also took the liberty of strengthening a few error-message checks
Signed-off-by: Ed Santiago <santiago@redhat.com>
Run root e2e & system tests using composefs on rawhide.
Write magic settings to storage.conf. That part is easy.
e2e tests, however, ignore storage.conf. They require everything
to be specified on the command line. And "everything", in the
case of composefs, includes a long complicated --pull-options
string which in turn requires containers-storage PR 1966
which, as of this writing, is finally vendored into podman.
Signed-off-by: Ed Santiago <santiago@redhat.com>
When a system has one ipv4 and one ipv6 address hostname -I will show
both causing a failure in the case where this is only one address.
To fix this stop using hostname -I and use ip -4 to only list v4
addresses and the use jq to filter the output accordingly.
Fixes#23227
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The code currently tried to avoid joining the userns from conmon
directly and rather joined to only read the pid file and then send this
back to use so we could join the userns. From the comment this was done
because we could not read the pid file. However this is no longer true
as of commit 49eb5af301 and file is no always owned by the real user.
This means we can just remove this special logic and join the namespace
directly there. A test has been added to check the rejoin logic with a
custom uidmapping.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
- fix test name to reflect that it's not pasta-only
(followup from #21563)
- in one podman-update test run in OpenQA, defer assertion
failures so we can gather better data on regressions.
This would've been helpful in diagnosing bz2281805.
- add an error-message check to one test that needed it
(found by accident)
- add distro-integration test tag to a handful of new tests,
so they run in OpenQA. Found via 'git diff 33891e8 test/system'
and scanning for '^\+@test '. I only added tests that IMO
have some risk of interacting poorly with kernel or systemd
updates, e.g. quadlet, modules, tmpfs+noswap.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The test check the the default volume is not on tmpfs, however what it
should really check that the volume is on our container storage fs. It
is possible that users run the storage on top of tmpfs so this test
always failed there.
The better check is to compare the fs from the graphroot and the volume.
Unfortunately, for unknown reasons stat -f -c %T returns UNKNOWN and not
the actual fs. I have no idea why, to work around that we now parse
/proc/mounts manually for the fs. Not nice but at least it works
correctly.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Add some test steps into quadlet - ContainerName. These steps are
used to ensure the default configuration for quadlets generated
service files is sending stdout/stderr/syslog to the journald.
Signed-off-by: Yiqiao Pu <ypu@redhat.com>
The restore code path never called completeNetworkSetup() and this means
that hosts/resolv.conf files were not populated. This fix is simply to
call this function. There is a big catch here. Technically this is
suposed to be called after the container is created but before it is
started. There is no such thing for restore, the container runs right
away. This means that if we do the call afterwards there is a short
interval where the file is still empty. Thus I decided to call it
before which makes it not working with PostConfigureNetNS (userns) but
as this does not work anyway today so I don't see it as problem.
Fixes#22901
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
With (esp. Debian) CI VM images built by
https://github.com/containers/automation_images/ pull/338 CI no-longer
tests with runc nor cgroups v1. Add logic to fail under these
conditions. Prune back high-level YAML/script envars and logic formerly
required to support these things.
Signed-off-by: Chris Evich <cevich@redhat.com>
When a user specifies a invalid connection in CONTAINER_CONNECTION then
podman should return a proper error saying so. Currently it ignored the
error and in rootFlags() just exited early with defining any flags. This
caused a panic then when trying to use the flags later.
In order to address this first store the connection error in the
PodmanConfig struct and not abort right away during flag setup. This is
important as the user might have specified a flag with a valid remote
connection. As such we check all flags and only when none were given we
return the connection error.
Also while at it I noticed that the default connection reported via
podman --help was wrong as it only used the old containers.conf field
for it and did not consider the podman-connections.json default.
New regression tests have been added to make sure it behaves correctly.
This fixes the problem reported in the PR #22997.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
PR #22821 (CI speedup) was overly aggressive in one kube test.
It's flaking. Bump up timeout from 3s to 4.
Signed-off-by: Ed Santiago <santiago@redhat.com>
At the end of all tests always check for leaks. That should make us more
robust against adding tests at the end that would leak stuff otherwise.
TODO: something seems wrong with bats when returning an error in
teardown_suite(), it prints a warning:
bats warning: Executed <NUM+1> instead of expected <NUM> tests
And also the output is formatted weirdly in this case where the podman
args are split over multiple lines.
But the test fails as expected so I don't think it is a problem.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
While these are not really slow they still take about 100-250ms if I
time this locally. Given they are run for every test this adds up
quickly. Looking at CI logs I can see the timings for skipped
tests are all in 600ms range. So I think it is safe to assume that these
functions need to get faster.
We have over 670 test cases currently so we talk about over 400s spend
in these functions in CI. This allows for big gains.
Now overall this is a tricky trade of, while all tests should cleanup
after themselves there is no guarantee for that as such errors can be
leaked into other tests making debugging much harder. To work at least a
bit against this teardown checks if the test was successful and only
skips the podman commands bases on that. Without it a single flake could
cause all following tets to fail.
As such this commit does the proper setup once one suite start then only
after a test failed.
In order for this to work at all we have to fix all leaks first, see
previous commits. And then for the future keep a very strong eye on
this during reviews.
Also add a PODMAN_BATS_LEAK_CHECK option
By default test must cleanup themselves and to speed up CI we no longer
do any cleanup in teardown by default. However there is still many cases
where we might have to debug a leak so add a new PODMAN_BATS_LEAK_CHECK
env option that can be set and should cause teardown to fail if the test
did not cleanup properly.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Remove leaking containers and remove unessesary push/pull args. For push
it tries to push an image as argument which makes no sense and for pull
we try to pull argument as image which is also wrong.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This is the same as what --squash-all is doing, and we already support
--squash with --layers=true since this is the default.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If a container was stopped and we try to start it before we called
cleanup it tried to reuse the network which caused a panic as the pasta
code cannot deal with that. It is also never correct as the netns must
be created by the runtime in case of custom user namespaces used. As
such the proper thing is to clean the netns up first.
Also change a e2e test to report better errors. It is not directly
related to this chnage but it failed on v1 of this patch so we noticed
the ugly error message it produced. Thanks to Ed for the fix.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This fixes a regression added in commit 4fd84190b8, because the name was
overwritten by the createTimer() timer call the removeTransientFiles()
call removed the new timer and not the startup healthcheck. And then
when the container was stopped we leaked it as the wrong unit name was
in the state.
A new test has been added to ensure the logic works and we never leak
the system timers.
Fixes#22884
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The buildah buil kill trick is bad as we have to sleep and wait to aboid
flakes which takes time. Instead it is possible to redo this build part
manually with buildah commands. It is not trival and harder to
understand but it safes 2-3s so I think it is worth it.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
First, as root don't wait 5s for the timeout, 1s is enough. Also switch
to use the curl --max-time option instead, that way we know we do not
kill curl before it had the chance to do anything possibly.
Second, combine podman inspect commands into one. This makes the test
faster by over one second as we safe a bunch of podman commands.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Another case of contianer does not exit with SIGTERM so we waste 10s.
Now because our contianer reacts to sigterm and exits 0 the systemd unit
status changed to inactive from failed.
And most importantly add Notify=yes because the socat call always failed
as the default is to not leak the notify socket into the container.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It is not clear at all why the count of 30 was choosen, this seems a
lot and of course takes quite a while. The test takes over 16s in CI.
To speed it up reduce the count to 10. I think this should still be good
enough to ensure there are no races IMO.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It makes the test a bit uglier but I cannot see a good way to sped this
up otherwise. I chnaged the created test to only start/stop the
contianer once instead of every test case iteration. This makes it about
2s faster locally.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Overall just combine several container runs into one. Every RUN
instruction will run a new container which is quite expensive so chain
the commands together. The same for podman run's.
I could have combined a bit more but I think this leaves it still
readable. This speeds up the test about 4s locally from 8s before.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Use podman version over podman info because info has to query a lot of
internal state, e.g. contianer and image count, so it is slower than a
simple info. This speeds the test up by about 600ms locally.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Issue #11825 was fixed a long time ago. Also we no longer test
cni/dnsname so there is really no point in having this.
Speeds up the test by 1 second.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Another case of contianer does not exit with SIGTERM so we waste 10s.
Now because our contianer reacts to sigterm and exits 0 the systemd
unit status changed to inactive from failed.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We don't have two loop twice for the stat call we can just stat both
dirs at once. This means we only have to create half of the containers
so the test is twice as fast.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Testing `podman system check` requires that we have a way to
intentionally introduce storage corruptions. Add a hidden `podman
testing` command that provides the necessary internal logic in
subcommands. Stub out the tunnel implementation for now.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The expectation with --cgroups=disabled is that the current cgroup is
used by the container.
Currently the --cgroups=disabled is passed directly to the OCI
runtime, but it doesn't stop Podman from creating a new cgroup when it
doesn't own the current one.
Closes: https://github.com/containers/podman/issues/20910
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The command does not react on sigterm, so kube down needs to wait 10s.
To fix it first use a command that does but also write the yaml
directly instead of doing the podman create && kube generate dance.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
use a command that stops on SIGTERM not sleep, that way the tests can
continue to use podman kube down without waiting for the full stop
timeout every time.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This test is by far the slowest one taking over minute, the reason is
that it is checking every single podman command for shell completions.
The test is useful but it does not need to check the "..." argument 3
times. Test a second time to make sure not only the first arg is
completed. This change makes it about 15 seconds faster.
Long term we should get this test out of the main system tests together
with other cli only tests as they do not need to run on each OS, etc...
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The current logic used podman logs I don't understand way, all we care
about is the container output and we can just read the same with a
attached podman run, of course we have to move it into the background
but it did the some with logs.
This also allows us to remove the extra log-driver checks and because
podman logs seems to be much slower than the extra run we safe over 10s
with this change.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Use only one retry and a short stop timeout to speed them up. I am not
sure if this will cause flakes, I have not seen any after trying for
some time so I think this works just as well. And is about 2-3 seconds
faster for both tests.
If it does start to flake we can revert this commit again or write the
test differently.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Do not wait 5 seconds, just stop the container directly.
This speeds up the test by more than 4 seconds.
One could make the case here that we want to check podman wait but
there are so many other podman wait tests that it should not matter.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Instead of iterating over all tmp dirs and creating test containers for
each one we can just pass all files to one touch call. With that we have
to create much less containers while still checking the same thing. This
speeds up the test by about 4 seconds.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The test used sleep to synchronize log output between both containers
which is slow. There is actually no way to guarantee the ordering on
the reading side so just remove the sleep's and check the the lines
within the same container are in the right order.
Trying to preserve the orignal ordering is just not possible if we speed
up the test as it would flake to often.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There is no reason for this check to wait 4 seconds for the container to
run, instead make sure to have a running process and then stop it
directly with -t0 not have any delay.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As agreed in Planning meeting of 2024-03-20, Podman 5.x will
drop support for cgroups v1 and for runc. Make it so.
CI images built in https://github.com/containers/automation_images/pull/338
Signed-off-by: Ed Santiago <santiago@redhat.com>
This container did not react to sigterm thus we always waited 10s for it
to stop. Also do not wait 2s for the logs instead use a retry loop.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The test does a normal stop on a command that does not react to sigterm.
As I cannot fix the system stop logic use a command which does. This
safes us 10s as it no longer waits for the timeout.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Both tests take 10s longer than they need to because they run the sleep
command int he container which does not react to sigterm, as such podman
waits 10s before killing it with sigkill.
To fix it just stop them with podman rm -fa -t0 to avoid the wait and do
not use podman kube down as we cannot set a timeout there. podman kube
down is still covered in many other tests so this is not an issue.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
IMO it is not important to cover each case with each sdnotify policy, to
speed them up we run all the exit code cases only once just twice for
each policy while switching the sdnotify policy between each case. This
way we safe 50% of runs and should still have sufficient coverage.
Before it took around 24 seconds, with this it is around 12 seconds now.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There is really no point in waiting 10s for the kill, let's use 2 this
should be good enough to observe the timing.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This test waits 15 seconds to send sigterm for no good reason, we can
just make the timeout shorter. Also make sure the podman command quit on
sigterm by looking for the output message.
While at it fix the tests to use $PODMAN_TMPDIR not /tmp and define the
yaml in the test instead of using the podman create && podman kube
generate && podman rm way to create the yaml as it is a bit slower as we
have to call three podman commands for it.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Merge two podman event tests into one to speed them up as they did
mostly the same anyway. This way we only have to do the setup/teardown
once and only run one container.
Second, add the --since option because reading the journal can be slow
if you have thousands of event entries. This is not so critical in CI as
we run on fresh systems but on local dev machines I have almost 100k
events in the journal so parsing all of them makes this test slow (like
30s), with this change I can get it under 1s.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Defining a timer with a fixed interval is not a good idea as we first
have to wait until the timer triggers, while the interval was every two
seconds it means that we have to wait at least 2s for it to start.
However much worse it means it triggers the unit over and over, this
seems to cause some soft of race with the output check. I have seen
this test run 10-60s which does not make much sense.
Switching the timer to trgger once on start seem to make the test run
consistently in 7s locally for me so this is much better.
There still is the question if we really have to test this at all on
each upstream PR but I left it for now.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It takes over 10 seconds for this test as it uses --wait 5 twice which
runs into the timeout. IMO this tests is just redundant as it is already
covered in the e2e tests much better. Thus remove it here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When an empty volume is mounted into a container, Docker will
chown that volume appropriately for use in the container. Podman
does this as well, but there are differences in the details. In
Podman, a chown is presently a one-and-done deal; in Docker, it
will continue so long as the volume remains empty. Mount into a
dozen containers, but never add content, the chown occurs every
time. The chown is also linked to copy-up; it will always occur
when a copy-up occurred, despite the volume now not being empty.
This PR changes our logic to (mostly) match Docker's.
For some reason, the chowning also stops if the volume is chowned
to root at any point. This feels like a Docker bug, but as they
say, bug for bug compatible.
In retrospect, using bools for NeedsChown and NeedsCopyUp was a
mistake. Docker isn't actually tracking this stuff; they're just
doing a copy-up and permissions change unconditionally as long as
the volume is empty. They also have the two linked as one
operation, seemingly, despite happening at very different times
during container init. Replicating that in our stateful system is
nontrivial, hence the need for the new CopiedUp field. Basically,
we never want to chown a volume with contents in it, except if
that data is a result of a copy-up that resulted from mounting
into the current container. Tracking who did the copy-up is the
easiest way to do this.
Fixes#22571
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Now WaitForExit returns the exit code as stored in the db instead of
returning an error when the container was removed.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The default_addr shell function in test/system/helpers.network is used to
get the host's default address, which is used in a number of pasta
networking tests. However, in certain circumstances it can incorrectly
pick a deprecated address as the primary address. Correct it to exclude
those.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We have to exclude the ips in the rootless netns as they are not the
host. Now that fix only works if there are more than one ip one the
host available, if there is only one we do not set the entry at all
which I consider better as failing to resolve this name is a much better
error for users than connecting to a wrong ip. It also matches what
--network pasta already does.
The test is bit more compilcated as I would like, however it must deal
with both cases one ip, more than one so there is no way around it I
think.
Fixes#22653
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
First, point users to hack/bats for running them locally. Second, remove
TODO.md as it doesn't contain any helpful information. Basically all the
missing tests there have been added so this does not serve any purpose
and is missleading.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
wait for the healthy status on the thread where the container lock is
held. Otherwise, if it is performed from a go routine, a different
thread is used (since the runtime.LockOSThread() call doesn't have any
effect), causing pthread_mutex_unlock() to fail with EPERM.
Closes: https://github.com/containers/podman/issues/22651
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The function that's handing us events will return an error after closing
the channel over which it's sending events, and its caller (in its own
goroutine) will then send that error over another channel.
The logic that started the goroutine is likely to notice that the events
channel is closed before noticing that the error channel has a result
for it to read, so any error that would have been communicated would be
lost.
When we finish reading events, check if the reader returned an error
before telling our caller that there was no error.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Under some circumstances podman might be executed with a different argv0
than the actual path to the podman binary. This breaks the reexec logic
as it tried to exec argv0 which failed.
This is visible when using podmansh as login shell which get's the
special -podmansh on argv0 to signal the shell it is a login shell.
To fix this we can simply use /proc/self/exe as command path which is
much more robust and the argv array is still passed correctly.
Fixes#22672
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The scenario for inducing this is as follows:
1. Start a container with a long stop timeout and a PID1 that
ignores SIGTERM
2. Use `podman stop` to stop that container
3. Simultaneously, in another terminal, kill -9 `pidof podman`
(the container is now in ContainerStateStopping)
4. Now kill that container's Conmon with SIGKILL.
5. No commands are able to move the container from Stopping to
Stopped now.
The cause is a logic bug in our exit-file handling logic. Conmon
being dead without an exit file causes no change to the state.
Add handling for this case that tries to clean up, including
stopping the container if it still seems to be running.
Fixes#19629
Signed-off-by: Matt Heon <mheon@redhat.com>
This never tested what it said it did, the command line was wrong so
`,ro=false` was taken as image causing a error. What this actually
should care about is that a glob is taken as is and not evaluated.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Some programs have their configuration files relative to the user's
home. It would be convenient being able to mount these into the container, but
that requires expansion of `~` or `$HOME` in a label. This commit adds support
for that for the `runlabel` command.
Signed-off-by: Dan Čermák <dcermak@suse.com>
Effectively, this is an ability to take an image already pulled
to the system, and automatically mount it into one or more
containers defined in Kubernetes YAML accepted by `podman play`.
Requirements:
- The image must already exist in storage.
- The image must have at least 1 volume directive.
- The path given by the volume directive will be mounted from the
image into the container. For example, an image with a volume
at `/test/test_dir` will have `/test/test_dir` in the image
mounted to `/test/test_dir` in the container.
- Multiple images can be specified. If multiple images have a
volume at a specific path, the last image specified trumps.
- The images are always mounted read-only.
- Images to mount are defined in the annotation
"io.podman.annotations.kube.image.automount/$ctrname" as a
semicolon-separated list. They are mounted into a single
container in the pod, not the whole pod.
As we're using a nonstandard annotation, this is Podman only, any
Kubernetes install will just ignore this.
Underneath, this compiles down to an image volume
(`podman run --mount type=image,...`) with subpaths to specify
what bits we want to mount into the container.
Signed-off-by: Matt Heon <mheon@redhat.com>
The change in healthcheck_run_test.go, depends on the
containers/image change:
commit b6afa8ca7b324aca8fd5a7b5b206fc05c0c04874
Author: Mikhail Sokolov <msokolov@evolution.com>
Date: Fri Mar 15 13:37:44 2024 +0200
Add support for Docker HealthConfig.StartInterval (v25.0.0+)
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This is something Docker does, and we did not do until now. Most
difficult/annoying part was the REST API, where I did not really
want to modify the struct being sent, so I made the new restart
policy parameters query parameters instead.
Testing was also a bit annoying, because testing restart policy
always is.
Signed-off-by: Matt Heon <mheon@redhat.com>
Two system tests were relying on $SYSTEMD_IMAGE but were not
running _prefetch. This led to baffling flakes that wasted
my time. (Quay flakes, of course. New manifestation.)
Signed-off-by: Ed Santiago <santiago@redhat.com>
This was added ages ago in commit c65b3599cc, however in the meantime
both podman and conmon can support longer socket paths as they use a
workaround to open the path via /proc/self/fd, see openUnixSocket() in
libpod/oci_conmon_attach_linux.go
Thus this restriction is not needed anymore and we can drop a workaround
in the tests.
Fixes#22272
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When we remove with --force we do not return a error if the input does
not exists, however if we get more than on input we must try to remove
all and not just NOP out and not remove anything just because one arg
did not exists.
Also make the code simpler for commands that do have the --ignore option
and just make --force imply --ignore which reduces the ugly error
handling.
Fixes#21529
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As of podman 5.0, slirp4netns is a soft dependency. It might
not be installed on a host (and, in gating tests, is not).
Deal with it.
Use podman itself, not 'which', to tell us if slirp4netns
is available. We don't want to duplicate podman's path-check
logic. Since this check is expensive, cache the result.
(Change the has_pasta check similarly)
Signed-off-by: Ed Santiago <santiago@redhat.com>
Three infrequent flakes. Add debug code to help track
down if/when they happen again.
And, one of them, fix a logic bug that will save us 8-10s
on system tests runs.
Signed-off-by: Ed Santiago <santiago@redhat.com>
if the volume is mounted with "idmap", there should not be any mapping
using the user namespace mappings since this is done at runtime using
the "idmap" kernel feature.
Closes: https://github.com/containers/podman/issues/22228
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Useful to tell whether containers are being made with pasta or
slirp4netns by default. Info is bloated enough already that I
don't really have concerns about shoving more into it.
Fixes#22172
Signed-off-by: Matt Heon <mheon@redhat.com>
By default we just ignored any localhost reolvers, this is problematic
for anyone with more complicated dns setups, i.e. split dns with
systemd-reolved. To address this we now make use of the build in dns
proxy in pasta. As such we need to set the default nameserver ip now.
A second change is the option to exclude certain ips when generating the
host.containers.internal ip. With that we no longer set it to the same
ip as is used in the netns. The fix is not perfect as it could mean on a
system with a single ip we no longer add the entry, however given the
previous entry was incorrect anyway this seems like the better behavior.
Fixes#22044
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
In some environments, such as the one described in
https://github.com/containers/podman/issues/20927, the default route
is given as nexthop gateways. That is, it's a multipath routes with
multiple gateways.
That means that pasta(1), after commit 6c7623d07bbd ("netlink: Add
support to fetch default gateway from multipath routes"), can start
and use a default gateway from that route.
Just like in pasta(1), in these tests, the default route indicates
which upstream interface we should pick. If we ignore multipath
routes, IPv6 addresses and gateway addresses themselves won't be
available, so, while pasta is now able to configure the container,
IPv6 tests will expect to find no address and no gateway, hence fail
due to the mismatch.
Try to get routes, including gateway addresses and interface names,
from nexthop objects, in case the selection of a regular default
route yields no results.
Link: https://github.com/containers/podman/issues/20927Closes: #20927
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Most of them look like our usual "assume too much about run -d".
One of them is just an unexpected warning, a push retry. Remove
the ExitCleanly() from that test, just rely on Exit(0).
The other two have to do with podman logs, which we know can lag.
Add a short 1-second retry loop.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Commit 03f6589f3 added basic support for pull-error event from libimage
but it contains several problems:
1. storing the error as error type prevents it from being unmarshalled,
thus change it to a string
2. the error was never propagated from the libimage event to the podman
event struct
3. the error message was not wired into the cli and API
This commit fixes these problems.
Fixes#21458
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This vendors the latest c/common version, including making Pasta
the default rootless network provider. That broke a number of
tests, which have been fixed as part of this PR.
Also includes a change to network stats logic, which simplifies
the code a bit and makes it actually work with Pasta.
Signed-off-by: Matt Heon <mheon@redhat.com>
Add a --artifact flag to `podman manifest add` which can be used to
create an artifact manifest for one or more files and attach it to a
manifest list. Corresponding --artifact-type, --artifact-config-type,
--artifact-config, --artifact-layer-type, --artifact-subject, and
--artifact-exclude-titles options can be used to fine-tune the fields in
the artifact manifest that don't refer to the files themselves.
Add a --index option to `podman manifest annotate` that will cause
values passed to the --annotation flag to be applied to the manifest
list as a whole instead of to an entry in the list.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Checking for the mountdir is not relevent, a recent c/storage change[1] no
longer deletes the mount point directory so the check will cause a false
positive. findmnt exits 1 when the given path is not a mountpoint so
let's use that to check.
[1] 3f2e81abb3
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Like docker podman network inspect should output the information of
running container with their ip/mac address on this network.
However the output format is not docker compatible as this cannot
include all the info we have and the previous output was already not
compatible so this is not new.
New example output:
```
[
{
...
"containers": {
"7c0d295779cee4a6db7adc07a99e635909413a390eeab9f951edbc4aac406bf1": {
"name": "c2",
"interfaces": {
"eth0": {
"subnets": [
{
"ipnet": "10.89.0.4/24",
"gateway": "10.89.0.1"
},
{
"ipnet": "fda3:b4da:da1e:7e9d::4/64",
"gateway": "fda3:b4da:da1e:7e9d::1"
}
],
"mac_address": "1a:bd:ca:ea:4b:3a"
}
}
},
"b17c6651ae6d9cc7d5825968e01d6b1e67f44460bb0c140bcc32bd9d436ac11d": {
"name": "c1",
"interfaces": {
"eth0": {
"subnets": [
{
"ipnet": "10.89.0.3/24",
"gateway": "10.89.0.1"
},
{
"ipnet": "fda3:b4da:da1e:7e9d::3/64",
"gateway": "fda3:b4da:da1e:7e9d::1"
}
],
"mac_address": "f6:50:e6:22:d9:55"
}
}
}
}
}
]
```
Fixes#14126
Fixes https://issues.redhat.com/browse/RHEL-3153
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
I'm tired of this flake, it's hitting us ~once/day. Root cause
still unknown.
Workaround: add a READY file to the http server, and run 'curl'
until we get it. Tested in #17831 for the last two weeks, flake
has not been seen even once since then.
Closes: #21649
Signed-off-by: Ed Santiago <santiago@redhat.com>
For a unix socket we should not trim this at all. The problem exists for
ssh only so make sure we only do this when a ssh URL is given.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Currently if a user specifies a negative time to stop a container the
code ends up specifying the negative time to time.Duration which treats
it as 0. By settine the default to max.Unint32 we end up with a positive
number which indicates > 68 years which is probably close enough to
infinity for our use case.
Fixes: https://github.com/containers/podman/issues/21811
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
When we want the original image to be gzip, explicitly ask for that
instead of assuming the containers.conf defaults do that.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Add new event type in cmd/podman to better match the docker format.
Signed-off-by: AhmedGrati <ahmedgrati1999@gmail.com>
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
- use PODMAN_TMPDIR, not BATS_TMPDIR, for temp file
- in teardown, do not assume that SNAME_FILE will exist
(test could fail before that file gets created)
- remove "?" ("ignore exit status") from rmi & prune.
Probably holdovers from the days before -f. If
these commands fail even with -f, we need to know.
Signed-off-by: Ed Santiago <santiago@redhat.com>
There's currently no way to inspect failures of the
parallel-remove test (#21742). Add debugging ability.
Also, clean up nasty red warnings
Signed-off-by: Ed Santiago <santiago@redhat.com>
Continuing to see CI failures of the form "StopSignal SIGTERM
failed to stop container in 10 seconds". Work around those,
either by adding "-t0" to podman stop, or by using Expect(Exit(0))
instead of ExitCleanly().
Addresses, but does not close, #20196
Signed-off-by: Ed Santiago <santiago@redhat.com>
For some reason this starting to flake f38. I don't think the issue in
podman rather the test start nc -l in the background so it may not yet
have bound the port in the container when we try to connect.
To fix this simply add some retry logic to nc.
While at it also add pasta to this test and make it use
defer-assertion-failures to run all loop iterations before reporting the
errors.
Fixes#21561 (hopefully)
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Simply because it's been a while since the last testimage
build, and I want to confirm that our image build process
still works.
Added /home/podman/healthcheck. This saves us having to
podman-build on each healthcheck test. Removed now-
unneeded _build_health_check_image helper.
testimage: bump alpine 3.16.2 to 3.19.0
systemd-image: f38 to f39
- tzdata now requires dnf **install**, not reinstall
(this is exactly the sort of thing I was looking for)
PROBLEMS DISCOVERED:
- in e2e, fedoraMinimal is now == SYSTEMD_IMAGE. This
screws up some of the image-count tests (CACHE_IMAGES).
- "alter tarball" system test now barfs with tar < 1.35.
TODO: completely replace fedoraMinimal with SYSTEMD_IMAGE
in all tests.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Currently we deadlock in the slirp4netns setup code as we try to
configure an non exissting netns. The problem happens because we tear
down the netns in the userns case correctly since commit bbd6281ecc but
that introduces this slirp4netns problem. The code does a proper new
network setup later so we should only use the short cut when not in a
userns.
Fixes#21477
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman v5 will not support cgroups-v1. This commit will print a warning
if it detects a cgroups-v1 system. The warning can be hidden by setting
envvar `PODMAN_CGROUPSV1_WARNING`.
This warning is patched out for RHEL 9 builds as cgroups-v1 will still
be supported on RHEL 9 systems.
Resolves: https://issues.redhat.com/browse/RUN-1957
[NO NEW TESTS NEEDED]
Co-authored-by: Ed Santiago <santiago@redhat.com>
Co-authored-by: Sascha Grunert <sgrunert@redhat.com>
Co-authored-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Just like all the other inspect commands that accept multiple args we
should just make podman pod inspect output a json array.
This makes the code more consistent and removes the extra workaround
which was needed before to support this.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
CNI is deprecated and is build tagged out for 5.0. Don't test it in our CI.
This commit also disables upgrade tests for now - those need more work since the old version of Podman only uses CNI. Upgrade tests will be re-vamped in a later commit.
Signed-off-by: Ashley Cui <acui@redhat.com>
We now no longer write containers.conf, instead system connections and
farms are written to a new file called podman-connections.conf.
This is a major rework and I had to change a lot of things to get this
to compile again with my c/common changes.
It is a breaking change for users as connections/farms added before this
commit can now no longer be removed or modified directly. However because
the logic keeps reading from containers.conf the old connections can
still be used to connect to a remote host.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
SpecGen is our primary container creation abstraction, and is
used to connect our CLI to the Libpod container creation backend.
Because container creation has a million options (I exaggerate
only slightly), the struct is composed of several other structs,
many of which are quite large.
The core problem is that SpecGen is also an API type - it's used
in remote Podman. There, we have a client and a server, and we
want to respect the server's containers.conf. But how do we tell
what parts of SpecGen were set by the client explicitly, and what
parts were not? If we're not using nullable values, an explicit
empty string and a value never being set are identical - and we
can't tell if it's safe to grab a default from the server's
containers.conf.
Fortunately, we only really need to do this for booleans. An
empty string is sufficient to tell us that a string was unset
(even if the user explicitly gave us an empty string for an
option, filling in a default from the config file is acceptable).
This makes things a lot simpler. My initial attempt at this
changed everything, including strings, and it was far larger and
more painful.
Also, begin the first steps of removing all uses of
containers.conf defaults from client-side. Two are gone entirely,
the rest are marked as remove-when-possible.
[NO NEW TESTS NEEDED] This is just a refactor.
Signed-off-by: Matt Heon <mheon@redhat.com>
Given that we can have multiple image digests,
fix the inspect test to check whether the digest
given matches one of the digests of the image.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
This is one of the breaking changes in Podman 5.0: removing the
ability to create new instances of the old Bolt database. This
does not remove support for the database entirely, as existing
Bolt databases will still be usable, but all new installs will
use SQLite after this point - if Bolt is forced by config, we'll
just error.
We don't have plans to outright remove the Bolt code. If that
were to happen, it'd be Podman 6.0 at least, and a significant
enough change it'd warrant a lot of discussion and planning. We
do intend to start winding down support of BoltDB, though, and
new features may be added only to SQLite from here on.
I have added an escape hatch via an undocumented environment
variable that allows us to continue testing BoltDB in CI (and, if
necessary, locally) but I don't want this to be used for any
purpose except continued testing of the old DB to ensure we don't
break it.
Signed-off-by: Matt Heon <mheon@redhat.com>
Some OCI runtimes (cf. [1]) may tolerate container images that don't
specify an entrypoint even if no entrypoint is given on the command
line. In those cases, it's annoying for the user to have to pass a ""
argument to podman.
If no entrypoint is given, make the behavior the same as if an empty ""
entrypoint was given.
[1] https://github.com/containers/crun-vm
Signed-off-by: Alberto Faria <afaria@redhat.com>
Currently, if the container creation failed with
either run or create and you've used --pod with new:
the pod would be created nonetheless. This change ensures
the pod just created is also cleaned up in case
of container creation failure
Fixes#21228
Signed-off-by: danishprakash <danish.prakash@suse.com>
- #15074 ("subtree_control" flake). The flake is NOT FIXED, I
saw it six months ago on my (non-aarch64) laptop. However,
it looks like the frequent-flake-on-aarch64 bug is resolved.
I've been testing in #17831 and have not seen it. So,
tentatively remove the skip and see what happens.
- Closes: #19407 (broken tar, "duplicates of file paths")
All Fedoras now have a fixed tar. Debian DOES NOT, but
we're handling that in our build-ci-vm code. I.e., the
Debian VM we're using has a working tar even though there's
currently a broken tar out in the wild.
Added distro-integration tag so we can catch future problems
like this in OpenQA.
- Closes: #19471 (brq / blkio / loopbackfs in rawhide)
Bug appears to be fixed in rawhide, at least in the VMs we're
using now.
Added distro-integration tag because this test obviously
relies on other system stuff.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Add a wait_for_ready() to one kube-play test, to make sure
container output has made it to the journal.
Probably does not fix#18501, but I think it might fix its
most common presentation.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Let's support --config option by setting environment variable
DOCKER_CONFIG instead of ignoring it for docker compatibility, so
it could be used to locate config.json as authentication file.
Also add a test case for this change, remove the deprecated one.
Signed-off-by: Ming Liu <liu.ming50@gmail.com>
- tmpfs + noswap test: requires noswap feature in kernel.
Check for it, and skip if unimplemented. (Root only.
Rootless test works regardless of kernel).
- podman generate systemd tests: always use --files option,
because otherwise the "DEPRECATED" warning gets written
to the systemd unit file.
- kube play tests: yikes. Fix longstanding bugs when checking
for containers running. This revealed a longstanding bug
in one test: multi-pod YAML never actually worked. Fixed now.
- run_podman(): that new check-for-warnings code we added
in #19878, duh, I skipped it on Debian but should've skipped
when *runc*. Do so now and update the comment. Requires
minor surgery to podman_runtime() helper to avoid
infinite recursion.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Create a buildah SystemContext from the existing cli arguments
Pass the SystemContext to the build
Add system test
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
A number of tests start a container then immediately run podman stop.
This frequently flakes with:
StopSignal SIGTERM failed to stop [...] in 10 seconds, resorting to SIGKILL
Likely reason: container is still initializing, and its process
has not yet set up its signal handlers.
Solution: if possible (containers running "top"), wait for "Mem:"
to indicate that top is running. If not possible (pods / catatonit),
sleep half a second.
Intended to fix some of the flakes cataloged in #20196 but I'm
leaving that open in case we see more. These are hard to identify
just by looking in the code.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Our test registry (used for login & local registry tests)
was being run using the standard podman tmpdir, hence the
standard podman database, This was then getting clobbered
in the 330-corrupt-images test, which runs "system reset".
We just didn't know this was happening. Until we added
a registry test after the system reset. Oops.
Solution: new helper function podman_isolation_opts()
sets --root, --runroot, *and --tmpdir*. Refactor all
existing --root/--runroot usages. Document.
Next problem: the "network reload" test in 500-networking.bats
did not (could not) know about our registry port, so the
"iptables -F" command reverted that to DROP, so the subsequent
podman-auth in 700-play timed out.
Solution: add a podman-isolated "network reload" to start_registry().
Final problem, because, really, those weren't enough: a BATS
bug where running with --filter-tags would set IFS=',' in setup_suite
which in turn has catastrophic consequences:
https://github.com/bats-core/bats-core/issues/812
See #20966 for details of the failure and further conversation.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Also Support for podman pod ps --format '{{ .Label label }}'
Finally fix support for --format '{{ .Podname }}'
When user specifies .Podname this implies --pod was passed.
Fixes: https://github.com/containers/podman/issues/20957
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
If a user did not set an equal sign in the annotation that old code
would panic when accessing the second element in the slice.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
New VMs have netavark 1.9, which fixes the "cannot talk to syslog"
warning when running containerized, so we can reenable clean-output
checks in containerized e2e tests
pasta: some new VMs have passt >= 2023-11-10, but f38 does not,
and f39 is unclear (my version extractor could not tell). So
I'm leaving the 20170 skip.
Debian runc now supports umask in *run*, but not *exec*. Even
with runc 1.1.10. And we don't even know what the situation is
on RHEL... so, run the podman-run umask tests but not exec.
Fixes: #19809
Signed-off-by: Ed Santiago <santiago@redhat.com>