The format test flakes when quay is down, because we've
been doing 'podman search $IMAGE', which is a quay image.
Solution: check if local registry is running, and use it.
We don't need a real image.
Signed-off-by: Ed Santiago <santiago@redhat.com>
(where possible. Not all tests are parallelizable).
And, refactor two complicated tests into one. This one
is hard to review, sorry.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The netns dir has a special logic to bind mout itself and make itslef
shared. This code here didn't which lead to catastrophic bug during
netns unmounting as we were unable to unmount the netns as the mount got
duplicated and had the wrong parent mount. This caused us to loop forever
trying to remove the file.
Fixes https://issues.redhat.com/browse/RHEL-59620Fixes#23685
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This removes the need for a tricky/fragile namespace workaround.
Huge thanks to Paul for discovering documentation on the
Registry container, and how to override config.yml settings:
https://distribution.github.io/distribution/about/configuration/#override-specific-configuration-options
Drive-by: consistentize quotes in -eVAR="value". Minor, but
makes them all easier to read with emacs/vi syntax highlighting.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The "rm on stopping containers" test is flaking under high load,
probably because I bumped up two timeouts in the healthcheck
container that it relies on. Bump up this test's timeout as well.
Signed-off-by: Ed Santiago <santiago@redhat.com>
...not just when running parallel Bats, because Bats
does not provide any way to know if we're parallel.
Signed-off-by: Ed Santiago <santiago@redhat.com>
...of high system load (such as when running parallel tests).
Allow time for services to reach desired state, by retrying
a few times in a loop.
Signed-off-by: Ed Santiago <santiago@redhat.com>
This fixes the problem where even as root we check the netns files from
root. But in order to catch any rootless bugs we must check the rootless
files from $XDG_RUNTIME_DIR/netns.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This test is currently disabled due to several issues, only some of which
are described in the existing comments. Add some more details to clarify
the situation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This name for the tests is misleading, since in the default configuration
podman will already configure a forwarding addres, which could forward
to either another local forwarder or an external nameserver on the host
side. What this test is really about is explicitly configuring the pasta
DNS forwarding address. Rename accordingly.
The IPv4 version of the test doesn't use the podman --dns option, only
the pasta --dns-forward option. This exercises the podman behaviour that
pasta --dns-forward options are added to /etc/resolv.conf automatically.
However there could also be other things in /etc/resolv.conf, so the
nslookup might not use the custom forwarding address for the lookup.
To fix that, split the test into two parts: one verifying that the custom
address is in /etc/resolv.conf and another performing the nslookup with an
explicit server address to make sure we exercise the pasta side as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In both the "Basic nameserver lookup" and "Local forwarder, IPv4" pasta
tests, we check whether DNS resolution is working by running "nslookup
127.0.0.1" in the container and checking if 1.0.0.127.in-addr.arpa is in
the output.
1.0.0.127.in-addr.arpa isn't the expected result of the resolution though,
it's just the DNS name that nslookup will tranlated 127.0.0.1 into. The
test mostly works, because nslookup echoes that on successful lookups.
However, it could also echo it in certain sorts of failure, so it's not a
very reliable test.
Furthermore, resolving 127.0.0.1 from a nameserver is a rather strange
thing to do. It's done that way because RFC1912[0] suggests it should
always resolve, even for nameservers on a disconnected network. But, this
doesn't really appear to be true in practice: a number of resolvers return
NXDOMAIN. That works by accident because nslookup seems to echo the
name above as part of the error message.
Change to instead looking up one of the root servers by name. This does
now rely on access to the global DNS during tests, but other podman tests
attempt to resolve google.com, so that should be ok. One of the root
servers is about as close to universal resolvability as it's possible to
get
[0] https://datatracker.ietf.org/doc/html/rfc1912#section-4.1
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The idea behind the "External resolver" tests is simply to check that we
can contact a nameserver, regardless of this configuration. To this end
the "IPv4" version looks up 127.0.0.1 which RFC1912[0] suggests should
always be resolvable.
The IPv6 version instead looks up [::1]. While it makes sense for
that to be resolvable in a similar way, there appear to be quite a few
nameservers which do not resolve it, making this test flaky.
Furthermore the idea behind resolving [::1] is that it should make
nslookup prefer to resolve over IPv6. That appears to be very
unreliable at best. Since making a different query doesn't actually
exercise anything different in pasta, drop the test.
The remaining IPv4 test isn't really specific to an "external" resolver,
it's simply checking that we can contact some sort of resolver with the
default podman configuration. Rename accordingly, and run it regardless of
IPv4 connectivity on the host: we can still query a nameserver about an
IPv4 address, even if we only have IPv6 connectivity ourselves.
[0] https://datatracker.ietf.org/doc/html/rfc1912#section-4.1
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "Local forwarder, IPv4" pasta test, amongst other things, checks that
podman's default DNS forwarding address - 169.254.0.1 - appears in the
container's /etc/resolv.conf. That's not really related to anything else
going on in that test (which is about _changing_ that default address).
So, move it into its own test case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
...or at least as much as possible. Some tests cannot
be run in parallel due to #23750: "--events-backend=file"
does not actually work the way a naïve user would intuit.
Stop/die events are asynchronous, and can be gathered
by *ANY OTHER* podman process running after it, and if
that process has the default events-backend=journal,
that's where the event will be logged. See #23987 for
further discussion.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Need --layers=false in podman build, otherwise a buildah race
can trigger "layer not known" failures:
https://github.com/containers/buildah/issues/5674
Signed-off-by: Ed Santiago <santiago@redhat.com>
When running parallel, multiple tests could be trying to start
the registry at once. Make this parallel-safe.
Also, use a safer port range for the registry. Something
outside of /proc/sys/net/ipv4/ip_local_port_range
Sorry, I'm including a FIXME section that I haven't investigated
deeply enough.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Add a few best-practices examples, and add a whole section
describing the dos and donts of writing parallel-safe tests.
Signed-off-by: Ed Santiago <santiago@redhat.com>
For tests run in parallel, show file number as |nnn| (vs [nnn])
Teach logformatter to distinguish the two, adding 'p' to anchors
in parallel tests. Necessary because in this scheme we run bats
twice, thus see 'ok 1' twice, and we want to differentiate them.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Workaround for #23292, where simultaneous 'pod create' commands
will all start a podman-build of the pause image, but only
one of them will be tagged, and the others will leak <none>
images.
Signed-off-by: Ed Santiago <santiago@redhat.com>
The issue is closed and I recently fixed a number of races (bf74797c69)
in the remote attach API that sound like exactly like the same error
that was mentioned in issue #9597.
As such I think this works, if it start flaking again we can revert this
or better fix the actual bug.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
For the past two months we've been splitting system tests
into two categories: those that CAN be run in parallel,
and those that CANNOT. Much work has been done to replace
hardcoded names (mycontainer, mypod) with safename().
Hundreds of test runs, in CI and on Ed's laptop, have
proven this approach viable.
make {local,remote}system now runs in two steps: first
the serial ones, then the parallel ones. hack/bats will
now recognize the 'ci:parallel' tag and add --jobs (nprocs).
This requires some tweaking of leak_check, because there
can be umpteen tests running (affecting image/container/pod/etc
state) when any given test completes.
Rules for enabling parallelization in tests:
* use unique container/pod/volume/network names (safename)
* do not run 'podman rm -a' or 'rmi -a'
* never use the -l (--latest) option
* do not run 'podman ps/images' and expect precise output
Signed-off-by: Ed Santiago <santiago@redhat.com>
...and remove one old skip() for older debian, but leave
two others in place and mark that they're still a problem.
Signed-off-by: Ed Santiago <santiago@redhat.com>
convert the owner UID and GID into the user namespace only when
":idmap" mount is used.
This changes the behaviour of :idmap with an empty volume. Now the
existing directory ownership is copied up as in the other case.
Closes: https://github.com/containers/podman/issues/23347
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Use safename. Add ci:parallel tags. Use a random port, not
hardcoded 9999. Do not remove pause image. And especially
do not "rm -a" anything.
Signed-off-by: Ed Santiago <santiago@redhat.com>
...because it requires 100% control and knowledge of the
state of all images, containers, and volumes.
Use safename anyway, just in case we ever have a leak from here.
I'm finding safename sooooooo helpful when reading journal.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Add ci:parallel tags; move one non-parallel-safe test to
another networking-test file; and a few drive-by fixes
Signed-off-by: Ed Santiago <santiago@redhat.com>