Add a new program based on bpftrace[1] to trace all podman processes
with arguments and exit code/signals. Additionally this captures stderr
from all podman container cleanup processes spawned by conmon which
otherwise go to /dev/null and are never seen in any CI logs.
Hopefull this allows us to debug strange network cleanup error seen in
CI, my plan is to add this to the cirrus setup and upload the logs so we
can check them when the flakes happen.
[1] https://github.com/bpftrace/bpftrace
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
For the past two months we've been splitting system tests
into two categories: those that CAN be run in parallel,
and those that CANNOT. Much work has been done to replace
hardcoded names (mycontainer, mypod) with safename().
Hundreds of test runs, in CI and on Ed's laptop, have
proven this approach viable.
make {local,remote}system now runs in two steps: first
the serial ones, then the parallel ones. hack/bats will
now recognize the 'ci:parallel' tag and add --jobs (nprocs).
This requires some tweaking of leak_check, because there
can be umpteen tests running (affecting image/container/pod/etc
state) when any given test completes.
Rules for enabling parallelization in tests:
* use unique container/pod/volume/network names (safename)
* do not run 'podman rm -a' or 'rmi -a'
* never use the -l (--latest) option
* do not run 'podman ps/images' and expect precise output
Signed-off-by: Ed Santiago <santiago@redhat.com>
Generated at build time from troubleshooting.md. Purpose is
to ship an actual man page to end users.
Much more complicated than initial guess, because there was
a bug in my Makefile man page filtering, the sed expression
that cleans up markdown that does not translate to roff.
All I've done here is reorder some of the expressions,
stripping off https links *before* we process
podman man page links.
Signed-off-by: Ed Santiago <santiago@redhat.com>
It qemu cannot be compiled anyway so make sure we do not try to compile
parts where the typechecker complains about on windows.
Also all the e2e test files are only used on linux as well.
pkg/machine/wsl also reports some error but to many for me to fix them
now. One minor problem was fixed in pkg/machine/machine_windows.go.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Now that we have propert !remote tags set everywhere we can just rely on
that and do not need to skip any dirs.
Also on linux do not lint three times, one remote run is enough.
We still have to skip the test dir for windows/macos though or we need
to add linux build tags there everywhere as well. This seems simpler.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This commit gets tests working under the new local-registry system:
* amend a few image names, mostly just sticking to a consistent
list of those images in our registry cache. Mostly minor
tag updates.
* trickier: pull_test: change some error messages, and remove
a test that's now a NOP. Basically, with a local (unprotected)
registry we always get "404 manifest unknown"; with a real
registry we'll get "403 I can't tell you".
* trickiest: seccomp_test: build our own images at run time,
with our desired labels. Until now we've been pulling
prebuilt images, but those will not copy to the local
cache registry. Something about v1? Anyhow, I gave up
trying to cache them, and the workaround is straightforward.
Also took the liberty of strengthening a few error-message checks
Signed-off-by: Ed Santiago <santiago@redhat.com>
While these are not really slow they still take about 100-250ms if I
time this locally. Given they are run for every test this adds up
quickly. Looking at CI logs I can see the timings for skipped
tests are all in 600ms range. So I think it is safe to assume that these
functions need to get faster.
We have over 670 test cases currently so we talk about over 400s spend
in these functions in CI. This allows for big gains.
Now overall this is a tricky trade of, while all tests should cleanup
after themselves there is no guarantee for that as such errors can be
leaked into other tests making debugging much harder. To work at least a
bit against this teardown checks if the test was successful and only
skips the podman commands bases on that. Without it a single flake could
cause all following tets to fail.
As such this commit does the proper setup once one suite start then only
after a test failed.
In order for this to work at all we have to fix all leaks first, see
previous commits. And then for the future keep a very strong eye on
this during reviews.
Also add a PODMAN_BATS_LEAK_CHECK option
By default test must cleanup themselves and to speed up CI we no longer
do any cleanup in teardown by default. However there is still many cases
where we might have to debug a leak so add a new PODMAN_BATS_LEAK_CHECK
env option that can be set and should cause teardown to fail if the test
did not cleanup properly.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Testing `podman system check` requires that we have a way to
intentionally introduce storage corruptions. Add a hidden `podman
testing` command that provides the necessary internal logic in
subcommands. Stub out the tunnel implementation for now.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
First of all this removes the need for a network connection, second
renovate can update the version as it is tracked in go.mod.
However the real important part is that the binary downloads are
broken[1]. For some reason the swagger created with them does not
include all the type information for the examples. However when building
from source the same thing works fine.
[1] https://github.com/go-swagger/go-swagger/issues/2842
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
when listing images through the restful service, consumers want to know
if the image they are listing is a manifest or not because the libpod
endpoint returns both images and manifest lists.
in addition, we now add `arch` and `os` as fields in the libpod endpoint
for image listing as well.
Fixes: #22184Fixes: #22185
Signed-off-by: Brent Baude <bbaude@redhat.com>
Belated followup to #21981. (Looks like I started to add this
functionality back in 2020 but left it unfinished. Tsk tsk.)
docs/source/Commands.rst is unnecessary duplication. It _should_
be autogenerated, but I can't figure out how to cleanly add
that to our Make process. This PR is an interim cross-check
until we get that resolved:
- everything in podman --help must have a matching entry
in Commands.rst (top-level commands only)
- check for dups and out-of-sequence in Commands.rst
- also for anything in Commands.rst that is not in --help
Fix existing mismatches in Commands.rst.
Also, #21784 removed a format specifier that I was using in
regression tests. Switch to using something else, to get
test passing again. Given the fact the correct solution
is autogenerating Commands.rst, I choose not to add new
tests for the rst xref.
Also, executive decision, remove volume.rst. It is not referenced
from anywhere, it looks like a lonely orphan remnant from days
of yore.
Signed-off-by: Ed Santiago <santiago@redhat.com>
pasta only works when we run as container_runtime_exec_t, now that pasta
is the default this means that the current binary will not work when
doing local dev without manually fixing the label.
There are also other parts where the correct label is important. So as a
simple fix always set the proper label in the bin/podman target.
This also means we can drop this line from the hack/bats script.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Followup to:
- #21060, where I added new struct checks (but did not make them fatal)
- #21534, which added per-interface stats and a .Network field,
but its documentation was slightly off
Signed-off-by: Ed Santiago <santiago@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
Specifically, Darwin's bash is very old so it doesn't support newer
features like `declare -A`. Reduce the complexity of the script
so that it can be used for all platforms. Comment heavily regarding the
scripts various execution contexts to prevent developers relying on
advanced features for any future modifications.
Signed-off-by: Chris Evich <cevich@redhat.com>
There are darwin-specific code paths which were not being linted prior
to this commit. Fix this with a new, darwin-specific section of the lint
runner script.
Signed-off-by: Chris Evich <cevich@redhat.com>
New CI validation check: all keys in quadlet.go must be
documented at least once in podman-systemd.unit.5.md.
Adding '// deprecated' next to an enum definition will
exclude said key from the documentation cross-checks.
And, because the md file lists keys in both table and block
form, make sure those all match.
And make sure everything is sorted in lexical order, in
both .go source and in man page.
And add a validation check to make sure it stays that way.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Initial impetus was #20958 (ps --format .Label abc). This is
a complicated solution to a simple-seeming problem.
The problem: .Label is a cobra *function*, something I did not
know about nor handle.
Solution: recognize cobra functions. Switch to __complete,
not __completeNoDesc, so we can see the number of arguments
required. Invent new man-page format for documenting functions.
And, finally, start enforcing how functions (and cobra structs)
are documented.
This discovered a never-used completion function, .Recycle(),
in podman-events. Remove it.
[NO NEW TESTS NEEDED] - the .go change is an excision of dead code.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Followup to #21055: regression tests for the code that
reads man pages. These are not xref-related at all, just
simple consistency checks on the man page content.
In the process of writing these tests, I also fixed a
longstanding bug where warning messages could be emitted
multiple times, once for each time we read a man page file
(as happens with command aliases).
Signed-off-by: Ed Santiago <santiago@redhat.com>
In the process of adding new functionality to the xref script,
I realized it is much too fragile. It's too easy to make some
minor change that could break the crossrefs, giving us the
illusion of testing.
Solution: add a test suite for the script. Still incomplete,
but an important step toward building confidence.
Requires minor surgery to the script itself
Signed-off-by: Ed Santiago <santiago@redhat.com>
There's a stanza in .cirrus.yml that only "runs" in
the treadmill cron job ... but that job is long gone.
The task actually runs in the buildah treadmill PR, #13808,
but that's not obvious to someone reading .cirrus.yml.
This is a maintenance burden. Remove it.
Because rootless bud tests are still important, and we
still want to run them in the treadmill PR, modify the
treadmill script itself so it (ugh) injects rootless jobs
into the buildah_bud test matrix. This is super fragile
but acceptable because I am the only one who ever runs
the treadmill script. I will notice if this breaks.
Signed-off-by: Ed Santiago <santiago@redhat.com>
I'm not sure about apparmor tag. Atleast runc isn't using it anymore.
"apparmor (since runc v1.0.0-rc93 the feature is always enabled)" from https://github.com/opencontainers/runc
containers-common still seems to check for apparmor, so not touching it for now.
Signed-off-by: Rahil Bhimjiani <rahil3108@gmail.com>
Some --filter descriptions listed the filters with asterisks,
i.e. markdown italics. There were 60+ of those, 250+ without
asterisks, so I choose to de-asterisk them all. Update the
xref script to remove the allow-asterisk exception. (Except
for the column title, which is sometimes written with two
asterisks--boldface--and sometimes plain).
Signed-off-by: Ed Santiago <santiago@redhat.com>
For all commands with a --filter option, cross-reference
against man pages, and vice-versa.
I'm sorry. I know this script has gone off the deep end.
[NO NEW TESTS NEEDED] although actually I would like to test some broken completions
Signed-off-by: Ed Santiago <santiago@redhat.com>
First do not lint pkg/domain/infra/abi with the remote tag as this is
only local code.
Then mark the cacheLibImage field as unused, this should be an unused
stub for the remote client so that we do not leak libimage.
The linter sees that with the remote tag so we need to silence that
warning.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Shortcuts like unix:path and unix:/path do not work everywhere,
so make sure to use unix://path when quoting the url (or address)
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
The performance issue in #19467 drove me to add a benchmark for
system-df to avoid regressing on it in the future.
Comparing current HEAD to v4.6.0 yields
```
/home/vrothberg/containers/podman/bin/podman system df ran
201.47 times faster than /usr/bin/podman system df
```
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Check for duplicate subcommands, flags, and format specifiers.
I assumed this would never be necessary, that code review would
catch dups, but it happened (#19462). Prevent future ones.
Also, make it a fatal error for a --format to be undocumented,
except for 'podman inspect'. So many exceptions ... :(
Signed-off-by: Ed Santiago <santiago@redhat.com>
BATS 1.8.0 introduces tags: metadata that can be applied to
a single test or one entire file, then used for filtering
in a test run.
Issue #19299 introduces the possibility of using OpenQA
for podman reverse dependency testing: continuous CI on
all packages that can affect podman, so we don't go two
months with no bodhi builds then get caught by surprise
when systemd or kernel or crun change in ways that break us.
This PR introduces one bats tag, "distro-integration".
The intention is for OpenQA (or other) tests to install
the podman-tests package and run:
bats --filter-tags distro-integration /usr/share/podman/test/system
Goal is to keep the test list short and sweet: we do not
need to test command-line option parsing. We *DO* need to
test interactions with systemd, kernel, nethack, and other
critical components.
Signed-off-by: Ed Santiago <santiago@redhat.com>
**podman compose** is a thin wrapper around an external compose provider
such as docker-compose or podman-compose. This means that `podman
compose` is executing another tool that implements the compose
functionality but sets up the environment in a way to let the compose
provider communicate transparently with the local Podman socket. The
specified options as well the command and argument are passed directly
to the compose provider.
The default compose providers are `docker-compose` and `podman-compose`.
If installed, `docker-compose` takes precedence since it is the original
implementation of the Compose specification and is widely used on the
supported platforms (i.e., Linux, Mac OS, Windows).
If you want to change the default behavior or have a custom installation
path for your provider of choice, please change the `compose_provider`
field in `containers.conf(5)`. You may also set the
`PODMAN_COMPOSE_PROVIDER` environment variable.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The podman-login tests have accumulated much cruft over the
years, because that's the only place where we run a local
registry, and the process was crufty: we actually start/stopped
the registry as the first & last tests of the file. Meaning,
you couldn't do 'hack/bats 150:just-one-test' because that
would skip the registry start. And just now, a completely
unrelated test has had to be shoved into the login file.
This PR revamps the whole thing, by adding a new registry helper
module that can be used anywhere. And, once the registry is
started, it just stays running until the end of tests. (This
requires BATS 1.7 or greater).
Signed-off-by: Ed Santiago <santiago@redhat.com>
First: fix podman-registry script so it preserves the initial $PODMAN,
so all subsequent invocations of ps, logs, and stop will use the
same binary and arguments. Until now we've handled this by requiring
that our caller manage $PODMAN (and keep it the same), but that's
just wrong.
Next, simplify the golang interface: move the $PODMAN setting into
registry.go, instead of requiring e2e callers to set it. (This
could use some work: the local/remote conditional is icky).
IMPORTANT: To prevent registry.go from using the wrong podman binary,
the Start() call is gone. Only StartWithOptions() is valid now.
And, minor cleanup: comments, and add an actual error-message check
Reason for this PR is a recurring flake, #18355, whose multiple
failure modes I truly can't understand. I don't think this PR
is going to fix it, but this is still necessary work.
Signed-off-by: Ed Santiago <santiago@redhat.com>
hack/podman-registry --help option does not exist.
We need to use -h option when we want to see the usage message.
[NO NEW TESTS NEEDED]
Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
Work around a go-md2man bug, and add a check script to make sure
this doesn't hit us again.
Background: go-md2man can't deal with a left-hand column > 31 chars.
It produces man pages that look like:
| Something With >31 Character | |
| | ..description |
(should be all on one row). It also has trouble when the vertical
bars are misaligned: it completely removes the right-hand side.
There's almost certainly a better solution: fix go-md2man, or
use a different conversion tool, or maybe even pre/postprocess.
But this is a quick interim solution.
Sorry for the perl. This could be done in bash/sed/awk/grep,
but not with any sort of sane error messages.
Signed-off-by: Ed Santiago <santiago@redhat.com>
- treadmill script: run root & rootless in parallel, not
sequentially. It's only four jobs, and it seems dumb
to fix root tests, repush, then discover a rootless failure.
- apply-podman-deltas: implement skip_if_rootless(), and
use it to skip a nasty longstanding flake
- bud-tests-in-podman diffs: ugly code to fix a rootless hang.
background: rootless remote tests hang
cause: stray podman server process
root cause: no idea. No clue at all. I just gave up
workaround: seek out and kill stray server processes
Rootless buildah-bud tests are not run in regular CI,
only in the buildah treadmill.
Signed-off-by: Ed Santiago <santiago@redhat.com>
There are days when I really, really, really hate GNU. Remember
when someone decided that 'head -1' would no longer work, and
that it was OK to break an infinite number of legacy production
scripts? Someone now decided that egrep/fgrep are deprecated,
and our CI logs (especially pr-should-include-tests) are now
filled with hundreds of warning lines, making it difficult
to find actual errors.
I expect that those warnings will be removed quickly after
furious community backlash, just like the 'head -1' fiasco
was quietly reverted, but ITM the warnings are annoying
so I capitulate.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Porting them over to v2 requires a full rewrite.
IT is not clear who actually uses these benchmarks, Valentin who wrote
them originally is in favor of removing them. He recommends to use
script from hack/perf instead.
This commit also drop the CI integration, it is not clear who actually
uses this data. If it is needed for something please speak up.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
As long as podman uses a fork/exec model this eBPF program is able to trace the performance of each podman command and the resulting child processes from start to finish. This is an improvement to the already existing podmansnoop eBPF program which only looks at sched_process_exit and enter/exit sys_execve tracepoints.
Signed-off-by: Paul Wallrabe <54737071+raballew@users.noreply.github.com>
It flakes once or twice a day:
VERSION=1.51.1 ./hack/install_golangci.sh
Installing golangci-lint v1.51.1 into ./bin/golangci-lint
golangci/golangci-lint info checking GitHub for tag 'v1.51.1'
golangci/golangci-lint crit unable to find 'v1.51.1' - use 'latest'
or see https://github.com/golangci/golangci-lint/releases for details
No visibility into why, and no special reason to believe that
retrying five seconds later will work, but it seems worth a try.
Signed-off-by: Ed Santiago <santiago@redhat.com>
In February we started running rootless bud tests in cron (#17608).
That's nice, but nobody ever looks at cron results. The idea behind
adding a rootless task was to run it in the manual treadmill, too.
This PR enables that, and more clearly documents the how and why.
Signed-off-by: Ed Santiago <santiago@redhat.com>