Systemd doesn't support `never` and logs a warning, systemd uses no as
default so we do not have to specify it at all.
Check systemd.service(5) for the systemd docs.
Fixes#18743
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Reduce sleep-loop time in logs test, from 1s to 0.1s,
to make 'podman stop' take effect more quickly. With 1s,
and testing with 1s resolution, we get flakes.
Fixes: #17826
Signed-off-by: Ed Santiago <santiago@redhat.com>
The new exit-code propagation test is racy: 'podman wait' can
fail if the service container has already been cleaned up by
systemd.
Solution: run the inspect and wait tests opportunistically, i.e.,
only if those commands succeed. If they fail, confirm that they
fail with ENOSUCHCONTAINER. This may silently lose us some
coverage ... but none of it is important. The important
test, systemctl final status, remains.
Also, as drive-bys:
- add a FIXME comment documenting another race condition
that I'm not bothering to fix right now
- give distinct names to unit files, for readability in
test failures
Fixes: #18732
Signed-off-by: Ed Santiago <santiago@redhat.com>
"image rm concurrent" test is still failing, even after #18664:
Error: no contents in "/tmp/podman_test967723851/Dockerfile"
Probable cause: the images are built in parallel, and p.BuildImage()
writes one single Dockerfile. (This almost certainly renders the
test less effective than intended, since the generated images
might end up being identical).
Solution: write and use a uniquely-named Dockerfile
Signed-off-by: Ed Santiago <santiago@redhat.com>
When we do path completion in images a user could try to complete a
simple relative path, e.g. podman run $IMAGE e... should complete to etc
if this path exists in the image. Right now we panic in this case as the
current check didn't account for an empty string in simplePathJoinUnix().
In such a case return the path directly because we can not alter what
the user typed on the cli and must return a path without slash as well
in order for the shell to suggest the completion.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2209809
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There are quite a lot of places in podman were we have some signal
handlers, most notably libpod/shutdown/handler.go.
However when we rexec we do not want any of that and just send all
signals we get down to the child obviously. So before we install our
signal handler we must first reset all others with signal.Reset().
Also while at it fix a problem were the joinUserAndMountNS() code path
would not forward signals at all. This code path is used when you have
running containers but the pause process was killed.
Fixes#16091
Given that signal handlers run in different goroutines parallel it would
explain why it flakes sometimes in CI. However to my understanding this
flake can only happen when the pause process is dead before we run the
podman command. So the question still is what kills the pause process?
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Use ExecStopPost instead of ExecStop to make sure containers, pods, etc.
are all cleaned up even in case of an error.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add a new field `ExitCodePropagation` field to allow for configuring the
newly added functionality of controlling how the main PID of a kube
service exits.
Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd. The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.
When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint. In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers. The service will then behave
according to the following three exit-code policies:
- `none`: exit 0 and ignore containers (default)
- `any`: exit non-zero if _any_ container did
- `all`: exit non-zero if _all_ containers did
The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.
In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.
Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero. The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero. Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.
Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
RHEL gating tests failing, because (sigh) journalctl doesn't
work rootless on RHEL.
I think the flake is fixed anyway, so we don't need this.
This reverts commit ba141adce4.
Signed-off-by: Ed Santiago <santiago@redhat.com>
This test is intended to test concurrent removals, so don't
risk a removal breaking a build.
Fixes#18659 .
(The sitaution that removals can break a build WIP is a real
problem that should be fixed, but that's not a target of this test.)
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Instrument system tests in hopes of tracking down #17216,
the unlinkat-ebusy-hosed flake.
Oh, also, timestamp.awk: timestamps have always been UTC, but
add a 'Z' to make it unambiguous.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Fixes: https://github.com/containers/podman/issues/18239
[NO NEW TESTS NEEDED]
@test "podman build -f test" in test/system/070-build.bats
Will test this. This was passing when run on a local system since
the remote end was using the clients path to read the Containerfile
The issue is it would not work in a podman machine since the
Containerfile would/should be a different path.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Make sure to look for the container's exit code when it's in stopped
state. With `--restart=always`, the container seems to stay in the
stopped state which led the wait logic to loop until the 20 seconds
timeout for the cleanup process to have finished kicks in.
Also defensively make sure to loop when the container is in stopped
state but no exit code has been written yet.
Add a regression test to make sure Podman doesn't wait more than 20
seconds. Even on a CI machine under high load I expect it to take much
much much less than that, so I do not expect this test to flake in the
future.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The examples show that --dns-add 8.8.8.8,1.1.1.1 is valid but it fails,
fix this by using StringSliceVar which splits at commas.
Added tests to ensure it is working.
Fixes#18632
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
b25b330306 introduced this behaviour.
It was fine at the time because we didn't support "container update",
so the limit could not be changed at runtime. Since it is not
possible to change the memory limit at runtime, read the limit as
reported from the cgroup.
https://github.com/containers/crun/pull/1217 is required for crun.
Closes: https://github.com/containers/podman/issues/18621
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We should not change selinux, in a parallel context this can change the
behavior of other tests and we should never disable selinux anyway.
Lets see if this passes CI or not.
Fixes#18564
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
the combination --pod and --userns is already blocked. Ignore the
PODMAN_USERNS variable when a pod is used, since it would cause to
create a new user namespace for the container.
Ideally a container should be able to do that, but its user namespace
must be a child of the pod user namespace, not a sibling. Since
nested user namespaces are not allowed in the OCI runtime specs,
disallow this case, since the end result is just confusing for the
user.
Closes: https://github.com/containers/podman/issues/18580
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Enable the --configmap flag for the remote case of podman
kube play. Users can pass in the paths to the configmap files
for kube play to use when creating the pods and containers from
a kube yaml file. The configmap file is read and the contents are
appended to the contents of the main yaml file before passed to the
remote client.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Accept a tag in the compat api endpoint. For the fromImage param we
already parse it but for fromSrc we did not.
Fixes#18597
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Init containers are currently not properly supported in
generate-systemd and there are no plans to do so since
all focus lies on Quadlet going forward.
Hence, generate systemd should through an error.
Closes: #18585
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
In run_podman(), display a nanosecond-level timestamp next to
each command and its output.
Because this clutters the results, teach logformatter to grok
these new timestamps, strip them, and display a more human-readable
time delta in the left-hand timestamp column. logformatter started off
as a mess and is now, well, 🤮. I'm sorry. I just hope its results
make it worthwhile.
Signed-off-by: Ed Santiago <santiago@redhat.com>
When running ginkgo tests locally we often only want to test a small
subset. I think most people just add the `FIt` block but then you need
to remember to undo that before pushing the changes.
With this change you can just run:
```
make localintegration FOCUS="test name here"
make localintegration FOCUS_FILE="some_test.go"
```
I updated the test Readme to use this new syntax.
The options just map to the ginkgo options, see the upstream docs
linked in the readme for more information about syntax.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Followup to #18578: move Serial to Describe(), in case new
tests get added to this module. And, explain the reasoning.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Reason: gpg tests all run with a different GNUPGHOME, and gpg-agent
does not like that, and there's no longer any way to run gpg
without the agent. So, do not run these tests in parallel, and
clean up agent after each test.
Fixes: #17966 (I hope)
May also fix#18358 but it will take some time to be sure.
Signed-off-by: Ed Santiago <santiago@redhat.com>
...in three kube tests. And, missing error-message checks.
And, reverse the sense of a confusing Expect(), plus add
a description to the test failure. And, set never-restart,
otherwise our "podman wait" will spin for an indeterminate
time.
Signed-off-by: Ed Santiago <santiago@redhat.com>
There is no reason to define the same code every time in each file, just
use global nodes. This diff should speak for itself.
CleanupSecrets()/Volume() no longer call Cleanup() directly, as the
global AfterEach node will always call Cleanup() this is no longer
necessary. If one AfterEach() node fails it will still run the others.
Also always unset the CONTAINERS_CONF env vars. This prevents people
from forgetting to unset it. And fix the special CONTAINERS_CONF logic
in the system connection tests, we do not want to preserve
CONTAINERS_CONF anyway so just remove this logic.
Ginkgo orders the BeforeEach and AfterEach nodes. They will be executed
from the outer-most defined to inner-most. This means our global
BeforeEach is always first. Only then the inner one (in the Describe()
function in each file). For AfterEach it is inverted, from the inner to
the outer.
Also see https://onsi.github.io/ginkgo/#organizing-specs-with-container-nodes
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Make sure that the directory formats are not just substituted with their
archive counterparts but actually tar'ed up directories. Also make sure
that the clients don't get chown errors by setting rootless user and
group ID instead of O when running in the user namespace.
Fixes: #15897
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
These annotations can have security implications - crun, for
example, allows rootless containers to preserve the user's groups
through an annotation. We absolutely should not include
annotations from an untrusted image off the internet by default.
We may consider whitelisting some annotations (e.g. the legacy
WASM annotations), but given that there is now a more explicit
way of specifying an image uses the WASM runtime in the OCI image
spec, I'm just tearing this out entirely for now.
Signed-off-by: Matt Heon <mheon@redhat.com>
Read the entire YAML file in case of a multi-doc file
Adjust the unit test
Add a system test
Add comment in the man page
Signed-off-by: Ygal Blum <ygal.blum@gmail.com>
Several tweaks to see if we can track down #17216, the unlinkat-ebusy
flake:
- teardown(): if a cleanup command fails, display it and its
output to the debug channel. This should never happen, but
it can and does (see #18180, dependent containers). We
need to know about it.
- selinux tests: use unique pod names. This should help when
scanning journal logs.
- many tests: add "-f -t0" to "pod rm"
And, several unrelated changes caught by accident:
- images-commit-with-comment test: was leaving a stray image
behind. Clean it up, and make a few more readability tweaks
- podman-remote-group-add test: add an explicit skip()
when not remote. (Otherwise, test passes cleanly on
podman local, which is misleading)
- lots of container cleanup and/or adding "--rm" to run commands,
to avoid leaving stray containers
Signed-off-by: Ed Santiago <santiago@redhat.com>
Yet another case of missing podman-wait. In these two, I see
no reason to run containers detached, so I just removed "-d"
Signed-off-by: Ed Santiago <santiago@redhat.com>
Another low-hanging fruit: test flake because podman-remote
trying to contact a server that hadn't come up.
Fixes: #17940
Signed-off-by: Ed Santiago <santiago@redhat.com>
Run $QUADLET and all systemctl/journalctl commands using 'timeout'.
Nothing should ever, ever take more than the default 2 minutes.
Followup to #18514, in which quadlet tests are found to be
taking 9-10 minutes.
Signed-off-by: Ed Santiago <santiago@redhat.com>
- document env vars that can be used
- list up to date dependencies
- remove unnecessary GOPATH mention, no longer needed with gomodules
- use make targets to tests everything (much faster due `-p` option)
- remove tests in container section as make shell is not a valid target
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
In rootFsSize(), instead of calculating the size of the diff for every
layer of the container's base image, ask the storage library for the sum
of the values it recorded when it first wrote those layers.
In a similar fashion, teach rwSize() to use the library's
ContainerSize() method instead of trying to roll its own.
Replace calls to pkg/util.SizeOfPath() with calls to
github.com/containers/storage/pkg/directory.Size(), which does the same
thing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Yet another case where tests expect play-kube to be synchronous.
There are probably dozens more of these.
Signed-off-by: Ed Santiago <santiago@redhat.com>
Fixes: https://github.com/containers/podman/issues/16354
Currently we check on the server side, which ends up generating a bad
error message.
$ podman --remote build foo/
ERRO[0000] While reading directory /home/dwalsh/go/src/github.com/containers/podman/foo: EOF
Error: stat /var/tmp/libpod_builder1249622306/build/Dockerfile: no such file or directory
With this change you will get
./bin/podman --remote build foo/
Error: Containerfile not specified and no Containerfile or Dockerfile found in context directory, /home/dwalsh/podman/foo
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
- treadmill script: run root & rootless in parallel, not
sequentially. It's only four jobs, and it seems dumb
to fix root tests, repush, then discover a rootless failure.
- apply-podman-deltas: implement skip_if_rootless(), and
use it to skip a nasty longstanding flake
- bud-tests-in-podman diffs: ugly code to fix a rootless hang.
background: rootless remote tests hang
cause: stray podman server process
root cause: no idea. No clue at all. I just gave up
workaround: seek out and kill stray server processes
Rootless buildah-bud tests are not run in regular CI,
only in the buildah treadmill.
Signed-off-by: Ed Santiago <santiago@redhat.com>