Being able to easily identify what lock has been allocated to a
given Libpod object is only somewhat useful for debugging lock
issues, but it's trivial to expose and I don't see any harm in
doing so.
Signed-off-by: Matt Heon <mheon@redhat.com>
setupPasta() has logic to handle forwarding of TCP or UDP ports. It has
what looks like logic to give an error if trying to forward ports of any
other protocol. However, there's a straightforward error in this that it
will in fact only give the error if you try to use a protocol called
"default". Other unknown protocols will fall through and result in a
nonsensical pasta command line which will almost certainly cause a cryptic
error later on.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current way of bind mounting the host timezone file has problems.
Because /etc/localtime in the image may exist and is a symlink under
/usr/share/zoneinfo it will overwrite the targetfile. That confuses
timezone parses especially java where this approach does not work at
all. So we end up with an link which does not reflect the actual truth.
The better way is to just change the symlink in the image like it is
done on the host. However because not all images ship tzdata we cannot
rely on that either. So now we do both, when tzdata is installed then
use the symlink and if not we keep the current way of copying the host
timezone file in the container to /etc/localtime.
Also note that we need to rebuild the systemd image to include tzdata in
order to test this as our images do not contain the tzdata by default.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2149876
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd. The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.
When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint. In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers. The service will then behave
according to the following three exit-code policies:
- `none`: exit 0 and ignore containers (default)
- `any`: exit non-zero if _any_ container did
- `all`: exit non-zero if _all_ containers did
The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.
In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.
Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero. The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero. Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.
Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Make sure to prune container exit codes only when the associated
container does not exist anymore. This is needed when checking if any
container in kube-play exited non-zero and a building block for the
below linked Jira card.
[NO NEW TESTS NEEDED] - there are no unit tests for exit code pruning.
Jira: https://issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Use a helper to handle the cleanupErr logic instead of
copy&pasting it EIGHT times.
Also modifies the returned errors to be wrapped with a context,
and changes the text of the logged errors a bit.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Use a shared helper instead of copy&pasting the handling
of cleanupErr EIGHT times.
This changes the wording of logged error text, and the error
in one case, a bit.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
[NO NEW TESTS NEEDED]
... because testing this would require us to intentionally
create an inconsistent state, which should ideally not be possible...
(and because at this point I don't even know what the reported failure
was.)
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Make sure to look for the container's exit code when it's in stopped
state. With `--restart=always`, the container seems to stay in the
stopped state which led the wait logic to loop until the 20 seconds
timeout for the cleanup process to have finished kicks in.
Also defensively make sure to loop when the container is in stopped
state but no exit code has been written yet.
Add a regression test to make sure Podman doesn't wait more than 20
seconds. Even on a CI machine under high load I expect it to take much
much much less than that, so I do not expect this test to flake in the
future.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
b25b330306 introduced this behaviour.
It was fine at the time because we didn't support "container update",
so the limit could not be changed at runtime. Since it is not
possible to change the memory limit at runtime, read the limit as
reported from the cgroup.
https://github.com/containers/crun/pull/1217 is required for crun.
Closes: https://github.com/containers/podman/issues/18621
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
As shown in #17831, WAL mode plays a role in causing `database is locked`
errors. Those are errors, in theory, should not happen as the DB should
busy wait. mattn/go-sqlite3/issues/274 has some comments indicating
that the busy handler behaves differently in WAL mode which may be an
explanation to the error.
For now, let's disable WAL mode and only re-enable it when we have
clearer understanding of what's going on. The upstream issue along with
the SQLite documentation do not give me the clear guidance that I would
need.
[NO NEW TESTS NEEDED] - flake is only reproducible in CI.
Fixes: #18356
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
In rootFsSize(), instead of calculating the size of the diff for every
layer of the container's base image, ask the storage library for the sum
of the values it recorded when it first wrote those layers.
In a similar fashion, teach rwSize() to use the library's
ContainerSize() method instead of trying to roll its own.
Replace calls to pkg/util.SizeOfPath() with calls to
github.com/containers/storage/pkg/directory.Size(), which does the same
thing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
If the container was already cleaned up we should not try to do it
again. Podman stop will always try to call Cleanup() if you look at the
podman event log and just keep calling podman stop --all you see a
cleanup event every time. This is not wanted. Also in case of the host
pidns we report a error every single time, see the linked issue.
Fixes#18460
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The remote API will wait 300s by default before conmon will call the
cleanup. In the meantime when you inspect an exec session started with
ExecStart() (so not attached) and it did exit we do not know that. If
a caller inspects it they think it is still running. To prevent this we
should sync the session based on the exec pid and update the state
accordingly.
For a reproducer see the test in this commit or the issue.
Fixes#18424
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Podman kube generate now uses the pod's restart policy
when generating the kube yaml. If generating from containers
only, use the restart policy of the first non-init container.
Podman kube play applies the pod restart policy from the yaml
file to the pod. The containers within a pod inherit this restart
policy.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Add Restarts column to the podman ps output to show how many times a
container was restarted based on its restart policy. This column will be
displayed when --format={{.Restarts}}.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Add --restart flag to pod create to allow users to set the
restart policy for the pod, which applies to all the containers
in the pod. This reuses the restart policy already there for
containers and has the same restart policy options.
Add "never" to the restart policy options to match k8s syntax.
It is a synonym for "no" and does the exact same thing where the
containers are not restarted once exited.
Only the containers that have exited will be restarted based on the
restart policy, running containers will not be restarted when an exited
container is restarted in the same pod (same as is done in k8s).
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
fix some issues with the handling of errors, we print an error only
when there is already one set to be returned. Also the first error is
not printed, since it is reported back to the caller of the function.
Improve some messages with more context that can be helpful when
things go wrong.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
go swagger has a flat namespace so it doesn't handle name conflicts at
all. The libpod info response uses the Info struct from some docker dep
instead. Because we cannot change the docker dependency simply rename
the Info struct, but only via swagger comment not the go actual struct.
I verified locally that this works.
Fixes#18228
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We can now use the new API for creating files and directories without
setting the umask to allow parallel usage of those methods.
This patch also bumps c/common for that.
[NO NEW TESTS NEEDED]
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Ref: https://pkg.go.dev/math/rand@go1.20#Seed
Note: For `runtime_test.go`, this test-case was never actually doing
what appears as it's intent . Fixing it to work as intended would be
require incredibly libpod-invasive changes. Do the least-worse thing and
simply confirm that consecutive generated names are different.
Signed-off-by: Chris Evich <cevich@redhat.com>
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."
So let's move the first SELECT under the same transaction as the table
initialization.
[NO NEW TESTS NEEDED] as it's a hard to cause race.
[1] https://github.com/mattn/go-sqlite3/issues/274#issuecomment-1429054597Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
If user specifies commit --format, we were not setting it before
commit, this caused warning messages that made no sense to be
printed that made no sense.
Fixes: https://github.com/containers/podman/issues/17773
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Commit 1ab833fb73 improved the situation but it is still not enough.
If you run short lived containers with --restart=always podman is
basically permanently restarting them. To only way to stop this is
podman stop. However podman stop does not do anything when the
container is already in a not running state. While this makes sense we
should still mark the container as explicitly stopped by the user.
Together with the change in shouldRestart() which now checks for
StoppedByUser this makes sure the cleanup process is not going to start
it back up again.
A simple reproducer is:
```
podman run --restart=always --name test -d alpine true
podman stop test
```
then check if the container is still running, the behavior is very
flaky, it took me like 20 podman stop tries before I finally hit the
correct window were it was stopped permanently.
With this patch it worked on the first try.
Fixes#18259
[NO NEW TESTS NEEDED] This is super flaky and hard to correctly test
in CI. MY ginkgo v2 work seems to trigger this in play kube tests so
that should catch at least some regressions. Also this may be something
that should be tested at podman test days by users (#17912).
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It really doesn't make sense to match the version one to one,
this just requires us to update it every time manually.
Use a regex instead.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Make sure to tear down the netns again on errors. This is needed when a
later call fails and we do not have already stored the netns in the
container state.
[NO NEW TESTS NEEDED] My ginkgo-v2 PR will catch problem like this once
merged.
Fixes#18205
Signed-off-by: Paul Holzinger <pholzing@redhat.com>