with the new mount API is available, the OCI runtime doesn't require
that each parent directory for a bind mount must be accessible.
Instead it is opened in the initial user namespace and passed down to
the container init process.
This requires that the kernel supports the new mount API and that the
OCI runtime uses it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
I believe the previous code meant to use cmd.Run instead of cmd.Start.
The issue is that cmd.Start returns before the command has finished
executing, so the conditional body checking for the stderr of the
command never gets executed.
Raise the cmd.Start up into it's own conditional, which is checking for
whether the process could be started. Then we consume stderr, check for
some specific strings in the output, and then finally continue on with
the rest of the code.
Signed-off-by: Keith Johnson <kj@ubergeek42.com>
Conmon writes the exit file and oom file (if container
was oom killed) to the persist directory. This directory
is retained across reboots as well.
Update podman to create a persist-dir/ctr-id for the exit
and oom files for each container to be written to. The oom
state of container is set after reading the files
from the persist-dir/ctr-id directory.
The exit code still continues to read the exit file from
the exits directory.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
Moving from Go module v4 to v5 prepares us for public releases.
Move done using gomove [1] as with the v3 and v4 moves.
[1] https://github.com/KSubedi/gomove
Signed-off-by: Matt Heon <mheon@redhat.com>
Use the new rootlessnetns logic from c/common, drop the podman code
here and make use of the new much simpler API.
ref: https://github.com/containers/common/pull/1761
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
add a new option --preserve-fd that allows to specify a list of FDs to
pass down to the container.
It is similar to --preserve-fds but it allows to specify a list of FDs
instead of the maximum FD number to preserve.
--preserve-fd and --preserve-fds are mutually exclusive.
It requires crun since runc would complain if any fd below
--preserve-fds is not preserved.
Closes: https://github.com/containers/podman/issues/20844
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
All `[]string`s in containers.conf have now been migrated to attributed
string slices which require some adjustments in Buildah and Podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
commit 7ade972102 introduced the change
that caused an issue in crun since it forces the root user session
instead of the system one when DBUS_SESSION_BUS_ADDRESS is set.
I am addressing it in crun, but for the time being, let's also not
pass the variable down to conmon since the assumption is that when
running as root the containers must be created on the system bus.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Pass the _entire_ environment to conmon instead of selectively enabling
only specific variables. The main reasoning is to make sure that conmon
and the podman-cleanup callback process operate in the exact same
environment than the initial podman process. Some configuration files
may be passed via environment variables. Podman not passing those down
to conmon has led to subtle and hard to debug issues in the past, so
passing all down will avoid such kinds of issues in the future.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The processing and setting of the static and volume directories was
scattered across the code base (including c/common) leading to subtle
errors that surfaced in #19938.
There were multiple issues that I try to summarize below:
- c/common loaded the graphroot from c/storage to set the defaults for
static and volume dir. That ignored Podman's --root flag and
surfaced in #19938 and other bugs. c/common does not set the
defaults anymore which gives Podman the ability to detect when the
user/admin configured a custom directory (not empty value).
- When parsing the CLI, Podman (ab)uses containers.conf structures to
set the defaults but also to override them in case the user specified
a flag. The --root flag overrode the static dir which is wrong and
broke a couple of use cases. Now there is a dedicated field for in
the "PodmanConfig" which also includes a containers.conf struct.
- The defaults for static and volume dir and now being set correctly
and adhere to --root.
- The CONTAINERS_CONF_OVERRIDE env variable has not been passed to the
cleanup process. I believe that _all_ env variables should be passed
to conmon to avoid such subtle bugs.
Overall I find that the code and logic is scattered and hard to
understand and follow. I refrained from larger refactorings as I really
just want to get #19938 fixed and then go back to other priorities.
https://github.com/containers/common/pull/1659 broke three pkg/machine
tests. Those have been commented out until getting fixed.
Fixes: #19938
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The --syslog flag has not been passed to the cleanup process (i.e.,
conmon's exit args) complicating debugging quite a bit.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Under some circumstances podman tries to kill a container
using signal 37, for which unix.SignalName() returns "".
Not helpful. So, when that happens, show "(signal number)".
Signed-off-by: Ed Santiago <santiago@redhat.com>
when the "kill" command fails, print the stderr from the OCI runtime
only after we check the container state.
It also simplifies the code since we don't have to hard code the error
messages we want to ignore.
Closes: https://github.com/containers/podman/issues/18452
[NO NEW TESTS NEEDED] it fixes a flake.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The attach API used to always return the Content-Type
`vnd.docker.raw-stream`, however docker api v1.42 added the
`vnd.docker.multiplexed-stream` type when no tty was used.
Follow suit and return the same header for docker api v1.42 and libpod
v4.7.0. This technically allows clients to make a small optimization as
they no longer need to inspect the container to see if they get a raw or
multiplexed stream.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When conmon is started it blocks and waits for us to signal it to start
via pipe. This works but when conmon exits before it waits for the start
message it causes podman to fail with `write child: broken pipe`. This
error is meaningless to podman users.
The real error is that conmon failed so we should not return early if we
fail to send the start message to conmon. Instead ignore the EPIPE error
case as it is safe to assume to the conmon died and for other errors we
make sure to kill conmon so that the following wait() call does not hang
forever. This also fixes problems with having conmon zombie processes
leaked as wait() was never called.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We need to actually check the output not just exit codes. While doing
this it was clear that the first test was not checking what it should
be so I had to remove the quotes from the arg.
Also this check did not work with remote testing at all, we must set the
env then restart the server as the env for conmon must be set on the
server obviously.
Also we can only match the conmon error messages on the local client.
Lastly this test requires the journald driver but we cannot use the in
container tests so skip it there.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Because it will cause memory leak if we do not stop timer when the function has completed.
[NO NEW TESTS NEEDED]
Signed-off-by: hang.jiang <hang.jiang@daocloud.io>
Compat api for containers/stop should take -1 value
Add support for `podman stop --time -1`
Add support for `podman restart --time -1`
Add support for `podman rm --time -1`
Add support for `podman pod stop --time -1`
Add support for `podman pod rm --time -1`
Add support for `podman volume rm --time -1`
Add support for `podman network rm --time -1`
Fixes: https://github.com/containers/podman/issues/17542
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Most of the code moved there so if from there and remove it here.
Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.
[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Conmon very early dups the std streams with /dev/null, therefore all
errors it reports go nowhere. When you run podman with debug level we
set --syslog and we can see the error in the journal. This should be
the default. We have a lot of weird failures in CI that could be caused
by conmon and we have access to the journal in the cirrus tasks so that
should make debugging much easier.
Conmon still uses the same logging level as podman so it will not spam
the journal and only log warning and errors by default.
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
podman info prints the network information about binary path,
package version, program version and DNS information.
Fixes: #18443
Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
There are certain messages logged by OCI runtimes when killing a
container that has already stopped that we really do not care
about when stopping a container. Due to our architecture, there
are inherent races around stopping containers, and so we cannot
guarantee that *we* are the people to kill it - but that doesn't
matter because Podman only cares that the container has stopped,
not who delivered the fatal signal.
Unfortunately, the OCI runtimes don't understand this, and log
various warning messages when the `kill` command is invoked on a
container that was already dead. These cause our tests to fail,
as we now check for clean STDERR when running Podman. To work
around this, capture STDERR for the OCI runtime in a buffer only
for stopping containers, and go through and discard any of the
warnings we identified as spurious.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
The documentation says
> The new Buffer takes ownership of buf, and the
> caller should not use buf after this call.
so use the more directly applicable, and simpler, bytes.Reader instead, to avoid this potentially risky use.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Kill is a fast syscall, so we can reduce the sleep time from 100ms to
10ms in hope to speed things up a bit.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Commit 067442b570 improved stopping/killing a container by detecting
whether the cleanup process has already fired and changed the state of
the container. Further improve on that by returning early instead of
trying to wait for the PID to finish. At that point we know that the
container has exited but the previous PID may have been recycled
already by the kernel.
[NO NEW TESTS NEEDED] - the absence of the two flaking tests recorded
in #17142 will tell.
Fixes: #17142
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add a comment when SIGKILL is being used. It may help future readers
better comprehend what's going on and why.
[NO NEW TESTS NEEDED] - cannot test a comment :^)
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Move the stopSignal decl into the branch where it's actually used.
[NO NEW TESTS NEEDED] as it's just a small refactor.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The code can be simplified by using a timer directly.
[NO NEW TESTS NEEDED] - should not change behavior.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The container lock is released before stopping/killing which implies
certain race conditions with, for instance, the cleanup process changing
the container state to stopped, exited or other states.
The (remaining) flakes seen in #16142 and #15367 strongly indicate a
race in between the stopping/killing a container and the cleanup
process. To fix the flake make sure to ignore invalid-state errors.
An alternative fix would be to change `KillContainer` to not return such
errors at all but commit c77691f06f indicates an explicit desire to
have these errors being reported in the sig proxy.
[NO NEW TESTS NEEDED] as it's a race already covered by the system
tests.
Fixes: #16142Fixes: #15367
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
check that the container has a valid pid before attempting to use
kill($PID, 0) on it. If the PID==0, it means the container is already
stopped.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When we read logs there can be full or partial lines, when it is full we
need to append a newline, thus the message length must be incremented by
one.
Fixes#16856
Signed-off-by: Paul Holzinger <pholzing@redhat.com>