Adds any required "wiring" to ensure the reserved annotations are supported by
`podman kube play`.
Addtionally fixes a bug where, when inspected, containers created using
the `--publish-all` flag had a field `.HostConfig.PublishAllPorts` whose
value was only evaluated as `false`.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Adds an `--podman-only` flag to `podman generate kube` to allow for
reserved annotations to be included in the generated YAML file.
Associated with: #19102
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
This fixes the "podman cp file from host to container mount" system test
on FreeBSD where binding host paths into containers uses the nullfs
mount type.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
The change to use the custom dns server in aardvark-dns caused a
regression here because macvlan networks never returned the nameservers
in netavark and it also does not make sense to do so.
Instead check here if we got any network nameservers, if not we then use
the ones from the config if set otherwise fallback to host servers.
Fixes#19169
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We use the name as alias but using the hostname makes also sense and
this is what docker does. We have to keep the short id as well for
docker compat.
While adding some tests I removed some duplicated tests that were
executed twice for nv for no reason.
Fixes#17370
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Since we have sqlite there is no point in duplicating this acroos two db
backends. Just set earlier when we validate the networks anyway.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Adds a `--no-trunc` flag to `podman kube generate` preventing the
annotations from being trimmed at 63 characters. However, due to
the fact the annotations will not be trimmed, any annotation that is
longer than 63 characters means this YAML will no longer be Kubernetes
compatible. However, these YAML files can still be used with `podman
kube play` due to the addition of the new flag below.
Adds a `--no-trunc` flag to `podman kube play` supporting YAML files with
annotations that were not truncated to the Kubernetes maximum length of
63 characters.
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
This file has not been present in BSD systems since 2.9.1 BSD and as far
as I remember /proc/mounts has never existed on BSD systems.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
We never ever close the stream so we do not need the Close() function in
th ebackend, the caller should close when required which may no be the
case, i.e. when os.Stdout/err is used.
This should not be a breaking change as the io.Writer is a subset of
io.WriteCloser, therfore all code should still compile while allowing to
pass in Writers without Close().
This is useful for podman top where we exec ps in the container via
podman exec.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
This ended up more complicated then expected. Lets start first with the
problem to show why I am doing this:
Currently we simply execute ps(1) in the container. This has some
drawbacks. First, obviously you need to have ps(1) in the container
image. That is no always the case especially in small images. Second,
even if you do it will often be only busybox's ps which supports far
less options.
Now we also have psgo which is used by default but that only supports a
small subset of ps(1) options. Implementing all options there is way to
much work.
Docker on the other hand executes ps(1) directly on the host and tries
to filter pids with `-q` an option which is not supported by busybox's
ps and conflicts with other ps(1) arguments. That means they fall back
to full ps(1) on the host and then filter based on the pid in the
output. This is kinda ugly and fails short because users can modify the
ps output and it may not even include the pid in the output which causes
an error.
So every solution has a different drawback, but what if we can combine
them somehow?! This commit tries exactly that.
We use ps(1) from the host and execute that in the container's pid
namespace.
There are some security concerns that must be addressed:
- mount the executable paths for ps and podman itself readonly to
prevent the container from overwriting it via /proc/self/exe.
- set NO_NEW_PRIVS, SET_DUMPABLE and PDEATHSIG
- close all non std fds to prevent leaking files in that the caller had
open
- unset all environment variables to not leak any into the contianer
Technically this could be a breaking change if somebody does not
have ps on the host and only in the container but I find that very
unlikely, we still have the exec in container fallback.
Because this can be insecure when the contianer has CAP_SYS_PTRACE we
still only use the podman exec version in that case.
This updates the docs accordingly, note that podman pod top never falls
back to executing ps in the container as this makes no sense with
multiple containers so I fixed the docs there as well.
Fixes#19001
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2215572
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
An empty range caused a panic as parseOptionIDs tried to check further
down for an @ at index 0 without taking into account that the splitted
out string could be empty.
Signed-off-by: Simon Brakhane <simon@brakhane.net>
For pods with bridged and slirp4netns networking we create /etc/hosts
entries to make it more convenient for the containers to address each
other. We omitted to do this for pasta networking, however. Add the
necessary code to do this.
Closes: https://github.com/containers/podman/issues/17922
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This adds define.BindOptions to declare the mount options for bind-like
mounts (nullfs on FreeBSD). Note: this mirrors identical declarations in
buildah and it may be preferable to use buildah's copies throughout
podman.
[NO NEW TESTS NEEDED]
Signed-off-by: Doug Rabson <dfr@rabson.org>
Handle more TOCTOUs operating on listed images. Also pull in
containers/common/pull/1520 and containers/common/pull/1522 which do the
same on the internal layer tree.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2216700
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Support two new wait conditions, "healthy" and "unhealthy". This
further paves the way for integrating sdnotify with health checks which
is currently being tracked in #6160.
Fixes: #13627
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Massage the internal APIs to use a string slice instead of a state slice
for passing wait conditions. This paves the way for waiting on
non-state conditions such as "healthy".
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Most of the code moved there so if from there and remove it here.
Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.
[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
My PR[1] to remove PostConfigureNetNS is blocked on other things I want
to split this change out. It reduces the complexity when generating
/etc/hosts and /etc/resolv.conf as now we always write this file after
we setup the network. this means we can get the actual ip from the netns
which is important.
[NO NEW TESTS NEEDED] This is just a rework.
[1] https://github.com/containers/podman/pull/18468
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
It turns out, after iterating over rows, we need to check for errors. It
also turns out that we did not do that at all.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Add support for `--imagestore` in podman which allows users to split the filesystem of containers vs image store, imagestore if configured will pull images in image storage instead of the graphRoot while keeping the other parts still in the originally configured graphRoot.
This is an implementation of
https://github.com/containers/storage/pull/1549 in podman.
Signed-off-by: Aditya R <arajan@redhat.com>
The code was moved to c/common so use that instead. Also add tests for
the new pasta_options config field. However there is one outstanding
problem[1]: pasta rejects most options when set more than once. Thus it is
impossible to overwrite most of them on the cli. If we cannot fix this
in pasta I need to make further changes in c/common to dedup the
options.
[1] https://archives.passt.top/passt-dev/895dae7d-3e61-4ef7-829a-87966ab0bb3a@redhat.com/
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Conmon very early dups the std streams with /dev/null, therefore all
errors it reports go nowhere. When you run podman with debug level we
set --syslog and we can see the error in the journal. This should be
the default. We have a lot of weird failures in CI that could be caused
by conmon and we have access to the journal in the cirrus tasks so that
should make debugging much easier.
Conmon still uses the same logging level as podman so it will not spam
the journal and only log warning and errors by default.
[NO NEW TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Remove code duplication and use the new FilterID function from
c/common. Also remove the duplicated ComputeUntilTimestamp in podman use
the one from c/common as well.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
When waiting for a container, there may be a time window where conmon
has already exited but the container hasn't been fully cleaned up.
In that case, we give the container at most 20 seconds to be fully
cleaned up. We cannot wait forever since conmon may have been killed or
something else went wrong.
After the timeout, we optimistically assume the container to be cleaned
up and its exit code to present. If no exit code can be found, we
return an error.
Indicate in the error whether the timeout kicked in to help debug
(transient) errors and flakes (e.g., #18860).
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
podman info prints the network information about binary path,
package version, program version and DNS information.
Fixes: #18443
Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
There is weird issue #18856 which causes the version check to fail.
Return the underlying error in these cases so we can see it and debug
it.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
There are certain messages logged by OCI runtimes when killing a
container that has already stopped that we really do not care
about when stopping a container. Due to our architecture, there
are inherent races around stopping containers, and so we cannot
guarantee that *we* are the people to kill it - but that doesn't
matter because Podman only cares that the container has stopped,
not who delivered the fatal signal.
Unfortunately, the OCI runtimes don't understand this, and log
various warning messages when the `kill` command is invoked on a
container that was already dead. These cause our tests to fail,
as we now check for clean STDERR when running Podman. To work
around this, capture STDERR for the OCI runtime in a buffer only
for stopping containers, and go through and discard any of the
warnings we identified as spurious.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
We use shared-memory pthread mutexes to handle mutual exclusion
in Libpod. It turns out that these have configurable options for
how to handle a recursive lock (IE, a thread trying to lock a
lock that the same thread had previously locked). The mutex can
either deadlock, or allow the duplicate lock without deadlocking.
Default behavior is, helpfully, unspecified, so if not explicitly
set there is no clear indication of which of these behaviors will
be seen. Unfortunately, today is the first I learned of this, so
our initial implementation did *not* explicitly set our preferred
behavior.
This turns out to be a major problem with a language like Golang,
where multiple goroutines can (and often do) use the same OS
thread. So we can have two goroutines trying to stop the same
container, and if the no-deadlock mutex behavior is in use, both
threads will successfully acquire the lock because the C library,
not knowing about Go's lightweight threads, sees the same PID
trying to lock a mutex twice, and allows it without question.
It appears that, at least on Fedora/RHEL/Debian libc, the default
(unspecified) behavior of the locks is the non-deadlocking
version - so, effectively, our locks have been of questionable
utility within the same Podman process for the last four years.
This is somewhat concerning.
What's even more concerning is that the Golang-native sync.Mutex
that was also in use did nothing to prevent the duplicate locking
(I don't know if I like the implications of this).
Anyways, this resolves the major issue of our locks not working
correctly by explicitly setting the correct pthread mutex
behavior.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
If the first container to get the pod lock is the infra container
it's going to want to remove the entire pod, which will also
remove every other container in the pod. Subsequent containers
will get the pod lock and try to access the pod, only to realize
it no longer exists - and that, actually, the container being
removed also no longer exists.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This was causing some CI flakes. I'm pretty sure that the pods
being removed already isn't a bug, but just the result of another
container in the pod removing it first - so no reason not to
ignore the errors.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
This fixes a lint issue, but I'm keeping it in its own commit so
it can be reverted independently if necessary; I don't know what
side effects this may have. I don't *think* there are any
issues, but I'm not sure why it wasn't a pointer in the first
place, so there may have been a reason.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
For filter=id=XXX (containers, pods) and =ctr-ids=XXX (pods):
if XXX is only hex characters, treat it as a PREFIX
otherwise, treat it as a REGEX
Add tests. Update documentation. And fix an incorrect help message.
Fixes: #18471
Signed-off-by: Ed Santiago <santiago@redhat.com>