Commit Graph

3884 Commits

Author SHA1 Message Date
Jake Correnti d7e25e14aa Add missing reserved annotation support to `play`
Adds any required "wiring" to ensure the reserved annotations are supported by
`podman kube play`.

Addtionally fixes a bug where, when inspected, containers created using
the `--publish-all` flag had a field `.HostConfig.PublishAllPorts` whose
value was only evaluated as `false`.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-17 14:06:23 -04:00
OpenShift Merge Robot 52af8a2293
Merge pull request #19250 from rhatdan/tmpfs
Should be checking tmpfs versus type not source
2023-07-17 10:32:54 +02:00
OpenShift Merge Robot 49a924cf39
Merge pull request #19211 from jakecorrenti/add-reserved-flag-generate
Add `--podman-only` flag to `podman generate kube`
2023-07-16 17:34:35 +02:00
Daniel J Walsh a224ff7313
Should be checking tmpfs versus type not source
Paul found this in https://github.com/containers/podman/pull/19241

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-15 07:57:37 -04:00
OpenShift Merge Robot d1ddd03a64
Merge pull request #19241 from rhatdan/bind
Use constants for mount types
2023-07-14 16:05:30 +02:00
Jake Correnti d0602e8f75 Add `--podman-only` flag to `podman generate kube`
Adds an `--podman-only` flag to `podman generate kube` to allow for
reserved annotations to be included in the generated YAML file.

Associated with: #19102

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-14 09:35:59 -04:00
OpenShift Merge Robot bb72016f58
Merge pull request #19066 from Luap99/ps
top: do not depend on ps(1) in container
2023-07-14 13:17:59 +02:00
Daniel J Walsh f256f4f954
Use constants for mount types
Inspired by https://github.com/containers/podman/pull/19238

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-07-14 07:17:21 -04:00
Doug Rabson 310a8f1038 libpod: use define.TypeBind when resolving container paths
This fixes the "podman cp file from host to container mount" system test
on FreeBSD where binding host paths into containers uses the nullfs
mount type.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-07-14 08:14:11 +01:00
Paul Holzinger 2a9b9bb53f
netavark: macvlan networks keep custom nameservers
The change to use the custom dns server in aardvark-dns caused a
regression here because macvlan networks never returned the nameservers
in netavark and it also does not make sense to do so.

Instead check here if we got any network nameservers, if not we then use
the ones from the config if set otherwise fallback to host servers.

Fixes #19169

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-12 14:07:34 +02:00
OpenShift Merge Robot 1be2ec1d4f
Merge pull request #19193 from Luap99/hostname-alias
add hostname to network alias
2023-07-11 12:19:02 -04:00
OpenShift Merge Robot b1dd0a3350
Merge pull request #19189 from pjannesen/issue/19175
Fix: cgroup is not set: internal libpod error after os reboot
2023-07-11 10:43:22 -04:00
Paul Holzinger f1c68b79eb
add hostname to network alias
We use the name as alias but using the hostname makes also sense and
this is what docker does. We have to keep the short id as well for
docker compat.

While adding some tests I removed some duplicated tests that were
executed twice for nv for no reason.

Fixes #17370

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-11 15:38:24 +02:00
Paul Holzinger b6ec2127b8
libpod: set cid network alias in setupContainer()
Since we have sqlite there is no point in duplicating this acroos two db
backends. Just set earlier when we validate the networks anyway.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-11 15:16:11 +02:00
Jake Correnti 7b54fd84ec Add `--no-trunc` flag to maintain original annotation length
Adds a `--no-trunc` flag to `podman kube generate` preventing the
annotations from being trimmed at 63 characters. However, due to
the fact the annotations will not be trimmed, any annotation that is
longer than 63 characters means this YAML will no longer be Kubernetes
compatible. However, these YAML files can still be used with `podman
kube play` due to the addition of the new flag below.

Adds a `--no-trunc` flag to `podman kube play` supporting YAML files with
annotations that were not truncated to the Kubernetes maximum length of
63 characters.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-10 18:02:53 -04:00
Peter Jannesen 4494cefbca Fix: cgroup is not set: internal libpod error after os reboot
[NO NEW TESTS NEEDED]
Closes #19175

Signed-off-by: Peter Jannesen <peter@jannesen.com>
2023-07-10 22:37:43 +02:00
Doug Rabson f8213a6d53 libpod: don't make a broken symlink for /etc/mtab on FreeBSD
This file has not been present in BSD systems since 2.9.1 BSD and as far
as I remember /proc/mounts has never existed on BSD systems.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-07-10 12:41:41 +01:00
Paul Holzinger 42ea0bf9c7
libpod: use io.Writer vs io.WriteCloser for attach streams
We never ever close the stream so we do not need the Close() function in
th ebackend, the caller should close when required which may no be the
case, i.e. when os.Stdout/err is used.
This should not be a breaking change as the io.Writer is a subset of
io.WriteCloser, therfore all code should still compile while allowing to
pass in Writers without Close().

This is useful for podman top where we exec ps in the container via
podman exec.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-10 13:33:03 +02:00
Paul Holzinger 597ebeb60f
top: do not depend on ps(1) in container
This ended up more complicated then expected. Lets start first with the
problem to show why I am doing this:

Currently we simply execute ps(1) in the container. This has some
drawbacks. First, obviously you need to have ps(1) in the container
image. That is no always the case especially in small images. Second,
even if you do it will often be only busybox's ps which supports far
less options.

Now we also have psgo which is used by default but that only supports a
small subset of ps(1) options. Implementing all options there is way to
much work.

Docker on the other hand executes ps(1) directly on the host and tries
to filter pids with `-q` an option which is not supported by busybox's
ps and conflicts with other ps(1) arguments. That means they fall back
to full ps(1) on the host and then filter based on the pid in the
output. This is kinda ugly and fails short because users can modify the
ps output and it may not even include the pid in the output which causes
an error.

So every solution has a different drawback, but what if we can combine
them somehow?! This commit tries exactly that.

We use ps(1) from the host and execute that in the container's pid
namespace.
There are some security concerns that must be addressed:
- mount the executable paths for ps and podman itself readonly to
  prevent the container from overwriting it via /proc/self/exe.
- set NO_NEW_PRIVS, SET_DUMPABLE and PDEATHSIG
- close all non std fds to prevent leaking files in that the caller had
  open
- unset all environment variables to not leak any into the contianer

Technically this could be a breaking change if somebody does not
have ps on the host and only in the container but I find that very
unlikely, we still have the exec in container fallback.

Because this can be insecure when the contianer has CAP_SYS_PTRACE we
still only use the podman exec version in that case.

This updates the docs accordingly, note that podman pod top never falls
back to executing ps in the container as this makes no sense with
multiple containers so I fixed the docs there as well.

Fixes #19001
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2215572

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-07-10 13:32:55 +02:00
Simon Brakhane dee94ea699 bugfix: do not try to parse empty ranges
An empty range caused a panic as parseOptionIDs tried to check further
down for an @ at index 0 without taking into account that the splitted
out string could be empty.

Signed-off-by: Simon Brakhane <simon@brakhane.net>
2023-07-06 11:16:34 +02:00
Peter Hunt a3a62275c8 libpod: use new libcontainer BlockIO constructors
[NO NEW TESTS NEEDED]

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-07-03 15:11:35 -04:00
David Gibson 39624473b0 pasta: Create /etc/hosts entries for pods using pasta networking
For pods with bridged and slirp4netns networking we create /etc/hosts
entries to make it more convenient for the containers to address each
other.  We omitted to do this for pasta networking, however.  Add the
necessary code to do this.

Closes: https://github.com/containers/podman/issues/17922

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-30 13:04:02 +10:00
Doug Rabson 865d77e942 pkg/specgen: add support for 'podman run --init' on FreeBSD
This adds define.BindOptions to declare the mount options for bind-like
mounts (nullfs on FreeBSD). Note: this mirrors identical declarations in
buildah and it may be preferable to use buildah's copies throughout
podman.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2023-06-28 14:43:50 +01:00
Fang-Pen Lin dd81f7ac61
Pass in correct cwd value for hooks exe
Signed-off-by: Fang-Pen Lin <hello@fangpenlin.com>
2023-06-26 23:49:08 -07:00
Valentin Rothberg db37d66cd1 make image listing more resilient
Handle more TOCTOUs operating on listed images.  Also pull in
containers/common/pull/1520 and containers/common/pull/1522 which do the
same on the internal layer tree.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2216700
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-26 16:34:26 +02:00
Valentin Rothberg 1398cbce8a container wait: support health states
Support two new wait conditions, "healthy" and "unhealthy".  This
further paves the way for integrating sdnotify with health checks which
is currently being tracked in #6160.

Fixes: #13627
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-23 14:16:32 +02:00
Valentin Rothberg 811867249b container wait API: use string slice instead of state slice
Massage the internal APIs to use a string slice instead of a state slice
for passing wait conditions.  This paves the way for waiting on
non-state conditions such as "healthy".

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-23 09:26:30 +02:00
OpenShift Merge Robot 49e0bde2bf
Merge pull request #18946 from Luap99/slirp4netns
use slirp4netns code from c/common
2023-06-22 16:15:18 +02:00
Ed Santiago a699ed0ebf StopContainer(): ignore one more conmon warning
Resolves: #18865

[NO NEW TESTS NEEDED] -- it's a flake

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-22 05:02:19 -06:00
Paul Holzinger 614c962c23
use libnetwork/slirp4netns from c/common
Most of the code moved there so if from there and remove it here.

Some extra changes are required here. This is a bit of a mess. The pipe
handling makes this a bit more difficult.

[NO NEW TESTS NEEDED] This is just a rework, existing tests must pass.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-22 11:16:13 +02:00
Paul Holzinger 810c97bd85
libpod: write /etc/{hosts,resolv.conf} once
My PR[1] to remove PostConfigureNetNS is blocked on other things I want
to split this change out. It reduces the complexity when generating
/etc/hosts and /etc/resolv.conf as now we always write this file after
we setup the network. this means we can get the actual ip from the netns
which is important.

[NO NEW TESTS NEEDED] This is just a rework.

[1] https://github.com/containers/podman/pull/18468

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-21 15:33:42 +02:00
Valentin Rothberg 2efa7c3fa1 make lint: enable rowserrcheck
It turns out, after iterating over rows, we need to check for errors. It
also turns out that we did not do that at all.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-19 14:31:40 +02:00
Valentin Rothberg f07aa1bfdc make lint: enable wastedassign
Because we shouldn't waste assigns.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-19 14:14:48 +02:00
Valentin Rothberg 60a5a59475 make lint: enable mirror
Helpful reports to avoid unnecessary allocations.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-19 14:11:12 +02:00
Aditya R 3829fbd35a
podman: add support for splitting imagestore
Add support for `--imagestore` in podman which allows users to split the filesystem of containers vs image store, imagestore if configured will pull images in image storage instead of the graphRoot while keeping the other parts still in the originally configured graphRoot.

This is an implementation of
https://github.com/containers/storage/pull/1549 in podman.

Signed-off-by: Aditya R <arajan@redhat.com>
2023-06-17 08:51:08 +05:30
Paul Holzinger 5ffbfd937d
pasta: use code from c/common
The code was moved to c/common so use that instead. Also add tests for
the new pasta_options config field. However there is one outstanding
problem[1]: pasta rejects most options when set more than once. Thus it is
impossible to overwrite most of them on the cli. If we cannot fix this
in pasta I need to make further changes in c/common to dedup the
options.

[1] https://archives.passt.top/passt-dev/895dae7d-3e61-4ef7-829a-87966ab0bb3a@redhat.com/

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-15 16:14:49 +02:00
Paul Holzinger 13c2aca219
libpod: make conmon always log to syslog
Conmon very early dups the std streams with /dev/null, therefore all
errors it reports go nowhere. When you run podman with debug level we
set --syslog and we can see the error in the journal. This should be
the default. We have a lot of weird failures in CI that could be caused
by conmon and we have access to the journal in the cirrus tasks so that
should make debugging much easier.

Conmon still uses the same logging level as podman so it will not spam
the journal and only log warning and errors by default.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-14 13:54:57 +02:00
Paul Holzinger 8a90765b90
filters: use new FilterID function from c/common
Remove code duplication and use the new FilterID function from
c/common. Also remove the duplicated ComputeUntilTimestamp in podman use
the one from c/common as well.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-13 17:49:41 +02:00
OpenShift Merge Robot 2a947c2f4b
Merge pull request #18869 from vrothberg/debug-18860
container wait: indicate timeout in error
2023-06-13 09:38:52 -04:00
Valentin Rothberg c0ab293131 container wait: indicate timeout in error
When waiting for a container, there may be a time window where conmon
has already exited but the container hasn't been fully cleaned up.
In that case, we give the container at most 20 seconds to be fully
cleaned up.  We cannot wait forever since conmon may have been killed or
something else went wrong.

After the timeout, we optimistically assume the container to be cleaned
up and its exit code to present.  If no exit code can be found, we
return an error.

Indicate in the error whether the timeout kicked in to help debug
(transient) errors and flakes (e.g., #18860).

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-13 13:48:29 +02:00
Toshiki Sonoda 6f821634ad libpod: Podman info output more network information
podman info prints the network information about binary path,
package version, program version and DNS information.

Fixes: #18443

Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
2023-06-13 11:19:29 +09:00
OpenShift Merge Robot 3cae574ab2
Merge pull request #18507 from mheon/fix_rm_depends
Fix `podman rm -fa` with dependencies
2023-06-12 13:27:34 -04:00
Paul Holzinger ab502fc5c4
criu: return error when checking for min version
There is weird issue #18856 which causes the version check to fail.
Return the underlying error in these cases so we can see it and debug
it.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-12 15:29:21 +02:00
Matthew Heon 2ebc9004f4 Ignore spurious warnings when killing containers
There are certain messages logged by OCI runtimes when killing a
container that has already stopped that we really do not care
about when stopping a container. Due to our architecture, there
are inherent races around stopping containers, and so we cannot
guarantee that *we* are the people to kill it - but that doesn't
matter because Podman only cares that the container has stopped,
not who delivered the fatal signal.

Unfortunately, the OCI runtimes don't understand this, and log
various warning messages when the `kill` command is invoked on a
container that was already dead. These cause our tests to fail,
as we now check for clean STDERR when running Podman. To work
around this, capture STDERR for the OCI runtime in a buffer only
for stopping containers, and go through and discard any of the
warnings we identified as spurious.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-08 09:19:47 -04:00
Matthew Heon f1ecdca4b6 Ensure our mutexes handle recursive locking properly
We use shared-memory pthread mutexes to handle mutual exclusion
in Libpod. It turns out that these have configurable options for
how to handle a recursive lock (IE, a thread trying to lock a
lock that the same thread had previously locked). The mutex can
either deadlock, or allow the duplicate lock without deadlocking.
Default behavior is, helpfully, unspecified, so if not explicitly
set there is no clear indication of which of these behaviors will
be seen. Unfortunately, today is the first I learned of this, so
our initial implementation did *not* explicitly set our preferred
behavior.

This turns out to be a major problem with a language like Golang,
where multiple goroutines can (and often do) use the same OS
thread. So we can have two goroutines trying to stop the same
container, and if the no-deadlock mutex behavior is in use, both
threads will successfully acquire the lock because the C library,
not knowing about Go's lightweight threads, sees the same PID
trying to lock a mutex twice, and allows it without question.

It appears that, at least on Fedora/RHEL/Debian libc, the default
(unspecified) behavior of the locks is the non-deadlocking
version - so, effectively, our locks have been of questionable
utility within the same Podman process for the last four years.
This is somewhat concerning.

What's even more concerning is that the Golang-native sync.Mutex
that was also in use did nothing to prevent the duplicate locking
(I don't know if I like the implications of this).

Anyways, this resolves the major issue of our locks not working
correctly by explicitly setting the correct pthread mutex
behavior.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
Matthew Heon a750cd9876 Fix a race removing multiple containers in the same pod
If the first container to get the pod lock is the infra container
it's going to want to remove the entire pod, which will also
remove every other container in the pod. Subsequent containers
will get the pod lock and try to access the pod, only to realize
it no longer exists - and that, actually, the container being
removed also no longer exists.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
Matthew Heon 0e47465e4a Discard errors when a pod is already removed
This was causing some CI flakes. I'm pretty sure that the pods
being removed already isn't a bug, but just the result of another
container in the pod removing it first - so no reason not to
ignore the errors.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:12 -04:00
Matthew Heon 398e48a24a Change Inherit to use a pointer to a container
This fixes a lint issue, but I'm keeping it in its own commit so
it can be reverted independently if necessary; I don't know what
side effects this may have. I don't *think* there are any
issues, but I'm not sure why it wasn't a pointer in the first
place, so there may have been a reason.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-06-07 14:09:07 -04:00
OpenShift Merge Robot c99d42b8e4
Merge pull request #18798 from edsantiago/fix_filters
filters: better handling of id=
2023-06-07 12:31:11 -04:00
Ed Santiago 992093ae91 filters: better handling of id=
For filter=id=XXX (containers, pods) and =ctr-ids=XXX (pods):

  if XXX is only hex characters, treat it as a PREFIX
  otherwise, treat it as a REGEX

Add tests. Update documentation. And fix an incorrect help message.

Fixes: #18471

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-07 05:29:06 -06:00