Commit Graph

3541 Commits

Author SHA1 Message Date
Doug Rabson 9e6b37ec1d libpod: Add support for 'podman top' on FreeBSD
This simply runs ps(1) on the host and filters for processes inside the
container.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-17 16:43:53 +01:00
Doug Rabson 21081355a7 libpod: Factor out jail name construction from stats_freebsd.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-17 15:33:18 +01:00
OpenShift Merge Robot 80a6017e2e
Merge pull request #16182 from dfr/freebsd-pidfd
libpod: Factor out the call to PidFdOpen from (*Container).WaitForExit
2022-10-17 09:55:43 -04:00
Valentin Rothberg 44bac51fca bump golangci-lint to v1.49.0
Motivated to have a working `make lint` on Fedora 37 (beta).
Most changes come from the new `gofmt` standards.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-10-17 09:19:41 +02:00
Doug Rabson 51c376c8a1 libpod: Factor out the call to PidFdOpen from (*Container).WaitForExit
This allows us to add a simple stub for FreeBSD which returns -1,
leading WaitForExit to fall back to the sleep loop approach.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-14 13:24:32 +01:00
Valentin Rothberg b35fab6f1c kill: wait for the container
Make sure to wait for the container to exit after kill. While the
cleanup process will take care eventually of transitioning the state, we
need to give a guarantee to the user to leave the container in the
expected state once the (kill) command has finished.

The issue could be observed in a flaking test (#16142) where
`podman rm -f -t0` failed because the preceding `podman kill`
left the container in "running" state which ultimately confused
the "stop" backend.

Note that we should only wait for the container to exit when SIGKILL is
being used.  Other signals have different semantics.

[NO NEW TESTS NEEDED] as I do not know how to reliably reproduce the
issue.  If #16142 stops flaking, we are good.

Fixes: #16142
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-10-14 13:21:52 +02:00
Valentin Rothberg 1d18dc2671 KillContainer: improve error message
To improve the error message reported in #16142 where the container is
reported to be in the wrong state but we do not know which.  This is not
a fix for #16142 but will hopefully aid in better understanding what's
going on if it flakes again.

[NO NEW TESTS NEEDED] as hitting the condition is inherently racy.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-10-13 12:22:34 +02:00
Alexander Larsson 55191ecc20 Add and use Container.LinuxResource() helper
This gets c.config.Spec.Linux.Resources, with some nil checks.
Using this means less open coding of the nil-checks, but also the
existing user of this field in moveConmonToCgroupAndSignal() was
using ctr.Spec().Linux.Resources instead, and the Spec() call
is very expensive.

[NO NEW TESTS NEEDED] Just minor performance effects

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-10-12 13:45:49 +02:00
OpenShift Merge Robot e5ace19aa2
Merge pull request #16117 from alexlarsson/container-terminal-helper
Add and use libpod/Container.Terminal() helper
2022-10-11 16:18:02 -04:00
Alexander Larsson 1038f063e0 Add and use libpod/Container.Terminal() helper
This just gets ctr.config.Spec.Process.Terminal with some null checks,
allowing several places that open-coded this to use the helper.

In particular, this helps the code in
pkg/domain/infra/abi/terminal.StartAttachCtr(), that used to do:
`ctr.Spec().Process.Terminal`, which looks fine, but actually causes
a deep json copy in the `ctr.Spec()` call that takes over 3 msec.

[NO NEW TESTS NEEDED] Just minor performance effects

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-10-11 17:17:11 +02:00
OpenShift Merge Robot 619366de93
Merge pull request #16088 from alexlarsson/wait-on-conmon-without-sleep
Avoid unnecessary timeout of 250msec when waiting on container shutdown
2022-10-11 10:17:11 -04:00
Valentin Rothberg 4dd67272ed healthcheck: fix --on-failure=stop
Fix the "stop" on-failure action by not removing the transient systemd
timer and service during container stop.  Removing the service will
in turn cause systemd to terminate the Podman process attempting to
stop the container and hence leave it in the "stopping" state.

Instead move the removal into the restart sequence.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-10-11 13:01:07 +02:00
OpenShift Merge Robot bb0b1849d7
Merge pull request #16083 from dfr/freebsd-pod
Add support for 'podman pod' on FreeBSD
2022-10-10 15:30:01 +02:00
Alexander Larsson c34b5be990 Avoid unnecessary timeout of 250msec when waiting on container shutdown
When you run "podman run foo" we attach to the container, which essentially
blocks until the container process exits. When that happens podman immediately
calls Container.WaitForExit(), but at this point the exit value has not
yet been written to the db by conmon. This means that we almost always
hit the "check for exit state; sleep 250msec" loop in WaitForExit(),
delaying the exit of podman run by 250 msec.

More recent kernels (>= 5.3) supports the pidfd_open() syscall, that
lets you open a fd representing a pid and then poll on it to wait
until the process exits. We can use this to have the first sleep
be exactly as long as is needed for conmon to exit (if we know its pid).
If for whatever reason there is still issues we use the old sleep loop
on later iterations.

This makes "time podman run fedora true" about 200msec faster.

[NO NEW TESTS NEEDED]

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-10-10 11:43:02 +02:00
OpenShift Merge Robot 5908996738
Merge pull request #16084 from vrothberg/health-check-fix
health checks: make on-failure action retry aware
2022-10-07 19:27:42 +02:00
Valentin Rothberg 02040089a6 health checks: make on-failure action retry aware
Make sure that the on-failure actions only kick in once the health check
has passed its retries.  Also fix race conditions on reading/writing the
log.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-10-07 15:43:48 +02:00
Alexander Larsson 5b71070e42 libpod: Remove 100msec delay during shutdown
When shutting down the image engine we always wait for the image
even goroutine to finish writing any outstanding events. However,
the loop for that always waits 100msec every iteration. This means
that (depending on the phase) shutdown is always delayed up to 100msec.

This is delaying "podman run" extra much because podman is run twice
(once for the run and once as cleanup via a conmon callback).

Changing the image loop to exit immediately when a libimageEventsShutdown
(but first checking for any outstanding events to write) improves podman
run times by about 100msec on average.

Note: We can't just block on the event loop reading the shutdown event
anymore, we need to wait until it read and processed any outstanding
events, so we now send the shutdown event and then block waiting for the
channel to be closed by the event loop.

[NO NEW TESTS NEEDED]

Signed-off-by: Alexander Larsson <alexl@redhat.com>
2022-10-07 10:13:26 +02:00
Doug Rabson b4b7011392 libpod: Add support for 'podman pod' on FreeBSD
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-07 08:27:27 +01:00
Doug Rabson 7f8964a78f libpod: Factor out cgroup validation from (*Runtime).NewPod
This moves the code to runtime_pod_linux.go since cgroups are
platform-specific.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-07 08:27:27 +01:00
Doug Rabson d71160539d libpod: Move runtime_pod_linux.go to runtime_pod_common.go
Most of the code can be shared with other unix-like platforms.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-07 08:27:08 +01:00
Doug Rabson e187b9711d libpod: Factor out cgroups handling from (*Pod).refresh
This moves the cgroup code to pod_internal_linux.go and adds a no-op
stub for FreeBSD.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-10-07 08:26:10 +01:00
Paul Holzinger eb7f54ef6f
cleanup: always delete netns mount
We should not keep the netns if there was a cleanup problem. Deleting
the netns will also delete the virtual links inside and thus make the IPs
available again for the next use.

context: https://github.com/containers/netavark/issues/302

[NO NEW TESTS NEEDED] This is very hard to trigger reliable and it would
need to work with cni and netavark. This mostly happens because of
specic bugs but those will be fixed and then this test would fail.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-09-30 20:38:20 +02:00
Valentin Rothberg 02b0f9fc39 container inspect: include image digest
Include the digest of the image in `podman container inspect`. The image
digest is a key information for auditing as it defines the identify of
an image.  This way, it can be determined whether a container used an
image with a given CVE etc.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-09-30 14:10:21 +02:00
Doug Rabson 359e397443 libpod: Add volume support for FreeBSD
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-27 16:31:40 +01:00
Doug Rabson abe8dad344 libpod: Factor out usage of unix.MNT_DETACH from (*Volume).unmount
There is an existing wrapper for unix.Unmount(..., MNT_DETACH) in
util_linux.go but that filters all errors and for volumes, we only want
to filter EINVAL. The existing libpod.Unmount seems to only have one
call site so perhaps these can be merged.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-27 16:31:40 +01:00
Doug Rabson 9de2a5ff79 libpod: Move volume_internal_linux.go to volume_internal_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-27 16:31:40 +01:00
Doug Rabson 065474755b libpod: Move runtime_volume_linux.go to runtime_volume_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-27 16:31:40 +01:00
Valentin Rothberg 7bc36602f6 auto-update: validate container image
Auto updates using the "registry" policy require container to be created
with a fully-qualified image reference.  Short names are not supported
due the ambiguity of their source registry.  Initially, container
creation errored out for non FQN images but it seems that Podman has
regressed.

Fixes: #15879
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-09-26 17:07:05 +02:00
OpenShift Merge Robot f658bbd230
Merge pull request #15463 from mheon/fix_15408
Events for containers in pods now include the pod's ID
2022-09-23 20:39:56 +02:00
Matthew Heon 6ee582315b Events for containers in pods now include the pod's ID
This allows tools like Cockpit to know that the pod in question
has also been updated, so they can refresh the list of containers
in the pod.

Fixes #15408

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-22 14:18:56 -04:00
Matthew Heon fc6dcd12b3 Add support for 'image' volume driver
We added the concept of image volumes in 2.2.0, to support
inspecting an image from within a container. However, this is a
strictly read-only mount, with no modification allowed.

By contrast, the new `image` volume driver creates a c/storage
container as its underlying storage, so we have a read/write
layer. This, in and of itself, is not especially interesting, but
what it will enable in the future is. If we add a new command to
allow these image volumes to be committed, we can now distribute
volumes - and changes to them - via a standard OCI image registry
(which is rather new and quite exciting).

Future work in this area:
- Add support for `podman volume push` (commit volume changes and
  push resulting image to OCI registry).
- Add support for `podman volume pull` (currently, we require
  that the image a volume is created from be already pulled; it
  would be simpler if we had a dedicated command that did the
  pull and made a volume from it)
- Add support for scratch images (make an empty image on demand
  to use as the base of the volume)
- Add UOR support to `podman volume push` and
  `podman volume pull` to enable both with non-image volume
  drivers

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-22 13:07:40 -04:00
Dan Čermák 5a2405ae1b
Don't mount /dev/tty* inside privileged containers running systemd
According to https://systemd.io/CONTAINER_INTERFACE/, systemd will try take
control over /dev/ttyN if exported, which can cause conflicts with the host's tty
in privileged containers. Thus we will not expose these to privileged containers
in systemd mode, as this is a bad idea according to systemd's maintainers.

Additionally, this commit adds a bats regression test to check that no /dev/ttyN
are present in a privileged container in systemd mode

This fixes https://github.com/containers/podman/issues/15878

Signed-off-by: Dan Čermák <dcermak@suse.com>
2022-09-22 16:44:26 +02:00
OpenShift Merge Robot 12655484e3
Merge pull request #15871 from cevich/replace_ioutil
Replace deprecated ioutil
2022-09-21 16:12:25 +02:00
Chris Evich d968f3fe09
Replace deprecated ioutil
Package `io/ioutil` was deprecated in golang 1.16, preventing podman from
building under Fedora 37.  Fortunately, functionality identical
replacements are provided by the packages `io` and `os`.  Replace all
usage of all `io/ioutil` symbols with appropriate substitutions
according to the golang docs.

Signed-off-by: Chris Evich <cevich@redhat.com>
2022-09-20 15:34:27 -04:00
Doug Rabson bb160be12b libpod: Implement 'podman cp' for FreeBSD
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-20 08:36:23 +01:00
Doug Rabson 88f82ceab2 libpod: Move jointMountAndExec to container_copy_linux.go
This also moves the logic for resolving paths in running and stopped
containers tp container_copy_linux.go.

On FreeBSD, we can execute the function argument to joinMountAndExec
directly using host-relative paths since the host mount namespace
includes all the container mounts.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-20 08:36:23 +01:00
Doug Rabson 75d6e7bae5 libpod: Move part of (*Container).stat to container_stat_linux.go
The logic that treats running containers differently from stopped
containers is not needed on FreeBSD where the container mounts live in
a global mount namespace.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-20 08:36:23 +01:00
Doug Rabson 0b02d4cee6 libpod: Move container_copy_linux.go to container_copy_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-20 08:36:23 +01:00
Doug Rabson 24d1290710 libpod: Move container_stat_linux.go to container_stat_common.go
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-20 08:36:23 +01:00
Daniel J Walsh dc8fdb46c5
label.Relabel third option is shared not recurse
There is no option in Selinux labeling to only relabel the top level of
a directory. The option is to either label the path shared or not
shared. Changing to make sure future engineers do not assume that
recurse can work.

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2022-09-17 07:14:31 -04:00
OpenShift Merge Robot df73f606ef
Merge pull request #15757 from mheon/fix_15526
Introduce graph-based pod container removal
2022-09-15 21:01:23 +02:00
Doug Rabson 45d09e65b8 libpod: Make getContainerInspectData work on FreeBSD
This makes setting EffectiveCaps and BoundingCaps conditional on whether
the capabilites field in the spec is non-nil.  This allows 'podman inspect'
to work on FreeBSD.

[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-15 08:10:35 +01:00
Doug Rabson 2e4db139ac libpod: Factor out platform-specfic code from generateInspectContainerHostConfig
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-15 08:10:34 +01:00
OpenShift Merge Robot 4fc18d07a6
Merge pull request #15799 from mheon/fix_2126697
Ensure that a broken OCI spec does not break inspect
2022-09-14 21:55:13 +02:00
Matthew Heon e19e0de5fa Introduce graph-based pod container removal
Originally, during pod removal, we locked every container in the
pod at once, did a number of validity checks to ensure everything
was safe, and then removed all the containers in the pod.

A deadlock was recently discovered with this approach. In brief,
we cannot lock the entire pod (or much more than a single
container at a time) without causing a deadlock. As such, we
converted to an approach where we just looped over each container
in the pod, removing them individually. Unfortunately, this
removed a lot of the validity checking of the earlier approach,
allowing for a lot of unintended bad things. Infra containers
could be removed while containers in the pod still depended on
them, for example.

There's no easy way to do validity checks while in a simple loop,
so I implemented a version of our graph-traversal logic that
currently handles pod start. This version acts in the reverse
order of startup: startup starts from containers which depend on
nothing and moves outwards, while removal acts on containers which
have nothing depend on them and moves inwards. By doing graph
traversal, we can guarantee that nothing is removed while
something that depends on it still exists - so the infra
container should be the last thing in a pod that is removed, for
example.

In the (unlikely) case that a graph of the pod's containers
cannot be built (most likely impossible without database editing)
the old method of pod removal has been retained to ensure that
even misbehaving pods can be forcibly evicted from the state.

I'm fairly confident that this resolves the problem, but there
are a lot of assumptions around dependency structure built into
the original pod removal code and I am not 100% sure I have
captured all of them.

Fixes #15526

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-09-14 13:44:48 -04:00
Giuseppe Scrivano 14e5d1c15d
libpod: fix lookup for subpath in volumes
a subdirectory that is below a mount destination is detected as a
subpath.

Closes: https://github.com/containers/podman/issues/15789

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-09-14 17:09:04 +02:00
Giuseppe Scrivano 92dc61d5ed
libpod: rename function
the function checks if a path is under any mount, not just bind
mounts.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-09-14 17:08:57 +02:00
Matthew Heon 42937cd9a8 Ensure that a broken OCI spec does not break inspect
The process of saving the OCI spec is not particularly
reboot-safe. Normally, this doesn't matter, because we recreate
the spec every time a container starts, but if one was to reboot
(or SIGKILL, or otherwise fatally interrupt) Podman in the middle
of writing the spec to disk, we can end up with a malformed spec
that sticks around until the container is next started. Some
Podman commands want to read the latest version of the spec off
disk (to get information only populated after a container is
started), and will break in the case that a partially populated
spec is present. Swap to just ignoring these errors (with a
logged warning, to let folks know something went wrong) so we
don't break important commands like `podman inspect` in these
cases.

[NO NEW TESTS NEEDED] Provided reproducer involves repeatedly
rebooting the system

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-09-14 11:02:35 -04:00
OpenShift Merge Robot 017d81ddd0
Merge pull request #15788 from kolyshkin/non-crypto-id
all: stop using deprecated GenerateNonCryptoID
2022-09-14 16:17:37 +02:00
Doug Rabson b47ce9a6e0 libpod: Add support for 'podman stats' on FreeBSD
[NO NEW TESTS NEEDED]

Signed-off-by: Doug Rabson <dfr@rabson.org>
2022-09-14 08:29:26 +01:00