Commit Graph

254 Commits

Author SHA1 Message Date
Paul Holzinger cc6faddfaa
use c/common code for resize and CopyDetachable
Since conmon-rs also uses this code we moved it to c/common. Now podman
should has this also to prevent duplication.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-07-06 16:57:07 +02:00
Sascha Grunert 251d91699d
libpod: switch to golang native error wrapping
We now use the golang error wrapping format specifier `%w` instead of
the deprecated github.com/pkg/errors package.

[NO NEW TESTS NEEDED]

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-07-05 16:06:32 +02:00
Valentin Rothberg b9aa475555 Sync: handle exit file
Make sure `Sync()` handles state transitions and exit codes correctly.
The function was only being called when batching which could render
containers in an unusable state when running concurrently with other
state-altering functions/commands since the state must be re-read from
the database before acting upon it.

Fixes: #14761
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-07-05 12:32:02 +02:00
openshift-ci[bot] 2cc3f127f4
Merge pull request #14720 from sstosh/rm-option
Fix: Prevent OCI runtime directory remain
2022-06-29 19:51:53 +00:00
Toshiki Sonoda 3619f0be95 Fix: Prevent OCI runtime directory remain
This bug was introduced in https://github.com/containers/podman/pull/8906.

When we use 'podman rm/restart/stop/kill etc...' command to
the container running with --rm, the OCI runtime directory
remains at /run/<runtime name> (root user) or
/run/user/<user id>/<runtime name> (rootless user).

This bug could cause other bugs.
For example, when we checkpoint the container running with
--rm (podman checkpoint --export) and restore it
(podman restore --import) with crun, error message
"Error: OCI runtime error: crun: container `<container id>`
already exists" is outputted.
This error is caused by an attempt to restore the container with
the same container ID as the remaining OCI runtime's container ID.

Therefore, I fix that the cleanupRuntime() function runs to
remove the OCI runtime directory,
even if the container has already been removed by --rm option.

Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
2022-06-24 09:29:24 +09:00
Valentin Rothberg 30e7cbccc1 libpod: fix wait and exit-code logic
This commit addresses three intertwined bugs to fix an issue when using
Gitlab runner on Podman.  The three bug fixes are not split into
separate commits as tests won't pass otherwise; avoidable noise when
bisecting future issues.

1) Podman conflated states: even when asking to wait for the `exited`
   state, Podman returned as soon as a container transitioned to
   `stopped`.  The issues surfaced in Gitlab tests to fail [1] as
   `conmon`'s buffers have not (yet) been emptied when attaching to a
   container right after a wait.  The race window was extremely narrow,
   and I only managed to reproduce with the Gitlab runner [1] unit
   tests.

2) The clearer separation between `exited` and `stopped` revealed a race
   condition predating the changes.  If a container is configured for
   autoremoval (e.g., via `run --rm`), the "run" process competes with
   the "cleanup" process running in the background.  The window of the
   race condition was sufficiently large that the "cleanup" process has
   already removed the container and storage before the "run" process
   could read the exit code and hence waited indefinitely.

   Address the exit-code race condition by recording exit codes in the
   main libpod database.  Exit codes can now be read from a database.
   When waiting for a container to exit, Podman first waits for the
   container to transition to `exited` and will then query the database
   for its exit code. Outdated exit codes are pruned during cleanup
   (i.e., non-performance critical) and when refreshing the database
   after a reboot.  An exit code is considered outdated when it is older
   than 5 minutes.

   While the race condition predates this change, the waiting process
   has apparently always been fast enough in catching the exit code due
   to issue 1): `exited` and `stopped` were conflated.  The waiting
   process hence caught the exit code after the container transitioned
   to `stopped` but before it `exited` and got removed.

3) With 1) and 2), Podman is now waiting for a container to properly
   transition to the `exited` state.  Some tests did not pass after 1)
   and 2) which revealed the third bug: `conmon` was executed with its
   working directory pointing to the OCI runtime bundle of the
   container.  The changed working directory broke resolving relative
   paths in the "cleanup" process.  The "cleanup" process error'ed
   before actually cleaning up the container and waiting "main" process
   ran indefinitely - or until hitting a timeout.  Fix the issue by
   executing `conmon` with the same working directory as Podman.

Note that fixing 3) *may* address a number of issues we have seen in the
past where for *some* reason cleanup processes did not fire.

[1] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119#note_970712864

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

[MH: Minor reword of commit message]

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-06-23 09:11:57 -04:00
OpenShift Merge Robot 95888735e3
Merge pull request #14384 from mheon/move_attach
Move Attach under the OCI Runtime interface
2022-06-02 14:20:25 -04:00
Jhon Honce 8efdbf5c4c Add API support for NoOverwriteDirNonDir
Update method signatures and structs to pass option to buildah code

```release-note
NONE
```

[NO NEW TESTS NEEDED]

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2022-05-26 16:31:15 -07:00
Matthew Heon ea1a8e2432 Move Attach under the OCI Runtime interface
With conmon-rs on the horizon, we need to disentangle Libpod from
legacy Conmon to the greatest extent possible. There are
definitely opportunities for codesharing between the two, but we
have to assume the implementations will be largely disjoint given
the different architectures.

Fortunately, most of the work has already been done in the past.
The conmon-managed OCI runtime mostly sits behind an interface,
with a few exceptions - the most notable of those being attach.
This PR thus moves Attach behind the interface, to ensure that we
can have attach implementations that don't use our existing unix
socket streaming if necessary.

Still to-do is conmon cleanup. There's a lot of code that removes
Conmon-specific files, or kills the Conmon PID, and all of it
will need to be refactored behind the interface.

[NO NEW TESTS NEEDED] Just moving some things around.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2022-05-26 14:57:08 -04:00
Matthew Heon 9fcfea7643 First batch of resolutions to FIXMEs
Most of these are no longer relevant, just drop the comments.

Most notable change: allow `podman kill` on paused containers.
Works just fine when I test it.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2022-05-25 13:28:04 -04:00
Paul Holzinger 51fbf3da9e
enable gocritic linter
The linter ensures a common code style.
- use switch/case instead of else if
- use if instead of switch/case for single case statement
- add space between comment and text
- detect the use of defer with os.Exit()
- use short form var += "..." instead of var = var + "..."
- detect problems with append()
```
newSlice := append(orgSlice, val)
```
  This could lead to nasty bugs because the orgSlice will be changed in
  place if it has enough capacity too hold the new elements. Thus we
  newSlice might not be a copy.

Of course most of the changes are just cosmetic and do not cause any
logic errors but I think it is a good idea to enforce a common style.
This should help maintainability.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-26 18:12:22 +02:00
Paul Holzinger c7b16645af
enable unparam linter
The unparam linter is useful to detect unused function parameters and
return values.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2022-04-25 13:23:20 +02:00
Radostin Stoyanov 756ecd5400
Add support for checkpoint image
This is an enhancement proposal for the checkpoint / restore feature of
Podman that enables container migration across multiple systems with
standard image distribution infrastructure.

A new option `--create-image <image>` has been added to the
`podman container checkpoint` command. This option tells Podman to
create a container image.  This is a standard image with a single layer,
tar archive, that that contains all checkpoint files. This is similar to
the current approach with checkpoint `--export`/`--import`.

This image can be pushed to a container registry and pulled on a
different system.  It can also be exported locally with `podman image
save` and inspected with `podman inspect`. Inspecting the image would
display additional information about the host and the versions of
Podman, criu, crun/runc, kernel, etc.

`podman container restore` has also been extended to support image
name or ID as input.

Suggested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-20 18:55:39 +01:00
Valentin Rothberg 06dd9136a2 fix a number of errcheck issues
Numerous issues remain, especially in tests/e2e.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2022-03-22 13:15:28 +01:00
Valentin Rothberg bd09b7aa79 bump go module to version 4
Automated for .go files via gomove [1]:
`gomove github.com/containers/podman/v3 github.com/containers/podman/v4`

Remaining files via vgrep [2]:
`vgrep github.com/containers/podman/v3`

[1] https://github.com/KSubedi/gomove
[2] https://github.com/vrothberg/vgrep

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2022-01-18 12:47:07 +01:00
Radostin Stoyanov 6d23ea60d2
Add --file-locks checkpoint/restore option
CRIU supports checkpoint/restore of file locks. This feature is
required to checkpoint/restore containers running applications
such as MySQL.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-11-18 19:23:25 +00:00
Adrian Reber 80e56fa12b
Added optional container restore statistics
This adds the parameter '--print-stats' to 'podman container restore'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to restore a checkpoint and print out these
information. CRIU already creates process restore statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the restored container, Podman will now print
out JSON:

 # podman container restore --latest --print-stats
 {
     "podman_restore_duration": 305871,
     "container_statistics": [
         {
             "Id": "47b02e1d474b5d5fe917825e91ac653efa757c91e5a81a368d771a78f6b5ed20",
             "runtime_restore_duration": 140614,
             "criu_statistics": {
                 "forking_time": 5,
                 "restore_time": 67672,
                 "pages_restored": 14
             }
         }
     ]
 }

The output contains 'podman_restore_duration' which contains the
number of microseconds Podman required to restore the checkpoint. The
output also includes 'runtime_restore_duration' which is the time
the runtime needed to restore that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-11-15 11:50:25 +00:00
Adrian Reber 6202e8102b
Added optional container checkpointing statistics
This adds the parameter '--print-stats' to 'podman container checkpoint'.
With '--print-stats' Podman will measure how long Podman itself, the OCI
runtime and CRIU requires to create a checkpoint and print out these
information. CRIU already creates checkpointing statistics which are
just read in addition to the added measurements. In contrast to just
printing out the ID of the checkpointed container, Podman will now print
out JSON:

 # podman container checkpoint --latest --print-stats
 {
     "podman_checkpoint_duration": 360749,
     "container_statistics": [
         {
             "Id": "25244244bf2efbef30fb6857ddea8cb2e5489f07eb6659e20dda117f0c466808",
             "runtime_checkpoint_duration": 177222,
             "criu_statistics": {
                 "freezing_time": 100657,
                 "frozen_time": 60700,
                 "memdump_time": 8162,
                 "memwrite_time": 4224,
                 "pages_scanned": 20561,
                 "pages_written": 2129
             }
         }
     ]
 }

The output contains 'podman_checkpoint_duration' which contains the
number of microseconds Podman required to create the checkpoint. The
output also includes 'runtime_checkpoint_duration' which is the time
the runtime needed to checkpoint that specific container. Each container
also includes 'criu_statistics' which displays the timing information
collected by CRIU.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-11-15 11:50:24 +00:00
Matthew Heon 8bd9f58d1d Ensure `podman ps --sync` functions
The backend for `ps --sync` has been nonfunctional for a long
while now - probably since v2.0. It's questionable how useful the
flag is in modern Podman (the original case it was intended to
catch, Conmon gone via SIGKILL, should be handled now via pinging
the process with a signal to ensure it's still alive) but having
the ability to force a refresh of container state from the OCI
runtime is still useful.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-10-06 11:19:32 -04:00
Matthew Heon e1089e89d7 Allow `podman stop` to be run on Stopping containers
This allows you to stop a container after a `podman stop` process
started, but did not finish, stopping the container (probably an
ignored stop signal, with no time to SIGKILL?). This is a very
narrow case, but once you're in it the only way to recover is a
`podman rm -f` of the container or extensive manual remediation
(you'd have to kill the container yourself, manually, and then
force a `podman ps --all --sync` to update its status from the
OCI runtime).

[NO NEW TESTS NEEDED] I have no idea how to verify this one -
we need to test that it actually started *during* the other stop
command, and that's nontrivial.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-10-06 11:19:32 -04:00
Giuseppe Scrivano 3ce98a5ec2
logging: new mode -l passthrough
it allows to pass the current std streams down to the container.

conmon support: https://github.com/containers/conmon/pull/289

[NO TESTS NEEDED] it needs a new conmon.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-09-27 12:07:01 +02:00
Daniel J Walsh 1c4e6d8624
standardize logrus messages to upper case
Remove ERROR: Error stutter from logrus messages also.

[ NO TESTS NEEDED] This is just code cleanup.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-09-22 15:29:34 -04:00
OpenShift Merge Robot 6c5966cf3c
Merge pull request #10910 from adrianreber/2021-07-12-checkpoint-restore-into-pod
Add support for checkpoint/restore into and out of pods
2021-07-28 14:48:28 +02:00
Adrian Reber eb94467780
Support checkpoint/restore with pods
This adds support to checkpoint containers out of pods and restore
container into pods.

It is only possible to restore a container into a pod if it has been
checkpointed out of pod. It is also not possible to restore a non pod
container into a pod.

The main reason this does not work is the PID namespace. If a non pod
container is being restored in a pod with a shared PID namespace, at
least one process in the restored container uses PID 1 which is already
in use by the infrastructure container. If someone tries to restore
container from a pod with a shared PID namespace without a shared PID
namespace it will also fail because the resulting PID namespace will not
have a PID 1.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-07-27 16:10:44 +02:00
Mehul Arora 6fe03b25ab support container to container copy
Implement container to container copy.  Previously data could only be
copied from/to the host.

Fixes: #7370
Co-authored-by: Mehul Arora <aroram18@mcmaster.ca>
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-07-27 15:32:23 +02:00
Matej Vasek 86c6014145 Implement --archive flag for podman cp
Signed-off-by: Matej Vasek <mvasek@redhat.com>
2021-07-01 12:01:46 +02:00
Adrian Reber 8aa5340ade
Add parameter to specify checkpoint archive compression
The checkpoint archive compression was hardcoded to `archive.Gzip`.

There have been requests to make the used compression algorithm
selectable. There was especially the request to not compress the
checkpoint archive to be able to create faster checkpoints when not
compressing it.

This also changes the default from `gzip` to `zstd`. This change should
not break anything as the restore code path automatically handles
whatever compression the user provides during restore.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-06-07 08:07:15 +02:00
Valentin Rothberg d0d084dd8c turn hidden --trace into a NOP
The --trace has helped in early stages analyze Podman code.  However,
it's contributing to dependency and binary bloat.  The standard go
tooling can also help in profiling, so let's turn `--trace` into a NOP.

[NO TESTS NEEDED]

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-03-08 09:22:42 +01:00
Valentin Rothberg a090301bbb podman cp: support copying on tmpfs mounts
Traditionally, the path resolution for containers has been resolved on
the *host*; relative to the container's mount point or relative to
specified bind mounts or volumes.

While this works nicely for non-running containers, it poses a problem
for running ones.  In that case, certain kinds of mounts (e.g., tmpfs)
will not resolve correctly.  A tmpfs is held in memory and hence cannot
be resolved relatively to the container's mount point.  A copy operation
will succeed but the data will not show up inside the container.

To support these kinds of mounts, we need to join the *running*
container's mount namespace (and PID namespace) when copying.

Note that this change implies moving the copy and stat logic into
`libpod` since we need to keep the container locked to avoid race
conditions.  The immediate benefit is that all logic is now inside
`libpod`; the code isn't scattered anymore.

Further note that Docker does not support copying to tmpfs mounts.

Tests have been extended to cover *both* path resolutions for running
and created containers.  New tests have been added to exercise the
tmpfs-mount case.

For the record: Some tests could be improved by using `start -a` instead
of a start-exec sequence.  Unfortunately, `start -a` is flaky in the CI
which forced me to use the more expensive start-exec option.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-03-04 15:43:12 +01:00
baude 24d9bda7ff prune remotecommand dependency
prune a dependency that was only being used for a simple struct.  Should
correct checksum issue on tarballs

[NO TESTS NEEDED]

Fixes: #9355

Signed-off-by: baude <bbaude@redhat.com>
2021-02-25 10:02:41 -06:00
Valentin Rothberg 5dded6fae7 bump go module to v3
We missed bumping the go module, so let's do it now :)

* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-02-22 09:03:51 +01:00
Matej Vasek 05444cb2cc Fix per review request
Signed-off-by: Matej Vasek <mvasek@redhat.com>
2021-02-04 18:30:07 +01:00
Matej Vasek 570e1587dd Improve container libpod.Wait*() functions
Signed-off-by: Matej Vasek <mvasek@redhat.com>
2021-02-03 21:49:09 +01:00
Valentin Rothberg 7b186dcb9e libpod: add (*Container).ResolvePath()
Add an API to libpod to resolve a path on the container.  We can
refactor the code that was originally written for copy.  Other
functions are requiring a proper path resolution, so libpod seems
like a reasonable home for sharing that code.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-01-26 09:01:33 +01:00
OpenShift Merge Robot a1b49749af
Merge pull request #8906 from vrothberg/fix-8501
container stop: release lock before calling the runtime
2021-01-14 13:37:16 -05:00
Valentin Rothberg d54478d8ea container stop: release lock before calling the runtime
Podman defers stopping the container to the runtime, which can take some
time.  Keeping the lock while waiting for the runtime to complete the
stop procedure, prevents other commands from acquiring the lock as shown
in #8501.

To improve the user experience, release the lock before invoking the
runtime, and re-acquire the lock when the runtime is finished.  Also
introduce an intermediate "stopping" to properly distinguish from
"stopped" containers etc.

Fixes: #8501
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2021-01-14 17:45:30 +01:00
unknown 2aa381f2d0 add pre checkpoint
Signed-off-by: Zhuohan Chen <chen_zhuohan@163.com>
2021-01-10 21:38:28 +08:00
Radostin Stoyanov 288ccc4c84 Include named volumes in container migration
When migrating a container with associated volumes, the content of
these volumes should be made available on the destination machine.

This patch enables container checkpoint/restore with named volumes
by including the content of volumes in checkpoint file. On restore,
volumes associated with container are created and their content is
restored.

The --ignore-volumes option is introduced to disable this feature.

Example:

 # podman container checkpoint --export checkpoint.tar.gz <container>

The content of all volumes associated with the container are included
in `checkpoint.tar.gz`

 # podman container checkpoint --export checkpoint.tar.gz --ignore-volumes <container>

The content of volumes is not included in `checkpoint.tar.gz`. This is
useful, for example, when the checkpoint/restore is performed on the
same machine.

 # podman container restore --import checkpoint.tar.gz

The associated volumes will be created and their content will be
restored. Podman will exit with an error if volumes with the same
name already exist on the system or the content of volumes is not
included in checkpoint.tar.gz

 # podman container restore --ignore-volumes --import checkpoint.tar.gz

Volumes associated with container must already exist. Podman will not
create them or restore their content.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-01-07 07:51:19 +00:00
Josh Soref 4fa1fce930 Spelling
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-12-22 13:34:31 -05:00
Matthew Heon b0286d6b43 Implement pod-network-reload
This adds a new command, 'podman network reload', to reload the
networks of existing containers, forcing recreation of firewall
rules after e.g. `firewall-cmd --reload` wipes them out.

Under the hood, this works by calling CNI to tear down the
existing network, then recreate it using identical settings. We
request that CNI preserve the old IP and MAC address in most
cases (where the container only had 1 IP/MAC), but there will be
some downtime inherent to the teardown/bring-up approach. The
architecture of CNI doesn't really make doing this without
downtime easy (or maybe even possible...).

At present, this only works for root Podman, and only locally.
I don't think there is much of a point to adding remote support
(this is very much a local debugging command), but I think adding
rootless support (to kill/recreate slirp4netns) could be
valuable.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
2020-12-07 19:26:23 +01:00
Daniel J Walsh dc8996ec84
Allow containers to --restart on-failure with --rm
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-11-20 13:55:19 -05:00
Matthew Heon 2cf443fd41 Ensure that attach ready channel does not block
We only use this channel in terminal attach, and it was not a
buffered channel originally, so it would block on trying to send
unless a receiver was ready. In the non-terminal case, there was
no receiver, so attach blocked forever. Buffer the channel for a
single bool so that it will never block, even if unused.

Fixes #8154

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-10-28 11:32:31 -04:00
Matthew Heon 4c155d36cb Force Attach() to send a SIGWINCH and redraw
Basically, we want to force the application in the container to
(iff the container was made with a terminal) redraw said terminal
immediately after an attach completes, so the fresh Attach
session will be able to see what's going on (e.g. will have a
shell prompt). Our current attach functions are unfortunately
geared more towards `podman run` than `podman attach` and will
start forwarding resize events *immediately* instead of waiting
until the attach session is alive (much safer for short-lived
`podman run` sessions, but broken for the `podman attach` case).
To avoid a major rewrite, let's just manually send a SIGWINCH
after attach succeeds to force a redraw.

Fixes #6253

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-09-10 17:54:47 -04:00
Matthew Heon 2ea9dac5e1 Send HTTP Hijack headers after successful attach
Our previous flow was to perform a hijack before passing a
connection into Libpod, and then Libpod would attach to the
container's attach socket and begin forwarding traffic.

A problem emerges: we write the attach header as soon as the
attach complete. As soon as we write the header, the client
assumes that all is ready, and sends a Start request. This Start
may be processed *before* we successfully finish attaching,
causing us to lose output.

The solution is to handle hijacking inside Libpod. Unfortunately,
this requires a downright extensive refactor of the Attach and
HTTP Exec StartAndAttach code. I think the result is an
improvement in some places (a lot more errors will be handled
with a proper HTTP error code, before the hijack occurs) but
other parts, like the relocation of printing container logs, are
just *bad*. Still, we need this fixed now to get CI back into
good shape...

Fixes #7195

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-08-27 12:50:22 -04:00
Daniel J Walsh a5e37ad280
Switch all references to github.com/containers/libpod -> podman
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-28 08:23:45 -04:00
Matthew Heon 4b784b377c Remove all instances of named return "err" from Libpod
This was inspired by https://github.com/cri-o/cri-o/pull/3934 and
much of the logic for it is contained there. However, in brief,
a named return called "err" can cause lots of code confusion and
encourages using the wrong err variable in defer statements,
which can make them work incorrectly. Using a separate name which
is not used elsewhere makes it very clear what the defer should
be doing.

As part of this, remove a large number of named returns that were
not used anywhere. Most of them were once needed, but are no
longer necessary after previous refactors (but were accidentally
retained).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-07-09 13:54:47 -04:00
Valentin Rothberg 09dc77aedf log API: add context to allow for cancelling
Add a `context.Context` to the log APIs to allow for cancelling
streaming (e.g., via `podman logs -f`).  This fixes issues for
the remote API where some go routines of the server will continue
writing and produce nothing but heat and waste CPU cycles.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-07-09 15:13:07 +02:00
Valentin Rothberg 8489dc4345 move go module to v2
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules.  While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.

Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`.  The renaming of the imports
was done via `gomove` [1].

[1] https://github.com/KSubedi/gomove

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-07-06 15:50:12 +02:00
Matthew Heon b20619e5b0 Allow recursive dependency start with Init()
As part of APIv2 Attach, we need to be able to attach to freshly
created containers (in ContainerStateConfigured). This isn't
something Libpod is interested in supporting, so we use Init() to
get the container into ContainerStateCreated, in which attach is
possible. Problem: Init() will fail if dependencies are not
started, so a fresh container in a fresh pod will fail. The
simplest solution is to extend the existing recursive start code
from Start() to Init(), allowing dependency containers to be
started when we initialize the container (optionally, controlled
via bool).

Also, update some comments in container_api.go to make it more
clear how some of our major API calls work.

Fixes #6646

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-06-18 09:34:04 -04:00
Daniel J Walsh 200cfa41a4
Turn on More linters
- misspell
    - prealloc
    - unparam
    - nakedret

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-06-15 07:05:56 -04:00
Jhon Honce b6113e2b9e WIP V2 attach bindings and test
* Add ErrLostSync to report lost of sync when de-mux'ing stream
* Add logus.SetLevel(logrus.DebugLevel) when `go test -v` given
* Add context to debugging messages

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2020-05-13 11:49:17 -07:00
Matthew Heon 71f14bd792 Improve APIv2 support for Attach
A few major fixes here:
- Support for attaching to Configured containers, to match Docker
  behavior.
- Support for stream parameter has been improved (we now properly
  handle cases where it is not set).
- Initial support for logs parameter has been added.
- Setting attach streams when the container has a terminal is now
  supported.
- Errors are properly reported once the hijack has begun.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-04-13 14:08:01 -04:00
Brent Baude 4d895dcb54 v2podman attach and exec
add the ability to attach to a running container.  the tunnel side of this is not enabled yet as we have work on the endpoints and plumbing to do yet.

add the ability to exec a command in a running container.  the tunnel side is also being deferred for same reason.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-04-05 15:54:51 -05:00
Matthew Heon 118e78c5d6 Add structure for new exec session tracking to DB
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.

This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.

Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-18 11:02:14 -04:00
Matthew Heon 521ff14d83 Revert "exec: get the exit code from sync pipe instead of file"
This reverts commit 4b72f9e401.

Continues what began with revert of
d3d97a25e8 in previous commit.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:55 -04:00
Matthew Heon ffce869daa Revert "Exec: use ErrorConmonRead"
This reverts commit d3d97a25e8.

This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:40 -04:00
Matthew Heon 6be87b2186 Revert "exec: fix error code when conmon fails"
This reverts commit 4632b81c81.

We are reverting #5373 as well, which lays the foundation for
this commit, so it has to go as well.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:25:56 -04:00
Daniel J Walsh ac354ac94a
Fix spelling mistakes in code found by codespell
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-07 10:30:44 -05:00
Peter Hunt 4632b81c81 exec: fix error code when conmon fails
this is a cosmetic change that makes sure podman returns a sane error code when conmon dies underneath it

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-04 17:10:14 -05:00
Peter Hunt d3d97a25e8 Exec: use ErrorConmonRead
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.

instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:43:31 -05:00
Peter Hunt 4b72f9e401 exec: get the exit code from sync pipe instead of file
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there

Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:35:35 -05:00
Daniel J Walsh b163640c61
Allow devs to set labels in container images for default capabilities.
This patch allows users to specify the list of capabilities required
to run their container image.

Setting a image/container label "io.containers.capabilities=setuid,setgid"
tells podman that the contained image should work fine with just these two
capabilties, instead of running with the default capabilities, podman will
launch the container with just these capabilties.

If the user or image specified capabilities that are not in the default set,
the container will print an error message and will continue to run with the
default capabilities.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-02 16:37:32 -05:00
OpenShift Merge Robot 47c4ea3919
Merge pull request #5347 from baude/apiv2wait
rework apiv2 wait endpoint|binding
2020-03-02 20:23:26 +01:00
Matthew Heon b41c864d56 Ensure that exec sessions inherit supplemental groups
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-28 11:32:56 -05:00
Brent Baude 0904873100 rework apiv2 wait endpoint|binding
added the ability to wait on a condition (stopped, running, paused...) for a container.  if a condition is not provided, wait will default to the stopped condition which uses the original wait code paths.  if the condition is stopped, the container exit code will be returned.

also, correct a mux issue we discovered.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-02-28 09:36:53 -06:00
Valentin Rothberg 156ce5cd7d add pkg/capabilities
Add pkg/capabibilities to deal with capabilities.  The code has been
copied from Docker (and attributed with the copyright) but changed
significantly to only do what we really need.  The code has also been
simplified and will perform better due to removed redundancy.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-02-14 12:00:45 +01:00
Matthew Heon ac47e80b07 Add an API for Attach over HTTP API
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.

To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.

At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.

I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-16 13:49:21 -05:00
Daniel J Walsh 123b8c627d
if container is not in a pid namespace, stop all processes
When a container is in a PID namespace, it is enought to send
the stop signal to the PID 1 of the namespace, only send signals
to all processes in the container when the container is not in
a pid namespace.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-12-19 13:33:17 -05:00
Matthew Heon bd44fd5c81 Reap exec sessions on cleanup and removal
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).

To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.

Fixes #4666

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-12 16:35:37 -05:00
Matthew Heon 25cc43c376 Add ContainerStateRemoving
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.

When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.

The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.

We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.

Fixes #3906

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-11-19 15:38:03 -05:00
Jakub Filak 2497b6c77b
podman: add support for specifying MAC
I basically copied and adapted the statements for setting IP.

Closes #1136

Signed-off-by: Jakub Filak <jakub.filak@sap.com>
2019-11-06 16:22:19 +01:00
Peter Hunt 1df4dba0a0 Switch to bufio Reader for exec streams
There were many situations that made exec act funky with input. pipes didn't work as expected, as well as sending input before the shell opened.
Thinking about it, it seemed as though the issues were because of how os.Stdin buffers (it doesn't). Dropping this input had some weird consequences.
Instead, read from os.Stdin as bufio.Reader, allowing the input to buffer before passing it to the container.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-10-31 11:20:12 -04:00
Matthew Heon 5f8bf3d07d Add ensureState helper for checking container state
We have a lot of checks for container state scattered throughout
libpod. Many of these need to ensure the container is in one of a
given set of states so an operation may safely proceed.
Previously there was no set way of doing this, so we'd use unique
boolean logic for each one. Introduce a helper to standardize
state checks.

Note that this is only intended to replace checks for multiple
states. A simple check for one state (ContainerStateRunning, for
example) should remain a straight equality, and not use this new
helper.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-10-28 13:09:01 -04:00
Matthew Heon cab7bfbb21 Add a MissingRuntime implementation
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).

Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-15 15:59:20 -04:00
Matthew Heon 6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00
Daniel J Walsh 82ac0d8925
Podman-remote run should wait for exit code
This change matches what is happening on the podman local side
and should eliminate a race condition.

Also exit commands on the server side should start to return to client.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-09-12 16:20:01 -04:00
Daniel J Walsh 535111b5d5
Use exit code constants
We have leaked the exit number codess all over the code, this patch
removes the numbers to constants.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-09-12 16:20:01 -04:00
Giuseppe Scrivano 8e337aff5a
libpod: avoid polling container status
use the inotify backend to be notified on the container exit instead
of polling continuosly the runtime.  Polling the runtime slowns
significantly down the podman execution time for short lived
processes:

$ time bin/podman run --rm -ti fedora true

real	0m0.324s
user	0m0.088s
sys	0m0.064s

from:

$ time podman run --rm -ti fedora true

real	0m4.199s
user	0m5.339s
sys	0m0.344s

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-09-04 19:55:54 +02:00
Peter Hunt cc3d8da968 exec: run with user specified on container start
Before, if the container was run with a specified user that wasn't root, exec would fail because it always set to root unless respecified by user.
instead, inherit the user from the container start.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-20 11:44:27 -04:00
OpenShift Merge Robot 337358ae63
Merge pull request #3690 from adrianreber/ignore-static-ip
restore: added --ignore-static-ip option
2019-08-05 16:11:50 +02:00
Adrian Reber c23b92b409
restore: added --ignore-static-ip option
If a container is restored multiple times from an exported checkpoint
with the help of '--import --name', the restore will fail if during
'podman run' a static container IP was set with '--ip'. The user can
tell the restore process to ignore the static IP with
'--ignore-static-ip'.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-02 10:10:54 +02:00
Matthew Heon 9dcd76e369 Ensure we generate a 'stopped' event on force-remove
When forcibly removing a container, we are initiating an explicit
stop of the container, which is not reflected in 'podman events'.
Swap to using our standard 'stop()' function instead of a custom
one for force-remove, and move the event into the internal stop
function (so internal calls also register it).

This does add one more database save() to `podman remove`. This
should not be a terribly serious performance hit, and does have
the desirable side effect of making things generally safer.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
Peter Hunt 479eeac62c move editing of exitCode to runtime
There's no way to get the error if we successfully get an exit code (as it's just printed to stderr instead).
instead of relying on the error to be passed to podman, and edit based on the error code, process it on the varlink side instead

Also move error codes to define package

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-07-23 13:29:33 -04:00
Peter Hunt a1a79c08b7 Implement conmon exec
This includes:
	Implement exec -i and fix some typos in description of -i docs
	pass failed runtime status to caller
	Add resize handling for a terminal connection
	Customize exec systemd-cgroup slice
	fix healthcheck
	fix top
	add --detach-keys
	Implement podman-remote exec (jhonce)
	* Cleanup some orphaned code (jhonce)
	adapt remote exec for conmon exec (pehunt)
	Fix healthcheck and exec to match docs
		Introduce two new OCIRuntime errors to more comprehensively describe situations in which the runtime can error
		Use these different errors in branching for exit code in healthcheck and exec
	Set conmon to use new api version

Signed-off-by: Jhon Honce <jhonce@redhat.com>

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-07-22 15:57:23 -04:00
baude db826d5d75 golangci-lint round #3
this is the third round of preparing to use the golangci-lint on our
code base.

Signed-off-by: baude <bbaude@redhat.com>
2019-07-21 14:22:39 -05:00
OpenShift Merge Robot deb087d7b1
Merge pull request #3443 from adrianreber/rootfs-changes-migration
Include changes to the container's root file-system in the checkpoint archive
2019-07-19 02:38:26 +02:00
Matthew Heon 5bbede9d9f Remove exec PID files after use to prevent memory leaks
We have another patch running to do the same for exit files, with
a much more in-depth explanation of why it's necessary. Suffice
to say that persistent files in tmpfs tied to container CGroups
lead to significant memory allocations that last for the lifetime
of the file.

Based on a patch by Andrea Arcangeli (aarcange@redhat.com).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-18 09:06:11 -04:00
baude a78c885397 golangci-lint pass number 2
clean up and prepare to migrate to the golangci-linter

Signed-off-by: baude <bbaude@redhat.com>
2019-07-11 09:13:06 -05:00
Adrian Reber 05549e8b29
Add --ignore-rootfs option for checkpoint/restore
The newly added functionality to include the container's root
file-system changes into the checkpoint archive can now be explicitly
disabled. Either during checkpoint or during restore.

If a container changes a lot of files during its runtime it might be
more effective to migrated the root file-system changes in some other
way and to not needlessly increase the size of the checkpoint archive.

If a checkpoint archive does not contain the root file-system changes
information it will automatically be skipped. If the root file-system
changes are part of the checkpoint archive it is also possible to tell
Podman to ignore these changes.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-11 14:43:35 +02:00
Adrian Reber 1a32074884
Fix typo in checkpoint/restore related texts
Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-11 14:43:35 +02:00
OpenShift Merge Robot 150778820f
Merge pull request #3324 from marcov/detach-keys-configurable
libpod: specify a detach keys sequence in libpod.conf
2019-07-01 15:54:27 +02:00
baude 8561b99644 libpod removal from main (phase 2)
this is phase 2 for the removal of libpod from main.

Signed-off-by: baude <bbaude@redhat.com>
2019-06-27 07:56:24 -05:00
Marco Vedovati 4f56964d55 libpod: fix hang on container start and attach
When a container is attached upon start, the WaitGroup counter may
never be decremented if an error is raised before start, causing
the caller to hang.
Synchronize with the start & attach goroutine using a channel, to be
able to detect failures before start.

Signed-off-by: Marco Vedovati <mvedovati@suse.com>
2019-06-26 10:17:29 +02:00
baude dd81a44ccf remove libpod from main
the compilation demands of having libpod in main is a burden for the
remote client compilations.  to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.

this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).

Signed-off-by: baude <bbaude@redhat.com>
2019-06-25 13:51:24 -05:00
Matthew Heon de75b1a277 Fix a segfault in 'podman ps --sync'
We weren't properly populating the container's OCI Runtime in
Batch(), causing segfaults on attempting to access it. Add a test
to make sure we actually catch cases like this in the future.

Fixes #3411

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-24 09:26:03 -04:00
Matthew Heon 92bae8d308 Begin adding support for multiple OCI runtimes
Allow Podman containers to request to use a specific OCI runtime
if multiple runtimes are configured. This is the first step to
properly supporting containers in a multi-runtime environment.

The biggest changes are that all OCI runtimes are now initialized
when Podman creates its runtime, and containers now use the
runtime requested in their configuration (instead of always the
default runtime).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-19 17:08:43 -04:00
Valentin Rothberg 04858a218f stop/kill: inproper state errors: s/in state/is in state/
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 14:31:55 +02:00
Valentin Rothberg 0f75410e1c kill: print ID and state for non-running containers
Extend kill's error message to include the container's ID and state.
This address cases where error messages caused by other containers
may confuse users.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-06-17 10:55:54 +02:00
Daniel J Walsh 3bbb692d80
If container is not in correct state podman exec should exit with 126
This way a tool can determine if the container exists or not, but is in the
wrong state.

Since 126 is documeted as:
**_126_** if the **_contained command_** cannot be invoked

It makes sense that the container would exit with this state.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-06-12 05:15:58 -04:00
OpenShift Merge Robot 39f5ea4c04
Merge pull request #3180 from mheon/inspect_volumes
Begin to break up pkg/inspect
2019-06-08 14:45:24 +02:00