Commit Graph

186 Commits

Author SHA1 Message Date
Adrian Reber 7660330ae2
checkpoint: change runtime checkpoint support test
Podman was checking if the runtime support checkpointing by running
'runtime checkpoint -h'. That works for runc.

crun, however, does not use '-h, --help' for help output but, '-?,
--help'.

This commit switches both checkpoint support detection from
 'runtime checkpoint -h'
to
 'runtime checkpoint --help'.

Podman can now correctly detect if 'crun' also support checkpointing.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-04-03 18:00:57 +02:00
Daniel J Walsh 84aa81fabe
Pass path environment down to the OCI runtime
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-04-03 11:45:55 -04:00
Giuseppe Scrivano 4c02aa46c2
attach: fix hang if control path is deleted
if the control path file is deleted, libpod hangs waiting for a reader
to open it.  Attempt to open it as non blocking until it returns an
error different than EINTR or EAGAIN.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-02 09:15:56 +02:00
Daniel J Walsh 4352d58549
Add support for containers.conf
vendor in c/common config pkg for containers.conf

Signed-off-by: Qi Wang qiwan@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-27 14:36:03 -04:00
Matthew Heon 1313f8a450 Ensure that exec sends resize events
We previously tried to send resize events only after the exec
session successfully started, which makes sense (we might drop an
event or two that came in before the exec session started
otherwise). However, the start function blocks, so waiting
actually means we send no resize events at all, which is
obviously worse than losing a few.. Sending resizes before attach
starts seems to work fine in my testing, so let's do that until we
get bug reports that it doesn't work.

Fixes #5584

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-03-25 15:33:52 -04:00
Matthew Heon 118e78c5d6 Add structure for new exec session tracking to DB
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.

This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.

Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-18 11:02:14 -04:00
Giuseppe Scrivano a6f5b6a485
podman: avoid conmon zombie on exec
conmon forks itself, so make sure we reap the first process and not
leave a zombie process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-03-18 12:58:14 +01:00
Valentin Rothberg 450361fc64 update systemd & dbus dependencies
Update the outdated systemd and dbus dependencies which are now provided
as go modules.  This will further tighten our dependencies and releases
and pave the way for the upcoming auto-update feature.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-03-10 18:34:55 +01:00
Matthew Heon 521ff14d83 Revert "exec: get the exit code from sync pipe instead of file"
This reverts commit 4b72f9e401.

Continues what began with revert of
d3d97a25e8 in previous commit.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:55 -04:00
Matthew Heon ffce869daa Revert "Exec: use ErrorConmonRead"
This reverts commit d3d97a25e8.

This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:40 -04:00
Peter Hunt d3d97a25e8 Exec: use ErrorConmonRead
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.

instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:43:31 -05:00
Peter Hunt 4b72f9e401 exec: get the exit code from sync pipe instead of file
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there

Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:35:35 -05:00
Matthew Heon b41c864d56 Ensure that exec sessions inherit supplemental groups
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-28 11:32:56 -05:00
Giuseppe Scrivano 170fd7b038
rootless: fix a regression when using -d
when using -d and port mapping, make sure the correct fd is injected
into conmon.

Move the pipe creation earlier as the fd must be known at the time we
create the container through conmon.

Closes: https://github.com/containers/libpod/issues/5167

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-02-18 15:33:38 +01:00
OpenShift Merge Robot 9f146b1b54
Merge pull request #4861 from giuseppe/add-cgroups-disabled-conmon
oci_conmon: do not create a cgroup under systemd
2020-01-22 17:00:48 +01:00
Giuseppe Scrivano 1951ff168a
oci_conmon: do not create a cgroup under systemd
Detect whether we are running under systemd (if the INVOCATION_ID is
set).  If Podman is running under a systemd service, we do not need to
create a cgroup for conmon.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 20:29:28 +01:00
Matthew Heon ac47e80b07 Add an API for Attach over HTTP API
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.

To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.

At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.

I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-16 13:49:21 -05:00
Giuseppe Scrivano ba0a6f34e3
podman: add new option --cgroups=no-conmon
it allows to disable cgroups creation only for the conmon process.

A new cgroup is created for the container payload.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 18:56:51 +01:00
Giuseppe Scrivano 68185048cf
oci_conmon: not make accessible dirs if not needed
do not change the permissions mask for the rundir and the tmpdir when
running a container with a user namespace and the current user is
mapped inside the user namespace.

The change was introduced with
849548ffb8, that dropped the
intermediate mount namespace in favor of allowing root into the user
namespace to access these directories.

Closes: https://github.com/containers/libpod/issues/4846

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-14 14:45:14 +01:00
Valentin Rothberg 67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
OpenShift Merge Robot 0e9c208d3f
Merge pull request #4805 from giuseppe/log-tag
log: support --log-opt tag=
2020-01-10 23:55:45 +01:00
Giuseppe Scrivano 71341a1948
log: support --log-opt tag=
support a custom tag to add to each log for the container.

It is currently supported only by the journald backend.

Closes: https://github.com/containers/libpod/issues/3653

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-10 10:35:19 +01:00
OpenShift Merge Robot 154b5ca85d
Merge pull request #4818 from haircommander/piped-exec-fix
exec: fix pipes
2020-01-09 14:01:42 +01:00
Peter Hunt 776eb64ab2 exec: fix pipes
In a largely anticlimatic solution to the saga of piped input from conmon, we come to this solution.

When we pass the Stdin stream to the exec.Command structure, it's immediately consumed and lost, instead of being consumed through CopyDetachable().

When we don't pass -i in, conmon is not told to create a masterfd_stdin, and won't pass anything to the container.

With both, we can do

echo hi | podman exec -til cat

and get the expected hi

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-01-08 11:09:08 -05:00
Valentin Rothberg 8ae672632b fix lint: correct func identifier in comment
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-08 13:52:52 +01:00
Akihiro Suda da7595a69f rootless: use RootlessKit port forwarder
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:

* Very high throughput.
  Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
  (https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)

* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
  No UDP issue (#4586)

* No tcp_rmem issue (#4537)

* Probably works with IPv6. Even if not, it is trivial to support IPv6.  (#4311)

* Easily extensible for future support of SCTP

* Easily extensible for future support of `lxc-user-nic` SUID network

RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.

As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.

Fix #4586
May-fix #4559
Fix #4537
May-fix #4311

See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-08 19:35:17 +09:00
Matthew Heon bd44fd5c81 Reap exec sessions on cleanup and removal
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).

To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.

Fixes #4666

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-12 16:35:37 -05:00
OpenShift Merge Robot 3e2d9f8662
Merge pull request #4352 from vrothberg/config-package
refactor libpod config into libpod/config
2019-10-31 19:21:46 +01:00
Valentin Rothberg 11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
OpenShift Merge Robot 381fa4df87
Merge pull request #4380 from giuseppe/rootless-create-cgroup-for-conmon
libpod, rootless: create cgroup for conmon
2019-10-30 21:42:47 +01:00
Giuseppe Scrivano 78e2a31943
libpod, rootless: create cgroup for conmon
always create a new cgroup for conmon also when running as rootless.
We were previously creating one only when necessary, but that behaves
differently than root containers.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-30 17:04:05 +01:00
Daniel J Walsh 0b9e07f7f2
Processes execed into container should match container label
Processes execed into a container were not being run with the correct label.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-10-29 16:05:42 -04:00
Peter Hunt 06850ea2c0 exec: remove unused var
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-10-21 17:04:27 -04:00
Matthew Heon cab7bfbb21 Add a MissingRuntime implementation
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).

Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-15 15:59:20 -04:00
Valentin Rothberg 2d2646883f change error wording when conmon fails without logs
In some cases, conmon can fail without writing logs.  Change the wording
of the error message from

	"error reading container (probably exited) json message"
to
	"container create failed (no logs from conmon)"

to have a more helpful error message that is more consistent with other
errors at that stage of execution.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-14 13:46:10 +02:00
Matthew Heon 6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00