Commit Graph

186 Commits

Author SHA1 Message Date
Giuseppe Scrivano ae9dab9445
oci: keep LC_ env variables to conmon
it is necessary for conmon to deal with the correct locale, otherwise
it uses C as a fallback.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1893567
Requires: https://github.com/containers/conmon/pull/215

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-11 13:48:04 +01:00
unknown 2aa381f2d0 add pre checkpoint
Signed-off-by: Zhuohan Chen <chen_zhuohan@163.com>
2021-01-10 21:38:28 +08:00
Daniel J Walsh db71759b1a
Handle podman exec capabilities correctly
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2021-01-07 05:53:50 -05:00
OpenShift Merge Robot 618c35570d
Merge pull request #8878 from mheon/no_edit_config
Ensure we do not edit container config in Exec
2021-01-04 21:11:27 -05:00
Matthew Heon 960607a4cd Ensure we do not edit container config in Exec
The existing code grabs the base container's process, and then
modifies it for use with the exec session. This could cause
errors in `podman inspect` or similar on the container, as the
definition of its OCI spec has been changed by the exec session.
The change never propagates to the DB, so it's limited to a
single process, but we should still avoid it when possible - so
deep-copy it before use.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2021-01-04 14:36:41 -05:00
Giuseppe Scrivano 898f57c4c1
systemd: make rundir always accessible
so that the PIDFile can be accessed also without being in the rootless
user namespace.

Closes: https://github.com/containers/podman/issues/8506

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2021-01-04 14:19:58 +01:00
Giuseppe Scrivano 2a39a6195a
exec: honor --privileged
write the capabilities to the configuration passed to the OCI
runtime.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-24 22:11:14 +01:00
Giuseppe Scrivano 2a97639263
libpod: change function to accept ExecOptions
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-24 22:01:38 +01:00
OpenShift Merge Robot 9ac5ed1e08
Merge pull request #8806 from rhatdan/keyring
Pass down EnableKeyring from containers.conf to conmon
2020-12-23 21:41:25 +01:00
Josh Soref 4fa1fce930 Spelling
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-12-22 13:34:31 -05:00
Daniel J Walsh b0a738ce79
Pass down EnableKeyring from containers.conf to conmon
We have a new field in containers.conf that tells whether
or not we want to generate a new keyring in a container.

This field was being ignored.  It now will be followed and
passed down to conmon.

Fixes: https://github.com/containers/podman/issues/8384

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-12-22 13:08:41 -05:00
Giuseppe Scrivano 08f76bf7a5
libpod, conmon: change log level for rootless
Change the log level when running as rootless when moving conmon to a
different cgroup.

Closes: https://github.com/containers/podman/issues/8721

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-12-15 18:55:51 +01:00
Daniel J Walsh f00cc25a7c
Drop default log-level from error to warn
Our users are missing certain warning messages that would
make debugging issues with Podman easier.

For example if you do a podman build with a Containerfile
that contains the SHELL directive, the Derective is silently
ignored.

If you run with the log-level warn you get a warning message explainging
what happened.

$ podman build --no-cache -f /tmp/Containerfile1 /tmp/
STEP 1: FROM ubi8
STEP 2: SHELL ["/bin/bash", "-c"]
STEP 3: COMMIT
--> 7a207be102a
7a207be102aa8993eceb32802e6ceb9d2603ceed9dee0fee341df63e6300882e

$ podman --log-level=warn build --no-cache -f /tmp/Containerfile1 /tmp/
STEP 1: FROM ubi8
STEP 2: SHELL ["/bin/bash", "-c"]
STEP 3: COMMIT
WARN[0000] SHELL is not supported for OCI image format, [/bin/bash -c] will be ignored. Must use `docker` format
--> 7bd96fd25b9
7bd96fd25b9f755d8a045e31187e406cf889dcf3799357ec906e90767613e95f

These messages will no longer be lost, when we default to WARNing level.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-12-03 06:28:09 -05:00
Daniel J Walsh 5a032acff6
Only use container/storage/pkg/homedir.Get()
We are resolving the homedir of the user in many different
places.  This Patch consolodates them to use container/storage
version.

This PR also fixes a failure mode when the homedir does not
exists, and the user sets a root path.  In this situation
podman should continue to work. Podman does not require a users
homedir to exist in order to run.

Finally the rootlessConfigHomeDirOnce and rootlessRuntimeDirOnce
were broken, because if an error ever happened, they would not be recorded
the second time, and "" would be returned as the path.

Fixes: https://github.com/containers/podman/issues/8131

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-11-04 14:47:54 -05:00
Daniel J Walsh 831d7fb0d7
Stop excessive wrapping of errors
Most of the builtin golang functions like os.Stat and
os.Open report errors including the file system object
path. We should not wrap these errors and put the file path
in a second time, causing stuttering of errors when they
get presented to the user.

This patch tries to cleanup a bunch of these errors.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-10-30 05:34:04 -04:00
Matthew Heon 4d800a5f45 Store cgroup manager on a per-container basis
When we create a container, we assign a cgroup parent based on
the current cgroup manager in use. This parent is only usable
with the cgroup manager the container is created with, so if the
default cgroup manager is later changed or overridden, the
container will not be able to start.

To solve this, store the cgroup manager that created the
container in container configuration, so we can guarantee a
container with a systemd cgroup parent will always be started
with systemd cgroups.

Unfortunately, this is very difficult to test in CI, due to the
fact that we hard-code cgroup manager on all invocations of
Podman in CI.

Fixes #7830

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-10-08 15:25:06 -04:00
OpenShift Merge Robot 80a2317ca2
Merge pull request #7929 from kolyshkin/nits-err
Nits
2020-10-06 10:15:04 +02:00
Kir Kolyshkin 684d0079d2 Lowercase some errors
This commit is courtesy of

```
for f in $(git ls-files *.go | grep -v ^vendor/); do \
	sed -i 's/\(errors\..*\)"Error /\1"error /' $f;
done

for f in $(git ls-files *.go | grep -v ^vendor/); do \
	sed -i 's/\(errors\..*\)"Failed to /\1"failed to /' $f;
done

```

etc.

Self-reviewed using `git diff --word-diff`, found no issues.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-05 15:56:44 -07:00
Kir Kolyshkin 4878dff3e2 Remove excessive error wrapping
In case os.Open[File], os.Mkdir[All], ioutil.ReadFile and the like
fails, the error message already contains the file name and the
operation that fails, so there is no need to wrap the error with
something like "open %s failed".

While at it

 - replace a few places with os.Open, ioutil.ReadAll with
   ioutil.ReadFile.

 - replace errors.Wrapf with errors.Wrap for cases where there
   are no %-style arguments.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-05 15:30:37 -07:00
Daniel J Walsh 348f2df0c0
Support max_size logoptions
Docker supports log-opt max_size and so does conmon (ALthough poorly).
Adding support for this allows users to at least make sure their containers
logs do not become a DOS vector.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-10-05 17:51:45 -04:00
Matthew Heon 00cca405d2 HTTP Attach: Wait until both STDIN and STDOUT finish
In the old code, there was a chance that we could return when
only one of STDIN or STDOUT had finished - this could lead to us
dropping either input to the container, or output from it, in the
case that one stream terminated early.

To resolve this, use separate channels to return STDOUT and STDIN
errors, and track which ones have returned cleanly to ensure that
we need bith in order to return from the HTTP attach function and
pass control back to the HTTP handler (which would assume we
exited cleanly and close the client's attach connection).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-09-24 14:48:26 -04:00
Matthew Heon bbbc0655b9 Fix a bug where log-driver json-file was made no logs
When we added the None log driver, it was accidentally added in
the middle of a set of Fallthrough stanzas which all should have
led to k8s-file, so that JSON file logging accidentally caused
no logging to be selected instead of k8s-file.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-09-23 13:28:55 -04:00
OpenShift Merge Robot 881f2dfe92
Merge pull request #7403 from QiWang19/runtime-flag
Add global options --runtime-flags
2020-09-11 11:00:11 -04:00
Akihiro Suda f82abc774a
rootless: support `podman network create` (CNI-in-slirp4netns)
Usage:
```
$ podman network create foo
$ podman run -d --name web --hostname web --network foo nginx:alpine
$ podman run --rm --network foo alpine wget -O - http://web.dns.podman
Connecting to web.dns.podman (10.88.4.6:80)
...
<h1>Welcome to nginx!</h1>
...
```

See contrib/rootless-cni-infra for the design.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-09-09 15:47:38 +09:00
Qi Wang 6b0864434a Add global options --runtime-flags
Add global options --runtime-flags for setting options to container runtime.

Signed-off-by: Qi Wang <qiwan@redhat.com>
2020-09-04 15:04:36 -04:00
Matthew Heon 2ac37f10b4 Fix up some error messages
We have a lot of 'cannot stat %s' errors in our codebase. These
are terrible and confusing and utterly useless without context.
Add some context to a few of them so we actually know what part
of the code is failing.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-08-27 15:05:12 -04:00
Matthew Heon 2ea9dac5e1 Send HTTP Hijack headers after successful attach
Our previous flow was to perform a hijack before passing a
connection into Libpod, and then Libpod would attach to the
container's attach socket and begin forwarding traffic.

A problem emerges: we write the attach header as soon as the
attach complete. As soon as we write the header, the client
assumes that all is ready, and sends a Start request. This Start
may be processed *before* we successfully finish attaching,
causing us to lose output.

The solution is to handle hijacking inside Libpod. Unfortunately,
this requires a downright extensive refactor of the Attach and
HTTP Exec StartAndAttach code. I think the result is an
improvement in some places (a lot more errors will be handled
with a proper HTTP error code, before the hijack occurs) but
other parts, like the relocation of printing container logs, are
just *bad*. Still, we need this fixed now to get CI back into
good shape...

Fixes #7195

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-08-27 12:50:22 -04:00
Qi Wang 57967414ae fix close fds of run --preserve-fds
Test flakes mentioned in #6987 might be caused by uncorrect closing of file descriptor.
Fix the code to close file descriptors for podman run since it may close those used by other processes.

Signed-off-by: Qi Wang <qiwan@redhat.com>
2020-07-30 15:32:39 -04:00
Daniel J Walsh a5e37ad280
Switch all references to github.com/containers/libpod -> podman
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-07-28 08:23:45 -04:00
Matthew Heon 4b784b377c Remove all instances of named return "err" from Libpod
This was inspired by https://github.com/cri-o/cri-o/pull/3934 and
much of the logic for it is contained there. However, in brief,
a named return called "err" can cause lots of code confusion and
encourages using the wrong err variable in defer statements,
which can make them work incorrectly. Using a separate name which
is not used elsewhere makes it very clear what the defer should
be doing.

As part of this, remove a large number of named returns that were
not used anywhere. Most of them were once needed, but are no
longer necessary after previous refactors (but were accidentally
retained).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-07-09 13:54:47 -04:00
Joseph Gooch 0b1c1ef461 Implement --sdnotify cmdline option to control sd-notify behavior
--sdnotify container|conmon|ignore
With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI
runtime doesn't pass it into the container. We also advertise "ready" when the
OCI runtime finishes to advertise the service as ready.

With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI
runtime passes it into the container for initialization, and let the container advertise further metadata.
This is the default, which is closest to the behavior podman has done in the past.

The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor
any child processes will talk to systemd.

This removes the need for hardcoded CID and PID files in the command line, and
the PIDFile directive, as the pid is advertised directly through sd-notify.

Signed-off-by: Joseph Gooch <mrwizard@dok.org>
2020-07-06 17:47:18 +00:00
Valentin Rothberg 8489dc4345 move go module to v2
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules.  While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.

Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`.  The renaming of the imports
was done via `gomove` [1].

[1] https://github.com/KSubedi/gomove

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-07-06 15:50:12 +02:00
Giuseppe Scrivano 6ee5f740a4
podman: add new cgroup mode split
When running under systemd there is no need to create yet another
cgroup for the container.

With conmon-delegated the current cgroup will be split in two sub
cgroups:

- supervisor
- container

The supervisor cgroup will hold conmon and the podman process, while
the container cgroup is used by the OCI runtime (using the cgroupfs
backend).

Closes: https://github.com/containers/libpod/issues/6400

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-06-25 17:16:12 +02:00
Qi Wang f61a7f25a8 Add --preservefds to podman run
Add --preservefds to podman run. close https://github.com/containers/libpod/issues/6458

Signed-off-by: Qi Wang <qiwan@redhat.com>
2020-06-19 09:40:13 -04:00
Matthew Heon 0e171b7b33 Do not share container log driver for exec
When the container uses journald logging, we don't want to
automatically use the same driver for its exec sessions. If we do
we will pollute the journal (particularly in the case of
healthchecks) with large amounts of undesired logs. Instead,
force exec sessions logs to file for now; we can add a log-driver
flag later (we'll probably want to add a `podman logs` command
that reads exec session logs at the same time).

As part of this, add support for the new 'none' logs driver in
Conmon. It will be the default log driver for exec sessions, and
can be optionally selected for containers.

Great thanks to Joe Gooch (mrwizard@dok.org) for adding support
to Conmon for a null log driver, and wiring it in here.

Fixes #6555

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-06-17 11:11:46 -04:00
Matthew Heon 9d964ffb9f Ensure Conmon is alive before waiting for exit file
This came out of a conversation with Valentin about
systemd-managed Podman. He discovered that unit files did not
properly handle cases where Conmon was dead - the ExecStopPost
`podman rm --force` line was not actually removing the container,
but interestingly, adding a `podman cleanup --rm` line would
remove it. Both of these commands do the same thing (minus the
`podman cleanup --rm` command not force-removing running
containers).

Without a running Conmon instance, the container process is still
running (assuming you killed Conmon with SIGKILL and it had no
chance to kill the container it managed), but you can still kill
the container itself with `podman stop` - Conmon is not involved,
only the OCI Runtime. (`podman rm --force` and `podman stop` use
the same code to kill the container). The problem comes when we
want to get the container's exit code - we expect Conmon to make
us an exit file, which it's obviously not going to do, being
dead. The first `podman rm` would fail because of this, but
importantly, it would (after failing to retrieve the exit code
correctly) set container status to Exited, so that the second
`podman cleanup` process would succeed.

To make sure the first `podman rm --force` succeeds, we need to
catch the case where Conmon is already dead, and instead of
waiting for an exit file that will never come, immediately set
the Stopped state and remove an error that can be caught and
handled.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-06-08 13:48:29 -04:00
Matthew Heon e7f4e98c45 Add exit commands to exec sessions
These are required for detached exec, where they will be used to
clean up and remove exec sessions when they exit.

As part of this, move all Exec related functionality for the
Conmon OCI runtime into a separate file; the existing one was
around 2000 lines.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-05-20 16:11:05 -04:00
Peter Hunt 92acb3676c oci conmon: tell conmon to log container name
specifying `-n=ctr-name` tells conmon to log CONTAINER_NAME=name if the log driver is journald

add this, and a test!

also, refactor the args slice creation to not append() unnecessarily.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-05-20 10:07:54 -04:00
Matthew Heon ab25f70dad Drop a debug line which could print very large messages
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-05-15 14:35:10 -04:00
Matthew Heon 50ed292aee Remove duplicated exec handling code
During the initial workup of HTTP exec, I duplicated most of the
existing exec handling code so I could work on it without
breaking normal exec (and compare what I was doing to the nroaml
version). Now that it's done and working, we can switch over to
the refactored version and ditch the original, removing a lot of
duplicated code.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-05-14 17:32:44 -04:00
Matthew Heon 2b08359faf Fix start order for APIv2 exec start endpoint
This makes the endpoint (mostly) functional.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-05-14 16:56:02 -04:00
Matthew Heon 50cc56bc4a Add an initial implementation of HTTP-forwarded exec
This is heavily based off the existing exec implementation, but
does not presently share code with it, to try and ensure we don't
break anything.

Still to do:
- Add code sharing with existing exec implementation
- Wire in the frontend (exec HTTP endpoint)
- Move all exec-related code in oci_conmon_linux.go into a new
  file
- Investigate code sharing between HTTP attach and HTTP exec.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-05-14 16:51:57 -04:00
Jhon Honce b6113e2b9e WIP V2 attach bindings and test
* Add ErrLostSync to report lost of sync when de-mux'ing stream
* Add logus.SetLevel(logrus.DebugLevel) when `go test -v` given
* Add context to debugging messages

Signed-off-by: Jhon Honce <jhonce@redhat.com>
2020-05-13 11:49:17 -07:00
Daniel J Walsh 8173e83054
Fix errors found in coverity scan
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-05-01 13:26:50 -04:00
Brent Baude ba430bfe5e podman v2 remove bloat v2
rid ourseleves of libpod references in v2 client

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-04-16 12:04:46 -05:00
Daniel J Walsh c4ca3c71ff
Add support for selecting kvm and systemd labels
In order to better support kata containers and systemd containers
container-selinux has added new types. Podman should execute the
container with an SELinux process label to match the container type.

Traditional Container process : container_t
KVM Container Process: containre_kvm_t
PID 1 Init process: container_init_t

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-04-15 16:52:16 -04:00
Matthew Heon 71f14bd792 Improve APIv2 support for Attach
A few major fixes here:
- Support for attaching to Configured containers, to match Docker
  behavior.
- Support for stream parameter has been improved (we now properly
  handle cases where it is not set).
- Initial support for logs parameter has been added.
- Setting attach streams when the container has a terminal is now
  supported.
- Errors are properly reported once the hijack has begun.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-04-13 14:08:01 -04:00
Brent Baude e20ecc733c refactor info
the current implementation of info, while typed, is very loosely done so.  we need stronger types for our apiv2 implmentation and bindings.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-04-06 12:45:42 -05:00
OpenShift Merge Robot 3542700d6e
Merge pull request #5707 from adrianreber/crun-checkpoint-1
Prepare for crun checkpoint support
2020-04-03 19:56:03 +02:00
Adrian Reber 001fe983df
checkpoint: handle XDG_RUNTIME_DIR
For (almost) all commands which podman passes on to a OCI runtime
XDG_RUNTIME_DIR is set to the same value. This does not happen for the
checkpoint command.

Using crun to checkpoint a container without this change will lead to
crun using XDG_RUNTIME_DIR of the currently logged in user and so it
will not find the container Podman wants to checkpoint.

This bascially just copies a few lines from on of the other commands to
handle 'checkpoint' as all the other commands.

Thanks to Giuseppe for helping me with this.

For 'restore' it is not needed as restore goes through conmon and for
calling conmon Podman already configures XDG_RUNTIME_DIR correctly.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-04-03 18:00:57 +02:00
Adrian Reber 7660330ae2
checkpoint: change runtime checkpoint support test
Podman was checking if the runtime support checkpointing by running
'runtime checkpoint -h'. That works for runc.

crun, however, does not use '-h, --help' for help output but, '-?,
--help'.

This commit switches both checkpoint support detection from
 'runtime checkpoint -h'
to
 'runtime checkpoint --help'.

Podman can now correctly detect if 'crun' also support checkpointing.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-04-03 18:00:57 +02:00
Daniel J Walsh 84aa81fabe
Pass path environment down to the OCI runtime
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-04-03 11:45:55 -04:00
Giuseppe Scrivano 4c02aa46c2
attach: fix hang if control path is deleted
if the control path file is deleted, libpod hangs waiting for a reader
to open it.  Attempt to open it as non blocking until it returns an
error different than EINTR or EAGAIN.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-02 09:15:56 +02:00
Daniel J Walsh 4352d58549
Add support for containers.conf
vendor in c/common config pkg for containers.conf

Signed-off-by: Qi Wang qiwan@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2020-03-27 14:36:03 -04:00
Matthew Heon 1313f8a450 Ensure that exec sends resize events
We previously tried to send resize events only after the exec
session successfully started, which makes sense (we might drop an
event or two that came in before the exec session started
otherwise). However, the start function blocks, so waiting
actually means we send no resize events at all, which is
obviously worse than losing a few.. Sending resizes before attach
starts seems to work fine in my testing, so let's do that until we
get bug reports that it doesn't work.

Fixes #5584

Signed-off-by: Matthew Heon <mheon@redhat.com>
2020-03-25 15:33:52 -04:00
Matthew Heon 118e78c5d6 Add structure for new exec session tracking to DB
As part of the rework of exec sessions, we need to address them
independently of containers. In the new API, we need to be able
to fetch them by their ID, regardless of what container they are
associated with. Unfortunately, our existing exec sessions are
tied to individual containers; there's no way to tell what
container a session belongs to and retrieve it without getting
every exec session for every container.

This adds a pointer to the container an exec session is
associated with to the database. The sessions themselves are
still stored in the container.

Exec-related APIs have been restructured to work with the new
database representation. The originally monolithic API has been
split into a number of smaller calls to allow more fine-grained
control of lifecycle. Support for legacy exec sessions has been
retained, but in a deprecated fashion; we should remove this in
a few releases.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-18 11:02:14 -04:00
Giuseppe Scrivano a6f5b6a485
podman: avoid conmon zombie on exec
conmon forks itself, so make sure we reap the first process and not
leave a zombie process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-03-18 12:58:14 +01:00
Valentin Rothberg 450361fc64 update systemd & dbus dependencies
Update the outdated systemd and dbus dependencies which are now provided
as go modules.  This will further tighten our dependencies and releases
and pave the way for the upcoming auto-update feature.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-03-10 18:34:55 +01:00
Matthew Heon 521ff14d83 Revert "exec: get the exit code from sync pipe instead of file"
This reverts commit 4b72f9e401.

Continues what began with revert of
d3d97a25e8 in previous commit.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:55 -04:00
Matthew Heon ffce869daa Revert "Exec: use ErrorConmonRead"
This reverts commit d3d97a25e8.

This does not resolve the issues we expected it would, and has
some unexpected side effects with the upcoming exec rework.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-09 09:50:40 -04:00
Peter Hunt d3d97a25e8 Exec: use ErrorConmonRead
Before, we were using -1 as a bogus value in podman to signify something went wrong when reading from a conmon pipe. However, conmon uses negative values to indicate the runtime failed, and return the runtime's exit code.

instead, we should use a bogus value that is actually bogus. Define that value in the define package as MinInt32 (-1<< 31 - 1), which is outside of the range of possible pids (-1 << 31)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:43:31 -05:00
Peter Hunt 4b72f9e401 exec: get the exit code from sync pipe instead of file
Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there

Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-03-03 15:35:35 -05:00
Matthew Heon b41c864d56 Ensure that exec sessions inherit supplemental groups
This corrects a regression from Podman 1.4.x where container exec
sessions inherited supplemental groups from the container, iff
the exec session did not specify a user.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-28 11:32:56 -05:00
Giuseppe Scrivano 170fd7b038
rootless: fix a regression when using -d
when using -d and port mapping, make sure the correct fd is injected
into conmon.

Move the pipe creation earlier as the fd must be known at the time we
create the container through conmon.

Closes: https://github.com/containers/libpod/issues/5167

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-02-18 15:33:38 +01:00
OpenShift Merge Robot 9f146b1b54
Merge pull request #4861 from giuseppe/add-cgroups-disabled-conmon
oci_conmon: do not create a cgroup under systemd
2020-01-22 17:00:48 +01:00
Giuseppe Scrivano 1951ff168a
oci_conmon: do not create a cgroup under systemd
Detect whether we are running under systemd (if the INVOCATION_ID is
set).  If Podman is running under a systemd service, we do not need to
create a cgroup for conmon.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 20:29:28 +01:00
Matthew Heon ac47e80b07 Add an API for Attach over HTTP API
The new APIv2 branch provides an HTTP-based remote API to Podman.
The requirements of this are, unfortunately, incompatible with
the existing Attach API. For non-terminal attach, we need append
a header to what was copied from the container, to multiplex
STDOUT and STDERR; to do this with the old API, we'd need to copy
into an intermediate buffer first, to handle the headers.

To avoid this, provide a new API to handle all aspects of
terminal and non-terminal attach, including closing the hijacked
HTTP connection. This might be a bit too specific, but for now,
it seems to be the simplest approach.

At the same time, add a Resize endpoint. This needs to be a
separate endpoint, so our existing channel approach does not work
here.

I wanted to rework the rest of attach at the same time (some
parts of it, particularly how we start the Attach session and how
we do resizing, are (in my opinion) handled much better here.
That may still be on the table, but I wanted to avoid breaking
existing APIs in this already massive change.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-16 13:49:21 -05:00
Giuseppe Scrivano ba0a6f34e3
podman: add new option --cgroups=no-conmon
it allows to disable cgroups creation only for the conmon process.

A new cgroup is created for the container payload.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-16 18:56:51 +01:00
Giuseppe Scrivano 68185048cf
oci_conmon: not make accessible dirs if not needed
do not change the permissions mask for the rundir and the tmpdir when
running a container with a user namespace and the current user is
mapped inside the user namespace.

The change was introduced with
849548ffb8, that dropped the
intermediate mount namespace in favor of allowing root into the user
namespace to access these directories.

Closes: https://github.com/containers/libpod/issues/4846

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-14 14:45:14 +01:00
Valentin Rothberg 67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
OpenShift Merge Robot 0e9c208d3f
Merge pull request #4805 from giuseppe/log-tag
log: support --log-opt tag=
2020-01-10 23:55:45 +01:00
Giuseppe Scrivano 71341a1948
log: support --log-opt tag=
support a custom tag to add to each log for the container.

It is currently supported only by the journald backend.

Closes: https://github.com/containers/libpod/issues/3653

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-01-10 10:35:19 +01:00
OpenShift Merge Robot 154b5ca85d
Merge pull request #4818 from haircommander/piped-exec-fix
exec: fix pipes
2020-01-09 14:01:42 +01:00
Peter Hunt 776eb64ab2 exec: fix pipes
In a largely anticlimatic solution to the saga of piped input from conmon, we come to this solution.

When we pass the Stdin stream to the exec.Command structure, it's immediately consumed and lost, instead of being consumed through CopyDetachable().

When we don't pass -i in, conmon is not told to create a masterfd_stdin, and won't pass anything to the container.

With both, we can do

echo hi | podman exec -til cat

and get the expected hi

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2020-01-08 11:09:08 -05:00
Valentin Rothberg 8ae672632b fix lint: correct func identifier in comment
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-08 13:52:52 +01:00
Akihiro Suda da7595a69f rootless: use RootlessKit port forwarder
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:

* Very high throughput.
  Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
  (https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)

* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
  No UDP issue (#4586)

* No tcp_rmem issue (#4537)

* Probably works with IPv6. Even if not, it is trivial to support IPv6.  (#4311)

* Easily extensible for future support of SCTP

* Easily extensible for future support of `lxc-user-nic` SUID network

RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.

As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.

Fix #4586
May-fix #4559
Fix #4537
May-fix #4311

See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-08 19:35:17 +09:00
Matthew Heon bd44fd5c81 Reap exec sessions on cleanup and removal
We currently rely on exec sessions being removed from the state
by the Exec() API itself, on detecting the session stopping. This
is not a reliable method, though. The Podman frontend for exec
could be killed before the session ended, or another Podman
process could be holding the lock and prevent update (most
notable in `run --rm`, when a container with an active exec
session is stopped).

To resolve this, add a function to reap active exec sessions from
the state, and use it on cleanup (to clear sessions after the
container stops) and remove (to do the same when --rm is passed).
This is a bit more complicated than it ought to be because Kata
and company exist, and we can't guarantee the exec session has a
PID on the host, so we have to plumb this through to the OCI
runtime.

Fixes #4666

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-12 16:35:37 -05:00
OpenShift Merge Robot 3e2d9f8662
Merge pull request #4352 from vrothberg/config-package
refactor libpod config into libpod/config
2019-10-31 19:21:46 +01:00
Valentin Rothberg 11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
OpenShift Merge Robot 381fa4df87
Merge pull request #4380 from giuseppe/rootless-create-cgroup-for-conmon
libpod, rootless: create cgroup for conmon
2019-10-30 21:42:47 +01:00
Giuseppe Scrivano 78e2a31943
libpod, rootless: create cgroup for conmon
always create a new cgroup for conmon also when running as rootless.
We were previously creating one only when necessary, but that behaves
differently than root containers.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-30 17:04:05 +01:00
Daniel J Walsh 0b9e07f7f2
Processes execed into container should match container label
Processes execed into a container were not being run with the correct label.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-10-29 16:05:42 -04:00
Peter Hunt 06850ea2c0 exec: remove unused var
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-10-21 17:04:27 -04:00
Matthew Heon cab7bfbb21 Add a MissingRuntime implementation
When a container is created with a given OCI runtime, but then it
is uninstalled or removed from the configuration file, Libpod
presently reacts very poorly. The EvictContainer code can
potentially remove these containers, but we still can't see them
in `podman ps` (aside from the massive logrus.Errorf messages
they create).

Providing a minimal OCI runtime implementation for missing
runtimes allows us to behave better. We'll be able to retrieve
containers from the database, though we still pop up an error for
each missing runtime. For containers which are stopped, we can
remove them as normal.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-15 15:59:20 -04:00
Valentin Rothberg 2d2646883f change error wording when conmon fails without logs
In some cases, conmon can fail without writing logs.  Change the wording
of the error message from

	"error reading container (probably exited) json message"
to
	"container create failed (no logs from conmon)"

to have a more helpful error message that is more consistent with other
errors at that stage of execution.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-14 13:46:10 +02:00
Matthew Heon 6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00