Commit Graph

182 Commits

Author SHA1 Message Date
Valentin Rothberg f4e873c4e1 auto updates
Add support to auto-update containers running in systemd units as
generated with `podman generate systemd --new`.

`podman auto-update` looks up containers with a specified
"io.containers.autoupdate" label (i.e., the auto-update policy).

If the label is present and set to "image", Podman reaches out to the
corresponding registry to check if the image has been updated.  We
consider an image to be updated if the digest in the local storage is
different than the one of the remote image.  If an image must be
updated, Podman pulls it down and restarts the container.  Note that the
restarting sequence relies on systemd.

At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT"
environment variables and stores it verbatim in the container's label.
This variable is now set by all systemd units generated by
`podman-generate-systemd` and is set to `%n` (i.e., the name of systemd
unit starting the container).  This data is then being used in the
auto-update sequence to instruct systemd (via DBUS) to restart the unit
and hence to restart the container.

Note that this implementation of auto-updates relies on systemd and
requires a fully-qualified image reference to be used to create the
container.  This enforcement is necessary to know which image to
actually check and pull.  If we used an image ID, we would not know
which image to check/pull anymore.

Fixes: #3575
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-03-17 17:18:56 +01:00
Brent Baude 8b5e2a6297 add default network for apiv2 create
during container creation, if no network is provided, we need to add a default value so the container can be later started.

use apiv2 container creation for RunTopContainer instead of an exec to the system podman. RunTopContainer now also returns the container id and an error.

added a libpod commit endpoint.

also, changed the use of the connections and bindings slightly to make it more convenient to write tests.

Fixes: 5366
Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-03-06 14:31:45 -06:00
Matthew Heon e45456223c Add validate() for containers
Until now, we've been validating every part of container
configuration through the With... functions that set the options.
This if fine when we are just validating the options to an
individual function, but things get complicated once we need to
validate conflicts between different options. We don't know the
order in which things were passed, so we need the validation on
both of the potential options that can conflict, resulting in
significant code duplication. To solve this, add a validate()
function for containers, and use this to check whether everything
is in a good state.

We can probably move more into this function (there are other
parts of container creation that also do validation of a sort)
but this is a good start to simplifying our options.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-03-02 10:58:11 -05:00
Matthew Heon 4004f646cd Add basic deadlock detection for container start/remove
We can easily tell if we're going to deadlock by comparing lock
IDs before actually taking the lock. Add a few checks for this in
common places where deadlocks might occur.

This does not yet cover pod operations, where detection is more
difficult (and costly) due to the number of locks being involved
being higher than 2.

Also, add some error wrapping on the Podman side, so we can tell
people to use `system renumber` when it occurs.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-02-24 09:29:34 -05:00
Matthew Heon 5d93c731af Deprecate & remove IsCtrSpecific in favor of IsAnon
In Podman 1.6.3, we added support for anonymous volumes - fixing
our old, broken support for named volumes that were created with
containers. Unfortunately, this reused the database field we used
for the old implementation, and toggled volume removal on for
`podman run --rm` - so now, we were removing *named* volumes
created with older versions of Podman.

We can't modify these old volumes in the DB, so the next-safest
thing to do is swap to a new field to indicate volumes should be
removed. Problem: Volumes created with 1.6.3 and up until this
lands, even anonymous volumes, will not be removed. However, this
is safer than removing too many volumes, as we were doing before.

Fixes #5009

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2020-01-29 14:04:51 -05:00
Brent Baude cf7be58b2c Review corrections pass #2
Add API review comments to correct documentation and endpoints.  Also, add a libpode prune method to reduce code duplication.  Only used right now for the API but when the remote client is wired, we will switch over there too.

Signed-off-by: Brent Baude <bbaude@redhat.com>
2020-01-23 11:58:26 -06:00
Valentin Rothberg 67165b7675 make lint: enable gocritic
`gocritic` is a powerful linter that helps in preventing certain kinds
of errors as well as enforcing a coding style.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2020-01-13 14:27:02 +01:00
Neville Cain 8bc394ce6e Add the pod name when we use `podman ps -p`
The pod name does not appear when doing `podman ps -p`.
It is missing as the documentation says:
-p, --pod              Print the ID and name of the pod the containers are associated with

The pod name is added in the ps output and checked in unit tests.

Closes #4703

Signed-off-by: NevilleC <neville.cain@qonto.eu>
2019-12-28 00:03:57 +01:00
OpenShift Merge Robot e33d7e9fab
Merge pull request #4727 from rhatdan/pidns
if container is not in a pid namespace, stop all processes
2019-12-20 12:13:22 +01:00
Daniel J Walsh 123b8c627d
if container is not in a pid namespace, stop all processes
When a container is in a PID namespace, it is enought to send
the stop signal to the PID 1 of the namespace, only send signals
to all processes in the container when the container is not in
a pid namespace.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-12-19 13:33:17 -05:00
Matthew Heon 88917e4a93 Remove volumes after containers in pod remove
When trying to reproduce #4704 I noticed that the named volumes
from the Postgres containers in the reproducer weren't being
removed by `podman pod rm -f` saying that the container they were
attached to was still in use. This was rather odd, considering
they were only in use by one container, and that container was in
the process of being removed with the pod.

After a bit of tracing, I realized that the cause is the ordering
of container removal when we remove a pod. Normally, it's done
in removeContainer() before volume removal (which is the last
thing in that function). However, when we are removing a pod, we
remove containers all at once, after removeContainer has already
finished - meaning the container still exists when we try to
remove its volumes, and thus the volume can't be removed.

Solution: collect a list of all named volumes in use by the pod,
and remove them all at once after every container in the pod is
gone. This ensures that there are no dependency issues.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-12-17 21:41:31 -05:00
Matthew Heon 25cc43c376 Add ContainerStateRemoving
When Libpod removes a container, there is the possibility that
removal will not fully succeed. The most notable problems are
storage issues, where the container cannot be removed from
c/storage.

When this occurs, we were faced with a choice. We can keep the
container in the state, appearing in `podman ps` and available for
other API operations, but likely unable to do any of them as it's
been partially removed. Or we can remove it very early and clean
up after it's already gone. We have, until now, used the second
approach.

The problem that arises is intermittent problems removing
storage. We end up removing a container, failing to remove its
storage, and ending up with a container permanently stuck in
c/storage that we can't remove with the normal Podman CLI, can't
use the name of, and generally can't interact with. A notable
cause is when Podman is hit by a SIGKILL midway through removal,
which can consistently cause `podman rm` to fail to remove
storage.

We now add a new state for containers that are in the process of
being removed, ContainerStateRemoving. We set this at the
beginning of the removal process. It notifies Podman that the
container cannot be used anymore, but preserves it in the DB
until it is fully removed. This will allow Remove to be run on
these containers again, which should successfully remove storage
if it fails.

Fixes #3906

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-11-19 15:38:03 -05:00
Valentin Rothberg 11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
Matthew Heon 0d623914d0 Add support for anonymous volumes to `podman run -v`
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.

We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.

This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.

The sum total of these changes let us turn on volume removal in
`--rm` by default.

Fixes: #4276

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-17 13:18:17 -04:00
Matthew Heon b6a7d88397 When restoring containers, reset cgroup path
Previously, `podman checkport restore` with exported containers,
when told to create a new container based on the exported
checkpoint, would create a new container, with a new container
ID, but not reset CGroup path - which contained the ID of the
original container.

If this was done multiple times, the result was two containers
with the same cgroup paths. Operations on these containers would
this have a chance of crossing over to affect the other one; the
most notable was `podman rm` once it was changed to use the --all
flag when stopping the container; all processes in the cgroup,
including the ones in the other container, would be stopped.

Reset cgroups on restore to ensure that the path matches the ID
of the container actually being run.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 14:53:29 -04:00
Matthew Heon 6f630bc09b Move OCI runtime implementation behind an interface
For future work, we need multiple implementations of the OCI
runtime, not just a Conmon-wrapped runtime matching the runc CLI.

As part of this, do some refactoring on the interface for exec
(move to a struct, not a massive list of arguments). Also, add
'all' support to Kill and Stop (supported by runc and used a bit
internally for removing containers).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-10 10:19:32 -04:00
Matthew Heon bb803b8f7a When evicting containers, perform a normal remove first
This ensures that containers that didn't require an evict will be
dealt with normally, and we only break out evict for containers
that refuse to be removed by normal means.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-04 11:04:43 -04:00
Marco Vedovati dacbc5beb2 rm: add containers eviction with `rm --force`
Add ability to evict a container when it becomes unusable. This may
happen when the host setup changes after a container creation, making it
impossible for that container to be used or removed.
Evicting a container is done using the `rm --force` command.

Signed-off-by: Marco Vedovati <mvedovati@suse.com>
2019-09-25 19:44:38 +02:00
OpenShift Merge Robot 7ac6ed3b4b
Merge pull request #3581 from mheon/no_cgroups
Support running containers without CGroups
2019-09-11 00:58:46 +02:00
Matthew Heon c2284962c7 Add support for launching containers without CGroups
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-10 10:52:37 -04:00
Matthew Heon b6106341fb When first mounting any named volume, copy up
Previously, we only did this for volumes created at the same time
as the container. However, this is not correct behavior - Docker
does so for all named volumes, even those made with
'podman volume create' and mounted into a container later.

Fixes #3945

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-09 17:17:39 -04:00
Matthew Heon e563f41116 Re-add locks to volumes.
This will require a 'podman system renumber' after being applied
to get lock numbers for existing volumes.

Add the DB backend code for rewriting volume configs and use it
for updating lock numbers as part of 'system renumber'.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 11:35:00 -04:00
Adrian Reber a891b84528
Fix up ConmonPidFile after restore
After restoring a container with a different name (ID) the ConmonPidFile
was still pointing to the path of the original container.

This means that the last restored container will overwrite the
ConmonPidFile of the original container. It was also not possible to
restore a container with a new name (ID) if the original container was
not running.

The ConmonPidFile is only changed if the ConmonPidFile starts with the
value of RunRoot. This assumes that if RunRoot is part of ConmonPidFile
the user did not specify --conmon-pidfile' during run or create.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-09 19:26:56 +02:00
Adrian Reber 82b586349c
restore: correctly set StartedTime
A container restored from an exported checkpoint did not have its
StartedTime set. Which resulted in a status like 'Up 292 years ago'
after the restore.

This just sets the StartedTime to time.Now() if a container is restored
from an exported checkpoint.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-08-05 14:29:07 +02:00
Matthew Heon 9dcd76e369 Ensure we generate a 'stopped' event on force-remove
When forcibly removing a container, we are initiating an explicit
stop of the container, which is not reflected in 'podman events'.
Swap to using our standard 'stop()' function instead of a custom
one for force-remove, and move the event into the internal stop
function (so internal calls also register it).

This does add one more database save() to `podman remove`. This
should not be a terribly serious performance hit, and does have
the desirable side effect of making things generally safer.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
baude db826d5d75 golangci-lint round #3
this is the third round of preparing to use the golangci-lint on our
code base.

Signed-off-by: baude <bbaude@redhat.com>
2019-07-21 14:22:39 -05:00
Matthew Heon 8713483362 Fix a bug where ctrs could not be removed from pods
Using pod removal worked, but container removal was missing the
most critical step - the actual removal. Must have been
accidentally removed during a refactor.

Fixes #3556

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-11 10:17:33 -04:00
Giuseppe Scrivano 18c4d73867
runtime: drop spurious message log
fix a regression introduced by 1d36501f96

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-10 15:47:38 +02:00
OpenShift Merge Robot fce2e6577e
Merge pull request #3497 from QazerLab/bugfix/systemd-generate-pidfile
Use conmon pidfile in generated systemd unit as PIDFile.
2019-07-08 23:39:42 +02:00
OpenShift Merge Robot edc7f52c95
Merge pull request #3425 from adrianreber/restore-mount-label
Set correct SELinux label on restored containers
2019-07-08 20:31:59 +02:00
baude 1d36501f96 code cleanup
clean up code identified as problematic by golands inspection

Signed-off-by: baude <bbaude@redhat.com>
2019-07-08 09:18:11 -05:00
Danila Kiver 37b134054e Use default conmon pidfile location for root containers.
The conmon pidfile is crucial for podman-generated systemd units, because
these units rely on it for determining service's main process ID.

With this change, every container has ConmonPidFile set (at least to
default value).

Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-07-04 21:08:06 +03:00
Matthew Heon e92de11a69 Ensure locks are freed when ctr/pod creation fails
If we don't do this, we can leak locks on every failure, and that
is very, very bad - can render Podman unusable without a 'system
renumber' being run.

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-02 12:51:39 -04:00
baude 8561b99644 libpod removal from main (phase 2)
this is phase 2 for the removal of libpod from main.

Signed-off-by: baude <bbaude@redhat.com>
2019-06-27 07:56:24 -05:00
Giuseppe Scrivano e27fef335a
stats: fix cgroup path for rootless containers
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-26 13:17:06 +02:00
baude dd81a44ccf remove libpod from main
the compilation demands of having libpod in main is a burden for the
remote client compilations.  to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.

this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).

Signed-off-by: baude <bbaude@redhat.com>
2019-06-25 13:51:24 -05:00
Adrian Reber 94e2a0cd63
Track if a container is restored from an exported checkpoint
Instead of only tracking that a container is restored from
a checkpoint locally in runtime_ctr.go this adds a flag to the
Container structure.

Upcoming patches to correctly label the root file-system mount-point
need also to know if a container is restored from a checkpoint.

Instead of passing a parameter around a lot of functions, this
adds that information to the Container structure.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-25 14:55:11 +02:00
Matthew Heon 2ee2404683 Properly initialize container OCI runtime
Use name of the default runtime, instead of the OCIRuntime config
option, which may include a full path.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-20 16:19:49 -04:00
Matthew Heon 92bae8d308 Begin adding support for multiple OCI runtimes
Allow Podman containers to request to use a specific OCI runtime
if multiple runtimes are configured. This is the first step to
properly supporting containers in a multi-runtime environment.

The biggest changes are that all OCI runtimes are now initialized
when Podman creates its runtime, and containers now use the
runtime requested in their configuration (instead of always the
default runtime).

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-19 17:08:43 -04:00
Matthew Heon 49e696642d Add --storage flag to 'podman rm' (local only)
This flag switches to removing containers directly from c/storage
and is mostly used to remove orphan containers.

It's a superior solution to our former one, which attempted
removal from storage under certain circumstances and could, under
some conditions, not trigger.

Also contains the beginning of support for storage in `ps` but
wiring that in is going to be a much bigger pain.

Fixes #3329.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-13 17:02:20 -04:00
Adrian Reber bef83c42ea
migration: add possibility to restore a container with a new name
The option to restore a container from an external checkpoint archive
(podman container restore -i /tmp/checkpoint.tar.gz) restores a
container with the same name and same ID as id had before checkpointing.

This commit adds the option '--name,-n' to 'podman container restore'.
With this option the restored container gets the name specified after
'--name,-n' and a new ID. This way it is possible to restore one
container multiple times.

If a container is restored with a new name Podman will not try to
request the same IP address for the container as it had during
checkpointing. This implicitly assumes that if a container is restored
from a checkpoint archive with a different name, that it will be
restored multiple times and restoring a container multiple times with
the same IP address will fail as each IP address can only be used once.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-04 14:02:51 +02:00
Adrian Reber 0028578b43
Added support to migrate containers
This commit adds an option to the checkpoint command to export a
checkpoint into a tar.gz file as well as importing a checkpoint tar.gz
file during restore. With all checkpoint artifacts in one file it is
possible to easily transfer a checkpoint and thus enabling container
migration in Podman. With the following steps it is possible to migrate
a running container from one system (source) to another (destination).

 Source system:
  * podman container checkpoint -l -e /tmp/checkpoint.tar.gz
  * scp /tmp/checkpoint.tar.gz destination:/tmp

 Destination system:
  * podman pull 'container-image-as-on-source-system'
  * podman container restore -i /tmp/checkpoint.tar.gz

The exported tar.gz file contains the checkpoint image as created by
CRIU and a few additional JSON files describing the state of the
checkpointed container.

Now the container is running on the destination system with the same
state just as during checkpointing. If the container is kept running
on the source system with the checkpoint flag '-R', the result will be
that the same container is running on two different hosts.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-06-03 22:05:12 +02:00
OpenShift Merge Robot 294448c2ea
Merge pull request #2709 from haircommander/journald
Add libpod journald logging
2019-05-29 17:51:27 +02:00
Peter Hunt 51bdf29f04 Address comments
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-05-28 11:10:57 -04:00
Peter Hunt f61fa28d39 Added --log-driver and journald logging
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-05-28 11:10:57 -04:00
Giuseppe Scrivano c4dedd3021
Revert "rootless: change default path for conmon.pid"
since we now enter the user namespace prior to read the conmon.pid, we
can write the conmon.pid file again to the runtime dir.

This reverts commit 6c6a865436.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-05-25 13:47:59 +02:00
Matthew Heon 5cbb3e7e9d Use standard remove functions for removing pod ctrs
Instead of rewriting the logic, reuse the standard logic we use
for removing containers, which is much better tested.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-05-10 14:14:29 -04:00
OpenShift Merge Robot 61fa40b256
Merge pull request #2913 from mheon/get_instead_of_lookup
Use GetContainer instead of LookupContainer for full ID
2019-04-12 09:38:48 -07:00
Matthew Heon f7951c8776 Use GetContainer instead of LookupContainer for full ID
All IDs in libpod are stored as a full container ID. We can get a
container by full ID faster with GetContainer (which directly
retrieves) than LookupContainer (which finds a match, then
retrieves). No reason to use Lookup when we have full IDs present
and available.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-12 10:59:00 -04:00
Matthew Heon 27d56c7f15 Expand debugging for container cleanup errors
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-11 11:05:00 -04:00
Matthew Heon 42c95eed2c Major rework of --volumes-from flag
The flag should be substantially more durable, and no longer
relies on the create artifact.

This should allow it to properly handle our new named volume
implementation.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:27:20 -04:00
Matthew Heon 7309e38ddd Add handling for new named volumes code in pkg/spec
Now that named volumes must be explicitly enumerated rather than
passed in with all other volumes, we need to split normal and
named volumes up before passing them into libpod. This PR does
this.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:26:29 -04:00
Matthew Heon ee770ad5b5 Create non-existing named volumes at container create
Replaces old functionality we used for handling image volumes.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:26:29 -04:00
Matthew Heon d245c6df29 Switch Libpod over to new explicit named volumes
This swaps the previous handling (parse all volume mounts on the
container and look for ones that might refer to named volumes)
for the new, explicit named volume lists stored per-container.

It also deprecates force-removing volumes that are in use. I
don't know how we want to handle this yet, but leaving containers
that depend on a volume that no longer exists is definitely not
correct.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-04-04 12:26:29 -04:00
Giuseppe Scrivano 72382a12a7
rootless: use a single user namespace
simplify the rootless implementation to use a single user namespace
for all the running containers.

This makes the rootless implementation behave more like root Podman,
where each container is created in the host environment.

There are multiple advantages to it: 1) much simpler implementation as
there is only one namespace to join.  2) we can join namespaces owned
by different containers.  3) commands like ps won't be limited to what
container they can access as previously we either had access to the
storage from a new namespace or access to /proc when running from the
host.  4) rootless varlink works.  5) there are only two ways to enter
in a namespace, either by creating a new one if no containers are
running or joining the existing one from any container.

Containers created by older Podman versions must be restarted.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-04-01 15:32:58 +02:00
Giuseppe Scrivano 849548ffb8
userns: do not use an intermediate mount namespace
We have an issue in the current implementation where the cleanup
process is not able to umount the storage as it is running in a
separate namespace.

Simplify the implementation for user namespaces by not using an
intermediate mount namespace.  For doing it, we need to relax the
permissions on the parent directories and allow browsing
them. Containers that are running without a user namespace, will still
maintain mode 0700 on their directory.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-29 14:04:44 +01:00
Giuseppe Scrivano f7e72bc86a
volumes: push the chown logic to runtime_volume_linux.go
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-29 14:04:44 +01:00
Matthew Heon 5ed62991dc Remove ulele/deepcopier in favor of JSON deep copy
We have a very high performance JSON library that doesn't need to
perform code generation. Let's use it instead of our questionably
performant, reflection-dependent deep copy library.

Most changes because some functions can now return errors.

Also converts cmd/podman to use jsoniter, instead of pkg/json,
for increased performance.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-03-27 20:00:31 -04:00
Giuseppe Scrivano bf10fac193
volume: create new volumes with right ownership
when we create a new volume we must be sure it is owned by root in the
container.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-21 20:14:41 +01:00
Giuseppe Scrivano 6c6a865436
rootless: change default path for conmon.pid
We cannot use the RunDir for writing the conmon.pid file as we might
not be able to read it before we join a namespace, since it is owned
by the root in the container which can be a different uid when using
uidmap.  To avoid completely the issue, we will just write it to the
static dir which is always readable by the unprivileged user.

Closes: https://github.com/containers/libpod/issues/2673

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-03-15 22:53:23 +01:00
baude ca1e76ff63 Add event logging to libpod, even display to podman
In lipod, we now log major events that occurr.  These events
can be displayed using the `podman events` command. Each
event contains:

* Type (container, image, volume, pod...)
* Status (create, rm, stop, kill, ....)
* Timestamp in RFC3339Nano format
* Name (if applicable)
* Image (if applicable)

The format of the event and the varlink endpoint are to not
be considered stable until cockpit has done its enablement.

Signed-off-by: baude <bbaude@redhat.com>
2019-03-11 15:08:59 -05:00
Matthew Heon 83db80ce17 Only remove image volumes when removing containers
When removing volumes with rm --volumes we want to only remove
volumes that were created with the container. Volumes created
separately via 'podman volume create' should not be removed.

Also ensure that --rm implies volumes will be removed.

Fixes #2441

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-02-26 12:16:58 -05:00
Matthew Heon ca8ae877c1 Remove locks from volumes
I was looking into why we have locks in volumes, and I'm fairly
convinced they're unnecessary.

We don't have a state whose accesses we need to guard with locks
and syncs. The only real purpose for the lock was to prevent
concurrent removal of the same volume.

Looking at the code, concurrent removal ought to be fine with a
bit of reordering - one or the other might fail, but we will
successfully evict the volume from the state.

Also, remove the 'prune' bool from RemoveVolume. None of our
other API functions accept it, and it only served to toggle off
more verbose error messages.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-02-21 10:51:42 -05:00
Sebastian Jug 7141f97270 OpenTracing support added to start, stop, run, create, pull, and ps
Drop context.Context field from cli.Context

Signed-off-by: Sebastian Jug <sejug@redhat.com>
2019-02-18 09:57:08 -05:00
Daniel J Walsh 52df1fa7e0
Fix volume handling in podman
iFix builtin volumes to work with podman volume

Currently builtin volumes are not recored in podman volumes when
they are created automatically. This patch fixes this.

Remove container volumes when requested

Currently the --volume option on podman remove does nothing.
This will implement the changes needed to remove the volumes
if the user requests it.

When removing a volume make sure that no container uses the volume.

Signed-off-by: Daniel J Walsh dwalsh@redhat.com
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-02-14 13:21:52 -05:00
Daniel J Walsh 233ba5bd89
Remove container from storage on --force
Currently we can get into a state where a container exists in
storage but does not exist in libpod.  If the user forces a
removal of this container, then we should remove it from storage
even if the container is owned by another tool.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-02-09 05:33:14 -07:00
baude 64c8fb7c24 podman-remote import|export
addition of import and export for the podman-remote client.  This includes
the ability to send and receive files between the remote-client and the
"podman" host using an upgraded varlink connection.

Signed-off-by: baude <bbaude@redhat.com>
2019-02-05 10:05:41 -06:00
baude eadaa5fb42 podman-remote inspect
base enablement of the inspect command.

Signed-off-by: baude <bbaude@redhat.com>
2019-01-18 15:43:11 -06:00
Giuseppe Scrivano 3b37101c6e
config: store the runtime used to create each container
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-01-14 10:22:18 +01:00
Matthew Heon 5ed23327a9 Rename libpod.Config back to ContainerConfig
During an earlier bugfix, we swapped all instances of
ContainerConfig to Config, which was meant to fix some data we
were returning from Inspect. This unfortunately also renamed a
libpod internal struct for container configs. Undo the rename
here.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-01-07 14:37:51 -05:00
Matthew Heon 97681a5f2b Move lock init after tmp dir is populated properly
Don't initialize the lock manager until almost the end of libpod
init, so we can guarantee our tmp dir is properly set up and
exists. This wasn't an issue on systems that had previously run
Podman, but CI caught it.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-01-04 09:51:09 -05:00
Matthew Heon d4b2f11601 Convert pods to SHM locks
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2019-01-04 09:51:09 -05:00
Matthew Heon 3de560053f Convert containers to SHM locking
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2019-01-04 09:51:09 -05:00
Daniel J Walsh df99522c67
Fixes to handle /dev/shm correctly.
We had two problems with /dev/shm, first, you mount the
container read/only then /dev/shm was mounted read/only.
This is a bug a tmpfs directory should be read/write within
a read-only container.

The second problem is we were ignoring users mounted /dev/shm
from the host.

If user specified

podman run -d -v /dev/shm:/dev/shm ...

We were dropping this mount and still using the internal mount.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-12-24 09:03:53 -05:00
Daniel J Walsh c657dc4fdb
Switch all referencs to image.ContainerConfig to image.Config
This will more closely match what Docker is doing.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-12-21 15:59:34 -05:00
Giuseppe Scrivano f65eafa6ba
libpod: always store the conmon pid file
we need this information to later be able to join
the conmon process.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-12-21 09:46:05 +01:00
umohnani8 4c70b8a94b Add "podman volume" command
Add support for podman volume and its subcommands.
The commands supported are:
	podman volume create
	podman volume inspect
	podman volume ls
	podman volume rm
	podman volume prune

This is a tool to manage volumes used by podman. For now it only handle
named volumes, but eventually it will handle all volumes used by podman.

Signed-off-by: umohnani8 <umohnani@redhat.com>
2018-12-06 10:17:16 +00:00
baude 2dd9cae37c rm -f now removes a paused container
We now can remove a paused container by sending it a kill signal while it
is paused.  We then unpause the container and it is immediately killed.

Also, reworked how the parallelWorker results are handled to provide a
more consistent approach to how each subcommand implements it. It also
fixes a bug where if one container errors, the error message is duplicated
when printed out.

Signed-off-by: baude <bbaude@redhat.com>
2018-11-08 15:18:11 -06:00
Matthew Heon 140f87c474 EXPERIMENTAL: Do not call out to runc for sync
When syncing container state, we normally call out to runc to see
the container's status. This does have significant performance
implications, though, and we've seen issues with large amounts of
runc processes being spawned.

This patch attempts to use stat calls on the container exit file
created by Conmon instead to sync state. This massively decreases
the cost of calling updateContainer (it has gone from an
almost-unconditional fork/exec of runc to a single stat call that
can be avoided in most states).

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-11-07 11:36:01 -05:00
Daniel J Walsh a95d71f113
Allow containers/storage to handle on SELinux labeling
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-23 10:57:23 -04:00
Matthew Heon 39d7c869ea Fix bug with exited state and container remove
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 12:07:23 -04:00
Matthew Heon 29dbab6440 Address review comments and fix ps output
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 12:05:22 -04:00
Matthew Heon 2c7f97d5a7 Add ContainerStateExited and OCI delete() in cleanup()
To work better with Kata containers, we need to delete() from the
OCI runtime as a part of cleanup, to ensure resources aren't
retained longer than they need to be.

To enable this, we need to add a new state to containers,
ContainerStateExited. Containers transition from
ContainerStateStopped to ContainerStateExited via cleanupRuntime
which is invoked as part of cleanup(). A container in the Exited
state is identical to Stopped, except it has been removed from
the OCI runtime and thus will be handled differently when
initializing the container.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-10-02 12:05:22 -04:00
Daniel J Walsh fbfcc7842e Add new field to libpod to indicate whether or not to use labelling
Also update some missing fields libpod.conf obtions in man pages.

Fix sort order of security options and add a note about disabling
labeling.

When a process requests a new label.  libpod needs to reserve all
labels to make sure that their are no conflicts.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #1406
Approved by: mheon
2018-09-20 16:01:29 +00:00
haircommander 0e6266858a Fixing network ns segfault
As well as small style corrections, update pod_top_test to use CreatePod, and move handling of adding a container to the pod's namespace from container_internal_linux to libpod/option.

Signed-off-by: haircommander <pehunt@redhat.com>

Closes: #1187
Approved by: mheon
2018-08-23 18:16:28 +00:00
haircommander 2a7449362f Change pause container to infra container
Signed-off-by: haircommander <pehunt@redhat.com>

Closes: #1187
Approved by: mheon
2018-08-23 18:16:28 +00:00
haircommander d5e690914d Added option to share kernel namespaces in libpod and podman
A pause container is added to the pod if the user opts in. The default pause image and command can be overridden. Pause containers are ignored in ps unless the -a option is present. Pod inspect and pod ps show shared namespaces and pause container. A pause container can't be removed with podman rm, and a pod can be removed if it only has a pause container.

Signed-off-by: haircommander <pehunt@redhat.com>

Closes: #1187
Approved by: mheon
2018-08-23 18:16:28 +00:00
Valentin Rothberg c27b7cdc93 removeContainer: fix deadlock
When checking if the container has already been removed, use
c.state.HasContainer() instead of the runtime's API to avoid
trying to take the already acquired lock.

Fixes: #1245
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>

Closes: #1251
Approved by: baude
2018-08-10 13:26:58 +00:00
Matthew Heon 9bd991f477 Fix CGroupFS cgroup manager cgroup creation for pods
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #1237
Approved by: rhatdan
2018-08-08 21:03:20 +00:00
Matthew Heon fc95f68247 Set namespace for new pods/containers based on runtime
New containers and pods will default to the namespace of the
runtime, but this can be overridden by With... options if
desired.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
2018-07-24 16:12:31 -04:00
Daniel J Walsh 3a90b5224d Returning joining namespace error should not be fatal
I got my database state in a bad way by killing a hanging container.

It did not setup the network namespace correctly

listing/remove bad containers becomes impossible.

podman run alpine/nginx
^c
got me in this state.

I got into a state in the database where
podman ps -a
was returning errors and I could not get out of it,  Makeing joining the network
namespace a non fatal error fixes the issue.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #918
Approved by: mheon
2018-06-29 14:32:57 +00:00
Matthew Heon 80131339b7 Mark containers invalid earlier during removal
Fixes a bug where we might try saving back to the database during
cleanup, which would fail as the container was already removed
from the database.

Signed-off-by: Matthew Heon <mheon@redhat.com>

Closes: #1001
Approved by: rhatdan
2018-06-27 13:42:20 +00:00
Daniel J Walsh 1e9e530714 Remove container from state before cleaning up.
Attempt to cleanup as much of the container as possible, even if one
of the cleanup stages fails.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Closes: #895
Approved by: mheon
2018-06-10 11:10:11 +00:00
W. Trevor King c9f763456c libpod: Execute poststop hooks locally
Instead of delegating to the runtime, since some runtimes do not seem
to handle these reliably [1].

[1]: https://github.com/projectatomic/libpod/issues/730#issuecomment-392959938

Signed-off-by: W. Trevor King <wking@tremily.us>

Closes: #864
Approved by: rhatdan
2018-06-04 18:36:40 +00:00
Matthew Heon 7e1ea9d26d Add per-pod CGroups
Pods can now create their own (cgroupfs) cgroups which containers
in them can (optionally) use.

This presently only works with CGroupFS, systemd cgroups are
still WIP

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #784
Approved by: rhatdan
2018-05-17 23:10:12 +00:00
Matthew Heon 20bceb787d Use container cleanup() functions when removing
Instead of manually calling the individual functions that cleanup
uses to tear down a container's resources, just call the cleanup
function to make sure that cleanup only needs to happen in one
place.

Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #790
Approved by: rhatdan
2018-05-17 18:55:59 +00:00
Matthew Heon c4c5c1a3e1 Remove parent cgroup we create with cgroupfs
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #507
Approved by: baude
2018-05-11 14:43:57 +00:00
Matthew Heon 15ca5f2687 Add validation for CGroup parents. Pass CGroups path into runc
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>

Closes: #507
Approved by: baude
2018-05-11 14:43:57 +00:00
Giuseppe Scrivano 522a7197a8 podman, userNS: configure an intermediate mount namespace
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

Closes: #690
Approved by: mheon
2018-05-04 17:15:55 +00:00
umohnani8 27107fdac1 Vendor in latest containers/image and contaners/storage
Made necessary changes to functions to include contex.Context wherever needed

Signed-off-by: umohnani8 <umohnani@redhat.com>

Closes: #640
Approved by: baude
2018-04-19 14:08:47 +00:00