Commit Graph

4840 Commits

Author SHA1 Message Date
OpenShift Merge Robot 626dfdb613
Merge pull request #3691 from baude/infoeventlogger
add eventlogger to info
2019-08-05 15:23:05 +02:00
OpenShift Merge Robot e2f38cdaa4
Merge pull request #3310 from gabibeyer/rootlessKata
rootless: Rearrange setup of rootless containers ***CIRRUS: TEST IMAGES***
2019-08-05 14:26:04 +02:00
OpenShift Merge Robot b609de2e3d
Merge pull request #3673 from TomSweeneyRedHat/dev/tsweeney/trubs2
Add rootless NFS and OverlayFS warnings to docs
2019-08-05 10:20:03 +02:00
OpenShift Merge Robot 389a7b79c2
Merge pull request #3720 from baude/honorconfiginuser
honor libpod.conf in /usr/share/containers
2019-08-05 00:24:53 +02:00
baude 577b37b716 honor libpod.conf in /usr/share/containers
we should be looking for the libpod.conf file in /usr/share/containers
and not in /usr/local.  packages of podman should drop the default
libpod.conf in /usr/share.  the override remains /etc/containers/ as
well.

Fixes: #3702

Signed-off-by: baude <bbaude@redhat.com>
2019-08-04 14:04:18 -05:00
OpenShift Merge Robot d9ea4db396
Merge pull request #3717 from rhatdan/errors
Don't log errors to the screen when XDG_RUNTIME_DIR is not set
2019-08-04 16:22:48 +02:00
Daniel J Walsh 66485c80fc
Don't log errors to the screen when XDG_RUNTIME_DIR is not set
Drop errors to debug when trying to setup the runtimetmpdir.  If the tool
can not setup a runtime dir, it will error out with a correct message
no need to put errors on the screen, when the tool actually succeeds.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-04 06:50:47 -04:00
baude 63eef5a234 add eventlogger to info
to help with future debugging, we now display the type of event logger
being used inside podman info -> host.

Signed-off-by: baude <bbaude@redhat.com>
2019-08-02 20:05:27 -05:00
OpenShift Merge Robot 140e08ef64
Merge pull request #3707 from haircommander/no-errorf
Add handling for empty LogDriver
2019-08-03 00:50:36 +02:00
Peter Hunt 2110422a61 Add handling for empty LogDriver
There are two cases logdriver can be empty, if it wasn't set by libpod, or if the user did --log-driver ""
The latter case is an odd one, and the former is very possible and already handled for LogPath.
Instead of printing an error for an entirely reasonable codepath, let's supress the error

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-02 14:03:27 -04:00
TomSweeneyRedHat a0f9dbe007 Add rootless NFS and OverlayFS warnings to docs
Add warnings/work arounds about NFS and OverlayFS to the troubleshooting guide
and also the main podman page.  Verified that these warnings are on the rootless
page already.

Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2019-08-02 13:57:43 -04:00
OpenShift Merge Robot 3cc9ab8992
Merge pull request #3695 from edsantiago/bats_hang_fix
System tests: resolve hang in rawhide rootless
2019-08-02 15:37:47 +02:00
OpenShift Merge Robot 5370c53c9c
Merge pull request #3692 from haircommander/play-caps
Add Capability support to play kube
2019-08-02 10:42:46 +02:00
Valentin Rothberg 2cc5913bed
Merge pull request #3676 from fzoske/fix-typo
Fix typo
2019-08-02 10:19:24 +02:00
Ed Santiago 6eee9ab080 System tests: resolve hang in rawhide rootless
Fedora CI tests are failing on rawhide under kernel
5.3.0-0.rc1.git3.1.fc31 (rhbz#1736758). But there's
another insidious failure, a 4-hour hang in the
rootless tests on the same CI system. The culprit
line is in the podman build test, but it's actually
BATS itself that hangs, not the build command -- which
suggests that it's the usual FD 3 problem (see BATS README).
It would seem that podman is forking a process that
inherits fd 3 but that process is not getting cleaned
up when podman crashes upon encountering the kernel bug.

Today it's podman build, tomorrow it might be something
else. Let's just run all podman invocations in run_podman
with a non-bats FD 3.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2019-08-01 20:19:54 -06:00
OpenShift Merge Robot e3240daa47
Merge pull request #3551 from mheon/fix_memory_leak
Fix memory leak with exit files
2019-08-02 03:44:43 +02:00
OpenShift Merge Robot e48dc506d1
Merge pull request #3693 from QiWang19/search
fix search output limit
2019-08-02 01:22:44 +02:00
OpenShift Merge Robot 1bbcb2fc56
Merge pull request #3458 from rhatdan/volume
Use buildah/pkg/parse volume parsing rather then internal version
2019-08-01 23:24:03 +02:00
Qi Wang 619a39f7bb fix search output limit
close https://bugzilla.redhat.com/show_bug.cgi?id=1732280
From the bug Podman search returns 25 results even when limit option `--limit` is larger than 25(maxQueries). They want Podman to return `--limit` results.

This PR fixes the number of output result.
if --limit not set, return MIN(maxQueries, len(res))
if --limit is set, return MIN(option, len(res))

Signed-off-by: Qi Wang <qiwan@redhat.com>
2019-08-01 16:15:15 -04:00
Peter Hunt 834107c82e Add capability functionality to play kube
Take capabilities written in a kube and add to a container
adapt test suite and write cap-add/drop tests

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-01 15:47:45 -04:00
Matthew Heon 8da24f2f7d Use "none" instead of "null" for the null eventer
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-01 15:01:54 -04:00
OpenShift Merge Robot e1a099ed44
Merge pull request #3688 from mheon/print_pod
Print Pod ID in `podman inspect` output
2019-08-01 20:30:53 +02:00
Peter Hunt 3acfcb3062 Deduplicate capabilities in generate kube
capabilities that were added and dropped were several times duplicated. Fix this

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-01 14:12:36 -04:00
Fabian Zoske 944a68cb4d Fix typo
Signed-off-by: Fabian Zoske <git@fzoske.de>
2019-08-01 20:09:44 +02:00
Matthew Heon 6bbeda6da5 Pass on events-backend config to cleanup processes
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-01 12:37:24 -04:00
Matthew Heon ea02c11cc1 Print Pod ID in `podman inspect` output
Somehow this managed to slip through the cracks, but this is
definitely something inspect should print.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-01 11:34:36 -04:00
OpenShift Merge Robot afb493ae9b
Merge pull request #3686 from vrothberg/rawhide-builds
go build: use `-mod=vendor` for go >= 1.11.x
2019-08-01 15:50:54 +02:00
Valentin Rothberg 39a9099b3b go build: use `-mod=vendor` for go >= 1.11.x
Go 1.13.x isn't sensitive to the GO111MODULE environment variable
causing builds to not use the vendored sources in ./vendor. Force builds
of module-supporting go versions to use the vendored sources by setting
-mod=vendor.

Verified in a fedora:rawhide container.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-08-01 14:04:17 +02:00
OpenShift Merge Robot 6f62dac163
Merge pull request #3341 from rhatdan/exit
Add new exit codes to rm & rmi for running containers & dependencies
2019-08-01 13:37:19 +02:00
OpenShift Merge Robot ee15e76da0
Merge pull request #3675 from rhatdan/storage
Vendor in containers/storage v1.12.16
2019-08-01 12:55:19 +02:00
OpenShift Merge Robot 5056964d09
Merge pull request #3677 from giuseppe/systemd-cgroupsv2
systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemd
2019-08-01 11:35:20 +02:00
Daniel J Walsh 3215ea694d
Merge pull request #3681 from vrothberg/tests-check-errors
e2e test: check exit codes for pull, save, inspect
2019-08-01 05:23:36 -04:00
OpenShift Merge Robot ccf4ec295b
Merge pull request #3671 from openSUSE/runtime-path-discovery
Add runtime and conmon path discovery
2019-08-01 10:04:19 +02:00
Daniel J Walsh e7aca5568a
Use buildah/pkg/parse volume parsing rather then internal version
We share this code with buildah, so we should eliminate the podman
version.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-01 03:48:16 -04:00
Daniel J Walsh 9d6dce1199
github.com/containers/storage v1.12.13
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-01 03:46:14 -04:00
Daniel J Walsh 39de184b8b
Merge pull request #3573 from rhatdan/vendor
Vendor in latest buildah code
2019-08-01 03:41:27 -04:00
Daniel J Walsh 5370d9cb76
Add new exit codes to rm & rmi for running containers & dependencies
This enables programs and scripts wrapping the podman command to handle
'podman rm' and 'podman rmi' failures caused by paused or running
containers or due to images having other child images or dependent
containers. These errors are common enough that it makes sense to have
a more machine readable way of detecting them than parsing the standard
error output.

Signed-off-by: Ondrej Zoder <ozoder@redhat.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-01 03:40:29 -04:00
Sascha Grunert 7dfaef7766
Add runtime and conmon path discovery
The `$PATH` environment variable will now used as fallback if no valid
runtime or conmon path matches. The debug logs has been updated to state
the used executable.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-08-01 08:32:25 +02:00
Giuseppe Scrivano 223fe64dc0
systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemd
when running on a cgroups v2 system, do not bind mount
the named hierarchy /sys/fs/cgroup/systemd as it doesn't exist
anymore.  Instead bind mount the entire /sys/fs/cgroup.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-08-01 07:31:06 +02:00
Matthew Heon 9dcd76e369 Ensure we generate a 'stopped' event on force-remove
When forcibly removing a container, we are initiating an explicit
stop of the container, which is not reflected in 'podman events'.
Swap to using our standard 'stop()' function instead of a custom
one for force-remove, and move the event into the internal stop
function (so internal calls also register it).

This does add one more database save() to `podman remove`. This
should not be a terribly serious performance hit, and does have
the desirable side effect of making things generally safer.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
Matthew Heon ef2d96a7a8 Fix Dockerfile - a dependency's name was changed
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:29:14 -04:00
Matthew Heon cc63aff571 System events are valid, don't error on them
The logfile driver was not aware that system events existed.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon 318438fcb3 Do not use an events backend when restoring images
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon cdd5639d56 Expose Null eventer and allow its use in the Podman CLI
We need this specifically for tests, but others may find it
useful if they don't explicitly need events and don't want the
performance implications of using them.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon fd73075cbe Force tests to use file backend for events
Podman-in-podman (and possibly ubuntu) have "issues" with
journald. Let's just use file instead to be safe.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon 8e8d1ac193 Add a flag to set events logger type
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon 6619c073bd Fix test suite
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon 7dd1df4323 Retrieve exit codes for containers via events
As we previously removed our exit code retrieval code to stop a
memory leak, we need a new way of doing this. Fortunately, events
is able to do the job for us.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
Matthew Heon ebacfbd091 podman: fix memleak caused by renaming and not deleting
the exit file

If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.

This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.

No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.

A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.

    rm -f /run/libpod/exits/*

This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.

The upstream podman commit used for testing is
1fe2965e4f (v1.4.4).

The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).

Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
OpenShift Merge Robot a622f8d345
Merge pull request #3682 from cevich/fix_release_rerun
Cirrus: Fix re-run of release task into no-op.
2019-07-31 20:10:03 +02:00