Commit Graph

4788 Commits

Author SHA1 Message Date
Matthew Heon ebacfbd091 podman: fix memleak caused by renaming and not deleting
the exit file

If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.

This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.

No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.

A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.

    rm -f /run/libpod/exits/*

This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.

The upstream podman commit used for testing is
1fe2965e4f (v1.4.4).

The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).

Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>

<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-31 17:28:42 -04:00
OpenShift Merge Robot a622f8d345
Merge pull request #3682 from cevich/fix_release_rerun
Cirrus: Fix re-run of release task into no-op.
2019-07-31 20:10:03 +02:00
Chris Evich 3e3afb942a
Cirrus: Fix release dependencies
The release-task ***must*** always execute last, in order to guarantee a
consistent cache of release archives from dependent tasks.  It
accomplishes this by verifying it's task-number matches one-less than
the total number of tasks.  Previous to this commit, a YAML anchor/alias
was used to avoid duplication of the dependency list between 'success'
and 'release'

However, it's been observed that this opens the possibility for
'release' and 'success' tasks to race when running on a PR.  Because
YAML anchor/aliases cannot be used to modify lists, duplication is
required to make 'release' actually depend upon 'success'.

This duplication will introduce an additional maintenance burden.
Though when adding a new task, it's already very easy to forget to
update the 'depends_on' list.  Assist both cases by the addition
unit-tests to verify ``.cirrus.yml`` dependency contents and structure.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-31 11:49:53 -04:00
Chris Evich cb2ea1a27b
Cirrus: Fix re-run of release task into no-op.
This task depends upon other tasks caching their binaries.  If for
whatever reason the `release` task is re-run and/or is out-of-order
with it's dependents, the state of cache will be undefined. Previously
this would result in an error, and failing of the release task.
This commit alters this behavior to issue a warning instead.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-31 09:42:52 -04:00
OpenShift Merge Robot 680a383874
Merge pull request #3672 from petejohanson/32bit-build-fixes
Build fix for 32-bit systems.
2019-07-30 22:07:32 +02:00
OpenShift Merge Robot e84ed3c1bc
Merge pull request #3665 from QiWang19/env
Set -env variables as appropriate
2019-07-30 21:20:34 +02:00
Pete Johanson 32aaf8da56 Build fix for 32-bit systems.
* Fixes #3664.

Signed-off-by: Pete Johanson <peter@peterjohanson.com>
2019-07-30 12:25:36 -04:00
Qi Wang 2da86bdc3a Set -env variables as appropriate
close #3648

podman create and podman run do not set --env variable if the environment is not present with a value

Signed-off-by: Qi Wang <qiwan@redhat.com>
2019-07-30 12:02:18 -04:00
OpenShift Merge Robot 1a008958d4
Merge pull request #3661 from openSUSE/nixos-friendly-config
Update libpod.conf to be more friendly to NixOS
2019-07-30 16:33:48 +02:00
OpenShift Merge Robot 4196a59452
Merge pull request #3668 from TomSweeneyRedHat/dev/tsweeney/adderror
Touch up input argument error on create
2019-07-30 15:59:59 +02:00
TomSweeneyRedHat 0b14e53590 Touch up input argument error on create
Add an error when there are not enough input arguments for remote
create.  Addresses comments in #3656

Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2019-07-30 09:05:48 -04:00
Sascha Grunert 52ae51c79f
Update libpod.conf to be NixOS friendly
NixOS links the current system state to `/run/current-system`, so we
have to add these paths to the configuration files as well to work out
of the box.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-07-30 12:59:11 +02:00
OpenShift Merge Robot 040355d450
Merge pull request #3667 from major/test-with-username-has-dash
Allow info test to work with usernames w/dash
2019-07-30 02:19:19 +02:00
Major Hayden 9822f54ac3
Allow info test to work with usernames w/dash
The regular expression used in the `info` test does not allow for
usernames that have a dash, such as `test-user`. This patch adjusts
the regex to allow for a dash.

Fixes #3666.

Signed-off-by: Major Hayden <major@redhat.com>
2019-07-29 16:08:51 -05:00
OpenShift Merge Robot 7d635ac1c5
Merge pull request #3656 from jwhonce/wip/env
Fix commit --changes env=X=Y
2019-07-29 21:57:08 +02:00
OpenShift Merge Robot 71bb2889ff
Merge pull request #3660 from LaszloGombos/master
Fix the syntax in the podman export documentation example
2019-07-29 21:43:56 +02:00
OpenShift Merge Robot 5343d79e6c
Merge pull request #3663 from adrianreber/random-test-ip
Move random IP code for tests from checkpoint to common
2019-07-29 21:31:13 +02:00
OpenShift Merge Robot c3c45f3ba5
Merge pull request #3646 from vrothberg/hi-scott
fix `podman -v` regression
2019-07-29 19:54:49 +02:00
Laszlo Gombos e46d9b2e45 Fix the syntax in the podman export documentation example
Signed-off-by: Laszlo Gombos <laszlo.gombos@gmail.com>
2019-07-29 11:14:31 -04:00
OpenShift Merge Robot 6665269ab8
Merge pull request #3233 from wking/fatal-requested-hook-directory-does-not-exist
libpod/container_internal: Make all errors loading explicitly configured hook dirs fatal
2019-07-29 16:39:08 +02:00
OpenShift Merge Robot 2ca7861b8e
Merge pull request #3650 from cevich/fix_clone_depth
Cirrus: Remove fixed clone depth
2019-07-29 16:24:58 +02:00
Valentin Rothberg 6065070bae fix `podman -v` regression
Re-add the shortflag for --version and add e2e tests to avoid regressing
in the future.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-07-29 14:47:21 +02:00
Adrian Reber 90ffba92e9
Move random IP code for tests from checkpoint to common
The function to generate random IP addresses during ginkgo tests in
the checkpoint test code is moved to common and all tests using
hardcoded IP addresses have been changed to use random IP addresses to
reduce test errors when running the tests in parallel.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-07-29 14:24:08 +02:00
OpenShift Merge Robot 2c98bd5398
Merge pull request #3654 from TomSweeneyRedHat/dev/tsweeney/commandpause
Update pause/unpause video links and demo
2019-07-28 17:06:17 +02:00
Jhon Honce 40bf0649af Fix commit --changes env=X=Y
Signed-off-by: Jhon Honce <jhonce@redhat.com>
2019-07-26 16:04:17 -07:00
TomSweeneyRedHat 1bedf04416 Update pause/unpause video links and demo
Update the links for the asciinema casts and the demo for the
`podman pause` and `podman unpause` commands on the commands.md
page.

Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
2019-07-26 17:04:23 -04:00
Chris Evich 6146a3ad0f
Cirrus: Remove fixed clone depth
It's been observed on several occasions, some tests fail in git clones
with a "cannot find ref" type error.  Especially in the depth=1 cases.
Since there's really only one place where limiting the depth makes sense
(build-each-commit), simply remove all the other limits.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-26 11:50:44 -04:00
OpenShift Merge Robot 0c4dfcfe57
Merge pull request #3639 from giuseppe/user-ns-container
podman: support --userns=ns|container
2019-07-26 15:06:06 +02:00
OpenShift Merge Robot b212daa92f
Merge pull request #3632 from cevich/small_cirrus_fixes
Small cirrus and image-build fixes
2019-07-26 14:55:12 +02:00
OpenShift Merge Robot eca157fb54
Merge pull request #3627 from ashley-cui/rmdocs
Documenation & make tar.gz for remote
2019-07-26 11:37:10 +02:00
OpenShift Merge Robot 1910d68e4d
Merge pull request #3645 from mheon/systemd_ubuntu
Use systemd cgroups for Ubuntu
2019-07-26 10:41:04 +02:00
OpenShift Merge Robot 4674d00f46
Merge pull request #3580 from samc24/hook
Improved hooks monitoring
2019-07-26 10:25:05 +02:00
Giuseppe Scrivano 1d72f651e4
podman: support --userns=ns|container
allow to join the user namespace of another container.

Closes: https://github.com/containers/libpod/issues/3629

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-25 23:04:55 +02:00
Giuseppe Scrivano ba5741e398
pods: do not to join a userns if there is not any
do not attempt to join the user namespace if the pod is running in the
host user namespace.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-25 23:04:54 +02:00
Ashley Cui ce0132a45e Documenation & build automation for remote darwin
Created shell script to automatically compile remote-only docs & rename
Added make brew-pkg to automatically package files needed for homebrew
Add missing docs

Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
2019-07-25 15:36:39 -04:00
Chris Evich ac5ad9acbf
Cirrus: Bypass release during image-building
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 15:20:57 -04:00
Matthew Heon 0934949220 Use systemd cgroups for Ubuntu
It seems like our VM images now support systemd CGroups with the
Ubuntu LTS images. No reason to keep testing CGroupfs as such,
systemd is much less racy (and CGroupfs on systemd-enabled
systems can be iffy).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-25 14:57:58 -04:00
Chris Evich 07b1e331c2
Cirrus: Ubuntu: Set + Test for $RUNC_BINARY
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 14:02:12 -04:00
Chris Evich f55288c96f
Cirrus: Simplify evil-unit check in image
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
Chris Evich ceb3d76298
Cirrus: Silence systemd-banish noise
It's somewhat hard to predict which units are certinly present
for any given base-image.  Therefore, at image-build time, it's
distracting and unhelpful to see all the errors about units that
don't exist, on every platform.  Simply ignore them and rely on
the `check_image.sh` test to confirm none are enabled.

Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
Chris Evich e3082762fe
Cirrus: Fix image build metadata update
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
Chris Evich 6942d3275d
Cirrus: Fix missing -n on CentOS
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
Chris Evich 23722e644e
Cirrus: Remove disused COMMIT variables
Signed-off-by: Chris Evich <cevich@redhat.com>
2019-07-25 13:51:33 -04:00
OpenShift Merge Robot dff82d940e
Merge pull request #3643 from openSUSE/history-panic
Fix possible runtime panic if image history len is zero
2019-07-25 19:26:47 +02:00
OpenShift Merge Robot 5763618ce5
Merge pull request #3631 from TristanCacqueray/master
Document SELinux label requirements for the rootfs argument
2019-07-25 18:03:35 +02:00
OpenShift Merge Robot 1b95ed9a71
Merge pull request #3622 from QiWang19/checkurl
fix import not ignoring url path
2019-07-25 17:49:08 +02:00
samc24 d6ea4b4139 Improved hooks monitoring
...to work for specific edge cases with a simpler solution.
Re-reads hooks directories after any changes are detected by the watchers.
Added monitoring test for adding a different invalid hook to primary directory.
Some issues with prior code:
- ReadDir would stop when it encounters an invalid hook, rather than registering an error but continuing to read the valid hook.
- Wouldn’t account for Rename and Chmod events.
- After doing a mv of the hooks file instead of rm, it would still think the hooks file is in the directory, but it has been moved to another location.
- If a hook file was renamed, it would register the renamed file as a separate hook and not delete the original, so it would then execute the hook twice - once for the renamed file, and once for the original name which it did not delete.

Signed-off-by: samc24 <sam.chaturvedi24@gmail.com>
2019-07-25 09:52:45 -04:00
Sascha Grunert 7630f1b52e
Fix possible runtime panic if image history len is zero
We now return an empty string for the `Comment` field if an OCI v1 image
contains no history.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-07-25 12:45:08 +02:00
OpenShift Merge Robot 7c9095ea1d
Merge pull request #3641 from mheon/no_fuzzy_volume_lookup
When retrieving volumes, only use exact names
2019-07-25 11:24:29 +02:00
Matthew Heon f747a06d53 When retrieving volumes, only use exact names
We should not be fuzzy matching on volume names. Docker doesn't
do  it, and it doesn't make much sense. Everything requires exact
matches for names - only IDs allow partial matches.

Fixes #3635

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-24 22:30:16 -04:00