Merge pull request #1451 from dashpole/memcg

Propose solution to make memory cgroup events effective.
This commit is contained in:
k8s-ci-robot 2018-03-08 10:48:55 -08:00 committed by GitHub
commit 9bf594e91a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 12 additions and 6 deletions

View File

@ -191,6 +191,18 @@ signal. If that signal is observed as being satisfied for longer than the
specified period, the `kubelet` will initiate eviction to attempt to
reclaim the resource that has met its eviction threshold.
### Memory CGroup Notifications
When the `kubelet` is started with `--experimental-kernel-memcg-notification=true`,
it will use cgroup events on the memory.usage_in_bytes file in order to trigger the eviction manager.
With the addition of on-demand metrics, this permits the `kubelet` to trigger the eviction manager,
collect metrics, and respond with evictions much quicker than using the sync loop alone.
To do this, we periodically adjust the memory cgroup threshold based on total_inactive_file. The eviction manager
periodically measures total_inactive_file, and sets the threshold for usage_in_bytes to mem_capacity - eviction_hard +
total_inactive_file. This means that the threshold is crossed when usage_in_bytes - total_inactive_file
= mem_capacity - eviction_hard.
### Disk
Let's assume the operator started the `kubelet` with the following:
@ -457,9 +469,3 @@ In general, it should be strongly recommended that `DaemonSet` not
create `BestEffort` pods to avoid being identified as a candidate pod
for eviction. Instead `DaemonSet` should ideally include Guaranteed pods only.
## Known issues
### kubelet may evict more pods than needed
The pod eviction may evict more pods than needed due to stats collection timing gap. This can be mitigated by adding
the ability to get root container stats on an on-demand basis (https://github.com/google/cadvisor/issues/1247) in the future.