If /sys/kernel/mm/transparent_hugepage/enabled=always, the shim process
will use huge pages, which will consume a lot of memory.
Just like this:
ps -efo pid,rss,comm | grep shim
PID RSS COMMAND
2614 7464 containerd-shim
I don't think shim needs to use huge pages, and if we turn off the huge
pages option, we can save a lot of memory resources.
After we set THP_DISABLE=true:
ps -efo pid,comm,rss
PID COMMAND RSS
1629841 containerd-shim 5648
containerd
|
|--shim1 --start
|
|--shim2 (this shim will on host)
|
|--runc create (when containerd send create request by ttrpc)
|
|--runc init (this is the pid 1 in container)
we should set thp_disabled=1 in shim1 --start, because if we set this
in shim 2, the huge page has been setted while func main() running,
we set thp_disabled cannot change the setted huge pages.
So We need to set thp_disabled=1 in shim1 so that shim2 inherits the
settings of the parent process shim1, and shim2 has closed the
hugepage when it starts.
For runc processes, we need to set thp_disabled='before' in shim2 after
fork() and before execve(). So we use cmd.pre_exec to do this.
|
||
|---|---|---|
| .. | ||
| src | ||
| Cargo.toml | ||
| README.md | ||
| build.rs | ||
README.md
Rust containerd shim v2 for runc container
By default containerd relies on runc shim v2 runtime (written in Go) to launch containers.
This crate is an alternative Rust implementation of the shim runtime.
It conforms to containerd's integration tests and can be replaced with the original Go runtime interchangeably.
Usage
To build binary, run:
cargo build --release --bin containerd-shim-runc-v2-rs
Replace it to the containerd shim dir: /usr/local/bin/containerd-shim-runc-v2-rs
In order to use it from containerd, use:
$ sudo ctr run --rm --runtime io.containerd.runc.v2-rs -t docker.io/library/hello-world:latest hello
You can run a container by ctr, crictl or kubernetes API.
Performance test
Memory overhead
Three different kinds of shim binaries are used to compare memory overhead, first is containerd-shimv2-runc-v2
compiled by golang, next is our sync containerd-shim-runc-v2-rs and the last one is our async containerd-shim-runc-v2-rs
but limited to 2 work threads.
We run a busybox container inside a pod on a 16U32G Ubuntu20.04 mechine with containerd v1.6.8 and runc v1.1.4. To measure the memory size of shim process we parse the output of smaps file and add up all RSS segments. In addition, we also run 100 pods and collect the total memory overhead.
| Single Process RSS | 100 Processes RSS | |
|---|---|---|
| containerd-shim-runc-v2 | 11.02MB | 1106.52MB |
| containerd-shim-runc-v2-rs(sync) | 3.45MB | 345.39MB |
| containerd-shim-runc-v2-rs(async, limited to 2 work threads) | 3.90MB | 396.83MB |