commit
09ae39dda5
|
@ -0,0 +1,93 @@
|
|||
# ProcMount/ProcMountType Option
|
||||
|
||||
## Background
|
||||
|
||||
Currently the way docker and most other container runtimes work is by masking
|
||||
and setting as read-only certain paths in `/proc`. This is to prevent data
|
||||
from being exposed into a container that should not be. However, there are
|
||||
certain use-cases where it is necessary to turn this off.
|
||||
|
||||
## Motivation
|
||||
|
||||
For end-users who would like to run unprivileged containers using user namespaces
|
||||
_nested inside_ CRI containers, we need an option to have a `ProcMount`. That is,
|
||||
we need an option to designate explicitly turn off masking and setting
|
||||
read-only of paths so that we can
|
||||
mount `/proc` in the nested container as an unprivileged user.
|
||||
|
||||
Please see the following filed issues for more information:
|
||||
- [opencontainers/runc#1658](https://github.com/opencontainers/runc/issues/1658#issuecomment-373122073)
|
||||
- [moby/moby#36597](https://github.com/moby/moby/issues/36597)
|
||||
- [moby/moby#36644](https://github.com/moby/moby/pull/36644)
|
||||
|
||||
Please also see the [use case for building images securely in kubernetes](https://github.com/jessfraz/blog/blob/master/content/post/building-container-images-securely-on-kubernetes.md).
|
||||
|
||||
Unmasking the paths in `/proc` option really only makes sense for when a user
|
||||
is nesting
|
||||
unprivileged containers with user namespaces as it will allow more information
|
||||
than is necessary to the program running in the container spawned by
|
||||
kubernetes.
|
||||
|
||||
The main use case for this option is to run
|
||||
[genuinetools/img](https://github.com/genuinetools/img) inside a kubernetes
|
||||
container. That program then launches sub-containers that take advantage of
|
||||
user namespaces and re-mask /proc and set /proc as read-only. So therefore
|
||||
there is no concern with having an unmasked proc open in the top level container.
|
||||
|
||||
It should be noted that this is different that the host /proc. It is still
|
||||
a newly mounted /proc just the container runtimes will not mask the paths.
|
||||
|
||||
Since the only use case for this option is to run unprivileged nested
|
||||
containers,
|
||||
this option should only be allowed or used if the user in the container is not `root`.
|
||||
This can be easily enforced with `MustRunAs`.
|
||||
Since the user inside is still unprivileged,
|
||||
doing things to `/proc` would be off limits regardless, since linux user
|
||||
support already prevents this.
|
||||
|
||||
## Existing SecurityContext objects
|
||||
|
||||
Kubernetes defines `SecurityContext` for `Container` and `PodSecurityContext`
|
||||
for `PodSpec`. `SecurityContext` objects define the related security options
|
||||
for Kubernetes containers, e.g. selinux options.
|
||||
|
||||
To support "ProcMount" options in Kubernetes, it is proposed to make
|
||||
the following changes:
|
||||
|
||||
## Changes of SecurityContext objects
|
||||
|
||||
Add a new `string` type field named `ProcMountType` will hold the viable
|
||||
options for `procMount` to the `SecurityContext`
|
||||
definition.
|
||||
|
||||
By default,`procMount` is `default`, aka the same behavior as today and the
|
||||
paths are masked.
|
||||
|
||||
This will look like the following in the spec:
|
||||
|
||||
```go
|
||||
type ProcMountType string
|
||||
|
||||
const (
|
||||
// DefaultProcMount uses the container runtime default ProcType. Most
|
||||
// container runtimes mask certain paths in /proc to avoid accidental security
|
||||
// exposure of special devices or information.
|
||||
DefaultProcMount ProcMountType = "Default"
|
||||
|
||||
// UnmaskedProcMount bypasses the default masking behavior of the container
|
||||
// runtime and ensures the newly created /proc the container stays in tact with
|
||||
// no modifications.
|
||||
UnmaskedProcMount ProcMountType = "Unmasked"
|
||||
)
|
||||
|
||||
procMount *ProcMountType
|
||||
```
|
||||
|
||||
This requires changes to the CRI runtime integrations so that
|
||||
kubelet will add the specific `unmasked` or `whatever_it_is_named` option.
|
||||
|
||||
## Pod Security Policy changes
|
||||
|
||||
A new `[]ProcMountType{}` field named `allowedProcMounts` will be added to the Pod
|
||||
Security Policy as well to gate the allowed ProcMountTypes a user is allowed to
|
||||
set. This field will default to `[]ProcMountType{ DefaultProcMount }`.
|
Loading…
Reference in New Issue