337 lines
13 KiB
Markdown
337 lines
13 KiB
Markdown
---
|
|
reviewers:
|
|
- tallclair
|
|
title: Pod Security Standards
|
|
content_type: concept
|
|
weight: 10
|
|
---
|
|
|
|
<!-- overview -->
|
|
|
|
Security settings for Pods are typically applied by using [security
|
|
contexts](/docs/tasks/configure-pod-container/security-context/). Security Contexts allow for the
|
|
definition of privilege and access controls on a per-Pod basis.
|
|
|
|
The enforcement and policy-based definition of cluster requirements of security contexts has
|
|
previously been achieved using [Pod Security Policy](/docs/concepts/policy/pod-security-policy/). A
|
|
_Pod Security Policy_ is a cluster-level resource that controls security sensitive aspects of the
|
|
Pod specification.
|
|
|
|
However, numerous means of policy enforcement have arisen that augment or replace the use of
|
|
PodSecurityPolicy. The intent of this page is to detail recommended Pod security profiles, decoupled
|
|
from any specific instantiation.
|
|
|
|
|
|
|
|
<!-- body -->
|
|
|
|
## Policy Types
|
|
|
|
There is an immediate need for base policy definitions to broadly cover the security spectrum. These
|
|
should range from highly restricted to highly flexible:
|
|
|
|
- **_Privileged_** - Unrestricted policy, providing the widest possible level of permissions. This
|
|
policy allows for known privilege escalations.
|
|
- **_Baseline_** - Minimally restrictive policy while preventing known privilege
|
|
escalations. Allows the default (minimally specified) Pod configuration.
|
|
- **_Restricted_** - Heavily restricted policy, following current Pod hardening best practices.
|
|
|
|
## Policies
|
|
|
|
### Privileged
|
|
|
|
The Privileged policy is purposely-open, and entirely unrestricted. This type of policy is typically
|
|
aimed at system- and infrastructure-level workloads managed by privileged, trusted users.
|
|
|
|
The privileged policy is defined by an absence of restrictions. For allow-by-default enforcement
|
|
mechanisms (such as gatekeeper), the privileged profile may be an absence of applied constraints
|
|
rather than an instantiated policy. In contrast, for a deny-by-default mechanism (such as Pod
|
|
Security Policy) the privileged policy should enable all controls (disable all restrictions).
|
|
|
|
### Baseline
|
|
|
|
The Baseline policy is aimed at ease of adoption for common containerized workloads while
|
|
preventing known privilege escalations. This policy is targeted at application operators and
|
|
developers of non-critical applications. The following listed controls should be
|
|
enforced/disallowed:
|
|
|
|
<table>
|
|
<caption style="display:none">Baseline policy specification</caption>
|
|
<tbody>
|
|
<tr>
|
|
<td><strong>Control</strong></td>
|
|
<td><strong>Policy</strong></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Host Namespaces</td>
|
|
<td>
|
|
Sharing the host namespaces must be disallowed.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.hostNetwork<br>
|
|
spec.hostPID<br>
|
|
spec.hostIPC<br>
|
|
<br><b>Allowed Values:</b> false<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Privileged Containers</td>
|
|
<td>
|
|
Privileged Pods disable most security mechanisms and must be disallowed.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.containers[*].securityContext.privileged<br>
|
|
spec.initContainers[*].securityContext.privileged<br>
|
|
<br><b>Allowed Values:</b> false, undefined/nil<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Capabilities</td>
|
|
<td>
|
|
Adding additional capabilities beyond the <a href="https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities">default set</a> must be disallowed.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.containers[*].securityContext.capabilities.add<br>
|
|
spec.initContainers[*].securityContext.capabilities.add<br>
|
|
<br><b>Allowed Values:</b> empty (or restricted to a known list)<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>HostPath Volumes</td>
|
|
<td>
|
|
HostPath volumes must be forbidden.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.volumes[*].hostPath<br>
|
|
<br><b>Allowed Values:</b> undefined/nil<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Host Ports</td>
|
|
<td>
|
|
HostPorts should be disallowed, or at minimum restricted to a known list.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.containers[*].ports[*].hostPort<br>
|
|
spec.initContainers[*].ports[*].hostPort<br>
|
|
<br><b>Allowed Values:</b> 0, undefined (or restricted to a known list)<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>AppArmor</td>
|
|
<td>
|
|
On supported hosts, the 'runtime/default' AppArmor profile is applied by default.
|
|
The baseline policy should prevent overriding or disabling the default AppArmor
|
|
profile, or restrict overrides to an allowed set of profiles.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
metadata.annotations['container.apparmor.security.beta.kubernetes.io/*']<br>
|
|
<br><b>Allowed Values:</b> 'runtime/default', undefined<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>SELinux</td>
|
|
<td>
|
|
Setting the SELinux type is restricted, and setting a custom SELinux user or role option is forbidden.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.seLinuxOptions.type<br>
|
|
spec.containers[*].securityContext.seLinuxOptions.type<br>
|
|
spec.initContainers[*].securityContext.seLinuxOptions.type<br>
|
|
<br><b>Allowed Values:</b><br>
|
|
undefined/empty<br>
|
|
container_t<br>
|
|
container_init_t<br>
|
|
container_kvm_t<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.seLinuxOptions.user<br>
|
|
spec.containers[*].securityContext.seLinuxOptions.user<br>
|
|
spec.initContainers[*].securityContext.seLinuxOptions.user<br>
|
|
spec.securityContext.seLinuxOptions.role<br>
|
|
spec.containers[*].securityContext.seLinuxOptions.role<br>
|
|
spec.initContainers[*].securityContext.seLinuxOptions.role<br>
|
|
<br><b>Allowed Values:</b> undefined/empty<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>/proc Mount Type</td>
|
|
<td>
|
|
The default /proc masks are set up to reduce attack surface, and should be required.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.containers[*].securityContext.procMount<br>
|
|
spec.initContainers[*].securityContext.procMount<br>
|
|
<br><b>Allowed Values:</b> undefined/nil, 'Default'<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Sysctls</td>
|
|
<td>
|
|
Sysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed "safe" subset.
|
|
A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.sysctls<br>
|
|
<br><b>Allowed Values:</b><br>
|
|
kernel.shm_rmid_forced<br>
|
|
net.ipv4.ip_local_port_range<br>
|
|
net.ipv4.tcp_syncookies<br>
|
|
net.ipv4.ping_group_range<br>
|
|
undefined/empty<br>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
### Restricted
|
|
|
|
The Restricted policy is aimed at enforcing current Pod hardening best practices, at the expense of
|
|
some compatibility. It is targeted at operators and developers of security-critical applications, as
|
|
well as lower-trust users.The following listed controls should be enforced/disallowed:
|
|
|
|
|
|
<table>
|
|
<caption style="display:none">Restricted policy specification</caption>
|
|
<tbody>
|
|
<tr>
|
|
<td><strong>Control</strong></td>
|
|
<td><strong>Policy</strong></td>
|
|
</tr>
|
|
<tr>
|
|
<td colspan="2"><em>Everything from the baseline profile.</em></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Volume Types</td>
|
|
<td>
|
|
In addition to restricting HostPath volumes, the restricted profile limits usage of non-core volume types to those defined through PersistentVolumes.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.volumes[*].hostPath<br>
|
|
spec.volumes[*].gcePersistentDisk<br>
|
|
spec.volumes[*].awsElasticBlockStore<br>
|
|
spec.volumes[*].gitRepo<br>
|
|
spec.volumes[*].nfs<br>
|
|
spec.volumes[*].iscsi<br>
|
|
spec.volumes[*].glusterfs<br>
|
|
spec.volumes[*].rbd<br>
|
|
spec.volumes[*].flexVolume<br>
|
|
spec.volumes[*].cinder<br>
|
|
spec.volumes[*].cephFS<br>
|
|
spec.volumes[*].flocker<br>
|
|
spec.volumes[*].fc<br>
|
|
spec.volumes[*].azureFile<br>
|
|
spec.volumes[*].vsphereVolume<br>
|
|
spec.volumes[*].quobyte<br>
|
|
spec.volumes[*].azureDisk<br>
|
|
spec.volumes[*].portworxVolume<br>
|
|
spec.volumes[*].scaleIO<br>
|
|
spec.volumes[*].storageos<br>
|
|
spec.volumes[*].csi<br>
|
|
<br><b>Allowed Values:</b> undefined/nil<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Privilege Escalation</td>
|
|
<td>
|
|
Privilege escalation (such as via set-user-ID or set-group-ID file mode) should not be allowed.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.containers[*].securityContext.allowPrivilegeEscalation<br>
|
|
spec.initContainers[*].securityContext.allowPrivilegeEscalation<br>
|
|
<br><b>Allowed Values:</b> false<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Running as Non-root</td>
|
|
<td>
|
|
Containers must be required to run as non-root users.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.runAsNonRoot<br>
|
|
spec.containers[*].securityContext.runAsNonRoot<br>
|
|
spec.initContainers[*].securityContext.runAsNonRoot<br>
|
|
<br><b>Allowed Values:</b> true<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Non-root groups <em>(optional)</em></td>
|
|
<td>
|
|
Containers should be forbidden from running with a root primary or supplementary GID.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.runAsGroup<br>
|
|
spec.securityContext.supplementalGroups[*]<br>
|
|
spec.securityContext.fsGroup<br>
|
|
spec.containers[*].securityContext.runAsGroup<br>
|
|
spec.initContainers[*].securityContext.runAsGroup<br>
|
|
<br><b>Allowed Values:</b><br>
|
|
non-zero<br>
|
|
undefined / nil (except for `*.runAsGroup`)<br>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Seccomp</td>
|
|
<td>
|
|
The RuntimeDefault seccomp profile must be required, or allow specific additional profiles.<br>
|
|
<br><b>Restricted Fields:</b><br>
|
|
spec.securityContext.seccompProfile.type<br>
|
|
spec.containers[*].securityContext.seccompProfile<br>
|
|
spec.initContainers[*].securityContext.seccompProfile<br>
|
|
<br><b>Allowed Values:</b><br>
|
|
'runtime/default'<br>
|
|
undefined / nil<br>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
## Policy Instantiation
|
|
|
|
Decoupling policy definition from policy instantiation allows for a common understanding and
|
|
consistent language of policies across clusters, independent of the underlying enforcement
|
|
mechanism.
|
|
|
|
As mechanisms mature, they will be defined below on a per-policy basis. The methods of enforcement
|
|
of individual policies are not defined here.
|
|
|
|
[**PodSecurityPolicy**](/docs/concepts/policy/pod-security-policy/)
|
|
|
|
- [Privileged](https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/policy/privileged-psp.yaml)
|
|
- [Baseline](https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/policy/baseline-psp.yaml)
|
|
- [Restricted](https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/policy/restricted-psp.yaml)
|
|
|
|
## FAQ
|
|
|
|
### Why isn't there a profile between privileged and baseline?
|
|
|
|
The three profiles defined here have a clear linear progression from most secure (restricted) to least
|
|
secure (privileged), and cover a broad set of workloads. Privileges required above the baseline
|
|
policy are typically very application specific, so we do not offer a standard profile in this
|
|
niche. This is not to say that the privileged profile should always be used in this case, but that
|
|
policies in this space need to be defined on a case-by-case basis.
|
|
|
|
SIG Auth may reconsider this position in the future, should a clear need for other profiles arise.
|
|
|
|
### What's the difference between a security policy and a security context?
|
|
|
|
[Security Contexts](/docs/tasks/configure-pod-container/security-context/) configure Pods and
|
|
Containers at runtime. Security contexts are defined as part of the Pod and container specifications
|
|
in the Pod manifest, and represent parameters to the container runtime.
|
|
|
|
Security policies are control plane mechanisms to enforce specific settings in the Security Context,
|
|
as well as other parameters outside the Security Context. As of February 2020, the current native
|
|
solution for enforcing these security policies is [Pod Security
|
|
Policy](/docs/concepts/policy/pod-security-policy/) - a mechanism for centrally enforcing security
|
|
policy on Pods across a cluster. Other alternatives for enforcing security policy are being
|
|
developed in the Kubernetes ecosystem, such as [OPA
|
|
Gatekeeper](https://github.com/open-policy-agent/gatekeeper).
|
|
|
|
### What profiles should I apply to my Windows Pods?
|
|
|
|
Windows in Kubernetes has some limitations and differentiators from standard Linux-based
|
|
workloads. Specifically, the Pod SecurityContext fields [have no effect on
|
|
Windows](/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#v1-podsecuritycontext). As
|
|
such, no standardized Pod Security profiles currently exists.
|
|
|
|
### What about sandboxed Pods?
|
|
|
|
There is not currently an API standard that controls whether a Pod is considered sandboxed or
|
|
not. Sandbox Pods may be identified by the use of a sandboxed runtime (such as gVisor or Kata
|
|
Containers), but there is no standard definition of what a sandboxed runtime is.
|
|
|
|
The protections necessary for sandboxed workloads can differ from others. For example, the need to
|
|
restrict privileged permissions is lessened when the workload is isolated from the underlying
|
|
kernel. This allows for workloads requiring heightened permissions to still be isolated.
|
|
|
|
Additionally, the protection of sandboxed workloads is highly dependent on the method of
|
|
sandboxing. As such, no single recommended policy is recommended for all sandboxed workloads.
|
|
|
|
|