feature: Pod Security Standards documentation

This commit is contained in:
Samuel Roth 2021-07-11 19:40:09 +00:00
parent 4c6a904242
commit 37dd90d81a
3 changed files with 492 additions and 184 deletions

View File

@ -2,8 +2,10 @@
reviewers:
- zparnold
title: Overview of Cloud Native Security
description: >
A model for thinking about Kubernetes security in the context of Cloud Native security.
content_type: concept
weight: 10
weight: 1
---
<!-- overview -->

View File

@ -2,58 +2,39 @@
reviewers:
- tallclair
title: Pod Security Standards
description: >
A detailed look at the different policy levels defined in the Pod Security Standards.
content_type: concept
weight: 10
---
<!-- overview -->
Security settings for Pods are typically applied by using [security
contexts](/docs/tasks/configure-pod-container/security-context/). Security Contexts allow for the
definition of privilege and access controls on a per-Pod basis.
The enforcement and policy-based definition of cluster requirements of security contexts has
previously been achieved using [Pod Security Policy](/docs/concepts/policy/pod-security-policy/). A
_Pod Security Policy_ is a cluster-level resource that controls security sensitive aspects of the
Pod specification.
However, numerous means of policy enforcement have arisen that augment or replace the use of
PodSecurityPolicy. The intent of this page is to detail recommended Pod security profiles, decoupled
from any specific instantiation.
The Pod Security Standards define three different _policies_ to broadly cover the security spectrum. These policies are _cumulative_ and range from highly-permissive to highly-restrictive. This guide outlines the requirements of each policy.
| Profile | Description |
| ------ | ----------- |
| <strong style="white-space: nowrap">Privileged</strong> | Unrestricted policy, providing the widest possible level of permissions. This policy allows for known privilege escalations. |
| <strong style="white-space: nowrap">Baseline</strong> | Minimally restrictive policy which prevents known privilege escalations. Allows the default (minimally specified) Pod configuration. |
| <strong style="white-space: nowrap">Restricted</strong> | Heavily restricted policy, following current Pod hardening best practices. |
<!-- body -->
## Policy Types
There is an immediate need for base policy definitions to broadly cover the security spectrum. These
should range from highly restricted to highly flexible:
- **_Privileged_** - Unrestricted policy, providing the widest possible level of permissions. This
policy allows for known privilege escalations.
- **_Baseline_** - Minimally restrictive policy while preventing known privilege
escalations. Allows the default (minimally specified) Pod configuration.
- **_Restricted_** - Heavily restricted policy, following current Pod hardening best practices.
## Policies
## Profile Details
### Privileged
The Privileged policy is purposely-open, and entirely unrestricted. This type of policy is typically
aimed at system- and infrastructure-level workloads managed by privileged, trusted users.
**The _Privileged_ policy is purposely-open, and entirely unrestricted.** This type of policy is typically aimed at system- and infrastructure-level workloads managed by privileged, trusted users.
The privileged policy is defined by an absence of restrictions. For allow-by-default enforcement
mechanisms (such as gatekeeper), the privileged profile may be an absence of applied constraints
rather than an instantiated policy. In contrast, for a deny-by-default mechanism (such as Pod
Security Policy) the privileged policy should enable all controls (disable all restrictions).
The Privileged policy is defined by an absence of restrictions. For allow-by-default enforcement mechanisms (such as gatekeeper), the Privileged policy may be an absence of applied constraints rather than an instantiated profile. In contrast, for a deny-by-default mechanism (such as Pod Security Policy) the Privileged policy should enable all controls (disable all restrictions).
### Baseline
The Baseline policy is aimed at ease of adoption for common containerized workloads while
preventing known privilege escalations. This policy is targeted at application operators and
developers of non-critical applications. The following listed controls should be
enforced/disallowed:
**The _Baseline_ policy is aimed at ease of adoption for common containerized workloads while preventing known privilege escalations.** This policy is targeted at application operators and developers of non-critical applications. The following listed controls should be enforced/disallowed:
{{< note >}}
In this table, wildcards (`*`) incidate all elements in a list. For example, `spec.containers[*].securityContext` refers to the Security Context object for _all defined containers_. If any of the listed containers fails to meet the requirements, the entire pod will fail validation.
{{< /note >}}
<table>
<caption style="display:none">Baseline policy specification</caption>
@ -63,112 +44,196 @@ enforced/disallowed:
<td><strong>Policy</strong></td>
</tr>
<tr>
<td>Host Namespaces</td>
<td style="white-space: nowrap">Host Namespaces</td>
<td>
Sharing the host namespaces must be disallowed.<br>
<br><b>Restricted Fields:</b><br>
spec.hostNetwork<br>
spec.hostPID<br>
spec.hostIPC<br>
<br><b>Allowed Values:</b> false<br>
<p>Sharing the host namespaces must be disallowed.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.hostNetwork</code></li>
<li><code>spec.hostPID</code></li>
<li><code>spec.hostIPC</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li><code>false</code></li>
</ul>
</td>
</tr>
<tr>
<td>Privileged Containers</td>
<td style="white-space: nowrap">Privileged Containers</td>
<td>
Privileged Pods disable most security mechanisms and must be disallowed.<br>
<br><b>Restricted Fields:</b><br>
spec.containers[*].securityContext.privileged<br>
spec.initContainers[*].securityContext.privileged<br>
<br><b>Allowed Values:</b> false, undefined/nil<br>
<p>Privileged Pods disable most security mechanisms and must be disallowed.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.containers[*].securityContext.privileged</code></li>
<li><code>spec.initContainers[*].securityContext.privileged</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>false</code></li>
</ul>
</td>
</tr>
<tr>
<td>Capabilities</td>
<td style="white-space: nowrap">Capabilities</td>
<td>
Adding additional capabilities beyond the <a href="https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities">default set</a> must be disallowed.<br>
<br><b>Restricted Fields:</b><br>
spec.containers[*].securityContext.capabilities.add<br>
spec.initContainers[*].securityContext.capabilities.add<br>
<br><b>Allowed Values:</b> empty (or restricted to a known list)<br>
<p>Adding additional capabilities beyond the <a href="https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities">default set</a> must be disallowed.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.containers[*].securityContext.capabilities.add</code></li>
<li><code>spec.initContainers[*].securityContext.capabilities.add</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>AUDIT_WRITE</code></li>
<li><code>CHOWN</code></li>
<li><code>DAC_OVERRIDE</code></li>
<li><code>FOWNER</code></li>
<li><code>FSETID</code></li>
<li><code>KILL</code></li>
<li><code>MKNOD</code></li>
<li><code>NET_BIND_SERVICE</code></li>
<li><code>SETFCAP</code></li>
<li><code>SETGID</code></li>
<li><code>SETPCAP</code></li>
<li><code>SETUID</code></li>
<li><code>SYS_CHROOT</code></li>
</ul>
</td>
</tr>
<tr>
<td>HostPath Volumes</td>
<td style="white-space: nowrap">HostPath Volumes</td>
<td>
HostPath volumes must be forbidden.<br>
<br><b>Restricted Fields:</b><br>
spec.volumes[*].hostPath<br>
<br><b>Allowed Values:</b> undefined/nil<br>
<p>HostPath volumes must be forbidden.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.volumes[*].hostPath</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
</ul>
</td>
</tr>
<tr>
<td>Host Ports</td>
<td style="white-space: nowrap">Host Ports</td>
<td>
HostPorts should be disallowed, or at minimum restricted to a known list.<br>
<br><b>Restricted Fields:</b><br>
spec.containers[*].ports[*].hostPort<br>
spec.initContainers[*].ports[*].hostPort<br>
<br><b>Allowed Values:</b> 0, undefined (or restricted to a known list)<br>
<p>HostPorts should be disallowed, or at minimum restricted to a known list.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.containers[*].ports[*].hostPort</code></li>
<li><code>spec.initContainers[*].ports[*].hostPort</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li>Known list</li>
<li><code>0</code></li>
</ul>
</td>
</tr>
<tr>
<td>AppArmor</td>
<td style="white-space: nowrap">AppArmor</td>
<td>
On supported hosts, the 'runtime/default' AppArmor profile is applied by default.
The baseline policy should prevent overriding or disabling the default AppArmor
profile, or restrict overrides to an allowed set of profiles.<br>
<br><b>Restricted Fields:</b><br>
metadata.annotations['container.apparmor.security.beta.kubernetes.io/*']<br>
<br><b>Allowed Values:</b> 'runtime/default', undefined<br>
<p>On supported hosts, the <code>runtime/default</code> AppArmor profile is applied by default. The baseline policy should prevent overriding or disabling the default AppArmor profile, or restrict overrides to an allowed set of profiles.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>metadata.annotations["container.apparmor.security.beta.kubernetes.io/*"]</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>runtime/default</code></li>
</ul>
</td>
</tr>
<tr>
<td>SELinux</td>
<td style="white-space: nowrap">SELinux</td>
<td>
Setting the SELinux type is restricted, and setting a custom SELinux user or role option is forbidden.<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.seLinuxOptions.type<br>
spec.containers[*].securityContext.seLinuxOptions.type<br>
spec.initContainers[*].securityContext.seLinuxOptions.type<br>
<br><b>Allowed Values:</b><br>
undefined/empty<br>
container_t<br>
container_init_t<br>
container_kvm_t<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.seLinuxOptions.user<br>
spec.containers[*].securityContext.seLinuxOptions.user<br>
spec.initContainers[*].securityContext.seLinuxOptions.user<br>
spec.securityContext.seLinuxOptions.role<br>
spec.containers[*].securityContext.seLinuxOptions.role<br>
spec.initContainers[*].securityContext.seLinuxOptions.role<br>
<br><b>Allowed Values:</b> undefined/empty<br>
<p>Setting the SELinux type is restricted, and setting a custom SELinux user or role option is forbidden.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.seLinuxOptions.type</code></li>
<li><code>spec.containers[*].securityContext.seLinuxOptions.type</code></li>
<li><code>spec.initContainers[*].securityContext.seLinuxOptions.type</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>container_t</code></li>
<li><code>container_init_t</code></li>
<li><code>container_kvm_t</code></li>
</ul>
<hr />
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.seLinuxOptions.user</code></li>
<li><code>spec.containers[*].securityContext.seLinuxOptions.user</code></li>
<li><code>spec.initContainers[*].securityContext.seLinuxOptions.user</code></li>
<li><code>spec.securityContext.seLinuxOptions.role</code></li>
<li><code>spec.containers[*].securityContext.seLinuxOptions.role</code></li>
<li><code>spec.initContainers[*].securityContext.seLinuxOptions.role</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
</ul>
</td>
</tr>
<tr>
<td>/proc Mount Type</td>
<td style="white-space: nowrap"><code>/proc</code> Mount Type</td>
<td>
The default /proc masks are set up to reduce attack surface, and should be required.<br>
<br><b>Restricted Fields:</b><br>
spec.containers[*].securityContext.procMount<br>
spec.initContainers[*].securityContext.procMount<br>
<br><b>Allowed Values:</b> undefined/nil, 'Default'<br>
<p>The default <code>/proc</code> masks are set up to reduce attack surface, and should be required.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.containers[*].securityContext.procMount</code></li>
<li><code>spec.initContainers[*].securityContext.procMount</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>Default</code></li>
</ul>
</td>
</tr>
<tr>
<td>Sysctls</td>
<td>Seccomp</td>
<td>
<p>Seccomp profile must not be explicitly set to <code>Unconfined</code>.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.seccompProfile.type</code></li>
<li><code>spec.containers[*].securityContext.seccompProfile.type</code></li>
<li><code>spec.initContainers[*].securityContext.seccompProfile.type</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>RuntimeDefault</code></li>
<li><code>Localhost</code>*</li>
</ul>
<small>* must also set <code>securityContext.SeccompProfile.localhostProfile</code></small>
</td>
</tr>
<tr>
<td style="white-space: nowrap">Sysctls</td>
<td>
Sysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed "safe" subset.
A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node.<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.sysctls<br>
<br><b>Allowed Values:</b><br>
kernel.shm_rmid_forced<br>
net.ipv4.ip_local_port_range<br>
net.ipv4.tcp_syncookies<br>
net.ipv4.ping_group_range<br>
undefined/empty<br>
<p>Sysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed "safe" subset. A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.sysctls</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
<li><code>kernel.shm_rmid_forced</code></li>
<li><code>net.ipv4.ip_local_port_range</code></li>
<li><code>net.ipv4.ip_unprivileged_port_start</code></li>
<li><code>net.ipv4.tcp_syncookies</code></li>
<li><code>net.ipv4.ping_group_range</code></li>
</ul>
</td>
</tr>
</tbody>
@ -176,10 +241,11 @@ enforced/disallowed:
### Restricted
The Restricted policy is aimed at enforcing current Pod hardening best practices, at the expense of
some compatibility. It is targeted at operators and developers of security-critical applications, as
well as lower-trust users.The following listed controls should be enforced/disallowed:
**The _Restricted_ policy is aimed at enforcing current Pod hardening best practices, at the expense ofsome compatibility.** It is targeted at operators and developers of security-critical applications, as well as lower-trust users. The following listed controls should be enforced/disallowed:
{{< note >}}
In this table, wildcards (`*`) incidate all elements in a list. For example, `spec.containers[*].securityContext` refers to the Security Context object for _all defined containers_. If any of the listed containers fails to meet the requirements, the entire pod will fail validation.
{{< /note >}}
<table>
<caption style="display:none">Restricted policy specification</caption>
@ -192,83 +258,107 @@ well as lower-trust users.The following listed controls should be enforced/disal
<td colspan="2"><em>Everything from the baseline profile.</em></td>
</tr>
<tr>
<td>Volume Types</td>
<td style="white-space: nowrap">Volume Types</td>
<td>
In addition to restricting HostPath volumes, the restricted profile limits usage of non-core volume types to those defined through PersistentVolumes.<br>
<br><b>Restricted Fields:</b><br>
spec.volumes[*].hostPath<br>
spec.volumes[*].gcePersistentDisk<br>
spec.volumes[*].awsElasticBlockStore<br>
spec.volumes[*].gitRepo<br>
spec.volumes[*].nfs<br>
spec.volumes[*].iscsi<br>
spec.volumes[*].glusterfs<br>
spec.volumes[*].rbd<br>
spec.volumes[*].flexVolume<br>
spec.volumes[*].cinder<br>
spec.volumes[*].cephFS<br>
spec.volumes[*].flocker<br>
spec.volumes[*].fc<br>
spec.volumes[*].azureFile<br>
spec.volumes[*].vsphereVolume<br>
spec.volumes[*].quobyte<br>
spec.volumes[*].azureDisk<br>
spec.volumes[*].portworxVolume<br>
spec.volumes[*].scaleIO<br>
spec.volumes[*].storageos<br>
spec.volumes[*].csi<br>
<br><b>Allowed Values:</b> undefined/nil<br>
<p>In addition to restricting HostPath volumes, the restricted policy limits usage of non-core volume types to those defined through PersistentVolumes.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.volumes[*].hostPath</code></li>
<li><code>spec.volumes[*].gcePersistentDisk</code></li>
<li><code>spec.volumes[*].awsElasticBlockStore</code></li>
<li><code>spec.volumes[*].gitRepo</code></li>
<li><code>spec.volumes[*].nfs</code></li>
<li><code>spec.volumes[*].iscsi</code></li>
<li><code>spec.volumes[*].glusterfs</code></li>
<li><code>spec.volumes[*].rbd</code></li>
<li><code>spec.volumes[*].flexVolume</code></li>
<li><code>spec.volumes[*].cinder</code></li>
<li><code>spec.volumes[*].cephfs</code></li>
<li><code>spec.volumes[*].flocker</code></li>
<li><code>spec.volumes[*].fc</code></li>
<li><code>spec.volumes[*].azureFile</code></li>
<li><code>spec.volumes[*].vsphereVolume</code></li>
<li><code>spec.volumes[*].quobyte</code></li>
<li><code>spec.volumes[*].azureDisk</code></li>
<li><code>spec.volumes[*].portworxVolume</code></li>
<li><code>spec.volumes[*].scaleIO</code></li>
<li><code>spec.volumes[*].storageos</code></li>
<li><code>spec.volumes[*].csi</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil</li>
</ul>
</td>
</tr>
<tr>
<td>Privilege Escalation</td>
<td style="white-space: nowrap">Privilege Escalation</td>
<td>
Privilege escalation (such as via set-user-ID or set-group-ID file mode) should not be allowed.<br>
<br><b>Restricted Fields:</b><br>
spec.containers[*].securityContext.allowPrivilegeEscalation<br>
spec.initContainers[*].securityContext.allowPrivilegeEscalation<br>
<br><b>Allowed Values:</b> false<br>
<p>Privilege escalation (such as via set-user-ID or set-group-ID file mode) should not be allowed.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.containers[*].securityContext.allowPrivilegeEscalation</code></li>
<li><code>spec.initContainers[*].securityContext.allowPrivilegeEscalation</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li><code>false</code></li>
</ul>
</td>
</tr>
<tr>
<td>Running as Non-root</td>
<td style="white-space: nowrap">Running as Non-root</td>
<td>
Containers must be required to run as non-root users.<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.runAsNonRoot<br>
spec.containers[*].securityContext.runAsNonRoot<br>
spec.initContainers[*].securityContext.runAsNonRoot<br>
<br><b>Allowed Values:</b> true<br>
<p>Containers must be required to run as non-root users.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.runAsNonRoot</code></li>
<li><code>spec.containers[*].securityContext.runAsNonRoot</code></li>
<li><code>spec.initContainers[*].securityContext.runAsNonRoot</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li><code>true</code></li>
</ul>
</td>
</tr>
<tr>
<td>Non-root groups <em>(optional)</em></td>
<td style="white-space: nowrap">Non-root groups <em>(optional)</em></td>
<td>
Containers should be forbidden from running with a root primary or supplementary GID.<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.runAsGroup<br>
spec.securityContext.supplementalGroups[*]<br>
spec.securityContext.fsGroup<br>
spec.containers[*].securityContext.runAsGroup<br>
spec.initContainers[*].securityContext.runAsGroup<br>
<br><b>Allowed Values:</b><br>
non-zero<br>
undefined / nil (except for `*.runAsGroup`)<br>
<p>Containers should be forbidden from running with a root primary or supplementary GID.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.runAsGroup</code></li>
<li><code>spec.securityContext.supplementalGroups[*]</code></li>
<li><code>spec.securityContext.fsGroup</code></li>
<li><code>spec.containers[*].securityContext.runAsGroup</code></li>
<li><code>spec.initContainers[*].securityContext.runAsGroup</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li>Undefined/nil (except for <code>*.runAsGroup</code>)</li>
<li>Non-zero</li>
</ul>
</td>
</tr>
<tr>
<td>Seccomp</td>
<td>
The RuntimeDefault seccomp profile must be required, or allow specific additional profiles.<br>
<br><b>Restricted Fields:</b><br>
spec.securityContext.seccompProfile.type<br>
spec.containers[*].securityContext.seccompProfile<br>
spec.initContainers[*].securityContext.seccompProfile<br>
<br><b>Allowed Values:</b><br>
'runtime/default'<br>
undefined / nil<br>
</td>
</tr>
<td>Seccomp</td>
<td>
<p>Seccomp profile must be explicitly set to one of the allowed values. Both the <code>Unconfined</code> profile and the <em>absence</em> of a profile are prohibited.</p>
<p><strong>Restricted Fields</strong></p>
<ul>
<li><code>spec.securityContext.seccompProfile.type</code></li>
<li><code>spec.containers[*].securityContext.seccompProfile.type</code></li>
<li><code>spec.initContainers[*].securityContext.seccompProfile.type</code></li>
</ul>
<p><strong>Allowed Values</strong></p>
<ul>
<li><code>RuntimeDefault</code></li>
<li><code>Localhost</code>*</li>
</ul>
<small>* must also set <code>securityContext.SeccompProfile.localhostProfile</code></small>
</td>
</tr>
</tbody>
</table>
@ -281,20 +371,20 @@ mechanism.
As mechanisms mature, they will be defined below on a per-policy basis. The methods of enforcement
of individual policies are not defined here.
[**PodSecurityPolicy**](/docs/concepts/policy/pod-security-policy/)
[**PodSecurityPolicy**](/docs/concepts/profile/pod-security-profile/)
- {{< example file="policy/privileged-psp.yaml" >}}Privileged{{< /example >}}
- {{< example file="policy/baseline-psp.yaml" >}}Baseline{{< /example >}}
- {{< example file="policy/restricted-psp.yaml" >}}Restricted{{< /example >}}
- {{< example file="profile/privileged-psp.yaml" >}}Privileged{{< /example >}}
- {{< example file="profile/baseline-psp.yaml" >}}Baseline{{< /example >}}
- {{< example file="profile/restricted-psp.yaml" >}}Restricted{{< /example >}}
## FAQ
### Why isn't there a profile between privileged and baseline?
### Why isn't there a policy between privileged and baseline?
The three profiles defined here have a clear linear progression from most secure (restricted) to least
secure (privileged), and cover a broad set of workloads. Privileges required above the baseline
policy are typically very application specific, so we do not offer a standard profile in this
niche. This is not to say that the privileged profile should always be used in this case, but that
policy are typically very application specific, so we do not offer a standard policy in this
niche. This is not to say that the privileged policy should always be used in this case, but that
policies in this space need to be defined on a case-by-case basis.
SIG Auth may reconsider this position in the future, should a clear need for other profiles arise.
@ -308,10 +398,10 @@ in the Pod manifest, and represent parameters to the container runtime.
Security policies are control plane mechanisms to enforce specific settings in the Security Context,
as well as other parameters outside the Security Context. As of February 2020, the current native
solution for enforcing these security policies is [Pod Security
Policy](/docs/concepts/policy/pod-security-policy/) - a mechanism for centrally enforcing security
policy on Pods across a cluster. Other alternatives for enforcing security policy are being
Policy](/docs/concepts/profile/pod-security-profile/) - a mechanism for centrally enforcing security
profile on Pods across a cluster. Other alternatives for enforcing security profile are being
developed in the Kubernetes ecosystem, such as [OPA
Gatekeeper](https://github.com/open-policy-agent/gatekeeper).
Gatekeeper](https://github.com/open-profile-agent/gatekeeper).
### What profiles should I apply to my Windows Pods?
@ -331,6 +421,6 @@ restrict privileged permissions is lessened when the workload is isolated from t
kernel. This allows for workloads requiring heightened permissions to still be isolated.
Additionally, the protection of sandboxed workloads is highly dependent on the method of
sandboxing. As such, no single recommended policy is recommended for all sandboxed workloads.
sandboxing. As such, no single recommended profile is recommended for all sandboxed workloads.

View File

@ -0,0 +1,216 @@
---
reviewers:
- tallclair
title: Securing Pods with the Pod Security Standards
content_type: concept
weight: 20
---
<!-- overview -->
{{< feature-state for_k8s_version="v1.22" state="alpha" >}}
Pod Security Standards enable cluster operators to restrict the permissions of pods in a clear, consistent fashion. They are the successor to [Pod Security Policies](/docs/concepts/policy/pod-security-policy), which will be removed from Kubernetes in v1.25.
<!-- body -->
## Getting Started
### Enabling the Alpha feature
Pod Security Standards are currently in Alpha and behind a [Feature Gate](/docs/reference/command-line-tools-reference/feature-gates/). This means they must be explicitly enabled before they can be used.
```shell
--feature-gates="...,PodSecurity=true"
```
### Configuring Namespaces
Once the feature is enabled, policies can then specified by adding labels to namespaces. These labels correspond to different _modes_, which determine how the admission controller will respond to violating pods.
| Mode | Description |
| ---- | ----------- |
| **`enforce`** | Policy violations will cause the pod to be rejected. |
| **`audit`** | Policy violations will trigger the addition of an audit annotation, but are otherwise allowed. |
| **`warn`** | Policy violations will trigger a user-facing warning, but are otherwise allowed. |
For each mode, there are two labels:
```yaml
# The per-mode level label indicates which policy level to apply for the mode.
#
# MODE must be one of `enforce`, `audit`, or `warn`.
# LEVEL must be one of `privileged`, `baseline`, or `restricted`.
pod-security.kubernetes.io/<MODE>: <LEVEL>
# Optional: per-mode version label that can be used to pin the policy to the
# version that shipped with a given Kubernetes minor version (e.g. v1.22).
#
# MODE must be one of `enforce`, `audit`, or `warn`.
# VERSION must be a valid Kubernetes version label.
pod-security.kubernetes.io/<MODE>-version: <VERSION>
```
Check out the [examples](/docs/concepts/security/securing-pods/#examples) to see example usage of these labels.
#### Configuring the Admission Controller
The PodSecurity Admission Controller can be statically configured to set defaults and [exemptions](/docs/concepts/security/securing-pods/#exemptions).
```yaml
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
defaults: # Defaults applied when a mode label is not set.
enforce: <default enforce policy level>
enforce-version: <default enforce policy version>
audit: <default audit policy level>
audit-version: <default audit policy version>
warn: <default warn policy level>
warn-version: <default warn policy version>
exemptions:
usernames: [ <array of authenticated usernames to exempt> ]
runtimeClassNames: [ <array of runtime class names to exempt> ]
namespaces: [ <array of namespaces to exempt> ]
```
#### Configuring Pods
Different policy levels (e.g. `baseline`, `restricted`) have different requirements for [Security Context](/docs/tasks/configure-pod-container/security-context/) objects and other related fields. Check out the [Pod Security Standards](/docs/concepts/security/pod-security-standards) page for an in-depth look at those requirements.
## Exemptions
Exemptions permit the creation of pods that would have otherwise been prohibited due to the policy associated with a given namespace. Exemptions can be statically configured in the [Admission Controller configuration](/docs/concepts/security/securing-pods/#configuring-the-admission-controller).
Exemptions must be explicitly enumerated, and do not support indirection such as label or group selectors. Requests meeting exemption criteria are _ignored_ by the Admission Controller (all `enforce`, `audit` and `warn` behaviors), except to record an audit annotation. Exemption dimensions include:
- **Usernames:** requests from users with an exempt authenticated (or impersonated) username are ignored.
- **RuntimeClassNames:** pods and templated pods specifying an exempt runtime class name are ignored.
- **Namespaces:** pods and templated pods in an exempt namespace are ignored.
The username exemption is special in that the creating user is not persisted on the Pod object, and the Pod may be modified by different non-exempt users in the future. See the [Updates section of the Pod Security Standards KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/2579-psp-replacement#updates) for details on how non-exempt updates of a previously exempted pod are handled. Use cases for username exemptions include:
- Trusted controllers that create pods in tenant namespaces with additional 3rd party enforcement on the privileged pods.
- Break-glass operations roles, for example for debugging workloads in a restricted namespace.
## Webhook
To enable policy enforcement on older clusters, we provide a standalone webhook implementation that utilizes the same underlying library.
TODO: Update this as instructions for setting up webhook become more clear. [See related issue.](https://github.com/kubernetes/kubernetes/issues/103559)
## Best Practices
### Configure All Cluster Namespaces
Even if the initial configurations are most permissive (i.e. everything is `privileged`), any Namespaces that lack any configuration at all will become significant gaps in your cluster security model.
We provide an [example](/docs/concepts/security/securing-pods/#applying-to-all-namespaces) that illustrates how you can do this.
### Enforce the Principle of Least Privilege
In an ideal world, every pod in every namespace would meet the requirements of the `restricted` policy. However, this is not possible nor practical, as some workloads will require elevated privileges for legitimate reasons.
- Namespaces allowing `privileged` workloads should establish and enforce appropriate access controls.
- For workloads running in those permissive namespaces, maintain documentation about their unique security requirements. If at all possible, consider how those requirements could be further constrained.
### Adopt a Multi-Mode Strategy
The `audit` and `warn` modes of the Pod Security Standards admission controller make it easy to collect important security insights about your pods without breaking existing workloads.
It is good practice to enable these modes for all namespaces, except maybe for those that will _absolutely_ always require `privileged` access or those that enforce the `restricted` policy. Further, these modes should usually be set to policies at least one level higher than the policy for the `enforce` mode.
- Namespaces enforcing the `privileged` policy (i.e. no restrictions) should set audit and warn labels to `baseline`.
- Namespaces enforcing the `baseline` policy should set audit and warn labels to `restricted`.
Monitoring these annotations and warnings over time will help you identify workloads that can be migrated to namespaces with stricter policies.
## Examples
### Adding Labels in YAML
This YAML file creates a Namespace `my-secure-namespace` that:
- Blocks any pods that don't satisfy the `baseline` policy requirements.
- Generates a user-facing warning and adds an audit annotation to any created pod that does not meet the `restricted` policy requirements.
- Pins the versions of the `baseline` and `restricted` policies to v1.22.
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: my-secure-namespace
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/enforce-version: v1.22
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/audit-version: v1.22
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: v1.22
```
### Adding Labels to Existing Namespaces with `kubectl label`
{{< note >}}
When an `enforce` policy (or version) label is added or changed, the admission plugin will test each pod in the namespace against the new policy. Violations are returned to the user as warnings.
{{< /note >}}
#### Applying to All Namespaces
If you're just getting started with the Pod Security Standards, a suitable first step would be to configure all namespaces as `privileged` but set up audit annotations for a stricter level such as `baseline`:
```shell
kubectl label --overwrite ns --all \
pod-security.kubernetes.io/enforce=privileged \
pod-security.kubernetes.io/audit=baseline \
pod-security.kubernetes.io/warn=baseline
```
#### Applying to a Single Namespace
You can update a specific namespace as well. This command adds the `enforce=restricted` policy to `my-existing-namespace`, pinning the restricted policy version to v1.22.
```shell
kubectl label --overwrite ns my-existing-namespace \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=v1.22
```
## Migrating from Pod Security Policies
Migrating to the replacement policy from PodSecurityPolicies can be done effectively using a combination of dry-run and `audit` and `warn` modes, although this becomes harder if mutating PSPs are used).
{{< note >}}
**We are working on tooling to automate this migration.** In the meantime, this guide should give you an idea of the general process.
{{< /note >}}
- [Enable the Pod Security Standards feature gate.](/docs/concepts/security/securing-pods#enabling-the-alpha-feature)
- **Temporarily set the default `enforce` mode to `privileged` for all Namespaces.** This can be done by [configuring the Admission Controller](/docs/concepts/security/securing-pods#configuring-the-admission-controller) or with [`kubectl label`](/docs/concepts/security/securing-pods#applying-to-all-namespaces).
- **Eliminate mutating Pod Security Policies, if your cluster has any set up.**
- Clone all mutating PSPs into a non-mutating version.
- Update all ClusterRoles authorizing use of those mutating PSPs to also authorize use of the non-mutating variant.
- Watch for Pods using the mutating PSPs and work with code owners to migrate to valid, non-mutating resources.
- Delete mutating PSPs.
- **Select a compatible policy level for each namespace.** Analyze existing resources in the namespace to drive this decision; strive for the `restricted` and `baseline` levels.
- Review the requirements of the different [Pod Security Standards](/docs/concepts/security/pod-security-standards).
- Evaluate the difference in privileges that would come from disabling the PSP controller.
- **Apply the selected profiles in `warn` and `audit` mode.** This will give you an idea of how your Pods will respond to the new policies, without breaking existing workloads. Iterate on your [Pods' configuration](/docs/concepts/security/securing-pods#configuring-pods) until they are in compliance with the selected profiles.
- Apply the profiles in enforce mode.
- Disable PodSecurityPolicy!
## FAQ
### Why were the Pod Security Policies deprecated?
There were numerous problems with Pod Security Policies, which led to the decision to deprecate them. The [KEP for Pod Security Standards](https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/2579-psp-replacement#motivation) goes into these issues in more detail.
### When should I pin a policy version?
The Pod Security Standards will continue to evolve over time, even after the feature leaves the Alpha phase. This is because the details of the policies are based on current Pod hardening best practices, which must adapt to new threats as they arise.
**Pinning policies to specific versions will prevent you from detecting and enforcing protections against future threats.** _Only_ consider pinning versions if the default value `latest` would result in your workloads drifting toward less-restrictive configurations in ways that cannot be resolved.
If you find yourself in that situation, consider refactoring workloads to isolate privileged operations as much as possible.