Merge branch 'kubernetes:main' into Issue38146

This commit is contained in:
Saumya 2023-03-24 09:51:01 +05:30 committed by GitHub
commit 48eee13cf2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
754 changed files with 23043 additions and 8574 deletions

View File

@ -32,7 +32,7 @@ aliases:
- bradtopol
- divya-mohan0209
- kbhawkey
- krol3
- mickeyboxell
- natalisucks
- nate-double-u
- onlydole
@ -138,6 +138,7 @@ aliases:
- Sea-n
- tanjunchen
- tengqm
- windsonsea
- xichengliudui
sig-docs-zh-reviews: # PR reviews for Chinese content
- chenrui333
@ -168,6 +169,7 @@ aliases:
- edsoncelio
- femrtnz
- jcjesus
- mrerlison
- rikatz
- stormqueen1990
- yagonobre

View File

@ -60,14 +60,34 @@ cd website
<!--
The Kubernetes website uses the [Docsy Hugo theme](https://github.com/google/docsy#readme). Even if you plan to run the website in a container, we strongly recommend pulling in the submodule and other development dependencies by running the following:
-->
Kubernetes 网站使用的是 [Docsy Hugo 主题](https://github.com/google/docsy#readme)。
即使你打算在容器中运行网站,我们也强烈建议你通过运行以下命令来引入子模块和其他开发依赖项:
```bash
# 引入 Docsy 子模块
<!--
### Windows
```powershell
# fetch submodule dependencies
git submodule update --init --recursive --depth 1
```
```
-->
### Windows
```powershell
# 获取子模块依赖
git submodule update --init --recursive --depth 1
```
<!--
### Linux / other Unix
```bash
# fetch submodule dependencies
make module-init
```
-->
### Linux / 其它 Unix
```bash
# 获取子模块依赖
make module-init
```
<!--
## Running the website using a container

View File

@ -9,7 +9,7 @@ This repository contains the assets required to build the [Kubernetes website an
## Using this repository
You can run the website locally using Hugo (Extended version), or you can run it in a container runtime. We strongly recommend using the container runtime, as it gives deployment consistency with the live website.
You can run the website locally using [Hugo (Extended version)](https://gohugo.io/), or you can run it in a container runtime. We strongly recommend using the container runtime, as it gives deployment consistency with the live website.
## Prerequisites
@ -70,7 +70,7 @@ This will start the local Hugo server on port 1313. Open up your browser to <htt
## Building the API reference pages
The API reference pages located in `content/en/docs/reference/kubernetes-api` are built from the Swagger specification, using <https://github.com/kubernetes-sigs/reference-docs/tree/master/gen-resourcesdocs>.
The API reference pages located in `content/en/docs/reference/kubernetes-api` are built from the Swagger specification, also known as OpenAPI specification, using <https://github.com/kubernetes-sigs/reference-docs/tree/master/gen-resourcesdocs>.
To update the reference pages for a new Kubernetes release follow these steps:

View File

@ -42,12 +42,12 @@ Kubernetes ist Open Source und bietet Dir die Freiheit, die Infrastruktur vor Or
<button id="desktopShowVideoButton" onclick="kub.showVideo()">Video ansehen</button>
<br>
<br>
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/" button id="desktopKCButton">Besuchen die KubeCon North America vom 24. bis 28. Oktober 2022</a>
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/" button id="desktopKCButton">Besuche die KubeCon Europe vom 18. bis 21. April 2023</a>
<br>
<br>
<br>
<br>
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/" button id="desktopKCButton">Besuche die KubeCon Europe vom 17. bis 21. April 2023</a>
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/" button id="desktopKCButton">Besuche die KubeCon North America vom 6. bis 9. November 2023</a>
</div>
<div id="videoPlayer">
<iframe data-url="https://www.youtube.com/embed/H06qrNmGqyE?autoplay=1" frameborder="0" allowfullscreen></iframe>

View File

@ -0,0 +1,48 @@
---
layout: blog
title: "k8s.gcr.io Image Registry Will Be Frozen From the 3rd of April 2023"
date: 2023-03-10
slug: k8s-gcr-io-freeze-announcement
---
**Authors**: Michael Mueller (Giant Swarm)
Das Kubernetes-Projekt betreibt eine zur Community gehörende Container-Image-Registry namens `registry.k8s.io`, um die zum Projekt gehörenden Container-Images zu hosten. Am 3. April 2023 wird diese Container-Image-Registry `k8s.gcr.io` eingefroren und es werden keine weiteren Container-Images für Kubernetes und Teilprojekte in die alte Registry gepusht.
Die Container-Image-Registry `registry.k8s.io` ist bereits seit einigen Monaten verfügbar und wird die alte Registry ersetzen. Wir haben einen [Blogbeitrag](/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/) über die Vorteile für die Community und das Kubernetes-Projekt veröffentlicht. In diesem Beitrag wurde auch angekündigt, dass zukünftige Versionen von Kubernetes nicht mehr in der alten Registry Released sein werden.
Was bedeutet dies für Contributors:
- Wenn Du ein Maintainer eines Teilprojekts bist, musst du die Manifeste und Helm-Charts entsprechend anpassen, um die neue Container-Registry zu verwenden.
Was bedeutet dies Änderung für Endanwender:
- Das Kubernetes Release 1.27 wird nicht auf der alten Registry veröffentlicht.
- Patchreleases für 1.24, 1.25 und 1.26 werden ab April nicht mehr in der alten Container-Image-Registry veröffentlicht. Bitte beachte den untenstehenden Zeitplan für die Details zu Patchreleases in der alten Container-Registry.
- Beginnend mit dem Release 1.25, wurde die Standardeinstellung der Container-Image-Registry auf `registry.k8s.io` geändert. Diese Einstellung kann in `kubeadm` und dem `kubelet` abgeändert werden, sollte der Wert jedoch auf `k8s.gcr.io` gesetzt werden, wird dies für neue Releases ab April fehlschlagen, da diese nicht in die alte Container-Image-Registry gepusht werden.
- Solltest Du die Zuverlässigkeit der Cluster erhöhen wollen und Abhängigkeiten zu dem zur Community gehörenden Container-Image-Registry auflösen wollen, oder betreibst Cluster in einer Umgebung mit eingeschränktem externen Netzwerkzugriff, solltest Du in Betracht ziehen eine lokale Container-Image-Registry als Mirror zu betreiben. Einige Cloud-Anbieter haben hierfür entsprechende Angebote.
## Zeitplan der Änderungen
- `k8s.gcr.io` wird zum 3.April 2023 eingefroren
- Der 1.27 Release wird zum 12.April 2023 erwartet
- Das letzte 1.23 Release auf `k8s.gcr.io` wird 1.23.18 sein (1.23 wird end-of-life vor dem einfrieren erreichen)
- Das letzte 1.24 Release auf `k8s.gcr.io` wird 1.24.12 sein
- Das letzte 1.25 Release auf `k8s.gcr.io` wird 1.25.8 sein
- Das letzte 1.26 Release auf `k8s.gcr.io` wird 1.26.3 sein
## Was geschieht nun
Bitte stelle sicher, dass die Cluster keine Abhängigkeiten zu der Alten Container-Image-Registry haben. Dies kann zum Beispiel folgendermaßen überprüft werden, durch Ausführung des fogenden Kommandos erhält man eine Liste der Container-Images die ein Pod verwendet:
```shell
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -c
```
Es können durchaus weitere Abhängigkeiten zu der alten Container-Image-Registry bestehen, stelle also sicher, dass du alle möglichen Abhängigkeiten überprüfst, um die Cluster funktional und auf dem neuesten Stand zu halten.
## Acknowledgments
__Change is hard__, die Weiterentwicklung unserer Container-Image-Registry ist notwendig, um eine nachhaltige Zukunft für das Projekt zu gewährleisten. Wir bemühen uns, Dinge für alle, die Kubernetes nutzen, zu verbessern. Viele Mitwirkende aus allen Ecken unserer Community haben lange und hart daran gearbeitet, sicherzustellen, dass wir die bestmöglichen Entscheidungen treffen, Pläne umsetzen und unser Bestes tun, um diese Pläne zu kommunizieren.
Dank geht an Aaron Crickenberger, Arnaud Meukam, Benjamin Elder, Caleb Woodbine, Davanum Srinivas, Mahamed Ali, und Tim Hockin von SIG K8s Infra, Brian McQueen, und Sergey Kanzhelev von SIG Node, Lubomir Ivanov von SIG Cluster Lifecycle, Adolfo García Veytia, Jeremy Rickard, Sascha Grunert, und Stephen Augustus von SIG Release, Bob Killen und Kaslin Fields von SIG Contribex, Tim Allclair von the Security Response Committee. Also a big thank you to our friends acting as liaisons with our cloud provider partners: Jay Pipes von Amazon und Jon Johnson Jr. von Google.

View File

@ -277,7 +277,7 @@ Pods können nur eigene Image Pull Secret in ihrem eigenen Namespace referenzier
#### Referenzierung eines imagePullSecrets bei einem Pod
Nun können Sie Pods erstellen, die dieses Secret referenzieren, indem Sie einen Aschnitt `imagePullSecrets` zu ihrer Pod - Definition hinzufügen.
Nun können Sie Pods erstellen, die dieses Secret referenzieren, indem Sie einen Abschnitt `imagePullSecrets` zu ihrer Pod - Definition hinzufügen.
```shell
cat <<EOF > pod.yaml

View File

@ -81,9 +81,9 @@ Du kannst auch einen Slack-Kanal für deine Lokalisierung im `kubernetes/communi
### Ändere die Website-Konfiguration
Die Kubernetes-Website verwendet Hugo als Web-Framework. Die Hugo-Konfiguration der Website befindet sich in der Datei [`config.toml`](https://github.com/kubernetes/website/tree/master/config.toml). Um eine neue Lokalisierung zu unterstützen, musst du die Datei `config.toml` modifizieren.
Die Kubernetes-Website verwendet Hugo als Web-Framework. Die Hugo-Konfiguration der Website befindet sich in der Datei [`hugo.toml`](https://github.com/kubernetes/website/tree/master/hugo.toml). Um eine neue Lokalisierung zu unterstützen, musst du die Datei `hugo.toml` modifizieren.
Dazu fügst du einen neuen Block für die neue Sprache unter den bereits existierenden `[languages]` Block in das `config.toml` ein, wie folgendes Beispiel zeigt:
Dazu fügst du einen neuen Block für die neue Sprache unter den bereits existierenden `[languages]` Block in das `hugo.toml` ein, wie folgendes Beispiel zeigt:
```toml
[languages.de]

View File

@ -34,7 +34,7 @@ Benutzen Sie eine Docker-basierende Lösung, wenn Sie Kubernetes erlernen wollen
| | [IBM Cloud Private-CE (Community Edition)](https://github.com/IBM/deploy-ibm-cloud-private) |
| | [IBM Cloud Private-CE (Community Edition) on Linux Containers](https://github.com/HSBawa/icp-ce-on-linux-containers)|
| | [k3s](https://k3s.io)|
{{< /table >}}
## Produktionsumgebung
@ -98,5 +98,6 @@ Die folgende Tabelle für Produktionsumgebungs-Lösungen listet Anbieter und der
| [VEXXHOST](https://vexxhost.com/) | &#x2714; | &#x2714; | | | |
| [VMware](https://cloud.vmware.com/) | [VMware Cloud PKS](https://cloud.vmware.com/vmware-cloud-pks) |[VMware Enterprise PKS](https://cloud.vmware.com/vmware-enterprise-pks) | [VMware Enterprise PKS](https://cloud.vmware.com/vmware-enterprise-pks) | [VMware Essential PKS](https://cloud.vmware.com/vmware-essential-pks) | |[VMware Essential PKS](https://cloud.vmware.com/vmware-essential-pks)
| [Z.A.R.V.I.S.](https://zarvis.ai/) | &#x2714; | | | | | |
{{< /table >}}

View File

@ -52,7 +52,7 @@ Creating machine...
Starting local Kubernetes cluster...
```
```shell
kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.10
kubectl create deployment hello-minikube --image=registry.k8s.io/echoserver:1.10
```
```
deployment.apps/hello-minikube created

View File

@ -1,5 +0,0 @@
---
title: "Service Catalog installieren"
weight: 150
---

View File

@ -77,7 +77,7 @@ Deployments sind die empfohlene Methode zum Verwalten der Erstellung und Skalier
Der Pod führt einen Container basierend auf dem bereitgestellten Docker-Image aus.
```shell
kubectl create deployment hello-node --image=k8s.gcr.io/echoserver:1.4
kubectl create deployment hello-node --image=registry.k8s.io/echoserver:1.4
```
2. Anzeigen des Deployments:

View File

@ -33,7 +33,7 @@ General Availability means different things for different projects. For kubeadm,
We now consider kubeadm to have achieved GA-level maturity in each of these important domains:
* **Stable command-line UX** --- The kubeadm CLI conforms to [#5a GA rule of the Kubernetes Deprecation Policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-flag-or-cli), which states that a command or flag that exists in a GA version must be kept for at least 12 months after deprecation.
* **Stable underlying implementation** --- kubeadm now creates a new Kubernetes cluster using methods that shouldn't change any time soon. The control plane, for example, is run as a set of static Pods, bootstrap tokens are used for the [`kubeadm join`](/docs/reference/setup-tools/kubeadm/kubeadm-join/) flow, and [ComponentConfig](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/wgs/0014-20180707-componentconfig-api-types-to-staging.md) is used for configuring the [kubelet](/docs/reference/command-line-tools-reference/kubelet/).
* **Stable underlying implementation** --- kubeadm now creates a new Kubernetes cluster using methods that shouldn't change any time soon. The control plane, for example, is run as a set of static Pods, bootstrap tokens are used for the [`kubeadm join`](/docs/reference/setup-tools/kubeadm/kubeadm-join/) flow, and [ComponentConfig](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/wgs/115-componentconfig) is used for configuring the [kubelet](/docs/reference/command-line-tools-reference/kubelet/).
* **Configuration file schema** --- With the new **v1beta1** API version, you can now tune almost every part of the cluster declaratively and thus build a "GitOps" flow around kubeadm-built clusters. In future versions, we plan to graduate the API to version **v1** with minimal changes (and perhaps none).
* **The "toolbox" interface of kubeadm** --- Also known as **phases**. If you don't want to perform all [`kubeadm init`](/docs/reference/setup-tools/kubeadm/kubeadm-init/) tasks, you can instead apply more fine-grained actions using the `kubeadm init phase` command (for example generating certificates or control plane [Static Pod](/docs/tasks/administer-cluster/static-pod/) manifests).
* **Upgrades between minor versions** --- The [`kubeadm upgrade`](/docs/reference/setup-tools/kubeadm/kubeadm-upgrade/) command is now fully GA. It handles control plane upgrades for you, which includes upgrades to [etcd](https://etcd.io), the [API Server](/docs/reference/using-api/api-overview/), the [Controller Manager](/docs/reference/command-line-tools-reference/kube-controller-manager/), and the [Scheduler](/docs/reference/command-line-tools-reference/kube-scheduler/). You can seamlessly upgrade your cluster between minor or patch versions (e.g. v1.12.2 -> v1.13.1 or v1.13.1 -> v1.13.3).

View File

@ -83,6 +83,7 @@ Adopting a common convention for annotations ensures consistency and understanda
| `a8r.io/uptime` | Link to external uptime dashboard. |
| `a8r.io/performance` | Link to external performance dashboard. |
| `a8r.io/dependencies` | Unstructured text describing the service dependencies for humans. |
{{< /table >}}
## Visualizing annotations: Service Catalogs

View File

@ -11,7 +11,7 @@ Starting with Kubernetes 1.25, our container image registry has changed from k8s
## TL;DR: What you need to know about this change
* Container images for Kubernetes releases from 1.25 onward are no longer published to k8s.gcr.io, only to registry.k8s.io.
* Container images for Kubernetes releases from <del>1.25</del> 1.27 onward are not published to k8s.gcr.io, only to registry.k8s.io.
* In the upcoming December patch releases, the new registry domain default will be backported to all branches still in support (1.22, 1.23, 1.24).
* If you run in a restricted environment and apply strict domain/IP address access policies limited to k8s.gcr.io, the __image pulls will not function__ after the migration to this new registry. For these users, the recommended method is to mirror the release images to a private registry.
@ -68,8 +68,15 @@ The image used by kubelet for the pod sandbox (`pause`) can be overridden by set
kubelet --pod-infra-container-image=k8s.gcr.io/pause:3.5
```
## Legacy container registry freeze {#registry-freeze}
[k8s.gcr.io Image Registry Will Be Frozen From the 3rd of April 2023](/blog/2023/02/06/k8s-gcr-io-freeze-announcement/) announces the freeze of the
legacy k8s.gcr.io image registry. Read that article for more details.
## Acknowledgments
__Change is hard__, and evolving our image-serving platform is needed to ensure a sustainable future for the project. We strive to make things better for everyone using Kubernetes. Many contributors from all corners of our community have been working long and hard to ensure we are making the best decisions possible, executing plans, and doing our best to communicate those plans.
Thanks to Aaron Crickenberger, Arnaud Meukam, Benjamin Elder, Caleb Woodbine, Davanum Srinivas, Mahamed Ali, and Tim Hockin from SIG K8s Infra, Brian McQueen, and Sergey Kanzhelev from SIG Node, Lubomir Ivanov from SIG Cluster Lifecycle, Adolfo García Veytia, Jeremy Rickard, Sascha Grunert, and Stephen Augustus from SIG Release, Bob Killen and Kaslin Fields from SIG Contribex, Tim Allclair from the Security Response Committee. Also a big thank you to our friends acting as liaisons with our cloud provider partners: Jay Pipes from Amazon and Jon Johnson Jr. from Google.
_This article was updated on the 28th of February 2023._

View File

@ -207,3 +207,11 @@ and without losing the state of the containers in that Pod.
You can reach SIG Node by several means:
- Slack: [#sig-node](https://kubernetes.slack.com/messages/sig-node)
- [Mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node)
## Further reading
Please see the follow-up article [Forensic container
analysis][forensic-container-analysis] for details on how a container checkpoint
can be analyzed.
[forensic-container-analysis]: /blog/2023/03/10/forensic-container-analysis/

View File

@ -31,8 +31,8 @@ files side by side to the artifacts for verifying their integrity.
[tarballs]: https://github.com/kubernetes/kubernetes/blob/release-1.26/CHANGELOG/CHANGELOG-1.26.md#downloads-for-v1260
[binaries]: https://gcsweb.k8s.io/gcs/kubernetes-release/release/v1.26.0/bin
[sboms]: https://storage.googleapis.com/kubernetes-release/release/v1.26.0/kubernetes-release.spdx
[provenance]: https://storage.googleapis.com/kubernetes-release/release/v1.26.0/provenance.json
[sboms]: https://dl.k8s.io/release/v1.26.0/kubernetes-release.spdx
[provenance]: https://dl.k8s.io/kubernetes-release/release/v1.26.0/provenance.json
[cosign]: https://github.com/sigstore/cosign
To verify an artifact, for example `kubectl`, you can download the

View File

@ -8,39 +8,99 @@ slug: security-behavior-analysis
**Author:**
David Hadas (IBM Research Labs)
_This post warns Devops from a false sense of security. Following security best practices when developing and configuring microservices do not result in non-vulnerable microservices. The post shows that although all deployed microservices are vulnerable, there is much that can be done to ensure microservices are not exploited. It explains how analyzing the behavior of clients and services from a security standpoint, named here **"Security-Behavior Analysis"**, can protect the deployed vulnerable microservices. It points to [Guard](http://knative.dev/security-guard), an open source project offering security-behavior monitoring and control of Kubernetes microservices presumed vulnerable._
_This post warns Devops from a false sense of security. Following security
best practices when developing and configuring microservices do not result
in non-vulnerable microservices. The post shows that although all deployed
microservices are vulnerable, there is much that can be done to ensure
microservices are not exploited. It explains how analyzing the behavior of
clients and services from a security standpoint, named here
**"Security-Behavior Analytics"**, can protect the deployed vulnerable microservices.
It points to [Guard](http://knative.dev/security-guard), an open source project offering
security-behavior monitoring and control of Kubernetes microservices presumed vulnerable._
As cyber attacks continue to intensify in sophistication, organizations deploying cloud services continue to grow their cyber investments aiming to produce safe and non-vulnerable services. However, the year-by-year growth in cyber investments does not result in a parallel reduction in cyber incidents. Instead, the number of cyber incidents continues to grow annually. Evidently, organizations are doomed to fail in this struggle - no matter how much effort is made to detect and remove cyber weaknesses from deployed services, it seems offenders always have the upper hand.
As cyber attacks continue to intensify in sophistication, organizations deploying
cloud services continue to grow their cyber investments aiming to produce safe and
non-vulnerable services. However, the year-by-year growth in cyber investments does
not result in a parallel reduction in cyber incidents. Instead, the number of cyber
incidents continues to grow annually. Evidently, organizations are doomed to fail in
this struggle - no matter how much effort is made to detect and remove cyber weaknesses
from deployed services, it seems offenders always have the upper hand.
Considering the current spread of offensive tools, sophistication of offensive players, and ever-growing cyber financial gains to offenders, any cyber strategy that relies on constructing a non-vulnerable, weakness-free service in 2023 is clearly too naïve. It seems the only viable strategy is to:
Considering the current spread of offensive tools, sophistication of offensive players,
and ever-growing cyber financial gains to offenders, any cyber strategy that relies on
constructing a non-vulnerable, weakness-free service in 2023 is clearly too naïve.
It seems the only viable strategy is to:
&#x27A5; **Admit that your services are vulnerable!**
In other words, consciously accept that you will never create completely invulnerable services. If your opponents find even a single weakness as an entry-point, you lose! Admitting that in spite of your best efforts, all your services are still vulnerable is an important first step. Next, this post discusses what you can do about it...
In other words, consciously accept that you will never create completely invulnerable
services. If your opponents find even a single weakness as an entry-point, you lose!
Admitting that in spite of your best efforts, all your services are still vulnerable
is an important first step. Next, this post discusses what you can do about it...
## How to protect microservices from being exploited
Being vulnerable does not necessarily mean that your service will be exploited. Though your services are vulnerable in some ways unknown to you, offenders still need to identify these vulnerabilities and then exploit them. If offenders fail to exploit your service vulnerabilities, you win! In other words, having a vulnerability that cant be exploited, represents a risk that cant be realized.
Being vulnerable does not necessarily mean that your service will be exploited.
Though your services are vulnerable in some ways unknown to you, offenders still
need to identify these vulnerabilities and then exploit them. If offenders fail
to exploit your service vulnerabilities, you win! In other words, having a
vulnerability that cant be exploited, represents a risk that cant be realized.
{{< figure src="security_behavior_figure_1.svg" alt="Image of an example of offender gaining foothold in a service" class="diagram-large" caption="Figure 1. An Offender gaining foothold in a vulnerable service" >}}
The above diagram shows an example in which the offender does not yet have a foothold in the service; that is, it is assumed that your service does not run code controlled by the offender on day 1. In our example the service has vulnerabilities in the API exposed to clients. To gain an initial foothold the offender uses a malicious client to try and exploit one of the service API vulnerabilities. The malicious client sends an exploit that triggers some unplanned behavior of the service.
The above diagram shows an example in which the offender does not yet have a
foothold in the service; that is, it is assumed that your service does not run
code controlled by the offender on day 1. In our example the service has
vulnerabilities in the API exposed to clients. To gain an initial foothold the
offender uses a malicious client to try and exploit one of the service API
vulnerabilities. The malicious client sends an exploit that triggers some
unplanned behavior of the service.
More specifically, lets assume the service is vulnerable to an SQL injection. The developer failed to sanitize the user input properly, thereby allowing clients to send values that would change the intended behavior. In our example, if a client sends a query string with key “username” and value of _“tom or 1=1”_, the client will receive the data of all users. Exploiting this vulnerability requires the client to send an irregular string as the value. Note that benign users will not be sending a string with spaces or with the equal sign character as a username, instead they will normally send legal usernames which for example may be defined as a short sequence of characters a-z. No legal username can trigger service unplanned behavior.
More specifically, lets assume the service is vulnerable to an SQL injection.
The developer failed to sanitize the user input properly, thereby allowing clients
to send values that would change the intended behavior. In our example, if a client
sends a query string with key “username” and value of _“tom or 1=1”_, the client will
receive the data of all users. Exploiting this vulnerability requires the client to
send an irregular string as the value. Note that benign users will not be sending a
string with spaces or with the equal sign character as a username, instead they will
normally send legal usernames which for example may be defined as a short sequence of
characters a-z. No legal username can trigger service unplanned behavior.
In this simple example, one can already identify several opportunities to detect and block an attempt to exploit the vulnerability (un)intentionally left behind by the developer, making the vulnerability unexploitable. First, the malicious client behavior differs from the behavior of benign clients, as it sends irregular requests. If such a change in behavior is detected and blocked, the exploit will never reach the service. Second, the service behavior in response to the exploit differs from the service behavior in response to a regular request. Such behavior may include making subsequent irregular calls to other services such as a data store, taking irregular time to respond, and/or responding to the malicious client with an irregular response (for example, containing much more data than normally sent in case of benign clients making regular requests). Service behavioral changes, if detected, will also allow blocking the exploit in different stages of the exploitation attempt.
In this simple example, one can already identify several opportunities to detect and
block an attempt to exploit the vulnerability (un)intentionally left behind by the
developer, making the vulnerability unexploitable. First, the malicious client behavior
differs from the behavior of benign clients, as it sends irregular requests. If such a
change in behavior is detected and blocked, the exploit will never reach the service.
Second, the service behavior in response to the exploit differs from the service behavior
in response to a regular request. Such behavior may include making subsequent irregular
calls to other services such as a data store, taking irregular time to respond, and/or
responding to the malicious client with an irregular response (for example, containing
much more data than normally sent in case of benign clients making regular requests).
Service behavioral changes, if detected, will also allow blocking the exploit in
different stages of the exploitation attempt.
More generally:
- Monitoring the behavior of clients can help detect and block exploits against service API vulnerabilities. In fact, deploying efficient client behavior monitoring makes many vulnerabilities unexploitable and others very hard to achieve. To succeed, the offender needs to create an exploit undetectable from regular requests.
- Monitoring the behavior of clients can help detect and block exploits against
service API vulnerabilities. In fact, deploying efficient client behavior
monitoring makes many vulnerabilities unexploitable and others very hard to achieve.
To succeed, the offender needs to create an exploit undetectable from regular requests.
- Monitoring the behavior of services can help detect services as they are being exploited regardless of the attack vector used. Efficient service behavior monitoring limits what an attacker may be able to achieve as the offender needs to ensure the service behavior is undetectable from regular service behavior.
- Monitoring the behavior of services can help detect services as they are being
exploited regardless of the attack vector used. Efficient service behavior
monitoring limits what an attacker may be able to achieve as the offender needs
to ensure the service behavior is undetectable from regular service behavior.
Combining both approaches may add a protection layer to the deployed vulnerable services, drastically decreasing the probability for anyone to successfully exploit any of the deployed vulnerable services. Next, let us identify four use cases where you need to use security-behavior monitoring.
Combining both approaches may add a protection layer to the deployed vulnerable services,
drastically decreasing the probability for anyone to successfully exploit any of the
deployed vulnerable services. Next, let us identify four use cases where you need to
use security-behavior monitoring.
## Use cases
One can identify the following four different stages in the life of any service from a security standpoint. In each stage, security-behavior monitoring is required to meet different challenges:
One can identify the following four different stages in the life of any service
from a security standpoint. In each stage, security-behavior monitoring is required
to meet different challenges:
Service State | Use case | What do you need in order to cope with this use case?
------------- | ------------- | -----------------------------------------
@ -53,25 +113,57 @@ Fortunately, microservice architecture is well suited to security-behavior monit
## Security-Behavior of microservices versus monoliths {#microservices-vs-monoliths}
Kubernetes is often used to support workloads designed with microservice architecture. By design, microservices aim to follow the UNIX philosophy of "Do One Thing And Do It Well". Each microservice has a bounded context and a clear interface. In other words, you can expect the microservice clients to send relatively regular requests and the microservice to present a relatively regular behavior as a response to these requests. Consequently, a microservice architecture is an excellent candidate for security-behavior monitoring.
Kubernetes is often used to support workloads designed with microservice architecture.
By design, microservices aim to follow the UNIX philosophy of "Do One Thing And Do It Well".
Each microservice has a bounded context and a clear interface. In other words, you can expect
the microservice clients to send relatively regular requests and the microservice to present
a relatively regular behavior as a response to these requests. Consequently, a microservice
architecture is an excellent candidate for security-behavior monitoring.
{{< figure src="security_behavior_figure_2.svg" alt="Image showing why microservices are well suited for security-behavior monitoring" class="diagram-large" caption="Figure 2. Microservices are well suited for security-behavior monitoring" >}}
The diagram above clarifies how dividing a monolithic service to a set of microservices improves our ability to perform security-behavior monitoring and control. In a monolithic service approach, different client requests are intertwined, resulting in a diminished ability to identify irregular client behaviors. Without prior knowledge, an observer of the intertwined client requests will find it hard to distinguish between types of requests and their related characteristics. Further, internal client requests are not exposed to the observer. Lastly, the aggregated behavior of the monolithic service is a compound of the many different internal behaviors of its components, making it hard to identify irregular service behavior.
The diagram above clarifies how dividing a monolithic service to a set of
microservices improves our ability to perform security-behavior monitoring
and control. In a monolithic service approach, different client requests are
intertwined, resulting in a diminished ability to identify irregular client
behaviors. Without prior knowledge, an observer of the intertwined client
requests will find it hard to distinguish between types of requests and their
related characteristics. Further, internal client requests are not exposed to
the observer. Lastly, the aggregated behavior of the monolithic service is a
compound of the many different internal behaviors of its components, making
it hard to identify irregular service behavior.
In a microservice environment, each microservice is expected by design to offer a more well-defined service and serve better defined type of requests. This makes it easier for an observer to identify irregular client behavior and irregular service behavior. Further, a microservice design exposes the internal requests and internal services which offer more security-behavior data to identify irregularities by an observer. Overall, this makes the microservice design pattern better suited for security-behavior monitoring and control.
In a microservice environment, each microservice is expected by design to offer
a more well-defined service and serve better defined type of requests. This makes
it easier for an observer to identify irregular client behavior and irregular
service behavior. Further, a microservice design exposes the internal requests
and internal services which offer more security-behavior data to identify
irregularities by an observer. Overall, this makes the microservice design
pattern better suited for security-behavior monitoring and control.
## Security-Behavior monitoring on Kubernetes
Kubernetes deployments seeking to add Security-Behavior may use [Guard](http://knative.dev/security-guard), developed under the CNCF project Knative. Guard is integrated into the full Knative automation suite that runs on top of Kubernetes. Alternatively, **you can deploy Guard as a standalone tool** to protect any HTTP-based workload on Kubernetes.
Kubernetes deployments seeking to add Security-Behavior may use
[Guard](http://knative.dev/security-guard), developed under the CNCF project Knative.
Guard is integrated into the full Knative automation suite that runs on top of Kubernetes.
Alternatively, **you can deploy Guard as a standalone tool** to protect any HTTP-based workload on Kubernetes.
See:
- [Guard](https://github.com/knative-sandbox/security-guard) on Github, for using Guard as a standalone tool.
- The Knative automation suite - Read about Knative, in the blog post [Opinionated Kubernetes](https://davidhadas.wordpress.com/2022/08/29/knative-an-opinionated-kubernetes) which describes how Knative simplifies and unifies the way web services are deployed on Kubernetes.
- You may contact Guard maintainers on the [SIG Security](https://kubernetes.slack.com/archives/C019LFTGNQ3) Slack channel or on the Knative community [security](https://knative.slack.com/archives/CBYV1E0TG) Slack channel. The Knative community channel will move soon to the [CNCF Slack](https://communityinviter.com/apps/cloud-native/cncf) under the name `#knative-security`.
- [Guard](https://github.com/knative-sandbox/security-guard) on Github,
for using Guard as a standalone tool.
- The Knative automation suite - Read about Knative, in the blog post
[Opinionated Kubernetes](https://davidhadas.wordpress.com/2022/08/29/knative-an-opinionated-kubernetes)
which describes how Knative simplifies and unifies the way web services are deployed on Kubernetes.
- You may contact Guard maintainers on the
[SIG Security](https://kubernetes.slack.com/archives/C019LFTGNQ3) Slack channel
or on the Knative community [security](https://knative.slack.com/archives/CBYV1E0TG)
Slack channel. The Knative community channel will move soon to the
[CNCF Slack](https://communityinviter.com/apps/cloud-native/cncf) under the name `#knative-security`.
The goal of this post is to invite the Kubernetes community to action and introduce Security-Behavior monitoring and control to help secure Kubernetes based deployments. Hopefully, the community as a follow up will:
The goal of this post is to invite the Kubernetes community to action and introduce
Security-Behavior monitoring and control to help secure Kubernetes based deployments.
Hopefully, the community as a follow up will:
1. Analyze the cyber challenges presented for different Kubernetes use cases
1. Add appropriate security documentation for users on how to introduce Security-Behavior monitoring and control.

View File

@ -0,0 +1,76 @@
---
layout: blog
title: "Introducing KWOK: Kubernetes WithOut Kubelet"
date: 2023-03-01
slug: introducing-kwok
canonicalUrl: https://kubernetes.dev/blog/2023/03/01/introducing-kwok/
---
**Author:** Shiming Zhang (DaoCloud), Wei Huang (Apple), Yibo Zhuang (Apple)
<img style="float: right; display: inline-block; margin-left: 2em; max-width: 15em;" src="/blog/2023/03/01/introducing-kwok/kwok.svg" alt="KWOK logo" />
Have you ever wondered how to set up a cluster of thousands of nodes just in seconds, how to simulate real nodes with a low resource footprint, and how to test your Kubernetes controller at scale without spending much on infrastructure?
If you answered "yes" to any of these questions, then you might be interested in KWOK, a toolkit that enables you to create a cluster of thousands of nodes in seconds.
## What is KWOK?
KWOK stands for Kubernetes WithOut Kubelet. So far, it provides two tools:
`kwok`
: `kwok` is the cornerstone of this project, responsible for simulating the lifecycle of fake nodes, pods, and other Kubernetes API resources.
`kwokctl`
: `kwokctl` is a CLI tool designed to streamline the creation and management of clusters, with nodes simulated by `kwok`.
## Why use KWOK?
KWOK has several advantages:
- **Speed**: You can create and delete clusters and nodes almost instantly, without waiting for boot or provisioning.
- **Compatibility**: KWOK works with any tools or clients that are compliant with Kubernetes APIs, such as kubectl, helm, kui, etc.
- **Portability**: KWOK has no specific hardware or software requirements. You can run it using pre-built images, once Docker or Nerdctl is installed. Alternatively, binaries are also available for all platforms and can be easily installed.
- **Flexibility**: You can configure different node types, labels, taints, capacities, conditions, etc., and you can configure different pod behaviors, status, etc. to test different scenarios and edge cases.
- **Performance**: You can simulate thousands of nodes on your laptop without significant consumption of CPU or memory resources.
## What are the use cases?
KWOK can be used for various purposes:
- **Learning**: You can use KWOK to learn about Kubernetes concepts and features without worrying about resource waste or other consequences.
- **Development**: You can use KWOK to develop new features or tools for Kubernetes without accessing to a real cluster or requiring other components.
- **Testing**:
- You can measure how well your application or controller scales with different numbers of nodes and(or) pods.
- You can generate high loads on your cluster by creating many pods or services with different resource requests or limits.
- You can simulate node failures or network partitions by changing node conditions or randomly deleting nodes.
- You can test how your controller interacts with other components or features of Kubernetes by enabling different feature gates or API versions.
## What are the limitations?
KWOK is not intended to replace others completely. It has some limitations that you should be aware of:
- **Functionality**: KWOK is not a kubelet and may exhibit different behaviors in areas such as pod lifecycle management, volume mounting, and device plugins. Its primary function is to simulate updates of node and pod status.
- **Accuracy**: It's important to note that KWOK doesn't accurately reflect the performance or behavior of real nodes under various workloads or environments. Instead, it approximates some behaviors using simple formulas.
- **Security**: KWOK does not enforce any security policies or mechanisms on simulated nodes. It assumes that all requests from the kube-apiserver are authorized and valid.
## Getting started
If you are interested in trying out KWOK, please check its [documents] for more details.
{{< figure src="/blog/2023/03/01/introducing-kwok/manage-clusters.svg" alt="Animation of a terminal showing kwokctl in use" caption="Using kwokctl to manage simulated clusters" >}}
## Getting Involved
If you're interested in participating in future discussions or development related to KWOK, there are several ways to get involved:
- Slack: [#kwok] for general usage discussion, [#kwok-dev] for development discussion. (visit [slack.k8s.io] for a workspace invitation)
- Open Issues/PRs/Discussions in [sigs.k8s.io/kwok]
We welcome feedback and contributions from anyone who wants to join us in this exciting project.
[documents]: https://kwok.sigs.k8s.io/
[sigs.k8s.io/kwok]: https://sigs.k8s.io/kwok/
[#kwok]: https://kubernetes.slack.com/messages/kwok/
[#kwok-dev]: https://kubernetes.slack.com/messages/kwok-dev/
[slack.k8s.io]: https://slack.k8s.io/

View File

@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 70 70"><path fill="#326CE5" d="m7.907 20.247-.325 1.656-.033.006-1.56 6.763-.415 1.961-.063.086c0 .348-.26 1.093-.324 1.451-.089.508-.223.952-.34 1.465-.223.977-.481 1.984-.67 2.938-.192.968-.517 1.974-.671 2.909-.148.897-.6 2.053-.6 3.01v.03c0 1.74.865 2.698 1.68 3.645.407.474.797.986 1.186 1.478.398.505.811.975 1.2 1.491.76 1.007 1.607 1.973 2.39 2.965.773.978 1.576 2.031 2.384 2.97.41.476.796.983 1.186 1.478.39.494.819.974 1.189 1.474.364.492.802.982 1.188 1.473.36.457.815 1.083 1.201 1.492.408.432.784 1.015 1.184 1.479.42.488.785.997 1.198 1.493.5.6.71.843 1.39 1.274.48.305 1.318.598 2.094.598l24.56.02 1.063-.138c.075-.051.33-.117.455-.167.2-.08.28-.126.452-.198.322-.135.536-.31.797-.505.376-.28.94-.992 1.225-1.378.714-.972 1.595-1.9 2.308-2.87.715-.973 1.597-1.9 2.308-2.87.706-.962 1.55-1.922 2.302-2.874 1.524-1.93 3.09-3.807 4.616-5.739a50.63 50.63 0 0 1 1.151-1.422c.395-.464.752-1.006 1.154-1.45.362-.398.828-.993 1.127-1.447.261-.398.626-1.336.626-1.977v-.71c0-.004-.15-.618-.165-.692-.051-.237-.12-.508-.154-.704-.06-.336-.228-1.127-.332-1.414-.045-.125-.117-.53-.16-.698-.058-.233-.103-.466-.17-.688-.12-.403-.207-.995-.308-1.436-.114-.496-.22-.887-.32-1.397-.05-.263-.122-.4-.166-.691a8.807 8.807 0 0 0-.16-.728c-.13-.469-.639-2.428-.634-2.826l-.042-.09-.373-1.714-.04-.09-.733-3.072c0-.363-.235-.842-.266-1.272-.006-.076-.12-.494-.146-.623-.048-.241-.116-.389-.163-.636-.063-.333-.18-.988-.284-1.255-.11-.28-.196-.925-.3-1.268-.112-.376-.166-.853-.28-1.258-.105-.375-.213-.9-.296-1.272-.242-1.087-.408-1.402-.993-2.143-.49-.621-1.077-.932-1.83-1.277-.156-.072-.346-.176-.519-.25a8.253 8.253 0 0 1-.522-.247c-.312-.195-.732-.322-1.057-.51-.3-.173-.716-.326-1.047-.492a101 101 0 0 0-3.18-1.524 53.67 53.67 0 0 1-2.096-1.01c-.703-.355-1.398-.654-2.1-1.006-.704-.352-1.4-.66-2.101-1.007-.34-.168-.73-.36-1.066-.501-.315-.132-.776-.413-1.069-.5-.19-.056-.799-.385-1.042-.496a30.09 30.09 0 0 1-1.065-.503c-.696-.353-1.412-.66-2.12-1.016-.703-.353-1.395-.653-2.1-1.006-.719-.36-1.368-.7-2.437-.7h-.06c-.958 0-1.415.316-2.082.58a7.04 7.04 0 0 0-.899.432c-.227.142-.668.295-.934.427-1.205.595-2.415 1.134-3.619 1.736-2.398 1.2-4.844 2.27-7.24 3.47-1.207.607-2.41 1.142-3.618 1.737-.606.298-1.202.572-1.811.852-.225.104-.688.369-.898.434-.115.035-.812.368-.923.437-.214.133-.656.318-.901.43-.205.095-.73.384-.898.434-.502.149-1.192.813-1.51 1.182a4.13 4.13 0 0 0-.854 1.839c-.036.224-.471 2.074-.53 2.162z"/><path fill="#cfe3ef" stroke="#355386" d="M7.469 60.078c4.095-8.48 3.708-26.191 7.407-34.164 3.7-7.973 9.248-11.668 19.422-12.64 10.532-.436 18.728 4.667 22.196 13.807s3.326 26.11 6.759 32.462c-3.598-2.245-5.738-4.683-9.648-4.572-3.91.111-5.378 4.094-8.837 4.267-3.46.173-6.42-3.994-10.224-3.819-3.803.175-4.92 3.84-9.266 3.794-4.347-.047-5.702-3.82-9.14-3.693-3.437.127-5.757 1.425-8.669 4.558Z" style="stroke-width:1.00157;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;paint-order:normal"/><ellipse cx="41.148" cy="32.56" fill="#355386" rx="4.3" ry="6.67"/><ellipse cx="24.369" cy="32.56" fill="#355386" rx="4.3" ry="6.67"/><circle cx="41.559" cy="31.23" r="3" fill="#fff"/><circle cx="24.744" cy="31.23" r="3" fill="#fff"/><path d="M34.162 53.377V47.65h1.156v2.543l2.336-2.543h1.555l-2.156 2.23 2.273 3.497H37.83l-1.574-2.688-.938.957v1.73zM40.736 53.377 39.37 47.65h1.184l.863 3.934 1.047-3.934h1.375l1.004 4 .879-4h1.164l-1.39 5.727h-1.227l-1.141-4.282-1.137 4.282zM47.24 50.549q0-.875.262-1.47.195-.437.531-.784.34-.348.742-.516.535-.227 1.235-.227 1.265 0 2.023.786.762.785.762 2.183 0 1.387-.754 2.172-.754.781-2.016.781-1.277 0-2.03-.777-.755-.781-.755-2.148zm1.192-.04q0 .973.449 1.477.449.5 1.14.5.692 0 1.133-.496.446-.5.446-1.496 0-.985-.434-1.469-.43-.484-1.145-.484-.714 0-1.152.492-.437.488-.437 1.476zM53.713 53.377V47.65h1.156v2.543l2.336-2.543h1.555l-2.157 2.23 2.274 3.497H57.38l-1.574-2.688-.938.957v1.73z" style="font-weight:700;font-size:8px;font-family:Sans,Arial;text-anchor:middle;fill:#355386"/></svg>

After

Width:  |  Height:  |  Size: 3.9 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 521 KiB

View File

@ -0,0 +1,373 @@
---
layout: blog
title: "Forensic container analysis"
date: 2023-03-10
slug: forensic-container-analysis
---
**Authors:** Adrian Reber (Red Hat)
In my previous article, [Forensic container checkpointing in
Kubernetes][forensic-blog], I introduced checkpointing in Kubernetes
and how it has to be setup and how it can be used. The name of the
feature is Forensic container checkpointing, but I did not go into
any details how to do the actual analysis of the checkpoint created by
Kubernetes. In this article I want to provide details how the
checkpoint can be analyzed.
Checkpointing is still an alpha feature in Kubernetes and this article
wants to provide a preview how the feature might work in the future.
## Preparation
Details about how to configure Kubernetes and the underlying CRI implementation
to enable checkpointing support can be found in my [Forensic container
checkpointing in Kubernetes][forensic-blog] article.
As an example I prepared a container image (`quay.io/adrianreber/counter:blog`)
which I want to checkpoint and then analyze in this article. This container allows
me to create files in the container and also store information in memory which
I later want to find in the checkpoint.
To run that container I need a pod, and for this example I am using the following Pod manifest:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: counters
spec:
containers:
- name: counter
image: quay.io/adrianreber/counter:blog
```
This results in a container called `counter` running in a pod called `counters`.
Once the container is running I am performing following actions with that
container:
```console
$ kubectl get pod counters --template '{{.status.podIP}}'
10.88.0.25
$ curl 10.88.0.25:8088/create?test-file
$ curl 10.88.0.25:8088/secret?RANDOM_1432_KEY
$ curl 10.88.0.25:8088
```
The first access creates a file called `test-file` with the content `test-file`
in the container and the second access stores my secret information
(`RANDOM_1432_KEY`) somewhere in the container's memory. The last access just
adds an additional line to the internal log file.
The last step before I can analyze the checkpoint it to tell Kubernetes to create
the checkpoint. As described in the previous article this requires access to the
*kubelet* only `checkpoint` API endpoint.
For a container named *counter* in a pod named *counters* in a namespace named
*default* the *kubelet* API endpoint is reachable at:
```shell
# run this on the node where that Pod is executing
curl -X POST "https://localhost:10250/checkpoint/default/counters/counter"
```
For completeness the following `curl` command-line options are necessary to
have `curl` accept the *kubelet*'s self signed certificate and authorize the
use of the *kubelet* `checkpoint` API:
```shell
--insecure --cert /var/run/kubernetes/client-admin.crt --key /var/run/kubernetes/client-admin.key
```
Once the checkpointing has finished the checkpoint should be available at
`/var/lib/kubelet/checkpoints/checkpoint-<pod-name>_<namespace-name>-<container-name>-<timestamp>.tar`
In the following steps of this article I will use the name `checkpoint.tar`
when analyzing the checkpoint archive.
## Checkpoint archive analysis using `checkpointctl`
To get some initial information about the checkpointed container I am using the
tool [checkpointctl][checkpointctl] like this:
```console
$ checkpointctl show checkpoint.tar --print-stats
+-----------+----------------------------------+--------------+---------+---------------------+--------+------------+------------+-------------------+
| CONTAINER | IMAGE | ID | RUNTIME | CREATED | ENGINE | IP | CHKPT SIZE | ROOT FS DIFF SIZE |
+-----------+----------------------------------+--------------+---------+---------------------+--------+------------+------------+-------------------+
| counter | quay.io/adrianreber/counter:blog | 059a219a22e5 | runc | 2023-03-02T06:06:49 | CRI-O | 10.88.0.23 | 8.6 MiB | 3.0 KiB |
+-----------+----------------------------------+--------------+---------+---------------------+--------+------------+------------+-------------------+
CRIU dump statistics
+---------------+-------------+--------------+---------------+---------------+---------------+
| FREEZING TIME | FROZEN TIME | MEMDUMP TIME | MEMWRITE TIME | PAGES SCANNED | PAGES WRITTEN |
+---------------+-------------+--------------+---------------+---------------+---------------+
| 100809 us | 119627 us | 11602 us | 7379 us | 7800 | 2198 |
+---------------+-------------+--------------+---------------+---------------+---------------+
```
This gives me already some information about the checkpoint in that checkpoint
archive. I can see the name of the container, information about the container
runtime and container engine. It also lists the size of the checkpoint (`CHKPT
SIZE`). This is mainly the size of the memory pages included in the checkpoint,
but there is also information about the size of all changed files in the
container (`ROOT FS DIFF SIZE`).
The additional parameter `--print-stats` decodes information in the checkpoint
archive and displays them in the second table (*CRIU dump statistics*). This
information is collected during checkpoint creation and gives an overview how much
time CRIU needed to checkpoint the processes in the container and how many
memory pages were analyzed and written during checkpoint creation.
## Digging deeper
With the help of `checkpointctl` I am able to get some high level information
about the checkpoint archive. To be able to analyze the checkpoint archive
further I have to extract it. The checkpoint archive is a *tar* archive and can
be extracted with the help of `tar xf checkpoint.tar`.
Extracting the checkpoint archive will result in following files and directories:
* `bind.mounts` - this file contains information about bind mounts and is needed
during restore to mount all external files and directories at the right location
* `checkpoint/` - this directory contains the actual checkpoint as created by
CRIU
* `config.dump` and `spec.dump` - these files contain metadata about the container
which is needed during restore
* `dump.log` - this file contains the debug output of CRIU created during
checkpointing
* `stats-dump` - this file contains the data which is used by `checkpointctl`
to display dump statistics (`--print-stats`)
* `rootfs-diff.tar` - this file contains all changed files on the container's
file-system
### File-system changes - `rootfs-diff.tar`
The first step to analyze the container's checkpoint further is to look at
the files that have changed in my container. This can be done by looking at the
file `rootfs-diff.tar`:
```console
$ tar xvf rootfs-diff.tar
home/counter/logfile
home/counter/test-file
```
Now the files that changed in the container can be studied:
```console
$ cat home/counter/logfile
10.88.0.1 - - [02/Mar/2023 06:07:29] "GET /create?test-file HTTP/1.1" 200 -
10.88.0.1 - - [02/Mar/2023 06:07:40] "GET /secret?RANDOM_1432_KEY HTTP/1.1" 200 -
10.88.0.1 - - [02/Mar/2023 06:07:43] "GET / HTTP/1.1" 200 -
$ cat home/counter/test-file
test-file 
```
Compared to the container image (`quay.io/adrianreber/counter:blog`) this
container is based on, I can see that the file `logfile` contains information
about all access to the service the container provides and the file `test-file`
was created just as expected.
With the help of `rootfs-diff.tar` it is possible to inspect all files that
were created or changed compared to the base image of the container.
### Analyzing the checkpointed processes - `checkpoint/`
The directory `checkpoint/` contains data created by CRIU while checkpointing
the processes in the container. The content in the directory `checkpoint/`
consists of different [image files][image-files] which can be analyzed with the
help of the tool [CRIT][crit] which is distributed as part of CRIU.
First lets get an overview of the processes inside of the container:
```console
$ crit show checkpoint/pstree.img | jq .entries[].pid
1
7
8
```
This output means that I have three processes inside of the container's PID
namespace with the PIDs: 1, 7, 8
This is only the view from the inside of the container's PID namespace. During
restore exactly these PIDs will be recreated. From the outside of the
container's PID namespace the PIDs will change after restore.
The next step is to get some additional information about these three processes:
```console
$ crit show checkpoint/core-1.img | jq .entries[0].tc.comm
"bash"
$ crit show checkpoint/core-7.img | jq .entries[0].tc.comm
"counter.py"
$ crit show checkpoint/core-8.img | jq .entries[0].tc.comm
"tee"
```
This means the three processes in my container are `bash`, `counter.py` (a Python
interpreter) and `tee`. For details about the parent child relations of these processes there
is more data to be analyzed in `checkpoint/pstree.img`.
Let's compare the so far collected information to the still running container:
```console
$ crictl inspect --output go-template --template "{{(index .info.pid)}}" 059a219a22e56
722520
$ ps auxf | grep -A 2 722520
fedora 722520 \_ bash -c /home/counter/counter.py 2>&1 | tee /home/counter/logfile
fedora 722541 \_ /usr/bin/python3 /home/counter/counter.py
fedora 722542 \_ /usr/bin/coreutils --coreutils-prog-shebang=tee /usr/bin/tee /home/counter/logfile
$ cat /proc/722520/comm
bash
$ cat /proc/722541/comm
counter.py
$ cat /proc/722542/comm
tee
```
In this output I am first retrieving the PID of the first process in the
container and then I am looking for that PID and child processes on the system
where the container is running. I am seeing three processes and the first one is
"bash" which is PID 1 inside of the containers PID namespace. Then I am looking
at `/proc/<PID>/comm` and I can find the exact same value
as in the checkpoint image.
Important to remember is that the checkpoint will contain the view from within the
container's PID namespace because that information is important to restore the
processes.
One last example of what `crit` can tell us about the container is the information
about the UTS namespace:
```console
$ crit show checkpoint/utsns-12.img
{
"magic": "UTSNS",
"entries": [
{
"nodename": "counters",
"domainname": "(none)"
}
]
}
```
This tells me that the hostname inside of the UTS namespace is `counters`.
For every resource CRIU collected during checkpointing the `checkpoint/`
directory contains corresponding image files which can be analyzed with the help
of `crit`.
#### Looking at the memory pages
In addition to the information from CRIU that can be decoded with the help
of CRIT, there are also files containing the raw memory pages written by
CRIU to disk:
```console
$ ls checkpoint/pages-*
checkpoint/pages-1.img checkpoint/pages-2.img checkpoint/pages-3.img
```
When I initially used the container I stored a random key (`RANDOM_1432_KEY`)
somewhere in the memory. Let see if I can find it:
```console
$ grep -ao RANDOM_1432_KEY checkpoint/pages-*
checkpoint/pages-2.img:RANDOM_1432_KEY
```
And indeed, there is my data. This way I can easily look at the content
of all memory pages of the processes in the container, but it is also
important to remember that anyone that can access the checkpoint
archive has access to all information that was stored in the memory of the
container's processes.
#### Using gdb for further analysis
Another possibility to look at the checkpoint images is `gdb`. The CRIU repository
contains the script [coredump][criu-coredump] which can convert a checkpoint
into a coredump file:
```console
$ /home/criu/coredump/coredump-python3
$ ls -al core*
core.1 core.7 core.8
```
Running the `coredump-python3` script will convert the checkpoint images into
one coredump file for each process in the container. Using `gdb` I can also look
at the details of the processes:
```console
$ echo info registers | gdb --core checkpoint/core.1 -q
[New LWP 1]
Core was generated by `bash -c /home/counter/counter.py 2>&1 | tee /home/counter/logfile'.
#0 0x00007fefba110198 in ?? ()
(gdb)
rax 0x3d 61
rbx 0x8 8
rcx 0x7fefba11019a 140667595587994
rdx 0x0 0
rsi 0x7fffed9c1110 140737179816208
rdi 0xffffffff 4294967295
rbp 0x1 0x1
rsp 0x7fffed9c10e8 0x7fffed9c10e8
r8 0x1 1
r9 0x0 0
r10 0x0 0
r11 0x246 582
r12 0x0 0
r13 0x7fffed9c1170 140737179816304
r14 0x0 0
r15 0x0 0
rip 0x7fefba110198 0x7fefba110198
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
```
In this example I can see the value of all registers as they were during
checkpointing and I can also see the complete command-line of my container's PID
1 process: `bash -c /home/counter/counter.py 2>&1 | tee /home/counter/logfile`
## Summary
With the help of container checkpointing, it is possible to create a
checkpoint of a running container without stopping the container and without the
container knowing that it was checkpointed. The result of checkpointing a
container in Kubernetes is a checkpoint archive; using different tools like
`checkpointctl`, `tar`, `crit` and `gdb` the checkpoint can be analyzed. Even
with simple tools like `grep` it is possible to find information in the
checkpoint archive.
The different examples I have shown in this article how to analyze a checkpoint
are just the starting point. Depending on your requirements it is possible to
look at certain things in much more detail, but this article should give you an
introduction how to start the analysis of your checkpoint.
## How do I get involved?
You can reach SIG Node by several means:
* Slack: [#sig-node][slack-sig-node]
* Slack: [#sig-security][slack-sig-security]
* [Mailing list][sig-node-ml]
[forensic-blog]: https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
[checkpointctl]: https://github.com/checkpoint-restore/checkpointctl
[image-files]: https://criu.org/Images
[crit]: https://criu.org/CRIT
[slack-sig-node]: https://kubernetes.slack.com/messages/sig-node
[slack-sig-security]: https://kubernetes.slack.com/messages/sig-security
[sig-node-ml]: https://groups.google.com/forum/#!forum/kubernetes-sig-node
[criu-coredump]: https://github.com/checkpoint-restore/criu/tree/criu-dev/coredump

View File

@ -0,0 +1,187 @@
---
layout: blog
title: "k8s.gcr.io Redirect to registry.k8s.io - What You Need to Know"
date: 2023-03-10T17:00:00.000Z
slug: image-registry-redirect
---
**Authors**: Bob Killen (Google), Davanum Srinivas (AWS), Chris Short (AWS), Frederico Muñoz (SAS
Institute), Tim Bannister (The Scale Factory), Ricky Sadowski (AWS), Grace Nguyen (Expo), Mahamed
Ali (Rackspace Technology), Mars Toktonaliev (independent), Laura Santamaria (Dell), Kat Cosgrove
(Dell)
On Monday, March 20th, the k8s.gcr.io registry [will be redirected to the community owned
registry](https://kubernetes.io/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/),
**registry.k8s.io** .
## TL;DR: What you need to know about this change
- On Monday, March 20th, traffic from the older k8s.gcr.io registry will be redirected to
registry.k8s.io with the eventual goal of sunsetting k8s.gcr.io.
- If you run in a restricted environment, and apply strict domain name or IP address access policies
limited to k8s.gcr.io, **the image pulls will not function** after k8s.gcr.io starts redirecting
to the new registry. 
- A small subset of non-standard clients do not handle HTTP redirects by image registries, and will
need to be pointed directly at registry.k8s.io.
- The redirect is a stopgap to assist users in making the switch. The deprecated k8s.gcr.io registry
will be phased out at some point. **Please update your manifests as soon as possible to point to
registry.k8s.io**.
- If you host your own image registry, you can copy images you need there as well to reduce traffic
to community owned registries.
If you think you may be impacted, or would like to know more about this change, please keep reading.
## How can I check if I am impacted?
To test connectivity to registry.k8s.io and being able to pull images from there, here is a sample
command that can be executed in the namespace of your choosing:
```
kubectl run hello-world -ti --rm --image=registry.k8s.io/busybox:latest --restart=Never -- date
```
When you run the command above, heres what to expect when things work correctly:
```
$ kubectl run hello-world -ti --rm --image=registry.k8s.io/busybox:latest --restart=Never -- date
Fri Feb 31 07:07:07 UTC 2023
pod "hello-world" deleted
```
## What kind of errors will I see if Im impacted?
Errors may depend on what kind of container runtime you are using, and what endpoint you are routed
to, but it should present such as `ErrImagePull`, `ImagePullBackOff`, or a container failing to be
created with the warning `FailedCreatePodSandBox`.
Below is an example error message showing a proxied deployment failing to pull due to an unknown
certificate:
```
FailedCreatePodSandBox: Failed to create pod sandbox: rpc error: code = Unknown desc = Error response from daemon: Head “https://us-west1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.8”: x509: certificate signed by unknown authority
```
## What images will be impacted?
**ALL** images on k8s.gcr.io will be impacted by this change. k8s.gcr.io hosts many images beyond
Kubernetes releases. A large number of Kubernetes subprojects host their images there as well. Some
examples include the `dns/k8s-dns-node-cache`, `ingress-nginx/controller`, and
`node-problem-detector/node-problem-detector` images.
## I am impacted. What should I do?
For impacted users that run in a restricted environment, the best option is to copy over the
required images to a private registry or configure a pull-through cache in their registry.
There are several tools to copy images between registries;
[crane](https://github.com/google/go-containerregistry/blob/main/cmd/crane/doc/crane_copy.md) is one
of those tools, and images can be copied to a private registry by using `crane copy SRC DST`. There
are also vendor-specific tools, like e.g. Googles
[gcrane](https://cloud.google.com/container-registry/docs/migrate-external-containers#copy), that
perform a similar function but are streamlined for their platform.
## How can I find which images are using the legacy registry, and fix them?
**Option 1**: See the one line kubectl command in our [earlier blog
post](https://kubernetes.io/blog/2023/02/06/k8s-gcr-io-freeze-announcement/#what-s-next):
```
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -c
```
**Option 2**: A `kubectl` [krew](https://krew.sigs.k8s.io/) plugin has been developed called
[`community-images`](https://github.com/kubernetes-sigs/community-images#kubectl-community-images),
that will scan and report any images using the k8s.gcr.io endpoint.
If you have krew installed, you can install it with:
```
kubectl krew install community-images
```
and generate a report with:
```
kubectl community-images
```
For alternate methods of install and example output, check out the repo:
[kubernetes-sigs/community-images](https://github.com/kubernetes-sigs/community-images).
**Option 3**: If you do not have access to a cluster directly, or manage many clusters - the best
way is to run a search over your manifests and charts for _"k8s.gcr.io"_.
**Option 4**: If you wish to prevent k8s.gcr.io based images from running in your cluster, example
policies for [Gatekeeper](https://open-policy-agent.github.io/gatekeeper-library/website/) and
[Kyverno](https://kyverno.io/) are available in the [AWS EKS Best Practices
repository](https://github.com/aws/aws-eks-best-practices/tree/master/policies/k8s-registry-deprecation)
that will block them from being pulled. You can use these third-party policies with any Kubernetes
cluster.
**Option 5**: As a **LAST** possible option, you can use a [Mutating
Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#what-are-admission-webhooks)
to change the image address dynamically. This should only be
considered a stopgap till your manifests have been updated. You can
find a (third party) Mutating Webhook and Kyverno policy in
[k8s-gcr-quickfix](https://github.com/abstractinfrastructure/k8s-gcr-quickfix).
## Why did Kubernetes change to a different image registry?
k8s.gcr.io is hosted on a custom [Google Container Registry
(GCR)](https://cloud.google.com/container-registry) domain that was set up solely for the Kubernetes
project. This has worked well since the inception of the project, and we thank Google for providing
these resources, but today, there are other cloud providers and vendors that would like to host
images to provide a better experience for the people on their platforms. In addition to Googles
[renewed commitment to donate $3
million](https://www.cncf.io/google-cloud-recommits-3m-to-kubernetes/) to support the project's
infrastructure last year, Amazon Web Services announced a matching donation [during their Kubecon NA
2022 keynote in Detroit](https://youtu.be/PPdimejomWo?t=236). This will provide a better experience
for users (closer servers = faster downloads) and will reduce the egress bandwidth and costs from
GCR at the same time.
For more details on this change, check out [registry.k8s.io: faster, cheaper and Generally Available
(GA)](/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/).
## Why is a redirect being put in place?
The project switched to [registry.k8s.io last year with the 1.25
release](https://kubernetes.io/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/); however, most of
the image pull traffic is still directed at the old endpoint k8s.gcr.io. This has not been
sustainable for us as a project, as it is not utilizing the resources that have been donated to the
project from other providers, and we are in the danger of running out of funds due to the cost of
serving this traffic.
A redirect will enable the project to take advantage of these new resources, significantly reducing
our egress bandwidth costs. We only expect this change to impact a small subset of users running in
restricted environments or using very old clients that do not respect redirects properly.
## What will happen to k8s.gcr.io?
Separate from the the redirect, k8s.gcr.io will be frozen [and will not be updated with new images
after April 3rd, 2023](https://kubernetes.io/blog/2023/02/06/k8s-gcr-io-freeze-announcement/). `k8s.gcr.io`
will not get any new releases, patches, or security updates. It will continue to remain available to
help people migrate, but it **WILL** be phased out entirely in the future.
## I still have questions, where should I go?
For more information on registry.k8s.io and why it was developed, see [registry.k8s.io: faster,
cheaper and Generally Available](/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/).
If you would like to know more about the image freeze and the last images that will be available
there, see the blog post: [k8s.gcr.io Image Registry Will Be Frozen From the 3rd of April
2023](/blog/2023/02/06/k8s-gcr-io-freeze-announcement/).
Information on the architecture of registry.k8s.io and its [request handling decision
tree](https://github.com/kubernetes/registry.k8s.io/blob/8408d0501a88b3d2531ff54b14eeb0e3c900a4f3/cmd/archeio/docs/request-handling.md)
can be found in the [kubernetes/registry.k8s.io
repo](https://github.com/kubernetes/registry.k8s.io).
If you believe you have encountered a bug with the new registry or the redirect, please open an
issue in the [kubernetes/registry.k8s.io
repo](https://github.com/kubernetes/registry.k8s.io/issues/new/choose). **Please check if there is an issue already
open similar to what you are seeing before you create a new issue**.

View File

@ -0,0 +1,241 @@
---
layout: blog
title: "Kubernetes Removals and Major Changes In v1.27"
date: 2023-03-17T14:00:00+0000
slug: upcoming-changes-in-kubernetes-v1-27
---
**Author**: Harshita Sao
As Kubernetes develops and matures, features may be deprecated, removed, or replaced
with better ones for the project's overall health. Based on the information available
at this point in the v1.27 release process, which is still ongoing and can introduce
additional changes, this article identifies and describes some of the planned changes
for the Kubernetes v1.27 release.
## A note about the k8s.gcr.io redirect to registry.k8s.io
To host its container images, the Kubernetes project uses a community-owned image
registry called registry.k8s.io. **On March 20th, all traffic from the out-of-date
[k8s.gcr.io](https://cloud.google.com/container-registry/) registry will be redirected
to [registry.k8s.io](https://github.com/kubernetes/registry.k8s.io)**. The deprecated
k8s.gcr.io registry will eventually be phased out.
### What does this change mean?
- If you are a subproject maintainer, you must update your manifests and Helm
charts to use the new registry.
- The v1.27 Kubernetes release will not be published to the old registry.
- From April, patch releases for v1.24, v1.25, and v1.26 will no longer be
published to the old registry.
We have a [blog post](/blog/2023/03/10/image-registry-redirect/) with all
the information about this change and what to do if it impacts you.
## The Kubernetes API Removal and Deprecation process
The Kubernetes project has a well-documented
[deprecation policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/)
for features. This policy states that stable APIs may only be deprecated when
a newer, stable version of that same API is available and that APIs have a
minimum lifetime for each stability level. A deprecated API has been marked
for removal in a future Kubernetes release, it will continue to function until
removal (at least one year from the deprecation), but usage will result in a
warning being displayed. Removed APIs are no longer available in the current
version, at which point you must migrate to using the replacement.
- Generally available (GA) or stable API versions may be marked as deprecated
but must not be removed within a major version of Kubernetes.
- Beta or pre-release API versions must be supported for 3 releases after the deprecation.
- Alpha or experimental API versions may be removed in any release without prior deprecation notice.
Whether an API is removed as a result of a feature graduating from beta to stable
or because that API simply did not succeed, all removals comply with this
deprecation policy. Whenever an API is removed, migration options are communicated
in the documentation.
## API removals, and other changes for Kubernetes v1.27
### Removal of `storage.k8s.io/v1beta1` from `CSIStorageCapacity`
The [CSIStorageCapacity](/docs/reference/kubernetes-api/config-and-storage-resources/csi-storage-capacity-v1/)
API supports exposing currently available storage capacity via CSIStorageCapacity
objects and enhances the scheduling of pods that use CSI volumes with late binding.
The `storage.k8s.io/v1beta1` API version of CSIStorageCapacity was deprecated in v1.24,
and it will no longer be served in v1.27.
Migrate manifests and API clients to use the `storage.k8s.io/v1` API version,
available since v1.24. All existing persisted objects are accessible via the new API.
Refer to the
[Storage Capacity Constraints for Pod Scheduling KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking)
for more information.
Kubernetes v1.27 is not removing any other APIs; however several other aspects are going
to be removed. Read on for details.
### Support for deprecated seccomp annotations
In Kubernetes v1.19, the
[seccomp](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/135-seccomp)
(secure computing mode) support graduated to General Availability (GA).
This feature can be used to increase the workload security by restricting
the system calls for a Pod (applies to all containers) or single containers.
The support for the alpha seccomp annotations `seccomp.security.alpha.kubernetes.io/pod`
and `container.seccomp.security.alpha.kubernetes.io` were deprecated since v1.19, now
have been completely removed. The seccomp fields are no longer auto-populated when pods
with seccomp annotations are created. Pods should use the corresponding pod or container
`securityContext.seccompProfile` field instead.
### Removal of several feature gates for volume expansion
The following feature gates for
[volume expansion](https://github.com/kubernetes/enhancements/issues/284) GA features
will be removed and must no longer be referenced in `--feature-gates` flags:
`ExpandCSIVolumes`
: Enable expanding of CSI volumes.
`ExpandInUsePersistentVolumes`
: Enable expanding in-use PVCs.
`ExpandPersistentVolumes`
: Enable expanding of persistent volumes.
### Removal of `--master-service-namespace` command line argument
The kube-apiserver accepts a deprecated command line argument, `--master-service-namespace`,
that specified where to create the Service named `kubernetes` to represent the API server.
Kubernetes v1.27 will remove that argument, which has been deprecated since the v1.26 release.
### Removal of the `ControllerManagerLeaderMigration` feature gate
[Leader Migration](https://github.com/kubernetes/enhancements/issues/2436) provides
a mechanism in which HA clusters can safely migrate "cloud-specific" controllers
between the `kube-controller-manager` and the `cloud-controller-manager` via a shared
resource lock between the two components while upgrading the replicated control plane.
The `ControllerManagerLeaderMigration` feature, GA since v1.24, is unconditionally
enabled and for the v1.27 release the feature gate option will be removed. If you're
setting this feature gate explicitly, you'll need to remove that from command line
arguments or configuration files.
### Removal of `--enable-taint-manager` command line argument
The kube-controller-manager command line argument `--enable-taint-manager` is
deprecated, and will be removed in Kubernetes v1.27. The feature that it supports,
[taint based eviction](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions),
is already enabled by default and will continue to be implicitly enabled when the flag is removed.
### Removal of `--pod-eviction-timeout` command line argument
The deprecated command line argument `--pod-eviction-timeout` will be removed from the
kube-controller-manager.
### Removal of the `CSI Migration` feature gate
The [CSI migration](https://github.com/kubernetes/enhancements/issues/625)
programme allows moving from in-tree volume plugins to out-of-tree CSI drivers.
CSI migration is generally available since Kubernetes v1.16, and the associated
`CSIMigration` feature gate will be removed in v1.27.
### Removal of `CSIInlineVolume` feature gate
The [CSI Ephemeral Volume](https://github.com/kubernetes/kubernetes/pull/111258)
feature allows CSI volumes to be specified directly in the pod specification for
ephemeral use cases. They can be used to inject arbitrary states, such as
configuration, secrets, identity, variables or similar information, directly
inside pods using a mounted volume. This feature graduated to GA in v1.25.
Hence, the feature gate `CSIInlineVolume` will be removed in the v1.27 release.
### Removal of `EphemeralContainers` feature gate
[Ephemeral containers](/docs/concepts/workloads/pods/ephemeral-containers/)
graduated to GA in v1.25. These are containers with a temporary duration that
executes within namespaces of an existing pod. Ephemeral containers are
typically initiated by a user in order to observe the state of other pods
and containers for troubleshooting and debugging purposes. For Kubernetes v1.27,
API support for ephemeral containers is unconditionally enabled; the
`EphemeralContainers` feature gate will be removed.
### Removal of `LocalStorageCapacityIsolation` feature gate
The [Local Ephemeral Storage Capacity Isolation](https://github.com/kubernetes/kubernetes/pull/111513)
feature moved to GA in v1.25. The feature provides support for capacity isolation
of local ephemeral storage between pods, such as `emptyDir` volumes, so that a pod
can be hard limited in its consumption of shared resources. The kubelet will
evicting Pods if consumption of local ephemeral storage exceeds the configured limit.
The feature gate, `LocalStorageCapacityIsolation`, will be removed in the v1.27 release.
### Removal of `NetworkPolicyEndPort` feature gate
The v1.25 release of Kubernetes promoted `endPort` in NetworkPolicy to GA.
NetworkPolicy providers that support the `endPort` field that can be used to
specify a range of ports to apply a NetworkPolicy. Previously, each NetworkPolicy
could only target a single port. So the feature gate `NetworkPolicyEndPort`
will be removed in this release.
Please be aware that `endPort` field must be supported by the Network Policy
provider. If your provider does not support `endPort`, and this field is
specified in a Network Policy, the Network Policy will be created covering
only the port field (single port).
### Removal of `StatefulSetMinReadySeconds` feature gate
For a pod that is part of a StatefulSet, Kubernetes can mark the Pod ready only
if Pod is available (and passing checks) for at least the period you specify in
[`minReadySeconds`](/docs/concepts/workloads/controllers/statefulset/#minimum-ready-seconds).
The feature became generally available in Kubernetes v1.25, and the `StatefulSetMinReadySeconds`
feature gate will be locked to true and removed in the v1.27 release.
### Removal of `IdentifyPodOS` feature gate
You can specify the operating system for a Pod, and the feature support for that
is stable since the v1.25 release. The `IdentifyPodOS` feature gate will be
removed for Kubernetes v1.27.
### Removal of `DaemonSetUpdateSurge` feature gate
The v1.25 release of Kubernetes also stabilised surge support for DaemonSet pods,
implemented in order to minimize DaemonSet downtime during rollouts.
The `DaemonSetUpdateSurge` feature gate will be removed in Kubernetes v1.27.
### Removal of `--container-runtime` command line argument
The kubelet accepts a deprecated command line argument, `--container-runtime`, and the only
valid value would be `remote` after dockershim code is removed. Kubernetes v1.27 will remove
that argument, which has been deprecated since the v1.24 release.
## Looking ahead
The official list of
[API removals](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-29)
planned for Kubernetes v1.29 includes:
- The `flowcontrol.apiserver.k8s.io/v1beta2` API version of FlowSchema and
PriorityLevelConfiguration will no longer be served in v1.29.
## Want to know more?
Deprecations are announced in the Kubernetes release notes. You can see the
announcements of pending deprecations in the release notes for:
- [Kubernetes v1.23](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#deprecation)
- [Kubernetes v1.24](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#deprecation)
- [Kubernetes v1.25](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.25.md#deprecation)
- [Kubernetes v1.26](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.26.md#deprecation)
We will formally announce the deprecations that come with
[Kubernetes v1.27](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#deprecation)
as part of the CHANGELOG for that release.
For information on the process of deprecation and removal, check out the official Kubernetes
[deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-parts-of-the-api) document.

View File

@ -103,6 +103,8 @@ updated to newer versions that support cgroup v2. For example:
* If you run [cAdvisor](https://github.com/google/cadvisor) as a stand-alone
DaemonSet for monitoring pods and containers, update it to v0.43.0 or later.
* If you use JDK, prefer to use JDK 11.0.16 and later or JDK 15 and later, which [fully support cgroup v2](https://bugs.openjdk.org/browse/JDK-8230305).
* If you are using the [uber-go/automaxprocs](https://github.com/uber-go/automaxprocs) package, make sure
the version you use is v1.5.1 or higher.
## Identify the cgroup version on Linux Nodes {#check-cgroup-version}

View File

@ -17,8 +17,6 @@ components.
The cloud-controller-manager is structured using a plugin
mechanism that allows different cloud providers to integrate their platforms with Kubernetes.
<!-- body -->
## Design
@ -48,10 +46,10 @@ when new servers are created in your cloud infrastructure. The node controller o
hosts running inside your tenancy with the cloud provider. The node controller performs the following functions:
1. Update a Node object with the corresponding server's unique identifier obtained from the cloud provider API.
2. Annotating and labelling the Node object with cloud-specific information, such as the region the node
1. Annotating and labelling the Node object with cloud-specific information, such as the region the node
is deployed into and the resources (CPU, memory, etc) that it has available.
3. Obtain the node's hostname and network addresses.
4. Verifying the node's health. In case a node becomes unresponsive, this controller checks with
1. Obtain the node's hostname and network addresses.
1. Verifying the node's health. In case a node becomes unresponsive, this controller checks with
your cloud provider's API to see if the server has been deactivated / deleted / terminated.
If the node has been deleted from the cloud, the controller deletes the Node object from your Kubernetes
cluster.
@ -88,13 +86,13 @@ to read and modify Node objects.
`v1/Node`:
- Get
- List
- Create
- Update
- Patch
- Watch
- Delete
- get
- list
- create
- update
- patch
- watch
- delete
### Route controller {#authorization-route-controller}
@ -103,37 +101,42 @@ routes appropriately. It requires Get access to Node objects.
`v1/Node`:
- Get
- get
### Service controller {#authorization-service-controller}
The service controller listens to Service object Create, Update and Delete events and then configures Endpoints for those Services appropriately (for EndpointSlices, the kube-controller-manager manages these on demand).
The service controller watches for Service object **create**, **update** and **delete** events and then
configures Endpoints for those Services appropriately (for EndpointSlices, the
kube-controller-manager manages these on demand).
To access Services, it requires List, and Watch access. To update Services, it requires Patch and Update access.
To access Services, it requires **list**, and **watch** access. To update Services, it requires
**patch** and **update** access.
To set up Endpoints resources for the Services, it requires access to Create, List, Get, Watch, and Update.
To set up Endpoints resources for the Services, it requires access to **create**, **list**,
**get**, **watch**, and **update**.
`v1/Service`:
- List
- Get
- Watch
- Patch
- Update
- list
- get
- watch
- patch
- update
### Others {#authorization-miscellaneous}
The implementation of the core of the cloud controller manager requires access to create Event objects, and to ensure secure operation, it requires access to create ServiceAccounts.
The implementation of the core of the cloud controller manager requires access to create Event
objects, and to ensure secure operation, it requires access to create ServiceAccounts.
`v1/Event`:
- Create
- Patch
- Update
- create
- patch
- update
`v1/ServiceAccount`:
- Create
- create
The {{< glossary_tooltip term_id="rbac" text="RBAC" >}} ClusterRole for the cloud
controller manager looks like:
@ -203,15 +206,21 @@ rules:
## {{% heading "whatsnext" %}}
[Cloud Controller Manager Administration](/docs/tasks/administer-cluster/running-cloud-controller/#cloud-controller-manager)
has instructions on running and managing the cloud controller manager.
* [Cloud Controller Manager Administration](/docs/tasks/administer-cluster/running-cloud-controller/#cloud-controller-manager)
has instructions on running and managing the cloud controller manager.
To upgrade a HA control plane to use the cloud controller manager, see [Migrate Replicated Control Plane To Use Cloud Controller Manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/).
* To upgrade a HA control plane to use the cloud controller manager, see
[Migrate Replicated Control Plane To Use Cloud Controller Manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/).
Want to know how to implement your own cloud controller manager, or extend an existing project?
* Want to know how to implement your own cloud controller manager, or extend an existing project?
The cloud controller manager uses Go interfaces to allow implementations from any cloud to be plugged in. Specifically, it uses the `CloudProvider` interface defined in [`cloud.go`](https://github.com/kubernetes/cloud-provider/blob/release-1.21/cloud.go#L42-L69) from [kubernetes/cloud-provider](https://github.com/kubernetes/cloud-provider).
The implementation of the shared controllers highlighted in this document (Node, Route, and Service), and some scaffolding along with the shared cloudprovider interface, is part of the Kubernetes core. Implementations specific to cloud providers are outside the core of Kubernetes and implement the `CloudProvider` interface.
For more information about developing plugins, see [Developing Cloud Controller Manager](/docs/tasks/administer-cluster/developing-cloud-controller-manager/).
- The cloud controller manager uses Go interfaces, specifically, `CloudProvider` interface defined in
[`cloud.go`](https://github.com/kubernetes/cloud-provider/blob/release-1.21/cloud.go#L42-L69)
from [kubernetes/cloud-provider](https://github.com/kubernetes/cloud-provider) to allow
implementations from any cloud to be plugged in.
- The implementation of the shared controllers highlighted in this document (Node, Route, and Service),
and some scaffolding along with the shared cloudprovider interface, is part of the Kubernetes core.
Implementations specific to cloud providers are outside the core of Kubernetes and implement
the `CloudProvider` interface.
- For more information about developing plugins,
see [Developing Cloud Controller Manager](/docs/tasks/administer-cluster/developing-cloud-controller-manager/).

View File

@ -11,7 +11,8 @@ aliases:
<!-- overview -->
This document catalogs the communication paths between the API server and the Kubernetes cluster.
This document catalogs the communication paths between the {{< glossary_tooltip term_id="kube-apiserver" text="API server" >}}
and the Kubernetes {{< glossary_tooltip text="cluster" term_id="cluster" length="all" >}}.
The intent is to allow users to customize their installation to harden the network configuration
such that the cluster can be run on an untrusted network (or on fully public IPs on a cloud
provider).
@ -30,28 +31,28 @@ enabled, especially if [anonymous requests](/docs/reference/access-authn-authz/a
or [service account tokens](/docs/reference/access-authn-authz/authentication/#service-account-tokens)
are allowed.
Nodes should be provisioned with the public root certificate for the cluster such that they can
Nodes should be provisioned with the public root {{< glossary_tooltip text="certificate" term_id="certificate" >}} for the cluster such that they can
connect securely to the API server along with valid client credentials. A good approach is that the
client credentials provided to the kubelet are in the form of a client certificate. See
[kubelet TLS bootstrapping](/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/)
for automated provisioning of kubelet client certificates.
Pods that wish to connect to the API server can do so securely by leveraging a service account so
{{< glossary_tooltip text="Pods" term_id="pod" >}} that wish to connect to the API server can do so securely by leveraging a service account so
that Kubernetes will automatically inject the public root certificate and a valid bearer token
into the pod when it is instantiated.
The `kubernetes` service (in `default` namespace) is configured with a virtual IP address that is
redirected (via `kube-proxy`) to the HTTPS endpoint on the API server.
redirected (via `{{< glossary_tooltip text="kube-proxy" term_id="kube-proxy" >}}`) to the HTTPS endpoint on the API server.
The control plane components also communicate with the API server over the secure port.
As a result, the default operating mode for connections from the nodes and pods running on the
As a result, the default operating mode for connections from the nodes and pod running on the
nodes to the control plane is secured by default and can run over untrusted and/or public
networks.
## Control plane to node
There are two primary communication paths from the control plane (the API server) to the nodes.
The first is from the API server to the kubelet process which runs on each node in the cluster.
The first is from the API server to the {{< glossary_tooltip text="kubelet" term_id="kubelet" >}} process which runs on each node in the cluster.
The second is from the API server to any node, pod, or service through the API server's _proxy_
functionality.
@ -89,7 +90,7 @@ connections **are not currently safe** to run over untrusted or public networks.
### SSH tunnels
Kubernetes supports SSH tunnels to protect the control plane to nodes communication paths. In this
Kubernetes supports [SSH tunnels](https://www.ssh.com/academy/ssh/tunneling) to protect the control plane to nodes communication paths. In this
configuration, the API server initiates an SSH tunnel to each node in the cluster (connecting to
the SSH server listening on port 22) and passes all traffic destined for a kubelet, node, pod, or
service through the tunnel.
@ -117,3 +118,12 @@ connections.
Follow the [Konnectivity service task](/docs/tasks/extend-kubernetes/setup-konnectivity/) to set
up the Konnectivity service in your cluster.
## {{% heading "whatsnext" %}}
* Read about the [Kubernetes control plane components](/docs/concepts/overview/components/#control-plane-components)
* Learn more about [Hubs and Spoke model](https://book.kubebuilder.io/multiversion-tutorial/conversion-concepts.html#hubs-spokes-and-other-wheel-metaphors)
* Learn how to [Secure a Cluster](/docs/tasks/administer-cluster/securing-a-cluster/)
* Learn more about the [Kubernetes API](/docs/concepts/overview/kubernetes-api/)
* [Set up Konnectivity service](/docs/tasks/extend-kubernetes/setup-konnectivity/)
* [Use Port Forwarding to Access Applications in a Cluster](/docs/tasks/access-application-cluster/port-forward-access-application-cluster/)
* Learn how to [Fetch logs for Pods](/docs/tasks/debug/debug-application/debug-running-pod/#examine-pod-logs), [use kubectl port-forward](/docs/tasks/access-application-cluster/port-forward-access-application-cluster/#forward-a-local-port-to-a-port-on-the-pod)

View File

@ -18,7 +18,7 @@ This page lists some of the available add-ons and links to their respective inst
* [ACI](https://www.github.com/noironetworks/aci-containers) provides integrated container networking and network security with Cisco ACI.
* [Antrea](https://antrea.io/) operates at Layer 3/4 to provide networking and security services for Kubernetes, leveraging Open vSwitch as the networking data plane. Antrea is a [CNCF project at the Sandbox level](https://www.cncf.io/projects/antrea/).
* [Calico](https://docs.projectcalico.org/latest/introduction/) is a networking and network policy provider. Calico supports a flexible set of networking options so you can choose the most efficient option for your situation, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio & Envoy) applications at the service mesh layer.
* [Calico](https://www.tigera.io/project-calico/) is a networking and network policy provider. Calico supports a flexible set of networking options so you can choose the most efficient option for your situation, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio & Envoy) applications at the service mesh layer.
* [Canal](https://projectcalico.docs.tigera.io/getting-started/kubernetes/flannel/flannel) unites Flannel and Calico, providing networking and network policy.
* [Cilium](https://github.com/cilium/cilium) is a networking, observability, and security solution with an eBPF-based data plane. Cilium provides a simple flat Layer 3 network with the ability to span multiple clusters in either a native routing or overlay/encapsulation mode, and can enforce network policies on L3-L7 using an identity-based security model that is decoupled from network addressing. Cilium can act as a replacement for kube-proxy; it also offers additional, opt-in observability and security features. Cilium is a [CNCF project at the Incubation level](https://www.cncf.io/projects/cilium/).
* [CNI-Genie](https://github.com/cni-genie/CNI-Genie) enables Kubernetes to seamlessly connect to a choice of CNI plugins, such as Calico, Canal, Flannel, or Weave. CNI-Genie is a [CNCF project at the Sandbox level](https://www.cncf.io/projects/cni-genie/).

View File

@ -705,14 +705,15 @@ serves the following additional paths at its HTTP[S] ports.
The output is similar to this:
```none
PriorityLevelName, ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, ExecutingRequests,
workload-low, 0, true, false, 0, 0,
global-default, 0, true, false, 0, 0,
exempt, <none>, <none>, <none>, <none>, <none>,
catch-all, 0, true, false, 0, 0,
system, 0, true, false, 0, 0,
leader-election, 0, true, false, 0, 0,
workload-high, 0, true, false, 0, 0,
PriorityLevelName, ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, ExecutingRequests, DispatchedRequests, RejectedRequests, TimedoutRequests, CancelledRequests
catch-all, 0, true, false, 0, 0, 1, 0, 0, 0
exempt, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>
global-default, 0, true, false, 0, 0, 46, 0, 0, 0
leader-election, 0, true, false, 0, 0, 4, 0, 0, 0
node-high, 0, true, false, 0, 0, 34, 0, 0, 0
system, 0, true, false, 0, 0, 48, 0, 0, 0
workload-high, 0, true, false, 0, 0, 500, 0, 0, 0
workload-low, 0, true, false, 0, 0, 0, 0, 0, 0
```
- `/debug/api_priority_and_fairness/dump_queues` - a listing of all the

View File

@ -201,25 +201,8 @@ If you want to access data from a Secret in a Pod, one way to do that is to
have Kubernetes make the value of that Secret be available as a file inside
the filesystem of one or more of the Pod's containers.
{{< note >}}
Versions of Kubernetes before v1.22 automatically created credentials for accessing
the Kubernetes API. This older mechanism was based on creating token Secrets that
could then be mounted into running Pods.
In more recent versions, including Kubernetes v{{< skew currentVersion >}}, API credentials
are obtained directly by using the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/) API,
and are mounted into Pods using a [projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume).
The tokens obtained using this method have bounded lifetimes, and are automatically
invalidated when the Pod they are mounted into is deleted.
You can still [manually create](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
a service account token Secret; for example, if you need a token that never expires.
However, using the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
subresource to obtain a token to access the API is recommended instead.
You can use the [`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
command to obtain a token from the `TokenRequest` API.
{{< /note >}}
#### Mounted Secrets are updated automatically
For instructions, refer to
[Distribute credentials securely using Secrets](/docs/tasks/inject-data-application/distribute-credentials-secure/#create-a-pod-that-has-access-to-the-secret-data-through-a-volume).
When a volume contains data from a Secret, and that Secret is updated, Kubernetes tracks
this and updates the data in the volume, using an eventually-consistent approach.
@ -638,13 +621,28 @@ A `kubernetes.io/service-account-token` type of Secret is used to store a
token credential that identifies a
{{< glossary_tooltip text="service account" term_id="service-account" >}}.
Since 1.22, this type of Secret is no longer used to mount credentials into Pods,
and obtaining tokens via the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
API is recommended instead of using service account token Secret objects.
Tokens obtained from the `TokenRequest` API are more secure than ones stored in Secret objects,
because they have a bounded lifetime and are not readable by other API clients.
You can use the [`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
{{< note >}}
Versions of Kubernetes before v1.22 automatically created credentials for
accessing the Kubernetes API. This older mechanism was based on creating token
Secrets that could then be mounted into running Pods.
In more recent versions, including Kubernetes v{{< skew currentVersion >}}, API
credentials are obtained directly by using the
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
API, and are mounted into Pods using a
[projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume).
The tokens obtained using this method have bounded lifetimes, and are
automatically invalidated when the Pod they are mounted into is deleted.
You can still
[manually create](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
a service account token Secret; for example, if you need a token that never
expires. However, using the
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
subresource to obtain a token to access the API is recommended instead.
You can use the
[`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
command to obtain a token from the `TokenRequest` API.
{{< /note >}}
You should only create a service account token Secret object
if you can't use the `TokenRequest` API to obtain a token,

View File

@ -88,56 +88,56 @@ spec:
The general workflow of a device plugin includes the following steps:
1. Initialization. During this phase, the device plugin performs vendor-specific
initialization and setup to make sure the devices are in a ready state.
initialization and setup to make sure the devices are in a ready state.
1. The plugin starts a gRPC service, with a Unix socket under the host path
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
```gRPC
service DevicePlugin {
// GetDevicePluginOptions returns options to be communicated with Device Manager.
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
```gRPC
service DevicePlugin {
// GetDevicePluginOptions returns options to be communicated with Device Manager.
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
// ListAndWatch returns a stream of List of Devices
// Whenever a Device state change or a Device disappears, ListAndWatch
// returns the new list
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
// ListAndWatch returns a stream of List of Devices
// Whenever a Device state change or a Device disappears, ListAndWatch
// returns the new list
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
// Allocate is called during container creation so that the Device
// Plugin can run device specific operations and instruct Kubelet
// of the steps to make the Device available in the container
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
// Allocate is called during container creation so that the Device
// Plugin can run device specific operations and instruct Kubelet
// of the steps to make the Device available in the container
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
// GetPreferredAllocation returns a preferred set of devices to allocate
// from a list of available ones. The resulting preferred allocation is not
// guaranteed to be the allocation ultimately performed by the
// devicemanager. It is only designed to help the devicemanager make a more
// informed allocation decision when possible.
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
// GetPreferredAllocation returns a preferred set of devices to allocate
// from a list of available ones. The resulting preferred allocation is not
// guaranteed to be the allocation ultimately performed by the
// devicemanager. It is only designed to help the devicemanager make a more
// informed allocation decision when possible.
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
// before each container start. Device plugin can run device specific operations
// such as resetting the device before making devices available to the container.
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
}
```
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
// before each container start. Device plugin can run device specific operations
// such as resetting the device before making devices available to the container.
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
}
```
{{< note >}}
Plugins are not required to provide useful implementations for
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating
the availability of these calls, if any, should be set in the `DevicePluginOptions`
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
always call `GetDevicePluginOptions()` to see which optional functions are
available, before calling any of them directly.
{{< /note >}}
{{< note >}}
Plugins are not required to provide useful implementations for
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating
the availability of these calls, if any, should be set in the `DevicePluginOptions`
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
always call `GetDevicePluginOptions()` to see which optional functions are
available, before calling any of them directly.
{{< /note >}}
1. The plugin registers itself with the kubelet through the Unix socket at host
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
{{< note >}}
The ordering of the workflow is important. A plugin MUST start serving gRPC
service before registering itself with kubelet for successful registration.
{{< /note >}}
{{< note >}}
The ordering of the workflow is important. A plugin MUST start serving gRPC
service before registering itself with kubelet for successful registration.
{{< /note >}}
1. After successfully registering itself, the device plugin runs in serving mode, during which it keeps
monitoring device health and reports back to the kubelet upon any device state changes.
@ -297,7 +297,6 @@ However, calling `GetAllocatableResources` endpoint is not sufficient in case of
update and Kubelet needs to be restarted to reflect the correct resource capacity and allocatable.
{{< /note >}}
```gRPC
// AllocatableResourcesResponses contains informations about all the devices known by the kubelet
message AllocatableResourcesResponse {
@ -318,14 +317,14 @@ Preceding Kubernetes v1.23, to enable this feature `kubelet` must be started wit
```
`ContainerDevices` do expose the topology information declaring to which NUMA cells the device is
affine. The NUMA cells are identified using a opaque integer ID, which value is consistent to
affine. The NUMA cells are identified using a opaque integer ID, which value is consistent to
what device plugins report
[when they register themselves to the kubelet](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#device-plugin-integration-with-the-topology-manager).
The gRPC service is served over a unix socket at `/var/lib/kubelet/pod-resources/kubelet.sock`.
Monitoring agents for device plugin resources can be deployed as a daemon, or as a DaemonSet.
The canonical directory `/var/lib/kubelet/pod-resources` requires privileged access, so monitoring
agents must run in a privileged security context. If a device monitoring agent is running as a
agents must run in a privileged security context. If a device monitoring agent is running as a
DaemonSet, `/var/lib/kubelet/pod-resources` must be mounted as a
{{< glossary_tooltip term_id="volume" >}} in the device monitoring agent's
[PodSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#podspec-v1-core).
@ -360,7 +359,7 @@ resource assignment decisions.
`TopologyInfo` supports setting a `nodes` field to either `nil` or a list of NUMA nodes. This
allows the Device Plugin to advertise a device that spans multiple NUMA nodes.
Setting `TopologyInfo` to `nil` or providing an empty list of NUMA nodes for a given device
Setting `TopologyInfo` to `nil` or providing an empty list of NUMA nodes for a given device
indicates that the Device Plugin does not have a NUMA affinity preference for that device.
An example `TopologyInfo` struct populated for a device by a Device Plugin:
@ -396,4 +395,3 @@ Here are some examples of device plugin implementations:
* Learn about the [Topology Manager](/docs/tasks/administer-cluster/topology-manager/)
* Read about using [hardware acceleration for TLS ingress](/blog/2019/04/24/hardware-accelerated-ssl/tls-termination-in-ingress-controllers-using-kubernetes-device-plugins-and-runtimeclass/)
with Kubernetes

View File

@ -8,7 +8,6 @@ weight: 60
You can use Kubernetes annotations to attach arbitrary non-identifying metadata
to objects. Clients such as tools and libraries can retrieve this metadata.
<!-- body -->
## Attaching metadata to objects
@ -74,10 +73,9 @@ If the prefix is omitted, the annotation Key is presumed to be private to the us
The `kubernetes.io/` and `k8s.io/` prefixes are reserved for Kubernetes core components.
For example, here's the configuration file for a Pod that has the annotation `imageregistry: https://hub.docker.com/` :
For example, here's a manifest for a Pod that has the annotation `imageregistry: https://hub.docker.com/` :
```yaml
apiVersion: v1
kind: Pod
metadata:
@ -90,14 +88,8 @@ spec:
image: nginx:1.14.2
ports:
- containerPort: 80
```
## {{% heading "whatsnext" %}}
Learn more about [Labels and Selectors](/docs/concepts/overview/working-with-objects/labels/).

View File

@ -79,7 +79,7 @@ Valid label value:
* unless empty, must begin and end with an alphanumeric character (`[a-z0-9A-Z]`),
* could contain dashes (`-`), underscores (`_`), dots (`.`), and alphanumerics between.
For example, here's the configuration file for a Pod that has two labels
For example, here's a manifest for a Pod that has two labels
`environment: production` and `app: nginx`:
```yaml
@ -259,7 +259,7 @@ or
```yaml
selector:
component: redis
component: redis
```
This selector (respectively in `json` or `yaml` format) is equivalent to
@ -278,8 +278,8 @@ selector:
matchLabels:
component: redis
matchExpressions:
- {key: tier, operator: In, values: [cache]}
- {key: environment, operator: NotIn, values: [dev]}
- { key: tier, operator: In, values: [cache] }
- { key: environment, operator: NotIn, values: [dev] }
```
`matchLabels` is a map of `{key,value}` pairs. A single `{key,value}` in the

View File

@ -24,6 +24,10 @@ For non-unique user-provided attributes, Kubernetes provides [labels](/docs/conc
{{< glossary_definition term_id="name" length="all" >}}
**Names must be unique across all [API versions](/docs/concepts/overview/kubernetes-api/#api-groups-and-versioning)
of the same resource. API resources are distinguished by their API group, resource type, namespace
(for namespaced resources), and name. In other words, API version is irrelevant in this context.**
{{< note >}}
In cases when objects represent a physical entity, like a Node representing a physical host, when the host is re-created under the same name without deleting and re-creating the Node, Kubernetes treats the new host as the old one, which may lead to inconsistencies.
{{< /note >}}

View File

@ -122,7 +122,7 @@ In addition, you can limit consumption of storage resources based on associated
| `requests.storage` | Across all persistent volume claims, the sum of storage requests cannot exceed this value. |
| `persistentvolumeclaims` | The total number of [PersistentVolumeClaims](/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) that can exist in the namespace. |
| `<storage-class-name>.storageclass.storage.k8s.io/requests.storage` | Across all persistent volume claims associated with the `<storage-class-name>`, the sum of storage requests cannot exceed this value. |
| `<storage-class-name>.storageclass.storage.k8s.io/persistentvolumeclaims` | Across all persistent volume claims associated with the storage-class-name, the total number of [persistent volume claims](/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) that can exist in the namespace. |
| `<storage-class-name>.storageclass.storage.k8s.io/persistentvolumeclaims` | Across all persistent volume claims associated with the `<storage-class-name>`, the total number of [persistent volume claims](/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) that can exist in the namespace. |
For example, if an operator wants to quota storage with `gold` storage class separate from `bronze` storage class, the operator can
define a quota as follows:

View File

@ -8,16 +8,15 @@ content_type: concept
weight: 20
---
<!-- overview -->
You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
_restricted_ to run on particular {{< glossary_tooltip text="node(s)" term_id="node" >}},
or to _prefer_ to run on particular nodes.
There are several ways to do this and the recommended approaches all use
[label selectors](/docs/concepts/overview/working-with-objects/labels/) to facilitate the selection.
Often, you do not need to set any such constraints; the
{{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
{{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
(for example, spreading your Pods across nodes so as not place Pods on a node with insufficient free resources).
However, there are some circumstances where you may want to control which node
the Pod deploys to, for example, to ensure that a Pod ends up on a node with an SSD attached to it,
@ -28,10 +27,10 @@ or to co-locate Pods from two different services that communicate a lot into the
You can use any of the following methods to choose where Kubernetes schedules
specific Pods:
* [nodeSelector](#nodeselector) field matching against [node labels](#built-in-node-labels)
* [Affinity and anti-affinity](#affinity-and-anti-affinity)
* [nodeName](#nodename) field
* [Pod topology spread constraints](#pod-topology-spread-constraints)
- [nodeSelector](#nodeselector) field matching against [node labels](#built-in-node-labels)
- [Affinity and anti-affinity](#affinity-and-anti-affinity)
- [nodeName](#nodename) field
- [Pod topology spread constraints](#pod-topology-spread-constraints)
## Node labels {#built-in-node-labels}
@ -51,7 +50,7 @@ and a different value in other environments.
Adding labels to nodes allows you to target Pods for scheduling on specific
nodes or groups of nodes. You can use this functionality to ensure that specific
Pods only run on nodes with certain isolation, security, or regulatory
properties.
properties.
If you use labels for node isolation, choose label keys that the {{<glossary_tooltip text="kubelet" term_id="kubelet">}}
cannot modify. This prevents a compromised node from setting those labels on
@ -59,7 +58,7 @@ itself so that the scheduler schedules workloads onto the compromised node.
The [`NodeRestriction` admission plugin](/docs/reference/access-authn-authz/admission-controllers/#noderestriction)
prevents the kubelet from setting or modifying labels with a
`node-restriction.kubernetes.io/` prefix.
`node-restriction.kubernetes.io/` prefix.
To make use of that label prefix for node isolation:
@ -73,7 +72,7 @@ To make use of that label prefix for node isolation:
You can add the `nodeSelector` field to your Pod specification and specify the
[node labels](#built-in-node-labels) you want the target node to have.
Kubernetes only schedules the Pod onto nodes that have each of the labels you
specify.
specify.
See [Assign Pods to Nodes](/docs/tasks/configure-pod-container/assign-pods-nodes) for more
information.
@ -84,20 +83,20 @@ information.
labels. Affinity and anti-affinity expands the types of constraints you can
define. Some of the benefits of affinity and anti-affinity include:
* The affinity/anti-affinity language is more expressive. `nodeSelector` only
- The affinity/anti-affinity language is more expressive. `nodeSelector` only
selects nodes with all the specified labels. Affinity/anti-affinity gives you
more control over the selection logic.
* You can indicate that a rule is *soft* or *preferred*, so that the scheduler
- You can indicate that a rule is *soft* or *preferred*, so that the scheduler
still schedules the Pod even if it can't find a matching node.
* You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
- You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
instead of just node labels, which allows you to define rules for which Pods
can be co-located on a node.
The affinity feature consists of two types of affinity:
* *Node affinity* functions like the `nodeSelector` field but is more expressive and
- *Node affinity* functions like the `nodeSelector` field but is more expressive and
allows you to specify soft rules.
* *Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
- *Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
on other Pods.
### Node affinity
@ -106,12 +105,12 @@ Node affinity is conceptually similar to `nodeSelector`, allowing you to constra
Pod can be scheduled on based on node labels. There are two types of node
affinity:
* `requiredDuringSchedulingIgnoredDuringExecution`: The scheduler can't
schedule the Pod unless the rule is met. This functions like `nodeSelector`,
but with a more expressive syntax.
* `preferredDuringSchedulingIgnoredDuringExecution`: The scheduler tries to
find a node that meets the rule. If a matching node is not available, the
scheduler still schedules the Pod.
- `requiredDuringSchedulingIgnoredDuringExecution`: The scheduler can't
schedule the Pod unless the rule is met. This functions like `nodeSelector`,
but with a more expressive syntax.
- `preferredDuringSchedulingIgnoredDuringExecution`: The scheduler tries to
find a node that meets the rule. If a matching node is not available, the
scheduler still schedules the Pod.
{{<note>}}
In the preceding types, `IgnoredDuringExecution` means that if the node labels
@ -127,17 +126,17 @@ For example, consider the following Pod spec:
In this example, the following rules apply:
* The node *must* have a label with the key `topology.kubernetes.io/zone` and
the value of that label *must* be either `antarctica-east1` or `antarctica-west1`.
* The node *preferably* has a label with the key `another-node-label-key` and
the value `another-node-label-value`.
- The node *must* have a label with the key `topology.kubernetes.io/zone` and
the value of that label *must* be either `antarctica-east1` or `antarctica-west1`.
- The node *preferably* has a label with the key `another-node-label-key` and
the value `another-node-label-value`.
You can use the `operator` field to specify a logical operator for Kubernetes to use when
interpreting the rules. You can use `In`, `NotIn`, `Exists`, `DoesNotExist`,
`Gt` and `Lt`.
`NotIn` and `DoesNotExist` allow you to define node anti-affinity behavior.
Alternatively, you can use [node taints](/docs/concepts/scheduling-eviction/taint-and-toleration/)
Alternatively, you can use [node taints](/docs/concepts/scheduling-eviction/taint-and-toleration/)
to repel Pods from specific nodes.
{{<note>}}
@ -168,7 +167,7 @@ The final sum is added to the score of other priority functions for the node.
Nodes with the highest total score are prioritized when the scheduler makes a
scheduling decision for the Pod.
For example, consider the following Pod spec:
For example, consider the following Pod spec:
{{< codenew file="pods/pod-with-affinity-anti-affinity.yaml" >}}
@ -222,7 +221,7 @@ unexpected to them. Use node labels that have a clear correlation to the
scheduler profile name.
{{< note >}}
The DaemonSet controller, which [creates Pods for DaemonSets](/docs/concepts/workloads/controllers/daemonset/#scheduled-by-default-scheduler),
The DaemonSet controller, which [creates Pods for DaemonSets](/docs/concepts/workloads/controllers/daemonset/#how-daemon-pods-are-scheduled),
does not support scheduling profiles. When the DaemonSet controller creates
Pods, the default Kubernetes scheduler places those Pods and honors any
`nodeAffinity` rules in the DaemonSet controller.
@ -268,8 +267,8 @@ to unintended behavior.
Similar to [node affinity](#node-affinity) are two types of Pod affinity and
anti-affinity as follows:
* `requiredDuringSchedulingIgnoredDuringExecution`
* `preferredDuringSchedulingIgnoredDuringExecution`
- `requiredDuringSchedulingIgnoredDuringExecution`
- `preferredDuringSchedulingIgnoredDuringExecution`
For example, you could use
`requiredDuringSchedulingIgnoredDuringExecution` affinity to tell the scheduler to
@ -297,7 +296,7 @@ The affinity rule says that the scheduler can only schedule a Pod onto a node if
the node is in the same zone as one or more existing Pods with the label
`security=S1`. More precisely, the scheduler must place the Pod on a node that has the
`topology.kubernetes.io/zone=V` label, as long as there is at least one node in
that zone that currently has one or more Pods with the Pod label `security=S1`.
that zone that currently has one or more Pods with the Pod label `security=S1`.
The anti-affinity rule says that the scheduler should try to avoid scheduling
the Pod onto a node that is in the same zone as one or more Pods with the label
@ -314,9 +313,9 @@ You can use the `In`, `NotIn`, `Exists` and `DoesNotExist` values in the
In principle, the `topologyKey` can be any allowed label key with the following
exceptions for performance and security reasons:
* For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
- For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
and `preferredDuringSchedulingIgnoredDuringExecution`.
* For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
- For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
the admission controller `LimitPodHardAntiAffinityTopology` limits
`topologyKey` to `kubernetes.io/hostname`. You can modify or disable the
admission controller if you want to allow custom topologies.
@ -328,17 +327,18 @@ If omitted or empty, `namespaces` defaults to the namespace of the Pod where the
affinity/anti-affinity definition appears.
#### Namespace selector
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
You can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
The affinity term is applied to namespaces selected by both `namespaceSelector` and the `namespaces` field.
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
null `namespaceSelector` matches the namespace of the Pod where the rule is defined.
#### More practical use-cases
Inter-pod affinity and anti-affinity can be even more useful when they are used with higher
level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
rules allow you to configure that a set of workloads should
be co-located in the same defined topology; for example, preferring to place two related
Pods onto the same node.
@ -430,10 +430,10 @@ spec:
Creating the two preceding Deployments results in the following cluster layout,
where each web server is co-located with a cache, on three separate nodes.
| node-1 | node-2 | node-3 |
|:--------------------:|:-------------------:|:------------------:|
| *webserver-1* | *webserver-2* | *webserver-3* |
| *cache-1* | *cache-2* | *cache-3* |
| node-1 | node-2 | node-3 |
| :-----------: | :-----------: | :-----------: |
| *webserver-1* | *webserver-2* | *webserver-3* |
| *cache-1* | *cache-2* | *cache-3* |
The overall effect is that each cache instance is likely to be accessed by a single client, that
is running on the same node. This approach aims to minimize both skew (imbalanced load) and latency.
@ -453,13 +453,12 @@ tries to place the Pod on that node. Using `nodeName` overrules using
Some of the limitations of using `nodeName` to select nodes are:
- If the named node does not exist, the Pod will not run, and in
some cases may be automatically deleted.
- If the named node does not have the resources to accommodate the
Pod, the Pod will fail and its reason will indicate why,
for example OutOfmemory or OutOfcpu.
- Node names in cloud environments are not always predictable or
stable.
- If the named node does not exist, the Pod will not run, and in
some cases may be automatically deleted.
- If the named node does not have the resources to accommodate the
Pod, the Pod will fail and its reason will indicate why,
for example OutOfmemory or OutOfcpu.
- Node names in cloud environments are not always predictable or stable.
{{< note >}}
`nodeName` is intended for use by custom schedulers or advanced use cases where
@ -495,12 +494,10 @@ to learn more about how these work.
## {{% heading "whatsnext" %}}
* Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
* Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
- Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
- Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
and for [inter-pod affinity/anti-affinity](https://git.k8s.io/design-proposals-archive/scheduling/podaffinity.md).
* Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
resource allocation decisions.
* Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
* Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
- Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
resource allocation decisions.
- Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
- Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).

View File

@ -63,9 +63,10 @@ The name of a PriorityClass object must be a valid
and it cannot be prefixed with `system-`.
A PriorityClass object can have any 32-bit integer value smaller than or equal
to 1 billion. Larger numbers are reserved for critical system Pods that should
not normally be preempted or evicted. A cluster admin should create one
PriorityClass object for each such mapping that they want.
to 1 billion. This means that the range of values for a PriorityClass object is
from -2147483648 to 1000000000 inclusive. Larger numbers are reserved for
built-in PriorityClasses that represent critical system Pods. A cluster
admin should create one PriorityClass object for each such mapping that they want.
PriorityClass also has two optional fields: `globalDefault` and `description`.
The `globalDefault` field indicates that the value of this PriorityClass should

View File

@ -178,14 +178,25 @@ following methods:
rotates the token before it expires.
* [Service Account Token Secrets](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
(not recommended): You can mount service account tokens as Kubernetes
Secrets in Pods. These tokens don't expire and don't rotate. This method
is not recommended, especially at scale, because of the risks associated
Secrets in Pods. These tokens don't expire and don't rotate.
This method is not recommended, especially at scale, because of the risks associated
with static, long-lived credentials. In Kubernetes v1.24 and later, the
[LegacyServiceAccountTokenNoAutoGeneration feature gate](/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-graduated-or-deprecated-features)
prevents Kubernetes from automatically creating these tokens for
ServiceAccounts. `LegacyServiceAccountTokenNoAutoGeneration` is enabled
by default; in other words, Kubernetes does not create these tokens.
{{< note >}}
For applications running outside your Kubernetes cluster, you might be considering
creating a long-lived ServiceAccount token that is stored in a Secret. This allows authentication, but the Kubernetes project recommends you avoid this approach.
Long-lived bearer tokens represent a security risk as, once disclosed, the token
can be misused. Instead, consider using an alternative. For example, your external
application can authenticate using a well-protected private key `and` a certificate,
or using a custom mechanism such as an [authentication webhook](/docs/reference/access-authn-authz/authentication/#webhook-token-authentication) that you implement yourself.
You can also use TokenRequest to obtain short-lived tokens for your external application.
{{< /note >}}
## Authenticating service account credentials {#authenticating-credentials}
ServiceAccounts use signed

View File

@ -96,10 +96,10 @@ Services will always have the `ready` condition set to `true`.
#### Serving
{{< feature-state for_k8s_version="v1.22" state="beta" >}}
{{< feature-state for_k8s_version="v1.26" state="stable" >}}
`serving` is identical to the `ready` condition, except it does not account for terminating states.
Consumers of the EndpointSlice API should check this condition if they care about pod readiness while
The `serving` condition is almost identical to the `ready` condition. The difference is that
consumers of the EndpointSlice API should check the `serving` condition if they care about pod readiness while
the pod is also terminating.
{{< note >}}
@ -235,7 +235,7 @@ at different times.
{{< note >}}
Clients of the EndpointSlice API must iterate through all the existing EndpointSlices
associated to a Service and build a complete list of unique network endpoints. It is
important to mention that endpoints may be duplicated in different EndointSlices.
important to mention that endpoints may be duplicated in different EndpointSlices.
You can find a reference implementation for how to perform this endpoint aggregation
and deduplication as part of the `EndpointSliceCache` code within `kube-proxy`.

View File

@ -16,24 +16,25 @@ weight: 10
<!-- overview -->
{{< glossary_definition term_id="service" length="short" >}}
{{< glossary_definition term_id="service" length="short" prepend="In Kubernetes, a Service is" >}}
With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism.
Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods,
and can load-balance across them.
A key aim of Services in Kubernetes is that you don't need to modify your existing
application to use an unfamiliar service discovery mechanism.
You can run code in Pods, whether this is a code designed for a cloud-native world, or
an older app you've containerized. You use a Service to make that set of Pods available
on the network so that clients can interact with it.
<!-- body -->
## Motivation
Kubernetes {{< glossary_tooltip term_id="pod" text="Pods" >}} are created and destroyed
to match the desired state of your cluster. Pods are nonpermanent resources.
If you use a {{< glossary_tooltip term_id="deployment" >}} to run your app,
it can create and destroy Pods dynamically.
that Deployment can create and destroy Pods dynamically. From one moment to the next,
you don't know how many of those Pods are working and healthy; you might not even know
what those healthy Pods are named.
Kubernetes {{< glossary_tooltip term_id="pod" text="Pods" >}} are created and destroyed
to match the desired state of your cluster. Pods are emphemeral resources (you should not
expect that an individual Pod is reliable and durable).
Each Pod gets its own IP address, however in a Deployment, the set of Pods
running in one moment in time could be different from
the set of Pods running that application a moment later.
Each Pod gets its own IP address (Kubernetes expects network plugins to ensure this).
For a given Deployment in your cluster, the set of Pods running in one moment in
time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them "backends") provides
functionality to other Pods (call them "frontends") inside your cluster,
@ -42,14 +43,13 @@ to, so that the frontend can use the backend part of the workload?
Enter _Services_.
## Service resources {#service-resource}
<!-- body -->
In Kubernetes, a Service is an abstraction which defines a logical set of Pods
and a policy by which to access them (sometimes this pattern is called
a micro-service). The set of Pods targeted by a Service is usually determined
by a {{< glossary_tooltip text="selector" term_id="selector" >}}.
To learn about other ways to define Service endpoints,
see [Services _without_ selectors](#services-without-selectors).
## Services in Kubernetes
The Service API, part of Kubernetes, is an abstraction to help you expose groups of
Pods over a network. Each Service object defines a logical set of endpoints (usually
these endpoints are Pods) along with a policy about how to make those pods accessible.
For example, consider a stateless image-processing backend which is running with
3 replicas. Those replicas are fungible&mdash;frontends do not care which backend
@ -59,6 +59,26 @@ track of the set of backends themselves.
The Service abstraction enables this decoupling.
The set of Pods targeted by a Service is usually determined
by a {{< glossary_tooltip text="selector" term_id="selector" >}} that you
define.
To learn about other ways to define Service endpoints,
see [Services _without_ selectors](#services-without-selectors).
If your workload speaks HTTP, you might choose to use an
[Ingress](/docs/concepts/services-networking/ingress/) to control how web traffic
reaches that workload.
Ingress is not a Service type, but it acts as the entry point for your
cluster. An Ingress lets you consolidate your routing rules into a single resource, so
that you can expose multiple components of your workload, running separately in your
cluster, behind a single listener.
The [Gateway](https://gateway-api.sigs.k8s.io/#what-is-the-gateway-api) API for Kubernetes
provides extra capabilities beyond Ingress and Service. You can add Gateway to your cluster -
it is a family of extension APIs, implemented using
{{< glossary_tooltip term_id="CustomResourceDefinition" text="CustomResourceDefinitions" >}} -
and then use these to configure access to network services that are running in your cluster.
### Cloud-native service discovery
If you're able to use Kubernetes APIs for service discovery in your application,
@ -69,16 +89,20 @@ whenever the set of Pods in a Service changes.
For non-native applications, Kubernetes offers ways to place a network port or load
balancer in between your application and the backend Pods.
Either way, your workload can use these [service discovery](#discovering-services)
mechanisms to find the target it wants to connect to.
## Defining a Service
A Service in Kubernetes is a REST object, similar to a Pod. Like all of the
REST objects, you can `POST` a Service definition to the API server to create
a new instance.
The name of a Service object must be a valid
[RFC 1035 label name](/docs/concepts/overview/working-with-objects/names#rfc-1035-label-names).
A Service in Kubernetes is an
{{< glossary_tooltip text="object" term_id="object" >}}
(the same way that a Pod or a ConfigMap is an object). You can create,
view or modify Service definitions using the Kubernetes API. Usually
you use a tool such as `kubectl` to make those API calls for you.
For example, suppose you have a set of Pods where each listens on TCP port 9376
and contains a label `app.kubernetes.io/name=MyApp`:
For example, suppose you have a set of Pods that each listen on TCP port 9376
and are labelled as `app.kubernetes.io/name=MyApp`. You can define a Service to
publish that TCP listener:
```yaml
apiVersion: v1
@ -94,16 +118,20 @@ spec:
targetPort: 9376
```
This specification creates a new Service object named "my-service", which
targets TCP port 9376 on any Pod with the `app.kubernetes.io/name=MyApp` label.
Applying this manifest creates a new Service named "my-service", which
targets TCP port 9376 on any Pod with the `app.kubernetes.io/name: MyApp` label.
Kubernetes assigns this Service an IP address (sometimes called the "cluster IP"),
which is used by the Service proxies
(see [Virtual IP addressing mechanism](#virtual-ip-addressing-mechanism) below).
Kubernetes assigns this Service an IP address (the _cluster IP_),
that is used by the virtual IP address mechanism. For more details on that mechanism,
read [Virtual IPs and Service Proxies](/docs/reference/networking/virtual-ips/).
The controller for that Service continuously scans for Pods that
match its selector, and then makes any necessary updates to the set of
EndpointSlices for the Service.
The name of a Service object must be a valid
[RFC 1035 label name](/docs/concepts/overview/working-with-objects/names#rfc-1035-label-names).
The controller for the Service selector continuously scans for Pods that
match its selector, and then POSTs any updates to an Endpoint object
also named "my-service".
{{< note >}}
A Service can map _any_ incoming `port` to a `targetPort`. By default and
@ -177,8 +205,8 @@ For example:
* You are migrating a workload to Kubernetes. While evaluating the approach,
you run only a portion of your backends in Kubernetes.
In any of these scenarios you can define a Service _without_ a Pod selector.
For example:
In any of these scenarios you can define a Service _without_ specifying a
selector to match Pods. For example:
```yaml
apiVersion: v1
@ -262,9 +290,9 @@ selector will fail due to this constraint. This prevents the Kubernetes API serv
from being used as a proxy to endpoints the caller may not be authorized to access.
{{< /note >}}
An ExternalName Service is a special case of Service that does not have
An `ExternalName` Service is a special case of Service that does not have
selectors and uses DNS names instead. For more information, see the
[ExternalName](#externalname) section later in this document.
[ExternalName](#externalname) section.
### EndpointSlices
@ -628,7 +656,7 @@ by making the changes that are equivalent to you requesting a Service of
`type: NodePort`. The cloud-controller-manager component then configures the external load balancer to
forward traffic to that assigned node port.
_As an alpha feature_, you can configure a load balanced Service to
You can configure a load balanced Service to
[omit](#load-balancer-nodeport-allocation) assigning a node port, provided that the
cloud provider implementation supports this.
@ -704,7 +732,7 @@ In a split-horizon DNS environment you would need two Services to be able to rou
and internal traffic to your endpoints.
To set an internal load balancer, add one of the following annotations to your Service
depending on the cloud Service provider you're using.
depending on the cloud service provider you're using:
{{< tabs name="service_tabs" >}}
{{% tab name="Default" %}}
@ -1137,7 +1165,7 @@ will be routed to one of the Service endpoints. `externalIPs` are not managed by
of the cluster administrator.
In the Service spec, `externalIPs` can be specified along with any of the `ServiceTypes`.
In the example below, "`my-service`" can be accessed by clients on "`80.11.12.10:80`" (`externalIP:port`)
In the example below, "`my-service`" can be accessed by clients on "`198.51.100.32:80`" (`externalIP:port`)
```yaml
apiVersion: v1
@ -1151,9 +1179,9 @@ spec:
- name: http
protocol: TCP
port: 80
targetPort: 9376
targetPort: 49152
externalIPs:
- 80.11.12.10
- 198.51.100.32
```
## Session stickiness
@ -1178,13 +1206,17 @@ mechanism Kubernetes provides to expose a Service with a virtual IP address.
## {{% heading "whatsnext" %}}
Learn more about the following:
* Follow the [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/) tutorial
* [Ingress](/docs/concepts/services-networking/ingress/) exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
* [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/)
Learn more about Services and how they fit into Kubernetes:
* Follow the [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/) tutorial.
* Read about [Ingress](/docs/concepts/services-networking/ingress/), which
exposes HTTP and HTTPS routes from outside the cluster to Services within
your cluster.
* Read about [Gateway](https://gateway-api.sigs.k8s.io/), an extension to
Kubernetes that provides more flexibility than Ingress.
For more context:
For more context, read the following:
* [Virtual IPs and Service Proxies](/docs/reference/networking/virtual-ips/)
* [API reference](/docs/reference/kubernetes-api/service-resources/service-v1/) for the Service API
* [API reference](/docs/reference/kubernetes-api/service-resources/endpoints-v1/) for the Endpoints API
* [API reference](/docs/reference/kubernetes-api/service-resources/endpoint-slice-v1/) for the EndpointSlice API
* [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/)
* [Service API reference](/docs/reference/kubernetes-api/service-resources/service-v1/)
* [EndpointSlice API reference](/docs/reference/kubernetes-api/service-resources/endpoint-slice-v1/)
* [Endpoint API reference (legacy)](/docs/reference/kubernetes-api/service-resources/endpoints-v1/)

View File

@ -126,7 +126,7 @@ zone.
5. **A zone is not represented in hints:** If the kube-proxy is unable to find
at least one endpoint with a hint targeting the zone it is running in, it falls
to using endpoints from all zones. This is most likely to happen as you add
back to using endpoints from all zones. This is most likely to happen as you add
a new zone into your existing cluster.
## Constraints

View File

@ -16,25 +16,45 @@ weight: 20
<!-- overview -->
This document describes _persistent volumes_ in Kubernetes. Familiarity with [volumes](/docs/concepts/storage/volumes/) is suggested.
This document describes _persistent volumes_ in Kubernetes. Familiarity with
[volumes](/docs/concepts/storage/volumes/) is suggested.
<!-- body -->
## Introduction
Managing storage is a distinct problem from managing compute instances. The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
Managing storage is a distinct problem from managing compute instances.
The PersistentVolume subsystem provides an API for users and administrators
that abstracts details of how storage is provided from how it is consumed.
To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
A _PersistentVolume_ (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using [Storage Classes](/docs/concepts/storage/storage-classes/). It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A _PersistentVolume_ (PV) is a piece of storage in the cluster that has been
provisioned by an administrator or dynamically provisioned using
[Storage Classes](/docs/concepts/storage/storage-classes/). It is a resource in
the cluster just like a node is a cluster resource. PVs are volume plugins like
Volumes, but have a lifecycle independent of any individual Pod that uses the PV.
This API object captures the details of the implementation of the storage, be that
NFS, iSCSI, or a cloud-provider-specific storage system.
A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see [AccessModes](#access-modes)).
A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar
to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can
request specific levels of resources (CPU and Memory). Claims can request specific
size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or
ReadWriteMany, see [AccessModes](#access-modes)).
While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes that differ in more ways than size and access modes, without exposing users to the details of how those volumes are implemented. For these needs, there is the _StorageClass_ resource.
While PersistentVolumeClaims allow a user to consume abstract storage resources,
it is common that users need PersistentVolumes with varying properties, such as
performance, for different problems. Cluster administrators need to be able to
offer a variety of PersistentVolumes that differ in more ways than size and access
modes, without exposing users to the details of how those volumes are implemented.
For these needs, there is the _StorageClass_ resource.
See the [detailed walkthrough with working examples](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/).
## Lifecycle of a volume and claim
PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle:
PVs are resources in the cluster. PVCs are requests for those resources and also act
as claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle:
### Provisioning
@ -42,7 +62,9 @@ There are two ways PVs may be provisioned: statically or dynamically.
#### Static
A cluster administrator creates a number of PVs. They carry the details of the real storage, which is available for use by cluster users. They exist in the Kubernetes API and are available for consumption.
A cluster administrator creates a number of PVs. They carry the details of the
real storage, which is available for use by cluster users. They exist in the
Kubernetes API and are available for consumption.
#### Dynamic
@ -55,34 +77,60 @@ provisioning to occur. Claims that request the class `""` effectively disable
dynamic provisioning for themselves.
To enable dynamic storage provisioning based on storage class, the cluster administrator
needs to enable the `DefaultStorageClass` [admission controller](/docs/reference/access-authn-authz/admission-controllers/#defaultstorageclass)
needs to enable the `DefaultStorageClass`
[admission controller](/docs/reference/access-authn-authz/admission-controllers/#defaultstorageclass)
on the API server. This can be done, for example, by ensuring that `DefaultStorageClass` is
among the comma-delimited, ordered list of values for the `--enable-admission-plugins` flag of
the API server component. For more information on API server command-line flags,
check [kube-apiserver](/docs/admin/kube-apiserver/) documentation.
check [kube-apiserver](/docs/reference/command-line-tools-reference/kube-apiserver/) documentation.
### Binding
A user creates, or in the case of dynamic provisioning, has already created, a PersistentVolumeClaim with a specific amount of storage requested and with certain access modes. A control loop in the master watches for new PVCs, finds a matching PV (if possible), and binds them together. If a PV was dynamically provisioned for a new PVC, the loop will always bind that PV to the PVC. Otherwise, the user will always get at least what they asked for, but the volume may be in excess of what was requested. Once bound, PersistentVolumeClaim binds are exclusive, regardless of how they were bound. A PVC to PV binding is a one-to-one mapping, using a ClaimRef which is a bi-directional binding between the PersistentVolume and the PersistentVolumeClaim.
A user creates, or in the case of dynamic provisioning, has already created,
a PersistentVolumeClaim with a specific amount of storage requested and with
certain access modes. A control loop in the master watches for new PVCs, finds
a matching PV (if possible), and binds them together. If a PV was dynamically
provisioned for a new PVC, the loop will always bind that PV to the PVC. Otherwise,
the user will always get at least what they asked for, but the volume may be in
excess of what was requested. Once bound, PersistentVolumeClaim binds are exclusive,
regardless of how they were bound. A PVC to PV binding is a one-to-one mapping,
using a ClaimRef which is a bi-directional binding between the PersistentVolume
and the PersistentVolumeClaim.
Claims will remain unbound indefinitely if a matching volume does not exist. Claims will be bound as matching volumes become available. For example, a cluster provisioned with many 50Gi PVs would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.
Claims will remain unbound indefinitely if a matching volume does not exist.
Claims will be bound as matching volumes become available. For example, a
cluster provisioned with many 50Gi PVs would not match a PVC requesting 100Gi.
The PVC can be bound when a 100Gi PV is added to the cluster.
### Using
Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a Pod. For volumes that support multiple access modes, the user specifies which mode is desired when using their claim as a volume in a Pod.
Pods use claims as volumes. The cluster inspects the claim to find the bound
volume and mounts that volume for a Pod. For volumes that support multiple
access modes, the user specifies which mode is desired when using their claim
as a volume in a Pod.
Once a user has a claim and that claim is bound, the bound PV belongs to the user for as long as they need it. Users schedule Pods and access their claimed PVs by including a `persistentVolumeClaim` section in a Pod's `volumes` block. See [Claims As Volumes](#claims-as-volumes) for more details on this.
Once a user has a claim and that claim is bound, the bound PV belongs to the
user for as long as they need it. Users schedule Pods and access their claimed
PVs by including a `persistentVolumeClaim` section in a Pod's `volumes` block.
See [Claims As Volumes](#claims-as-volumes) for more details on this.
### Storage Object in Use Protection
The purpose of the Storage Object in Use Protection feature is to ensure that PersistentVolumeClaims (PVCs) in active use by a Pod and PersistentVolume (PVs) that are bound to PVCs are not removed from the system, as this may result in data loss.
The purpose of the Storage Object in Use Protection feature is to ensure that
PersistentVolumeClaims (PVCs) in active use by a Pod and PersistentVolume (PVs)
that are bound to PVCs are not removed from the system, as this may result in data loss.
{{< note >}}
PVC is in active use by a Pod when a Pod object exists that is using the PVC.
{{< /note >}}
If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately. PVC removal is postponed until the PVC is no longer actively used by any Pods. Also, if an admin deletes a PV that is bound to a PVC, the PV is not removed immediately. PV removal is postponed until the PV is no longer bound to a PVC.
If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately.
PVC removal is postponed until the PVC is no longer actively used by any Pods. Also,
if an admin deletes a PV that is bound to a PVC, the PV is not removed immediately.
PV removal is postponed until the PV is no longer bound to a PVC.
You can see that a PVC is protected when the PVC's status is `Terminating` and the `Finalizers` list includes `kubernetes.io/pvc-protection`:
You can see that a PVC is protected when the PVC's status is `Terminating` and the
`Finalizers` list includes `kubernetes.io/pvc-protection`:
```shell
kubectl describe pvc hostpath
@ -98,7 +146,8 @@ Finalizers: [kubernetes.io/pvc-protection]
...
```
You can see that a PV is protected when the PV's status is `Terminating` and the `Finalizers` list includes `kubernetes.io/pv-protection` too:
You can see that a PV is protected when the PV's status is `Terminating` and
the `Finalizers` list includes `kubernetes.io/pv-protection` too:
```shell
kubectl describe pv task-pv-volume
@ -122,29 +171,48 @@ Events: <none>
### Reclaiming
When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted.
When a user is done with their volume, they can delete the PVC objects from the
API that allows reclamation of the resource. The reclaim policy for a PersistentVolume
tells the cluster what to do with the volume after it has been released of its claim.
Currently, volumes can either be Retained, Recycled, or Deleted.
#### Retain
The `Retain` reclaim policy allows for manual reclamation of the resource. When the PersistentVolumeClaim is deleted, the PersistentVolume still exists and the volume is considered "released". But it is not yet available for another claim because the previous claimant's data remains on the volume. An administrator can manually reclaim the volume with the following steps.
The `Retain` reclaim policy allows for manual reclamation of the resource.
When the PersistentVolumeClaim is deleted, the PersistentVolume still exists
and the volume is considered "released". But it is not yet available for
another claim because the previous claimant's data remains on the volume.
An administrator can manually reclaim the volume with the following steps.
1. Delete the PersistentVolume. The associated storage asset in external infrastructure (such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume) still exists after the PV is deleted.
1. Delete the PersistentVolume. The associated storage asset in external infrastructure
(such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume) still exists after the PV is deleted.
1. Manually clean up the data on the associated storage asset accordingly.
1. Manually delete the associated storage asset.
If you want to reuse the same storage asset, create a new PersistentVolume with the same storage asset definition.
If you want to reuse the same storage asset, create a new PersistentVolume with
the same storage asset definition.
#### Delete
For volume plugins that support the `Delete` reclaim policy, deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure, such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume. Volumes that were dynamically provisioned inherit the [reclaim policy of their StorageClass](#reclaim-policy), which defaults to `Delete`. The administrator should configure the StorageClass according to users' expectations; otherwise, the PV must be edited or patched after it is created. See [Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/).
For volume plugins that support the `Delete` reclaim policy, deletion removes
both the PersistentVolume object from Kubernetes, as well as the associated
storage asset in the external infrastructure, such as an AWS EBS, GCE PD,
Azure Disk, or Cinder volume. Volumes that were dynamically provisioned
inherit the [reclaim policy of their StorageClass](#reclaim-policy), which
defaults to `Delete`. The administrator should configure the StorageClass
according to users' expectations; otherwise, the PV must be edited or
patched after it is created. See
[Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/).
#### Recycle
{{< warning >}}
The `Recycle` reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
The `Recycle` reclaim policy is deprecated. Instead, the recommended approach
is to use dynamic provisioning.
{{< /warning >}}
If supported by the underlying volume plugin, the `Recycle` reclaim policy performs a basic scrub (`rm -rf /thevolume/*`) on the volume and makes it available again for a new claim.
If supported by the underlying volume plugin, the `Recycle` reclaim policy performs
a basic scrub (`rm -rf /thevolume/*`) on the volume and makes it available again for a new claim.
However, an administrator can configure a custom recycler Pod template using
the Kubernetes controller manager command line arguments as described in the
@ -173,7 +241,8 @@ spec:
mountPath: /scrub
```
However, the particular path specified in the custom recycler Pod template in the `volumes` part is replaced with the particular path of the volume that is being recycled.
However, the particular path specified in the custom recycler Pod template in the
`volumes` part is replaced with the particular path of the volume that is being recycled.
### PersistentVolume deletion protection finalizer
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
@ -181,10 +250,12 @@ However, the particular path specified in the custom recycler Pod template in th
Finalizers can be added on a PersistentVolume to ensure that PersistentVolumes
having `Delete` reclaim policy are deleted only after the backing storage are deleted.
The newly introduced finalizers `kubernetes.io/pv-controller` and `external-provisioner.volume.kubernetes.io/finalizer`
The newly introduced finalizers `kubernetes.io/pv-controller` and
`external-provisioner.volume.kubernetes.io/finalizer`
are only added to dynamically provisioned volumes.
The finalizer `kubernetes.io/pv-controller` is added to in-tree plugin volumes. The following is an example
The finalizer `kubernetes.io/pv-controller` is added to in-tree plugin volumes.
The following is an example
```shell
kubectl describe pv pvc-74a498d6-3929-47e8-8c02-078c1ece4d78
@ -213,6 +284,7 @@ Events: <none>
The finalizer `external-provisioner.volume.kubernetes.io/finalizer` is added for CSI volumes.
The following is an example:
```shell
Name: pvc-2f0bab97-85a8-4552-8044-eb8be45cf48d
Labels: <none>
@ -244,14 +316,17 @@ the `kubernetes.io/pv-controller` finalizer is replaced by the
### Reserving a PersistentVolume
The control plane can [bind PersistentVolumeClaims to matching PersistentVolumes](#binding) in the
cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
The control plane can [bind PersistentVolumeClaims to matching PersistentVolumes](#binding)
in the cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding between that specific PV and PVC.
If the PersistentVolume exists and has not reserved PersistentVolumeClaims through its `claimRef` field, then the PersistentVolume and PersistentVolumeClaim will be bound.
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding
between that specific PV and PVC. If the PersistentVolume exists and has not reserved
PersistentVolumeClaims through its `claimRef` field, then the PersistentVolume and
PersistentVolumeClaim will be bound.
The binding happens regardless of some volume matching criteria, including node affinity.
The control plane still checks that [storage class](/docs/concepts/storage/storage-classes/), access modes, and requested storage size are valid.
The control plane still checks that [storage class](/docs/concepts/storage/storage-classes/),
access modes, and requested storage size are valid.
```yaml
apiVersion: v1
@ -265,7 +340,10 @@ spec:
...
```
This method does not guarantee any binding privileges to the PersistentVolume. If other PersistentVolumeClaims could use the PV that you specify, you first need to reserve that storage volume. Specify the relevant PersistentVolumeClaim in the `claimRef` field of the PV so that other PVCs can not bind to it.
This method does not guarantee any binding privileges to the PersistentVolume.
If other PersistentVolumeClaims could use the PV that you specify, you first
need to reserve that storage volume. Specify the relevant PersistentVolumeClaim
in the `claimRef` field of the PV so that other PVCs can not bind to it.
```yaml
apiVersion: v1
@ -334,8 +412,9 @@ increased and that no resize is necessary.
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
Support for expanding CSI volumes is enabled by default but it also requires a specific CSI driver to support volume expansion. Refer to documentation of the specific CSI driver for more information.
Support for expanding CSI volumes is enabled by default but it also requires a
specific CSI driver to support volume expansion. Refer to documentation of the
specific CSI driver for more information.
#### Resizing a volume containing a file system
@ -364,22 +443,33 @@ FlexVolume resize is possible only when the underlying driver supports resize.
{{< /note >}}
{{< note >}}
Expanding EBS volumes is a time-consuming operation. Also, there is a per-volume quota of one modification every 6 hours.
Expanding EBS volumes is a time-consuming operation.
Also, there is a per-volume quota of one modification every 6 hours.
{{< /note >}}
#### Recovering from Failure when Expanding Volumes
If a user specifies a new size that is too big to be satisfied by underlying storage system, expansion of PVC will be continuously retried until user or cluster administrator takes some action. This can be undesirable and hence Kubernetes provides following methods of recovering from such failures.
If a user specifies a new size that is too big to be satisfied by underlying
storage system, expansion of PVC will be continuously retried until user or
cluster administrator takes some action. This can be undesirable and hence
Kubernetes provides following methods of recovering from such failures.
{{< tabs name="recovery_methods" >}}
{{% tab name="Manually with Cluster Administrator access" %}}
If expanding underlying storage fails, the cluster administrator can manually recover the Persistent Volume Claim (PVC) state and cancel the resize requests. Otherwise, the resize requests are continuously retried by the controller without administrator intervention.
If expanding underlying storage fails, the cluster administrator can manually
recover the Persistent Volume Claim (PVC) state and cancel the resize requests.
Otherwise, the resize requests are continuously retried by the controller without
administrator intervention.
1. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC) with `Retain` reclaim policy.
2. Delete the PVC. Since PV has `Retain` reclaim policy - we will not lose any data when we recreate the PVC.
3. Delete the `claimRef` entry from PV specs, so as new PVC can bind to it. This should make the PV `Available`.
4. Re-create the PVC with smaller size than PV and set `volumeName` field of the PVC to the name of the PV. This should bind new PVC to existing PV.
1. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC)
with `Retain` reclaim policy.
2. Delete the PVC. Since PV has `Retain` reclaim policy - we will not lose any data
when we recreate the PVC.
3. Delete the `claimRef` entry from PV specs, so as new PVC can bind to it.
This should make the PV `Available`.
4. Re-create the PVC with smaller size than PV and set `volumeName` field of the
PVC to the name of the PV. This should bind new PVC to existing PV.
5. Don't forget to restore the reclaim policy of the PV.
{{% /tab %}}
@ -387,7 +477,11 @@ If expanding underlying storage fails, the cluster administrator can manually re
{{% feature-state for_k8s_version="v1.23" state="alpha" %}}
{{< note >}}
Recovery from failing PVC expansion by users is available as an alpha feature since Kubernetes 1.23. The `RecoverVolumeExpansionFailure` feature must be enabled for this feature to work. Refer to the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) documentation for more information.
Recovery from failing PVC expansion by users is available as an alpha feature
since Kubernetes 1.23. The `RecoverVolumeExpansionFailure` feature must be
enabled for this feature to work. Refer to the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
documentation for more information.
{{< /note >}}
If the feature gates `RecoverVolumeExpansionFailure` is
@ -397,7 +491,8 @@ smaller proposed size, edit `.spec.resources` for that PVC and choose a value th
value you previously tried.
This is useful if expansion to a higher value did not succeed because of capacity constraint.
If that has happened, or you suspect that it might have, you can retry expansion by specifying a
size that is within the capacity limits of underlying storage provider. You can monitor status of resize operation by watching `.status.resizeStatus` and events on the PVC.
size that is within the capacity limits of underlying storage provider. You can monitor status of
resize operation by watching `.status.resizeStatus` and events on the PVC.
Note that,
although you can specify a lower amount of storage than what was requested previously,
@ -406,7 +501,6 @@ Kubernetes does not support shrinking a PVC to less than its current size.
{{% /tab %}}
{{% /tabs %}}
## Types of Persistent Volumes
PersistentVolume types are implemented as plugins. Kubernetes currently supports the following plugins:
@ -423,7 +517,8 @@ PersistentVolume types are implemented as plugins. Kubernetes currently supports
* [`nfs`](/docs/concepts/storage/volumes/#nfs) - Network File System (NFS) storage
* [`rbd`](/docs/concepts/storage/volumes/#rbd) - Rados Block Device (RBD) volume
The following types of PersistentVolume are deprecated. This means that support is still available but will be removed in a future Kubernetes release.
The following types of PersistentVolume are deprecated.
This means that support is still available but will be removed in a future Kubernetes release.
* [`awsElasticBlockStore`](/docs/concepts/storage/volumes/#awselasticblockstore) - AWS Elastic Block Store (EBS)
(**deprecated** in v1.17)
@ -483,14 +578,21 @@ spec:
```
{{< note >}}
Helper programs relating to the volume type may be required for consumption of a PersistentVolume within a cluster. In this example, the PersistentVolume is of type NFS and the helper program /sbin/mount.nfs is required to support the mounting of NFS filesystems.
Helper programs relating to the volume type may be required for consumption of
a PersistentVolume within a cluster. In this example, the PersistentVolume is
of type NFS and the helper program /sbin/mount.nfs is required to support the
mounting of NFS filesystems.
{{< /note >}}
### Capacity
Generally, a PV will have a specific storage capacity. This is set using the PV's `capacity` attribute. Read the glossary term [Quantity](/docs/reference/glossary/?all=true#term-quantity) to understand the units expected by `capacity`.
Generally, a PV will have a specific storage capacity. This is set using the PV's
`capacity` attribute. Read the glossary term
[Quantity](/docs/reference/glossary/?all=true#term-quantity) to understand the units
expected by `capacity`.
Currently, storage size is the only resource that can be set or requested. Future attributes may include IOPS, throughput, etc.
Currently, storage size is the only resource that can be set or requested.
Future attributes may include IOPS, throughput, etc.
### Volume Mode
@ -515,12 +617,18 @@ for an example on how to use a volume with `volumeMode: Block` in a Pod.
### Access Modes
A PersistentVolume can be mounted on a host in any way supported by the resource provider. As shown in the table below, providers will have different capabilities and each PV's access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV's capabilities.
A PersistentVolume can be mounted on a host in any way supported by the resource
provider. As shown in the table below, providers will have different capabilities
and each PV's access modes are set to the specific modes supported by that particular
volume. For example, NFS can support multiple read/write clients, but a specific
NFS PV might be exported on the server as read-only. Each PV gets its own set of
access modes describing that specific PV's capabilities.
The access modes are:
`ReadWriteOnce`
: the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.
`ReadWriteOnce`
: the volume can be mounted as read-write by a single node. ReadWriteOnce access
mode still can allow multiple pods to access the volume when the pods are running on the same node.
`ReadOnlyMany`
: the volume can be mounted as read-only by many nodes.
@ -529,12 +637,14 @@ The access modes are:
: the volume can be mounted as read-write by many nodes.
`ReadWriteOncePod`
: the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod access mode if you want to ensure that only one pod across whole cluster can read that PVC or write to it. This is only supported for CSI volumes and Kubernetes version 1.22+.
: the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod
access mode if you want to ensure that only one pod across whole cluster can
read that PVC or write to it. This is only supported for CSI volumes and
Kubernetes version 1.22+.
The blog article [Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/) covers this in more detail.
The blog article
[Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/)
covers this in more detail.
In the CLI, the access modes are abbreviated to:
@ -547,13 +657,15 @@ In the CLI, the access modes are abbreviated to:
Kubernetes uses volume access modes to match PersistentVolumeClaims and PersistentVolumes.
In some cases, the volume access modes also constrain where the PersistentVolume can be mounted.
Volume access modes do **not** enforce write protection once the storage has been mounted.
Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, they don't set any constraints on the volume.
For example, even if a PersistentVolume is created as ReadOnlyMany, it is no guarantee that it will be read-only.
If the access modes are specified as ReadWriteOncePod, the volume is constrained and can be mounted on only a single Pod.
Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany,
they don't set any constraints on the volume. For example, even if a PersistentVolume is
created as ReadOnlyMany, it is no guarantee that it will be read-only. If the access modes
are specified as ReadWriteOncePod, the volume is constrained and can be mounted on only a single Pod.
{{< /note >}}
> __Important!__ A volume can only be mounted using one access mode at a time, even if it supports many. For example, a GCEPersistentDisk can be mounted as ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time.
> __Important!__ A volume can only be mounted using one access mode at a time,
> even if it supports many. For example, a GCEPersistentDisk can be mounted as
> ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time.
| Volume Plugin | ReadWriteOnce | ReadOnlyMany | ReadWriteMany | ReadWriteOncePod |
| :--- | :---: | :---: | :---: | - |
@ -593,13 +705,16 @@ Current reclaim policies are:
* Retain -- manual reclamation
* Recycle -- basic scrub (`rm -rf /thevolume/*`)
* Delete -- associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted
* Delete -- associated storage asset such as AWS EBS, GCE PD, Azure Disk,
or OpenStack Cinder volume is deleted
Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk, and Cinder volumes support deletion.
Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk,
and Cinder volumes support deletion.
### Mount Options
A Kubernetes administrator can specify additional mount options for when a Persistent Volume is mounted on a node.
A Kubernetes administrator can specify additional mount options for when a
Persistent Volume is mounted on a node.
{{< note >}}
Not all Persistent Volume types support mount options.
@ -627,10 +742,19 @@ it will become fully deprecated in a future Kubernetes release.
### Node Affinity
{{< note >}}
For most volume types, you do not need to set this field. It is automatically populated for [AWS EBS](/docs/concepts/storage/volumes/#awselasticblockstore), [GCE PD](/docs/concepts/storage/volumes/#gcepersistentdisk) and [Azure Disk](/docs/concepts/storage/volumes/#azuredisk) volume block types. You need to explicitly set this for [local](/docs/concepts/storage/volumes/#local) volumes.
For most volume types, you do not need to set this field. It is automatically
populated for [AWS EBS](/docs/concepts/storage/volumes/#awselasticblockstore),
[GCE PD](/docs/concepts/storage/volumes/#gcepersistentdisk) and
[Azure Disk](/docs/concepts/storage/volumes/#azuredisk) volume block types. You
need to explicitly set this for [local](/docs/concepts/storage/volumes/#local) volumes.
{{< /note >}}
A PV can specify node affinity to define constraints that limit what nodes this volume can be accessed from. Pods that use a PV will only be scheduled to nodes that are selected by the node affinity. To specify node affinity, set `nodeAffinity` in the `.spec` of a PV. The [PersistentVolume](/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-v1/#PersistentVolumeSpec) API reference has more details on this field.
A PV can specify node affinity to define constraints that limit what nodes this
volume can be accessed from. Pods that use a PV will only be scheduled to nodes
that are selected by the node affinity. To specify node affinity, set
`nodeAffinity` in the `.spec` of a PV. The
[PersistentVolume](/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-v1/#PersistentVolumeSpec)
API reference has more details on this field.
### Phase
@ -671,24 +795,35 @@ spec:
### Access Modes
Claims use [the same conventions as volumes](#access-modes) when requesting storage with specific access modes.
Claims use [the same conventions as volumes](#access-modes) when requesting
storage with specific access modes.
### Volume Modes
Claims use [the same convention as volumes](#volume-mode) to indicate the consumption of the volume as either a filesystem or block device.
Claims use [the same convention as volumes](#volume-mode) to indicate the
consumption of the volume as either a filesystem or block device.
### Resources
Claims, like Pods, can request specific quantities of a resource. In this case, the request is for storage. The same [resource model](https://git.k8s.io/design-proposals-archive/scheduling/resources.md) applies to both volumes and claims.
Claims, like Pods, can request specific quantities of a resource. In this case,
the request is for storage. The same
[resource model](https://git.k8s.io/design-proposals-archive/scheduling/resources.md)
applies to both volumes and claims.
### Selector
Claims can specify a [label selector](/docs/concepts/overview/working-with-objects/labels/#label-selectors) to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim. The selector can consist of two fields:
Claims can specify a
[label selector](/docs/concepts/overview/working-with-objects/labels/#label-selectors)
to further filter the set of volumes. Only the volumes whose labels match the selector
can be bound to the claim. The selector can consist of two fields:
* `matchLabels` - the volume must have a label with this value
* `matchExpressions` - a list of requirements made by specifying key, list of values, and operator that relates the key and values. Valid operators include In, NotIn, Exists, and DoesNotExist.
* `matchExpressions` - a list of requirements made by specifying key, list of values,
and operator that relates the key and values. Valid operators include In, NotIn,
Exists, and DoesNotExist.
All of the requirements, from both `matchLabels` and `matchExpressions`, are ANDed together they must all be satisfied in order to match.
All of the requirements, from both `matchLabels` and `matchExpressions`, are
ANDed together they must all be satisfied in order to match.
### Class
@ -738,22 +873,38 @@ In the past, the annotation `volume.beta.kubernetes.io/storage-class` was used i
of `storageClassName` attribute. This annotation is still working; however,
it won't be supported in a future Kubernetes release.
#### Retroactive default StorageClass assignment
{{< feature-state for_k8s_version="v1.26" state="beta" >}}
You can create a PersistentVolumeClaim without specifying a `storageClassName` for the new PVC, and you can do so even when no default StorageClass exists in your cluster. In this case, the new PVC creates as you defined it, and the `storageClassName` of that PVC remains unset until default becomes available.
You can create a PersistentVolumeClaim without specifying a `storageClassName`
for the new PVC, and you can do so even when no default StorageClass exists
in your cluster. In this case, the new PVC creates as you defined it, and the
`storageClassName` of that PVC remains unset until default becomes available.
When a default StorageClass becomes available, the control plane identifies any existing PVCs without `storageClassName`. For the PVCs that either have an empty value for `storageClassName` or do not have this key, the control plane then updates those PVCs to set `storageClassName` to match the new default StorageClass. If you have an existing PVC where the `storageClassName` is `""`, and you configure a default StorageClass, then this PVC will not get updated.
When a default StorageClass becomes available, the control plane identifies any
existing PVCs without `storageClassName`. For the PVCs that either have an empty
value for `storageClassName` or do not have this key, the control plane then
updates those PVCs to set `storageClassName` to match the new default StorageClass.
If you have an existing PVC where the `storageClassName` is `""`, and you configure
a default StorageClass, then this PVC will not get updated.
In order to keep binding to PVs with `storageClassName` set to `""` (while a default StorageClass is present), you need to set the `storageClassName` of the associated PVC to `""`.
In order to keep binding to PVs with `storageClassName` set to `""`
(while a default StorageClass is present), you need to set the `storageClassName`
of the associated PVC to `""`.
This behavior helps administrators change default StorageClass by removing the old one first and then creating or setting another one. This brief window while there is no default causes PVCs without `storageClassName` created at that time to not have any default, but due to the retroactive default StorageClass assignment this way of changing defaults is safe.
This behavior helps administrators change default StorageClass by removing the
old one first and then creating or setting another one. This brief window while
there is no default causes PVCs without `storageClassName` created at that time
to not have any default, but due to the retroactive default StorageClass
assignment this way of changing defaults is safe.
## Claims As Volumes
Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the Pod using the claim. The cluster finds the claim in the Pod's namespace and uses it to get the PersistentVolume backing the claim. The volume is then mounted to the host and into the Pod.
Pods access storage by using the claim as a volume. Claims must exist in the
same namespace as the Pod using the claim. The cluster finds the claim in the
Pod's namespace and uses it to get the PersistentVolume backing the claim.
The volume is then mounted to the host and into the Pod.
```yaml
apiVersion: v1
@ -775,12 +926,15 @@ spec:
### A Note on Namespaces
PersistentVolumes binds are exclusive, and since PersistentVolumeClaims are namespaced objects, mounting claims with "Many" modes (`ROX`, `RWX`) is only possible within one namespace.
PersistentVolumes binds are exclusive, and since PersistentVolumeClaims are
namespaced objects, mounting claims with "Many" modes (`ROX`, `RWX`) is only
possible within one namespace.
### PersistentVolumes typed `hostPath`
A `hostPath` PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
See [an example of `hostPath` typed volume](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume).
A `hostPath` PersistentVolume uses a file or directory on the Node to emulate
network-attached storage. See
[an example of `hostPath` typed volume](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume).
## Raw Block Volume Support
@ -819,6 +973,7 @@ spec:
lun: 0
readOnly: false
```
### PersistentVolumeClaim requesting a Raw Block Volume {#persistent-volume-claim-requesting-a-raw-block-volume}
```yaml
@ -858,14 +1013,18 @@ spec:
```
{{< note >}}
When adding a raw block device for a Pod, you specify the device path in the container instead of a mount path.
When adding a raw block device for a Pod, you specify the device path in the
container instead of a mount path.
{{< /note >}}
### Binding Block Volumes
If a user requests a raw block volume by indicating this using the `volumeMode` field in the PersistentVolumeClaim spec, the binding rules differ slightly from previous releases that didn't consider this mode as part of the spec.
Listed is a table of possible combinations the user and admin might specify for requesting a raw block device. The table indicates if the volume will be bound or not given the combinations:
Volume binding matrix for statically provisioned volumes:
If a user requests a raw block volume by indicating this using the `volumeMode`
field in the PersistentVolumeClaim spec, the binding rules differ slightly from
previous releases that didn't consider this mode as part of the spec.
Listed is a table of possible combinations the user and admin might specify for
requesting a raw block device. The table indicates if the volume will be bound or
not given the combinations: Volume binding matrix for statically provisioned volumes:
| PV volumeMode | PVC volumeMode | Result |
| --------------|:---------------:| ----------------:|
@ -880,15 +1039,19 @@ Volume binding matrix for statically provisioned volumes:
| Filesystem | unspecified | BIND |
{{< note >}}
Only statically provisioned volumes are supported for alpha release. Administrators should take care to consider these values when working with raw block devices.
Only statically provisioned volumes are supported for alpha release. Administrators
should take care to consider these values when working with raw block devices.
{{< /note >}}
## Volume Snapshot and Restore Volume from Snapshot Support
{{< feature-state for_k8s_version="v1.20" state="stable" >}}
Volume snapshots only support the out-of-tree CSI volume plugins. For details, see [Volume Snapshots](/docs/concepts/storage/volume-snapshots/).
In-tree volume plugins are deprecated. You can read about the deprecated volume plugins in the [Volume Plugin FAQ](https://github.com/kubernetes/community/blob/master/sig-storage/volume-plugin-faq.md).
Volume snapshots only support the out-of-tree CSI volume plugins.
For details, see [Volume Snapshots](/docs/concepts/storage/volume-snapshots/).
In-tree volume plugins are deprecated. You can read about the deprecated volume
plugins in the
[Volume Plugin FAQ](https://github.com/kubernetes/community/blob/master/sig-storage/volume-plugin-faq.md).
### Create a PersistentVolumeClaim from a Volume Snapshot {#create-persistent-volume-claim-from-volume-snapshot}
@ -912,7 +1075,8 @@ spec:
## Volume Cloning
[Volume Cloning](/docs/concepts/storage/volume-pvc-datasource/) only available for CSI volume plugins.
[Volume Cloning](/docs/concepts/storage/volume-pvc-datasource/)
only available for CSI volume plugins.
### Create PersistentVolumeClaim from an existing PVC {#create-persistent-volume-claim-from-an-existing-pvc}
@ -949,27 +1113,32 @@ same namespace, except for core objects other than PVCs. For clusters that have
gate enabled, use of the `dataSourceRef` is preferred over `dataSource`.
## Cross namespace data sources
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
Kubernetes supports cross namespace volume data sources.
To use cross namespace volume data sources, you must enable the `AnyVolumeDataSource` and `CrossNamespaceVolumeDataSource`
To use cross namespace volume data sources, you must enable the `AnyVolumeDataSource`
and `CrossNamespaceVolumeDataSource`
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/) for
the kube-apiserver, kube-controller-manager.
Also, you must enable the `CrossNamespaceVolumeDataSource` feature gate for the csi-provisioner.
Enabling the `CrossNamespaceVolumeDataSource` feature gate allow you to specify a namespace in the dataSourceRef field.
Enabling the `CrossNamespaceVolumeDataSource` feature gate allows you to specify
a namespace in the dataSourceRef field.
{{< note >}}
When you specify a namespace for a volume data source, Kubernetes checks for a
ReferenceGrant in the other namespace before accepting the reference.
ReferenceGrant is part of the `gateway.networking.k8s.io` extension APIs.
See [ReferenceGrant](https://gateway-api.sigs.k8s.io/api-types/referencegrant/) in the Gateway API documentation for details.
See [ReferenceGrant](https://gateway-api.sigs.k8s.io/api-types/referencegrant/)
in the Gateway API documentation for details.
This means that you must extend your Kubernetes cluster with at least ReferenceGrant from the
Gateway API before you can use this mechanism.
{{< /note >}}
## Data source references
The `dataSourceRef` field behaves almost the same as the `dataSource` field. If either one is
The `dataSourceRef` field behaves almost the same as the `dataSource` field. If one is
specified while the other is not, the API server will give both fields the same value. Neither
field can be changed after creation, and attempting to specify different values for the two
fields will result in a validation error. Therefore the two fields will always have the same
@ -986,7 +1155,8 @@ users should be aware of:
When the `CrossNamespaceVolumeDataSource` feature is enabled, there are additional differences:
* The `dataSource` field only allows local objects, while the `dataSourceRef` field allows objects in any namespaces.
* The `dataSource` field only allows local objects, while the `dataSourceRef` field allows
objects in any namespaces.
* When namespace is specified, `dataSource` and `dataSourceRef` are not synced.
Users should always use `dataSourceRef` on clusters that have the feature gate enabled, and
@ -1030,10 +1200,13 @@ responsibility of that populator controller to report Events that relate to volu
the process.
### Using a cross-namespace volume data source
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
Create a ReferenceGrant to allow the namespace owner to accept the reference.
You define a populated volume by specifying a cross namespace volume data source using the `dataSourceRef` field. You must already have a valid ReferenceGrant in the source namespace:
You define a populated volume by specifying a cross namespace volume data source
using the `dataSourceRef` field. You must already have a valid ReferenceGrant
in the source namespace:
```yaml
apiVersion: gateway.networking.k8s.io/v1beta1

View File

@ -238,11 +238,11 @@ work between Windows and Linux:
The following list documents differences between how Pod specifications work between Windows and Linux:
* `hostIPC` and `hostpid` - host namespace sharing is not possible on Windows
* `hostNetwork` - [see below](/docs/concepts/windows/intro#compatibility-v1-pod-spec-containers-hostnetwork)
* `hostNetwork` - [see below](#compatibility-v1-pod-spec-containers-hostnetwork)
* `dnsPolicy` - setting the Pod `dnsPolicy` to `ClusterFirstWithHostNet` is
not supported on Windows because host networking is not provided. Pods always
run with a container network.
* `podSecurityContext` [see below](/docs/concepts/windows/intro#compatibility-v1-pod-spec-containers-securitycontext)
* `podSecurityContext` [see below](#compatibility-v1-pod-spec-containers-securitycontext)
* `shareProcessNamespace` - this is a beta feature, and depends on Linux namespaces
which are not implemented on Windows. Windows cannot share process namespaces or
the container's root filesystem. Only the network can be shared.

View File

@ -202,7 +202,7 @@ That is, the CronJob does _not_ update existing Jobs, even if those remain runni
A CronJob creates a Job object approximately once per execution time of its schedule.
The scheduling is approximate because there
are certain circumstances where two Jobs might be created, or no Job might be created.
Kubernetes tries to avoid those situations, but do not completely prevent them. Therefore,
Kubernetes tries to avoid those situations, but does not completely prevent them. Therefore,
the Jobs that you define should be _idempotent_.
If `startingDeadlineSeconds` is set to a large value or left unset (the default)

View File

@ -105,30 +105,24 @@ If you do not specify either, then the DaemonSet controller will create Pods on
## How Daemon Pods are scheduled
### Scheduled by default scheduler
A DaemonSet ensures that all eligible nodes run a copy of a Pod. The DaemonSet
controller creates a Pod for each eligible node and adds the
`spec.affinity.nodeAffinity` field of the Pod to match the target host. After
the Pod is created, the default scheduler typically takes over and then binds
the Pod to the target host by setting the `.spec.nodeName` field. If the new
Pod cannot fit on the node, the default scheduler may preempt (evict) some of
the existing Pods based on the
[priority](/docs/concepts/scheduling-eviction/pod-priority-preemption/#pod-priority)
of the new Pod.
{{< feature-state for_k8s_version="1.17" state="stable" >}}
The user can specify a different scheduler for the Pods of the DaemonSet, by
setting the `.spec.template.spec.schedulerName` field of the DaemonSet.
A DaemonSet ensures that all eligible nodes run a copy of a Pod. Normally, the
node that a Pod runs on is selected by the Kubernetes scheduler. However,
DaemonSet pods are created and scheduled by the DaemonSet controller instead.
That introduces the following issues:
* Inconsistent Pod behavior: Normal Pods waiting to be scheduled are created
and in `Pending` state, but DaemonSet pods are not created in `Pending`
state. This is confusing to the user.
* [Pod preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
is handled by default scheduler. When preemption is enabled, the DaemonSet controller
will make scheduling decisions without considering pod priority and preemption.
`ScheduleDaemonSetPods` allows you to schedule DaemonSets using the default
scheduler instead of the DaemonSet controller, by adding the `NodeAffinity` term
to the DaemonSet pods, instead of the `.spec.nodeName` term. The default
scheduler is then used to bind the pod to the target host. If node affinity of
the DaemonSet pod already exists, it is replaced (the original node affinity was
taken into account before selecting the target host). The DaemonSet controller only
performs these operations when creating or modifying DaemonSet pods, and no
changes are made to the `spec.template` of the DaemonSet.
The original node affinity specified at the
`.spec.template.spec.affinity.nodeAffinity` field (if specified) is taken into
consideration by the DaemonSet controller when evaluating the eligible nodes,
but is replaced on the created Pod with the node affinity that matches the name
of the eligible node.
```yaml
nodeAffinity:
@ -141,25 +135,40 @@ nodeAffinity:
- target-host-name
```
In addition, `node.kubernetes.io/unschedulable:NoSchedule` toleration is added
automatically to DaemonSet Pods. The default scheduler ignores
`unschedulable` Nodes when scheduling DaemonSet Pods.
### Taints and Tolerations
### Taints and tolerations
Although Daemon Pods respect
[taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/),
the following tolerations are added to DaemonSet Pods automatically according to
the related features.
The DaemonSet controller automatically adds a set of {{< glossary_tooltip
text="tolerations" term_id="toleration" >}} to DaemonSet Pods:
| Toleration Key | Effect | Version | Description |
| ---------------------------------------- | ---------- | ------- | ----------- |
| `node.kubernetes.io/not-ready` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
| `node.kubernetes.io/unreachable` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
| `node.kubernetes.io/disk-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate disk-pressure attributes by default scheduler. |
| `node.kubernetes.io/memory-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate memory-pressure attributes by default scheduler. |
| `node.kubernetes.io/unschedulable` | NoSchedule | 1.12+ | DaemonSet pods tolerate unschedulable attributes by default scheduler. |
| `node.kubernetes.io/network-unavailable` | NoSchedule | 1.12+ | DaemonSet pods, who uses host network, tolerate network-unavailable attributes by default scheduler. |
{{< table caption="Tolerations for DaemonSet pods" >}}
| Toleration key | Effect | Details |
| --------------------------------------------------------------------------------------------------------------------- | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------- |
| [`node.kubernetes.io/not-ready`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-not-ready) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are not healthy or ready to accept Pods. Any DaemonSet Pods running on such nodes will not be evicted. |
| [`node.kubernetes.io/unreachable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unreachable) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are unreachable from the node controller. Any DaemonSet Pods running on such nodes will not be evicted. |
| [`node.kubernetes.io/disk-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-disk-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with disk pressure issues. |
| [`node.kubernetes.io/memory-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-memory-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with memory pressure issues. |
| [`node.kubernetes.io/pid-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-pid-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with process pressure issues. |
| [`node.kubernetes.io/unschedulable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unschedulable) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes that are unschedulable. |
| [`node.kubernetes.io/network-unavailable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-network-unavailable) | `NoSchedule` | **Only added for DaemonSet Pods that request host networking**, i.e., Pods having `spec.hostNetwork: true`. Such DaemonSet Pods can be scheduled onto nodes with unavailable network.|
{{< /table >}}
You can add your own tolerations to the Pods of a DaemonSet as well, by
defining these in the Pod template of the DaemonSet.
Because the DaemonSet controller sets the
`node.kubernetes.io/unschedulable:NoSchedule` toleration automatically,
Kubernetes can run DaemonSet Pods on nodes that are marked as _unschedulable_.
If you use a DaemonSet to provide an important node-level function, such as
[cluster networking](/docs/concepts/cluster-administration/networking/), it is
helpful that Kubernetes places DaemonSet Pods on nodes before they are ready.
For example, without that special toleration, you could end up in a deadlock
situation where the node is not marked as ready because the network plugin is
not running there, and at the same time the network plugin is not running on
that node because the node is not yet ready.
## Communicating with Daemon Pods

View File

@ -45,8 +45,8 @@ The following is an example of a Deployment. It creates a ReplicaSet to bring up
In this example:
* A Deployment named `nginx-deployment` is created, indicated by the
`.metadata.name` field. This name will become the basis for the ReplicaSets
and Pods which are created later. See [Writing a Deployment Spec](#writing-a-deployment-spec)
`.metadata.name` field. This name will become the basis for the ReplicaSets
and Pods which are created later. See [Writing a Deployment Spec](#writing-a-deployment-spec)
for more details.
* The Deployment creates a ReplicaSet that creates three replicated Pods, indicated by the `.spec.replicas` field.
* The `.spec.selector` field defines how the created ReplicaSet finds which Pods to manage.
@ -71,14 +71,12 @@ In this example:
Before you begin, make sure your Kubernetes cluster is up and running.
Follow the steps given below to create the above Deployment:
1. Create the Deployment by running the following command:
```shell
kubectl apply -f https://k8s.io/examples/controllers/nginx-deployment.yaml
```
2. Run `kubectl get deployments` to check if the Deployment was created.
If the Deployment is still being created, the output is similar to the following:
@ -125,7 +123,7 @@ Follow the steps given below to create the above Deployment:
* `AGE` displays the amount of time that the application has been running.
Notice that the name of the ReplicaSet is always formatted as
`[DEPLOYMENT-NAME]-[HASH]`. This name will become the basis for the Pods
`[DEPLOYMENT-NAME]-[HASH]`. This name will become the basis for the Pods
which are created.
The `HASH` string is the same as the `pod-template-hash` label on the ReplicaSet.
@ -169,56 +167,56 @@ Follow the steps given below to update your Deployment:
1. Let's update the nginx Pods to use the `nginx:1.16.1` image instead of the `nginx:1.14.2` image.
```shell
kubectl set image deployment.v1.apps/nginx-deployment nginx=nginx:1.16.1
```
```shell
kubectl set image deployment.v1.apps/nginx-deployment nginx=nginx:1.16.1
```
or use the following command:
or use the following command:
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
```
The output is similar to:
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
```
```
deployment.apps/nginx-deployment image updated
```
The output is similar to:
Alternatively, you can `edit` the Deployment and change `.spec.template.spec.containers[0].image` from `nginx:1.14.2` to `nginx:1.16.1`:
```
deployment.apps/nginx-deployment image updated
```
```shell
kubectl edit deployment/nginx-deployment
```
Alternatively, you can `edit` the Deployment and change `.spec.template.spec.containers[0].image` from `nginx:1.14.2` to `nginx:1.16.1`:
The output is similar to:
```shell
kubectl edit deployment/nginx-deployment
```
```
deployment.apps/nginx-deployment edited
```
The output is similar to:
```
deployment.apps/nginx-deployment edited
```
2. To see the rollout status, run:
```shell
kubectl rollout status deployment/nginx-deployment
```
```shell
kubectl rollout status deployment/nginx-deployment
```
The output is similar to this:
The output is similar to this:
```
Waiting for rollout to finish: 2 out of 3 new replicas have been updated...
```
```
Waiting for rollout to finish: 2 out of 3 new replicas have been updated...
```
or
or
```
deployment "nginx-deployment" successfully rolled out
```
```
deployment "nginx-deployment" successfully rolled out
```
Get more details on your updated Deployment:
* After the rollout succeeds, you can view the Deployment by running `kubectl get deployments`.
The output is similar to this:
The output is similar to this:
```ini
NAME READY UP-TO-DATE AVAILABLE AGE
@ -228,44 +226,44 @@ Get more details on your updated Deployment:
* Run `kubectl get rs` to see that the Deployment updated the Pods by creating a new ReplicaSet and scaling it
up to 3 replicas, as well as scaling down the old ReplicaSet to 0 replicas.
```shell
kubectl get rs
```
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1564180365 3 3 3 6s
nginx-deployment-2035384211 0 0 0 36s
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1564180365 3 3 3 6s
nginx-deployment-2035384211 0 0 0 36s
```
* Running `get pods` should now show only the new Pods:
```shell
kubectl get pods
```
```shell
kubectl get pods
```
The output is similar to this:
```
NAME READY STATUS RESTARTS AGE
nginx-deployment-1564180365-khku8 1/1 Running 0 14s
nginx-deployment-1564180365-nacti 1/1 Running 0 14s
nginx-deployment-1564180365-z9gth 1/1 Running 0 14s
```
The output is similar to this:
```
NAME READY STATUS RESTARTS AGE
nginx-deployment-1564180365-khku8 1/1 Running 0 14s
nginx-deployment-1564180365-nacti 1/1 Running 0 14s
nginx-deployment-1564180365-z9gth 1/1 Running 0 14s
```
Next time you want to update these Pods, you only need to update the Deployment's Pod template again.
Next time you want to update these Pods, you only need to update the Deployment's Pod template again.
Deployment ensures that only a certain number of Pods are down while they are being updated. By default,
it ensures that at least 75% of the desired number of Pods are up (25% max unavailable).
Deployment ensures that only a certain number of Pods are down while they are being updated. By default,
it ensures that at least 75% of the desired number of Pods are up (25% max unavailable).
Deployment also ensures that only a certain number of Pods are created above the desired number of Pods.
By default, it ensures that at most 125% of the desired number of Pods are up (25% max surge).
Deployment also ensures that only a certain number of Pods are created above the desired number of Pods.
By default, it ensures that at most 125% of the desired number of Pods are up (25% max surge).
For example, if you look at the above Deployment closely, you will see that it first creates a new Pod,
then deletes an old Pod, and creates another new one. It does not kill old Pods until a sufficient number of
new Pods have come up, and does not create new Pods until a sufficient number of old Pods have been killed.
It makes sure that at least 3 Pods are available and that at max 4 Pods in total are available. In case of
a Deployment with 4 replicas, the number of Pods would be between 3 and 5.
For example, if you look at the above Deployment closely, you will see that it first creates a new Pod,
then deletes an old Pod, and creates another new one. It does not kill old Pods until a sufficient number of
new Pods have come up, and does not create new Pods until a sufficient number of old Pods have been killed.
It makes sure that at least 3 Pods are available and that at max 4 Pods in total are available. In case of
a Deployment with 4 replicas, the number of Pods would be between 3 and 5.
* Get details of your Deployment:
```shell
@ -309,13 +307,13 @@ up to 3 replicas, as well as scaling down the old ReplicaSet to 0 replicas.
Normal ScalingReplicaSet 19s deployment-controller Scaled down replica set nginx-deployment-2035384211 to 1
Normal ScalingReplicaSet 19s deployment-controller Scaled up replica set nginx-deployment-1564180365 to 3
Normal ScalingReplicaSet 14s deployment-controller Scaled down replica set nginx-deployment-2035384211 to 0
```
Here you see that when you first created the Deployment, it created a ReplicaSet (nginx-deployment-2035384211)
and scaled it up to 3 replicas directly. When you updated the Deployment, it created a new ReplicaSet
(nginx-deployment-1564180365) and scaled it up to 1 and waited for it to come up. Then it scaled down the old ReplicaSet
to 2 and scaled up the new ReplicaSet to 2 so that at least 3 Pods were available and at most 4 Pods were created at all times.
It then continued scaling up and down the new and the old ReplicaSet, with the same rolling update strategy.
Finally, you'll have 3 available replicas in the new ReplicaSet, and the old ReplicaSet is scaled down to 0.
```
Here you see that when you first created the Deployment, it created a ReplicaSet (nginx-deployment-2035384211)
and scaled it up to 3 replicas directly. When you updated the Deployment, it created a new ReplicaSet
(nginx-deployment-1564180365) and scaled it up to 1 and waited for it to come up. Then it scaled down the old ReplicaSet
to 2 and scaled up the new ReplicaSet to 2 so that at least 3 Pods were available and at most 4 Pods were created at all times.
It then continued scaling up and down the new and the old ReplicaSet, with the same rolling update strategy.
Finally, you'll have 3 available replicas in the new ReplicaSet, and the old ReplicaSet is scaled down to 0.
{{< note >}}
Kubernetes doesn't count terminating Pods when calculating the number of `availableReplicas`, which must be between
@ -333,7 +331,7 @@ ReplicaSet is scaled to `.spec.replicas` and all old ReplicaSets is scaled to 0.
If you update a Deployment while an existing rollout is in progress, the Deployment creates a new ReplicaSet
as per the update and start scaling that up, and rolls over the ReplicaSet that it was scaling up previously
-- it will add it to its list of old ReplicaSets and start scaling it down.
-- it will add it to its list of old ReplicaSets and start scaling it down.
For example, suppose you create a Deployment to create 5 replicas of `nginx:1.14.2`,
but then update the Deployment to create 5 replicas of `nginx:1.16.1`, when only 3
@ -378,107 +376,107 @@ rolled back.
* Suppose that you made a typo while updating the Deployment, by putting the image name as `nginx:1.161` instead of `nginx:1.16.1`:
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.161
```
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.161
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
* The rollout gets stuck. You can verify it by checking the rollout status:
```shell
kubectl rollout status deployment/nginx-deployment
```
```shell
kubectl rollout status deployment/nginx-deployment
```
The output is similar to this:
```
Waiting for rollout to finish: 1 out of 3 new replicas have been updated...
```
The output is similar to this:
```
Waiting for rollout to finish: 1 out of 3 new replicas have been updated...
```
* Press Ctrl-C to stop the above rollout status watch. For more information on stuck rollouts,
[read more here](#deployment-status).
* You see that the number of old replicas (`nginx-deployment-1564180365` and `nginx-deployment-2035384211`) is 2, and new replicas (nginx-deployment-3066724191) is 1.
```shell
kubectl get rs
```
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1564180365 3 3 3 25s
nginx-deployment-2035384211 0 0 0 36s
nginx-deployment-3066724191 1 1 0 6s
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1564180365 3 3 3 25s
nginx-deployment-2035384211 0 0 0 36s
nginx-deployment-3066724191 1 1 0 6s
```
* Looking at the Pods created, you see that 1 Pod created by new ReplicaSet is stuck in an image pull loop.
```shell
kubectl get pods
```
```shell
kubectl get pods
```
The output is similar to this:
```
NAME READY STATUS RESTARTS AGE
nginx-deployment-1564180365-70iae 1/1 Running 0 25s
nginx-deployment-1564180365-jbqqo 1/1 Running 0 25s
nginx-deployment-1564180365-hysrc 1/1 Running 0 25s
nginx-deployment-3066724191-08mng 0/1 ImagePullBackOff 0 6s
```
The output is similar to this:
```
NAME READY STATUS RESTARTS AGE
nginx-deployment-1564180365-70iae 1/1 Running 0 25s
nginx-deployment-1564180365-jbqqo 1/1 Running 0 25s
nginx-deployment-1564180365-hysrc 1/1 Running 0 25s
nginx-deployment-3066724191-08mng 0/1 ImagePullBackOff 0 6s
```
{{< note >}}
The Deployment controller stops the bad rollout automatically, and stops scaling up the new ReplicaSet. This depends on the rollingUpdate parameters (`maxUnavailable` specifically) that you have specified. Kubernetes by default sets the value to 25%.
{{< /note >}}
{{< note >}}
The Deployment controller stops the bad rollout automatically, and stops scaling up the new ReplicaSet. This depends on the rollingUpdate parameters (`maxUnavailable` specifically) that you have specified. Kubernetes by default sets the value to 25%.
{{< /note >}}
* Get the description of the Deployment:
```shell
kubectl describe deployment
```
```shell
kubectl describe deployment
```
The output is similar to this:
```
Name: nginx-deployment
Namespace: default
CreationTimestamp: Tue, 15 Mar 2016 14:48:04 -0700
Labels: app=nginx
Selector: app=nginx
Replicas: 3 desired | 1 updated | 4 total | 3 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.161
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True ReplicaSetUpdated
OldReplicaSets: nginx-deployment-1564180365 (3/3 replicas created)
NewReplicaSet: nginx-deployment-3066724191 (1/1 replicas created)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-2035384211 to 3
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 1
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 2
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 2
21s 21s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 1
21s 21s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 3
13s 13s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 0
13s 13s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-3066724191 to 1
```
The output is similar to this:
```
Name: nginx-deployment
Namespace: default
CreationTimestamp: Tue, 15 Mar 2016 14:48:04 -0700
Labels: app=nginx
Selector: app=nginx
Replicas: 3 desired | 1 updated | 4 total | 3 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.161
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True ReplicaSetUpdated
OldReplicaSets: nginx-deployment-1564180365 (3/3 replicas created)
NewReplicaSet: nginx-deployment-3066724191 (1/1 replicas created)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-2035384211 to 3
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 1
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 2
22s 22s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 2
21s 21s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 1
21s 21s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-1564180365 to 3
13s 13s 1 {deployment-controller } Normal ScalingReplicaSet Scaled down replica set nginx-deployment-2035384211 to 0
13s 13s 1 {deployment-controller } Normal ScalingReplicaSet Scaled up replica set nginx-deployment-3066724191 to 1
```
To fix this, you need to rollback to a previous revision of Deployment that is stable.
@ -487,131 +485,131 @@ rolled back.
Follow the steps given below to check the rollout history:
1. First, check the revisions of this Deployment:
```shell
kubectl rollout history deployment/nginx-deployment
```
The output is similar to this:
```
deployments "nginx-deployment"
REVISION CHANGE-CAUSE
1 kubectl apply --filename=https://k8s.io/examples/controllers/nginx-deployment.yaml
2 kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
3 kubectl set image deployment/nginx-deployment nginx=nginx:1.161
```
```shell
kubectl rollout history deployment/nginx-deployment
```
The output is similar to this:
```
deployments "nginx-deployment"
REVISION CHANGE-CAUSE
1 kubectl apply --filename=https://k8s.io/examples/controllers/nginx-deployment.yaml
2 kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
3 kubectl set image deployment/nginx-deployment nginx=nginx:1.161
```
`CHANGE-CAUSE` is copied from the Deployment annotation `kubernetes.io/change-cause` to its revisions upon creation. You can specify the`CHANGE-CAUSE` message by:
`CHANGE-CAUSE` is copied from the Deployment annotation `kubernetes.io/change-cause` to its revisions upon creation. You can specify the`CHANGE-CAUSE` message by:
* Annotating the Deployment with `kubectl annotate deployment/nginx-deployment kubernetes.io/change-cause="image updated to 1.16.1"`
* Manually editing the manifest of the resource.
* Annotating the Deployment with `kubectl annotate deployment/nginx-deployment kubernetes.io/change-cause="image updated to 1.16.1"`
* Manually editing the manifest of the resource.
2. To see the details of each revision, run:
```shell
kubectl rollout history deployment/nginx-deployment --revision=2
```
```shell
kubectl rollout history deployment/nginx-deployment --revision=2
```
The output is similar to this:
```
deployments "nginx-deployment" revision 2
Labels: app=nginx
pod-template-hash=1159050644
Annotations: kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
Containers:
nginx:
Image: nginx:1.16.1
Port: 80/TCP
QoS Tier:
cpu: BestEffort
memory: BestEffort
Environment Variables: <none>
No volumes.
```
The output is similar to this:
```
deployments "nginx-deployment" revision 2
Labels: app=nginx
pod-template-hash=1159050644
Annotations: kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
Containers:
nginx:
Image: nginx:1.16.1
Port: 80/TCP
QoS Tier:
cpu: BestEffort
memory: BestEffort
Environment Variables: <none>
No volumes.
```
### Rolling Back to a Previous Revision
Follow the steps given below to rollback the Deployment from the current version to the previous version, which is version 2.
1. Now you've decided to undo the current rollout and rollback to the previous revision:
```shell
kubectl rollout undo deployment/nginx-deployment
```
```shell
kubectl rollout undo deployment/nginx-deployment
```
The output is similar to this:
```
deployment.apps/nginx-deployment rolled back
```
Alternatively, you can rollback to a specific revision by specifying it with `--to-revision`:
The output is similar to this:
```
deployment.apps/nginx-deployment rolled back
```
Alternatively, you can rollback to a specific revision by specifying it with `--to-revision`:
```shell
kubectl rollout undo deployment/nginx-deployment --to-revision=2
```
```shell
kubectl rollout undo deployment/nginx-deployment --to-revision=2
```
The output is similar to this:
```
deployment.apps/nginx-deployment rolled back
```
The output is similar to this:
```
deployment.apps/nginx-deployment rolled back
```
For more details about rollout related commands, read [`kubectl rollout`](/docs/reference/generated/kubectl/kubectl-commands#rollout).
For more details about rollout related commands, read [`kubectl rollout`](/docs/reference/generated/kubectl/kubectl-commands#rollout).
The Deployment is now rolled back to a previous stable revision. As you can see, a `DeploymentRollback` event
for rolling back to revision 2 is generated from Deployment controller.
The Deployment is now rolled back to a previous stable revision. As you can see, a `DeploymentRollback` event
for rolling back to revision 2 is generated from Deployment controller.
2. Check if the rollback was successful and the Deployment is running as expected, run:
```shell
kubectl get deployment nginx-deployment
```
```shell
kubectl get deployment nginx-deployment
```
The output is similar to this:
```
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 30m
```
The output is similar to this:
```
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 30m
```
3. Get the description of the Deployment:
```shell
kubectl describe deployment nginx-deployment
```
The output is similar to this:
```
Name: nginx-deployment
Namespace: default
CreationTimestamp: Sun, 02 Sep 2018 18:17:55 -0500
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision=4
kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
Selector: app=nginx
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.16.1
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: nginx-deployment-c4747d96c (3/3 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 12m deployment-controller Scaled up replica set nginx-deployment-75675f5897 to 3
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 1
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 2
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 2
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 1
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 3
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 0
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-595696685f to 1
Normal DeploymentRollback 15s deployment-controller Rolled back deployment "nginx-deployment" to revision 2
Normal ScalingReplicaSet 15s deployment-controller Scaled down replica set nginx-deployment-595696685f to 0
```
```shell
kubectl describe deployment nginx-deployment
```
The output is similar to this:
```
Name: nginx-deployment
Namespace: default
CreationTimestamp: Sun, 02 Sep 2018 18:17:55 -0500
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision=4
kubernetes.io/change-cause=kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
Selector: app=nginx
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.16.1
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: nginx-deployment-c4747d96c (3/3 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 12m deployment-controller Scaled up replica set nginx-deployment-75675f5897 to 3
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 1
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 2
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 2
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 1
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-c4747d96c to 3
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set nginx-deployment-75675f5897 to 0
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set nginx-deployment-595696685f to 1
Normal DeploymentRollback 15s deployment-controller Rolled back deployment "nginx-deployment" to revision 2
Normal ScalingReplicaSet 15s deployment-controller Scaled down replica set nginx-deployment-595696685f to 0
```
## Scaling a Deployment
@ -658,26 +656,26 @@ For example, you are running a Deployment with 10 replicas, [maxSurge](#max-surg
```
* You update to a new image which happens to be unresolvable from inside the cluster.
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:sometag
```
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:sometag
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
* The image update starts a new rollout with ReplicaSet nginx-deployment-1989198191, but it's blocked due to the
`maxUnavailable` requirement that you mentioned above. Check out the rollout status:
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1989198191 5 5 0 9s
nginx-deployment-618515232 8 8 8 1m
```
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-deployment-1989198191 5 5 0 9s
nginx-deployment-618515232 8 8 8 1m
```
* Then a new scaling request for the Deployment comes along. The autoscaler increments the Deployment replicas
to 15. The Deployment controller needs to decide where to add these new 5 replicas. If you weren't using
@ -741,103 +739,103 @@ apply multiple fixes in between pausing and resuming without triggering unnecess
```
* Pause by running the following command:
```shell
kubectl rollout pause deployment/nginx-deployment
```
```shell
kubectl rollout pause deployment/nginx-deployment
```
The output is similar to this:
```
deployment.apps/nginx-deployment paused
```
The output is similar to this:
```
deployment.apps/nginx-deployment paused
```
* Then update the image of the Deployment:
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
```
```shell
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
The output is similar to this:
```
deployment.apps/nginx-deployment image updated
```
* Notice that no new rollout started:
```shell
kubectl rollout history deployment/nginx-deployment
```
```shell
kubectl rollout history deployment/nginx-deployment
```
The output is similar to this:
```
deployments "nginx"
REVISION CHANGE-CAUSE
1 <none>
```
The output is similar to this:
```
deployments "nginx"
REVISION CHANGE-CAUSE
1 <none>
```
* Get the rollout status to verify that the existing ReplicaSet has not changed:
```shell
kubectl get rs
```
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 3 3 3 2m
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 3 3 3 2m
```
* You can make as many updates as you wish, for example, update the resources that will be used:
```shell
kubectl set resources deployment/nginx-deployment -c=nginx --limits=cpu=200m,memory=512Mi
```
```shell
kubectl set resources deployment/nginx-deployment -c=nginx --limits=cpu=200m,memory=512Mi
```
The output is similar to this:
```
deployment.apps/nginx-deployment resource requirements updated
```
The output is similar to this:
```
deployment.apps/nginx-deployment resource requirements updated
```
The initial state of the Deployment prior to pausing its rollout will continue its function, but new updates to
the Deployment will not have any effect as long as the Deployment rollout is paused.
The initial state of the Deployment prior to pausing its rollout will continue its function, but new updates to
the Deployment will not have any effect as long as the Deployment rollout is paused.
* Eventually, resume the Deployment rollout and observe a new ReplicaSet coming up with all the new updates:
```shell
kubectl rollout resume deployment/nginx-deployment
```
```shell
kubectl rollout resume deployment/nginx-deployment
```
The output is similar to this:
```
deployment.apps/nginx-deployment resumed
```
The output is similar to this:
```
deployment.apps/nginx-deployment resumed
```
* Watch the status of the rollout until it's done.
```shell
kubectl get rs -w
```
```shell
kubectl get rs -w
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 2 2 2 2m
nginx-3926361531 2 2 0 6s
nginx-3926361531 2 2 1 18s
nginx-2142116321 1 2 2 2m
nginx-2142116321 1 2 2 2m
nginx-3926361531 3 2 1 18s
nginx-3926361531 3 2 1 18s
nginx-2142116321 1 1 1 2m
nginx-3926361531 3 3 1 18s
nginx-3926361531 3 3 2 19s
nginx-2142116321 0 1 1 2m
nginx-2142116321 0 1 1 2m
nginx-2142116321 0 0 0 2m
nginx-3926361531 3 3 3 20s
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 2 2 2 2m
nginx-3926361531 2 2 0 6s
nginx-3926361531 2 2 1 18s
nginx-2142116321 1 2 2 2m
nginx-2142116321 1 2 2 2m
nginx-3926361531 3 2 1 18s
nginx-3926361531 3 2 1 18s
nginx-2142116321 1 1 1 2m
nginx-3926361531 3 3 1 18s
nginx-3926361531 3 3 2 19s
nginx-2142116321 0 1 1 2m
nginx-2142116321 0 1 1 2m
nginx-2142116321 0 0 0 2m
nginx-3926361531 3 3 3 20s
```
* Get the status of the latest rollout:
```shell
kubectl get rs
```
```shell
kubectl get rs
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 0 0 0 2m
nginx-3926361531 3 3 3 28s
```
The output is similar to this:
```
NAME DESIRED CURRENT READY AGE
nginx-2142116321 0 0 0 2m
nginx-3926361531 3 3 3 28s
```
{{< note >}}
You cannot rollback a paused Deployment until you resume it.
{{< /note >}}
@ -1084,9 +1082,9 @@ For general information about working with config files, see
configuring containers, and [using kubectl to manage resources](/docs/concepts/overview/working-with-objects/object-management/) documents.
When the control plane creates new Pods for a Deployment, the `.metadata.name` of the
Deployment is part of the basis for naming those Pods. The name of a Deployment must be a valid
Deployment is part of the basis for naming those Pods. The name of a Deployment must be a valid
[DNS subdomain](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names)
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
the name should follow the more restrictive rules for a
[DNS label](/docs/concepts/overview/working-with-objects/names#dns-label-names).
@ -1153,11 +1151,11 @@ the default value.
All existing Pods are killed before new ones are created when `.spec.strategy.type==Recreate`.
{{< note >}}
This will only guarantee Pod termination previous to creation for upgrades. If you upgrade a Deployment, all Pods
of the old revision will be terminated immediately. Successful removal is awaited before any Pod of the new
revision is created. If you manually delete a Pod, the lifecycle is controlled by the ReplicaSet and the
replacement will be created immediately (even if the old Pod is still in a Terminating state). If you need an
"at most" guarantee for your Pods, you should consider using a
This will only guarantee Pod termination previous to creation for upgrades. If you upgrade a Deployment, all Pods
of the old revision will be terminated immediately. Successful removal is awaited before any Pod of the new
revision is created. If you manually delete a Pod, the lifecycle is controlled by the ReplicaSet and the
replacement will be created immediately (even if the old Pod is still in a Terminating state). If you need an
"at most" guarantee for your Pods, you should consider using a
[StatefulSet](/docs/concepts/workloads/controllers/statefulset/).
{{< /note >}}

View File

@ -296,14 +296,14 @@ Your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >
Any container in a pod can run in privileged mode to use operating system administrative capabilities
that would otherwise be inaccessible. This is available for both Windows and Linux.
### Linux priviledged containers
### Linux privileged containers
In Linux, any container in a Pod can enable privileged mode using the `privileged` (Linux) flag
on the [security context](/docs/tasks/configure-pod-container/security-context/) of the
container spec. This is useful for containers that want to use operating system administrative
capabilities such as manipulating the network stack or accessing hardware devices.
### Windows priviledged containers
### Windows privileged containers
{{< feature-state for_k8s_version="v1.26" state="stable" >}}

View File

@ -267,6 +267,11 @@ after successful sandbox creation and network configuration by the runtime
plugin). For a Pod without init containers, the kubelet sets the `Initialized`
condition to `True` before sandbox creation and network configuration starts.
### Pod scheduling readiness {#pod-scheduling-readiness-gate}
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
See [Pod Scheduling Readiness](/docs/concepts/scheduling-eviction/pod-scheduling-readiness/) for more information.
## Container probes
@ -291,7 +296,7 @@ Each probe must define exactly one of these four mechanisms:
The target should implement
[gRPC health checks](https://grpc.io/grpc/core/md_doc_health-checking.html).
The diagnostic is considered successful if the `status`
of the response is `SERVING`.
of the response is `SERVING`.
gRPC probes are an alpha feature and are only available if you
enable the `GRPCContainerProbe`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
@ -460,14 +465,32 @@ An example flow:
The containers in the Pod receive the TERM signal at different times and in an arbitrary
order. If the order of shutdowns matters, consider using a `preStop` hook to synchronize.
{{< /note >}}
1. At the same time as the kubelet is starting graceful shutdown, the control plane removes that
shutting-down Pod from EndpointSlice (and Endpoints) objects where these represent
1. At the same time as the kubelet is starting graceful shutdown of the Pod, the control plane evaluates whether to remove that shutting-down Pod from EndpointSlice (and Endpoints) objects, where those objects represent
a {{< glossary_tooltip term_id="service" text="Service" >}} with a configured
{{< glossary_tooltip text="selector" term_id="selector" >}}.
{{< glossary_tooltip text="ReplicaSets" term_id="replica-set" >}} and other workload resources
no longer treat the shutting-down Pod as a valid, in-service replica. Pods that shut down slowly
cannot continue to serve traffic as load balancers (like the service proxy) remove the Pod from
the list of endpoints as soon as the termination grace period _begins_.
should not continue to serve regular traffic and should start terminating and finish processing open connections.
Some applications need to go beyond finishing open connections and need more graceful termination -
for example: session draining and completion. Any endpoints that represent the terminating pods
are not immediately removed from EndpointSlices,
and a status indicating [terminating state](/docs/concepts/services-networking/endpoint-slices/#conditions)
is exposed from the EndpointSlice API (and the legacy Endpoints API). Terminating
endpoints always have their `ready` status
as `false` (for backward compatibility with versions before 1.26),
so load balancers will not use it for regular traffic.
If traffic draining on terminating pod is needed, the actual readiness can be checked as a condition `serving`.
You can find more details on how to implement connections draining
in the tutorial [Pods And Endpoints Termination Flow](/docs/tutorials/services/pods-and-endpoint-termination-flow/)
{{<note>}}
If you don't have the `EndpointSliceTerminatingCondition` feature gate enabled
in your cluster (the gate is on by default from Kubernetes 1.22, and locked to default in 1.26), then the Kubernetes control
plane removes a Pod from any relevant EndpointSlices as soon as the Pod's
termination grace period _begins_. The behavior above is described when the
feature gate `EndpointSliceTerminatingCondition` is enabled.
{{</note>}}
1. When the grace period expires, the kubelet triggers forcible shutdown. The container runtime sends
`SIGKILL` to any processes still running in any container in the Pod.
The kubelet also cleans up a hidden `pause` container if that container runtime uses one.

View File

@ -71,7 +71,7 @@ A Pod is given a QoS class of `Burstable` if:
Pods in the `BestEffort` QoS class can use node resources that aren't specifically assigned
to Pods in other QoS classes. For example, if you have a node with 16 CPU cores available to the
kubelet, and you assign assign 4 CPU cores to a `Guaranteed` Pod, then a Pod in the `BestEffort`
kubelet, and you assign 4 CPU cores to a `Guaranteed` Pod, then a Pod in the `BestEffort`
QoS class can try to use any amount of the remaining 12 CPU cores.
The kubelet prefers to evict `BestEffort` Pods if the node comes under resource pressure.

View File

@ -10,7 +10,7 @@ min-kubernetes-server-version: v1.25
{{< feature-state for_k8s_version="v1.25" state="alpha" >}}
This page explains how user namespaces are used in Kubernetes pods. A user
namespace allows to isolate the user running inside the container from the one
namespace isolates the user running inside the container from the one
in the host.
A process running as root in a container can run as a different (non-root) user

View File

@ -39,6 +39,10 @@ standard](https://www.loc.gov/standards/iso639-2/php/code_list.php) to find your
localization's two-letter language code. For example, the two-letter code for
Korean is `ko`.
Some languages use a lowercase version of the country code as defined by the
ISO-3166 along with their language codes. For example, the Brazilian Portuguese
language code is `pt-br`.
### Fork and clone the repo
First, [create your own
@ -88,6 +92,11 @@ You'll need to know the two-letter language code for your language. Consult the
to find your localization's two-letter language code. For example, the
two-letter code for Korean is `ko`.
If the language you are starting a localization for is spoken in various places
with significant differences between the variants, it might make sense to
combine the lowercased ISO-3166 country code with the language two-letter code.
For example, Brazilian Portuguese is localized as `pt-br`.
When you start a new localization, you must localize all the
[minimum required content](#minimum-required-content) before
the Kubernetes project can publish your changes to the live
@ -153,10 +162,10 @@ For an example of adding a label, see the PR for adding the
The Kubernetes website uses Hugo as its web framework. The website's Hugo
configuration resides in the
[`config.toml`](https://github.com/kubernetes/website/tree/main/config.toml)
file. You'll need to modify `config.toml` to support a new localization.
[`hugo.toml`](https://github.com/kubernetes/website/tree/main/hugo.toml)
file. You'll need to modify `hugo.toml` to support a new localization.
Add a configuration block for the new language to `config.toml` under the
Add a configuration block for the new language to `hugo.toml` under the
existing `[languages]` block. The German block, for example, looks like:
```toml

View File

@ -271,6 +271,33 @@ Renders to:
{{< tab name="JSON File" include="podtemplate.json" />}}
{{< /tabs >}}
### Source code files
You can use the `{{</* codenew */>}}` shortcode to embed the contents of file in a code block to allow users to download or copy its content to their clipboard. This shortcode is used when the contents of the sample file is generic and reusable, and you want the users to try it out themselves.
This shortcode takes in two named parameters: `language` and `file`. The mandatory parameter `file` is used to specify the path to the file being displayed. The optional parameter `language` is used to specify the programming language of the file. If the `language` parameter is not provided, the shortcode will attempt to guess the language based on the file extension.
For example:
```none
{{</* codenew language="yaml" file="application/deployment-scale.yaml" */>}}
```
The output is:
{{< codenew language="yaml" file="application/deployment-scale.yaml" >}}
When adding a new sample file, such as a YAML file, create the file in one of the `<LANG>/examples/` subdirectories where `<LANG>` is the language for the page. In the markdown of your page, use the `codenew` shortcode:
```none
{{</* codenew file="<RELATIVE-PATH>/example-yaml>" */>}}
```
where `<RELATIVE-PATH>` is the path to the sample file to include, relative to the `examples` directory. The following shortcode references a YAML file located at `/content/en/examples/configmap/configmaps.yaml`.
```none
{{</* codenew file="configmap/configmaps.yaml" */>}}
```
## Third party content marker
Running Kubernetes requires third-party software. For example: you
@ -311,7 +338,7 @@ before the item, or just below the heading for the specific item.
To generate a version string for inclusion in the documentation, you can choose from
several version shortcodes. Each version shortcode displays a version string derived from
the value of a version parameter found in the site configuration file, `config.toml`.
the value of a version parameter found in the site configuration file, `hugo.toml`.
The two most commonly used version parameters are `latest` and `version`.
### `{{</* param "version" */>}}`

View File

@ -459,12 +459,8 @@ Do | Don't
Update the title in the front matter of the page or blog post. | Use first level heading, as Hugo automatically converts the title in the front matter of the page into a first-level heading.
Use ordered headings to provide a meaningful high-level outline of your content. | Use headings level 4 through 6, unless it is absolutely necessary. If your content is that detailed, it may need to be broken into separate articles.
Use pound or hash signs (`#`) for non-blog post content. | Use underlines (`---` or `===`) to designate first-level headings.
Use sentence case for headings in the page body. For example,
**Extend kubectl with plugins** | Use title case for headings in the page body. For example, **Extend Kubectl With Plugins**
Use title case for the page title in the front matter. For example,
`title: Kubernetes API Server Bypass Risks` | Use sentence case for page titles
in the front matter. For example, don't use
`title: Kubernetes API server bypass risks`
Use sentence case for headings in the page body. For example, **Extend kubectl with plugins** | Use title case for headings in the page body. For example, **Extend Kubectl With Plugins**
Use title case for the page title in the front matter. For example, `title: Kubernetes API Server Bypass Risks` | Use sentence case for page titles in the front matter. For example, don't use `title: Kubernetes API server bypass risks`
{{< /table >}}
### Paragraphs
@ -635,4 +631,5 @@ These steps ... | These simple steps ...
* Learn about [writing a new topic](/docs/contribute/style/write-new-topic/).
* Learn about [using page templates](/docs/contribute/style/page-content-types/).
* Learn about [custom hugo shortcodes](/docs/contribute/style/hugo-shortcodes/).
* Learn about [creating a pull request](/docs/contribute/new-content/open-a-pr/).

View File

@ -89,6 +89,7 @@ operator to use or manage a cluster.
* [kube-scheduler configuration (v1beta2)](/docs/reference/config-api/kube-scheduler-config.v1beta2/),
[kube-scheduler configuration (v1beta3)](/docs/reference/config-api/kube-scheduler-config.v1beta3/) and
[kube-scheduler configuration (v1)](/docs/reference/config-api/kube-scheduler-config.v1/)
* [kube-controller-manager configuration (v1alpha1)](/docs/reference/config-api/kube-controller-manager-config.v1alpha1/)
* [kube-proxy configuration (v1alpha1)](/docs/reference/config-api/kube-proxy-config.v1alpha1/)
* [`audit.k8s.io/v1` API](/docs/reference/config-api/apiserver-audit.v1/)
* [Client authentication API (v1beta1)](/docs/reference/config-api/client-authentication.v1beta1/) and

View File

@ -107,7 +107,7 @@ CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultI
{{< note >}}
The [`ValidatingAdmissionPolicy`](#validatingadmissionpolicy) admission plugin is enabled
by default, but is only active if you enable the the `ValidatingAdmissionPolicy`
by default, but is only active if you enable the `ValidatingAdmissionPolicy`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) **and**
the `admissionregistration.k8s.io/v1alpha1` API.
{{< /note >}}

View File

@ -307,7 +307,7 @@ roleRef:
```
The `csrapproving` controller that ships as part of
[kube-controller-manager](/docs/admin/kube-controller-manager/) and is enabled
[kube-controller-manager](/docs/reference/command-line-tools-reference/kube-controller-manager/) and is enabled
by default. The controller uses the
[`SubjectAccessReview` API](/docs/reference/access-authn-authz/authorization/#checking-api-access) to
determine if a given user is authorized to request a CSR, then approves based on

View File

@ -333,6 +333,7 @@ In the following table:
| `WindowsRunAsUserName` | `false` | Alpha | 1.16 | 1.16 |
| `WindowsRunAsUserName` | `true` | Beta | 1.17 | 1.17 |
| `WindowsRunAsUserName` | `true` | GA | 1.18 | 1.20 |
{{< /table >}}
## Descriptions for removed feature gates

File diff suppressed because it is too large Load Diff

View File

@ -2,7 +2,7 @@
title: Kubeadm
id: kubeadm
date: 2018-04-12
full_link: /docs/admin/kubeadm/
full_link: /docs/reference/setup-tools/kubeadm/
short_description: >
A tool for quickly installing Kubernetes and setting up a secure cluster.

View File

@ -2,7 +2,7 @@
title: Kubectl
id: kubectl
date: 2018-04-12
full_link: /docs/user-guide/kubectl-overview/
full_link: /docs/reference/kubectl/
short_description: >
A command line tool for communicating with a Kubernetes cluster.

View File

@ -5,14 +5,21 @@ date: 2018-04-12
full_link: /docs/concepts/services-networking/service/
short_description: >
A way to expose an application running on a set of Pods as a network service.
aka:
tags:
- fundamental
- core-object
---
An abstract way to expose an application running on a set of {{< glossary_tooltip text="Pods" term_id="pod" >}} as a network service.
A method for exposing a network application that is running as one or more
{{< glossary_tooltip text="Pods" term_id="pod" >}} in your cluster.
<!--more-->
The set of Pods targeted by a Service is (usually) determined by a {{< glossary_tooltip text="selector" term_id="selector" >}}. If more Pods are added or removed, the set of Pods matching the selector will change. The Service makes sure that network traffic can be directed to the current set of Pods for the workload.
The set of Pods targeted by a Service is (usually) determined by a
{{< glossary_tooltip text="selector" term_id="selector" >}}. If more Pods are added or removed,
the set of Pods matching the selector will change. The Service makes sure that network traffic
can be directed to the current set of Pods for the workload.
Kubernetes Services either use IP networking (IPv4, IPv6, or both), or reference an external name in
the Domain Name System (DNS).
The Service abstraction enables other mechanisms, such as Ingress and Gateway.

View File

@ -4,7 +4,7 @@ id: statefulset
date: 2018-04-12
full_link: /docs/concepts/workloads/controllers/statefulset/
short_description: >
Manages deployment and scaling of a set of Pods, with durable storage and persistent identifiers for each Pod.
A StatefulSet manages deployment and scaling of a set of Pods, with durable storage and persistent identifiers for each Pod.
aka:
tags:
@ -17,6 +17,6 @@ tags:
<!--more-->
Like a {{< glossary_tooltip term_id="deployment" >}}, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable&#58; each has a persistent identifier that it maintains across any rescheduling.
Like a {{< glossary_tooltip term_id="deployment" >}}, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of its Pods. These pods are created from the same spec, but are not interchangeable&#58; each has a persistent identifier that it maintains across any rescheduling.
If you want to use storage volumes to provide persistence for your workload, you can use a StatefulSet as part of the solution. Although individual Pods in a StatefulSet are susceptible to failure, the persistent Pod identifiers make it easier to match existing volumes to the new Pods that replace any that have failed.

View File

@ -0,0 +1,38 @@
---
title: CRI Pod & Container Metrics
content_type: reference
weight: 50
description: >-
Collection of Pod & Container metrics via the CRI.
---
<!-- overview -->
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
The [kubelet](/docs/reference/command-line-tools-reference/kubelet/) collects pod and
container metrics via [cAdvisor](https://github.com/google/cadvisor). As an alpha feature,
Kubernetes lets you configure the collection of pod and container
metrics via the {{< glossary_tooltip term_id="cri" text="Container Runtime Interface">}} (CRI). You
must enable the `PodAndContainerStatsFromCRI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and
use a compatible CRI implementation (containerd >= 1.6.0, CRI-O >= 1.23.0) to
use the CRI based collection mechanism.
<!-- body -->
## CRI Pod & Container Metrics
With `PodAndContainerStatsFromCRI` enabled, the kubelet polls the underlying container
runtime for pod and container stats instead of inspecting the host system directly using cAdvisor.
The benefits of relying on the container runtime for this information as opposed to direct
collection with cAdvisor include:
- Potential improved performance if the container runtime already collects this information
during normal operations. In this case, the data can be re-used instead of being aggregated
again by the kubelet.
- It further decouples the kubelet and the container runtime allowing collection of metrics for
container runtimes that don't run processes directly on the host with kubelet where they are
observable by cAdvisor (for example: container runtimes that use virtualization).

View File

@ -37,17 +37,11 @@ kubelet endpoint, and not `/stats/summary`.
## Summary metrics API source {#summary-api-source}
By default, Kubernetes fetches node summary metrics data using an embedded
[cAdvisor](https://github.com/google/cadvisor) that runs within the kubelet.
## Summary API data via CRI {#pod-and-container-stats-from-cri}
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
If you enable the `PodAndContainerStatsFromCRI`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in your
cluster, and you use a container runtime that supports statistics access via
[cAdvisor](https://github.com/google/cadvisor) that runs within the kubelet. If you
enable the `PodAndContainerStatsFromCRI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
in your cluster, and you use a container runtime that supports statistics access via
{{< glossary_tooltip term_id="cri" text="Container Runtime Interface">}} (CRI), then
the kubelet fetches Pod- and container-level metric data using CRI, and not via cAdvisor.
the kubelet [fetches Pod- and container-level metric data using CRI](/docs/reference/instrumentation/cri-pod-container-metrics), and not via cAdvisor.
## {{% heading "whatsnext" %}}

View File

@ -1,9 +1,11 @@
---
title: Official CVE Feed
linkTitle: CVE feed
weight: 25
outputs:
- json
- html
- html
- rss
layout: cve-feed
---
@ -14,19 +16,25 @@ the Kubernetes Security Response Committee. See
[Kubernetes Security and Disclosure Information](/docs/reference/issues-security/security/)
for more details.
The Kubernetes project publishes a programmatically accessible
[JSON Feed](/docs/reference/issues-security/official-cve-feed/index.json) of
published security issues. You can access it by executing the following command:
{{< comment >}}
`replace` is used to bypass known issue with rendering ">"
: https://github.com/gohugoio/hugo/issues/7229 in JSON layouts template
`layouts/_default/cve-feed.json`
{{< /comment >}}
The Kubernetes project publishes a programmatically accessible feed of published
security issues in [JSON feed](/docs/reference/issues-security/official-cve-feed/index.json)
and [RSS feed](/docs/reference/issues-security/official-cve-feed/feed.xml)
formats. You can access it by executing the following commands:
{{< tabs name="CVE feeds" >}}
{{% tab name="JSON feed" %}}
[Link to JSON format](/docs/reference/issues-security/official-cve-feed/index.json)
```shell
curl -Lv https://k8s.io/docs/reference/issues-security/official-cve-feed/index.json
```
{{% /tab %}}
{{% tab name="RSS feed" %}}
[Link to RSS format](/docs/reference/issues-security/official-cve-feed/feed.xml)
```shell
curl -Lv https://k8s.io/docs/reference/issues-security/official-cve-feed/feed.xml
```
{{% /tab %}}
{{< /tabs >}}
{{< cve-feed >}}

View File

@ -41,7 +41,7 @@ echo '[[ $commands[kubectl] ]] && source <(kubectl completion zsh)' >> ~/.zshrc
```
### A note on `--all-namespaces`
Appending `--all-namespaces` happens frequently enough where you should be aware of the shorthand for `--all-namespaces`:
Appending `--all-namespaces` happens frequently enough that you should be aware of the shorthand for `--all-namespaces`:
```kubectl -A```

View File

@ -168,6 +168,7 @@ Automanaged APIService objects are deleted by kube-apiserver when it has no buil
{{< /note >}}
There are two possible values:
- `onstart`: The APIService should be reconciled when an API server starts up, but not otherwise.
- `true`: The API server should reconcile this APIService continuously.
@ -191,7 +192,6 @@ The Kubelet populates this label with the hostname. Note that the hostname can b
This label is also used as part of the topology hierarchy. See [topology.kubernetes.io/zone](#topologykubernetesiozone) for more information.
### kubernetes.io/change-cause {#change-cause}
Example: `kubernetes.io/change-cause: "kubectl edit --record deployment foo"`
@ -409,6 +409,7 @@ A zone represents a logical failure domain. It is common for Kubernetes cluster
A region represents a larger domain, made up of one or more zones. It is uncommon for Kubernetes clusters to span multiple regions, While the exact definition of a zone or region is left to infrastructure implementations, common properties of a region include higher network latency between them than within them, non-zero cost for network traffic between them, and failure independence from other zones or regions. For example, nodes within a region might share power infrastructure (e.g. a UPS or generator), but nodes in different regions typically would not.
Kubernetes makes a few assumptions about the structure of zones and regions:
1) regions and zones are hierarchical: zones are strict subsets of regions and no zone can be in 2 regions
2) zone names are unique across regions; for example region "africa-east-1" might be comprised of zones "africa-east-1a" and "africa-east-1b"
@ -431,6 +432,17 @@ Used on: PersistentVolumeClaim
This annotation has been deprecated.
### volume.beta.kubernetes.io/storage-class (deprecated)
Example: `volume.beta.kubernetes.io/storage-class: "example-class"`
Used on: PersistentVolume, PersistentVolumeClaim
This annotation can be used for PersistentVolume(PV) or PersistentVolumeClaim(PVC) to specify the name of [StorageClass](/docs/concepts/storage/storage-classes/). When both `storageClassName` attribute and `volume.beta.kubernetes.io/storage-class` annotation are specified, the annotation `volume.beta.kubernetes.io/storage-class` takes precedence over the `storageClassName` attribute.
This annotation has been deprecated. Instead, set the [`storageClassName` field](/docs/concepts/storage/persistent-volumes/#class)
for the PersistentVolumeClaim or PersistentVolume.
### volume.beta.kubernetes.io/mount-options (deprecated) {#mount-options}
Example : `volume.beta.kubernetes.io/mount-options: "ro,soft"`
@ -528,7 +540,6 @@ a request where the client authenticated using the service account token.
If a legacy token was last used before the cluster gained the feature (added in Kubernetes v1.26), then
the label isn't set.
### endpointslice.kubernetes.io/managed-by {#endpointslicekubernetesiomanaged-by}
Example: `endpointslice.kubernetes.io/managed-by: "controller"`
@ -614,6 +625,17 @@ Example: `kubectl.kubernetes.io/default-container: "front-end-app"`
The value of the annotation is the container name that is default for this Pod. For example, `kubectl logs` or `kubectl exec` without `-c` or `--container` flag will use this default container.
### kubectl.kubernetes.io/default-logs-container (deprecated)
Example: `kubectl.kubernetes.io/default-logs-container: "front-end-app"`
The value of the annotation is the container name that is the default logging container for this Pod. For example, `kubectl logs` without `-c` or `--container` flag will use this default container.
{{< note >}}
This annotation is deprecated. You should use the [`kubectl.kubernetes.io/default-container`](#kubectl-kubernetes-io-default-container) annotation instead.
Kubernetes versions 1.25 and newer ignore this annotation.
{{< /note >}}
### endpoints.kubernetes.io/over-capacity
Example: `endpoints.kubernetes.io/over-capacity:truncated`
@ -634,7 +656,7 @@ The presence of this annotation on a Job indicates that the control plane is
[tracking the Job status using finalizers](/docs/concepts/workloads/controllers/job/#job-tracking-with-finalizers).
The control plane uses this annotation to safely transition to tracking Jobs
using finalizers, while the feature is in development.
You should **not** manually add or remove this annotation.
You should **not** manually add or remove this annotation.
{{< note >}}
Starting from Kubernetes 1.26, this annotation is deprecated.
@ -716,7 +738,6 @@ Refer to
for further details about when and how to use this taint.
{{< /caution >}}
### node.cloudprovider.kubernetes.io/uninitialized
Example: `node.cloudprovider.kubernetes.io/uninitialized: "NoSchedule"`
@ -813,6 +834,17 @@ or updating objects that contain Pod templates, such as Deployments, Jobs, State
See [Enforcing Pod Security at the Namespace Level](/docs/concepts/security/pod-security-admission)
for more information.
### rbac.authorization.kubernetes.io/autoupdate
Example: `rbac.authorization.kubernetes.io/autoupdate: "false"`
Used on: ClusterRole, ClusterRoleBinding, Role, RoleBinding
When this annotation is set to `"true"` on default RBAC objects created by the kube-apiserver, they are automatically updated at server start to add missing permissions and subjects (extra permissions and subjects are left in place). To prevent autoupdating a particular role or rolebinding, set this annotation to `"false"`.
If you create your own RBAC objects and set this annotation to `"false"`, `kubectl auth reconcile`
(which allows reconciling arbitrary RBAC objects in a {{< glossary_tooltip text="manifest" term_id="manifest" >}}) respects this annotation and does not automatically add missing permissions and
subjects.
### kubernetes.io/psp (deprecated) {#kubernetes-io-psp}
Example: `kubernetes.io/psp: restricted`

View File

@ -14,4 +14,4 @@ Kubernetes documentation, including:
* [Node Metrics Data](/docs/reference/instrumentation/node-metrics).
* [CRI Pod & Container Metrics](/docs/reference/instrumentation/cri-pod-container-metrics).

View File

@ -0,0 +1,300 @@
---
title: Common Expression Language in Kubernetes
reviewers:
- jpbetz
- cici37
content_type: concept
weight: 35
min-kubernetes-server-version: 1.25
---
<!-- overview -->
The [Common Expression Language (CEL)](https://github.com/google/cel-go) is used
in the Kubernetes API to declare validation rules, policy rules, and other
constraints or conditions.
CEL expressions are evaluated directly in the
{{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}, making CEL a
convenient alternative to out-of-process mechanisms, such as webhooks, for many
extensibility use cases. Your CEL expressions continue to execute so long as the
control plane's API server component remains available.
<!-- body -->
## Language overview
The [CEL
language](https://github.com/google/cel-spec/blob/master/doc/langdef.md) has a
straightforward syntax that is similar to the expressions in C, C++, Java,
JavaScript and Go.
CEL was designed to be embedded into applications. Each CEL "program" is a
single expression that evaluates to a single value. CEL expressions are
typically short "one-liners" that inline well into the string fields of Kubernetes
API resources.
Inputs to a CEL program are "variables". Each Kubernetes API field that contains
CEL declares in the API documentation which variables are available to use for
that field. For example, in the `x-kubernetes-validations[i].rules` field of
CustomResourceDefinitions, the `self` and `oldSelf` variables are available and
refer to the previous and current state of the custom resource data to be
validated by the CEL expression. Other Kubernetes API fields may declare
different variables. See the API documentation of the API fields to learn which
variables are available for that field.
Example CEL expressions:
{{< table caption="Examples of CEL expressions and the purpose of each" >}}
| Rule | Purpose |
|------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| `self.minReplicas <= self.replicas && self.replicas <= self.maxReplicas` | Validate that the three fields defining replicas are ordered appropriately |
| `'Available' in self.stateCounts` | Validate that an entry with the 'Available' key exists in a map |
| `(self.list1.size() == 0) != (self.list2.size() == 0)` | Validate that one of two lists is non-empty, but not both |
| `self.envars.filter(e, e.name = 'MY_ENV').all(e, e.value.matches('^[a-zA-Z]*$')` | Validate the 'value' field of a listMap entry where key field 'name' is 'MY_ENV' |
| `has(self.expired) && self.created + self.ttl < self.expired` | Validate that 'expired' date is after a 'create' date plus a 'ttl' duration |
| `self.health.startsWith('ok')` | Validate a 'health' string field has the prefix 'ok' |
| `self.widgets.exists(w, w.key == 'x' && w.foo < 10)` | Validate that the 'foo' property of a listMap item with a key 'x' is less than 10 |
| `type(self) == string ? self == '99%' : self == 42` | Validate an int-or-string field for both the the int and string cases |
| `self.metadata.name == 'singleton'` | Validate that an object's name matches a specific value (making it a singleton) |
| `self.set1.all(e, !(e in self.set2))` | Validate that two listSets are disjoint |
| `self.names.size() == self.details.size() && self.names.all(n, n in self.details)` | Validate the 'details' map is keyed by the items in the 'names' listSet |
{{< /table >}}
## CEL community libraries
Kubernetes CEL expressions have access to the following CEL community libraries:
- CEL standard functions, defined in the [list of standard definitions](https://github.com/google/cel-spec/blob/master/doc/langdef.md#list-of-standard-definitions)
- CEL standard [macros](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#macros)
- CEL [extended string function library](https://pkg.go.dev/github.com/google/cel-go/ext#Strings)
## Kubernetes CEL libraries
In additional to the CEL community libraries, Kubernetes includes CEL libraries
that are available everywhere CEL is used in Kubernetes.
### Kubernetes list library
The list library includes `indexOf` and `lastIndexOf`, which work similar to the
strings functions of the same names. These functions either the first or last
positional index of the provided element in the list.
The list library also includes `min`, `max` and `sum`. Sum is supported on all
number types as well as the duration type. Min and max are supported on all
comparable types.
`isSorted` is also provided as a convenience function and is supported on all
comparable types.
Examples:
{{< table caption="Examples of CEL expressions using list library functions" >}}
| CEL Expression | Purpose |
|------------------------------------------------------------------------------------|-----------------------------------------------------------|
| `names.isSorted()` | Verify that a list of names is kept in alphabetical order |
| `items.map(x, x.weight).sum() == 1.0` | Verify that the "weights" of a list of objects sum to 1.0 |
| `lowPriorities.map(x, x.priority).max() < highPriorities.map(x, x.priority).min()` | Verify that two sets of priorities do not overlap |
| `names.indexOf('should-be-first') == 1` | Require that the first name in a list if a specific value |
{{< /table >}}
See the [Kubernetes List Library](https://pkg.go.dev/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/library#Lists)
godoc for more information.
### Kubernetes regex library
In addition to the `matches` function provided by the CEL standard library, the
regex library provides `find` and `findAll`, enabling a much wider range of
regex operations.
Examples:
{{< table caption="Examples of CEL expressions using regex library functions" >}}
| CEL Expression | Purpose |
|-------------------------------------------------------------|----------------------------------------------------------|
| `"abc 123".find('[0-9]*')` | Find the first number in a string |
| `"1, 2, 3, 4".findAll('[0-9]*').map(x, int(x)).sum() < 100` | Verify that the numbers in a string sum to less than 100 |
{{< /table >}}
See the [Kubernetes regex library](https://pkg.go.dev/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/library#Regex)
godoc for more information.
### Kubernetes URL library
To make it easier and safer to process URLs, the following functions have been added:
- `isURL(string)` checks if a string is a valid URL according to the [Go's
net/url](https://pkg.go.dev/net/url#URL) package. The string must be an
absolute URL.
- `url(string) URL` converts a string to a URL or results in an error if the
string is not a valid URL.
Once parsed via the `url` function, the resulting URL object has `getScheme`,
`getHost`, `getHostname`, `getPort`, `getEscapedPath` and `getQuery` accessors.
Examples:
{{< table caption="Examples of CEL expressions using URL library functions" >}}
| CEL Expression | Purpose |
|-----------------------------------------------------------------|------------------------------------------------|
| `url('https://example.com:80/').getHost()` | Get the 'example.com:80' host part of the URL. |
| `url('https://example.com/path with spaces/').getEscapedPath()` | Returns '/path%20with%20spaces/' |
{{< /table >}}
See the [Kubernetes URL library](https://pkg.go.dev/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/library#URLs)
godoc for more information.
## Type checking
CEL is a [gradually typed language](https://github.com/google/cel-spec/blob/master/doc/langdef.md#gradual-type-checking).
Some Kubernetes API fields contain fully type checked CEL expressions. For
example, [CustomResourceDefinitions Validation
Rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules)
are fully type checked.
Some Kubernetes API fields contain partially type checked CEL expressions. A
partially type checked expression is an experessions where some of the variables
are statically typed but others are dynamically typed. For example, in the CEL
expressions of
[ValidatingAdmissionPolicies](/docs/reference/access-authn-authz/validating-admission-policy/)
the `request` variable is typed, but the `object` variable is dynamically typed.
As a result, an expression containing `request.namex` would fail type checking
because the `namex` field is not defined. However, `object.namex` would pass
type checking even when the `namex` field is not defined for the resource kinds
that `object` refers to, because `object` is dynamically typed.
The `has()` macro in CEL may be used in CEL expressions to check if a field of a
dynamically typed variable is accessible before attempting to access the field's
value. For example:
```cel
has(object.namex) ? object.namex == 'special' : request.name == 'special'
```
## Type system integration
{{< table caption="Table showing the relationship between OpenAPIv3 types and CEL types" >}}
| OpenAPIv3 type | CEL type |
|----------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| 'object' with Properties | object / "message type" (`type(<object>)` evaluates to `selfType<uniqueNumber>.path.to.object.from.self` |
| 'object' with AdditionalProperties | map |
| 'object' with x-kubernetes-embedded-type | object / "message type", 'apiVersion', 'kind', 'metadata.name' and 'metadata.generateName' are implicitly included in schema |
| 'object' with x-kubernetes-preserve-unknown-fields | object / "message type", unknown fields are NOT accessible in CEL expression |
| x-kubernetes-int-or-string | union of int or string, `self.intOrString < 100 \|\| self.intOrString == '50%'` evaluates to true for both `50` and `"50%"` |
| 'array | list |
| 'array' with x-kubernetes-list-type=map | list with map based Equality & unique key guarantees |
| 'array' with x-kubernetes-list-type=set | list with set based Equality & unique entry guarantees |
| 'boolean' | boolean |
| 'number' (all formats) | double |
| 'integer' (all formats) | int (64) |
| _no equivalent_ | uint (64) |
| 'null' | null_type |
| 'string' | string |
| 'string' with format=byte (base64 encoded) | bytes |
| 'string' with format=date | timestamp (google.protobuf.Timestamp) |
| 'string' with format=datetime | timestamp (google.protobuf.Timestamp) |
| 'string' with format=duration | duration (google.protobuf.Duration) |
{{< /table >}}
Also see: [CEL types](https://github.com/google/cel-spec/blob/v0.6.0/doc/langdef.md#values),
[OpenAPI types](https://swagger.io/specification/#data-types),
[Kubernetes Structural Schemas](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema).
Equality comparison for arrays with `x-kubernetes-list-type` of `set` or `map` ignores element
order. For example `[1, 2] == [2, 1]` if the arrays represent Kubernetes `set` values.
Concatenation on arrays with `x-kubernetes-list-type` use the semantics of the
list type:
- `set`: `X + Y` performs a union where the array positions of all elements in
`X` are preserved and non-intersecting elements in `Y` are appended, retaining
their partial order.
- `map`: `X + Y` performs a merge where the array positions of all keys in `X`
are preserved but the values are overwritten by values in `Y` when the key
sets of `X` and `Y` intersect. Elements in `Y` with non-intersecting keys are
appended, retaining their partial order.
## Escaping
Only Kubernetes resource property names of the form
`[a-zA-Z_.-/][a-zA-Z0-9_.-/]*` are accessible from CEL. Accessible property
names are escaped according to the following rules when accessed in the
expression:
{{< table caption="Table of CEL identifier escaping rules" >}}
| escape sequence | property name equivalent |
|-------------------|----------------------------------------------------------------------------------------------|
| `__underscores__` | `__` |
| `__dot__` | `.` |
| `__dash__` | `-` |
| `__slash__` | `/` |
| `__{keyword}__` | [CEL **RESERVED** keyword](https://github.com/google/cel-spec/blob/v0.6.0/doc/langdef.md#syntax) |
{{< /table >}}
When you escape any of CEL's **RESERVED** keywords you need to match the exact property name
use the underscore escaping
(for example, `int` in the word `sprint` would not be escaped and nor would it need to be).
Examples on escaping:
{{< table caption="Examples escaped CEL identifiers" >}}
| property name | rule with escaped property name |
|---------------|-----------------------------------|
| `namespace` | `self.__namespace__ > 0` |
| `x-prop` | `self.x__dash__prop > 0` |
| `redact__d` | `self.redact__underscores__d > 0` |
| `string` | `self.startsWith('kube')` |
{{< /table >}}
## Resource constraints
CEL is non-Turing complete and offers a variety of production safety controls to
limit execution time. CEL's _resource constraint_ features provide feedback to
developers about expression complexity and help protect the API server from
excessive resource consumption during evaluation. CEL's resource constraint
features are used to prevent CEL evaluation from consuming excessive API server
resources.
A key element of the resource constraint features is a _cost unit_ that CEL
defines as a way of tracking CPU utilization. Cost units are independent of
system load and hardware. Cost units are also deterministic; for any given CEL
expression and input data, evaluation of the expression by the CEL interpreter
will always result in the same cost.
Many of CEL's core operations have fixed costs. The simplest operations, such as
comparisons (e.g. `<`) have a cost of 1. Some have a higher fixed cost, for
example list literal declarations have a fixed base cost of 40 cost units.
Calls to functions implemented in native code approximate cost based on the time
complexity of the operation. For example: operations that use regular
expressions, such as `match` and `find`, are estimated using an approximated
cost of `length(regexString)*length(inputString)`. The approximated cost
reflects the worst case time complexity of Go's RE2 implementation.
### Runtime cost budget
All CEL expressions evaluated by Kubernetes are constrained by a runtime cost
budget. The runtime cost budget is an estimate of actual CPU utilization
computed by incrementing a cost unit counter while interpreting a CEL
expression. If the CEL interpreter executes too many instructions, the runtime
cost budget will be exceeded, execution of the expressions will be halted, and
an error will result.
Some Kubernetes resources define an additional runtime cost budget that bounds
the execution of multiple expressions. If the sum total of the cost of
expressions exceed the budget, execution of the expressions will be halted, and
an error will result. For example the validation of a custom resource has a
_per-validation_ runtime cost budget for all [Validation
Rules](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules)
evaluated to validate the custom resource.
### Estimated cost limits
For some Kubernetes resources, the API server may also check if worst case
estimated running time of CEL expressions would be prohibitively expensive to
execute. If so, the API server prevent the CEL expression from being written to
API resources by rejecting create or update operations containing the CEL
expression to the API resources. This feature offers a stronger assurance that
CEL expressions written to the API resource will be evaluate at runtime without
exceeding the runtime cost budget.

View File

@ -1,6 +1,6 @@
---
title: "Access Applications in a Cluster"
description: Configure load balancing, port forwarding, or setup firewall or DNS configurations to access applications in a cluster.
weight: 60
weight: 100
---

View File

@ -1,6 +1,7 @@
---
title: Access Services Running on Clusters
content_type: task
weight: 140
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Communicate Between Containers in the Same Pod Using a Shared Volume
content_type: task
weight: 110
weight: 120
---
<!-- overview -->

View File

@ -41,12 +41,12 @@ cluster's API server.
## Define clusters, users, and contexts
Suppose you have two clusters, one for development work and one for scratch work.
Suppose you have two clusters, one for development work and one for test work.
In the `development` cluster, your frontend developers work in a namespace called `frontend`,
and your storage developers work in a namespace called `storage`. In your `scratch` cluster,
and your storage developers work in a namespace called `storage`. In your `test` cluster,
developers work in the default namespace, or they create auxiliary namespaces as they
see fit. Access to the development cluster requires authentication by certificate. Access
to the scratch cluster requires authentication by username and password.
to the test cluster requires authentication by username and password.
Create a directory named `config-exercise`. In your
`config-exercise` directory, create a file named `config-demo` with this content:
@ -60,7 +60,7 @@ clusters:
- cluster:
name: development
- cluster:
name: scratch
name: test
users:
- name: developer
@ -72,7 +72,7 @@ contexts:
- context:
name: dev-storage
- context:
name: exp-scratch
name: exp-test
```
A configuration file describes clusters, users, and contexts. Your `config-demo` file
@ -83,7 +83,7 @@ your configuration file:
```shell
kubectl config --kubeconfig=config-demo set-cluster development --server=https://1.2.3.4 --certificate-authority=fake-ca-file
kubectl config --kubeconfig=config-demo set-cluster scratch --server=https://5.6.7.8 --insecure-skip-tls-verify
kubectl config --kubeconfig=config-demo set-cluster test --server=https://5.6.7.8 --insecure-skip-tls-verify
```
Add user details to your configuration file:
@ -108,7 +108,7 @@ Add context details to your configuration file:
```shell
kubectl config --kubeconfig=config-demo set-context dev-frontend --cluster=development --namespace=frontend --user=developer
kubectl config --kubeconfig=config-demo set-context dev-storage --cluster=development --namespace=storage --user=developer
kubectl config --kubeconfig=config-demo set-context exp-scratch --cluster=scratch --namespace=default --user=experimenter
kubectl config --kubeconfig=config-demo set-context exp-test --cluster=test --namespace=default --user=experimenter
```
Open your `config-demo` file to see the added details. As an alternative to opening the
@ -130,7 +130,7 @@ clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://5.6.7.8
name: scratch
name: test
contexts:
- context:
cluster: development
@ -143,10 +143,10 @@ contexts:
user: developer
name: dev-storage
- context:
cluster: scratch
cluster: test
namespace: default
user: experimenter
name: exp-scratch
name: exp-test
current-context: ""
kind: Config
preferences: {}
@ -220,19 +220,19 @@ users:
client-key: fake-key-file
```
Now suppose you want to work for a while in the scratch cluster.
Now suppose you want to work for a while in the test cluster.
Change the current context to `exp-scratch`:
Change the current context to `exp-test`:
```shell
kubectl config --kubeconfig=config-demo use-context exp-scratch
kubectl config --kubeconfig=config-demo use-context exp-test
```
Now any `kubectl` command you give will apply to the default namespace of
the `scratch` cluster. And the command will use the credentials of the user
listed in the `exp-scratch` context.
the `test` cluster. And the command will use the credentials of the user
listed in the `exp-test` context.
View configuration associated with the new current context, `exp-scratch`.
View configuration associated with the new current context, `exp-test`.
```shell
kubectl config --kubeconfig=config-demo view --minify
@ -338,10 +338,10 @@ contexts:
user: developer
name: dev-storage
- context:
cluster: scratch
cluster: test
namespace: default
user: experimenter
name: exp-scratch
name: exp-test
```
For more information about how kubeconfig files are merged, see

View File

@ -1,6 +1,6 @@
---
title: Configure DNS for a Cluster
weight: 120
weight: 130
content_type: concept
---

View File

@ -1,26 +1,24 @@
---
title: Set up Ingress on Minikube with the NGINX Ingress Controller
content_type: task
weight: 100
weight: 110
min-kubernetes-server-version: 1.19
---
<!-- overview -->
An [Ingress](/docs/concepts/services-networking/ingress/) is an API object that defines rules which allow external access
to services in a cluster. An [Ingress controller](/docs/concepts/services-networking/ingress-controllers/) fulfills the rules set in the Ingress.
This page shows you how to set up a simple Ingress which routes requests to Service web or web2 depending on the HTTP URI.
An [Ingress](/docs/concepts/services-networking/ingress/) is an API object that defines rules
which allow external access to services in a cluster. An
[Ingress controller](/docs/concepts/services-networking/ingress-controllers/)
fulfills the rules set in the Ingress.
This page shows you how to set up a simple Ingress which routes requests to Service 'web' or
'web2' depending on the HTTP URI.
## {{% heading "prerequisites" %}}
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
If you are using an older Kubernetes version, switch to the documentation
for that version.
If you are using an older Kubernetes version, switch to the documentation for that version.
### Create a Minikube cluster
@ -37,49 +35,60 @@ Locally
1. To enable the NGINX Ingress controller, run the following command:
```shell
minikube addons enable ingress
```
```shell
minikube addons enable ingress
```
1. Verify that the NGINX Ingress controller is running
{{< tabs name="tab_with_md" >}}
{{% tab name="minikube v1.19 or later" %}}
```shell
kubectl get pods -n ingress-nginx
```
{{< note >}}It can take up to a minute before you see these pods running OK.{{< /note >}}
```shell
kubectl get pods -n ingress-nginx
```
{{< note >}}
It can take up to a minute before you see these pods running OK.
{{< /note >}}
The output is similar to:
```
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-g9g49 0/1 Completed 0 11m
ingress-nginx-admission-patch-rqp78 0/1 Completed 1 11m
ingress-nginx-controller-59b45fb494-26npt 1/1 Running 0 11m
```
```none
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-g9g49 0/1 Completed 0 11m
ingress-nginx-admission-patch-rqp78 0/1 Completed 1 11m
ingress-nginx-controller-59b45fb494-26npt 1/1 Running 0 11m
```
{{% /tab %}}
{{% tab name="minikube v1.18.1 or earlier" %}}
```shell
kubectl get pods -n kube-system
```
{{< note >}}It can take up to a minute before you see these pods running OK.{{< /note >}}
```shell
kubectl get pods -n kube-system
```
{{< note >}}
It can take up to a minute before you see these pods running OK.
{{< /note >}}
The output is similar to:
```
NAME READY STATUS RESTARTS AGE
default-http-backend-59868b7dd6-xb8tq 1/1 Running 0 1m
kube-addon-manager-minikube 1/1 Running 0 3m
kube-dns-6dcb57bcc8-n4xd4 3/3 Running 0 2m
kubernetes-dashboard-5498ccf677-b8p5h 1/1 Running 0 2m
nginx-ingress-controller-5984b97644-rnkrg 1/1 Running 0 1m
storage-provisioner 1/1 Running 0 2m
```
```none
NAME READY STATUS RESTARTS AGE
default-http-backend-59868b7dd6-xb8tq 1/1 Running 0 1m
kube-addon-manager-minikube 1/1 Running 0 3m
kube-dns-6dcb57bcc8-n4xd4 3/3 Running 0 2m
kubernetes-dashboard-5498ccf677-b8p5h 1/1 Running 0 2m
nginx-ingress-controller-5984b97644-rnkrg 1/1 Running 0 1m
storage-provisioner 1/1 Running 0 2m
```
Make sure that you see a Pod with a name that starts with `nginx-ingress-controller-`.
Make sure that you see a Pod with a name that starts with `nginx-ingress-controller-`.
{{% /tab %}}
{{< /tabs >}}
## Deploy a hello, world app
@ -92,7 +101,7 @@ storage-provisioner 1/1 Running 0 2m
The output should be:
```
```none
deployment.apps/web created
```
@ -104,19 +113,19 @@ storage-provisioner 1/1 Running 0 2m
The output should be:
```
```none
service/web exposed
```
1. Verify the Service is created and is available on a node port:
```shell
```shell
kubectl get service web
```
The output is similar to:
```
```none
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
web NodePort 10.104.133.249 <none> 8080:31637/TCP 12m
```
@ -129,26 +138,31 @@ storage-provisioner 1/1 Running 0 2m
The output is similar to:
```
```none
http://172.17.0.15:31637
```
{{< note >}}Katacoda environment only: at the top of the terminal panel, click the plus sign, and then click **Select port to view on Host 1**. Enter the NodePort, in this case `31637`, and then click **Display Port**.{{< /note >}}
{{< note >}}
Katacoda environment only: at the top of the terminal panel, click the plus sign,
and then click **Select port to view on Host 1**. Enter the NodePort value,
in this case `31637`, and then click **Display Port**.
{{< /note >}}
The output is similar to:
```
```none
Hello, world!
Version: 1.0.0
Hostname: web-55b8c6998d-8k564
```
You can now access the sample app via the Minikube IP address and NodePort. The next step lets you access
the app using the Ingress resource.
You can now access the sample application via the Minikube IP address and NodePort.
The next step lets you access the application using the Ingress resource.
## Create an Ingress
The following manifest defines an Ingress that sends traffic to your Service via hello-world.info.
The following manifest defines an Ingress that sends traffic to your Service via
`hello-world.info`.
1. Create `example-ingress.yaml` from the following file:
@ -162,7 +176,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
The output should be:
```
```none
ingress.networking.k8s.io/example-ingress created
```
@ -172,11 +186,13 @@ The following manifest defines an Ingress that sends traffic to your Service via
kubectl get ingress
```
{{< note >}}This can take a couple of minutes.{{< /note >}}
{{< note >}}
This can take a couple of minutes.
{{< /note >}}
You should see an IPv4 address in the ADDRESS column; for example:
You should see an IPv4 address in the `ADDRESS` column; for example:
```
```none
NAME CLASS HOSTS ADDRESS PORTS AGE
example-ingress <none> hello-world.info 172.17.0.15 80 38s
```
@ -184,30 +200,35 @@ The following manifest defines an Ingress that sends traffic to your Service via
1. Add the following line to the bottom of the `/etc/hosts` file on
your computer (you will need administrator access):
```
```none
172.17.0.15 hello-world.info
```
{{< note >}}If you are running Minikube locally, use `minikube ip` to get the external IP. The IP address displayed within the ingress list will be the internal IP.{{< /note >}}
{{< note >}}
If you are running Minikube locally, use `minikube ip` to get the external IP.
The IP address displayed within the ingress list will be the internal IP.
{{< /note >}}
After you make this change, your web browser sends requests for
hello-world.info URLs to Minikube.
After you make this change, your web browser sends requests for
`hello-world.info` URLs to Minikube.
1. Verify that the Ingress controller is directing traffic:
```shell
curl hello-world.info
```
```shell
curl hello-world.info
```
You should see:
You should see:
```
Hello, world!
Version: 1.0.0
Hostname: web-55b8c6998d-8k564
```
```none
Hello, world!
Version: 1.0.0
Hostname: web-55b8c6998d-8k564
```
{{< note >}}If you are running Minikube locally, you can visit hello-world.info from your browser.{{< /note >}}
{{< note >}}
If you are running Minikube locally, you can visit `hello-world.info` from your browser.
{{< /note >}}
## Create a second Deployment
@ -216,9 +237,10 @@ The following manifest defines an Ingress that sends traffic to your Service via
```shell
kubectl create deployment web2 --image=gcr.io/google-samples/hello-app:2.0
```
The output should be:
```
```none
deployment.apps/web2 created
```
@ -230,7 +252,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
The output should be:
```
```none
service/web2 exposed
```
@ -240,13 +262,13 @@ The following manifest defines an Ingress that sends traffic to your Service via
following lines at the end:
```yaml
- path: /v2
pathType: Prefix
backend:
service:
name: web2
port:
number: 8080
- path: /v2
pathType: Prefix
backend:
service:
name: web2
port:
number: 8080
```
1. Apply the changes:
@ -257,7 +279,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
You should see:
```
```none
ingress.networking/example-ingress configured
```
@ -271,7 +293,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
The output is similar to:
```
```none
Hello, world!
Version: 1.0.0
Hostname: web-55b8c6998d-8k564
@ -285,16 +307,16 @@ The following manifest defines an Ingress that sends traffic to your Service via
The output is similar to:
```
```none
Hello, world!
Version: 2.0.0
Hostname: web2-75cd47646f-t8cjk
```
{{< note >}}If you are running Minikube locally, you can visit hello-world.info and hello-world.info/v2 from your browser.{{< /note >}}
{{< note >}}
If you are running Minikube locally, you can visit `hello-world.info` and
`hello-world.info/v2` from your browser.
{{< /note >}}
## {{% heading "whatsnext" %}}

View File

@ -10,26 +10,15 @@ This page shows how to create a Kubernetes Service object that external
clients can use to access an application running in a cluster. The Service
provides load balancing for an application that has two running instances.
## {{% heading "prerequisites" %}}
{{< include "task-tutorial-prereqs.md" >}}
## {{% heading "objectives" %}}
* Run two instances of a Hello World application.
* Create a Service object that exposes a node port.
* Use the Service object to access the running application.
- Run two instances of a Hello World application.
- Create a Service object that exposes a node port.
- Use the Service object to access the running application.
<!-- lessoncontent -->
@ -41,9 +30,11 @@ Here is the configuration file for the application Deployment:
1. Run a Hello World application in your cluster:
Create the application Deployment using the file above:
```shell
kubectl apply -f https://k8s.io/examples/service/access/hello-application.yaml
```
The preceding command creates a
{{< glossary_tooltip text="Deployment" term_id="deployment" >}}
and an associated
@ -52,30 +43,35 @@ Here is the configuration file for the application Deployment:
{{< glossary_tooltip text="Pods" term_id="pod" >}}
each of which runs the Hello World application.
1. Display information about the Deployment:
```shell
kubectl get deployments hello-world
kubectl describe deployments hello-world
```
1. Display information about your ReplicaSet objects:
```shell
kubectl get replicasets
kubectl describe replicasets
```
1. Create a Service object that exposes the deployment:
```shell
kubectl expose deployment hello-world --type=NodePort --name=example-service
```
1. Display information about the Service:
```shell
kubectl describe services example-service
```
The output is similar to this:
```shell
```none
Name: example-service
Namespace: default
Labels: run=load-balancer-example
@ -90,19 +86,24 @@ Here is the configuration file for the application Deployment:
Session Affinity: None
Events: <none>
```
Make a note of the NodePort value for the service. For example,
in the preceding output, the NodePort value is 31496.
1. List the pods that are running the Hello World application:
```shell
kubectl get pods --selector="run=load-balancer-example" --output=wide
```
The output is similar to this:
```shell
```none
NAME READY STATUS ... IP NODE
hello-world-2895499144-bsbk5 1/1 Running ... 10.200.1.4 worker1
hello-world-2895499144-m1pwt 1/1 Running ... 10.200.2.5 worker2
```
1. Get the public IP address of one of your nodes that is running
a Hello World pod. How you get this address depends on how you set
up your cluster. For example, if you are using Minikube, you can
@ -117,13 +118,16 @@ Here is the configuration file for the application Deployment:
cloud providers offer different ways of configuring firewall rules.
1. Use the node address and node port to access the Hello World application:
```shell
curl http://<public-node-ip>:<node-port>
```
where `<public-node-ip>` is the public IP address of your node,
and `<node-port>` is the NodePort value for your service. The
response to a successful request is a hello message:
```shell
```none
Hello Kubernetes!
```
@ -133,12 +137,8 @@ As an alternative to using `kubectl expose`, you can use a
[service configuration file](/docs/concepts/services-networking/service/)
to create a Service.
## {{% heading "cleanup" %}}
To delete the Service, enter this command:
kubectl delete services example-service
@ -148,9 +148,6 @@ the Hello World application, enter this command:
kubectl delete deployment hello-world
## {{% heading "whatsnext" %}}
Follow the

View File

@ -18,7 +18,7 @@ manually through [`easyrsa`](https://github.com/OpenVPN/easy-rsa), [`openssl`](h
1. Download, unpack, and initialize the patched version of `easyrsa3`.
```shell
curl -LO https://storage.googleapis.com/kubernetes-release/easy-rsa/easy-rsa.tar.gz
curl -LO https://dl.k8s.io/easy-rsa/easy-rsa.tar.gz
tar xzf easy-rsa.tar.gz
cd easy-rsa-master/easyrsa3
./easyrsa init-pki

View File

@ -66,7 +66,7 @@ resources:
```
Each `resources` array item is a separate config and contains a complete configuration. The
`resources.resources` field is an array of Kubernetes resource names (`resource` or `resource.group`
`resources.resources` field is an array of Kubernetes resource names (`resource` or `resource.group`)
that should be encrypted like Secrets, ConfigMaps, or other resources.
If custom resources are added to `EncryptionConfiguration` and the cluster version is 1.26 or newer,
@ -103,6 +103,7 @@ Name | Encryption | Strength | Speed | Key Length | Other Considerations
`aesgcm` | AES-GCM with random nonce | Must be rotated every 200k writes | Fastest | 16, 24, or 32-byte | Is not recommended for use except when an automated key rotation scheme is implemented.
`aescbc` | AES-CBC with [PKCS#7](https://datatracker.ietf.org/doc/html/rfc2315) padding | Weak | Fast | 32-byte | Not recommended due to CBC's vulnerability to padding oracle attacks.
`kms` | Uses envelope encryption scheme: Data is encrypted by data encryption keys (DEKs) using AES-CBC with [PKCS#7](https://datatracker.ietf.org/doc/html/rfc2315) padding (prior to v1.25), using AES-GCM starting from v1.25, DEKs are encrypted by key encryption keys (KEKs) according to configuration in Key Management Service (KMS) | Strongest | Fast | 32-bytes | The recommended choice for using a third party tool for key management. Simplifies key rotation, with a new DEK generated for each encryption, and KEK rotation controlled by the user. [Configure the KMS provider](/docs/tasks/administer-cluster/kms-provider/).
{{< /table >}}
Each provider supports multiple keys - the keys are tried in order for decryption, and if the provider
is the first provider, the first key is used for encryption.

View File

@ -8,7 +8,7 @@ weight: 50
<!-- overview -->
The `dockershim` component of Kubernetes allows to use Docker as a Kubernetes's
The `dockershim` component of Kubernetes allows the use of Docker as a Kubernetes's
{{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}.
Kubernetes' built-in `dockershim` component was removed in release v1.24.
@ -40,11 +40,11 @@ dependency on Docker:
1. Third-party tools that perform above mentioned privileged operations. See
[Migrating telemetry and security agents from dockershim](/docs/tasks/administer-cluster/migrating-from-dockershim/migrating-telemetry-and-security-agents)
for more information.
1. Make sure there is no indirect dependencies on dockershim behavior.
1. Make sure there are no indirect dependencies on dockershim behavior.
This is an edge case and unlikely to affect your application. Some tooling may be configured
to react to Docker-specific behaviors, for example, raise alert on specific metrics or search for
a specific log message as part of troubleshooting instructions.
If you have such tooling configured, test the behavior on test
If you have such tooling configured, test the behavior on a test
cluster before migration.
## Dependency on Docker explained {#role-of-dockershim}
@ -74,7 +74,7 @@ before to check on these containers is no longer available.
You cannot get container information using `docker ps` or `docker inspect`
commands. As you cannot list containers, you cannot get logs, stop containers,
or execute something inside container using `docker exec`.
or execute something inside a container using `docker exec`.
{{< note >}}

View File

@ -161,7 +161,7 @@ kubectl config set-context prod --namespace=production \
--user=lithe-cocoa-92103_kubernetes
```
By default, the above commands adds two contexts that are saved into file
By default, the above commands add two contexts that are saved into file
`.kube/config`. You can now view the contexts and alternate against the two
new request contexts depending on which namespace you wish to work against.

View File

@ -43,7 +43,7 @@ Decide whether you want to deploy a [cloud](#creating-a-calico-cluster-with-goog
## Creating a local Calico cluster with kubeadm
To get a local single-host Calico cluster in fifteen minutes using kubeadm, refer to the
[Calico Quickstart](https://docs.projectcalico.org/latest/getting-started/kubernetes/).
[Calico Quickstart](https://projectcalico.docs.tigera.io/getting-started/kubernetes/).

View File

@ -10,7 +10,7 @@ weight: 30
<!-- overview -->
This page shows how to use Cilium for NetworkPolicy.
For background on Cilium, read the [Introduction to Cilium](https://docs.cilium.io/en/stable/intro).
For background on Cilium, read the [Introduction to Cilium](https://docs.cilium.io/en/stable/overview/intro).
## {{% heading "prerequisites" %}}
@ -82,7 +82,7 @@ policies using an example application.
## Deploying Cilium for Production Use
For detailed instructions around deploying Cilium for production, see:
[Cilium Kubernetes Installation Guide](https://docs.cilium.io/en/stable/concepts/kubernetes/intro/)
[Cilium Kubernetes Installation Guide](https://docs.cilium.io/en/stable/network/kubernetes/concepts/)
This documentation includes detailed requirements, instructions and example
production DaemonSet files.

View File

@ -53,9 +53,9 @@ To get a list of all parameters, you can run
sudo sysctl -a
```
## Enabling Unsafe Sysctls
## Safe and Unsafe Sysctls
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
Kubernetes classes sysctls as either _safe_ or _unsafe_. In addition to proper
namespacing, a _safe_ sysctl must be properly _isolated_ between pods on the
same node. This means that setting a _safe_ sysctl for one pod
@ -80,6 +80,8 @@ The example `net.ipv4.tcp_syncookies` is not namespaced on Linux kernel version
This list will be extended in future Kubernetes versions when the kubelet
supports better isolation mechanisms.
### Enabling Unsafe Sysctls
All _safe_ sysctls are enabled by default.
All _unsafe_ sysctls are disabled by default and must be allowed manually by the

View File

@ -1,6 +1,6 @@
---
title: "Managing Secrets"
weight: 28
weight: 60
description: Managing confidential settings data using Secrets.
---

View File

@ -1,6 +1,6 @@
---
title: "Configure Pods and Containers"
description: Perform common configuration tasks for Pods and containers.
weight: 20
weight: 30
---

View File

@ -2,7 +2,7 @@
title: Assign Pods to Nodes using Node Affinity
min-kubernetes-server-version: v1.10
content_type: task
weight: 120
weight: 160
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Assign Pods to Nodes
content_type: task
weight: 120
weight: 150
---
<!-- overview -->

View File

@ -1,7 +1,6 @@
---
title: Attach Handlers to Container Lifecycle Events
content_type: task
weight: 140
weight: 180
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Configure GMSA for Windows Pods and containers
content_type: task
weight: 20
weight: 30
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Configure Liveness, Readiness and Startup Probes
content_type: task
weight: 110
weight: 140
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Configure a Pod to Use a PersistentVolume for Storage
content_type: task
weight: 60
weight: 90
---
<!-- overview -->
@ -12,27 +12,24 @@ for storage.
Here is a summary of the process:
1. You, as cluster administrator, create a PersistentVolume backed by physical
storage. You do not associate the volume with any Pod.
storage. You do not associate the volume with any Pod.
1. You, now taking the role of a developer / cluster user, create a
PersistentVolumeClaim that is automatically bound to a suitable
PersistentVolume.
PersistentVolumeClaim that is automatically bound to a suitable
PersistentVolume.
1. You create a Pod that uses the above PersistentVolumeClaim for storage.
## {{% heading "prerequisites" %}}
* You need to have a Kubernetes cluster that has only one Node, and the
{{< glossary_tooltip text="kubectl" term_id="kubectl" >}}
command-line tool must be configured to communicate with your cluster. If you
do not already have a single-node cluster, you can create one by using
[Minikube](https://minikube.sigs.k8s.io/docs/).
{{< glossary_tooltip text="kubectl" term_id="kubectl" >}}
command-line tool must be configured to communicate with your cluster. If you
do not already have a single-node cluster, you can create one by using
[Minikube](https://minikube.sigs.k8s.io/docs/).
* Familiarize yourself with the material in
[Persistent Volumes](/docs/concepts/storage/persistent-volumes/).
[Persistent Volumes](/docs/concepts/storage/persistent-volumes/).
<!-- steps -->
@ -50,7 +47,6 @@ In your shell on that Node, create a `/mnt/data` directory:
sudo mkdir /mnt/data
```
In the `/mnt/data` directory, create an `index.html` file:
```shell
@ -71,6 +67,7 @@ cat /mnt/data/index.html
```
The output should be:
```
Hello from Kubernetes storage
```
@ -116,8 +113,10 @@ kubectl get pv task-pv-volume
The output shows that the PersistentVolume has a `STATUS` of `Available`. This
means it has not yet been bound to a PersistentVolumeClaim.
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Available manual 4s
```
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Available manual 4s
```
## Create a PersistentVolumeClaim
@ -132,7 +131,9 @@ Here is the configuration file for the PersistentVolumeClaim:
Create the PersistentVolumeClaim:
kubectl apply -f https://k8s.io/examples/pods/storage/pv-claim.yaml
```shell
kubectl apply -f https://k8s.io/examples/pods/storage/pv-claim.yaml
```
After you create the PersistentVolumeClaim, the Kubernetes control plane looks
for a PersistentVolume that satisfies the claim's requirements. If the control
@ -147,8 +148,10 @@ kubectl get pv task-pv-volume
Now the output shows a `STATUS` of `Bound`.
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim manual 2m
```
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim manual 2m
```
Look at the PersistentVolumeClaim:
@ -159,8 +162,10 @@ kubectl get pvc task-pv-claim
The output shows that the PersistentVolumeClaim is bound to your PersistentVolume,
`task-pv-volume`.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
```
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
```
## Create a Pod
@ -206,15 +211,16 @@ curl http://localhost/
The output shows the text that you wrote to the `index.html` file on the
hostPath volume:
Hello from Kubernetes storage
```
Hello from Kubernetes storage
```
If you see that message, you have successfully configured a Pod to
use storage from a PersistentVolumeClaim.
## Clean up
Delete the Pod, the PersistentVolumeClaim and the PersistentVolume:
Delete the Pod, the PersistentVolumeClaim and the PersistentVolume:
```shell
kubectl delete pod task-pv-pod
@ -242,8 +248,8 @@ You can now close the shell to your Node.
You can perform 2 volume mounts on your nginx container:
`/usr/share/nginx/html` for the static website
`/etc/nginx/nginx.conf` for the default config
- `/usr/share/nginx/html` for the static website
- `/etc/nginx/nginx.conf` for the default config
<!-- discussion -->
@ -256,6 +262,7 @@ with a GID. Then the GID is automatically added to any Pod that uses the
PersistentVolume.
Use the `pv.beta.kubernetes.io/gid` annotation as follows:
```yaml
apiVersion: v1
kind: PersistentVolume
@ -264,6 +271,7 @@ metadata:
annotations:
pv.beta.kubernetes.io/gid: "1234"
```
When a Pod consumes a PersistentVolume that has a GID annotation, the annotated GID
is applied to all containers in the Pod in the same way that GIDs specified in the
Pod's security context are. Every GID, whether it originates from a PersistentVolume
@ -275,12 +283,8 @@ When a Pod consumes a PersistentVolume, the GIDs associated with the
PersistentVolume are not present on the Pod resource itself.
{{< /note >}}
## {{% heading "whatsnext" %}}
* Learn more about [PersistentVolumes](/docs/concepts/storage/persistent-volumes/).
* Read the [Persistent Storage design document](https://git.k8s.io/design-proposals-archive/storage/persistent-storage.md).
@ -290,7 +294,3 @@ PersistentVolume are not present on the Pod resource itself.
* [PersistentVolumeSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#persistentvolumespec-v1-core)
* [PersistentVolumeClaim](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#persistentvolumeclaim-v1-core)
* [PersistentVolumeClaimSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#persistentvolumeclaimspec-v1-core)

View File

@ -1,7 +1,7 @@
---
title: Configure a Pod to Use a ConfigMap
content_type: task
weight: 150
weight: 190
card:
name: tasks
weight: 50
@ -9,61 +9,92 @@ card:
<!-- overview -->
Many applications rely on configuration which is used during either application initialization or runtime.
Most of the times there is a requirement to adjust values assigned to configuration parameters.
ConfigMaps are the Kubernetes way to inject application pods with configuration data.
ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable. This page provides a series of usage examples demonstrating how to create ConfigMaps and configure Pods using data stored in ConfigMaps.
Most times, there is a requirement to adjust values assigned to configuration parameters.
ConfigMaps are a Kubernetes mechanism that let you inject configuration data into application
{{< glossary_tooltip text="pods" term_id="pod" >}}.
The ConfigMap concept allow you to decouple configuration artifacts from image content to
keep containerized applications portable. For example, you can download and run the same
{{< glossary_tooltip text="container image" term_id="image" >}} to spin up containers for
the purposes of local development, system test, or running a live end-user workload.
This page provides a series of usage examples demonstrating how to create ConfigMaps and
configure Pods using data stored in ConfigMaps.
## {{% heading "prerequisites" %}}
{{< include "task-tutorial-prereqs.md" >}}
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
You need to have the `wget` tool installed. If you have a different tool
such as `curl`, and you do not have `wget`, you will need to adapt the
step that downloads example data.
<!-- steps -->
## Create a ConfigMap
You can use either `kubectl create configmap` or a ConfigMap generator in `kustomization.yaml` to create a ConfigMap. Note that `kubectl` starts to support `kustomization.yaml` since 1.14.
### Create a ConfigMap Using kubectl create configmap
You can use either `kubectl create configmap` or a ConfigMap generator in `kustomization.yaml`
to create a ConfigMap.
Use the `kubectl create configmap` command to create ConfigMaps from [directories](#create-configmaps-from-directories), [files](#create-configmaps-from-files), or [literal values](#create-configmaps-from-literal-values):
### Create a ConfigMap using `kubectl create configmap`
Use the `kubectl create configmap` command to create ConfigMaps from
[directories](#create-configmaps-from-directories), [files](#create-configmaps-from-files),
or [literal values](#create-configmaps-from-literal-values):
```shell
kubectl create configmap <map-name> <data-source>
```
where \<map-name> is the name you want to assign to the ConfigMap and \<data-source> is the directory, file, or literal value to draw the data from.
where \<map-name> is the name you want to assign to the ConfigMap and \<data-source> is the
directory, file, or literal value to draw the data from.
The name of a ConfigMap object must be a valid
[DNS subdomain name](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names).
When you are creating a ConfigMap based on a file, the key in the \<data-source> defaults to the basename of the file, and the value defaults to the file content.
When you are creating a ConfigMap based on a file, the key in the \<data-source> defaults to
the basename of the file, and the value defaults to the file content.
You can use [`kubectl describe`](/docs/reference/generated/kubectl/kubectl-commands/#describe) or
[`kubectl get`](/docs/reference/generated/kubectl/kubectl-commands/#get) to retrieve information
about a ConfigMap.
#### Create ConfigMaps from directories
#### Create a ConfigMap from a directory {#create-configmaps-from-directories}
You can use `kubectl create configmap` to create a ConfigMap from multiple files in the same directory. When you are creating a ConfigMap based on a directory, kubectl identifies files whose basename is a valid key in the directory and packages each of those files into the new ConfigMap. Any directory entries except regular files are ignored (e.g. subdirectories, symlinks, devices, pipes, etc).
You can use `kubectl create configmap` to create a ConfigMap from multiple files in the same
directory. When you are creating a ConfigMap based on a directory, kubectl identifies files
whose filename is a valid key in the directory and packages each of those files into the new
ConfigMap. Any directory entries except regular files are ignored (for example: subdirectories,
symlinks, devices, pipes, and more).
For example:
{{< note >}}
Each filename being used for ConfigMap creation must consist of only acceptable characters,
which are: letters (`A` to `Z` and `a` to z`), digits (`0` to `9`), '-', '_', or '.'.
If you use `kubectl create configmap` with a directory where any of the file names contains
an unacceptable character, the `kubectl` command may fail.
The `kubectl` command does not print an error when it encounters an invalid filename.
{{< /note >}}
Create the local directory:
```shell
# Create the local directory
mkdir -p configure-pod-container/configmap/
```
Now, download the sample configuration and create the ConfigMap:
```shell
# Download the sample files into `configure-pod-container/configmap/` directory
wget https://kubernetes.io/examples/configmap/game.properties -O configure-pod-container/configmap/game.properties
wget https://kubernetes.io/examples/configmap/ui.properties -O configure-pod-container/configmap/ui.properties
# Create the configmap
# Create the ConfigMap
kubectl create configmap game-config --from-file=configure-pod-container/configmap/
```
The above command packages each file, in this case, `game.properties` and `ui.properties` in the `configure-pod-container/configmap/` directory into the game-config ConfigMap. You can display details of the ConfigMap using the following command:
The above command packages each file, in this case, `game.properties` and `ui.properties`
in the `configure-pod-container/configmap/` directory into the game-config ConfigMap. You can
display details of the ConfigMap using the following command:
```shell
kubectl describe configmaps game-config
@ -95,7 +126,8 @@ allow.textmode=true
how.nice.to.look=fairlyNice
```
The `game.properties` and `ui.properties` files in the `configure-pod-container/configmap/` directory are represented in the `data` section of the ConfigMap.
The `game.properties` and `ui.properties` files in the `configure-pod-container/configmap/`
directory are represented in the `data` section of the ConfigMap.
```shell
kubectl get configmaps game-config -o yaml
@ -106,7 +138,7 @@ The output is similar to this:
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: 2016-02-18T18:52:05Z
creationTimestamp: 2022-02-18T18:52:05Z
name: game-config
namespace: default
resourceVersion: "516"
@ -129,7 +161,8 @@ data:
#### Create ConfigMaps from files
You can use `kubectl create configmap` to create a ConfigMap from an individual file, or from multiple files.
You can use `kubectl create configmap` to create a ConfigMap from an individual file, or from
multiple files.
For example,
@ -164,7 +197,8 @@ secret.code.allowed=true
secret.code.lives=30
```
You can pass in the `--from-file` argument multiple times to create a ConfigMap from multiple data sources.
You can pass in the `--from-file` argument multiple times to create a ConfigMap from multiple
data sources.
```shell
kubectl create configmap game-config-2 --from-file=configure-pod-container/configmap/game.properties --from-file=configure-pod-container/configmap/ui.properties
@ -203,9 +237,6 @@ allow.textmode=true
how.nice.to.look=fairlyNice
```
When `kubectl` creates a ConfigMap from inputs that are not ASCII or UTF-8, the tool puts these into the `binaryData` field of the ConfigMap, and not in `data`. Both text and binary data sources can be combined in one ConfigMap.
If you want to view the `binaryData` keys (and their values) in a ConfigMap, you can run `kubectl get configmap -o jsonpath='{.binaryData}' <name>`.
Use the option `--from-env-file` to create a ConfigMap from an env-file, for example:
```shell
@ -234,18 +265,18 @@ kubectl create configmap game-config-env-file \
--from-env-file=configure-pod-container/configmap/game-env-file.properties
```
would produce the following ConfigMap:
would produce a ConfigMap. View the ConfigMap:
```shell
kubectl get configmap game-config-env-file -o yaml
```
where the output is similar to this:
the output is similar to:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: 2017-12-27T18:36:28Z
creationTimestamp: 2019-12-27T18:36:28Z
name: game-config-env-file
namespace: default
resourceVersion: "809965"
@ -276,7 +307,7 @@ where the output is similar to this:
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: 2017-12-27T18:38:34Z
creationTimestamp: 2019-12-27T18:38:34Z
name: config-multi-env-files
namespace: default
resourceVersion: "810136"
@ -292,13 +323,15 @@ data:
#### Define the key to use when creating a ConfigMap from a file
You can define a key other than the file name to use in the `data` section of your ConfigMap when using the `--from-file` argument:
You can define a key other than the file name to use in the `data` section of your ConfigMap
when using the `--from-file` argument:
```shell
kubectl create configmap game-config-3 --from-file=<my-key-name>=<path-to-file>
```
where `<my-key-name>` is the key you want to use in the ConfigMap and `<path-to-file>` is the location of the data source file you want the key to represent.
where `<my-key-name>` is the key you want to use in the ConfigMap and `<path-to-file>` is the
location of the data source file you want the key to represent.
For example:
@ -316,7 +349,7 @@ where the output is similar to this:
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: 2016-02-18T18:54:22Z
creationTimestamp: 2022-02-18T18:54:22Z
name: game-config-3
namespace: default
resourceVersion: "530"
@ -334,13 +367,15 @@ data:
#### Create ConfigMaps from literal values
You can use `kubectl create configmap` with the `--from-literal` argument to define a literal value from the command line:
You can use `kubectl create configmap` with the `--from-literal` argument to define a literal
value from the command line:
```shell
kubectl create configmap special-config --from-literal=special.how=very --from-literal=special.type=charm
```
You can pass in multiple key-value pairs. Each pair provided on the command line is represented as a separate entry in the `data` section of the ConfigMap.
You can pass in multiple key-value pairs. Each pair provided on the command line is represented
as a separate entry in the `data` section of the ConfigMap.
```shell
kubectl get configmaps special-config -o yaml
@ -351,7 +386,7 @@ The output is similar to this:
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: 2016-02-18T19:14:38Z
creationTimestamp: 2022-02-18T19:14:38Z
name: special-config
namespace: default
resourceVersion: "651"
@ -362,26 +397,33 @@ data:
```
### Create a ConfigMap from generator
`kubectl` supports `kustomization.yaml` since 1.14.
You can also create a ConfigMap from generators and then apply it to create the object on
the Apiserver. The generators
should be specified in a `kustomization.yaml` inside a directory.
You can also create a ConfigMap from generators and then apply it to create the object
in the cluster's API server.
You should specify the generators in a `kustomization.yaml` file within a directory.
#### Generate ConfigMaps from files
For example, to generate a ConfigMap from files `configure-pod-container/configmap/game.properties`
```shell
# Create a kustomization.yaml file with ConfigMapGenerator
cat <<EOF >./kustomization.yaml
configMapGenerator:
- name: game-config-4
labels:
game-config: config-4
files:
- configure-pod-container/configmap/game.properties
EOF
```
Apply the kustomization directory to create the ConfigMap object.
Apply the kustomization directory to create the ConfigMap object:
```shell
kubectl apply -k .
```
```
configmap/game-config-4-m9dm2f92bt created
```
@ -389,14 +431,21 @@ You can check that the ConfigMap was created like this:
```shell
kubectl get configmap
```
```
NAME DATA AGE
game-config-4-m9dm2f92bt 1 37s
```
and also:
```shell
kubectl describe configmaps/game-config-4-m9dm2f92bt
```
```
Name: game-config-4-m9dm2f92bt
Namespace: default
Labels: <none>
Labels: game-config=config-4
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","data":{"game.properties":"enemies=aliens\nlives=3\nenemies.cheat=true\nenemies.cheat.level=noGoodRotten\nsecret.code.p...
@ -414,10 +463,11 @@ secret.code.lives=30
Events: <none>
```
Note that the generated ConfigMap name has a suffix appended by hashing the contents. This ensures that a
new ConfigMap is generated each time the content is modified.
Notice that the generated ConfigMap name has a suffix appended by hashing the contents. This
ensures that a new ConfigMap is generated each time the content is modified.
#### Define the key to use when generating a ConfigMap from a file
You can define a key other than the file name to use in the ConfigMap generator.
For example, to generate a ConfigMap from files `configure-pod-container/configmap/game.properties`
with the key `game-special-key`
@ -427,6 +477,8 @@ with the key `game-special-key`
cat <<EOF >./kustomization.yaml
configMapGenerator:
- name: game-config-5
labels:
game-config: config-5
files:
- game-special-key=configure-pod-container/configmap/game.properties
EOF
@ -435,28 +487,51 @@ EOF
Apply the kustomization directory to create the ConfigMap object.
```shell
kubectl apply -k .
```
```
configmap/game-config-5-m67dt67794 created
```
#### Generate ConfigMaps from Literals
To generate a ConfigMap from literals `special.type=charm` and `special.how=very`,
you can specify the ConfigMap generator in `kustomization.yaml` as
```shell
# Create a kustomization.yaml file with ConfigMapGenerator
cat <<EOF >./kustomization.yaml
#### Generate ConfigMaps from literals
This example shows you how to create a `ConfigMap` from two literal key/value pairs:
`special.type=charm` and `special.how=very`, using Kustomize and kubectl. To achieve
this, you can specify the `ConfigMap` generator. Create (or replace)
`kustomization.yaml` so that it has the following contents:
```yaml
---
# kustomization.yaml contents for creating a ConfigMap from literals
configMapGenerator:
- name: special-config-2
literals:
- special.how=very
- special.type=charm
EOF
```
Apply the kustomization directory to create the ConfigMap object.
Apply the kustomization directory to create the ConfigMap object:
```shell
kubectl apply -k .
```
```
configmap/special-config-2-c92b5mmcf2 created
```
## Interim cleanup
Before proceeding, clean up some of the ConfigMaps you made:
```bash
kubectl delete configmap special-config
kubectl delete configmap env-config
kubectl delete configmap -l 'game-config in (config-4,config-5)
```
Now that you have learned to define ConfigMaps, you can move on to the next
section, and learn how to use these objects with Pods.
---
## Define container environment variables using ConfigMap data
### Define a container environment variable with data from a single ConfigMap
@ -467,7 +542,8 @@ configmap/special-config-2-c92b5mmcf2 created
kubectl create configmap special-config --from-literal=special.how=very
```
2. Assign the `special.how` value defined in the ConfigMap to the `SPECIAL_LEVEL_KEY` environment variable in the Pod specification.
2. Assign the `special.how` value defined in the ConfigMap to the `SPECIAL_LEVEL_KEY`
environment variable in the Pod specification.
{{< codenew file="pods/pod-single-configmap-env-variable.yaml" >}}
@ -481,11 +557,12 @@ configmap/special-config-2-c92b5mmcf2 created
### Define container environment variables with data from multiple ConfigMaps
* As with the previous example, create the ConfigMaps first.
As with the previous example, create the ConfigMaps first.
Here is the manifest you will use:
{{< codenew file="configmap/configmaps.yaml" >}}
{{< codenew file="configmap/configmaps.yaml" >}}
Create the ConfigMap:
* Create the ConfigMap:
```shell
kubectl create -f https://kubernetes.io/examples/configmap/configmaps.yaml
@ -503,6 +580,11 @@ configmap/special-config-2-c92b5mmcf2 created
Now, the Pod's output includes environment variables `SPECIAL_LEVEL_KEY=very` and `LOG_LEVEL=INFO`.
Once you're happy to move on, delete that Pod:
```shell
kubectl delete pod dapi-test-pod --now
```
## Configure all key-value pairs in a ConfigMap as container environment variables
* Create a ConfigMap containing multiple key-value pairs.
@ -515,7 +597,8 @@ configmap/special-config-2-c92b5mmcf2 created
kubectl create -f https://kubernetes.io/examples/configmap/configmap-multikeys.yaml
```
* Use `envFrom` to define all of the ConfigMap's data as container environment variables. The key from the ConfigMap becomes the environment variable name in the Pod.
* Use `envFrom` to define all of the ConfigMap's data as container environment variables. The
key from the ConfigMap becomes the environment variable name in the Pod.
{{< codenew file="pods/pod-configmap-envFrom.yaml" >}}
@ -524,35 +607,47 @@ configmap/special-config-2-c92b5mmcf2 created
```shell
kubectl create -f https://kubernetes.io/examples/pods/pod-configmap-envFrom.yaml
```
Now, the Pod's output includes environment variables `SPECIAL_LEVEL=very` and
`SPECIAL_TYPE=charm`.
Now, the Pod's output includes environment variables `SPECIAL_LEVEL=very` and `SPECIAL_TYPE=charm`.
Once you're happy to move on, delete that Pod:
```shell
kubectl delete pod dapi-test-pod --now
```
## Use ConfigMap-defined environment variables in Pod commands
You can use ConfigMap-defined environment variables in the `command` and `args` of a container using the `$(VAR_NAME)` Kubernetes substitution syntax.
You can use ConfigMap-defined environment variables in the `command` and `args` of a container
using the `$(VAR_NAME)` Kubernetes substitution syntax.
For example, the following Pod specification
For example, the following Pod manifest:
{{< codenew file="pods/pod-configmap-env-var-valueFrom.yaml" >}}
created by running
Create that Pod, by running:
```shell
kubectl create -f https://kubernetes.io/examples/pods/pod-configmap-env-var-valueFrom.yaml
```
produces the following output in the `test-container` container:
That pod produces the following output from the `test-container` container:
```
very charm
```
Once you're happy to move on, delete that Pod:
```shell
kubectl delete pod dapi-test-pod --now
```
## Add ConfigMap data to a Volume
As explained in [Create ConfigMaps from files](#create-configmaps-from-files), when you create a ConfigMap using ``--from-file``, the filename becomes a key stored in the `data` section of the ConfigMap. The file contents become the key's value.
As explained in [Create ConfigMaps from files](#create-configmaps-from-files), when you create
a ConfigMap using `--from-file`, the filename becomes a key stored in the `data` section of
the ConfigMap. The file contents become the key's value.
The examples in this section refer to a ConfigMap named special-config, shown below.
The examples in this section refer to a ConfigMap named `special-config`:
{{< codenew file="configmap/configmap-multikeys.yaml" >}}
@ -565,8 +660,9 @@ kubectl create -f https://kubernetes.io/examples/configmap/configmap-multikeys.y
### Populate a Volume with data stored in a ConfigMap
Add the ConfigMap name under the `volumes` section of the Pod specification.
This adds the ConfigMap data to the directory specified as `volumeMounts.mountPath` (in this case, `/etc/config`).
The `command` section lists directory files with names that match the keys in ConfigMap.
This adds the ConfigMap data to the directory specified as `volumeMounts.mountPath` (in this
case, `/etc/config`). The `command` section lists directory files with names that match the
keys in ConfigMap.
{{< codenew file="pods/pod-configmap-volume.yaml" >}}
@ -583,14 +679,20 @@ SPECIAL_LEVEL
SPECIAL_TYPE
```
{{< caution >}}
If there are some files in the `/etc/config/` directory, they will be deleted.
{{< /caution >}}
Text data is exposed as files using the UTF-8 character encoding. To use some other
character encoding, use `binaryData`
(see [ConfigMap object](/docs/concepts/configuration/configmap/#configmap-object) for more details).
{{< note >}}
Text data is exposed as files using the UTF-8 character encoding. To use some other character encoding, use binaryData.
If there are any files in the `/etc/config` directory of that container image, the volume
mount will make those files from the image inaccessible.
{{< /note >}}
Once you're happy to move on, delete that Pod:
```shell
kubectl delete pod dapi-test-pod --now
```
### Add ConfigMap data to a specific path in the Volume
Use the `path` field to specify the desired file path for specific ConfigMap items.
@ -614,24 +716,63 @@ very
Like before, all previous files in the `/etc/config/` directory will be deleted.
{{< /caution >}}
Delete that Pod:
```shell
kubectl delete pod dapi-test-pod --now
```
### Project keys to specific paths and file permissions
You can project keys to specific paths and specific permissions on a per-file
basis. The [Secrets](/docs/concepts/configuration/secret/#using-secrets-as-files-from-a-pod) user guide explains the syntax.
basis. The
[Secrets](/docs/concepts/configuration/secret/#using-secrets-as-files-from-a-pod)
guide explains the syntax.
### Optional references
A ConfigMap reference may be marked _optional_. If the ConfigMap is non-existent, the mounted
volume will be empty. If the ConfigMap exists, but the referenced key is non-existent, the path
will be absent beneath the mount point. See [Optional ConfigMaps](#optional-configmaps) for more
details.
### Mounted ConfigMaps are updated automatically
When a mounted ConfigMap is updated, the projected content is eventually updated too.
This applies in the case where an optionally referenced ConfigMap comes into
existence after a pod has started.
Kubelet checks whether the mounted ConfigMap is fresh on every periodic sync. However,
it uses its local TTL-based cache for getting the current value of the ConfigMap. As a
result, the total delay from the moment when the ConfigMap is updated to the moment
when new keys are projected to the pod can be as long as kubelet sync period (1
minute by default) + TTL of ConfigMaps cache (1 minute by default) in kubelet. You
can trigger an immediate refresh by updating one of the pod's annotations.
{{< note >}}
A container using a ConfigMap as a [subPath](/docs/concepts/storage/volumes/#using-subpath)
volume will not receive ConfigMap updates.
{{< /note >}}
<!-- discussion -->
## Understanding ConfigMaps and Pods
The ConfigMap API resource stores configuration data as key-value pairs. The data can be consumed in pods or provide the configurations for system components such as controllers. ConfigMap is similar to [Secrets](/docs/concepts/configuration/secret/), but provides a means of working with strings that don't contain sensitive information. Users and system components alike can store configuration data in ConfigMap.
The ConfigMap API resource stores configuration data as key-value pairs. The data can be consumed
in pods or provide the configurations for system components such as controllers. ConfigMap is
similar to [Secrets](/docs/concepts/configuration/secret/), but provides a means of working
with strings that don't contain sensitive information. Users and system components alike can
store configuration data in ConfigMap.
{{< note >}}
ConfigMaps should reference properties files, not replace them. Think of the ConfigMap as representing something similar to the Linux `/etc` directory and its contents. For example, if you create a [Kubernetes Volume](/docs/concepts/storage/volumes/) from a ConfigMap, each data item in the ConfigMap is represented by an individual file in the volume.
ConfigMaps should reference properties files, not replace them. Think of the ConfigMap as
representing something similar to the Linux `/etc` directory and its contents. For example,
if you create a [Kubernetes Volume](/docs/concepts/storage/volumes/) from a ConfigMap, each
data item in the ConfigMap is represented by an individual file in the volume.
{{< /note >}}
The ConfigMap's `data` field contains the configuration data. As shown in the example below, this can be simple -- like individual properties defined using `--from-literal` -- or complex -- like configuration files or JSON blobs defined using `--from-file`.
The ConfigMap's `data` field contains the configuration data. As shown in the example below,
this can be simple (like individual properties defined using `--from-literal`) or complex
(like configuration files or JSON blobs defined using `--from-file`).
```yaml
apiVersion: v1
@ -651,33 +792,24 @@ data:
property.3=value-3
```
### Restrictions
When `kubectl` creates a ConfigMap from inputs that are not ASCII or UTF-8, the tool puts
these into the `binaryData` field of the ConfigMap, and not in `data`. Both text and binary
data sources can be combined in one ConfigMap.
- You must create the `ConfigMap` object before you reference it in a Pod specification. Alternatively, mark the ConfigMap reference as `optional` in the Pod spec (see [Optional ConfigMaps](#optional-configmaps)). If you reference a ConfigMap that doesn't exist and you don't mark the reference as `optional`, the Pod won't start. Similarly, references to keys that don't exist in the ConfigMap will also prevent the Pod from starting, unless you mark the key references as `optional`.
If you want to view the `binaryData` keys (and their values) in a ConfigMap, you can run
`kubectl get configmap -o jsonpath='{.binaryData}' <name>`.
- If you use `envFrom` to define environment variables from ConfigMaps, keys that are considered invalid will be skipped. The pod will be allowed to start, but the invalid names will be recorded in the event log (`InvalidVariableNames`). The log message lists each skipped key. For example:
Pods can load data from a ConfigMap that uses either `data` or `binaryData`.
```shell
kubectl get events
```
The output is similar to this:
```
LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
0s 0s 1 dapi-test-pod Pod Warning InvalidEnvironmentVariableNames {kubelet, 127.0.0.1} Keys [1badkey, 2alsobad] from the EnvFrom configMap default/myconfig were skipped since they are considered invalid environment variable names.
```
- ConfigMaps reside in a specific {{< glossary_tooltip term_id="namespace" >}}. A ConfigMap can only be referenced by pods residing in the same namespace.
- You can't use ConfigMaps for {{< glossary_tooltip text="static pods" term_id="static-pod" >}}, because the Kubelet does not support this.
### Optional ConfigMaps
## Optional ConfigMaps
You can mark a reference to a ConfigMap as _optional_ in a Pod specification.
If the ConfigMap doesn't exist, the configuration for which it provides data in the Pod (e.g. environment variable, mounted volume) will be empty.
If the ConfigMap doesn't exist, the configuration for which it provides data in the Pod
(for example: environment variable, mounted volume) will be empty.
If the ConfigMap exists, but the referenced key is non-existent the data is also empty.
For example, the following Pod specification marks an environment variable from a ConfigMap as optional:
For example, the following Pod specification marks an environment variable from a ConfigMap
as optional:
```yaml
apiVersion: v1
@ -688,7 +820,7 @@ spec:
containers:
- name: test-container
image: gcr.io/google_containers/busybox
command: [ "/bin/sh", "-c", "env" ]
command: ["/bin/sh", "-c", "env"]
env:
- name: SPECIAL_LEVEL_KEY
valueFrom:
@ -704,8 +836,9 @@ If you run this pod, and there is a ConfigMap named `a-config` but that ConfigMa
a key named `akey`, the output is also empty. If you do set a value for `akey` in the `a-config`
ConfigMap, this pod prints that value and then terminates.
You can also mark the volumes and files provided by a ConfigMap as optional. Kubernetes always creates the mount paths for the volume, even if the referenced ConfigMap or key doesn't exist. For example, the following
Pod specification marks a volume that references a ConfigMap as optional:
You can also mark the volumes and files provided by a ConfigMap as optional. Kubernetes always
creates the mount paths for the volume, even if the referenced ConfigMap or key doesn't exist. For
example, the following Pod specification marks a volume that references a ConfigMap as optional:
```yaml
apiVersion: v1
@ -716,7 +849,7 @@ spec:
containers:
- name: test-container
image: gcr.io/google_containers/busybox
command: [ "/bin/sh", "-c", "ls /etc/config" ]
command: ["/bin/sh", "-c", "ls /etc/config"]
volumeMounts:
- name: config-volume
mountPath: /etc/config
@ -730,17 +863,70 @@ spec:
### Mounted ConfigMaps are updated automatically
When a mounted ConfigMap is updated, the projected content is eventually updated too. This applies in the case where an optionally referenced ConfigMap comes into
existence after a pod has started.
When a mounted ConfigMap is updated, the projected content is eventually updated too.
This applies in the case where an optionally referenced ConfigMap comes into existence after
a pod has started.
The kubelet checks whether the mounted ConfigMap is fresh on every periodic sync. However, it uses its local TTL-based cache for getting the current value of the
ConfigMap. As a result, the total delay from the moment when the ConfigMap is updated to the moment when new keys are projected to the pod can be as long as
kubelet sync period (1 minute by default) + TTL of ConfigMaps cache (1 minute by default) in kubelet.
The kubelet checks whether the mounted ConfigMap is fresh on every periodic sync. However, it
uses its local TTL-based cache for getting the current value of the ConfigMap. As a result,
the total delay from the moment when the ConfigMap is updated to the moment when new keys
are projected to the pod can be as long as kubelet sync period (1 minute by default) + TTL of
ConfigMaps cache (1 minute by default) in kubelet.
{{< note >}}
A container using a ConfigMap as a [subPath](/docs/concepts/storage/volumes/#using-subpath) volume will not receive ConfigMap updates.
A container using a ConfigMap as a [subPath](/docs/concepts/storage/volumes/#using-subpath)
volume will not receive ConfigMap updates.
{{< /note >}}
## Restrictions
- You must create the `ConfigMap` object before you reference it in a Pod
specification. Alternatively, mark the ConfigMap reference as `optional` in the Pod spec (see
[Optional ConfigMaps](#optional-configmaps)). If you reference a ConfigMap that doesn't exist
and you don't mark the reference as `optional`, the Pod won't start. Similarly, references
to keys that don't exist in the ConfigMap will also prevent the Pod from starting, unless
you mark the key references as `optional`.
- If you use `envFrom` to define environment variables from ConfigMaps, keys that are considered
invalid will be skipped. The pod will be allowed to start, but the invalid names will be
recorded in the event log (`InvalidVariableNames`). The log message lists each skipped
key. For example:
```shell
kubectl get events
```
The output is similar to this:
```
LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
0s 0s 1 dapi-test-pod Pod Warning InvalidEnvironmentVariableNames {kubelet, 127.0.0.1} Keys [1badkey, 2alsobad] from the EnvFrom configMap default/myconfig were skipped since they are considered invalid environment variable names.
```
- ConfigMaps reside in a specific {{< glossary_tooltip term_id="namespace" >}}.
Pods can only refer to ConfigMaps that are in the same namespace as the Pod.
- You can't use ConfigMaps for
{{< glossary_tooltip text="static pods" term_id="static-pod" >}}, because the
kubelet does not support this.
## {{% heading "cleanup" %}}
Delete the ConfigMaps and Pods that you made:
```bash
kubectl delete configmaps/game-config configmaps/game-config-2 configmaps/game-config-3 \
configmaps/game-config-env-file
kubectl delete pod dapi-test-pod --now
# You might already have removed the next set
kubectl delete configmaps/special-config configmaps/env-config
kubectl delete configmap -l 'game-config in (config-4,config-5)
```
If you created a directory `configure-pod-container` and no longer need it, you should remove that too,
or move it into the trash can / deleted files location.
## {{% heading "whatsnext" %}}
* Follow a real world example of [Configuring Redis using a ConfigMap](/docs/tutorials/configuration/configure-redis-using-configmap/).
* Follow a real world example of
[Configuring Redis using a ConfigMap](/docs/tutorials/configuration/configure-redis-using-configmap/).

View File

@ -1,22 +1,18 @@
---
title: Configure Pod Initialization
content_type: task
weight: 130
weight: 170
---
<!-- overview -->
This page shows how to use an Init Container to initialize a Pod before an
application Container runs.
## {{% heading "prerequisites" %}}
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
<!-- steps -->
## Create a Pod that has an Init Container
@ -37,55 +33,63 @@ shared Volume at `/work-dir`, and the application container mounts the shared
Volume at `/usr/share/nginx/html`. The init container runs the following command
and then terminates:
wget -O /work-dir/index.html http://info.cern.ch
```shell
wget -O /work-dir/index.html http://info.cern.ch
```
Notice that the init container writes the `index.html` file in the root directory
of the nginx server.
Create the Pod:
kubectl apply -f https://k8s.io/examples/pods/init-containers.yaml
```shell
kubectl apply -f https://k8s.io/examples/pods/init-containers.yaml
```
Verify that the nginx container is running:
kubectl get pod init-demo
```shell
kubectl get pod init-demo
```
The output shows that the nginx container is running:
NAME READY STATUS RESTARTS AGE
init-demo 1/1 Running 0 1m
```
NAME READY STATUS RESTARTS AGE
init-demo 1/1 Running 0 1m
```
Get a shell into the nginx container running in the init-demo Pod:
kubectl exec -it init-demo -- /bin/bash
```shell
kubectl exec -it init-demo -- /bin/bash
```
In your shell, send a GET request to the nginx server:
root@nginx:~# apt-get update
root@nginx:~# apt-get install curl
root@nginx:~# curl localhost
```
root@nginx:~# apt-get update
root@nginx:~# apt-get install curl
root@nginx:~# curl localhost
```
The output shows that nginx is serving the web page that was written by the init container:
<html><head></head><body><header>
<title>http://info.cern.ch</title>
</header>
<h1>http://info.cern.ch - home of the first website</h1>
...
<li><a href="http://info.cern.ch/hypertext/WWW/TheProject.html">Browse the first website</a></li>
...
```html
<html><head></head><body><header>
<title>http://info.cern.ch</title>
</header>
<h1>http://info.cern.ch - home of the first website</h1>
...
<li><a href="http://info.cern.ch/hypertext/WWW/TheProject.html">Browse the first website</a></li>
...
```
## {{% heading "whatsnext" %}}
* Learn more about
[communicating between Containers running in the same Pod](/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/).
[communicating between Containers running in the same Pod](/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/).
* Learn more about [Init Containers](/docs/concepts/workloads/pods/init-containers/).
* Learn more about [Volumes](/docs/concepts/storage/volumes/).
* Learn more about [Debugging Init Containers](/docs/tasks/debug/debug-application/debug-init-containers/)

View File

@ -4,7 +4,7 @@ reviewers:
- pmorie
title: Configure a Pod to Use a Projected Volume for Storage
content_type: task
weight: 70
weight: 100
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Configure RunAsUserName for Windows pods and containers
content_type: task
weight: 20
weight: 40
---
<!-- overview -->

View File

@ -5,7 +5,7 @@ reviewers:
- thockin
title: Configure Service Accounts for Pods
content_type: task
weight: 90
weight: 120
---
Kubernetes offers two distinct ways for clients that run within your

View File

@ -1,7 +1,7 @@
---
title: Configure a Pod to Use a Volume for Storage
content_type: task
weight: 50
weight: 80
---
<!-- overview -->

View File

@ -1,7 +1,7 @@
---
title: Create a Windows HostProcess Pod
content_type: task
weight: 20
weight: 50
min-kubernetes-server-version: 1.23
---

View File

@ -4,6 +4,7 @@ reviewers:
- tallclair
- liggitt
content_type: task
weight: 240
---
Kubernetes provides a built-in [admission controller](/docs/reference/access-authn-authz/admission-controllers/#podsecurity)
@ -13,9 +14,11 @@ You can configure this admission controller to set cluster-wide defaults and [ex
## {{% heading "prerequisites" %}}
Following an alpha release in Kubernetes v1.22,
Pod Security Admission becaome available by default in Kubernetes v1.23, as
Pod Security Admission became available by default in Kubernetes v1.23, as
a beta. From version 1.25 onwards, Pod Security Admission is generally
available. {{% version-check %}}
available.
{{% version-check %}}
If you are not running Kubernetes {{< skew currentVersion >}}, you can switch
to viewing this page in the documentation for the Kubernetes version that you
@ -29,7 +32,6 @@ For v1.23 and v1.24, use [v1beta1](https://v1-24.docs.kubernetes.io/docs/tasks/c
For v1.22, use [v1alpha1](https://v1-22.docs.kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-admission-controller/).
{{< /note >}}
```yaml
apiVersion: apiserver.config.k8s.io/v1 # see compatibility note
kind: AdmissionConfiguration
@ -63,6 +65,7 @@ plugins:
# Array of namespaces to exempt.
namespaces: []
```
{{< note >}}
The above manifest needs to be specified via the `--admission-control-config-file` to kube-apiserver.
{{< /note >}}

View File

@ -4,6 +4,7 @@ reviewers:
- tallclair
- liggitt
content_type: task
weight: 250
---
Namespaces can be labeled to enforce the [Pod Security Standards](/docs/concepts/security/pod-security-standards). The three policies

View File

@ -1,7 +1,7 @@
---
title: Assign Extended Resources to a Container
content_type: task
weight: 40
weight: 70
---
<!-- overview -->

Some files were not shown because too many files have changed in this diff Show More