Merge branch 'kubernetes:main' into main

This commit is contained in:
Qassie 2023-08-25 08:32:08 +07:00 committed by GitHub
commit d3b76f87fe
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
397 changed files with 68118 additions and 30360 deletions

View File

@ -37,11 +37,11 @@ aliases:
- nate-double-u
- onlydole
- reylejano
- Rishit-dagli # 1.28 Release Team Docs Lead
- sftim
- tengqm
sig-docs-en-reviews: # PR reviews for English content
- bradtopol
- dipesh-rawat
- divya-mohan0209
- kbhawkey
- mehabhalodiya

View File

@ -3,11 +3,11 @@
[![Build Status](https://api.travis-ci.org/kubernetes/website.svg?branch=master)](https://travis-ci.org/kubernetes/website)
[![GitHub release](https://img.shields.io/github/release/kubernetes/website.svg)](https://github.com/kubernetes/website/releases/latest)
स्वागत ह! इस रिपॉजिटरी में [कुबरनेट्स वेबसाइट और दस्तावेज](https://kubernetes.io/) बनाने के लिए आवश्यक सभी संपत्तियां हैं। हम बहुत खुश हैं कि आप योगदान करना चाहते हैं!
स्वागत ह! इस रिपॉजिटरी में [कुबरनेट्स वेबसाइट और दस्तावेज](https://kubernetes.io/) बनाने के लिए आवश्यक सभी संपत्तियां हैं। हम बहुत खुश हैं कि आप योगदान करना चाहते हैं!
## डॉक्स में योगदान देना
आप अपने GitHub खाते में इस रिपॉजिटरी की एक copy बनाने के लिए स्क्रीन के ऊपरी-दाएँ क्षेत्र में **Fork** बटन पर क्लिक करें। इस copy को *Fork* कहा जाता है। अपने fork में परिवर्तन करने के बाद जब आप उनको हमारे पास भेजने के लिए तैयार हों, तो अपने fork पर जाए और हमें इसके बारे में बताने के लिए एक नया pull request बनाएं।
आप अपने GitHub खाते में इस रिपॉजिटरी की एक copy बनाने के लिए स्क्रीन के ऊपरी-दाएँ क्षेत्र में **Fork** बटन पर क्लिक करें। इस copy को *Fork* कहा जाता है। अपने fork में परिवर्तन करने के बाद जब आप उनको हमारे पास भेजने के लिए तैयार हों, तो अपने fork पर जाए और हमें इसके बारे में बताने के लिए एक नया pull request बनाएं।
एक बार जब आपका pull request बन जाता है, तो एक कुबरनेट्स समीक्षक स्पष्ट, कार्रवाई योग्य प्रतिक्रिया प्रदान करने की जिम्मेदारी लेगा। pull request के मालिक के रूप में, **यह आपकी जिम्मेदारी है कि आप कुबरनेट्स समीक्षक द्वारा प्रदान की गई प्रतिक्रिया को संबोधित करने के लिए अपने pull request को संशोधित करें।**

View File

@ -120,13 +120,13 @@ Open up your browser to <http://localhost:1313> to view the website. As you make
<!--
## Running the website locally using Hugo
Make sure to install the Hugo extended version specified by the `HUGO_VERSION` environment variable in the [`netlify.toml`](netlify.toml#L10) file.
Make sure to install the Hugo extended version specified by the `HUGO_VERSION` environment variable in the [`netlify.toml`](netlify.toml#L11) file.
To build and test the site locally, run:
-->
## 在本地使用 Hugo 来运行网站
请确保安装的是 [`netlify.toml`](netlify.toml#L10) 文件中环境变量 `HUGO_VERSION` 所指定的
请确保安装的是 [`netlify.toml`](netlify.toml#L11) 文件中环境变量 `HUGO_VERSION` 所指定的
Hugo Extended 版本。
若要在本地构造和测试网站,请运行:

File diff suppressed because it is too large Load Diff

View File

@ -88,6 +88,7 @@
- fields:
- nominatedNodeName
- hostIP
- hostIPs
- startTime
- phase
- message
@ -99,6 +100,7 @@
- initContainerStatuses
- containerStatuses
- ephemeralContainerStatuses
- resourceClaimStatuses
- resize
- definition: io.k8s.api.core.v1.Container
@ -137,6 +139,7 @@
- livenessProbe
- readinessProbe
- startupProbe
- restartPolicy
- name: Security Context
fields:
- securityContext
@ -228,6 +231,7 @@
fields:
- terminationMessagePath
- terminationMessagePolicy
- restartPolicy
- name: Debugging
fields:
- stdin
@ -393,9 +397,14 @@
fields:
- selector
- manualSelector
- name: Alpha level
- name: Beta level
fields:
- podFailurePolicy
- name: Alpha level
fields:
- backoffLimitPerIndex
- maxFailedIndexes
- podReplacementPolicy
- definition: io.k8s.api.batch.v1.JobStatus
field_categories:
@ -411,6 +420,10 @@
- name: Beta level
fields:
- ready
- name: Alpha level
fields:
- failedIndexes
- terminating
- definition: io.k8s.api.batch.v1.CronJobSpec
field_categories:

View File

@ -153,7 +153,7 @@ parts:
version: v1alpha1
- name: SelfSubjectReview
group: authentication.k8s.io
version: v1beta1
version: v1
- name: Authorization Resources
chapters:
- name: LocalSubjectAccessReview
@ -168,9 +168,6 @@ parts:
- name: SubjectAccessReview
group: authorization.k8s.io
version: v1
- name: SelfSubjectReview
group: authentication.k8s.io
version: v1alpha1
- name: ClusterRole
group: rbac.authorization.k8s.io
version: v1
@ -218,7 +215,7 @@ parts:
version: v1
- name: ValidatingAdmissionPolicy
group: admissionregistration.k8s.io
version: v1alpha1
version: v1beta1
otherDefinitions:
- ValidatingAdmissionPolicyList
- ValidatingAdmissionPolicyBinding

View File

@ -673,18 +673,19 @@ section#cncf {
width: 100%;
overflow: hidden;
clear: both;
display: flex;
justify-content: space-evenly;
flex-wrap: wrap;
h4 {
line-height: normal;
margin-bottom: 15px;
}
& > div:first-child {
float: left;
}
& > div:last-child {
float: right;
& > div {
background-color: #daeaf9;
border-radius: 20px;
padding: 25px;
}
}

View File

@ -354,12 +354,48 @@ main {
word-break: break-word;
}
/* SCSS Related to the Metrics Table */
@media (max-width: 767px) { // for mobile devices, Display the names, Stability levels & types
table.metrics {
th:nth-child(n + 4),
td:nth-child(n + 4) {
display: none;
}
td.metric_type{
min-width: 7em;
}
td.metric_stability_level{
min-width: 6em;
}
}
}
table.metrics tbody{ // Tested dimensions to improve overall aesthetic of the table
tr {
td {
font-size: smaller;
}
td.metric_labels_varying{
min-width: 9em;
}
td.metric_type{
min-width: 9em;
}
td.metric_description{
min-width: 10em;
}
}
}
table.no-word-break td,
table.no-word-break code {
word-break: normal;
}
}
}
// blockquotes and callouts

View File

@ -14,8 +14,6 @@ card:
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,9 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,9 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Roboto+Slab:300,400,700" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -85,7 +85,7 @@ f017f405ff4b gcr.io/google\_containers/hyperkube-arm:v1.1.2 "/hyperkube
When thats looking good were able to access the master node of the Kubernetes cluster with kubectl. Kubectl for ARM can be downloaded from googleapis storage. kubectl get nodes shows which cluster nodes are registered with its status. The master node is named 127.0.0.1.
```
$ curl -fsSL -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/v1.1.2/bin/linux/arm/kubectl
$ curl -fsSL -o /usr/bin/kubectl https://dl.k8s.io/release/v1.1.2/bin/linux/arm/kubectl
$ kubectl get nodes

View File

@ -8,12 +8,11 @@ date: 2018-05-29
[**kustomize**]: https://github.com/kubernetes-sigs/kustomize
[hello world]: https://github.com/kubernetes-sigs/kustomize/blob/master/examples/helloWorld
[kustomization]: https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#kustomization
[mailing list]: https://groups.google.com/forum/#!forum/kustomize
[open an issue]: https://github.com/kubernetes-sigs/kustomize/issues/new
[subproject]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cli/0008-kustomize.md
[subproject]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cli/2377-Kustomize/README.md
[SIG-CLI]: https://github.com/kubernetes/community/tree/master/sig-cli
[workflow]: https://github.com/kubernetes-sigs/kustomize/blob/master/docs/workflows.md
[workflow]: https://github.com/kubernetes-sigs/kustomize/blob/1dd448e65c81aab9d09308b695691175ca6459cd/docs/workflows.md
If you run a Kubernetes environment, chances are youve
customized a Kubernetes configuration — you've copied

View File

@ -80,7 +80,7 @@ In order to manage the Kubernetes cluster, we need to install [kubectl](https://
The recommended way to install it on Linux is to download the pre-built binary and move it to a directory under the `$PATH`.
```shell
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl \
curl -LO https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl \
&& sudo install kubectl /usr/local/bin && rm kubectl
```

View File

@ -23,6 +23,13 @@ per container characteristics like image size or payload) can utilize the
the `PodHasNetwork` condition to optimize the set of actions performed when pods
repeatedly fail to come up.
### Updates for Kubernetes 1.28
The `PodHasNetwork` condition has been renamed to `PodReadyToStartContainers`.
Alongside that change, the feature gate `PodHasNetworkCondition` has been replaced by
`PodReadyToStartContainersCondition`. You need to set `PodReadyToStartContainersCondition`
to true in order to use the new feature in v1.28.0 and later.
### How is this different from the existing Initialized condition reported for pods?
The kubelet sets the status of the existing `Initialized` condition reported in

View File

@ -5,7 +5,7 @@ date: 2022-12-15
slug: dynamic-resource-allocation
---
**Authors:** Patrick Ohly (Intel), Kevin Klues (NVIDIA)
**Authors:** Patrick Ohly (Intel), Kevin Klues (NVIDIA)
Dynamic resource allocation is a new API for requesting resources. It is a
generalization of the persistent volumes API for generic resources, making it possible to:
@ -19,11 +19,11 @@ Third-party resource drivers are responsible for interpreting these parameters
as well as tracking and allocating resources as requests come in.
Dynamic resource allocation is an *alpha feature* and only enabled when the
`DynamicResourceAllocation` [feature
gate](/docs/reference/command-line-tools-reference/feature-gates/) and the
`resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group"
term_id="api-group" >}} are enabled. For details, see the
`--feature-gates` and `--runtime-config` [kube-apiserver
`DynamicResourceAllocation`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the
`resource.k8s.io/v1alpha1`
{{< glossary_tooltip text="API group" term_id="api-group" >}} are enabled. For details,
see the `--feature-gates` and `--runtime-config` [kube-apiserver
parameters](/docs/reference/command-line-tools-reference/kube-apiserver/).
The kube-scheduler, kube-controller-manager and kubelet components all need
the feature gate enabled as well.
@ -39,8 +39,8 @@ for end-to-end testing, but also can be run manually. See
## API
The new `resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group"
term_id="api-group" >}} provides four new types:
The new `resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group" term_id="api-group" >}}
provides four new types:
ResourceClass
: Defines which resource driver handles a certain kind of
@ -77,7 +77,7 @@ this `.spec` (for example, inside a Deployment or StatefulSet) share the same
ResourceClaim instance. When referencing a ResourceClaimTemplate, each Pod gets
its own ResourceClaim instance.
For a container defined within a Pod, the `resources.claims` list
For a container defined within a Pod, the `resources.claims` list
defines whether that container gets
access to these resource instances, which makes it possible to share resources
between one or more containers inside the same Pod. For example, an init container could
@ -89,7 +89,7 @@ will get created for this Pod and each container gets access to one of them.
Assuming a resource driver called `resource-driver.example.com` was installed
together with the following resource class:
```
```yaml
apiVersion: resource.k8s.io/v1alpha1
kind: ResourceClass
name: resource.example.com
@ -151,8 +151,7 @@ spec:
In contrast to native resources (such as CPU or RAM) and
[extended resources](/docs/concepts/configuration/manage-resources-containers/#extended-resources)
(managed by a
device plugin, advertised by kubelet), the scheduler has no knowledge of what
(managed by a device plugin, advertised by kubelet), the scheduler has no knowledge of what
dynamic resources are available in a cluster or how they could be split up to
satisfy the requirements of a specific ResourceClaim. Resource drivers are
responsible for that. Drivers mark ResourceClaims as _allocated_ once resources
@ -227,8 +226,8 @@ It is up to the driver developer to decide how these two components
communicate. The [KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md) outlines an [approach using
CRDs](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation#implementing-a-plugin-for-node-resources).
Within SIG Node, we also plan to provide a complete [example
driver](https://github.com/kubernetes-sigs/dra-example-driver) that can serve
Within SIG Node, we also plan to provide a complete
[example driver](https://github.com/kubernetes-sigs/dra-example-driver) that can serve
as a template for other drivers.
## Running the test driver
@ -236,7 +235,7 @@ as a template for other drivers.
The following steps bring up a local, one-node cluster directly from the
Kubernetes source code. As a prerequisite, your cluster must have nodes with a container
runtime that supports the
[Container Device Interface](https://github.com/container-orchestrated-devices/container-device-interface)
[Container Device Interface](https://github.com/container-orchestrated-devices/container-device-interface)
(CDI). For example, you can run CRI-O [v1.23.2](https://github.com/cri-o/cri-o/releases/tag/v1.23.2) or later.
Once containerd v1.7.0 is released, we expect that you can run that or any later version.
In the example below, we use CRI-O.
@ -259,15 +258,16 @@ $ RUNTIME_CONFIG=resource.k8s.io/v1alpha1 \
PATH=$(pwd)/third_party/etcd:$PATH \
./hack/local-up-cluster.sh -O
...
To start using your cluster, you can open up another terminal/tab and run:
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
...
```
Once the cluster is up, in another
terminal run the test driver controller. `KUBECONFIG` must be set for all of
the following commands.
To start using your cluster, you can open up another terminal/tab and run:
```console
$ export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
```
Once the cluster is up, in another terminal run the test driver controller.
`KUBECONFIG` must be set for all of the following commands.
```console
$ go run ./test/e2e/dra/test-driver --feature-gates ContextualLogging=true -v=5 controller
@ -319,7 +319,7 @@ user_a='b'
## Next steps
- See the
[Dynamic Resource Allocation](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md)
[Dynamic Resource Allocation](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md)
KEP for more information on the design.
- Read [Dynamic Resource Allocation](/docs/concepts/scheduling-eviction/dynamic-resource-allocation/)
in the official Kubernetes documentation.
@ -328,6 +328,6 @@ user_a='b'
and / or the [CNCF Container Orchestrated Device Working Group](https://github.com/cncf/tag-runtime/blob/master/wg/COD.md).
- You can view or comment on the [project board](https://github.com/orgs/kubernetes/projects/95/views/1)
for dynamic resource allocation.
- In order to move this feature towards beta, we need feedback from hardware
vendors, so here's a call to action: try out this feature, consider how it can help
with problems that your users are having, and write resource drivers…
- In order to move this feature towards beta, we need feedback from hardware
vendors, so here's a call to action: try out this feature, consider how it can help
with problems that your users are having, and write resource drivers…

View File

@ -0,0 +1,294 @@
---
layout: blog
title: "Spotlight on SIG ContribEx"
date: 2023-08-14
slug: sig-contribex-spotlight-2023
canonicalUrl: https://www.kubernetes.dev/blog/2023/08/14/sig-contribex-spotlight-2023/
---
**Author**: Fyka Ansari
Welcome to the world of Kubernetes and its vibrant contributor
community! In this blog post, we'll be shining a spotlight on the
[Special Interest Group for Contributor
Experience](https://github.com/kubernetes/community/blob/master/sig-contributor-experience/README.md)
(SIG ContribEx), an essential component of the Kubernetes project.
SIG ContribEx in Kubernetes is responsible for developing and
maintaining a healthy and productive community of contributors to the
project. This involves identifying and addressing bottlenecks that may
hinder the project's growth and feature velocity, such as pull request
latency and the number of open pull requests and issues.
SIG ContribEx works to improve the overall contributor experience by
creating and maintaining guidelines, tools, and processes that
facilitate collaboration and communication among contributors. They
also focus on community building and support, including outreach
programs and mentorship initiatives to onboard and retain new
contributors.
Ultimately, the role of SIG ContribEx is to foster a welcoming and
inclusive environment that encourages contribution and supports the
long-term sustainability of the Kubernetes project.
In this blog post, [Fyka Ansari](https://twitter.com/1fyka) interviews
[Kaslin Fields](https://twitter.com/kaslinfields), a DevRel Engineer
at Google, who is a chair of SIG ContribEx, and [Madhav
Jivrajani](https://twitter.com/MadhavJivrajani), a Software Engineer
at VMWare who serves as a SIG ContribEx Tech Lead. This interview
covers various aspects of SIG ContribEx, including current
initiatives, exciting developments, and how interested individuals can
get involved and contribute to the group. It provides valuable
insights into the workings of SIG ContribEx and highlights the
importance of its role in the Kubernetes ecosystem.
### Introductions
**Fyka:** Let's start by diving into your background and how you got
involved in the Kubernetes ecosystem. Can you tell us more about that
journey?
**Kaslin:** I first got involved in the Kubernetes ecosystem through
my mentor, Jonathan Rippy, who introduced me to containers during my
early days in tech. Eventually, I transitioned to a team working with
containers, which sparked my interest in Kubernetes when it was
announced. While researching Kubernetes in that role, I eagerly sought
opportunities to engage with the containers/Kubernetes community. It
was not until my subsequent job that I found a suitable role to
contribute consistently. I joined SIG ContribEx, specifically in the
Contributor Comms subproject, to both deepen my knowledge of
Kubernetes and support the community better.
**Madhav:** My journey with Kubernetes began when I was a student,
searching for interesting and exciting projects to work on. With my
peers, I discovered open source and attended The New Contributor
Workshop organized by the Kubernetes community. The workshop not only
provided valuable insights into the community structure but also gave
me a sense of warmth and welcome, which motivated me to join and
remain involved. I realized that collaboration is at the heart of
open-source communities, and to get answers and support, I needed to
contribute and do my part. I started working on issues in ContribEx,
particularly focusing on GitHub automation, despite not fully
understanding the task at first. I continued to contribute for various
technical and non-technical aspects of the project, finding it to be
one of the most professionally rewarding experiences in my life.
**Fyka:** That's such an inspiration in itself! I'm sure beginners who
are reading this got the ultimate motivation to take their first
steps. Embracing the Learning journey, seeking mentorship, and
engaging with the Kubernetes community can pave the way for exciting
opportunities in the tech industry. Your stories proved the importance
of starting small and being proactive, just like Madhav said Don't be
afraid to take on tasks, even if you're uncertain at first.
### Primary goals and scope
**Fyka:** Given your experience as a member of SIG ContribEx, could
you tell us a bit about the group's primary goals and initiatives? Its
current focus areas? What do you see as the scope of SIG ContribEx and
the impact it has on the Kubernetes community?
**Kaslin:** SIG ContribEx's primary goals are to simplify the
contributions of Kubernetes contributors and foster a welcoming
community. It collaborates with other Kubernetes SIGs, such as
planning the Contributor Summit at KubeCon, ensuring it meets the
needs of various groups. The group's impact is evident in projects
like updating org membership policies and managing critical platforms
like Zoom, YouTube, and Slack. Its scope encompasses making the
contributor experience smoother and supporting the overall Kubernetes
community.
**Madhav:** The Kubernetes project has vertical SIGs and cross-cutting
SIGs, ContribEx is a deeply cross-cutting SIG, impacting virtually
every area of the Kubernetes community. Adding to Kaslin,
sustainability in the Kubernetes project and community is critical now
more than ever, it plays a central role in addressing critical issues,
such as maintainer succession, by facilitating cohorts for SIGs to
train experienced community members to take on leadership
roles. Excellent examples include SIG CLI and SIG Apps, leading to the
onboarding of new reviewers. Additionally, SIG ContribEx is essential
in managing GitHub automation tools, including bots and commands used
by contributors for interacting with [Prow](https://docs.prow.k8s.io/)
and other automation (label syncing, group and GitHub team management,
etc).
### Beginner's guide!
**Fyka:** I'll never forget talking to Kaslin when I joined the
community and needed help with contributing. Kaslin, your quick and
clear answers were a huge help in getting me started. Can you both
give some tips for people new to contributing to Kubernetes? What
makes SIG ContribEx a great starting point? Why should beginners and
current contributors consider it? And what cool opportunities are
there for newbies to jump in?
**Kaslin:** If you want to contribute to Kubernetes for the first
time, it can be overwhelming to know where to start. A good option is
to join SIG ContribEx as it offers great opportunities to know and
serve the community. Within SIG ContribEx, various subprojects allow
you to explore different parts of the Kubernetes project while you
learn how contributions work. Once you know a bit more, its common
for you to move to other SIGs within the project, and we think thats
wonderful. While many newcomers look for "good first issues" to start
with, these opportunities can be scarce and get claimed
quickly. Instead, the real benefit lies in attending meetings and
getting to know the community. As you learn more about the project and
the people involved, you'll be better equipped to offer your help, and
the community will be more inclined to seek your assistance when
needed. As a co-lead for the Contributor Comms subproject, I can
confidently say that it's an excellent place for beginners to get
involved. We have supportive leads and particularly beginner-friendly
projects too.
**Madhav:** To begin, read the [SIG
README](https://github.com/kubernetes/community/tree/master#readme) on
GitHub, which provides an overview of the projects the SIG
manages. While attending meetings is beneficial for all SIGs, it's
especially recommended for SIG ContribEx, as each subproject gets
dedicated slots for updates and areas that need help. If you can't
attend in real-time due to time zone differences, you can catch the
meeting recordings or
[Notes](https://docs.google.com/document/d/1K3vjCZ9C3LwYrOJOhztQtFuDQCe-urv-ewx1bI8IPVQ/edit?usp=sharing)
later.
### Skills you learn!
**Fyka:** What skills do you look for when bringing in new
contributors to SIG ContribEx, from passion to expertise?
Additionally, what skills can contributors expect to develop while
working with SIG ContribEx?
**Kaslin:** Skills folks need to have or will acquire vary depending
on what area of ContribEx they work upon. Even within a subproject, a
range of skills can be useful and/or developed. For example, the tech
lead role involves technical tasks and overseeing automation, while
the social media lead role requires excellent communication
skills. Working with SIG ContribEx allows contributors to acquire
various skills based on their chosen subproject. By participating in
meetings, listening, learning, and taking on tasks related to their
interests, they can develop and hone these skills. Some subprojects
may require more specialized skills, like program management for the
mentoring project, but all contributors can benefit from offering
their talents to help teach others and contribute to the community.
### Sub-projects under SIG ContribEx
**Fyka:** SIG ContribEx has several smaller projects. Can you tell me
about the aims of these projects and how they've impacted the
Kubernetes community?
**Kaslin:** Some SIGs have one or two subprojects and some have none
at all, but in SIG ContribEx, we have ELEVEN!
Heres a list of them and their respective mission statements
1. **Community**: Manages the community repository, documentation,
and operations.
2. **Community management**: Handles communication platforms and
policies for the community.
3. **Contributor-comms**: Focuses on promoting the success of
Kubernetes contributors through marketing.
4. **Contributors-documentation**: Writes and maintains documentation
for contributing to Kubernetes.
5. **Devstats**: Maintains and updates the [Kubernetes
statistics](https://k8s.devstats.cncf.io) website.
6. **Elections**: Oversees community elections and maintains related
documentation and software.
7. **Events**: Organizes contributor-focused events like the
Contributor Summit.
8. **Github management**: Manages permissions, repositories, and
groups on GitHub.
9. **Mentoring**: Develop programs to help contributors progress in
their contributions.
10. **Sigs-GitHub-actions**: Repository for GitHub actions related to
all SIGs in Kubernetes.
11. **Slack-infra**: Creates and maintains tools and automation for
Kubernetes Slack.
**Madhav:** Also, Devstats is critical from a sustainability
standpoint!
_(If you are willing to learn more and get involved with any of these
sub-projects, check out the_ [SIG ContribEx
README](https://github.com/kubernetes/community/blob/master/sig-contributor-experience/README.md#subprojects))._
### Accomplishments
**Fyka:** With that said, any SIG-related accomplishment that youre
proud of?
**Kaslin:** I'm proud of the accomplishments made by SIG ContribEx and
its contributors in supporting the community. Some of the recent
achievements include:
1. _Establishment of the elections subproject_: Kubernetes is a massive
project, and ensuring smooth leadership transitions is
crucial. The contributors in this subproject organize fair and
consistent elections, which helps keep the project running
effectively.
2. _New issue triage proces_: With such a large open-source project
like Kubernetes, there's always a lot of work to be done. To
ensure things progress safely, we implemented new labels and
updated functionality for issue triage using our PROW tool. This
reduces bottlenecks in the workflow and allows leaders to
accomplish more.
3. _New org membership requirements_: Becoming an org member in
Kubernetes can be overwhelming for newcomers. We view org
membership as a significant milestone for contributors aiming to
take on leadership roles. We recently updated the rules to
automatically remove privileges from inactive members, making sure
that the right people have access to the necessary tools and
responsibilities.
Overall, these accomplishments have greatly benefited our fellow
contributors and strengthened the Kubernetes community.
### Upcoming initiatives
**Fyka:** Could you give us a sneak peek into what's next for the
group? We're excited to hear about upcoming projects and initiatives
from this dynamic team.
**Madhav:** Wed love for more groups to sign up for mentoring
cohorts! Were probably going to have to spend some time polishing the
process around that.
### Final thoughts
**Fyka:** As we wrap up our conversation, would you like to share some
final thoughts for those interested in contributing to SIG ContribEx
or getting involved with Kubernetes?
**Madhav**: Kubernetes is meant to be overwhelming and difficult
initially! Youre coming into something thats taken multiple people,
from multiple countries, multiple years to build. Embrace that
diversity! Use the high entropy initially to collide around and gain
as much knowledge about the project and community as possible before
you decide to settle in your niche.
**Fyka:** Thank You Madhav and Kaslin, it was an absolute pleasure
chatting about SIG ContribEx and your experiences as a member. It's
clear that the role of SIG ContribEx in Kubernetes is significant and
essential, ensuring scalability, growth and productivity, and I hope
this interview inspires more people to get involved and contribute to
Kubernetes. I wish SIG ContribEx all the best, and can't wait to see
what exciting things lie ahead!
## What next?
We love meeting new contributors and helping them in investigating
different Kubernetes project spaces. If you are interested in getting
more involved with SIG ContribEx, here are some resources for you to
get started:
* [GitHub](https://github.com/kubernetes/community/tree/master/sig-contributor-experience#contributor-experience-special-interest-group)
* [Mailing list](https://groups.google.com/g/kubernetes-sig-contribex)
* [Open Community
Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fcontributor-experience)
* [Slack](https://slack.k8s.io/)
* [Slack channel
#sig-contribex](https://kubernetes.slack.com/messages/sig-contribex)
* SIG Contribex also hosted a [KubeCon
talk](https://youtu.be/5Bs1bs6iFmY) about studying Kubernetes
Contributor experiences.

View File

@ -0,0 +1,311 @@
---
layout: blog
title: "Kubernetes v1.28: Planternetes"
date: 2023-08-15T12:00:00+0000
slug: kubernetes-v1-28-release
---
**Authors**: [Kubernetes v1.28 Release Team](https://github.com/kubernetes/sig-release/blob/master/releases/release-1.28/release-team.md)
Announcing the release of Kubernetes v1.28 Planternetes, the second release of 2023!
This release consists of 45 enhancements. Of those enhancements, 19 are entering Alpha, 14 have graduated to Beta, and 12 have graduated to Stable.
## Release Theme And Logo
**Kubernetes v1.28: _Planternetes_**
The theme for Kubernetes v1.28 is *Planternetes*.
{{< figure src="/images/blog/2023-08-15-kubernetes-1.28-blog/kubernetes-1.28.png" alt="Kubernetes 1.28 Planternetes logo" class="release-logo" >}}
Each Kubernetes release is the culmination of the hard work of thousands of individuals from our community. The people behind this release come from a wide range of backgrounds, some of us industry veterans, parents, others students and newcomers to open-source. We combine our unique experience to create a collective artifact with global impact.
Much like a garden, our release has ever-changing growth, challenges and opportunities. This theme celebrates the meticulous care, intention and efforts to get the release to where we are today. Harmoniously together, we grow better.
# What's New (Major Themes)
## Changes to supported skew between control plane and node versions
Kubernetes v1.28 expands the supported skew between core node and control plane
components by one minor version, from _n-2_ to _n-3_, so that node components
(kubelet and kube-proxy) for the oldest supported minor version work with
control plane components (kube-apiserver, kube-scheduler, kube-controller-manager,
cloud-controller-manager) for the newest supported minor version.
Some cluster operators avoid node maintenance and especially changes to node
behavior, because nodes are where the workloads run. For minor version upgrades
to a kubelet, the supported process includes draining that node, and hence
disruption to any Pods that had been executing there. For Kubernetes end users
with very long running workloads, and where Pods should stay running wherever
possible, reducing the time lost to node maintenance is a benefit.
The Kubernetes yearly support period already made annual upgrades possible. Users can
upgrade to the latest patch versions to pick up security fixes and do 3 sequential
minor version upgrades once a year to "catch up" to the latest supported minor version.
Previously, to stay within the supported skew, a cluster operator planning an annual
upgrade would have needed to upgrade their nodes twice (perhaps only hours apart). Now,
with Kubernetes v1.28, you have the option of making a minor version upgrade to
nodes just once in each calendar year and still staying within upstream support.
If you'd like to stay current and upgrade your clusters more often, that's
fine and is still completely supported.
## Generally available: recovery from non-graceful node shutdown
If a node shuts down unexpectedly or ends up in a non-recoverable state (perhaps due to hardware failure or unresponsive OS), Kubernetes allows you to clean up afterward and allow stateful workloads to restart on a different node. For Kubernetes v1.28, that's now a stable feature.
This allows stateful workloads to fail over to a different node successfully after the original node is shut down or in a non-recoverable state, such as the hardware failure or broken OS.
Versions of Kubernetes earlier than v1.20 lacked handling for node shutdown on Linux, the kubelet integrates with systemd
and implements graceful node shutdown (beta, and enabled by default). However, even an intentional
shutdown might not get handled well that could be because:
- the node runs Windows
- the node runs Linux, but uses a different `init` (not `systemd`)
- the shutdown does not trigger the system inhibitor locks mechanism
- because of a node-level configuration error
(such as not setting appropriate values for `shutdownGracePeriod` and `shutdownGracePeriodCriticalPods`).
When a node shutdowns or fails, and that shutdown was not detected by the kubelet, the pods that are part
of a StatefulSet will be stuck in terminating status on the shutdown node. If the stopped node restarts, the
kubelet on that node can clean up (`DELETE`) the Pods that the Kubernetes API still sees as bound to that node.
However, if the node stays stopped - or if the kubelet isn't able to start after a reboot - then Kubernetes may
not be able to create replacement Pods. When the kubelet on the shut-down node is not available to delete
the old pods, an associated StatefulSet cannot create a new pod (which would have the same name).
There's also a problem with storage. If there are volumes used by the pods, existing VolumeAttachments will
not be disassociated from the original - and now shut down - node so the PersistentVolumes used by these
pods cannot be attached to a different, healthy node. As a result, an application running on an
affected StatefulSet may not be able to function properly. If the original, shut down node does come up, then
their pods will be deleted by its kubelet and new pods can be created on a different running node.
If the original node does not come up (common with an [immutable infrastructure](https://glossary.cncf.io/immutable-infrastructure/) design), those pods would be stuck in a `Terminating` status on the shut-down node forever.
For more information on how to trigger cleanup after a non-graceful node shutdown,
read [non-graceful node shutdown](/docs/concepts/architecture/nodes/#non-graceful-node-shutdown).
## Improvements to CustomResourceDefinition validation rules
The [Common Expression Language (CEL)](https://github.com/google/cel-go) can be used to validate
[custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/). The primary goal is to allow the majority of the validation use cases that might once have needed you, as a CustomResourceDefinition (CRD) author, to design and implement a webhook. Instead, and as a beta feature, you can add _validation expressions_ directly into the schema of a CRD.
CRDs need direct support for non-trivial validation. While admission webhooks do support CRDs validation, they significantly complicate the development and operability of CRDs.
In 1.28, two optional fields `reason` and `fieldPath` were added to allow user to specify the failure reason and fieldPath when validation failed.
For more information, read [validation rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) in the CRD documentation.
## ValidatingAdmissionPolicies graduate to beta
Common Expression language for admission control is customizable, in-process validation of requests to the Kubernetes API server as an alternative to validating admission webhooks.
This builds on the capabilities of the CRD Validation Rules feature that graduated to beta in 1.25 but with a focus on the policy enforcement capabilities of validating admission control.
This will lower the infrastructure barrier to enforcing customizable policies as well as providing primitives that help the community establish and adhere to the best practices of both K8s and its extensions.
To use [ValidatingAdmissionPolicies](/docs/reference/access-authn-authz/validating-admission-policy/), you need to enable both the `admissionregistration.k8s.io/v1beta1` API group and the `ValidatingAdmissionPolicy` feature gate in your cluster's control plane.
## Match conditions for admission webhooks
Kubernetes v1.27 lets you specify _match conditions_ for admission webhooks,
which lets you narrow the scope of when Kubernetes makes a remote HTTP call at admission time.
The `matchCondition` field for ValidatingWebhookConfiguration and MutatingWebhookConfiguration
is a CEL expression that must evaluate to true for the admission request to be sent to the webhook.
In Kubernetes v1.28, that field moved to beta, and it's enabled by default.
To learn more, see [`matchConditions`](/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-matchconditions) in the Kubernetes documentation.
## Beta support for enabling swap space on Linux
This adds swap support to nodes in a controlled, predictable manner so that Kubernetes users can perform testing and provide data to continue building cluster capabilities on top of swap.
There are two distinct types of users for swap, who may overlap:
- Node administrators, who may want swap available for node-level performance tuning and stability/reducing noisy neighbor issues.
- Application developers, who have written applications that would benefit from using swap memory.
## Mixed version proxy (alpha) {#mixed-version-proxy}
When a cluster has multiple API servers at mixed versions (such as during an upgrade/downgrade or when runtime-config changes and a rollout happens), not every apiserver can serve every resource at every version.
For Kubernetes v1.28, you can enable the _mixed version proxy_ within the API server's aggregation layer.
The mixed version proxy finds requests that the local API server doesn't recognize but another API server
inside the control plan is able to support. Having found a suitable peer, the aggregation layer proxies
the request to a compatible API server; this is transparent from the client's perspective.
When an upgrade or downgrade is performed on a cluster, for some period of time the API servers
within the control plane may be at differing versions; when that happens, different subsets of the
API servers are able to serve different sets of built-in resources (different groups, versions, and resources
are all possible). This new alpha mechanism lets you hide that skew from clients.
## Source code reorganization for control plane components
Kubernetes contributors have begun to reorganize the code for the kube-apiserver to build on a new staging repository that consumes [k/apiserver](https://github.com/kubernetes/apiserver) but has a bigger, carefully chosen subset of the functionality of kube-apiserver such that it is reusable.
This is a gradual reorganization; eventually there will be a new git repository with generic functionality abstracted from Kubernetes' API server.
## Support for CDI injection into containers (alpha) {#cdi-device-plugin}
CDI provides a standardized way of injecting complex devices into a container (i.e. devices that logically require more than just a single /dev node to be injected for them to work). This new feature enables plugin developers to utilize the CDIDevices field added to the CRI in 1.27 to pass CDI devices directly to CDI enabled runtimes (of which containerd and crio-o are in recent releases).
## API awareness of sidecar containers (alpha) {#sidecar-init-containers}
Kubernetes 1.28 introduces an alpha `restartPolicy` field for [init containers](https://github.com/kubernetes/website/blob/main/content/en/docs/concepts/workloads/pods/init-containers.md),
and uses that to indicate when an init container is also a _sidecar container_.
The kubelet will start init containers with `restartPolicy: Always` in the order
they are defined, along with other init containers.
Instead of waiting for that sidecar container to complete before starting the main
container(s) for the Pod, the kubelet only waits for the sidecar init container to have started.
The kubelet will consider the startup for the sidecar container as being completed
if the startup probe succeeds and the postStart handler is completed.
This condition is represented with the field Started of ContainerStatus type.
If you do not define a startup probe, the kubelet will consider the container
startup to be completed immediately after the postStart handler completion.
For init containers, you can either omit the `restartPolicy` field, or set it to `Always`. Omitting the field
means that you want a true init container that runs to completion before application startup.
Sidecar containers do not block Pod completion: if all regular containers are complete, sidecar
containers in that Pod will be terminated.
Once the sidecar container has started (process running, `postStart` was successful, and
any configured startup probe is passing), and then there's a failure, that sidecar container will be
restarted even when the Pod's overall `restartPolicy` is `Never` or `OnFailure`.
Furthermore, sidecar containers will be restarted (on failure or on normal exit)
_even during Pod termination_.
To learn more, read [API for sidecar containers](/docs/concepts/workloads/pods/init-containers/#api-for-sidecar-containers).
## Automatic, retroactive assignment of a default StorageClass graduates to stable
Kubernetes automatically sets a `storageClassName` for a PersistentVolumeClaim (PVC) if you don't provide
a value. The control plane also sets a StorageClass for any existing PVC that doesn't have a `storageClassName`
defined.
Previous versions of Kubernetes also had this behavior; for Kubernetes v1.28 it is automatic and always
active; the feature has graduated to stable (general availability).
To learn more, read about [StorageClass](/docs/concepts/storage/storage-classes/) in the Kubernetes
documentation.
## Pod replacement policy for Jobs (alpha) {#pod-replacement-policy}
Kubernetes 1.28 adds a new field for the Job API that allows you to specify if you want the control
plane to make new Pods as soon as the previous Pods begin termination (existing behavior),
or only once the existing pods are fully terminated (new, optional behavior).
Many common machine learning frameworks, such as Tensorflow and JAX, require unique pods per index.
With the older behaviour, if a pod that belongs to an `Indexed` Job enters a terminating state (due to preemption, eviction or other external factors), a replacement pod is created but then immediately fails to start due
to the clash with the old pod that has not yet shut down.
Having a replacement Pod appear before the previous one fully terminates can also cause problems
in clusters with scarce resources or with tight budgets. These resources can be difficult to obtain so pods may only be able to find nodes once the existing pods have been terminated. If cluster autoscaler is enabled, early creation of replacement Pods might produce undesired scale-ups.
To learn more, read [Delayed creation of replacement pods](/docs/concepts/workloads/controllers/job/#delayed-creation-of-replacement-pods)
in the Job documentation.
## Job retry backoff limit, per index (alpha) {#job-per-index-retry-backoff}
This extends the Job API to support indexed jobs where the backoff limit is per index, and the Job can continue execution despite some of its indexes failing.
Currently, the indexes of an indexed job share a single backoff limit. When the job reaches this shared backoff limit, the job controller marks the entire job as failed, and the resources are cleaned up, including indexes that have yet to run to completion.
As a result, the existing implementation did not cover the situation where the workload is truly
[embarrassingly parallel](https://en.wikipedia.org/wiki/Embarrassingly_parallel): each index is
fully independent of other indexes.
For instance, if indexed jobs were used as the basis for a suite of long-running integration tests, then each test run would only be able to find a single test failure.
For more information, read [Handling Pod and container failures](/docs/concepts/workloads/controllers/job/#handling-pod-and-container-failures) in the Kubernetes documentation.
<hr />
<a id="cri-container-and-pod-statistics-without-cadvisor" />
**Correction**: the feature CRI container and pod statistics without cAdvisor has been removed as it did not make the release.
The original release announcement stated that Kubernetes 1.28 included the new feature.
## Feature graduations and deprecations in Kubernetes v1.28
### Graduations to stable
This release includes a total of 12 enhancements promoted to Stable:
* [`kubectl events`](https://github.com/kubernetes/enhancements/issues/1440)
* [Retroactive default StorageClass assignment](https://github.com/kubernetes/enhancements/issues/3333)
* [Non-graceful node shutdown](https://github.com/kubernetes/enhancements/issues/2268)
* [Support 3rd party device monitoring plugins](https://github.com/kubernetes/enhancements/issues/606)
* [Auth API to get self-user attributes](https://github.com/kubernetes/enhancements/issues/3325)
* [Proxy Terminating Endpoints](https://github.com/kubernetes/enhancements/issues/1669)
* [Expanded DNS Configuration](https://github.com/kubernetes/enhancements/issues/2595)
* [Cleaning up IPTables Chain Ownership](https://github.com/kubernetes/enhancements/issues/3178)
* [Minimizing iptables-restore input size](https://github.com/kubernetes/enhancements/issues/3453)
* [Graduate the kubelet pod resources endpoint to GA](https://github.com/kubernetes/enhancements/issues/3743)
* [Extend podresources API to report allocatable resources](https://github.com/kubernetes/enhancements/issues/2403)
* [Move EndpointSlice Reconciler into Staging](https://github.com/kubernetes/enhancements/issues/3685)
### Deprecations and removals
Removals:
* [Removal of CSI Migration for GCE PD](https://github.com/kubernetes/enhancements/issues/1488)
Deprecations:
* [Ceph RBD in-tree plugin](https://github.com/kubernetes/kubernetes/pull/118303)
* [Ceph FS in-tree plugin](https://github.com/kubernetes/kubernetes/pull/118143)
## Release Notes
The complete details of the Kubernetes v1.28 release are available in our [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.28.md).
## Availability
Kubernetes v1.28 is available for download on [GitHub](https://github.com/kubernetes/kubernetes/releases/tag/v1.28.0). To get started with Kubernetes, you can run local Kubernetes clusters using [minikube](https://minikube.sigs.k8s.io/docs/), [kind](https://kind.sigs.k8s.io/), etc. You can also easily install v1.28 using [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
## Release Team
Kubernetes is only possible with the support, commitment, and hard work of its community. Each release team is comprised of dedicated community volunteers who work together to build the many pieces that make up the Kubernetes releases you rely on. This requires the specialized skills of people from all corners of our community, from the code itself to its documentation and project management.
We would like to thank the entire release team for the hours spent hard at work to ensure we deliver a solid Kubernetes v1.28 release for our community.
Special thanks to our release lead, Grace Nguyen, for guiding us through a smooth and successful release cycle.
## Ecosystem Updates
* KubeCon + CloudNativeCon China 2023 will take place in Shanghai, China, from 26 28 September 2023! You can find more information about the conference and registration on the [event site](https://www.lfasiallc.com/kubecon-cloudnativecon-open-source-summit-china/).
* KubeCon + CloudNativeCon North America 2023 will take place in Chicago, Illinois, The United States of America, from 6 9 November 2023! You can find more information about the conference and registration on the [event site](https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/).
## Project Velocity
The [CNCF K8s DevStats](https://k8s.devstats.cncf.io/d/12/dashboards?orgId=1&refresh=15m) project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing and is an illustration of the depth and breadth of effort that goes into evolving this ecosystem.
In the v1.28 release cycle, which [ran for 14 weeks](https://github.com/kubernetes/sig-release/tree/master/releases/release-1.28) (May 15 to August 15), we saw contributions from [911 companies](https://k8s.devstats.cncf.io/d/9/companies-table?orgId=1&var-period_name=v1.27.0%20-%20now&var-metric=contributions) and [1440 individuals](https://k8s.devstats.cncf.io/d/66/developer-activity-counts-by-companies?orgId=1&var-period_name=v1.27.0%20-%20now&var-metric=contributions&var-repogroup_name=Kubernetes&var-repo_name=kubernetes%2Fkubernetes&var-country_name=All&var-companies=All).
## Upcoming Release Webinar
Join members of the Kubernetes v1.28 release team on Wednesday, September 6th, 2023, at 9 A.M. PDT to learn about the major features of this release, as well as deprecations and removals to help plan for upgrades. For more information and registration, visit the [event page](https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cncf-live-webinar-kubernetes-128-release/) on the CNCF Online Programs site.
## Get Involved
The simplest way to get involved with Kubernetes is by joining one of the many [Special Interest Groups](https://github.com/kubernetes/community/blob/master/sig-list.md) (SIGs) that align with your interests.
Have something youd like to broadcast to the Kubernetes community? Share your voice at our weekly [community meeting](https://github.com/kubernetes/community/tree/master/communication), and through the channels below:
* Find out more about contributing to Kubernetes at the [Kubernetes Contributors website](https://www.kubernetes.dev/).
* Follow us on Twitter [@Kubernetesio](https://twitter.com/kubernetesio) for the latest updates.
* Join the community discussion on [Discuss](https://discuss.kubernetes.io/).
* Join the community on [Slack](https://communityinviter.com/apps/kubernetes/community).
* Post questions (or answer questions) on [Server Fault](https://serverfault.com/questions/tagged/kubernetes).
* [Share](https://docs.google.com/forms/d/e/1FAIpQLScuI7Ye3VQHQTwBASrgkjQDSS5TP0g3AXfFhwSM9YpHgxRKFA/viewform) your Kubernetes story.
* Read more about whats happening with Kubernetes on the [blog](https://kubernetes.io/blog/).
* Learn more about the [Kubernetes Release Team](https://github.com/kubernetes/sig-release/tree/master/release-team).

View File

@ -0,0 +1,207 @@
---
layout: blog
title: "pkgs.k8s.io: Introducing Kubernetes Community-Owned Package Repositories"
date: 2023-08-15T20:00:00+0000
slug: pkgs-k8s-io-introduction
---
**Author**: Marko Mudrinić (Kubermatic)
On behalf of Kubernetes SIG Release, I am very excited to introduce the
Kubernetes community-owned software
repositories for Debian and RPM packages: `pkgs.k8s.io`! The new package
repositories are replacement for the Google-hosted package repositories
(`apt.kubernetes.io` and `yum.kubernetes.io`) that we've been using since
Kubernetes v1.5.
This blog post contains information about these new package repositories,
what does it mean to you as an end user, and how to migrate to the new
repositories.
## What you need to know about the new package repositories?
- This is an **opt-in change**; you're required to manually migrate from the
Google-hosted repository to the Kubernetes community-owned repositories.
See [how to migrate](#how-to-migrate) later in this announcement for migration information
and instructions.
- Access to the Google-hosted repository will remain intact for the foreseeable
future. However, the Kubernetes project plans to stop publishing packages to
the Google-hosted repository in the future. The project strongly recommends
migrating to the Kubernetes package repositories going forward.
- The Kubernetes package repositories contain packages beginning with those
Kubernetes versions that were still under support when the community took
over the package builds. This means that anything before v1.24.0 will only be
available in the Google-hosted repository.
- There's a dedicated package repository for each Kubernetes minor version.
When upgrading to a different minor release, you must bear in mind that
the package repository details also change.
## Why are we introducing new package repositories?
As the Kubernetes project is growing, we want to ensure the best possible
experience for the end users. The Google-hosted repository has been serving
us well for many years, but we started facing some problems that require
significant changes to how we publish packages. Another goal that we have is to
use community-owned infrastructure for all critical components and that
includes package repositories.
Publishing packages to the Google-hosted repository is a manual process that
can be done only by a team of Google employees called
[Google Build Admins](/releases/release-managers/#build-admins).
[The Kubernetes Release Managers team](/releases/release-managers/#release-managers)
is a very diverse team especially in terms of timezones that we work in.
Given this constraint, we have to do very careful planning for every release to
ensure that we have both Release Manager and Google Build Admin available to
carry out the release.
Another problem is that we only have a single package repository. Because of
this, we were not able to publish packages for prerelease versions (alpha,
beta, and rc). This made testing Kubernetes prereleases harder for anyone who
is interested to do so. The feedback that we receive from people testing these
releases is critical to ensure the best quality of releases, so we want to make
testing these releases as easy as possible. On top of that, having only one
repository limited us when it comes to publishing dependencies like `cri-tools`
and `kubernetes-cni`.
Regardless of all these issues, we're very thankful to Google and Google Build
Admins for their involvement, support, and help all these years!
## How the new package repositories work?
The new package repositories are hosted at `pkgs.k8s.io` for both Debian and
RPM packages. At this time, this domain points to a CloudFront CDN backed by S3
bucket that contains repositories and packages. However, we plan on onboarding
additional mirrors in the future, giving possibility for other companies to
help us with serving packages.
Packages are built and published via the [OpenBuildService (OBS) platform](http://openbuildservice.org).
After a long period of evaluating different solutions, we made a decision to
use OpenBuildService as a platform to manage our repositories and packages.
First of all, OpenBuildService is an open source platform used by a large
number of open source projects and companies, like openSUSE, VideoLAN,
Dell, Intel, and more. OpenBuildService has many features making it very
flexible and easy to integrate with our existing release tooling. It also
allows us to build packages in a similar way as for the Google-hosted
repository making the migration process as seamless as possible.
SUSE sponsors the Kubernetes project with access to their reference
OpenBuildService setup ([`build.opensuse.org`](http://build.opensuse.org)) and
with technical support to integrate OBS with our release processes.
We use SUSE's OBS instance for building and publishing packages. Upon building
a new release, our tooling automatically pushes needed artifacts and
package specifications to `build.opensuse.org`. That will trigger the build
process that's going to build packages for all supported architectures (AMD64,
ARM64, PPC64LE, S390X). At the end, generated packages will be automatically
pushed to our community-owned S3 bucket making them available to all users.
We want to take this opportunity to thank SUSE for allowing us to use
`build.opensuse.org` and their generous support to make this integration
possible!
## What are significant differences between the Google-hosted and Kubernetes package repositories?
There are three significant differences that you should be aware of:
- There's a dedicated package repository for each Kubernetes minor release.
For example, repository called `core:/stable:/v1.28` only hosts packages for
stable Kubernetes v1.28 releases. This means you can install v1.28.0 from
this repository, but you can't install v1.27.0 or any other minor release
other than v1.28. Upon upgrading to another minor version, you have to add a
new repository and optionally remove the old one
- There's a difference in what `cri-tools` and `kubernetes-cni` package
versions are available in each Kubernetes repository
- These two packages are dependencies for `kubelet` and `kubeadm`
- Kubernetes repositories for v1.24 to v1.27 have same versions of these
packages as the Google-hosted repository
- Kubernetes repositories for v1.28 and onwards are going to have published
only versions that are used by that Kubernetes minor release
- Speaking of v1.28, only kubernetes-cni 1.2.0 and cri-tols v1.28 are going
to be available in the repository for Kubernetes v1.28
- Similar for v1.29, we only plan on publishing cri-tools v1.29 and
whatever kubernetes-cni version is going to be used by Kubernetes v1.29
- The revision part of the package version (the `-00` part in `1.28.0-00`) is
now autogenerated by the OpenBuildService platform and has a different format.
The revision is now in the format of `-x.y`, e.g. `1.28.0-1.1`
## Does this in any way affect existing Google-hosted repositories?
The Google-hosted repository and all packages published to it will continue
working in the same way as before. There are no changes in how we build and
publish packages to the Google-hosted repository, all newly-introduced changes
are only affecting packages publish to the community-owned repositories.
However, as mentioned at the beginning of this blog post, we plan to stop
publishing packages to the Google-hosted repository in the future.
## How to migrate to the Kubernetes community-owned repositories? {#how-to-migrate}
### Debian, Ubuntu, and operating systems using `apt`/`apt-get` {#how-to-migrate-deb}
1. Replace the `apt` repository definition so that `apt` points to the new
repository instead of the Google-hosted repository. Make sure to replace the
Kubernetes minor version in the command below with the minor version
that you're currently using:
```shell
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
```
2. Download the public signing key for the Kubernetes package repositories.
The same signing key is used for all repositories, so you can disregard the
version in the URL:
```shell
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
```
3. Update the `apt` package index:
```shell
sudo apt-get update
```
### CentOS, Fedora, RHEL, and operating systems using `rpm`/`dnf` {#how-to-migrate-rpm}
1. Replace the `yum` repository definition so that `yum` points to the new
repository instead of the Google-hosted repository. Make sure to replace the
Kubernetes minor version in the command below with the minor version
that you're currently using:
```shell
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
```
## Can I rollback to the Google-hosted repository after migrating to the Kubernetes repositories?
In general, yes. Just do the same steps as when migrating, but use parameters
for the Google-hosted repository. You can find those parameters in a document
like ["Installing kubeadm"](/docs/setup/production-environment/tools/kubeadm/install-kubeadm).
## Why isnt there a stable list of domains/IPs? Why cant I restrict package downloads?
Our plan for `pkgs.k8s.io` is to make it work as a redirector to a set of
backends (package mirrors) based on user's location. The nature of this change
means that a user downloading a package could be redirected to any mirror at
any time. Given the architecture and our plans to onboard additional mirrors in
the near future, we can't provide a list of IP addresses or domains that you
can add to an allow list.
Restrictive control mechanisms like man-in-the-middle proxies or network
policies that restrict access to a specific list of IPs/domains will break with
this change. For these scenarios, we encourage you to mirror the release
packages to a local package repository that you have strict control over.
## What should I do if I detect some abnormality with the new repositories?
If you encounter any issue with new Kubernetes package repositories, please
file an issue in the
[`kubernetes/release` repository](https://github.com/kubernetes/release/issues/new/choose).

View File

@ -1,7 +1,7 @@
---
layout: blog
title: "Kubernetes 1.28: Non-Graceful Node Shutdown Moves to GA"
date: 2023-08-15T10:00:00-08:00
date: 2023-08-16T10:00:00-08:00
slug: kubernetes-1-28-non-graceful-node-shutdown-GA
---

View File

@ -0,0 +1,39 @@
---
layout: blog
title: "Kubernetes v1.28: Retroactive Default StorageClass move to GA"
date: 2023-08-18
slug: retroactive-default-storage-class-ga
---
**Author:** Roman Bednář (Red Hat)
Announcing graduation to General Availability (GA) - Retroactive Default StorageClass Assignment in Kubernetes v1.28!
Kubernetes SIG Storage team is thrilled to announce that the "Retroactive Default StorageClass Assignment" feature,
introduced as an alpha in Kubernetes v1.25, has now graduated to GA and is officially part of the Kubernetes v1.28 release.
This enhancement brings a significant improvement to how default
[StorageClasses](/docs/concepts/storage/storage-classes/) are assigned to PersistentVolumeClaims (PVCs).
With this feature enabled, you no longer need to create a default StorageClass first and then a PVC to assign the class.
Instead, any PVCs without a StorageClass assigned will now be retroactively updated to include the default StorageClass.
This enhancement ensures that PVCs no longer get stuck in an unbound state, and storage provisioning works seamlessly,
even when a default StorageClass is not defined at the time of PVC creation.
## What changed?
The PersistentVolume (PV) controller has been modified to automatically assign a default StorageClass to any unbound
PersistentVolumeClaim with the `storageClassName` not set. Additionally, the PersistentVolumeClaim
admission validation mechanism within
the API server has been adjusted to allow changing values from an unset state to an actual StorageClass name.
## How to use it?
As this feature has graduated to GA, there's no need to enable a feature gate anymore.
Simply make sure you are running Kubernetes v1.28 or later, and the feature will be available for use.
For more details, read about
[default StorageClass assignment](/docs/concepts/storage/persistent-volumes/#retroactive-default-storageclass-assignment) in the Kubernetes documentation.
You can also read the previous [blog post](/blog/2023/01/05/retroactive-default-storage-class/) announcing beta graduation in v1.26.
To provide feedback, join our [Kubernetes Storage Special-Interest-Group](https://github.com/kubernetes/community/tree/master/sig-storage) (SIG)
or participate in discussions on our [public Slack channel](https://app.slack.com/client/T09NY5SBT/C09QZFCE5).

View File

@ -0,0 +1,231 @@
---
layout: blog
title: "Kubernetes 1.28: Improved failure handling for Jobs"
date: 2023-08-21
slug: kubernetes-1-28-jobapi-update
---
**Authors:** Kevin Hannon (G-Research), Michał Woźniak (Google)
This blog discusses two new features in Kubernetes 1.28 to improve Jobs for batch
users: [Pod replacement policy](/docs/concepts/workloads/controllers/job/#pod-replacement-policy)
and [Backoff limit per index](/docs/concepts/workloads/controllers/job/#backoff-limit-per-index).
These features continue the effort started by the
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy)
to improve the handling of Pod failures in a Job.
## Pod replacement policy {#pod-replacement-policy}
By default, when a pod enters a terminating state (e.g. due to preemption or
eviction), Kubernetes immediately creates a replacement Pod. Therefore, both Pods are running
at the same time. In API terms, a pod is considered terminating when it has a
`deletionTimestamp` and it has a phase `Pending` or `Running`.
The scenario when two Pods are running at a given time is problematic for
some popular machine learning frameworks, such as
TensorFlow and [JAX](https://jax.readthedocs.io/en/latest/), which require at most one Pod running at the same time,
for a given index.
Tensorflow gives the following error if two pods are running for a given index.
```
/job:worker/task:4: Duplicate task registration with task_name=/job:worker/replica:0/task:4
```
See more details in the ([issue](https://github.com/kubernetes/kubernetes/issues/115844)).
Creating the replacement Pod before the previous one fully terminates can also
cause problems in clusters with scarce resources or with tight budgets, such as:
* cluster resources can be difficult to obtain for Pods pending to be scheduled,
as Kubernetes might take a long time to find available nodes until the existing
Pods are fully terminated.
* if cluster autoscaler is enabled, the replacement Pods might produce undesired
scale ups.
### How can you use it? {#pod-replacement-policy-how-to-use}
This is an alpha feature, which you can enable by turning on `JobPodReplacementPolicy`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in
your cluster.
Once the feature is enabled in your cluster, you can use it by creating a new Job that specifies a
`podReplacementPolicy` field as shown here:
```yaml
kind: Job
metadata:
name: new
...
spec:
podReplacementPolicy: Failed
...
```
In that Job, the Pods would only be replaced once they reached the `Failed` phase,
and not when they are terminating.
Additionally, you can inspect the `.status.terminating` field of a Job. The value
of the field is the number of Pods owned by the Job that are currently terminating.
```shell
kubectl get jobs/myjob -o=jsonpath='{.items[*].status.terminating}'
```
```
3 # three Pods are terminating and have not yet reached the Failed phase
```
This can be particularly useful for external queueing controllers, such as
[Kueue](https://github.com/kubernetes-sigs/kueue), that tracks quota
from running Pods of a Job until the resources are reclaimed from
the currently terminating Job.
Note that the `podReplacementPolicy: Failed` is the default when using a custom
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy).
## Backoff limit per index {#backoff-limit-per-index}
By default, Pod failures for [Indexed Jobs](/docs/concepts/workloads/controllers/job/#completion-mode)
are counted towards the global limit of retries, represented by `.spec.backoffLimit`.
This means, that if there is a consistently failing index, it is restarted
repeatedly until it exhausts the limit. Once the limit is reached the entire
Job is marked failed and some indexes may never be even started.
This is problematic for use cases where you want to handle Pod failures for
every index independently. For example, if you use Indexed Jobs for running
integration tests where each index corresponds to a testing suite. In that case,
you may want to account for possible flake tests allowing for 1 or 2 retries per
suite. There might be some buggy suites, making the corresponding
indexes fail consistently. In that case you may prefer to limit retries for
the buggy suites, yet allowing other suites to complete.
The feature allows you to:
* complete execution of all indexes, despite some indexes failing.
* better utilize the computational resources by avoiding unnecessary retries of consistently failing indexes.
### How can you use it? {#backoff-limit-per-index-how-to-use}
This is an alpha feature, which you can enable by turning on the
`JobBackoffLimitPerIndex`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
in your cluster.
Once the feature is enabled in your cluster, you can create an Indexed Job with the
`.spec.backoffLimitPerIndex` field specified.
#### Example
The following example demonstrates how to use this feature to make sure the
Job executes all indexes (provided there is no other reason for the early Job
termination, such as reaching the `activeDeadlineSeconds` timeout, or being
manually deleted by the user), and the number of failures is controlled per index.
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: job-backoff-limit-per-index-execute-all
spec:
completions: 8
parallelism: 2
completionMode: Indexed
backoffLimitPerIndex: 1
template:
spec:
restartPolicy: Never
containers:
- name: example # this example container returns an error, and fails,
# when it is run as the second or third index in any Job
# (even after a retry)
image: python
command:
- python3
- -c
- |
import os, sys, time
id = int(os.environ.get("JOB_COMPLETION_INDEX"))
if id == 1 or id == 2:
sys.exit(1)
time.sleep(1)
```
Now, inspect the Pods after the job is finished:
```sh
kubectl get pods -l job-name=job-backoff-limit-per-index-execute-all
```
Returns output similar to this:
```
NAME READY STATUS RESTARTS AGE
job-backoff-limit-per-index-execute-all-0-b26vc 0/1 Completed 0 49s
job-backoff-limit-per-index-execute-all-1-6j5gd 0/1 Error 0 49s
job-backoff-limit-per-index-execute-all-1-6wd82 0/1 Error 0 37s
job-backoff-limit-per-index-execute-all-2-c66hg 0/1 Error 0 32s
job-backoff-limit-per-index-execute-all-2-nf982 0/1 Error 0 43s
job-backoff-limit-per-index-execute-all-3-cxmhf 0/1 Completed 0 33s
job-backoff-limit-per-index-execute-all-4-9q6kq 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-5-z9hqf 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-6-tbkr8 0/1 Completed 0 23s
job-backoff-limit-per-index-execute-all-7-hxjsq 0/1 Completed 0 22s
```
Additionally, you can take a look at the status for that Job:
```sh
kubectl get jobs job-backoff-limit-per-index-fail-index -o yaml
```
The output ends with a `status` similar to:
```yaml
status:
completedIndexes: 0,3-7
failedIndexes: 1,2
succeeded: 6
failed: 4
conditions:
- message: Job has failed indexes
reason: FailedIndexes
status: "True"
type: Failed
```
Here, indexes `1` and `2` were both retried once. After the second failure,
in each of them, the specified `.spec.backoffLimitPerIndex` was exceeded, so
the retries were stopped. For comparison, if the per-index backoff was disabled,
then the buggy indexes would retry until the global `backoffLimit` was exceeded,
and then the entire Job would be marked failed, before some of the higher
indexes are started.
## How can you learn more?
- Read the user-facing documentation for [Pod replacement policy](/docs/concepts/workloads/controllers/job/#pod-replacement-policy),
[Backoff limit per index](/docs/concepts/workloads/controllers/job/#backoff-limit-per-index), and
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy)
- Read the KEPs for [Pod Replacement Policy](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3939-allow-replacement-when-fully-terminated),
[Backoff limit per index](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3850-backoff-limits-per-index-for-indexed-jobs), and
[Pod failure policy](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures).
## Getting Involved
These features were sponsored by [SIG Apps](https://github.com/kubernetes/community/tree/master/sig-apps). Batch use cases are actively
being improved for Kubernetes users in the
[batch working group](https://github.com/kubernetes/community/tree/master/wg-batch).
Working groups are relatively short-lived initiatives focused on specific goals.
The goal of the WG Batch is to improve experience for batch workload users, offer support for
batch processing use cases, and enhance the
Job API for common use cases. If that interests you, please join the working
group either by subscriping to our
[mailing list](https://groups.google.com/a/kubernetes.io/g/wg-batch) or on
[Slack](https://kubernetes.slack.com/messages/wg-batch).
## Acknowledgments
As with any Kubernetes feature, multiple people contributed to getting this
done, from testing and filing bugs to reviewing code.
We would not have been able to achieve either of these features without Aldo
Culquicondor (Google) providing excellent domain knowledge and expertise
throughout the Kubernetes ecosystem.

View File

@ -0,0 +1,128 @@
---
layout: blog
title: 'Kubernetes 1.28: Node podresources API Graduates to GA'
date: 2023-08-23
slug: kubelet-podresources-api-GA
---
**Author:**
Francesco Romani (Red Hat)
The podresources API is an API served by the kubelet locally on the node, which exposes the compute resources exclusively
allocated to containers. With the release of Kubernetes 1.28, that API is now Generally Available.
## What problem does it solve?
The kubelet can allocate exclusive resources to containers, like
[CPUs, granting exclusive access to full cores](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/)
or [memory, either regions or hugepages](https://kubernetes.io/docs/tasks/administer-cluster/memory-manager/).
Workloads which require high performance, or low latency (or both) leverage these features.
The kubelet also can assign [devices to containers](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/).
Collectively, these features which enable exclusive assignments are known as "resource managers".
Without an API like podresources, the only possible option to learn about resource assignment was to read the state files the
resource managers use. While done out of necessity, the problem with this approach is the path and the format of these file are
both internal implementation details. Albeit very stable, the project reserves the right to change them freely.
Consuming the content of the state files is thus fragile and unsupported, and projects doing this are recommended to consider
moving to podresources API or to other supported APIs.
## Overview of the API
The podresources API was [initially proposed to enable device monitoring](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
In order to enable monitoring agents, a key prerequisite is to enable introspection of device assignment, which is performed by the kubelet.
Serving this purpose was the initial goal of the API. The first iteration of the API only had a single function implemented, `List`,
to return information about the assignment of devices to containers.
The API is used by [multus CNI](https://github.com/k8snetworkplumbingwg/multus-cni) and by
[GPU monitoring tools](https://github.com/NVIDIA/dcgm-exporter).
Since its inception, the podresources API increased its scope to cover other resource managers than device manager.
Starting from Kubernetes 1.20, the `List` API reports also CPU cores and memory regions (including hugepages); the API also
reports the NUMA locality of the devices, while the locality of CPUs and memory can be inferred from the system.
In Kubernetes 1.21, the API [gained](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2403-pod-resources-allocatable-resources/README.md)
the `GetAllocatableResources` function.
This newer API complements the existing `List` API and enables monitoring agents to determine the unallocated resources,
thus enabling new features built on top of the podresources API like a
[NUMA-aware scheduler plugin](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg/noderesourcetopology/README.md).
Finally, in Kubernetes 1.27, another function, `Get` was introduced to be more friendly with CNI meta-plugins, to make it simpler to access resources
allocated to a specific pod, rather than having to filter through resources for all pods on the node. The `Get` function is currently alpha level.
## Consuming the API
The podresources API is served by the kubelet locally, on the same node on which is running.
On unix flavors, the endpoint is served over a unix domain socket; the default path is `/var/lib/kubelet/pod-resources/kubelet.sock`.
On windows, the endpoint is served over a named pipe; the default path is `npipe://\\.\pipe\kubelet-pod-resources`.
In order for the containerized monitoring application consume the API, the socket should be mounted inside the container.
A good practice is to mount the directory on which the podresources socket endpoint sits rather than the socket directly.
This will ensure that after a kubelet restart, the containerized monitor application will be able to re-connect to the socket.
An example manifest for a hypothetical monitoring agent consuming the podresources API and deployed as a DaemonSet could look like:
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: podresources-monitoring-app
namespace: monitoring
spec:
selector:
matchLabels:
name: podresources-monitoring
template:
metadata:
labels:
name: podresources-monitoring
spec:
containers:
- args:
- --podresources-socket=unix:///host-podresources/kubelet.sock
command:
- /bin/podresources-monitor
image: podresources-monitor:latest # just for an example
volumeMounts:
- mountPath: /host-podresources
name: host-podresources
serviceAccountName: podresources-monitor
volumes:
- hostPath:
path: /var/lib/kubelet/pod-resources
type: Directory
name: host-podresources
```
I hope you find it straightforward to consume the podresources API programmatically.
The kubelet API package provides the protocol file and the go type definitions; however, a client package is not yet available from the project,
and the existing code should not be used directly.
The [recommended](https://github.com/kubernetes/kubernetes/blob/v1.28.0-rc.0/pkg/kubelet/apis/podresources/client.go#L32)
approach is to reimplement the client in your projects, copying and pasting the related functions like for example
the multus project is [doing](https://github.com/k8snetworkplumbingwg/multus-cni/blob/v4.0.2/pkg/kubeletclient/kubeletclient.go).
When operating the containerized monitoring application consuming the podresources API, few points are worth highlighting to prevent "gotcha" moments:
- Even though the API only exposes data, and doesn't allow by design clients to mutate the kubelet state, the gRPC request/response model requires
read-write access to the podresources API socket. In other words, it is not possible to limit the container mount to `ReadOnly`.
- Multiple clients are allowed to connect to the podresources socket and consume the API, since it is stateless.
- The kubelet has [built-in rate limits](https://github.com/kubernetes/kubernetes/pull/116459) to mitigate local Denial of Service attacks from
misbehaving or malicious consumers. The consumers of the API must tolerate rate limit errors returned by the server. The rate limit is currently
hardcoded and global, so misbehaving clients can consume all the quota and potentially starve correctly behaving clients.
## Future enhancements
For historical reasons, the podresources API has a less precise specification than typical kubernetes APIs (such as the Kubernetes HTTP API, or the container runtime interface).
This leads to unspecified behavior in corner cases.
An [effort](https://issues.k8s.io/119423) is ongoing to rectify this state and to have a more precise specification.
The [Dynamic Resource Allocation (DRA)](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation) infrastructure
is a major overhaul of the resource management.
The [integration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3695-pod-resources-for-dra) with the podresources API
is already ongoing.
An [effort](https://issues.k8s.io/119817) is ongoing to recommend or create a reference client package ready to be consumed.
## Getting involved
This feature is driven by [SIG Node](https://github.com/Kubernetes/community/blob/master/sig-node/README.md).
Please join us to connect with the community and share your ideas and feedback around the above feature and
beyond. We look forward to hearing from you!

View File

@ -0,0 +1,248 @@
---
layout: blog
title: "Kubernetes 1.28: Beta support for using swap on Linux"
date: 2023-08-24T10:00:00-08:00
slug: swap-linux-beta
---
**Author:** Itamar Holder (Red Hat)
The 1.22 release [introduced Alpha support](/blog/2021/08/09/run-nodes-with-swap-alpha/)
for configuring swap memory usage for Kubernetes workloads running on Linux on a per-node basis.
Now, in release 1.28, support for swap on Linux nodes has graduated to Beta, along with many
new improvements.
Prior to version 1.22, Kubernetes did not provide support for swap memory on Linux systems.
This was due to the inherent difficulty in guaranteeing and accounting for pod memory utilization
when swap memory was involved. As a result, swap support was deemed out of scope in the initial
design of Kubernetes, and the default behavior of a kubelet was to fail to start if swap memory
was detected on a node.
In version 1.22, the swap feature for Linux was initially introduced in its Alpha stage. This represented
a significant advancement, providing Linux users with the opportunity to experiment with the swap
feature for the first time. However, as an Alpha version, it was not fully developed and had
several issues, including inadequate support for cgroup v2, insufficient metrics and summary
API statistics, inadequate testing, and more.
Swap in Kubernetes has numerous [use cases](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#user-stories)
for a wide range of users. As a result, the node special interest group within the Kubernetes project
has invested significant effort into supporting swap on Linux nodes for beta.
Compared to the alpha, the kubelet's support for running with swap enabled is more stable and
robust, more user-friendly, and addresses many known shortcomings. This graduation to beta
represents a crucial step towards achieving the goal of fully supporting swap in Kubernetes.
## How do I use it?
The utilization of swap memory on a node where it has already been provisioned can be
facilitated by the activation of the `NodeSwap` feature gate on the kubelet.
Additionally, you must disable the `failSwapOn` configuration setting, or the deprecated
`--fail-swap-on` command line flag must be deactivated.
It is possible to configure the `memorySwap.swapBehavior` option to define the manner in which a node utilizes swap memory. For instance,
```yaml
# this fragment goes into the kubelet's configuration file
memorySwap:
swapBehavior: UnlimitedSwap
```
The available configuration options for `swapBehavior` are:
- `UnlimitedSwap` (default): Kubernetes workloads can use as much swap memory as they
request, up to the system limit.
- `LimitedSwap`: The utilization of swap memory by Kubernetes workloads is subject to limitations.
Only Pods of [Burstable](docs/concepts/workloads/pods/pod-qos/#burstable) QoS are permitted to employ swap.
If configuration for `memorySwap` is not specified and the feature gate is
enabled, by default the kubelet will apply the same behaviour as the
`UnlimitedSwap` setting.
Note that `NodeSwap` is supported for **cgroup v2** only. For Kubernetes v1.28,
using swap along with cgroup v1 is no longer supported.
## Install a swap-enabled cluster with kubeadm
### Before you begin
It is required for this demo that the kubeadm tool be installed, following the steps outlined in the
[kubeadm installation guide](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm).
If swap is already enabled on the node, cluster creation may
proceed. If swap is not enabled, please refer to the provided instructions for enabling swap.
### Create a swap file and turn swap on
I'll demonstrate creating 4GiB of unencrypted swap.
```bash
dd if=/dev/zero of=/swapfile bs=128M count=32
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
swapon -s # enable the swap file only until this node is rebooted
```
To start the swap file at boot time, add line like `/swapfile swap swap defaults 0 0` to `/etc/fstab` file.
### Set up a Kubernetes cluster that uses swap-enabled nodes
To make things clearer, here is an example kubeadm configuration file `kubeadm-config.yaml` for the swap enabled cluster.
```yaml
---
apiVersion: "kubeadm.k8s.io/v1beta3"
kind: InitConfiguration
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
featureGates:
NodeSwap: true
memorySwap:
swapBehavior: LimitedSwap
```
Then create a single-node cluster using `kubeadm init --config kubeadm-config.yaml`.
During init, there is a warning that swap is enabled on the node and in case the kubelet
`failSwapOn` is set to true. We plan to remove this warning in a future release.
## How is the swap limit being determined with LimitedSwap?
The configuration of swap memory, including its limitations, presents a significant
challenge. Not only is it prone to misconfiguration, but as a system-level property, any
misconfiguration could potentially compromise the entire node rather than just a specific
workload. To mitigate this risk and ensure the health of the node, we have implemented
Swap in Beta with automatic configuration of limitations.
With `LimitedSwap`, Pods that do not fall under the Burstable QoS classification (i.e.
`BestEffort`/`Guaranteed` Qos Pods) are prohibited from utilizing swap memory.
`BestEffort` QoS Pods exhibit unpredictable memory consumption patterns and lack
information regarding their memory usage, making it difficult to determine a safe
allocation of swap memory. Conversely, `Guaranteed` QoS Pods are typically employed for
applications that rely on the precise allocation of resources specified by the workload,
with memory being immediately available. To maintain the aforementioned security and node
health guarantees, these Pods are not permitted to use swap memory when `LimitedSwap` is
in effect.
Prior to detailing the calculation of the swap limit, it is necessary to define the following terms:
* `nodeTotalMemory`: The total amount of physical memory available on the node.
* `totalPodsSwapAvailable`: The total amount of swap memory on the node that is available for use by Pods (some swap memory may be reserved for system use).
* `containerMemoryRequest`: The container's memory request.
Swap limitation is configured as:
`(containerMemoryRequest / nodeTotalMemory) × totalPodsSwapAvailable`
In other words, the amount of swap that a container is able to use is proportionate to its
memory request, the node's total physical memory and the total amount of swap memory on
the node that is available for use by Pods.
It is important to note that, for containers within Burstable QoS Pods, it is possible to
opt-out of swap usage by specifying memory requests that are equal to memory limits.
Containers configured in this manner will not have access to swap memory.
## How does it work?
There are a number of possible ways that one could envision swap use on a node.
When swap is already provisioned and available on a node,
SIG Node have [proposed](https://github.com/kubernetes/enhancements/blob/9d127347773ad19894ca488ee04f1cd3af5774fc/keps/sig-node/2400-node-swap/README.md#proposal)
the kubelet should be able to be configured so that:
- It can start with swap on.
- It will direct the Container Runtime Interface to allocate zero swap memory
to Kubernetes workloads by default.
Swap configuration on a node is exposed to a cluster admin via the
[`memorySwap` in the KubeletConfiguration](/docs/reference/config-api/kubelet-config.v1).
As a cluster administrator, you can specify the node's behaviour in the
presence of swap memory by setting `memorySwap.swapBehavior`.
The kubelet [employs the CRI](https://kubernetes.io/docs/concepts/architecture/cri/)
(container runtime interface) API to direct the CRI to
configure specific cgroup v2 parameters (such as `memory.swap.max`) in a manner that will
enable the desired swap configuration for a container. The CRI is then responsible to
write these settings to the container-level cgroup.
## How can I monitor swap?
A notable deficiency in the Alpha version was the inability to monitor and introspect swap
usage. This issue has been addressed in the Beta version introduced in Kubernetes 1.28, which now
provides the capability to monitor swap usage through several different methods.
The beta version of kubelet now collects
[node-level metric statistics](/docs/reference/instrumentation/node-metrics/),
which can be accessed at the `/metrics/resource` and `/stats/summary` kubelet HTTP endpoints.
This allows clients who can directly interrogate the kubelet to
monitor swap usage and remaining swap memory when using LimitedSwap. Additionally, a
`machine_swap_bytes` metric has been added to cadvisor to show the total physical swap capacity of the
machine.
## Caveats
Having swap available on a system reduces predictability. Swap's performance is
worse than regular memory, sometimes by many orders of magnitude, which can
cause unexpected performance regressions. Furthermore, swap changes a system's
behaviour under memory pressure. Since enabling swap permits
greater memory usage for workloads in Kubernetes that cannot be predictably
accounted for, it also increases the risk of noisy neighbours and unexpected
packing configurations, as the scheduler cannot account for swap memory usage.
The performance of a node with swap memory enabled depends on the underlying
physical storage. When swap memory is in use, performance will be significantly
worse in an I/O operations per second (IOPS) constrained environment, such as a
cloud VM with I/O throttling, when compared to faster storage mediums like
solid-state drives or NVMe.
As such, we do not advocate the utilization of swap memory for workloads or
environments that are subject to performance constraints. Furthermore, it is
recommended to employ `LimitedSwap`, as this significantly mitigates the risks
posed to the node.
Cluster administrators and developers should benchmark their nodes and applications
before using swap in production scenarios, and [we need your help](#how-do-i-get-involved) with that!
### Security risk
Enabling swap on a system without encryption poses a security risk, as critical information,
such as volumes that represent Kubernetes Secrets, [may be swapped out to the disk](/docs/concepts/configuration/secret/#information-security-for-secrets).
If an unauthorized individual gains
access to the disk, they could potentially obtain these confidential data. To mitigate this risk, the
Kubernetes project strongly recommends that you encrypt your swap space.
However, handling encrypted swap is not within the scope of
kubelet; rather, it is a general OS configuration concern and should be addressed at that level.
It is the administrator's responsibility to provision encrypted swap to mitigate this risk.
Furthermore, as previously mentioned, with `LimitedSwap` the user has the option to completely
disable swap usage for a container by specifying memory requests that are equal to memory limits.
This will prevent the corresponding containers from accessing swap memory.
## Looking ahead
The Kubernetes 1.28 release introduced Beta support for swap memory on Linux nodes,
and we will continue to work towards [general availability](/docs/reference/command-line-tools-reference/feature-gates/#feature-stages)
for this feature. I hope that this will include:
* Add the ability to set a system-reserved quantity of swap from what kubelet detects on the host.
* Adding support for controlling swap consumption at the Pod level via cgroups.
* This point is still under discussion.
* Collecting feedback from test user cases.
* We will consider introducing new configuration modes for swap, such as a
node-wide swap limit for workloads.
## How can I learn more?
You can review the current [documentation](/docs/concepts/architecture/nodes/#swap-memory)
for using swap with Kubernetes.
For more information, and to assist with testing and provide feedback, please
see [KEP-2400](https://github.com/kubernetes/enhancements/issues/4128) and its
[design proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md).
## How do I get involved?
Your feedback is always welcome! SIG Node [meets regularly](https://github.com/kubernetes/community/tree/master/sig-node#meetings)
and [can be reached](https://github.com/kubernetes/community/tree/master/sig-node#contact)
via [Slack](https://slack.k8s.io/) (channel **#sig-node**), or the SIG's
[mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node). A Slack
channel dedicated to swap is also available at **#sig-node-swap**.
Feel free to reach out to me, Itamar Holder (**@iholder101** on Slack and GitHub)
if you'd like to help or ask further questions.

View File

@ -0,0 +1,114 @@
---
layout: blog
title: "Kubernetes v1.28: Introducing native sidecar containers"
date: 2023-08-25
slug: native-sidecar-containers
---
***Authors:*** Todd Neal (AWS), Matthias Bertschy (ARMO), Sergey Kanzhelev (Google), Gunju Kim (NAVER), Shannon Kularathna (Google)
This post explains how to use the new sidecar feature, which enables restartable init containers and is available in alpha in Kubernetes 1.28. We want your feedback so that we can graduate this feature as soon as possible.
The concept of a “sidecar” has been part of Kubernetes since nearly the very beginning. In 2015, sidecars were described in a [blog post](/blog/2015/06/the-distributed-system-toolkit-patterns/) about composite containers as additional containers that “extend and enhance the main container”. Sidecar containers have become a common Kubernetes deployment pattern and are often used for network proxies or as part of a logging system. Until now, sidecars were a concept that Kubernetes users applied without native support. The lack of native support has caused some usage friction, which this enhancement aims to resolve.
## What are sidecar containers in 1.28?
Kubernetes 1.28 adds a new `restartPolicy` field to [init containers](/docs/concepts/workloads/pods/init-containers/) that is available when the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled.
```yaml
apiVersion: v1
kind: Pod
spec:
initContainers:
- name: secret-fetch
image: secret-fetch:1.0
- name: network-proxy
image: network-proxy:1.0
restartPolicy: Always
containers:
...
```
The field is optional and, if set, the only valid value is Always. Setting this field changes the behavior of init containers as follows:
- The container restarts if it exits
- Any subsequent init container starts immediately after the [startupProbe](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes) has successfully completed instead of waiting for the restartable init container to exit
- The resource usage calculation changes for the pod as restartable init container resources are now added to the sum of the resource requests by the main containers
[Pod termination](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination) continues to only depend on the main containers. An init container with a `restartPolicy` of `Always` (named a sidecar) won't prevent the pod from terminating after the main containers exit.
The following properties of restartable init containers make them ideal for the sidecar deployment pattern:
- Init containers have a well-defined startup order regardless of whether you set a `restartPolicy`, so you can ensure that your sidecar starts before any container declarations that come after the sidecar declaration in your manifest.
- Sidecar containers don't extend the lifetime of the Pod, so you can use them in short-lived Pods with no changes to the Pod lifecycle.
- Sidecar containers are restarted on exit, which improves resilience and lets you use sidecars to provide services that your main containers can more reliably consume.
## When to use sidecar containers
You might find built-in sidecar containers useful for workloads such as the following:
- **Batch or AI/ML workloads**, or other Pods that run to completion. These workloads will experience the most significant benefits.
- **Network proxies** that start up before any other container in the manifest. Every other container that runs can use the proxy container's services. For instructions, see the [Kubernetes Native sidecars in Istio blog post](https://istio.io/latest/blog/2023/native-sidecars/).
- **Log collection containers**, which can now start before any other container and run until the Pod terminates. This improves the reliability of log collection in your Pods.
- **Jobs**, which can use sidecars for any purpose without Job completion being blocked by the running sidecar. No additional configuration is required to ensure this behavior.
## How did users get sidecar behavior before 1.28?
Prior to the sidecar feature, the following options were available for implementing sidecar behavior depending on the desired lifetime of the sidecar container:
- **Lifetime of sidecar less than Pod lifetime**: Use an init container, which provides well-defined startup order. However, the sidecar has to exit for other init containers and main Pod containers to start.
- **Lifetime of sidecar equal to Pod lifetime**: Use a main container that runs alongside your workload containers in the Pod. This method doesn't give you control over startup order, and lets the sidecar container potentially block Pod termination after the workload containers exit.
The built-in sidecar feature solves for the use case of having a lifetime equal to the Pod lifetime and has the following additional benefits:
- Provides control over startup order
- Doesnt block Pod termination
## Transitioning existing sidecars to the new model
We recommend only using the sidecars feature gate in [short lived testing clusters](/docs/reference/command-line-tools-reference/feature-gates/#feature-stages) at the alpha stage. If you have an existing sidecar that is configured as a main container so it can run for the lifetime of the pod, it can be moved to the `initContainers` section of the pod spec and given a `restartPolicy` of `Always`. In many cases, the sidecar should work as before with the added benefit of having a defined startup ordering and not prolonging the pod lifetime.
## Known issues
The alpha release of built-in sidecar containers has the following known issues, which we'll resolve before graduating the feature to beta:
- The CPU, memory, device, and topology manager are unaware of the sidecar container lifetime and additional resource usage, and will operate as if the Pod had lower resource requests than it actually does.
- The output of `kubectl describe node` is incorrect when sidecars are in use. The output shows resource usage that's lower than the actual usage because it doesn't use the new resource usage calculation for sidecar containers.
## We need your feedback!
In the alpha stage, we want you to try out sidecar containers in your environments and open issues if you encounter bugs or friction points. We're especially interested in feedback about the following:
- The shutdown sequence, especially with multiple sidecars running
- The backoff timeout adjustment for crashing sidecars
- The behavior of Pod readiness and liveness probes when sidecars are running
To open an issue, see the [Kubernetes GitHub repository](https://github.com/kubernetes/kubernetes/issues/new/choose).
## Whats next?
In addition to the known issues that will be resolved, we're working on adding termination ordering for sidecar and main containers. This will ensure that sidecar containers only terminate after the Pod's main containers have exited.
Were excited to see the sidecar feature come to Kubernetes and are interested in feedback.
## Acknowledgements
Many years have passed since the original KEP was written, so we apologize if we omit anyone who worked on this feature over the years. This is a best-effort attempt to recognize the people involved in this effort.
- [mrunalp](https://github.com/mrunalp/) for design discussions and reviews
- [thockin](https://github.com/thockin/) for API discussions and support thru years
- [bobbypage](https://github.com/bobbypage) for reviews
- [smarterclayton](https://github.com/smarterclayton) for detailed review and feedback
- [howardjohn](https://github.com/howardjohn) for feedback over years and trying it early during implementation
- [derekwaynecarr](https://github.com/derekwaynecarr) and [dchen1107](https://github.com/dchen1107) for leadership
- [jpbetz](https://github.com/Jpbetz) for API and termination ordering designs as well as code reviews
- [Joseph-Irving](https://github.com/Joseph-Irving) and [rata](https://github.com/rata) for the early iterations design and reviews years back
- [swatisehgal](https://github.com/swatisehgal) and [ffromani](https://github.com/ffromani) for early feedback on resource managers impact
- [alculquicondor](https://github.com/Alculquicondor) for feedback on addressing the version skew of the scheduler
- [wojtek-t](https://github.com/Wojtek-t) for PRR review of a KEP
- [ahg-g](https://github.com/ahg-g) for reviewing the scheduler portion of a KEP
- [adisky](https://github.com/Adisky) for the Job completion issue
## More Information
- Read [API for sidecar containers](/docs/concepts/workloads/pods/init-containers/#api-for-sidecar-containers) in the Kubernetes documentation
- Read the [Sidecar KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/753-sidecar-containers/README.md)

View File

@ -0,0 +1,102 @@
---
layout: blog
title: 'Comparing Local Kubernetes Development Tools: Telepresence, Gefyra, and mirrord'
date: 2023-09-12
slug: local-k8s-development-tools
---
**Author:** Eyal Bukchin (MetalBear)
The Kubernetes development cycle is an evolving landscape with a myriad of tools seeking to streamline the process. Each tool has its unique approach, and the choice often comes down to individual project requirements, the team's expertise, and the preferred workflow.
Among the various solutions, a category we dubbed “Local K8S Development tools” has emerged, which seeks to enhance the Kubernetes development experience by connecting locally running components to the Kubernetes cluster. This facilitates rapid testing of new code in cloud conditions, circumventing the traditional cycle of Dockerization, CI, and deployment.
In this post, we compare three solutions in this category: Telepresence, Gefyra, and our own contender, mirrord.
## Telepresence
The oldest and most well-established solution in the category, [Telepresence](https://www.telepresence.io/) uses a VPN (or more specifically, a `tun` device) to connect the user's machine (or a locally running container) and the cluster's network. It then supports the interception of incoming traffic to a specific service in the cluster, and its redirection to a local port. The traffic being redirected can also be filtered to avoid completely disrupting the remote service. It also offers complementary features to support file access (by locally mounting a volume mounted to a pod) and importing environment variables.
Telepresence requires the installation of a local daemon on the user's machine (which requires root privileges) and a Traffic Manager component on the cluster. Additionally, it runs an Agent as a sidecar on the pod to intercept the desired traffic.
## Gefyra
[Gefyra](https://gefyra.dev/), similar to Telepresence, employs a VPN to connect to the cluster. However, it only supports connecting locally running Docker containers to the cluster. This approach enhances portability across different OSes and local setups. However, the downside is that it does not support natively run uncontainerized code.
Gefyra primarily focuses on network traffic, leaving file access and environment variables unsupported. Unlike Telepresence, it doesn't alter the workloads in the cluster, ensuring a straightforward clean-up process if things go awry.
## mirrord
The newest of the three tools, [mirrord](https://mirrord.dev/) adopts a different approach by injecting itself
into the local binary (utilizing `LD_PRELOAD` on Linux or `DYLD_INSERT_LIBRARIES` on macOS),
and overriding libc function calls, which it then proxies a temporary agent it runs in the cluster.
For example, when the local process tries to read a file mirrord intercepts that call and sends it
to the agent, which then reads the file from the remote pod. This method allows mirrord to cover
all inputs and outputs to the process covering network access, file access, and
environment variables uniformly.
By working at the process level, mirrord supports running multiple local processes simultaneously, each in the context of their respective pod in the cluster, without requiring them to be containerized and without needing root permissions on the users machine.
## Summary
<table>
<caption>Comparison of Telepresence, Gefyra, and mirrord</caption>
<thead>
<tr>
<td class="empty"></td>
<th>Telepresence</th>
<th>Gefyra</th>
<th>mirrord</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">Cluster connection scope</th>
<td>Entire machine or container</td>
<td>Container</td>
<td>Process</td>
</tr>
<tr>
<th scope="row">Developer OS support</th>
<td>Linux, macOS, Windows</td>
<td>Linux, macOS, Windows</td>
<td>Linux, macOS, Windows (WSL)</td>
</tr>
<tr>
<th scope="row">Incoming traffic features</th>
<td>Interception</td>
<td>Interception</td>
<td>Interception or mirroring</td>
</tr>
<tr>
<th scope="row">File access</th>
<td>Supported</td>
<td>Unsupported</td>
<td>Supported</td>
</tr>
<tr>
<th scope="row">Environment variables</th>
<td>Supported</td>
<td>Unsupported</td>
<td>Supported</td>
</tr>
<tr>
<th scope="row">Requires local root</th>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<th scope="row">How to use</th>
<td><ul><li>CLI</li><li>Docker Desktop extension</li></ul></td>
<td><ul><li>CLI</li><li>Docker Desktop extension</li></ul></td>
<td><ul><li>CLI</li><li>Visual Studio Code extension</li><li>IntelliJ plugin</li></ul></td>
</tr>
</tbody>
</table>
## Conclusion
Telepresence, Gefyra, and mirrord each offer unique approaches to streamline the Kubernetes development cycle, each having its strengths and weaknesses. Telepresence is feature-rich but comes with complexities, mirrord offers a seamless experience and supports various functionalities, while Gefyra aims for simplicity and robustness.
Your choice between them should depend on the specific requirements of your project, your team's familiarity with the tools, and the desired development workflow. Whichever tool you choose, we believe the local Kubernetes development approach can provide an easy, effective, and cheap solution to the bottlenecks of the Kubernetes development cycle, and will become even more prevalent as these tools continue to innovate and evolve.

View File

@ -20,7 +20,7 @@ case_study_details:
<h2>Solution</h2>
<p>Mountain has been overseeing the company's migration to <a href="http://kubernetes.io/">Kubernetes</a>, using <a href="https://www.openshift.org/">OpenShift</a> Container Platform, <a href="https://www.redhat.com/en">Red Hat</a>'s enterprise container platform.</p>
<p>Mountain has been overseeing the company's migration to <a href="https://kubernetes.io/">Kubernetes</a>, using <a href="https://www.openshift.org/">OpenShift</a> Container Platform, <a href="https://www.redhat.com/en">Red Hat</a>'s enterprise container platform.</p>
<h2>Impact</h2>
@ -48,7 +48,7 @@ In his two decades at Amadeus, Eric Mountain has been the migrations guy.
<p>While mainly a C++ and Java shop, Amadeus also wanted to be able to adopt new technologies more easily. Some of its developers had started using languages like <a href="https://www.python.org/">Python</a> and databases like <a href="https://www.couchbase.com/">Couchbase</a>, but Mountain wanted still more options, he says, "in order to better adapt our technical solutions to the products we offer, and open up entirely new possibilities to our developers." Working with recent technologies and cool new things would also make it easier to attract new talent.</p>
<p>All of those needs led Mountain and his team on a search for a new platform. "We did a set of studies and proofs of concept over a fairly short period, and we considered many technologies," he says. "In the end, we were left with three choices: build everything on premise, build on top of <a href="http://kubernetes.io/">Kubernetes</a> whatever happens to be missing from our point of view, or go with <a href="https://www.openshift.com/">OpenShift</a> and build whatever remains there."</p>
<p>All of those needs led Mountain and his team on a search for a new platform. "We did a set of studies and proofs of concept over a fairly short period, and we considered many technologies," he says. "In the end, we were left with three choices: build everything on premise, build on top of <a href="https://kubernetes.io/">Kubernetes</a> whatever happens to be missing from our point of view, or go with <a href="https://www.openshift.com/">OpenShift</a> and build whatever remains there."</p>
<p>The team decided against building everything themselves—though they'd done that sort of thing in the past—because "people were already inventing things that looked good," says Mountain.</p>

View File

@ -0,0 +1,106 @@
---
reviewers:
- jpbetz
title: Mixed Version Proxy
content_type: concept
weight: 220
---
<!-- overview -->
{{< feature-state state="alpha" for_k8s_version="v1.28" >}}
Kubernetes {{< skew currentVersion >}} includes an alpha feature that lets an
{{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}
proxy a resource requests to other _peer_ API servers. This is useful when there are multiple
API servers running different versions of Kubernetes in one cluster
(for example, during a long-lived rollout to a new release of Kubernetes).
This enables cluster administrators to configure highly available clusters that can be upgraded
more safely, by directing resource requests (made during the upgrade) to the correct kube-apiserver.
That proxying prevents users from seeing unexpected 404 Not Found errors that stem
from the upgrade process.
This mechanism is called the _Mixed Version Proxy_.
## Enabling the Mixed Version Proxy
Ensure that `UnknownVersionInteroperabilityProxy` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled when you start the {{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}:
```shell
kube-apiserver \
--feature-gates=UnknownVersionInteroperabilityProxy=true \
# required command line arguments for this feature
--peer-ca-file=<path to kube-apiserver CA cert>
--proxy-client-cert-file=<path to aggregator proxy cert>,
--proxy-client-key-file=<path to aggregator proxy key>,
--requestheader-client-ca-file=<path to aggregator CA cert>,
# requestheader-allowed-names can be set to blank to allow any Common Name
--requestheader-allowed-names=<valid Common Names to verify proxy client cert against>,
# optional flags for this feature
--peer-advertise-ip=`IP of this kube-apiserver that should be used by peers to proxy requests`
--peer-advertise-port=`port of this kube-apiserver that should be used by peers to proxy requests`
# …and other flags as usual
```
### Proxy transport and authentication between API servers {#transport-and-authn}
* The source kube-apiserver reuses the
[existing APIserver client authentication flags](/docs/tasks/extend-kubernetes/configure-aggregation-layer/#kubernetes-apiserver-client-authentication)
`--proxy-client-cert-file` and `--proxy-client-key-file` to present its identity that
will be verified by its peer (the destination kube-apiserver). The destination API server
verifies that peer connection based on the configuration you specify using the
`--requestheader-client-ca-file` command line argument.
* To authenticate the destination server's serving certs, you must configure a certificate
authority bundle by specifying the `--peer-ca-file` command line argument to the **source** API server.
### Configuration for peer API server connectivity
To set the network location of a kube-apiserver that peers will use to proxy requests, use the
`--peer-advertise-ip` and `--peer-advertise-port` command line arguments to kube-apiserver or specify
these fields in the API server configuration file.
If these flags are unspecified, peers will use the value from either `--advertise-address` or
`--bind-address` command line argument to the kube-apiserver. If those too, are unset, the host's default interface is used.
## Mixed version proxying
When you enable mixed version proxying, the [aggregation layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/)
loads a special filter that does the following:
* When a resource request reaches an API server that cannot serve that API
(either because it is at a version pre-dating the introduction of the API or the API is turned off on the API server)
the API server attempts to send the request to a peer API server that can serve the requested API.
It does so by identifying API groups / versions / resources that the local server doesn't recognise,
and tries to proxy those requests to a peer API server that is capable of handling the request.
* If the peer API server fails to respond, the _source_ API server responds with 503 ("Service Unavailable") error.
### How it works under the hood
When an API Server receives a resource request, it first checks which API servers can
serve the requested resource. This check happens using the internal `StorageVersion` API.
* If the resource is known to the API server that received the request
(for example, `GET /api/v1/pods/some-pod`), the request is handled locally.
* If there is no internal `StorageVersion` object found for the requested resource
(for example, `GET /my-api/v1/my-resource`) and the configured APIService specifies proxying
to an extension API server, that proxying happens following the usual
[flow](/docs/tasks/extend-kubernetes/configure-aggregation-layer/) for extension APIs.
* If a valid internal `StorageVersion` object is found for the requested resource
(for example, `GET /batch/v1/jobs`) and the API server trying to handle the request
(the _handling API server_) has the `batch` API disabled, then the _handling API server_
fetches the peer API servers that do serve the relevant API group / version / resource
(`api/v1/batch` in this case) using the information in the fetched `StorageVersion` object.
The _handling API server_ then proxies the request to one of the matching peer kube-apiservers
that are aware of the requested resource.
* If there is no peer known for that API group / version / resource, the handling API server
passes the request to its own handler chain which should eventually return a 404 ("Not Found") response.
* If the handling API server has identified and selected a peer API server, but that peer fails
to respond (for reasons such as network connectivity issues, or a data race between the request
being received and a controller registering the peer's info into the control plane), then the handling
API server responds with a 503 (“Service Unavailable”) error.

View File

@ -9,7 +9,8 @@ weight: 10
<!-- overview -->
Kubernetes runs your {{< glossary_tooltip text="workload" term_id="workload" >}} by placing containers into Pods to run on _Nodes_.
Kubernetes runs your {{< glossary_tooltip text="workload" term_id="workload" >}}
by placing containers into Pods to run on _Nodes_.
A node may be a virtual or physical machine, depending on the cluster. Each node
is managed by the
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}
@ -28,14 +29,15 @@ The [components](/docs/concepts/overview/components/#node-components) on a node
## Management
There are two main ways to have Nodes added to the {{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}:
There are two main ways to have Nodes added to the
{{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}:
1. The kubelet on a node self-registers to the control plane
2. You (or another human user) manually add a Node object
After you create a Node {{< glossary_tooltip text="object" term_id="object" >}},
or the kubelet on a node self-registers, the control plane checks whether the new Node object is
valid. For example, if you try to create a Node from the following JSON manifest:
or the kubelet on a node self-registers, the control plane checks whether the new Node object
is valid. For example, if you try to create a Node from the following JSON manifest:
```json
{
@ -72,11 +74,10 @@ The name of a Node object must be a valid
The [name](/docs/concepts/overview/working-with-objects/names#names) identifies a Node. Two Nodes
cannot have the same name at the same time. Kubernetes also assumes that a resource with the same
name is the same object. In case of a Node, it is implicitly assumed that an instance using the
same name will have the same state (e.g. network settings, root disk contents)
and attributes like node labels. This may lead to
inconsistencies if an instance was modified without changing its name. If the Node needs to be
replaced or updated significantly, the existing Node object needs to be removed from API server
first and re-added after the update.
same name will have the same state (e.g. network settings, root disk contents) and attributes like
node labels. This may lead to inconsistencies if an instance was modified without changing its name.
If the Node needs to be replaced or updated significantly, the existing Node object needs to be
removed from API server first and re-added after the update.
### Self-registration of Nodes
@ -163,10 +164,10 @@ that should run on the Node even if it is being drained of workload applications
A Node's status contains the following information:
* [Addresses](#addresses)
* [Conditions](#condition)
* [Capacity and Allocatable](#capacity)
* [Info](#info)
* [Addresses](/docs/reference/node/node-status/#addresses)
* [Conditions](/docs/reference/node/node-status/#condition)
* [Capacity and Allocatable](/docs/reference/node/node-status/#capacity)
* [Info](/docs/reference/node/node-status/#info)
You can use `kubectl` to view a Node's status and other details:
@ -174,121 +175,21 @@ You can use `kubectl` to view a Node's status and other details:
kubectl describe node <insert-node-name-here>
```
Each section of the output is described below.
See [Node Status](/docs/reference/node/node-status/) for more details.
### Addresses
The usage of these fields varies depending on your cloud provider or bare metal configuration.
* HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
`--hostname-override` parameter.
* ExternalIP: Typically the IP address of the node that is externally routable (available from
outside the cluster).
* InternalIP: Typically the IP address of the node that is routable only within the cluster.
### Conditions {#condition}
The `conditions` field describes the status of all `Running` nodes. Examples of conditions include:
{{< table caption = "Node conditions, and a description of when each condition applies." >}}
| Node Condition | Description |
|----------------------|-------------|
| `Ready` | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 40 seconds) |
| `DiskPressure` | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
| `MemoryPressure` | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
| `PIDPressure` | `True` if pressure exists on the processes—that is, if there are too many processes on the node; otherwise `False` |
| `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
{{< /table >}}
{{< note >}}
If you use command-line tools to print details of a cordoned Node, the Condition includes
`SchedulingDisabled`. `SchedulingDisabled` is not a Condition in the Kubernetes API; instead,
cordoned nodes are marked Unschedulable in their spec.
{{< /note >}}
In the Kubernetes API, a node's condition is represented as part of the `.status`
of the Node resource. For example, the following JSON structure describes a healthy node:
```json
"conditions": [
{
"type": "Ready",
"status": "True",
"reason": "KubeletReady",
"message": "kubelet is posting ready status",
"lastHeartbeatTime": "2019-06-05T18:38:35Z",
"lastTransitionTime": "2019-06-05T11:41:27Z"
}
]
```
When problems occur on nodes, the Kubernetes control plane automatically creates
[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
affecting the node. An example of this is when the `status` of the Ready condition
remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
These taints affect pending pods as the scheduler takes the Node's taints into consideration when
assigning a pod to a Node. Existing pods scheduled to the node may be evicted due to the application
of `NoExecute` taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
them schedule to and continue running on a Node even though it has a specific taint.
See [Taint Based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions) and
[Taint Nodes by Condition](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition)
for more details.
### Capacity and Allocatable {#capacity}
Describes the resources available on the node: CPU, memory, and the maximum
number of pods that can be scheduled onto the node.
The fields in the capacity block indicate the total amount of resources that a
Node has. The allocatable block indicates the amount of resources on a
Node that is available to be consumed by normal Pods.
You may read more about capacity and allocatable resources while learning how
to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
on a Node.
### Info
Describes general information about the node, such as kernel version, Kubernetes
version (kubelet and kube-proxy version), container runtime details, and which
operating system the node uses.
The kubelet gathers this information from the node and publishes it into
the Kubernetes API.
## Heartbeats
## Node heartbeats
Heartbeats, sent by Kubernetes nodes, help your cluster determine the
availability of each node, and to take action when failures are detected.
For nodes there are two forms of heartbeats:
* updates to the `.status` of a Node
* Updates to the [`.status`](/docs/reference/node/node-status/) of a Node.
* [Lease](/docs/concepts/architecture/leases/) objects
within the `kube-node-lease`
{{< glossary_tooltip term_id="namespace" text="namespace">}}.
Each Node has an associated Lease object.
Compared to updates to `.status` of a Node, a Lease is a lightweight resource.
Using Leases for heartbeats reduces the performance impact of these updates
for large clusters.
The kubelet is responsible for creating and updating the `.status` of Nodes,
and for updating their related Leases.
- The kubelet updates the node's `.status` either when there is change in status
or if there has been no update for a configured interval. The default interval
for `.status` updates to Nodes is 5 minutes, which is much longer than the 40
second default timeout for unreachable nodes.
- The kubelet creates and then updates its Lease object every 10 seconds
(the default update interval). Lease updates occur independently from
updates to the Node's `.status`. If the Lease update fails, the kubelet retries,
using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.
## Node controller
The node {{< glossary_tooltip text="controller" term_id="controller" >}} is a
@ -327,8 +228,7 @@ from more than 1 node per 10 seconds.
The node eviction behavior changes when a node in a given availability zone
becomes unhealthy. The node controller checks what percentage of nodes in the zone
are unhealthy (the `Ready` condition is `Unknown` or `False`) at
the same time:
are unhealthy (the `Ready` condition is `Unknown` or `False`) at the same time:
- If the fraction of unhealthy nodes is at least `--unhealthy-zone-threshold`
(default 0.55), then the eviction rate is reduced.
@ -424,8 +324,8 @@ node shutdown has been detected, so that even Pods with a
{{< glossary_tooltip text="toleration" term_id="toleration" >}} for
`node.kubernetes.io/not-ready:NoSchedule` do not start there.
At the same time when kubelet is setting that condition on its Node via the API, the kubelet also begins
terminating any Pods that are running locally.
At the same time when kubelet is setting that condition on its Node via the API,
the kubelet also begins terminating any Pods that are running locally.
During a graceful shutdown, kubelet terminates pods in two phases:
@ -448,10 +348,9 @@ Graceful node shutdown feature is configured with two
{{< note >}}
There are cases when Node termination was cancelled by the system (or perhaps manually
by an administrator). In either of those situations the
Node will return to the `Ready` state. However Pods which already started the process
of termination
will not be restored by kubelet and will need to be re-scheduled.
by an administrator). In either of those situations the Node will return to the `Ready` state.
However, Pods which already started the process of termination will not be restored by kubelet
and will need to be re-scheduled.
{{< /note >}}
@ -544,7 +443,6 @@ example, you could instead use these settings:
| 1000 |120 seconds |
| 0 |60 seconds |
In the above case, the pods with `custom-class-b` will go into the same bucket
as `custom-class-c` for shutdown.
@ -556,8 +454,8 @@ If this feature is enabled and no configuration is provided, then no ordering
action will be taken.
Using this feature requires enabling the `GracefulNodeShutdownBasedOnPodPriority`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
, and setting `ShutdownGracePeriodByPodPriority` in the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/),
and setting `ShutdownGracePeriodByPodPriority` in the
[kubelet config](/docs/reference/config-api/kubelet-config.v1beta1/)
to the desired configuration containing the pod priority class values and
their respective shutdown periods.
@ -571,9 +469,9 @@ the feature is Beta and is enabled by default.
Metrics `graceful_shutdown_start_time_seconds` and `graceful_shutdown_end_time_seconds`
are emitted under the kubelet subsystem to monitor node shutdowns.
## Non Graceful node shutdown {#non-graceful-node-shutdown}
## Non-graceful node shutdown handling {#non-graceful-node-shutdown}
{{< feature-state state="beta" for_k8s_version="v1.26" >}}
{{< feature-state state="stable" for_k8s_version="v1.28" >}}
A node shutdown action may not be detected by kubelet's Node Shutdown Manager,
either because the command does not trigger the inhibitor locks mechanism used by
@ -582,24 +480,25 @@ ShutdownGracePeriodCriticalPods are not configured properly. Please refer to abo
section [Graceful Node Shutdown](#graceful-node-shutdown) for more details.
When a node is shutdown but not detected by kubelet's Node Shutdown Manager, the pods
that are part of a {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}} will be stuck in terminating status on
the shutdown node and cannot move to a new running node. This is because kubelet on
the shutdown node is not available to delete the pods so the StatefulSet cannot
create a new pod with the same name. If there are volumes used by the pods, the
VolumeAttachments will not be deleted from the original shutdown node so the volumes
that are part of a {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}
will be stuck in terminating status on the shutdown node and cannot move to a new running node.
This is because kubelet on the shutdown node is not available to delete the pods so
the StatefulSet cannot create a new pod with the same name. If there are volumes used by the pods,
the VolumeAttachments will not be deleted from the original shutdown node so the volumes
used by these pods cannot be attached to a new running node. As a result, the
application running on the StatefulSet cannot function properly. If the original
shutdown node comes up, the pods will be deleted by kubelet and new pods will be
created on a different running node. If the original shutdown node does not come up,
these pods will be stuck in terminating status on the shutdown node forever.
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute`
or `NoSchedule` effect to a Node marking it out-of-service.
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service`
with either `NoExecute` or `NoSchedule` effect to a Node marking it out-of-service.
If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}, and a Node is marked out-of-service with this taint, the
pods on the node will be forcefully deleted if there are no matching tolerations on it and volume
detach operations for the pods terminating on the node will happen immediately. This allows the
Pods on the out-of-service node to recover quickly on a different node.
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}},
and a Node is marked out-of-service with this taint, the pods on the node will be forcefully deleted
if there are no matching tolerations on it and volume detach operations for the pods terminating on
the node will happen immediately. This allows the Pods on the out-of-service node to recover quickly
on a different node.
During a non-graceful shutdown, Pods are terminated in the two phases:
@ -607,9 +506,8 @@ During a non-graceful shutdown, Pods are terminated in the two phases:
2. Immediately perform detach volume operation for such pods.
{{< note >}}
- Before adding the taint `node.kubernetes.io/out-of-service` , it should be verified
that the node is already in shutdown or power off state (not in the middle of
restarting).
- Before adding the taint `node.kubernetes.io/out-of-service`, it should be verified
that the node is already in shutdown or power off state (not in the middle of restarting).
- The user is required to manually remove the out-of-service taint after the pods are
moved to a new node and the user has checked that the shutdown node has been
recovered since the user was the one who originally added the taint.
@ -617,11 +515,7 @@ During a non-graceful shutdown, Pods are terminated in the two phases:
## Swap memory management {#swap-memory}
{{< feature-state state="alpha" for_k8s_version="v1.22" >}}
Prior to Kubernetes 1.22, nodes did not support the use of swap memory, and a
kubelet would by default fail to start if swap was detected on a node. In 1.22
onwards, swap memory support can be enabled on a per-node basis.
{{< feature-state state="beta" for_k8s_version="v1.28" >}}
To enable swap on a node, the `NodeSwap` feature gate must be enabled on
the kubelet, and the `--fail-swap-on` command line flag or `failSwapOn`
@ -638,37 +532,52 @@ specify how a node will use swap memory. For example,
```yaml
memorySwap:
swapBehavior: LimitedSwap
swapBehavior: UnlimitedSwap
```
The available configuration options for `swapBehavior` are:
- `LimitedSwap`: Kubernetes workloads are limited in how much swap they can
use. Workloads on the node not managed by Kubernetes can still swap.
- `UnlimitedSwap`: Kubernetes workloads can use as much swap memory as they
- `UnlimitedSwap` (default): Kubernetes workloads can use as much swap memory as they
request, up to the system limit.
- `LimitedSwap`: The utilization of swap memory by Kubernetes workloads is subject to limitations.
Only Pods of Burstable QoS are permitted to employ swap.
If configuration for `memorySwap` is not specified and the feature gate is
enabled, by default the kubelet will apply the same behaviour as the
`LimitedSwap` setting.
`UnlimitedSwap` setting.
The behaviour of the `LimitedSwap` setting depends if the node is running with
v1 or v2 of control groups (also known as "cgroups"):
With `LimitedSwap`, Pods that do not fall under the Burstable QoS classification (i.e.
`BestEffort`/`Guaranteed` Qos Pods) are prohibited from utilizing swap memory.
To maintain the aforementioned security and node health guarantees, these Pods
are not permitted to use swap memory when `LimitedSwap` is in effect.
- **cgroupsv1:** Kubernetes workloads can use any combination of memory and
swap, up to the pod's memory limit, if set.
- **cgroupsv2:** Kubernetes workloads cannot use swap memory.
Prior to detailing the calculation of the swap limit, it is necessary to define the following terms:
* `nodeTotalMemory`: The total amount of physical memory available on the node.
* `totalPodsSwapAvailable`: The total amount of swap memory on the node that is available for use by Pods
(some swap memory may be reserved for system use).
* `containerMemoryRequest`: The container's memory request.
Swap limitation is configured as:
`(containerMemoryRequest / nodeTotalMemory) * totalPodsSwapAvailable`.
It is important to note that, for containers within Burstable QoS Pods, it is possible to
opt-out of swap usage by specifying memory requests that are equal to memory limits.
Containers configured in this manner will not have access to swap memory.
Swap is supported only with **cgroup v2**, cgroup v1 is not supported.
For more information, and to assist with testing and provide feedback, please
see [KEP-2400](https://github.com/kubernetes/enhancements/issues/2400) and its
see the blog-post about [Kubernetes 1.28: NodeSwap graduates to Beta1](/blog/2023/07/18/swap-beta1-1.28-2023/),
[KEP-2400](https://github.com/kubernetes/enhancements/issues/4128) and its
[design proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md).
## {{% heading "whatsnext" %}}
Learn more about the following:
* [Components](/docs/concepts/overview/components/#node-components) that make up a node.
* [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node) section of the architecture design document.
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node)
section of the architecture design document.
* [Taints and Tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
* [Node Resource Managers](/docs/concepts/policy/node-resource-managers/).
* [Resource Management for Windows nodes](/docs/concepts/configuration/windows-resource-management/).

View File

@ -8,6 +8,12 @@ content_type: concept
description: >
Lower-level detail relevant to creating or administering a Kubernetes cluster.
no_list: true
card:
name: setup
weight: 60
anchors:
- anchor: "#securing-a-cluster"
title: Securing a cluster
---
<!-- overview -->

View File

@ -55,18 +55,75 @@ See [Information security for Secrets](#information-security-for-secrets) for mo
## Uses for Secrets
There are three main ways for a Pod to use a Secret:
You can use Secrets for purposes such as the following:
- As [files](#using-secrets-as-files-from-a-pod) in a
{{< glossary_tooltip text="volume" term_id="volume" >}} mounted on one or more of
its containers.
- As [container environment variable](#using-secrets-as-environment-variables).
- By the [kubelet when pulling images](#using-imagepullsecrets) for the Pod.
- [Set environment variables for a container](/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data).
- [Provide credentials such as SSH keys or passwords to Pods](/docs/tasks/inject-data-application/distribute-credentials-secure/#provide-prod-test-creds).
- [Allow the kubelet to pull container images from private registries](/docs/tasks/configure-pod-container/pull-image-private-registry/).
The Kubernetes control plane also uses Secrets; for example,
[bootstrap token Secrets](#bootstrap-token-secrets) are a mechanism to
help automate node registration.
### Use case: dotfiles in a secret volume
You can make your data "hidden" by defining a key that begins with a dot.
This key represents a dotfile or "hidden" file. For example, when the following secret
is mounted into a volume, `secret-volume`, the volume will contain a single file,
called `.secret-file`, and the `dotfile-test-container` will have this file
present at the path `/etc/secret-volume/.secret-file`.
{{< note >}}
Files beginning with dot characters are hidden from the output of `ls -l`;
you must use `ls -la` to see them when listing directory contents.
{{< /note >}}
```yaml
apiVersion: v1
kind: Secret
metadata:
name: dotfile-secret
data:
.secret-file: dmFsdWUtMg0KDQo=
---
apiVersion: v1
kind: Pod
metadata:
name: secret-dotfiles-pod
spec:
volumes:
- name: secret-volume
secret:
secretName: dotfile-secret
containers:
- name: dotfile-test-container
image: registry.k8s.io/busybox
command:
- ls
- "-l"
- "/etc/secret-volume"
volumeMounts:
- name: secret-volume
readOnly: true
mountPath: "/etc/secret-volume"
```
### Use case: Secret visible to one container in a Pod
Consider a program that needs to handle HTTP requests, do some complex business
logic, and then sign some messages with an HMAC. Because it has complex
application logic, there might be an unnoticed remote file reading exploit in
the server, which could expose the private key to an attacker.
This could be divided into two processes in two containers: a frontend container
which handles user interaction and business logic, but which cannot see the
private key; and a signer container that can see the private key, and responds
to simple signing requests from the frontend (for example, over localhost networking).
With this partitioned approach, an attacker now has to trick the application
server into doing something rather arbitrary, which may be harder than getting
it to read a file.
### Alternatives to Secrets
Rather than using a Secret to protect confidential data, you can pick from alternatives.
@ -108,8 +165,8 @@ These types vary in terms of the validations performed and the constraints
Kubernetes imposes on them.
| Built-in Type | Usage |
| ------------------------------------- | --------------------------------------- |
| `Opaque` | arbitrary user-defined data |
| ------------------------------------- |---------------------------------------- |
| `Opaque` | arbitrary user-defined data |
| `kubernetes.io/service-account-token` | ServiceAccount token |
| `kubernetes.io/dockercfg` | serialized `~/.dockercfg` file |
| `kubernetes.io/dockerconfigjson` | serialized `~/.docker/config.json` file |
@ -220,8 +277,8 @@ for information on referencing service account credentials from within Pods.
### Docker config Secrets
You can use one of the following `type` values to create a Secret to
store the credentials for accessing a container image registry:
If you are creating a Secret to store credentials for accessing a container image registry,
you must use one of the following `type` values for that Secret:
- `kubernetes.io/dockercfg`
- `kubernetes.io/dockerconfigjson`
@ -297,10 +354,12 @@ Docker configuration file):
}
```
{{< note >}}
{{< caution >}}
The `auth` value there is base64 encoded; it is obscured but not secret.
Anyone who can read that Secret can learn the registry access bearer token.
{{< /note >}}
It is suggested to use [credential providers](/docs/tasks/administer-cluster/kubelet-credential-provider/) to dynamically and securely provide pull secrets on-demand.
{{< /caution >}}
### Basic authentication Secret
@ -382,6 +441,8 @@ When using this type of Secret, the `tls.key` and the `tls.crt` key must be prov
in the `data` (or `stringData`) field of the Secret configuration, although the API
server doesn't actually validate the values for each key.
As an alternative to using `stringData`, you can use the `data` field to provide the base64 encoded certificate and private key. Refer to [Constraints on Secret names and data](#restriction-names-data) for more on this.
The following YAML contains an example config for a TLS Secret:
```yaml
@ -390,11 +451,13 @@ kind: Secret
metadata:
name: secret-tls
type: kubernetes.io/tls
data:
stringData:
# the data is abbreviated in this example
tls.crt: |
--------BEGIN CERTIFICATE-----
MIIC2DCCAcCgAwIBAgIBATANBgkqh ...
tls.key: |
-----BEGIN RSA PRIVATE KEY-----
MIIEpgIBAAKCAQEA7yn3bRHQ5FHMQ ...
```
@ -412,21 +475,8 @@ kubectl create secret tls my-tls-secret \
--key=path/to/key/file
```
The public/private key pair must exist before hand. The public key certificate
for `--cert` must be DER format as per
[Section 5.1 of RFC 7468](https://datatracker.ietf.org/doc/html/rfc7468#section-5.1),
and must match the given private key for `--key` (PKCS #8 in DER format;
[Section 11 of RFC 7468](https://datatracker.ietf.org/doc/html/rfc7468#section-11)).
{{< note >}}
A kubernetes.io/tls Secret stores the Base64-encoded DER data for keys and
certificates. If you're familiar with PEM format for private keys and for certificates,
the base64 data are the same as that format except that you omit
the initial and the last lines that are used in PEM.
For example, for a certificate, you do **not** include `--------BEGIN CERTIFICATE-----`
and `-------END CERTIFICATE----`.
{{< /note >}}
The public/private key pair must exist before hand. The public key certificate for `--cert` must be .PEM encoded
and must match the given private key for `--key`.
### Bootstrap token Secrets
@ -576,17 +626,17 @@ metadata:
name: mypod
spec:
containers:
- name: mypod
image: redis
volumeMounts:
- name: foo
mountPath: "/etc/foo"
readOnly: true
volumes:
- name: mypod
image: redis
volumeMounts:
- name: foo
secret:
secretName: mysecret
optional: true
mountPath: "/etc/foo"
readOnly: true
volumes:
- name: foo
secret:
secretName: mysecret
optional: true
```
By default, Secrets are required. None of a Pod's containers will start until
@ -697,269 +747,6 @@ for a detailed explanation of that process.
You cannot use ConfigMaps or Secrets with {{< glossary_tooltip text="static Pods" term_id="static-pod" >}}.
## Use cases
### Use case: As container environment variables {#use-case-as-container-environment-variables}
You can create a Secret and use it to
[set environment variables for a container](/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data).
### Use case: Pod with SSH keys
Create a Secret containing some SSH keys:
```shell
kubectl create secret generic ssh-key-secret --from-file=ssh-privatekey=/path/to/.ssh/id_rsa --from-file=ssh-publickey=/path/to/.ssh/id_rsa.pub
```
The output is similar to:
```
secret "ssh-key-secret" created
```
You can also create a `kustomization.yaml` with a `secretGenerator` field containing ssh keys.
{{< caution >}}
Think carefully before sending your own SSH keys: other users of the cluster may have access
to the Secret.
You could instead create an SSH private key representing a service identity that you want to be
accessible to all the users with whom you share the Kubernetes cluster, and that you can revoke
if the credentials are compromised.
{{< /caution >}}
Now you can create a Pod which references the secret with the SSH key and
consumes it in a volume:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: secret-test-pod
labels:
name: secret-test
spec:
volumes:
- name: secret-volume
secret:
secretName: ssh-key-secret
containers:
- name: ssh-test-container
image: mySshImage
volumeMounts:
- name: secret-volume
readOnly: true
mountPath: "/etc/secret-volume"
```
When the container's command runs, the pieces of the key will be available in:
```
/etc/secret-volume/ssh-publickey
/etc/secret-volume/ssh-privatekey
```
The container is then free to use the secret data to establish an SSH connection.
### Use case: Pods with prod / test credentials
This example illustrates a Pod which consumes a secret containing production credentials and
another Pod which consumes a secret with test environment credentials.
You can create a `kustomization.yaml` with a `secretGenerator` field or run
`kubectl create secret`.
```shell
kubectl create secret generic prod-db-secret --from-literal=username=produser --from-literal=password=Y4nys7f11
```
The output is similar to:
```
secret "prod-db-secret" created
```
You can also create a secret for test environment credentials.
```shell
kubectl create secret generic test-db-secret --from-literal=username=testuser --from-literal=password=iluvtests
```
The output is similar to:
```
secret "test-db-secret" created
```
{{< note >}}
Special characters such as `$`, `\`, `*`, `=`, and `!` will be interpreted by your
[shell](https://en.wikipedia.org/wiki/Shell_(computing)) and require escaping.
In most shells, the easiest way to escape the password is to surround it with single quotes (`'`).
For example, if your actual password is `S!B\*d$zDsb=`, you should execute the command this way:
```shell
kubectl create secret generic dev-db-secret --from-literal=username=devuser --from-literal=password='S!B\*d$zDsb='
```
You do not need to escape special characters in passwords from files (`--from-file`).
{{< /note >}}
Now make the Pods:
```shell
cat <<EOF > pod.yaml
apiVersion: v1
kind: List
items:
- kind: Pod
apiVersion: v1
metadata:
name: prod-db-client-pod
labels:
name: prod-db-client
spec:
volumes:
- name: secret-volume
secret:
secretName: prod-db-secret
containers:
- name: db-client-container
image: myClientImage
volumeMounts:
- name: secret-volume
readOnly: true
mountPath: "/etc/secret-volume"
- kind: Pod
apiVersion: v1
metadata:
name: test-db-client-pod
labels:
name: test-db-client
spec:
volumes:
- name: secret-volume
secret:
secretName: test-db-secret
containers:
- name: db-client-container
image: myClientImage
volumeMounts:
- name: secret-volume
readOnly: true
mountPath: "/etc/secret-volume"
EOF
```
Add the pods to the same `kustomization.yaml`:
```shell
cat <<EOF >> kustomization.yaml
resources:
- pod.yaml
EOF
```
Apply all those objects on the API server by running:
```shell
kubectl apply -k .
```
Both containers will have the following files present on their filesystems with the values
for each container's environment:
```
/etc/secret-volume/username
/etc/secret-volume/password
```
Note how the specs for the two Pods differ only in one field; this facilitates
creating Pods with different capabilities from a common Pod template.
You could further simplify the base Pod specification by using two service accounts:
1. `prod-user` with the `prod-db-secret`
1. `test-user` with the `test-db-secret`
The Pod specification is shortened to:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: prod-db-client-pod
labels:
name: prod-db-client
spec:
serviceAccount: prod-db-client
containers:
- name: db-client-container
image: myClientImage
```
### Use case: dotfiles in a secret volume
You can make your data "hidden" by defining a key that begins with a dot.
This key represents a dotfile or "hidden" file. For example, when the following secret
is mounted into a volume, `secret-volume`:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: dotfile-secret
data:
.secret-file: dmFsdWUtMg0KDQo=
---
apiVersion: v1
kind: Pod
metadata:
name: secret-dotfiles-pod
spec:
volumes:
- name: secret-volume
secret:
secretName: dotfile-secret
containers:
- name: dotfile-test-container
image: registry.k8s.io/busybox
command:
- ls
- "-l"
- "/etc/secret-volume"
volumeMounts:
- name: secret-volume
readOnly: true
mountPath: "/etc/secret-volume"
```
The volume will contain a single file, called `.secret-file`, and
the `dotfile-test-container` will have this file present at the path
`/etc/secret-volume/.secret-file`.
{{< note >}}
Files beginning with dot characters are hidden from the output of `ls -l`;
you must use `ls -la` to see them when listing directory contents.
{{< /note >}}
### Use case: Secret visible to one container in a Pod
Consider a program that needs to handle HTTP requests, do some complex business
logic, and then sign some messages with an HMAC. Because it has complex
application logic, there might be an unnoticed remote file reading exploit in
the server, which could expose the private key to an attacker.
This could be divided into two processes in two containers: a frontend container
which handles user interaction and business logic, but which cannot see the
private key; and a signer container that can see the private key, and responds
to simple signing requests from the frontend (for example, over localhost networking).
With this partitioned approach, an attacker now has to trick the application
server into doing something rather arbitrary, which may be harder than getting
it to read a file.
## Immutable Secrets {#secret-immutable}
{{< feature-state for_k8s_version="v1.21" state="stable" >}}

View File

@ -6,6 +6,9 @@ reviewers:
- erictune
- thockin
content_type: concept
card:
name: concepts
weight: 50
---
<!-- overview -->

View File

@ -209,7 +209,7 @@ Aggregated APIs offer more advanced API features and customization of other feat
| Feature | Description | CRDs | Aggregated API |
| ------- | ----------- | ---- | -------------- |
| Validation | Help users prevent errors and allow you to evolve your API independently of your clients. These features are most useful when there are many clients who can't all update at the same time. | Yes. Most validation can be specified in the CRD using [OpenAPI v3.0 validation](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation). Any other validations supported by addition of a [Validating Webhook](/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook-alpha-in-1-8-beta-in-1-9). | Yes, arbitrary validation checks |
| Validation | Help users prevent errors and allow you to evolve your API independently of your clients. These features are most useful when there are many clients who can't all update at the same time. | Yes. Most validation can be specified in the CRD using [OpenAPI v3.0 validation](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation). [CRDValidationRatcheting](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-ratcheting) feature gate allows failing validations specified using OpenAPI also can be ignored if the failing part of the resource was unchanged. Any other validations supported by addition of a [Validating Webhook](/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook-alpha-in-1-8-beta-in-1-9). | Yes, arbitrary validation checks |
| Defaulting | See above | Yes, either via [OpenAPI v3.0 validation](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#defaulting) `default` keyword (GA in 1.17), or via a [Mutating Webhook](/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook) (though this will not be run when reading from etcd for old objects). | Yes |
| Multi-versioning | Allows serving the same object through two API versions. Can help ease API changes like renaming fields. Less important if you control your client versions. | [Yes](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning) | Yes |
| Custom Storage | If you need storage with a different performance mode (for example, a time-series database instead of key-value store) or isolation for security (for example, encryption of sensitive information, etc.) | No | Yes |

View File

@ -147,6 +147,23 @@ The general workflow of a device plugin includes the following steps:
runtime configurations for accessing the allocated devices. The kubelet passes this information
to the container runtime.
An `AllocateResponse` contains zero or more `ContainerAllocateResponse` objects. In these, the
device plugin defines modifications that must be made to a container's definition to provide
access to the device. These modifications include:
* [Annotations](/docs/concepts/overview/working-with-objects/annotations/)
* device nodes
* environment variables
* mounts
* fully-qualified CDI device names
{{< note >}}
The processing of the fully-qualified CDI device names by the Device Manager requires
that the `DevicePluginCDIDevices` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled for the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes
v1.28.
{{< /note >}}
### Handling kubelet restarts
A device plugin is expected to detect kubelet restarts and re-register itself with the new
@ -195,7 +212,7 @@ of the device allocations during the upgrade.
## Monitoring device plugin resources
{{< feature-state for_k8s_version="v1.15" state="beta" >}}
{{< feature-state for_k8s_version="v1.28" state="stable" >}}
In order to monitor resources provided by device plugins, monitoring agents need to be able to
discover the set of devices that are in-use on the node and obtain metadata to describe which
@ -312,7 +329,7 @@ below:
### `GetAllocatableResources` gRPC endpoint {#grpc-endpoint-getallocatableresources}
{{< feature-state state="beta" for_k8s_version="v1.23" >}}
{{< feature-state state="stable" for_k8s_version="v1.28" >}}
GetAllocatableResources provides information on resources initially available on the worker node.
It provides more information than kubelet exports to APIServer.
@ -338,16 +355,6 @@ message AllocatableResourcesResponse {
}
```
Starting from Kubernetes v1.23, the `GetAllocatableResources` is enabled by default.
You can disable it by turning off the `KubeletPodResourcesGetAllocatable`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
Preceding Kubernetes v1.23, to enable this feature `kubelet` must be started with the following flag:
```
--feature-gates=KubeletPodResourcesGetAllocatable=true
```
`ContainerDevices` do expose the topology information declaring to which NUMA cells the device is
affine. The NUMA cells are identified using a opaque integer ID, which value is consistent to
what device plugins report
@ -381,8 +388,6 @@ Support for the `PodResourcesLister service` requires `KubeletPodResources`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled.
It is enabled by default starting with Kubernetes 1.15 and is v1 since Kubernetes 1.20.
### `Get` gRPC endpoint {#grpc-endpoint-get}
{{< feature-state state="alpha" for_k8s_version="v1.27" >}}

View File

@ -10,6 +10,9 @@ weight: 20
card:
name: concepts
weight: 10
anchors:
- anchor: "#why-you-need-kubernetes-and-what-can-it-do"
title: Why Kubernetes?
no_list: true
---

View File

@ -8,6 +8,7 @@ description: >
plane and a set of machines called nodes.
weight: 30
card:
title: Components of a cluster
name: concepts
weight: 20
---

View File

@ -26,7 +26,7 @@ and does the following:
* Modifies the object to add a `metadata.deletionTimestamp` field with the
time you started the deletion.
* Prevents the object from being removed until its `metadata.finalizers` field is empty.
* Prevents the object from being removed until all items are removed from its `metadata.finalizers` field
* Returns a `202` status code (HTTP "Accepted")
The controller managing that finalizer notices the update to the object setting the
@ -45,6 +45,16 @@ controller can't delete it because the finalizer exists. When the Pod stops
using the `PersistentVolume`, Kubernetes clears the `pv-protection` finalizer,
and the controller deletes the volume.
{{<note>}}
* When you `DELETE` an object, Kubernetes adds the deletion timestamp for that object and then
immediately starts to restrict changes to the `.metadata.finalizers` field for the object that is
now pending deletion. You can remove existing finalizers (deleting an entry from the `finalizers`
list) but you cannot add a new finalizer. You also cannot modify the `deletionTimestamp` for an
object once it is set.
* After the deletion is requested, you can not resurrect this object. The only way is to delete it and make a new similar object.
{{</note>}}
## Owner references, labels, and finalizers {#owners-labels-finalizers}
Like {{<glossary_tooltip text="labels" term_id="label">}},

View File

@ -11,7 +11,7 @@ weight: 65
{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
Dynamic resource allocation is a new API for requesting and sharing resources
Dynamic resource allocation is an API for requesting and sharing resources
between pods and containers inside a pod. It is a generalization of the
persistent volumes API for generic resources. Third-party resource drivers are
responsible for tracking and allocating resources. Different kinds of
@ -32,7 +32,7 @@ check the documentation for that version of Kubernetes.
## API
The `resource.k8s.io/v1alpha2` {{< glossary_tooltip text="API group"
term_id="api-group" >}} provides four new types:
term_id="api-group" >}} provides four types:
ResourceClass
: Defines which resource driver handles a certain kind of
@ -61,7 +61,7 @@ typically using the type defined by a {{< glossary_tooltip
term_id="CustomResourceDefinition" text="CRD" >}} that was created when
installing a resource driver.
The `core/v1` `PodSpec` defines ResourceClaims that are needed for a Pod in a new
The `core/v1` `PodSpec` defines ResourceClaims that are needed for a Pod in a
`resourceClaims` field. Entries in that list reference either a ResourceClaim
or a ResourceClaimTemplate. When referencing a ResourceClaim, all Pods using
this PodSpec (for example, inside a Deployment or StatefulSet) share the same
@ -168,14 +168,39 @@ The kubelet provides a gRPC service to enable discovery of dynamic resources of
running Pods. For more information on the gRPC endpoints, see the
[resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
## Limitations
## Pre-scheduled Pods
The scheduler plugin must be involved in scheduling Pods which use
ResourceClaims. Bypassing the scheduler by setting the `nodeName` field leads
to Pods that the kubelet refuses to start because the ResourceClaims are not
reserved or not even allocated. It may be possible to [remove this
limitation](https://github.com/kubernetes/kubernetes/issues/114005) in the
future.
When you - or another API client - create a Pod with `spec.nodeName` already set, the scheduler gets bypassed.
If some ResourceClaim needed by that Pod does not exist yet, is not allocated
or not reserved for the Pod, then the kubelet will fail to run the Pod and
re-check periodically because those requirements might still get fulfilled
later.
Such a situation can also arise when support for dynamic resource allocation
was not enabled in the scheduler at the time when the Pod got scheduled
(version skew, configuration, feature gate, etc.). kube-controller-manager
detects this and tries to make the Pod runnable by triggering allocation and/or
reserving the required ResourceClaims.
However, it is better to avoid this because a Pod that is assigned to a node
blocks normal resources (RAM, CPU) that then cannot be used for other Pods
while the Pod is stuck. To make a Pod run on a specific node while still going
through the normal scheduling flow, create the Pod with a node selector that
exactly matches the desired node:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-with-cats
spec:
nodeSelector:
kubernetes.io/hostname: name-of-the-intended-node
...
```
You may also be able to mutate the incoming Pod, at admission time, to unset
the `.spec.nodeName` field and to use a node selector instead.
## Enabling dynamic resource allocation

View File

@ -110,7 +110,7 @@ If you configure a Secret through a
data encoded as base64, sharing this file or checking it in to a source
repository means the secret is available to everyone who can read the manifest.
{{<caution>}}
{{< caution >}}
Base64 encoding is _not_ an encryption method, it provides no additional
confidentiality over plain text.
{{</caution>}}
{{< /caution >}}

View File

@ -292,7 +292,7 @@ Below are the properties a user can specify in the `dnsConfig` field:
This property is optional. When specified, the provided list will be merged
into the base search domain names generated from the chosen DNS policy.
Duplicate domain names are removed.
Kubernetes allows for at most 6 search domains.
Kubernetes allows up to 32 search domains.
- `options`: an optional list of objects where each object may have a `name`
property (required) and a `value` property (optional). The contents in this
property will be merged to the options generated from the specified DNS policy.
@ -325,7 +325,7 @@ options ndots:5
## DNS search domain list limits
{{< feature-state for_k8s_version="1.26" state="beta" >}}
{{< feature-state for_k8s_version="1.28" state="stable" >}}
Kubernetes itself does not limit the DNS Config until the length of the search
domain list exceeds 32 or the total length of all search domains exceeds 2048.

View File

@ -340,12 +340,8 @@ namespaces based on their labels.
## Targeting a Namespace by its name
{{< feature-state for_k8s_version="1.22" state="stable" >}}
The Kubernetes control plane sets an immutable label `kubernetes.io/metadata.name` on all
namespaces, provided that the `NamespaceDefaultLabelName`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled.
The value of the label is the namespace name.
namespaces, the value of the label is the namespace name.
While NetworkPolicy cannot target a namespace by its name with some object field, you can use the
standardized label to target a specific namespace.

View File

@ -517,7 +517,7 @@ spec:
#### Reserve Nodeport Ranges to avoid collisions when port assigning
{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
{{< feature-state for_k8s_version="v1.28" state="beta" >}}
The policy for assigning ports to NodePort services applies to both the auto-assignment and
the manual assignment scenarios. When a user wants to create a NodePort service that

View File

@ -224,7 +224,7 @@ volume.
### PersistentVolumeClaim naming
Naming of the automatically created PVCs is deterministic: the name is
a combination of Pod name and volume name, with a hyphen (`-`) in the
a combination of the Pod name and volume name, with a hyphen (`-`) in the
middle. In the example above, the PVC name will be
`my-app-scratch-volume`. This deterministic naming makes it easier to
interact with the PVC because one does not have to search for it once
@ -248,11 +248,10 @@ same namespace, so that these conflicts can't occur.
### Security
Enabling the GenericEphemeralVolume feature allows users to create
PVCs indirectly if they can create Pods, even if they do not have
permission to create PVCs directly. Cluster administrators must be
aware of this. If this does not fit their security model, they should
use an [admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/)
Using generic ephemeral volumes allows users to create PVCs indirectly
if they can create Pods, even if they do not have permission to create PVCs directly.
Cluster administrators must be aware of this. If this does not fit their security model,
they should use an [admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/)
that rejects objects like Pods that have a generic ephemeral volume.
The normal [namespace quota for PVCs](/docs/concepts/policy/resource-quotas/#storage-resource-quota)

View File

@ -9,7 +9,7 @@ title: Persistent Volumes
feature:
title: Storage orchestration
description: >
Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as <a href="https://aws.amazon.com/products/storage/">AWS</a> or <a href="https://cloud.google.com/storage/">GCP</a>, or a network storage system such as NFS, iSCSI, Ceph, Cinder.
Automatically mount the storage system of your choice, whether from local storage, a public cloud provider, or a network storage system such as iSCSI or NFS.
content_type: concept
weight: 20
---
@ -371,7 +371,7 @@ the following types of volumes:
* {{< glossary_tooltip text="csi" term_id="csi" >}}
* flexVolume (deprecated)
* gcePersistentDisk (deprecated)
* rbd
* rbd (deprecated)
* portworxVolume (deprecated)
You can only expand a PVC if its storage class's `allowVolumeExpansion` field is set to true.
@ -488,7 +488,7 @@ value you previously tried.
This is useful if expansion to a higher value did not succeed because of capacity constraint.
If that has happened, or you suspect that it might have, you can retry expansion by specifying a
size that is within the capacity limits of underlying storage provider. You can monitor status of
resize operation by watching `.status.resizeStatus` and events on the PVC.
resize operation by watching `.status.allocatedResourceStatuses` and events on the PVC.
Note that,
although you can specify a lower amount of storage than what was requested previously,
@ -501,7 +501,6 @@ Kubernetes does not support shrinking a PVC to less than its current size.
PersistentVolume types are implemented as plugins. Kubernetes currently supports the following plugins:
* [`cephfs`](/docs/concepts/storage/volumes/#cephfs) - CephFS volume
* [`csi`](/docs/concepts/storage/volumes/#csi) - Container Storage Interface (CSI)
* [`fc`](/docs/concepts/storage/volumes/#fc) - Fibre Channel (FC) storage
* [`hostPath`](/docs/concepts/storage/volumes/#hostpath) - HostPath volume
@ -511,7 +510,6 @@ PersistentVolume types are implemented as plugins. Kubernetes currently supports
* [`local`](/docs/concepts/storage/volumes/#local) - local storage devices
mounted on nodes.
* [`nfs`](/docs/concepts/storage/volumes/#nfs) - Network File System (NFS) storage
* [`rbd`](/docs/concepts/storage/volumes/#rbd) - Rados Block Device (RBD) volume
The following types of PersistentVolume are deprecated.
This means that support is still available but will be removed in a future Kubernetes release.
@ -526,6 +524,10 @@ This means that support is still available but will be removed in a future Kuber
(**deprecated** in v1.25)
* [`vsphereVolume`](/docs/concepts/storage/volumes/#vspherevolume) - vSphere VMDK volume
(**deprecated** in v1.19)
* [`cephfs`](/docs/concepts/storage/volumes/#cephfs) - CephFS volume
(**deprecated** in v1.28)
* [`rbd`](/docs/concepts/storage/volumes/#rbd) - Rados Block Device (RBD) volume
(**deprecated** in v1.28)
Older versions of Kubernetes also supported the following in-tree PersistentVolume types:
@ -635,7 +637,7 @@ The access modes are:
`ReadWriteOncePod`
: {{< feature-state for_k8s_version="v1.27" state="beta" >}}
the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod
access mode if you want to ensure that only one pod across whole cluster can
access mode if you want to ensure that only one pod across the whole cluster can
read that PVC or write to it. This is only supported for CSI volumes and
Kubernetes version 1.22+.
@ -715,11 +717,12 @@ Not all Persistent Volume types support mount options.
The following volume types support mount options:
* `azureFile`
* `cephfs`
* `gcePersistentDisk`
* `cephfs` (**deprecated** in v1.28)
* `cinder` (**deprecated** in v1.18)
* `gcePersistentDisk` (**deprecated** in v1.28)
* `iscsi`
* `nfs`
* `rbd`
* `rbd` (**deprecated** in v1.28)
* `vsphereVolume`
Mount options are not validated. If a mount option is invalid, the mount fails.
@ -745,14 +748,35 @@ API reference has more details on this field.
### Phase
A volume will be in one of the following phases:
A PersistentVolume will be in one of the following phases:
* Available -- a free resource that is not yet bound to a claim
* Bound -- the volume is bound to a claim
* Released -- the claim has been deleted, but the resource is not yet reclaimed by the cluster
* Failed -- the volume has failed its automatic reclamation
`Available`
: a free resource that is not yet bound to a claim
The CLI will show the name of the PVC bound to the PV.
`Bound`
: the volume is bound to a claim
`Released`
: the claim has been deleted, but the associated storage resource is not yet reclaimed by the cluster
`Failed`
: the volume has failed its (automated) reclamation
You can see the name of the PVC bound to the PV using `kubectl describe persistentvolume <name>`.
#### Phase transition timestamp
{{< feature-state for_k8s_version="v1.28" state="alpha" >}}
The `.status` field for a PersistentVolume can include an alpha `lastPhaseTransitionTime` field. This field records
the timestamp of when the volume last transitioned its phase. For newly created
volumes the phase is set to `Pending` and `lastPhaseTransitionTime` is set to
the current time.
{{< note >}}
You need to enable the `PersistentVolumeLastPhaseTransitionTime` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
to use or see the `lastPhaseTransitionTime` field.
{{< /note >}}
## PersistentVolumeClaims
@ -833,9 +857,9 @@ is turned on.
PVs of that default. Specifying a default StorageClass is done by setting the
annotation `storageclass.kubernetes.io/is-default-class` equal to `true` in
a StorageClass object. If the administrator does not specify a default, the
cluster responds to PVC creation as if the admission plugin were turned off. If
more than one default is specified, the admission plugin forbids the creation of
all PVCs.
cluster responds to PVC creation as if the admission plugin were turned off. If more than one
default StorageClass is specified, the newest default is used when the
PVC is dynamically provisioned.
* If the admission plugin is turned off, there is no notion of a default
StorageClass. All PVCs that have `storageClassName` set to `""` can be
bound only to PVs that have `storageClassName` also set to `""`.
@ -862,7 +886,7 @@ it won't be supported in a future Kubernetes release.
#### Retroactive default StorageClass assignment
{{< feature-state for_k8s_version="v1.26" state="beta" >}}
{{< feature-state for_k8s_version="v1.28" state="stable" >}}
You can create a PersistentVolumeClaim without specifying a `storageClassName`
for the new PVC, and you can do so even when no default StorageClass exists
@ -932,10 +956,12 @@ applicable:
* CSI
* FC (Fibre Channel)
* GCEPersistentDisk
* GCEPersistentDisk (deprecated)
* iSCSI
* Local volume
* RBD (Ceph Block Device)
* OpenStack Cinder
* RBD (deprecated)
* RBD (Ceph Block Device; deprecated)
* VsphereVolume
### PersistentVolume using a Raw Block Volume {#persistent-volume-using-a-raw-block-volume}
@ -1104,7 +1130,7 @@ Kubernetes supports cross namespace volume data sources.
To use cross namespace volume data sources, you must enable the `AnyVolumeDataSource`
and `CrossNamespaceVolumeDataSource`
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/) for
the kube-apiserver, kube-controller-manager.
the kube-apiserver and kube-controller-manager.
Also, you must enable the `CrossNamespaceVolumeDataSource` feature gate for the csi-provisioner.
Enabling the `CrossNamespaceVolumeDataSource` feature gate allows you to specify

View File

@ -235,7 +235,7 @@ parameters:
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
- key: topology.kubernetes.io/zone
values:
- us-central-1a
- us-central-1b
@ -459,6 +459,11 @@ There are few
which you try out for persistent volume management inside Kubernetes for vSphere.
### Ceph RBD
{{< note >}}
{{< feature-state state="deprecated" for_k8s_version="v1.28" >}}
This internal provisioner of Ceph RBD is deprecated. Please use
[CephFS RBD CSI driver](https://github.com/ceph/ceph-csi).
{{< /note >}}
```yaml
apiVersion: storage.k8s.io/v1

View File

@ -74,10 +74,10 @@ used for provisioning VolumeSnapshots. This field must be specified.
### DeletionPolicy
Volume snapshot classes have a deletionPolicy. It enables you to configure what
happens to a VolumeSnapshotContent when the VolumeSnapshot object it is bound to
is to be deleted. The deletionPolicy of a volume snapshot class can either be
`Retain` or `Delete`. This field must be specified.
Volume snapshot classes have a [deletionPolicy](/docs/concepts/storage/volume-snapshots/#delete).
It enables you to configure what happens to a VolumeSnapshotContent when the VolumeSnapshot
object it is bound to is to be deleted. The deletionPolicy of a volume snapshot class can
either be `Retain` or `Delete`. This field must be specified.
If the deletionPolicy is `Delete`, then the underlying storage snapshot will be
deleted along with the VolumeSnapshotContent object. If the deletionPolicy is `Retain`,

View File

@ -119,6 +119,12 @@ To disable the `azureFile` storage plugin from being loaded by the controller ma
and the kubelet, set the `InTreePluginAzureFileUnregister` flag to `true`.
### cephfs
{{< feature-state for_k8s_version="v1.28" state="deprecated" >}}
{{< note >}}
The Kubernetes project suggests that you use the [CephFS CSI](https://github.com/ceph/ceph-csi) third party
storage driver instead.
{{< /note >}}
A `cephfs` volume allows an existing CephFS volume to be
mounted into your Pod. Unlike `emptyDir`, which is erased when a pod is
@ -762,6 +768,12 @@ A projected volume maps several existing volume sources into the same
directory. For more details, see [projected volumes](/docs/concepts/storage/projected-volumes/).
### rbd
{{< feature-state for_k8s_version="v1.28" state="deprecated" >}}
{{< note >}}
The Kubernetes project suggests that you use the [Ceph CSI](https://github.com/ceph/ceph-csi)
third party storage driver instead, in RBD mode.
{{< /note >}}
An `rbd` volume allows a
[Rados Block Device](https://docs.ceph.com/en/latest/rbd/) (RBD) volume to mount
@ -785,7 +797,7 @@ for more details.
#### RBD CSI migration {#rbd-csi-migration}
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
{{< feature-state for_k8s_version="v1.28" state="deprecated" >}}
The `CSIMigration` feature for `RBD`, when enabled, redirects all plugin
operations from the existing in-tree plugin to the `rbd.csi.ceph.com` {{<
@ -840,7 +852,8 @@ For more details, see [Configuring Secrets](/docs/concepts/configuration/secret/
### vsphereVolume (deprecated) {#vspherevolume}
{{< note >}}
We recommend to use vSphere CSI out-of-tree driver instead.
The Kubernetes project recommends using the [vSphere CSI](https://github.com/kubernetes-sigs/vsphere-csi-driver)
out-of-tree storage driver instead.
{{< /note >}}
A `vsphereVolume` is used to mount a vSphere VMDK volume into your Pod. The contents
@ -1061,7 +1074,7 @@ persistent volume:
`ControllerPublishVolume` and `ControllerUnpublishVolume` calls. This field is
optional, and may be empty if no secret is required. If the Secret
contains more than one secret, all secrets are passed.
`nodeExpandSecretRef`: A reference to the secret containing sensitive
* `nodeExpandSecretRef`: A reference to the secret containing sensitive
information to pass to the CSI driver to complete the CSI
`NodeExpandVolume` call. This field is optional, and may be empty if no
secret is required. If the object contains more than one secret, all
@ -1116,8 +1129,8 @@ For more information on how to develop a CSI driver, refer to the
CSI node plugins need to perform various privileged
operations like scanning of disk devices and mounting of file systems. These operations
differ for each host operating system. For Linux worker nodes, containerized CSI node
node plugins are typically deployed as privileged containers. For Windows worker nodes,
differ for each host operating system. For Linux worker nodes, containerized CSI node
plugins are typically deployed as privileged containers. For Windows worker nodes,
privileged operations for containerized CSI node plugins is supported using
[csi-proxy](https://github.com/kubernetes-csi/csi-proxy), a community-managed,
stand-alone binary that needs to be pre-installed on each Windows node.

View File

@ -4,6 +4,10 @@ weight: 55
description: >
Understand Pods, the smallest deployable compute object in Kubernetes, and the higher-level abstractions that help you to run them.
no_list: true
card:
title: Workloads and Pods
name: concepts
weight: 60
---
{{< glossary_definition term_id="workload" length="short" >}}

View File

@ -18,8 +18,8 @@ hide_summary: true # Listed separately in section index
<!-- overview -->
A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate.
As pods successfully complete, the Job tracks the successful completions. When a specified number
of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up
As pods successfully complete, the Job tracks the successful completions. When a specified number
of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up
the Pods it created. Suspending a Job will delete its active Pods until the Job
is resumed again.
@ -36,7 +36,7 @@ see [CronJob](/docs/concepts/workloads/controllers/cron-jobs/).
## Running an example Job
Here is an example Job config. It computes π to 2000 places and prints it out.
Here is an example Job config. It computes π to 2000 places and prints it out.
It takes around 10s to complete.
{{% code file="controllers/job.yaml" %}}
@ -163,7 +163,7 @@ The output is similar to this:
pi-5rwd7
```
Here, the selector is the same as the selector for the Job. The `--output=jsonpath` option specifies an expression
Here, the selector is the same as the selector for the Job. The `--output=jsonpath` option specifies an expression
with the name from each Pod in the returned list.
View the standard output of one of the pods:
@ -189,9 +189,9 @@ The output is similar to this:
As with all other Kubernetes config, a Job needs `apiVersion`, `kind`, and `metadata` fields.
When the control plane creates new Pods for a Job, the `.metadata.name` of the
Job is part of the basis for naming those Pods. The name of a Job must be a valid
Job is part of the basis for naming those Pods. The name of a Job must be a valid
[DNS subdomain](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names)
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
the name should follow the more restrictive rules for a
[DNS label](/docs/concepts/overview/working-with-objects/names#dns-label-names).
Even when the name is a DNS subdomain, the name must be no longer than 63
@ -207,20 +207,21 @@ Job labels will have `batch.kubernetes.io/` prefix for `job-name` and `controlle
The `.spec.template` is the only required field of the `.spec`.
The `.spec.template` is a [pod template](/docs/concepts/workloads/pods/#pod-templates). It has exactly the same schema as a {{< glossary_tooltip text="Pod" term_id="pod" >}}, except it is nested and does not have an `apiVersion` or `kind`.
The `.spec.template` is a [pod template](/docs/concepts/workloads/pods/#pod-templates).
It has exactly the same schema as a {{< glossary_tooltip text="Pod" term_id="pod" >}},
except it is nested and does not have an `apiVersion` or `kind`.
In addition to required fields for a Pod, a pod template in a Job must specify appropriate
labels (see [pod selector](#pod-selector)) and an appropriate restart policy.
Only a [`RestartPolicy`](/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy) equal to `Never` or `OnFailure` is allowed.
Only a [`RestartPolicy`](/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy)
equal to `Never` or `OnFailure` is allowed.
### Pod selector
The `.spec.selector` field is optional. In almost all cases you should not specify it.
The `.spec.selector` field is optional. In almost all cases you should not specify it.
See section [specifying your own pod selector](#specifying-your-own-pod-selector).
### Parallel execution for Jobs {#parallel-jobs}
There are three main types of task suitable to run as a Job:
@ -234,14 +235,18 @@ There are three main types of task suitable to run as a Job:
- when using `.spec.completionMode="Indexed"`, each Pod gets a different index in the range 0 to `.spec.completions-1`.
1. Parallel Jobs with a *work queue*:
- do not specify `.spec.completions`, default to `.spec.parallelism`.
- the Pods must coordinate amongst themselves or an external service to determine what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue.
- each Pod is independently capable of determining whether or not all its peers are done, and thus that the entire Job is done.
- the Pods must coordinate amongst themselves or an external service to determine
what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue.
- each Pod is independently capable of determining whether or not all its peers are done,
and thus that the entire Job is done.
- when _any_ Pod from the Job terminates with success, no new Pods are created.
- once at least one Pod has terminated with success and all Pods are terminated, then the Job is completed with success.
- once any Pod has exited with success, no other Pod should still be doing any work for this task or writing any output. They should all be in the process of exiting.
- once at least one Pod has terminated with success and all Pods are terminated,
then the Job is completed with success.
- once any Pod has exited with success, no other Pod should still be doing any work
for this task or writing any output. They should all be in the process of exiting.
For a _non-parallel_ Job, you can leave both `.spec.completions` and `.spec.parallelism` unset. When both are
unset, both are defaulted to 1.
For a _non-parallel_ Job, you can leave both `.spec.completions` and `.spec.parallelism` unset.
When both are unset, both are defaulted to 1.
For a _fixed completion count_ Job, you should set `.spec.completions` to the number of completions needed.
You can set `.spec.parallelism`, or leave it unset and it will default to 1.
@ -249,7 +254,8 @@ You can set `.spec.parallelism`, or leave it unset and it will default to 1.
For a _work queue_ Job, you must leave `.spec.completions` unset, and set `.spec.parallelism` to
a non-negative integer.
For more information about how to make use of the different types of job, see the [job patterns](#job-patterns) section.
For more information about how to make use of the different types of job,
see the [job patterns](#job-patterns) section.
#### Controlling parallelism
@ -261,7 +267,7 @@ Actual parallelism (number of pods running at any instant) may be more or less t
parallelism, for a variety of reasons:
- For _fixed completion count_ Jobs, the actual number of pods running in parallel will not exceed the number of
remaining completions. Higher values of `.spec.parallelism` are effectively ignored.
remaining completions. Higher values of `.spec.parallelism` are effectively ignored.
- For _work queue_ Jobs, no new Pods are started after any Pod has succeeded -- remaining Pods are allowed to complete, however.
- If the Job {{< glossary_tooltip term_id="controller" >}} has not had time to react.
- If the Job controller failed to create Pods for any reason (lack of `ResourceQuota`, lack of permission, etc.),
@ -281,8 +287,11 @@ Jobs with _fixed completion count_ - that is, jobs that have non null
completion is homologous to each other. Note that Jobs that have null
`.spec.completions` are implicitly `NonIndexed`.
- `Indexed`: the Pods of a Job get an associated completion index from 0 to
`.spec.completions-1`. The index is available through three mechanisms:
`.spec.completions-1`. The index is available through four mechanisms:
- The Pod annotation `batch.kubernetes.io/job-completion-index`.
- The Pod label `batch.kubernetes.io/job-completion-index` (for v1.28 and later). Note
the feature gate `PodIndexLabel` must be enabled to use this label, and it is enabled
by default.
- As part of the Pod hostname, following the pattern `$(job-name)-$(index)`.
When you use an Indexed Job in combination with a
{{< glossary_tooltip term_id="Service" >}}, Pods within the Job can use
@ -301,33 +310,36 @@ count towards the completion count and update the status of the Job. The other P
or completed for the same index will be deleted by the Job controller once they are detected.
{{< /note >}}
## Handling Pod and container failures
A container in a Pod may fail for a number of reasons, such as because the process in it exited with
a non-zero exit code, or the container was killed for exceeding a memory limit, etc. If this
a non-zero exit code, or the container was killed for exceeding a memory limit, etc. If this
happens, and the `.spec.template.spec.restartPolicy = "OnFailure"`, then the Pod stays
on the node, but the container is re-run. Therefore, your program needs to handle the case when it is
on the node, but the container is re-run. Therefore, your program needs to handle the case when it is
restarted locally, or else specify `.spec.template.spec.restartPolicy = "Never"`.
See [pod lifecycle](/docs/concepts/workloads/pods/pod-lifecycle/#example-states) for more information on `restartPolicy`.
An entire Pod can also fail, for a number of reasons, such as when the pod is kicked off the node
(node is upgraded, rebooted, deleted, etc.), or if a container of the Pod fails and the
`.spec.template.spec.restartPolicy = "Never"`. When a Pod fails, then the Job controller
starts a new Pod. This means that your application needs to handle the case when it is restarted in a new
pod. In particular, it needs to handle temporary files, locks, incomplete output and the like
`.spec.template.spec.restartPolicy = "Never"`. When a Pod fails, then the Job controller
starts a new Pod. This means that your application needs to handle the case when it is restarted in a new
pod. In particular, it needs to handle temporary files, locks, incomplete output and the like
caused by previous runs.
By default, each pod failure is counted towards the `.spec.backoffLimit` limit,
see [pod backoff failure policy](#pod-backoff-failure-policy). However, you can
customize handling of pod failures by setting the Job's [pod failure policy](#pod-failure-policy).
Additionally, you can choose to count the pod failures independently for each
index of an [Indexed](#completion-mode) Job by setting the `.spec.backoffLimitPerIndex` field
(for more information, see [backoff limit per index](#backoff-limit-per-index)).
Note that even if you specify `.spec.parallelism = 1` and `.spec.completions = 1` and
`.spec.template.spec.restartPolicy = "Never"`, the same program may
sometimes be started twice.
If you do specify `.spec.parallelism` and `.spec.completions` both greater than 1, then there may be
multiple pods running at once. Therefore, your pods must also be tolerant of concurrency.
multiple pods running at once. Therefore, your pods must also be tolerant of concurrency.
When the [feature gates](/docs/reference/command-line-tools-reference/feature-gates/)
`PodDisruptionConditions` and `JobPodFailurePolicy` are both enabled,
@ -352,6 +364,7 @@ Pods associated with the Job are recreated by the Job controller with an
exponential back-off delay (10s, 20s, 40s ...) capped at six minutes.
The number of retries is calculated in two ways:
- The number of Pods with `.status.phase = "Failed"`.
- When using `restartPolicy = "OnFailure"`, the number of retries in all the
containers of Pods with `.status.phase` equal to `Pending` or `Running`.
@ -361,11 +374,78 @@ considered failed.
{{< note >}}
If your job has `restartPolicy = "OnFailure"`, keep in mind that your Pod running the Job
will be terminated once the job backoff limit has been reached. This can make debugging the Job's executable more difficult. We suggest setting
will be terminated once the job backoff limit has been reached. This can make debugging
the Job's executable more difficult. We suggest setting
`restartPolicy = "Never"` when debugging the Job or using a logging system to ensure output
from failed Jobs is not lost inadvertently.
{{< /note >}}
### Backoff limit per index {#backoff-limit-per-index}
{{< feature-state for_k8s_version="v1.28" state="alpha" >}}
{{< note >}}
You can only configure the backoff limit per index for an [Indexed](#completion-mode) Job, if you
have the `JobBackoffLimitPerIndex` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
enabled in your cluster.
{{< /note >}}
When you run an [indexed](#completion-mode) Job, you can choose to handle retries
for pod failures independently for each index. To do so, set the
`.spec.backoffLimitPerIndex` to specify the maximal number of pod failures
per index.
When the per-index backoff limit is exceeded for an index, Kuberentes considers the index as failed and adds it to the
`.status.failedIndexes` field. The succeeded indexes, those with a successfully
executed pods, are recorded in the `.status.completedIndexes` field, regardless of whether you set
the `backoffLimitPerIndex` field.
Note that a failing index does not interrupt execution of other indexes.
Once all indexes finish for a Job where you specified a backoff limit per index,
if at least one of those indexes did fail, the Job controller marks the overall
Job as failed, by setting the Failed condition in the status. The Job gets
marked as failed even if some, potentially nearly all, of the indexes were
processed successfully.
You can additionally limit the maximal number of indexes marked failed by
setting the `.spec.maxFailedIndexes` field.
When the number of failed indexes exceeds the `maxFailedIndexes` field, the
Job controller triggers termination of all remaining running Pods for that Job.
Once all pods are terminated, the entire Job is marked failed by the Job
controller, by setting the Failed condition in the Job status.
Here is an example manifest for a Job that defines a `backoffLimitPerIndex`:
{{< codenew file="/controllers/job-backoff-limit-per-index-example.yaml" >}}
In the example above, the Job controller allows for one restart for each
of the indexes. When the total number of failed indexes exceeds 5, then
the entire Job is terminated.
Once the job is finished, the Job status looks as follows:
```sh
kubectl get -o yaml job job-backoff-limit-per-index-example
```
```yaml
status:
completedIndexes: 1,3,5,7,9
failedIndexes: 0,2,4,6,8
succeeded: 5 # 1 succeeded pod for each of 5 succeeded indexes
failed: 10 # 2 failed pods (1 retry) for each of 5 failed indexes
conditions:
- message: Job has failed indexes
reason: FailedIndexes
status: "True"
type: Failed
```
Additionally, you may want to use the per-index backoff along with a
[pod failure policy](#pod-failure-policy). When using
per-index backoff, there is a new `FailIndex` action available which allows you to
avoid unnecessary retries within an index.
### Pod failure policy {#pod-failure-policy}
{{< feature-state for_k8s_version="v1.26" state="beta" >}}
@ -376,8 +456,8 @@ You can only configure a Pod failure policy for a Job if you have the
enabled in your cluster. Additionally, it is recommended
to enable the `PodDisruptionConditions` feature gate in order to be able to detect and handle
Pod disruption conditions in the Pod failure policy (see also:
[Pod disruption conditions](/docs/concepts/workloads/pods/disruptions#pod-disruption-conditions)). Both feature gates are
available in Kubernetes {{< skew currentVersion >}}.
[Pod disruption conditions](/docs/concepts/workloads/pods/disruptions#pod-disruption-conditions)).
Both feature gates are available in Kubernetes {{< skew currentVersion >}}.
{{< /note >}}
A Pod failure policy, defined with the `.spec.podFailurePolicy` field, enables
@ -387,11 +467,12 @@ Pod conditions.
In some situations, you may want to have a better control when handling Pod
failures than the control provided by the [Pod backoff failure policy](#pod-backoff-failure-policy),
which is based on the Job's `.spec.backoffLimit`. These are some examples of use cases:
* To optimize costs of running workloads by avoiding unnecessary Pod restarts,
you can terminate a Job as soon as one of its Pods fails with an exit code
indicating a software bug.
* To guarantee that your Job finishes even if there are disruptions, you can
ignore Pod failures caused by disruptions (such {{< glossary_tooltip text="preemption" term_id="preemption" >}},
ignore Pod failures caused by disruptions (such as {{< glossary_tooltip text="preemption" term_id="preemption" >}},
{{< glossary_tooltip text="API-initiated eviction" term_id="api-eviction" >}}
or {{< glossary_tooltip text="taint" term_id="taint" >}}-based eviction) so
that they don't count towards the `.spec.backoffLimit` limit of retries.
@ -430,6 +511,7 @@ the Pods in that Job that are still Pending or Running.
{{< /note >}}
These are some requirements and semantics of the API:
- if you want to use a `.spec.podFailurePolicy` field for a Job, you must
also define that Job's pod template with `.spec.restartPolicy` set to `Never`.
- the Pod failure policy rules you specify under `spec.podFailurePolicy.rules`
@ -448,6 +530,8 @@ These are some requirements and semantics of the API:
should not be incremented and a replacement Pod should be created.
- `Count`: use to indicate that the Pod should be handled in the default way.
The counter towards the `.spec.backoffLimit` should be incremented.
- `FailIndex`: use this action along with [backoff limit per index](#backoff-limit-per-index)
to avoid unnecessary retries within the index of a failed pod.
{{< note >}}
When you use a `podFailurePolicy`, the job controller only matches Pods in the
@ -460,23 +544,34 @@ Since Kubernetes 1.27, Kubelet transitions deleted pods to a terminal phase
ensures that deleted pods have their finalizers removed by the Job controller.
{{< /note >}}
{{< note >}}
Starting with Kubernetes v1.28, when Pod failure policy is used, the Job controller recreates
terminating Pods only once these Pods reach the terminal `Failed` phase. This behavior is similar
to `podReplacementPolicy: Failed`. For more information, see [Pod replacement policy](#pod-replacement-policy).
{{< /note >}}
## Job termination and cleanup
When a Job completes, no more Pods are created, but the Pods are [usually](#pod-backoff-failure-policy) not deleted either.
Keeping them around
allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output.
The job object also remains after it is completed so that you can view its status. It is up to the user to delete
old jobs after noting their status. Delete the job with `kubectl` (e.g. `kubectl delete jobs/pi` or `kubectl delete -f ./job.yaml`). When you delete the job using `kubectl`, all the pods it created are deleted too.
Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output.
The job object also remains after it is completed so that you can view its status. It is up to the user to delete
old jobs after noting their status. Delete the job with `kubectl` (e.g. `kubectl delete jobs/pi` or `kubectl delete -f ./job.yaml`).
When you delete the job using `kubectl`, all the pods it created are deleted too.
By default, a Job will run uninterrupted unless a Pod fails (`restartPolicy=Never`) or a Container exits in error (`restartPolicy=OnFailure`), at which point the Job defers to the
`.spec.backoffLimit` described above. Once `.spec.backoffLimit` has been reached the Job will be marked as failed and any running Pods will be terminated.
By default, a Job will run uninterrupted unless a Pod fails (`restartPolicy=Never`)
or a Container exits in error (`restartPolicy=OnFailure`), at which point the Job defers to the
`.spec.backoffLimit` described above. Once `.spec.backoffLimit` has been reached the Job will
be marked as failed and any running Pods will be terminated.
Another way to terminate a Job is by setting an active deadline.
Do this by setting the `.spec.activeDeadlineSeconds` field of the Job to a number of seconds.
The `activeDeadlineSeconds` applies to the duration of the job, no matter how many Pods are created.
Once a Job reaches `activeDeadlineSeconds`, all of its running Pods are terminated and the Job status will become `type: Failed` with `reason: DeadlineExceeded`.
Once a Job reaches `activeDeadlineSeconds`, all of its running Pods are terminated and the Job status
will become `type: Failed` with `reason: DeadlineExceeded`.
Note that a Job's `.spec.activeDeadlineSeconds` takes precedence over its `.spec.backoffLimit`. Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once it reaches the time limit specified by `activeDeadlineSeconds`, even if the `backoffLimit` is not yet reached.
Note that a Job's `.spec.activeDeadlineSeconds` takes precedence over its `.spec.backoffLimit`.
Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once
it reaches the time limit specified by `activeDeadlineSeconds`, even if the `backoffLimit` is not yet reached.
Example:
@ -493,14 +588,17 @@ spec:
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
```
Note that both the Job spec and the [Pod template spec](/docs/concepts/workloads/pods/init-containers/#detailed-behavior) within the Job have an `activeDeadlineSeconds` field. Ensure that you set this field at the proper level.
Note that both the Job spec and the [Pod template spec](/docs/concepts/workloads/pods/init-containers/#detailed-behavior)
within the Job have an `activeDeadlineSeconds` field. Ensure that you set this field at the proper level.
Keep in mind that the `restartPolicy` applies to the Pod, and not to the Job itself: there is no automatic Job restart once the Job status is `type: Failed`.
That is, the Job termination mechanisms activated with `.spec.activeDeadlineSeconds` and `.spec.backoffLimit` result in a permanent Job failure that requires manual intervention to resolve.
Keep in mind that the `restartPolicy` applies to the Pod, and not to the Job itself:
there is no automatic Job restart once the Job status is `type: Failed`.
That is, the Job termination mechanisms activated with `.spec.activeDeadlineSeconds`
and `.spec.backoffLimit` result in a permanent Job failure that requires manual intervention to resolve.
## Clean up finished jobs automatically
@ -539,7 +637,7 @@ spec:
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
```
@ -568,32 +666,30 @@ cap on the amount of resources that a particular namespace can
consume.
{{< /note >}}
## Job patterns
The Job object can be used to support reliable parallel execution of Pods. The Job object is not
The Job object can be used to support reliable parallel execution of Pods. The Job object is not
designed to support closely-communicating parallel processes, as commonly found in scientific
computing. It does support parallel processing of a set of independent but related *work items*.
computing. It does support parallel processing of a set of independent but related *work items*.
These might be emails to be sent, frames to be rendered, files to be transcoded, ranges of keys in a
NoSQL database to scan, and so on.
In a complex system, there may be multiple different sets of work items. Here we are just
In a complex system, there may be multiple different sets of work items. Here we are just
considering one set of work items that the user wants to manage together &mdash; a *batch job*.
There are several different patterns for parallel computation, each with strengths and weaknesses.
The tradeoffs are:
- One Job object for each work item, vs. a single Job object for all work items. The latter is
better for large numbers of work items. The former creates some overhead for the user and for the
- One Job object for each work item, vs. a single Job object for all work items. The latter is
better for large numbers of work items. The former creates some overhead for the user and for the
system to manage large numbers of Job objects.
- Number of pods created equals number of work items, vs. each Pod can process multiple work items.
The former typically requires less modification to existing code and containers. The latter
The former typically requires less modification to existing code and containers. The latter
is better for large numbers of work items, for similar reasons to the previous bullet.
- Several approaches use a work queue. This requires running a queue service,
- Several approaches use a work queue. This requires running a queue service,
and modifications to the existing program or container to make it use the work queue.
Other approaches are easier to adapt to an existing containerised application.
The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs.
The pattern names are also links to examples and more detailed description.
@ -603,12 +699,12 @@ The pattern names are also links to examples and more detailed description.
| [Queue with Variable Pod Count] | ✓ | ✓ | |
| [Indexed Job with Static Work Assignment] | ✓ | | ✓ |
| [Job Template Expansion] | | | ✓ |
| [Job with Pod-to-Pod Communication] | ✓ | sometimes | sometimes |
| [Job with Pod-to-Pod Communication] | ✓ | sometimes | sometimes |
When you specify completions with `.spec.completions`, each Pod created by the Job controller
has an identical [`spec`](https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). This means that
all pods for a task will have the same command line and the same
image, the same volumes, and (almost) the same environment variables. These patterns
has an identical [`spec`](https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status).
This means that all pods for a task will have the same command line and the same
image, the same volumes, and (almost) the same environment variables. These patterns
are different ways to arrange for pods to work on different things.
This table shows the required settings for `.spec.parallelism` and `.spec.completions` for each of the patterns.
@ -649,7 +745,8 @@ When a Job is resumed from suspension, its `.status.startTime` field will be
reset to the current time. This means that the `.spec.activeDeadlineSeconds`
timer will be stopped and reset when a Job is suspended and resumed.
When you suspend a Job, any running Pods that don't have a status of `Completed` will be [terminated](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination).
When you suspend a Job, any running Pods that don't have a status of `Completed`
will be [terminated](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination).
with a SIGTERM signal. The Pod's graceful termination period will be honored and
your Pod must handle this signal in this period. This may involve saving
progress for later or undoing changes. Pods terminated this way will not count
@ -743,19 +840,19 @@ as soon as the Job was resumed.
{{< feature-state for_k8s_version="v1.27" state="stable" >}}
In most cases a parallel job will want the pods to run with constraints,
In most cases, a parallel job will want the pods to run with constraints,
like all in the same zone, or all either on GPU model x or y but not a mix of both.
The [suspend](#suspending-a-job) field is the first step towards achieving those semantics. Suspend allows a
The [suspend](#suspending-a-job) field is the first step towards achieving those semantics. Suspend allows a
custom queue controller to decide when a job should start; However, once a job is unsuspended,
a custom queue controller has no influence on where the pods of a job will actually land.
This feature allows updating a Job's scheduling directives before it starts, which gives custom queue
controllers the ability to influence pod placement while at the same time offloading actual
pod-to-node assignment to kube-scheduler. This is allowed only for suspended Jobs that have never
controllers the ability to influence pod placement while at the same time offloading actual
pod-to-node assignment to kube-scheduler. This is allowed only for suspended Jobs that have never
been unsuspended before.
The fields in a Job's pod template that can be updated are node affinity, node selector,
The fields in a Job's pod template that can be updated are node affinity, node selector,
tolerations, labels, annotations and [scheduling gates](/docs/concepts/scheduling-eviction/pod-scheduling-readiness/).
### Specifying your own Pod selector
@ -767,17 +864,17 @@ It picks a selector value that will not overlap with any other jobs.
However, in some cases, you might need to override this automatically set selector.
To do this, you can specify the `.spec.selector` of the Job.
Be very careful when doing this. If you specify a label selector which is not
Be very careful when doing this. If you specify a label selector which is not
unique to the pods of that Job, and which matches unrelated Pods, then pods of the unrelated
job may be deleted, or this Job may count other Pods as completing it, or one or both
Jobs may refuse to create Pods or run to completion. If a non-unique selector is
Jobs may refuse to create Pods or run to completion. If a non-unique selector is
chosen, then other controllers (e.g. ReplicationController) and their Pods may behave
in unpredictable ways too. Kubernetes will not stop you from making a mistake when
in unpredictable ways too. Kubernetes will not stop you from making a mistake when
specifying `.spec.selector`.
Here is an example of a case when you might want to use this feature.
Say Job `old` is already running. You want existing Pods
Say Job `old` is already running. You want existing Pods
to keep running, but you want the rest of the Pods it creates
to use a different pod template and for the Job to have a new name.
You cannot update the Job because these fields are not updatable.
@ -823,7 +920,7 @@ spec:
...
```
The new Job itself will have a different uid from `a8f3d00d-c6d2-11e5-9f87-42010af00002`. Setting
The new Job itself will have a different uid from `a8f3d00d-c6d2-11e5-9f87-42010af00002`. Setting
`manualSelector: true` tells the system that you know what you are doing and to allow this
mismatch.
@ -831,32 +928,12 @@ mismatch.
{{< feature-state for_k8s_version="v1.26" state="stable" >}}
{{< note >}}
The control plane doesn't track Jobs using finalizers, if the Jobs were created
when the feature gate `JobTrackingWithFinalizers` was disabled, even after you
upgrade the control plane to 1.26.
{{< /note >}}
The control plane keeps track of the Pods that belong to any Job and notices if
any such Pod is removed from the API server. To do that, the Job controller
creates Pods with the finalizer `batch.kubernetes.io/job-tracking`. The
controller removes the finalizer only after the Pod has been accounted for in
the Job status, allowing the Pod to be removed by other controllers or users.
Jobs created before upgrading to Kubernetes 1.26 or before the feature gate
`JobTrackingWithFinalizers` is enabled are tracked without the use of Pod
finalizers.
The Job {{< glossary_tooltip term_id="controller" text="controller" >}} updates
the status counters for `succeeded` and `failed` Pods based only on the Pods
that exist in the cluster. The contol plane can lose track of the progress of
the Job if Pods are deleted from the cluster.
You can determine if the control plane is tracking a Job using Pod finalizers by
checking if the Job has the annotation
`batch.kubernetes.io/job-tracking`. You should **not** manually add or remove
this annotation from Jobs. Instead, you can recreate the Jobs to ensure they
are tracked using Pod finalizers.
### Elastic Indexed Jobs
{{< feature-state for_k8s_version="v1.27" state="beta" >}}
@ -870,12 +947,61 @@ is disabled, `.spec.completions` is immutable.
Use cases for elastic Indexed Jobs include batch workloads which require
scaling an indexed Job, such as MPI, Horovord, Ray, and PyTorch training jobs.
### Delayed creation of replacement pods {#pod-replacement-policy}
{{< feature-state for_k8s_version="v1.28" state="alpha" >}}
{{< note >}}
You can only set `podReplacementPolicy` on Jobs if you enable the `JobPodReplacementPolicy`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
{{< /note >}}
By default, the Job controller recreates Pods as soon they either fail or are terminating (have a deletion timestamp).
This means that, at a given time, when some of the Pods are terminating, the number of running Pods for a Job
can be greater than `parallelism` or greater than one Pod per index (if you are using an Indexed Job).
You may choose to create replacement Pods only when the terminating Pod is fully terminal (has `status.phase: Failed`).
To do this, set the `.spec.podReplacementPolicy: Failed`.
The default replacement policy depends on whether the Job has a `podFailurePolicy` set.
With no Pod failure policy defined for a Job, omitting the `podReplacementPolicy` field selects the
`TerminatingOrFailed` replacement policy:
the control plane creates replacement Pods immediately upon Pod deletion
(as soon as the control plane sees that a Pod for this Job has `deletionTimestamp` set).
For Jobs with a Pod failure policy set, the default `podReplacementPolicy` is `Failed`, and no other
value is permitted.
See [Pod failure policy](#pod-failure-policy) to learn more about Pod failure policies for Jobs.
```yaml
kind: Job
metadata:
name: new
...
spec:
podReplacementPolicy: Failed
...
```
Provided your cluster has the feature gate enabled, you can inspect the `.status.terminating` field of a Job.
The value of the field is the number of Pods owned by the Job that are currently terminating.
```shell
kubectl get jobs/myjob -o yaml
```
```yaml
apiVersion: batch/v1
kind: Job
# .metadata and .spec omitted
status:
terminating: 3 # three Pods are terminating and have not yet reached the Failed phase
```
## Alternatives
### Bare Pods
When the node that a Pod is running on reboots or fails, the pod is terminated
and will not be restarted. However, a Job will create new Pods to replace terminated ones.
and will not be restarted. However, a Job will create new Pods to replace terminated ones.
For this reason, we recommend that you use a Job rather than a bare Pod, even if your application
requires only a single Pod.
@ -892,12 +1018,12 @@ for pods with `RestartPolicy` equal to `OnFailure` or `Never`.
### Single Job starts controller Pod
Another pattern is for a single Job to create a Pod which then creates other Pods, acting as a sort
of custom controller for those Pods. This allows the most flexibility, but may be somewhat
of custom controller for those Pods. This allows the most flexibility, but may be somewhat
complicated to get started with and offers less integration with Kubernetes.
One example of this pattern would be a Job which starts a Pod which runs a script that in turn
starts a Spark master controller (see [spark example](https://github.com/kubernetes/examples/tree/master/staging/spark/README.md)), runs a spark
driver, and then cleans up.
starts a Spark master controller (see [spark example](https://github.com/kubernetes/examples/tree/master/staging/spark/README.md)),
runs a spark driver, and then cleans up.
An advantage of this approach is that the overall process gets the completion guarantee of a Job
object, but maintains complete control over what Pods are created and how work is assigned to them.
@ -906,10 +1032,10 @@ object, but maintains complete control over what Pods are created and how work i
* Learn about [Pods](/docs/concepts/workloads/pods).
* Read about different ways of running Jobs:
* [Coarse Parallel Processing Using a Work Queue](/docs/tasks/job/coarse-parallel-processing-work-queue/)
* [Fine Parallel Processing Using a Work Queue](/docs/tasks/job/fine-parallel-processing-work-queue/)
* Use an [indexed Job for parallel processing with static work assignment](/docs/tasks/job/indexed-parallel-processing-static/)
* Create multiple Jobs based on a template: [Parallel Processing using Expansions](/docs/tasks/job/parallel-processing-expansion/)
* [Coarse Parallel Processing Using a Work Queue](/docs/tasks/job/coarse-parallel-processing-work-queue/)
* [Fine Parallel Processing Using a Work Queue](/docs/tasks/job/fine-parallel-processing-work-queue/)
* Use an [indexed Job for parallel processing with static work assignment](/docs/tasks/job/indexed-parallel-processing-static/)
* Create multiple Jobs based on a template: [Parallel Processing using Expansions](/docs/tasks/job/parallel-processing-expansion/)
* Follow the links within [Clean up finished jobs automatically](#clean-up-finished-jobs-automatically)
to learn more about how your cluster can clean up completed and / or failed tasks.
* `Job` is part of the Kubernetes REST API.

View File

@ -160,7 +160,8 @@ regardless of which node it's (re)scheduled on.
For a StatefulSet with N [replicas](#replicas), each Pod in the StatefulSet
will be assigned an integer ordinal, that is unique over the Set. By default,
pods will be assigned ordinals from 0 up through N-1.
pods will be assigned ordinals from 0 up through N-1. The StatefulSet controller
will also add a pod label with this index: `apps.kubernetes.io/pod-index`.
### Start ordinal
@ -238,6 +239,16 @@ it adds a label, `statefulset.kubernetes.io/pod-name`, that is set to the name o
the Pod. This label allows you to attach a Service to a specific Pod in
the StatefulSet.
### Pod index label
{{< feature-state for_k8s_version="v1.28" state="beta" >}}
When the StatefulSet {{<glossary_tooltip text="controller" term_id="controller">}} creates a Pod,
the new Pod is labelled with `apps.kubernetes.io/pod-index`. The value of this label is the ordinal index of
the Pod. This label allows you to route traffic to a particular pod index, filter logs/metrics
using the pod index label, and more. Note the feature gate `PodIndexLabel` must be enabled for this
feature, and it is enabled by default.
## Deployment and Scaling Guarantees
* For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.

View File

@ -5,9 +5,6 @@ title: Pods
content_type: concept
weight: 10
no_list: true
card:
name: concepts
weight: 60
---
<!-- overview -->
@ -110,7 +107,18 @@ that updates those files from a remote source, as in the following diagram:
{{< figure src="/images/docs/pod.svg" alt="Pod creation diagram" class="diagram-medium" >}}
Some Pods have {{< glossary_tooltip text="init containers" term_id="init-container" >}} as well as {{< glossary_tooltip text="app containers" term_id="app-container" >}}. Init containers run and complete before the app containers are started.
Some Pods have {{< glossary_tooltip text="init containers" term_id="init-container" >}}
as well as {{< glossary_tooltip text="app containers" term_id="app-container" >}}.
By default, init containers run and complete before the app containers are started.
{{< feature-state for_k8s_version="v1.28" state="alpha" >}}
Enabling the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
allows you to specify `restartPolicy: Always` for init containers.
Setting the `Always` restart policy ensures that the init containers where you set it are
kept running during the entire lifetime of the Pod.
See [Sidecar containers and restartPolicy](/docs/concepts/workloads/pods/init-containers/#sidecar-containers-and-restartpolicy)
for more details.
Pods natively provide two kinds of shared resources for their constituent containers:
[networking](#pod-networking) and [storage](#pod-storage).
@ -364,4 +372,4 @@ To understand the context for why Kubernetes wraps a common Pod API in other res
* [Borg](https://research.google.com/pubs/pub43438.html)
* [Marathon](https://mesosphere.github.io/marathon/docs/rest-api.html)
* [Omega](https://research.google/pubs/pub41684/)
* [Tupperware](https://engineering.fb.com/data-center-engineering/tupperware/).
* [Tupperware](https://engineering.fb.com/data-center-engineering/tupperware/).

View File

@ -44,7 +44,7 @@ You can pass information from available Container-level fields using
### Information available via `fieldRef` {#downwardapi-fieldRef}
For most Pod-level fields, you can provide them to a container either as
For some Pod-level fields, you can provide them to a container either as
an environment variable or using a `downwardAPI` volume. The fields available
via either mechanism are:
@ -75,9 +75,16 @@ The following information is available through environment variables
`status.hostIP`
: the primary IP address of the node to which the Pod is assigned
`status.hostIPs`
: the IP addresses is a dual-stack version of `status.hostIP`, the first is always the same as `status.hostIP`.
The field is available if you enable the `PodHostIPs` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
`status.podIP`
: the pod's primary IP address (usually, its IPv4 address)
`status.podIPs`
: the IP addresses is a dual-stack version of `status.podIP`, the first is always the same as `status.podIP`
The following information is available through a `downwardAPI` volume
`fieldRef`, **but not as environment variables**:

View File

@ -289,7 +289,49 @@ The Pod which is already running correctly would be killed by `activeDeadlineSec
The name of each app and init container in a Pod must be unique; a
validation error is thrown for any container sharing a name with another.
### Resources
#### API for sidecar containers
{{< feature-state for_k8s_version="v1.28" state="alpha" >}}
Starting with Kubernetes 1.28 in alpha, a feature gate named `SidecarContainers`
allows you to specify a `restartPolicy` for init containers which is independent of
the Pod and other init containers. Container [probes](/docs/concepts/workloads/pods/pod-lifecycle/#types-of-probe)
can also be added to control their lifecycle.
If an init container is created with its `restartPolicy` set to `Always`, it will
start and remain running during the entire life of the Pod, which is useful for
running supporting services separated from the main application containers.
If a `readinessProbe` is specified for this init container, its result will be used
to determine the `ready` state of the Pod.
Since these containers are defined as init containers, they benefit from the same
ordering and sequential guarantees as other init containers, allowing them to
be mixed with other init containers into complex Pod initialization flows.
Compared to regular init containers, sidecar-style init containers continue to
run and the next init container can begin starting once the kubelet has set
the `started` container status for the sidecar-style init container to true.
That status either becomes true because there is a process running in the
container and no startup probe defined, or
as a result of its `startupProbe` succeeding.
This feature can be used to implement the sidecar container pattern in a more
robust way, as the kubelet always restarts a sidecar container if it fails.
Here's an example of a Deployment with two containers, one of which is a sidecar:
{{% codenew language="yaml" file="application/deployment-sidecar.yaml" %}}
This feature is also useful for running Jobs with sidecars, as the sidecar
container will not prevent the Job from completing after the main container
has finished.
Here's an example of a Job with two containers, one of which is a sidecar:
{{% codenew language="yaml" file="application/job/job-sidecar.yaml" %}}
#### Resource sharing within containers
Given the ordering and execution for init containers, the following rules
for resource usage apply:

View File

@ -164,7 +164,7 @@ through which the Pod has or has not passed. Kubelet manages the following
PodConditions:
* `PodScheduled`: the Pod has been scheduled to a node.
* `PodHasNetwork`: (alpha feature; must be [enabled explicitly](#pod-has-network)) the
* `PodReadyToStartContainers`: (alpha feature; must be [enabled explicitly](#pod-has-network)) the
Pod sandbox has been successfully created and networking configured.
* `ContainersReady`: all containers in the Pod are ready.
* `Initialized`: all [init containers](/docs/concepts/workloads/pods/init-containers/)
@ -244,15 +244,19 @@ When a Pod's containers are Ready but at least one custom condition is missing o
{{< feature-state for_k8s_version="v1.25" state="alpha" >}}
{{< note >}}
This condition was renamed from PodHasNetwork to PodReadyToStartContainers.
{{< /note >}}
After a Pod gets scheduled on a node, it needs to be admitted by the Kubelet and
have any volumes mounted. Once these phases are complete, the Kubelet works with
a container runtime (using {{< glossary_tooltip term_id="cri" >}}) to set up a
runtime sandbox and configure networking for the Pod. If the `PodHasNetworkCondition`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled,
Kubelet reports whether a Pod has reached this initialization milestone through
the `PodHasNetwork` condition in the `status.conditions` field of a Pod.
runtime sandbox and configure networking for the Pod. If the
`PodReadyToStartContainersCondition` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled,
Kubelet reports whether a pod has reached this initialization milestone through
the `PodReadyToStartContainers` condition in the `status.conditions` field of a Pod.
The `PodHasNetwork` condition is set to `False` by the Kubelet when it detects a
The `PodReadyToStartContainers` condition is set to `False` by the Kubelet when it detects a
Pod does not have a runtime sandbox with networking configured. This occurs in
the following scenarios:
@ -264,10 +268,10 @@ the following scenarios:
sandbox virtual machine rebooting, which then requires creating a new sandbox and
fresh container network configuration.
The `PodHasNetwork` condition is set to `True` by the kubelet after the
The `PodReadyToStartContainers` condition is set to `True` by the kubelet after the
successful completion of sandbox creation and network configuration for the Pod
by the runtime plugin. The kubelet can start pulling container images and create
containers after `PodHasNetwork` condition has been set to `True`.
containers after `PodReadyToStartContainers` condition has been set to `True`.
For a Pod with init containers, the kubelet sets the `Initialized` condition to
`True` after the init containers have successfully completed (which happens
@ -431,13 +435,12 @@ before the Pod is allowed to be forcefully killed. With that forceful shutdown t
place, the {{< glossary_tooltip text="kubelet" term_id="kubelet" >}} attempts graceful
shutdown.
Typically, the container runtime sends a TERM signal to the main process in each
container. Many container runtimes respect the `STOPSIGNAL` value defined in the container
image and send this instead of TERM.
Once the grace period has expired, the KILL signal is sent to any remaining processes, and the Pod
is then deleted from the {{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}.
If the kubelet or the container runtime's management service is restarted while waiting for
processes to terminate, the cluster retries from the start including the full original grace period.
Typically, with this graceful termination of the pod, kubelet makes requests to the container runtime to attempt to stop the containers in the pod by first sending a TERM (aka. SIGTERM) signal, with a grace period timeout, to the main process in each container. The requests to stop the containers are processed by the container runtime asynchronously. There is no guarantee to the order of processing for these requests. Many container runtimes respect the `STOPSIGNAL` value defined in the container image and, if different, send the container image configured STOPSIGNAL instead of TERM.
Once the grace period has expired, the KILL signal is sent to any remaining
processes, and the Pod is then deleted from the
{{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}. If the kubelet or the
container runtime's management service is restarted while waiting for processes to terminate, the
cluster retries from the start including the full original grace period.
An example flow:

View File

@ -7,7 +7,7 @@ weight: 85
<!-- overview -->
This page introduces _Quality of Service (QoS) classes_ in Kubernetes, and explains
how Kubernetes assigns a QoS class to each Pods as a consequence of the resource
how Kubernetes assigns a QoS class to each Pod as a consequence of the resource
constraints that you specify for the containers in that Pod. Kubernetes relies on this
classification to make decisions about which Pods to evict when there are not enough
available resources on a Node.

View File

@ -48,11 +48,11 @@ ext4, xfs, fat, tmpfs, overlayfs.
In addition, support is needed in the
{{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}
to use this feature with Kubernetes stateless pods:
to use this feature with Kubernetes pods:
* CRI-O: version 1.25 (and later) supports user namespaces for containers.
containerd v1.7 is not compatible with the userns support in Kubernetes v{{< skew currentVersion >}}.
containerd v1.7 is not compatible with the userns support in Kubernetes v1.27 to v{{< skew latestVersion >}}.
Kubernetes v1.25 and v1.26 used an earlier implementation that **is** compatible with containerd v1.7,
in terms of userns support.
If you are using a version of Kubernetes other than {{< skew currentVersion >}},
@ -75,7 +75,7 @@ A pod can opt-in to use user namespaces by setting the `pod.spec.hostUsers` fiel
to `false`.
The kubelet will pick host UIDs/GIDs a pod is mapped to, and will do so in a way
to guarantee that no two stateless pods on the same node use the same mapping.
to guarantee that no two pods on the same node use the same mapping.
The `runAsUser`, `runAsGroup`, `fsGroup`, etc. fields in the `pod.spec` always
refer to the user inside the container.
@ -92,7 +92,7 @@ Most applications that need to run as root but don't access other host
namespaces or resources, should continue to run fine without any changes needed
if user namespaces is activated.
## Understanding user namespaces for stateless pods
## Understanding user namespaces for pods {#pods-and-userns}
Several container runtimes with their default configuration (like Docker Engine,
containerd, CRI-O) use Linux namespaces for isolation. Other technologies exist
@ -162,15 +162,6 @@ allowed to set any of:
* `hostIPC: true`
* `hostPID: true`
The pod is allowed to use no volumes at all or, if using volumes, only these
volume types are allowed:
* configmap
* secret
* projected
* downwardAPI
* emptyDir
## {{% heading "whatsnext" %}}
* Take a look at [Use a User Namespace With a Pod](/docs/tasks/configure-pod-container/user-namespaces/)

View File

@ -158,6 +158,10 @@ Figure 2. Preparation for your first contribution.
- Learn about [page content types](/docs/contribute/style/page-content-types/)
and [Hugo shortcodes](/docs/contribute/style/hugo-shortcodes/).
## Getting help when contributing
Making your first contribution can be overwhelming. The [New Contributor Ambassadors](https://github.com/kubernetes/website#new-contributor-ambassadors) are there to walk you through making your first few contributions. You can reach out to them in the [Kubernetes Slack](https://slack.k8s.io/) preferably in the `#sig-docs` channel. There is also the [New Contributors Meet and Greet call](https://www.kubernetes.dev/resources/calendar/) that happens on the first Tuesday of every month. You can interact with the New Contributor Ambassadors and get your queries resolved here.
## Next steps
- Learn to [work from a local clone](/docs/contribute/new-content/open-a-pr/#fork-the-repo)

View File

@ -91,7 +91,7 @@ To submit a blog post, follow these steps:
- Blog posts should be relevant to Kubernetes users.
- Topics related to participation in or results of Kubernetes SIGs activities are always on
topic (see the work in the [Contributor Comms Team](https://github.com/kubernetes/community/blob/master/communication/contributor-comms/storytelling-resources/blog-guidelines.md#upstream-marketing-blog-guidelines)
topic (see the work in the [Contributor Comms Team](https://github.com/kubernetes/community/blob/master/communication/contributor-comms/blogging-resources/blog-guidelines.md#contributor-comms-blog-guidelines)
for support on these posts).
- The components of Kubernetes are purposely modular, so tools that use existing integration
points like CNI and CSI are on topic.

View File

@ -2,9 +2,6 @@
title: Page content types
content_type: concept
weight: 80
card:
name: contribute
weight: 30
---
<!-- overview -->

View File

@ -651,6 +651,12 @@ You can remove ... | You can easily remove ...
These steps ... | These simple steps ...
{{< /table >}}
### EditorConfig file
The Kubernetes project maintains an EditorConfig file that sets common style preferences in text editors
such as VS Code. You can use this file if you want to ensure that your contributions are consistent with
the rest of the project. To view the file, refer to
[`.editorconfig`](https://github.com/kubernetes/website/blob/main/.editorconfig) in the repository root.
## {{% heading "whatsnext" %}}
* Learn about [writing a new topic](/docs/contribute/style/write-new-topic/).

View File

@ -4,7 +4,10 @@ content_type: concept
weight: 10
card:
name: contribute
weight: 20
weight: 15
anchors:
- anchor: "#opening-an-issue"
title: Suggest content improvements
---
<!-- overview -->

View File

@ -41,20 +41,20 @@ cards:
description: "Look up common tasks and how to perform them using a short sequence of steps."
button: "View Tasks"
button_path: "/docs/tasks"
- name: training
title: "Training"
description: "Get certified in Kubernetes and make your cloud native projects successful!"
button: "View training"
button_path: "/training"
- name: reference
title: Look up reference information
description: Browse terminology, command line syntax, API resource types, and setup tool documentation.
button: View Reference
button_path: /docs/reference
- name: training
title: "Training"
description: "Get certified in Kubernetes and make your cloud native projects successful!"
button: "View training"
button_path: "/training"
- name: contribute
title: Contribute to the docs
title: Contribute to Kubernetes
description: Anyone can contribute, whether you're new to the project or you've been around a long time.
button: Contribute to the docs
button: Find out how to help
button_path: /docs/contribute
- name: Download
title: Download Kubernetes

View File

@ -87,7 +87,6 @@ operator to use or manage a cluster.
* [kubelet credential providers (v1alpha1)](/docs/reference/config-api/kubelet-credentialprovider.v1alpha1/),
[kubelet credential providers (v1beta1)](/docs/reference/config-api/kubelet-credentialprovider.v1beta1/) and
[kubelet credential providers (v1)](/docs/reference/config-api/kubelet-credentialprovider.v1/)
* [kube-scheduler configuration (v1beta2)](/docs/reference/config-api/kube-scheduler-config.v1beta2/),
[kube-scheduler configuration (v1beta3)](/docs/reference/config-api/kube-scheduler-config.v1beta3/) and
[kube-scheduler configuration (v1)](/docs/reference/config-api/kube-scheduler-config.v1/)
* [kube-controller-manager configuration (v1alpha1)](/docs/reference/config-api/kube-controller-manager-config.v1alpha1/)
@ -101,6 +100,7 @@ operator to use or manage a cluster.
## Config API for kubeadm
* [v1beta3](/docs/reference/config-api/kubeadm-config.v1beta3/)
* [v1beta4](/docs/reference/config-api/kubeadm-config.v1beta4/)
## Design Docs

View File

@ -1220,7 +1220,7 @@ The following `ExecCredential` manifest describes a cluster information sample.
## API access to authentication information for a client {#self-subject-review}
{{< feature-state for_k8s_version="v1.27" state="beta" >}}
{{< feature-state for_k8s_version="v1.28" state="stable" >}}
If your cluster has the API enabled, you can use the `SelfSubjectReview` API to find out how your Kubernetes cluster maps your authentication
information to identify you as a client. This works whether you are authenticating as a user (typically representing
@ -1230,11 +1230,11 @@ a real person) or as a ServiceAccount.
Request example (the body would be a `SelfSubjectReview`):
```
POST /apis/authentication.k8s.io/v1beta1/selfsubjectreviews
POST /apis/authentication.k8s.io/v1/selfsubjectreviews
```
```json
{
"apiVersion": "authentication.k8s.io/v1beta1",
"apiVersion": "authentication.k8s.io/v1",
"kind": "SelfSubjectReview"
}
```
@ -1242,7 +1242,7 @@ Response example:
```json
{
"apiVersion": "authentication.k8s.io/v1beta1",
"apiVersion": "authentication.k8s.io/v1",
"kind": "SelfSubjectReview",
"status": {
"userInfo": {
@ -1285,7 +1285,7 @@ By providing the output flag, it is also possible to print the JSON or YAML repr
{{% tab name="JSON" %}}
```json
{
"apiVersion": "authentication.k8s.io/v1alpha1",
"apiVersion": "authentication.k8s.io/v1",
"kind": "SelfSubjectReview",
"status": {
"userInfo": {
@ -1314,7 +1314,7 @@ By providing the output flag, it is also possible to print the JSON or YAML repr
{{% tab name="YAML" %}}
```yaml
apiVersion: authentication.k8s.io/v1alpha1
apiVersion: authentication.k8s.io/v1
kind: SelfSubjectReview
status:
userInfo:
@ -1351,8 +1351,10 @@ By default, all authenticated users can create `SelfSubjectReview` objects when
You can only make `SelfSubjectReview` requests if:
* the `APISelfSubjectReview`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled for your cluster (enabled by default after reaching Beta)
* the API server for your cluster has the `authentication.k8s.io/v1alpha1` or `authentication.k8s.io/v1beta1`
is enabled for your cluster (not needed for Kubernetes {{< skew currentVersion >}}, but older
Kubernetes versions might not offer this feature gate, or might default it to be off)
* (if you are running a version of Kubernetes older than v1.28) the API server for your
cluster has the `authentication.k8s.io/v1alpha1` or `authentication.k8s.io/v1beta1`
{{< glossary_tooltip term_id="api-group" text="API group" >}}
enabled.
{{< /note >}}

View File

@ -721,14 +721,9 @@ The `matchPolicy` for an admission webhooks defaults to `Equivalent`.
### Matching requests: `matchConditions`
{{< feature-state state="alpha" for_k8s_version="v1.27" >}}
{{< feature-state state="beta" for_k8s_version="v1.28" >}}
{{< note >}}
Use of `matchConditions` requires the [featuregate](/docs/reference/command-line-tools-reference/feature-gates/)
`AdmissionWebhookMatchConditions` to be explicitly enabled on the kube-apiserver before this feature can be used.
{{< /note >}}
You can define _match conditions_for webhooks if you need fine-grained request filtering. These
You can define _match conditions_ for webhooks if you need fine-grained request filtering. These
conditions are useful if you find that match rules, `objectSelectors` and `namespaceSelectors` still
doesn't provide the filtering you want over when to call out over HTTP. Match conditions are
[CEL expressions](/docs/reference/using-api/cel/). All match conditions must evaluate to true for the
@ -754,6 +749,7 @@ webhooks:
namespace: my-namespace
name: my-webhook
caBundle: '<omitted>'
# You can have up to 64 matchConditions per webhook
matchConditions:
- name: 'exclude-leases' # Each match condition must have a unique name
expression: '!(request.resource.group == "coordination.k8s.io" && request.resource.resource == "leases")' # Match non-lease resources.
@ -779,6 +775,7 @@ webhooks:
namespace: my-namespace
name: my-webhook
caBundle: '<omitted>'
# You can have up to 64 matchConditions per webhook
matchConditions:
- name: 'breakglass'
# Skip requests made by users authorized to 'breakglass' on this webhook.
@ -786,6 +783,10 @@ webhooks:
expression: '!authorizer.group("admissionregistration.k8s.io").resource("validatingwebhookconfigurations").name("my-webhook.example.com").check("breakglass").allowed()'
```
{{< note >}}
You can define up to 64 elements in the `matchConditions` field per webhook.
{{< /note >}}
Match conditions have access to the following CEL variables:
- `object` - The object from the incoming request. The value is null for DELETE requests. The object

View File

@ -9,7 +9,7 @@ content_type: concept
<!-- overview -->
{{< feature-state state="alpha" for_k8s_version="v1.26" >}}
{{< feature-state state="beta" for_k8s_version="v1.28" >}}
This page provides an overview of Validating Admission Policy.
@ -45,12 +45,12 @@ At least a `ValidatingAdmissionPolicy` and a corresponding `ValidatingAdmission
must be defined for a policy to have an effect.
If a `ValidatingAdmissionPolicy` does not need to be configured via parameters, simply leave
`spec.paramKind` in `ValidatingAdmissionPolicy` unset.
`spec.paramKind` in `ValidatingAdmissionPolicy` not specified.
## {{% heading "prerequisites" %}}
- Ensure the `ValidatingAdmissionPolicy` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled.
- Ensure that the `admissionregistration.k8s.io/v1alpha1` API is enabled.
- Ensure that the `admissionregistration.k8s.io/v1beta1` API is enabled.
## Getting Started with Validating Admission Policy
@ -61,22 +61,7 @@ with great caution. The following describes how to quickly experiment with Valid
The following is an example of a ValidatingAdmissionPolicy.
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "demo-policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas <= 5"
```
{{% codenew language="yaml" file="validatingadmissionpolicy/basic-example-policy.yaml" %}}
`spec.validations` contains CEL expressions which use the [Common Expression Language (CEL)](https://github.com/google/cel-spec)
to validate the request. If an expression evaluates to false, the validation check is enforced
@ -85,19 +70,7 @@ according to the `spec.failurePolicy` field.
To configure a validating admission policy for use in a cluster, a binding is required.
The following is an example of a ValidatingAdmissionPolicyBinding.:
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "demo-binding-test.example.com"
spec:
policyName: "demo-policy.example.com"
validationActions: [Deny]
matchResources:
namespaceSelector:
matchLabels:
environment: test
```
{{% codenew language="yaml" file="validatingadmissionpolicy/basic-example-binding.yaml" %}}
When trying to create a deployment with replicas set not satisfying the validation expression, an
error will return containing message:
@ -133,13 +106,13 @@ API response body and the HTTP warning headers.
A `validation` that evaluates to false is always enforced according to these
actions. Failures defined by the `failurePolicy` are enforced
according to these actions only if the `failurePolicy` is set to `Fail` (or unset),
according to these actions only if the `failurePolicy` is set to `Fail` (or not specified),
otherwise the failures are ignored.
See [Audit Annotations: validation falures](/docs/reference/labels-annotations-taints/audit-annotations/#validation-policy-admission-k8s-io-validation_failure)
for more details about the validation failure audit annotation.
#### Parameter resources
### Parameter resources
Parameter resources allow a policy configuration to be separate from its definition.
A policy can define paramKind, which outlines GVK of the parameter resource,
@ -148,26 +121,7 @@ and then a policy binding ties a policy by name (via policyName) to a particular
If parameter configuration is needed, the following is an example of a ValidatingAdmissionPolicy
with parameter configuration.
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "replicalimit-policy.example.com"
spec:
failurePolicy: Fail
paramKind:
apiVersion: rules.example.com/v1
kind: ReplicaLimit
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas <= params.maxReplicas"
reason: Invalid
```
{{% codenew language="yaml" file="validatingadmissionpolicy/policy-with-param.yaml" %}}
The `spec.paramKind` field of the ValidatingAdmissionPolicy specifies the kind of resources used
to parameterize this policy. For this example, it is configured by ReplicaLimit custom resources.
@ -182,89 +136,51 @@ validation check is enforced according to the `spec.failurePolicy` field.
The validating admission policy author is responsible for providing the ReplicaLimit parameter CRD.
To configure an validating admission policy for use in a cluster, a binding and parameter resource
are created. The following is an example of a ValidatingAdmissionPolicyBinding.
are created. The following is an example of a ValidatingAdmissionPolicyBinding
that uses a **cluster-wide** param - the same param will be used to validate
every resource request that matches the binding:
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "replicalimit-binding-test.example.com"
spec:
policyName: "replicalimit-policy.example.com"
validationActions: [Deny]
paramRef:
name: "replica-limit-test.example.com"
matchResources:
namespaceSelector:
matchLabels:
environment: test
```
{{% codenew language="yaml" file="validatingadmissionpolicy/binding-with-param.yaml" %}}
Notice this binding applies a parameter to the policy for all resources which
are in the `test` environment.
The parameter resource could be as following:
```yaml
apiVersion: rules.example.com/v1
kind: ReplicaLimit
metadata:
name: "replica-limit-test.example.com"
maxReplicas: 3
```
{{% codenew language="yaml" file="validatingadmissionpolicy/replicalimit-param.yaml" %}}
This policy parameter resource limits deployments to a max of 3 replicas in all namespaces in the
test environment. An admission policy may have multiple bindings. To bind all other environments
environment to have a maxReplicas limit of 100, create another ValidatingAdmissionPolicyBinding:
This policy parameter resource limits deployments to a max of 3 replicas.
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "replicalimit-binding-nontest"
spec:
policyName: "replicalimit-policy.example.com"
validationActions: [Deny]
paramRef:
name: "replica-limit-clusterwide.example.com"
matchResources:
namespaceSelector:
matchExpressions:
- key: environment
operator: NotIn
values:
- test
```
An admission policy may have multiple bindings. To bind all other environments
to have a maxReplicas limit of 100, create another ValidatingAdmissionPolicyBinding:
And have a parameter resource like:
{{% codenew language="yaml" file="validatingadmissionpolicy/binding-with-param-prod.yaml" %}}
```yaml
apiVersion: rules.example.com/v1
kind: ReplicaLimit
metadata:
name: "replica-limit-clusterwide.example.com"
maxReplicas: 100
```
Notice this binding applies a different parameter to resources which
are not in the `test` environment.
Bindings can have overlapping match criteria. The policy is evaluated for each matching binding.
In the above example, the "nontest" policy binding could instead have been defined as a global policy:
And have a parameter resource:
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "replicalimit-binding-global"
spec:
policyName: "replicalimit-policy.example.com"
validationActions: [Deny]
params: "replica-limit-clusterwide.example.com"
matchResources:
namespaceSelector:
matchExpressions:
- key: environment
operator: Exists
```
{{% codenew language="yaml" file="validatingadmissionpolicy/replicalimit-param-prod.yaml" %}}
For each admission request, the API server evaluates CEL expressions of each
(policy, binding, param) combination that match the request. For a request
to be admitted it must pass **all** evaluations.
If multiple bindings match the request, the policy will be evaluated for each,
and they must all pass evaluation for the policy to be considered passed.
If multiple parameters match a single binding, the policy rules will be evaluated
for each param, and they too must all pass for the binding to be considered passed.
Bindings can have overlapping match criteria. The policy is evaluated for each
matching binding-parameter combination. A policy may even be evaluated multiple
times if multiple bindings match it, or a single binding that matches multiple
parameters.
The params object representing a parameter resource will not be set if a parameter resource has
not been bound, so for policies requiring a parameter resource, it can be useful to add a check to
ensure one has been bound.
ensure one has been bound. A parameter resource will not be bound and `params` will be null
if `paramKind` of the policy, or `paramRef` of the binding are not specified.
For the use cases require parameter configuration, we recommend to add a param check in
`spec.validations[0].expression`:
@ -274,6 +190,8 @@ For the use cases require parameter configuration, we recommend to add a param c
message: "params missing but required to bind to this policy"
```
#### Optional parameters
It can be convenient to be able to have optional parameters as part of a parameter resource, and
only validate them if present. CEL provides `has()`, which checks if the key passed to it exists.
CEL also implements Boolean short-circuiting. If the first half of a logical OR evaluates to true,
@ -291,7 +209,38 @@ Here, we first check that the optional parameter is present with `!has(params.op
evaluated, and optionalNumber will be checked to ensure that it contains a value between 5 and
10 inclusive.
#### Authorization Check
#### Per-namespace Parameters
As the author of a ValidatingAdmissionPolicy and its ValidatingAdmissionPolicyBinding,
you can choose to specify cluster-wide, or per-namespace parameters.
If you specify a `namespace` for the binding's `paramRef`, the control plane only
searches for parameters in that namespace.
However, if `namespace` is not specified in the ValidatingAdmissionPolicyBinding, the
API server can search for relevant parameters in the namespace that a request is against.
For example, if you make a request to modify a ConfigMap in the `default` namespace and
there is a relevant ValidatingAdmissionPolicyBinding with no `namespace` set, then the
API server looks for a parameter object in `default`.
This design enables policy configuration that depends on the namespace
of the resource being manipulated, for more fine-tuned control.
#### Parameter selector
In addition to specify a parameter in a binding by `name`, you may
choose instead to specify label selector, such that all resources of the
policy's `paramKind`, and the param's `namespace` (if applicable) that match the
label selector are selected for evaluation. See {{< glossary_tooltip text="selector" term_id="selector">}} for more information on how label selectors match resources.
If multiple parameters are found to meet the condition, the policy's rules are
evaluated for each parameter found and the results will be ANDed together.
If `namespace` is provided, only objects of the `paramKind` in the provided
namespace are eligible for selection. Otherwise, when `namespace` is empty and
`paramKind` is namespace-scoped, the `namespace` used in the request being
admitted will be used.
#### Authorization checks {#authorization-check}
We introduced the authorization check for parameter resources.
User is expected to have `read` access to the resources referenced by `paramKind` in
@ -312,15 +261,7 @@ admission policy are handled. Allowed values are `Ignore` or `Fail`.
Note that the `failurePolicy` is defined inside `ValidatingAdmissionPolicy`:
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
spec:
...
failurePolicy: Ignore # The default is "Fail"
validations:
- expression: "object.spec.xyz == params.x"
```
{{% codenew language="yaml" file="validatingadmissionpolicy/failure-policy-ignore.yaml" %}}
### Validation Expression
@ -333,7 +274,9 @@ variables as well as some other useful variables:
- 'oldObject' - The existing object. The value is null for CREATE requests.
- 'request' - Attributes of the [admission request](/docs/reference/config-api/apiserver-admission.v1/#admission-k8s-io-v1-AdmissionRequest).
- 'params' - Parameter resource referred to by the policy binding being evaluated. The value is
null if `ParamKind` is unset.
null if `ParamKind` is not specified.
- `namespaceObject` - The namespace, as a Kubernetes resource, that the incoming object belongs to.
The value is null if the incoming object is cluster-scoped.
- `authorizer` - A CEL Authorizer. May be used to perform authorization checks for the principal
(authenticated user) of the request. See
[Authz](https://pkg.go.dev/k8s.io/apiserver/pkg/cel/library#Authz) in the Kubernetes CEL library
@ -466,8 +409,8 @@ event and all other values will be ignored.
To return a more friendly message when the policy rejects a request, we can use a CEL expression
to composite a message with `spec.validations[i].messageExpression`. Similar to the validation expression,
a message expression has access to `object`, `oldObject`, `request`, and `params`. Unlike validations,
message expression must evaluate to a string.
a message expression has access to `object`, `oldObject`, `request`, `params`, and `namespaceObject`.
Unlike validations, message expression must evaluate to a string.
For example, to better inform the user of the reason of denial when the policy refers to a parameter,
we can have the following validation:
@ -502,23 +445,7 @@ and an empty `status.typeChecking` means that no errors were detected.
For example, given the following policy definition:
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "deploy-replica-policy.example.com"
spec:
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.replicas > 1" # should be "object.spec.replicas > 1"
message: "must be replicated"
reason: Invalid
```
{{< codenew language="yaml" file="validatingadmissionpolicy/typechecking.yaml" >}}
The status will yield the following information:
@ -536,23 +463,8 @@ status:
If multiple resources are matched in `spec.matchConstraints`, all of matched resources will be checked against.
For example, the following policy definition
```yaml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "replica-policy.example.com"
spec:
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments","replicasets"]
validations:
- expression: "object.replicas > 1" # should be "object.spec.replicas > 1"
message: "must be replicated"
reason: Invalid
```
{{% codenew language="yaml" file="validatingadmissionpolicy/typechecking-multiple-match.yaml" %}}
will have multiple types and type checking result of each type in the warning message.
@ -579,3 +491,39 @@ Type Checking has the following limitation:
- Type Checking does not affect the policy behavior in any way. Even if the type checking detects errors, the policy will continue
to evaluate. If errors do occur during evaluate, the failure policy will decide its outcome.
- Type Checking does not apply to CRDs, including matched CRD types and reference of paramKind. The support for CRDs will come in future release.
### Variable composition
If an expression grows too complicated, or part of the expression is reusable and computationally expensive to evaluate,
you can extract some part of the expressions into variables. A variable is a named expression that can be referred later
in `variables` in other expressions.
```yaml
spec:
variables:
- name: foo
expression: "'foo' in object.spec.metadata.labels ? object.spec.metadata.labels['foo'] : 'default'"
validations:
- expression: variables.foo == 'bar'
```
A variable is lazily evaluated when it is first referred. Any error that occurs during the evaluation will be
reported during the evaluation of the referring expression. Both the result and potential error are memorized and
count only once towards the runtime cost.
The order of variables are important because a variable can refer to other variables that are defined before it.
This ordering prevents circular references.
The following is a more complex example of enforcing that image repo names match the environment defined in its namespace.
{{< codenew file="access/image-matches-namespace-environment.policy.yaml" >}}
With the policy bound to the namespace `default`, which is labeled `environment: prod`,
the following attempt to create a deployment would be rejected.
```shell
kubectl create deploy --image=dev.example.com/nginx invalid
```
The error message is similar to this.
```console
error: failed to create deployment: deployments.apps "invalid" is forbidden: ValidatingAdmissionPolicy 'image-matches-namespace-environment.policy.example.com' with binding 'demo-binding-test.example.com' denied request: only prod images are allowed in namespace default
```

View File

@ -32,6 +32,9 @@ In the following table:
|---------|---------|-------|-------|-------|
| `Accelerators` | `false` | Alpha | 1.6 | 1.10 |
| `Accelerators` | - | Deprecated | 1.11 | 1.11 |
| `AdvancedAuditing` | `false` | Alpha | 1.7 | 1.7 |
| `AdvancedAuditing` | `true` | Beta | 1.8 | 1.11 |
| `AdvancedAuditing` | `true` | GA | 1.12 | 1.27 |
| `AffinityInAnnotations` | `false` | Alpha | 1.6 | 1.7 |
| `AffinityInAnnotations` | - | Deprecated | 1.8 | 1.8 |
| `AllowExtTrafficLocalEndpoints` | `false` | Beta | 1.4 | 1.6 |
@ -78,6 +81,10 @@ In the following table:
| `CSIMigrationAzureDiskComplete` | - | Deprecated | 1.21 | 1.21 |
| `CSIMigrationAzureFileComplete` | `false` | Alpha | 1.17 | 1.20 |
| `CSIMigrationAzureFileComplete` | - | Deprecated | 1.21 | 1.21 |
| `CSIMigrationGCE` | `false` | Alpha | 1.14 | 1.16 |
| `CSIMigrationGCE` | `false` | Beta | 1.17 | 1.22 |
| `CSIMigrationGCE` | `true` | Beta | 1.23 | 1.24 |
| `CSIMigrationGCE` | `true` | GA | 1.25 | 1.27 |
| `CSIMigrationGCEComplete` | `false` | Alpha | 1.17 | 1.20 |
| `CSIMigrationGCEComplete` | - | Deprecated | 1.21 | 1.21 |
| `CSIMigrationOpenStack` | `false` | Alpha | 1.14 | 1.17 |
@ -96,6 +103,9 @@ In the following table:
| `CSIServiceAccountToken` | `false` | Alpha | 1.20 | 1.20 |
| `CSIServiceAccountToken` | `true` | Beta | 1.21 | 1.21 |
| `CSIServiceAccountToken` | `true` | GA | 1.22 | 1.24 |
| `CSIStorageCapacity` | `false` | Alpha | 1.19 | 1.20 |
| `CSIStorageCapacity` | `true` | Beta | 1.21 | 1.23 |
| `CSIStorageCapacity` | `true` | GA | 1.24 | 1.27 |
| `CSIVolumeFSGroupPolicy` | `false` | Alpha | 1.19 | 1.19 |
| `CSIVolumeFSGroupPolicy` | `true` | Beta | 1.20 | 1.22 |
| `CSIVolumeFSGroupPolicy` | `true` | GA | 1.23 | 1.25 |
@ -134,6 +144,18 @@ In the following table:
| `DefaultPodTopologySpread` | `false` | Alpha | 1.19 | 1.19 |
| `DefaultPodTopologySpread` | `true` | Beta | 1.20 | 1.23 |
| `DefaultPodTopologySpread` | `true` | GA | 1.24 | 1.25 |
| `DelegateFSGroupToCSIDriver` | `false` | Alpha | 1.22 | 1.22 |
| `DelegateFSGroupToCSIDriver` | `true` | Beta | 1.23 | 1.25 |
| `DelegateFSGroupToCSIDriver` | `true` | GA | 1.26 | 1.27 |
| `DevicePlugins` | `false` | Alpha | 1.8 | 1.9 |
| `DevicePlugins` | `true` | Beta | 1.10 | 1.25 |
| `DevicePlugins` | `true` | GA | 1.26 | 1.27 |
| `DisableAcceleratorUsageMetrics` | `false` | Alpha | 1.19 | 1.19 |
| `DisableAcceleratorUsageMetrics` | `true` | Beta | 1.20 | 1.24 |
| `DisableAcceleratorUsageMetrics` | `true` | GA | 1.25 | 1.27 |
| `DryRun` | `false` | Alpha | 1.12 | 1.12 |
| `DryRun` | `true` | Beta | 1.13 | 1.18 |
| `DryRun` | `true` | GA | 1.19 | 1.27 |
| `DynamicAuditing` | `false` | Alpha | 1.13 | 1.18 |
| `DynamicAuditing` | - | Deprecated | 1.19 | 1.19 |
| `DynamicKubeletConfig` | `false` | Alpha | 1.4 | 1.10 |
@ -155,6 +177,9 @@ In the following table:
| `EndpointSliceProxying` | `false` | Alpha | 1.18 | 1.18 |
| `EndpointSliceProxying` | `true` | Beta | 1.19 | 1.21 |
| `EndpointSliceProxying` | `true` | GA | 1.22 | 1.24 |
| `EndpointSliceTerminatingCondition` | `false` | Alpha | 1.20 | 1.21 |
| `EndpointSliceTerminatingCondition` | `true` | Beta | 1.22 | 1.25 |
| `EndpointSliceTerminatingCondition` | `true` | GA | 1.26 | 1.27 |
| `EphemeralContainers` | `false` | Alpha | 1.16 | 1.22 |
| `EphemeralContainers` | `true` | Beta | 1.23 | 1.24 |
| `EphemeralContainers` | `true` | GA | 1.25 | 1.26 |
@ -203,8 +228,12 @@ In the following table:
| `IngressClassNamespacedParams` | `true` | GA | 1.23 | 1.24 |
| `Initializers` | `false` | Alpha | 1.7 | 1.13 |
| `Initializers` | - | Deprecated | 1.14 | 1.14 |
| `KMSv1` | `true` | Deprecated | 1.28 | |
| `KubeletConfigFile` | `false` | Alpha | 1.8 | 1.9 |
| `KubeletConfigFile` | - | Deprecated | 1.10 | 1.10 |
| `KubeletCredentialProviders` | `false` | Alpha | 1.20 | 1.23 |
| `KubeletCredentialProviders` | `true` | Beta | 1.24 | 1.25 |
| `KubeletCredentialProviders` | `true` | GA | 1.26 | 1.28 |
| `KubeletPluginsWatcher` | `false` | Alpha | 1.11 | 1.11 |
| `KubeletPluginsWatcher` | `true` | Beta | 1.12 | 1.12 |
| `KubeletPluginsWatcher` | `true` | GA | 1.13 | 1.16 |
@ -214,6 +243,9 @@ In the following table:
| `LocalStorageCapacityIsolation` | `false` | Alpha | 1.7 | 1.9 |
| `LocalStorageCapacityIsolation` | `true` | Beta | 1.10 | 1.24 |
| `LocalStorageCapacityIsolation` | `true` | GA | 1.25 | 1.26 |
| `MixedProtocolLBService` | `false` | Alpha | 1.20 | 1.23 |
| `MixedProtocolLBService` | `true` | Beta | 1.24 | 1.25 |
| `MixedProtocolLBService` | `true` | GA | 1.26 | 1.27 |
| `MountContainers` | `false` | Alpha | 1.9 | 1.16 |
| `MountContainers` | `false` | Deprecated | 1.17 | 1.17 |
| `MountPropagation` | `false` | Alpha | 1.8 | 1.9 |
@ -224,6 +256,7 @@ In the following table:
| `NetworkPolicyEndPort` | `false` | Alpha | 1.21 | 1.21 |
| `NetworkPolicyEndPort` | `true` | Beta | 1.22 | 1.24 |
| `NetworkPolicyEndPort` | `true` | GA | 1.25 | 1.26 |
| `NetworkPolicyStatus` | `false` | Alpha | 1.24 | 1.27 |
| `NodeDisruptionExclusion` | `false` | Alpha | 1.16 | 1.18 |
| `NodeDisruptionExclusion` | `true` | Beta | 1.19 | 1.20 |
| `NodeDisruptionExclusion` | `true` | GA | 1.21 | 1.22 |
@ -244,6 +277,7 @@ In the following table:
| `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 |
| `PodDisruptionBudget` | `true` | Beta | 1.5 | 1.20 |
| `PodDisruptionBudget` | `true` | GA | 1.21 | 1.25 |
| `PodHasNetworkCondition` | `false` | Alpha | 1.25 | 1.27 |
| `PodOverhead` | `false` | Alpha | 1.16 | 1.17 |
| `PodOverhead` | `true` | Beta | 1.18 | 1.23 |
| `PodOverhead` | `true` | GA | 1.24 | 1.25 |
@ -253,6 +287,9 @@ In the following table:
| `PodReadinessGates` | `false` | Alpha | 1.11 | 1.11 |
| `PodReadinessGates` | `true` | Beta | 1.12 | 1.13 |
| `PodReadinessGates` | `true` | GA | 1.14 | 1.16 |
| `PodSecurity` | `false` | Alpha | 1.22 | 1.22 |
| `PodSecurity` | `true` | Beta | 1.23 | 1.24 |
| `PodSecurity` | `true` | GA | 1.25 | 1.27 |
| `PodShareProcessNamespace` | `false` | Alpha | 1.10 | 1.11 |
| `PodShareProcessNamespace` | `true` | Beta | 1.12 | 1.16 |
| `PodShareProcessNamespace` | `true` | GA | 1.17 | 1.19 |
@ -291,6 +328,12 @@ In the following table:
| `ServiceAppProtocol` | `false` | Alpha | 1.18 | 1.18 |
| `ServiceAppProtocol` | `true` | Beta | 1.19 | 1.19 |
| `ServiceAppProtocol` | `true` | GA | 1.20 | 1.22 |
| `ServiceIPStaticSubrange` | `false` | Alpha | 1.24 | 1.24 |
| `ServiceIPStaticSubrange` | `true` | Beta | 1.25 | 1.25 |
| `ServiceIPStaticSubrange` | `true` | GA | 1.26 | 1.27 |
| `ServiceInternalTrafficPolicy` | `false` | Alpha | 1.21 | 1.21 |
| `ServiceInternalTrafficPolicy` | `true` | Beta | 1.22 | 1.25 |
| `ServiceInternalTrafficPolicy` | `true` | GA | 1.26 | 1.27 |
| `ServiceLBNodePortControl` | `false` | Alpha | 1.20 | 1.21 |
| `ServiceLBNodePortControl` | `true` | Beta | 1.22 | 1.23 |
| `ServiceLBNodePortControl` | `true` | GA | 1.24 | 1.25 |
@ -350,6 +393,7 @@ In the following table:
| `TokenRequestProjection` | `false` | Alpha | 1.11 | 1.11 |
| `TokenRequestProjection` | `true` | Beta | 1.12 | 1.19 |
| `TokenRequestProjection` | `true` | GA | 1.20 | 1.21 |
| `UserNamespacesStatelessPodsSupport` | `false` | Alpha | 1.25 | 1.27 |
| `ValidateProxyRedirects` | `false` | Alpha | 1.12 | 1.13 |
| `ValidateProxyRedirects` | `true` | Beta | 1.14 | 1.21 |
| `ValidateProxyRedirects` | `true` | Deprecated | 1.22 | 1.24 |
@ -374,6 +418,9 @@ In the following table:
| `WindowsGMSA` | `false` | Alpha | 1.14 | 1.15 |
| `WindowsGMSA` | `true` | Beta | 1.16 | 1.17 |
| `WindowsGMSA` | `true` | GA | 1.18 | 1.20 |
| `WindowsHostProcessContainers` | `false` | Alpha | 1.22 | 1.22 |
| `WindowsHostProcessContainers` | `true` | Beta | 1.23 | 1.25 |
| `WindowsHostProcessContainers` | `true` | GA | 1.26 | 1.27 |
| `WindowsRunAsUserName` | `false` | Alpha | 1.16 | 1.16 |
| `WindowsRunAsUserName` | `true` | Beta | 1.17 | 1.17 |
| `WindowsRunAsUserName` | `true` | GA | 1.18 | 1.20 |
@ -389,6 +436,8 @@ In the following table:
- `AffinityInAnnotations`: Enable setting
[Pod affinity or anti-affinity](/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity).
- `AdvancedAuditing`: Enable [advanced auditing](/docs/tasks/debug/debug-cluster/audit/#advanced-audit)
- `AllowExtTrafficLocalEndpoints`: Enable a service to route external requests to node local endpoints.
- `AllowInsecureBackendProxy`: Enable the users to skip TLS verification of
@ -475,6 +524,14 @@ In the following table:
`InTreePluginAzureFileUnregister` feature flag which prevents the registration
of in-tree AzureFile plugin.
- `CSIMigrationGCE`: Enables shims and translation logic to route volume
operations from the GCE-PD in-tree plugin to PD CSI plugin. Supports falling
back to in-tree GCE plugin for mount operations to nodes that have the
feature disabled or that do not have PD CSI plugin installed and configured.
Does not support falling back for provision operations, for those the CSI
plugin must be installed and configured. Requires CSIMigration feature flag
enabled.
- `CSIMigrationGCEComplete`: Stops registering the GCE-PD in-tree plugin in
kubelet and volume controllers and enables shims and translation logic to
route volume operations from the GCE-PD in-tree plugin to PD CSI plugin.
@ -517,6 +574,11 @@ In the following table:
that they mount volumes for. See
[Token Requests](https://kubernetes-csi.github.io/docs/token-requests.html).
- `CSIStorageCapacity`: Enables CSI drivers to publish storage capacity information
and the Kubernetes scheduler to use that information when scheduling pods. See
[Storage Capacity](/docs/concepts/storage/storage-capacity/).
Check the [`csi` volume type](/docs/concepts/storage/volumes/#csi) documentation for more details.
- `CSIVolumeFSGroupPolicy`: Allows CSIDrivers to use the `fsGroupPolicy` field.
This field controls whether volumes created by a CSIDriver support volume ownership
and permission modifications when these volumes are mounted.
@ -564,6 +626,19 @@ In the following table:
- `DefaultPodTopologySpread`: Enables the use of `PodTopologySpread` scheduling plugin to do
[default spreading](/docs/concepts/scheduling-eviction/topology-spread-constraints/#internal-default-constraints).
- `DelegateFSGroupToCSIDriver`: If supported by the CSI driver, delegates the
role of applying `fsGroup` from a Pod's `securityContext` to the driver by
passing `fsGroup` through the NodeStageVolume and NodePublishVolume CSI calls.
- `DevicePlugins`: Enable the [device-plugins](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)
based resource provisioning on nodes.
- `DisableAcceleratorUsageMetrics`:
[Disable accelerator metrics collected by the kubelet](/docs/concepts/cluster-administration/system-metrics/#disable-accelerator-metrics).
- `DryRun`: Enable server-side [dry run](/docs/reference/using-api/api-concepts/#dry-run) requests
so that validation, merging, and mutation can be tested without committing.
- `DynamicAuditing`: Used to enable dynamic auditing before v1.19.
- `DynamicKubeletConfig`: Enable the dynamic configuration of kubelet. The
@ -593,6 +668,9 @@ In the following table:
Endpoints, enabling scalability and performance improvements. See
[Enabling Endpoint Slices](/docs/concepts/services-networking/endpoint-slices/).
- `EndpointSliceTerminatingCondition`: Enables EndpointSlice `terminating` and `serving`
condition fields.
- `EphemeralContainers`: Enable the ability to add
{{< glossary_tooltip text="ephemeral containers" term_id="ephemeral-container" >}}
to running Pods.
@ -658,6 +736,9 @@ In the following table:
See [setting kubelet parameters via a config file](/docs/tasks/administer-cluster/kubelet-config-file/)
for more details.
- `KubeletCredentialProviders`: Enable kubelet exec credential providers for
image pull credentials.
- `KubeletPluginsWatcher`: Enable probe-based plugin watcher utility to enable kubelet
to discover plugins such as [CSI volume drivers](/docs/concepts/storage/volumes/#csi).
@ -670,6 +751,9 @@ In the following table:
and also the `sizeLimit` property of an
[emptyDir volume](/docs/concepts/storage/volumes/#emptydir).
- `MixedProtocolLBService`: Enable using different protocols in the same `LoadBalancer` type
Service instance.
- `MountContainers`: Enable using utility containers on host as the volume mounter.
- `MountPropagation`: Enable sharing volume mounted by one container to other containers or pods.
@ -679,6 +763,8 @@ In the following table:
{{< glossary_tooltip text="label" term_id="label" >}} `kubernetes.io/metadata.name`
on all namespaces, containing the namespace name.
- `NetworkPolicyStatus`: Enable the `status` subresource for NetworkPolicy objects.
- `NodeDisruptionExclusion`: Enable use of the Node label `node.kubernetes.io/exclude-disruption`
which prevents nodes from being evacuated during zone failures.
@ -699,6 +785,9 @@ In the following table:
- `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature.
- `PodHasNetworkCondition`: Enable the kubelet to mark the [PodHasNetwork](/docs/concepts/workloads/pods/pod-lifecycle/#pod-has-network)
condition on pods. This was renamed to `PodReadyToStartContainersCondition` in 1.28.
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/)
feature to account for pod overheads.
@ -709,6 +798,8 @@ In the following table:
Pod readiness evaluation. See [Pod readiness gate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate)
for more details.
- `PodSecurity`: Enables the `PodSecurity` admission plugin.
- `PodShareProcessNamespace`: Enable the setting of `shareProcessNamespace` in a Pod for sharing
a single process namespace between containers running in a pod. More details can be found in
[Share Process Namespace between Containers in a Pod](/docs/tasks/configure-pod-container/share-process-namespace/).
@ -759,6 +850,15 @@ In the following table:
- `ServiceAppProtocol`: Enables the `appProtocol` field on Services and Endpoints.
- `ServiceIPStaticSubrange`: Enables a strategy for Services ClusterIP allocations, whereby the
ClusterIP range is subdivided. Dynamic allocated ClusterIP addresses will be allocated preferently
from the upper range allowing users to assign static ClusterIPs from the lower range with a low
risk of collision. See
[Avoiding collisions](/docs/reference/networking/virtual-ips/#avoiding-collisions)
for more details.
- `ServiceInternalTrafficPolicy`: Enables the `internalTrafficPolicy` field on Services.
- `ServiceLoadBalancerClass`: Enables the `loadBalancerClass` field on Services. See
[Specifying class of load balancer implementation](/docs/concepts/services-networking/service/#load-balancer-class)
for more details.
@ -820,6 +920,8 @@ In the following table:
- `TokenRequestProjection`: Enable the injection of service account tokens into a Pod through a
[`projected` volume](/docs/concepts/storage/volumes/#projected).
- `UserNamespacesStatelessPodsSupport`: Enable user namespace support for stateless Pods. This flag was renamed on newer releases to `UserNamespacesSupport`.
- `ValidateProxyRedirects`: This flag controls whether the API server should validate that redirects
are only followed to the same host. Only used if the `StreamingProxyRedirects` flag is enabled.
@ -846,6 +948,8 @@ In the following table:
- `WindowsGMSA`: Enables passing of GMSA credential specs from pods to container runtimes.
- `WindowsHostProcessContainers`: Enables support for Windows HostProcess containers.
- `WindowsRunAsUserName` : Enable support for running applications in Windows containers with as a
non-default user. See [Configuring RunAsUserName](/docs/tasks/configure-pod-container/configure-runasusername)
for more details.

View File

@ -62,13 +62,12 @@ For a reference to old feature gates that are removed, please refer to
| `APIPriorityAndFairness` | `true` | Beta | 1.20 | |
| `APIResponseCompression` | `false` | Alpha | 1.7 | 1.15 |
| `APIResponseCompression` | `true` | Beta | 1.16 | |
| `APISelfSubjectReview` | `false` | Alpha | 1.26 | 1.26 |
| `APISelfSubjectReview` | `true` | Beta | 1.27 | |
| `APIServerIdentity` | `false` | Alpha | 1.20 | 1.25 |
| `APIServerIdentity` | `true` | Beta | 1.26 | |
| `APIServerTracing` | `false` | Alpha | 1.22 | 1.26 |
| `APIServerTracing` | `true` | Beta | 1.27 | |
| `AdmissionWebhookMatchConditions` | `false` | Alpha | 1.27 | |
| `AdmissionWebhookMatchConditions` | `false` | Alpha | 1.27 | 1.27 |
| `AdmissionWebhookMatchConditions` | `true` | Beta | 1.28 | |
| `AggregatedDiscoveryEndpoint` | `false` | Alpha | 1.26 | 1.26 |
| `AggregatedDiscoveryEndpoint` | `true` | Beta | 1.27 | |
| `AnyVolumeDataSource` | `false` | Alpha | 1.18 | 1.23 |
@ -78,9 +77,9 @@ For a reference to old feature gates that are removed, please refer to
| `CPUManagerPolicyBetaOptions` | `true` | Beta | 1.23 | |
| `CPUManagerPolicyOptions` | `false` | Alpha | 1.22 | 1.22 |
| `CPUManagerPolicyOptions` | `true` | Beta | 1.23 | |
| CRDValidationRatcheting | false | Alpha | 1.28 |
| `CSIMigrationPortworx` | `false` | Alpha | 1.23 | 1.24 |
| `CSIMigrationPortworx` | `false` | Beta | 1.25 | |
| `CSIMigrationRBD` | `false` | Alpha | 1.23 | |
| `CSINodeExpandSecret` | `false` | Alpha | 1.25 | 1.26 |
| `CSINodeExpandSecret` | `true` | Beta | 1.27 | |
| `CSIVolumeHealth` | `false` | Alpha | 1.21 | |
@ -89,21 +88,21 @@ For a reference to old feature gates that are removed, please refer to
| `ClusterTrustBundle` | false | Alpha | 1.27 | |
| `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 |
| `ComponentSLIs` | `true` | Beta | 1.27 | |
| `ConsistentListFromCache` | `false` | Alpha | 1.28 |
| `ContainerCheckpoint` | `false` | Alpha | 1.25 | |
| `ContextualLogging` | `false` | Alpha | 1.24 | |
| `CronJobsScheduledAnnotation` | `true` | Beta | 1.28 | |
| `CrossNamespaceVolumeDataSource` | `false` | Alpha| 1.26 | |
| `CustomCPUCFSQuotaPeriod` | `false` | Alpha | 1.12 | |
| `CustomResourceValidationExpressions` | `false` | Alpha | 1.23 | 1.24 |
| `CustomResourceValidationExpressions` | `true` | Beta | 1.25 | |
| `DevicePluginCDIDevices` | `false` | Alpha | 1.28 | |
| `DisableCloudProviders` | `false` | Alpha | 1.22 | |
| `DisableKubeletCloudCredentialProviders` | `false` | Alpha | 1.23 | |
| `DynamicResourceAllocation` | `false` | Alpha | 1.26 | |
| `ElasticIndexedJob` | `true` | Beta` | 1.27 | |
| `EventedPLEG` | `false` | Alpha | 1.26 | 1.26 |
| `EventedPLEG` | `false` | Beta | 1.27 | - |
| `ExpandedDNSConfig` | `false` | Alpha | 1.22 | 1.25 |
| `ExpandedDNSConfig` | `true` | Beta | 1.26 | |
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | |
| `GracefulNodeShutdown` | `false` | Alpha | 1.20 | 1.20 |
| `GracefulNodeShutdown` | `true` | Beta | 1.21 | |
| `GracefulNodeShutdownBasedOnPodPriority` | `false` | Alpha | 1.23 | 1.23 |
@ -112,8 +111,6 @@ For a reference to old feature gates that are removed, please refer to
| `HPAContainerMetrics` | `true` | Beta | 1.27 | |
| `HPAScaleToZero` | `false` | Alpha | 1.16 | |
| `HonorPVReclaimPolicy` | `false` | Alpha | 1.23 | |
| `IPTablesOwnershipCleanup` | `false` | Alpha | 1.25 | 1.26 |
| `IPTablesOwnershipCleanup` | `true` | Beta | 1.27 | |
| `InPlacePodVerticalScaling` | `false` | Alpha | 1.27 | |
| `InTreePluginAWSUnregister` | `false` | Alpha | 1.21 | |
| `InTreePluginAzureDiskUnregister` | `false` | Alpha | 1.21 | |
@ -121,25 +118,24 @@ For a reference to old feature gates that are removed, please refer to
| `InTreePluginGCEUnregister` | `false` | Alpha | 1.21 | |
| `InTreePluginOpenStackUnregister` | `false` | Alpha | 1.21 | |
| `InTreePluginPortworxUnregister` | `false` | Alpha | 1.23 | |
| `InTreePluginRBDUnregister` | `false` | Alpha | 1.23 | |
| `InTreePluginvSphereUnregister` | `false` | Alpha | 1.21 | |
| `JobBackoffLimitPerIndex` | `false` | Alpha | 1.28 | |
| `JobPodFailurePolicy` | `false` | Alpha | 1.25 | 1.25 |
| `JobPodFailurePolicy` | `true` | Beta | 1.26 | |
| `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | |
| `JobReadyPods` | `false` | Alpha | 1.23 | 1.23 |
| `JobReadyPods` | `true` | Beta | 1.24 | |
| `KMSv2` | `false` | Alpha | 1.25 | 1.26 |
| `KMSv2` | `true` | Beta | 1.27 | |
| `KMSv2KDF` | `false` | Beta | 1.28 | |
| `KubeProxyDrainingTerminatingNodes` | `false` | Alpha | 1.28 | |
| `KubeletCgroupDriverFromCRI` | `false` | Alpha | 1.28 | |
| `KubeletInUserNamespace` | `false` | Alpha | 1.22 | |
| `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 |
| `KubeletPodResources` | `true` | Beta | 1.15 | |
| `KubeletPodResourcesDynamicResources` | `false` | Alpha | 1.27 | |
| `KubeletPodResourcesGet` | `false` | Alpha | 1.27 | |
| `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 |
| `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | |
| `KubeletTracing` | `false` | Alpha | 1.25 | 1.26 |
| `KubeletTracing` | `true` | Beta | 1.27 | |
| `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.26 | 1.26 |
| `LegacyServiceAccountTokenTracking` | `true` | Beta | 1.27 | |
| `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | |
| `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | - |
| `LogarithmicScaleDown` | `false` | Alpha | 1.21 | 1.21 |
| `LogarithmicScaleDown` | `true` | Beta | 1.22 | |
@ -154,52 +150,50 @@ For a reference to old feature gates that are removed, please refer to
| `MinDomainsInPodTopologySpread` | `false` | Alpha | 1.24 | 1.24 |
| `MinDomainsInPodTopologySpread` | `false` | Beta | 1.25 | 1.26 |
| `MinDomainsInPodTopologySpread` | `true` | Beta | 1.27 | |
| `MinimizeIPTablesRestore` | `false` | Alpha | 1.26 | 1.26 |
| `MinimizeIPTablesRestore` | `true` | Beta | 1.27 | |
| `MultiCIDRRangeAllocator` | `false` | Alpha | 1.25 | |
| `MultiCIDRServiceAllocator` | `false` | Alpha | 1.27 | |
| `NetworkPolicyStatus` | `false` | Alpha | 1.24 | |
| `NewVolumeManagerReconstruction` | `true` | Beta | 1.27 | |
| `NewVolumeManagerReconstruction` | `false` | Beta | 1.27 | 1.27 |
| `NewVolumeManagerReconstruction` | `true` | Beta | 1.28 | |
| `NodeInclusionPolicyInPodTopologySpread` | `false` | Alpha | 1.25 | 1.25 |
| `NodeInclusionPolicyInPodTopologySpread` | `true` | Beta | 1.26 | |
| `NodeLogQuery` | `false` | Alpha | 1.27 | |
| `NodeOutOfServiceVolumeDetach` | `false` | Alpha | 1.24 | 1.25 |
| `NodeOutOfServiceVolumeDetach` | `true` | Beta | 1.26 | |
| `NodeSwap` | `false` | Alpha | 1.22 | |
| `NodeSwap` | `false` | Alpha | 1.22 | 1.27 |
| `NodeSwap` | `false` | Beta | 1.28 | |
| `OpenAPIEnums` | `false` | Alpha | 1.23 | 1.23 |
| `OpenAPIEnums` | `true` | Beta | 1.24 | |
| `PDBUnhealthyPodEvictionPolicy` | `false` | Alpha | 1.26 | 1.26 |
| `PDBUnhealthyPodEvictionPolicy` | `true` | Beta | 1.27 | |
| `PersistentVolumeLastPhaseTransistionTime` | `false` | Alpha | 1.28 | |
| `PodAndContainerStatsFromCRI` | `false` | Alpha | 1.23 | |
| `PodDeletionCost` | `false` | Alpha | 1.21 | 1.21 |
| `PodDeletionCost` | `true` | Beta | 1.22 | |
| `PodDisruptionConditions` | `false` | Alpha | 1.25 | 1.25 |
| `PodDisruptionConditions` | `true` | Beta | 1.26 | |
| `PodHasNetworkCondition` | `false` | Alpha | 1.25 | |
| `PodHostIPs` | `false` | Alpha | 1.28 | |
| `PodIndexLabel` | `true` | Beta | 1.28 | |
| `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | |
| `PodSchedulingReadiness` | `false` | Alpha | 1.26 | 1.26 |
| `PodSchedulingReadiness` | `true` | Beta | 1.27 | |
| `ProbeTerminationGracePeriod` | `false` | Alpha | 1.21 | 1.21 |
| `ProbeTerminationGracePeriod` | `false` | Beta | 1.22 | 1.24 |
| `ProbeTerminationGracePeriod` | `true` | Beta | 1.25 | |
| `ProcMountType` | `false` | Alpha | 1.12 | |
| `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | 1.25 |
| `ProxyTerminatingEndpoints` | `true` | Beta | 1.26 | |
| `QOSReserved` | `false` | Alpha | 1.11 | |
| `ReadWriteOncePod` | `false` | Alpha | 1.22 | 1.26 |
| `ReadWriteOncePod` | `true` | Beta | 1.27 | |
| `RecoverVolumeExpansionFailure` | `false` | Alpha | 1.23 | |
| `RemainingItemCount` | `false` | Alpha | 1.15 | 1.15 |
| `RemainingItemCount` | `true` | Beta | 1.16 | |
| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 |
| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | |
| `RotateKubeletServerCertificate` | `false` | Alpha | 1.7 | 1.11 |
| `RotateKubeletServerCertificate` | `true` | Beta | 1.12 | |
| `SELinuxMountReadWriteOncePod` | `false` | Alpha | 1.25 | 1.26 |
| `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.27 | |
| `SELinuxMountReadWriteOncePod` | `false` | Beta | 1.27 | 1.27 |
| `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | |
| `SchedulerQueueingHints` | `false` | Alpha | 1.28 | |
| `SecurityContextDeny` | `false` | Alpha | 1.27 | |
| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | |
| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 |
| `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | |
| `SidecarContainers` | `false` | Alpha | 1.28 | |
| `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 |
| `SizeMemoryBackedVolumes` | `true` | Beta | 1.22 | |
| `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | |
| `StableLoadBalancerNodeSet` | `true` | Beta | 1.27 | |
| `StatefulSetAutoDeletePVC` | `false` | Alpha | 1.23 | 1.26 |
| `StatefulSetAutoDeletePVC` | `false` | Beta | 1.27 | |
@ -212,16 +206,20 @@ For a reference to old feature gates that are removed, please refer to
| `TopologyAwareHints` | `false` | Beta | 1.23 | 1.23 |
| `TopologyAwareHints` | `true` | Beta | 1.24 | |
| `TopologyManagerPolicyAlphaOptions` | `false` | Alpha | 1.26 | |
| `TopologyManagerPolicyBetaOptions` | `false` | Beta | 1.26 | |
| `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | |
| `UserNamespacesStatelessPodsSupport` | `false` | Alpha | 1.25 | |
| `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | |
| `VolumeCapacityPriority` | `false` | Alpha | 1.21 | - |
| `TopologyManagerPolicyBetaOptions` | `false` | Beta | 1.26 | 1.27 |
| `TopologyManagerPolicyBetaOptions` | `true` | Beta | 1.28 | |
| `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | 1.27 |
| `TopologyManagerPolicyOptions` | `true` | Beta | 1.28 | |
| `UnknownVersionInteroperabilityProxy` | `false` | Alpha | 1.28 | |
| `UserNamespacesSupport` | `false` | Alpha | 1.28 | |
| `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | 1.27 |
| `ValidatingAdmissionPolicy` | `false` | Beta | 1.28 | |
| `VolumeCapacityPriority` | `false` | Alpha | 1.21 | |
| `WatchList` | false | Alpha | 1.27 | |
| `WinDSR` | `false` | Alpha | 1.14 | |
| `WinOverlay` | `false` | Alpha | 1.14 | 1.19 |
| `WinOverlay` | `true` | Beta | 1.20 | |
| `WindowsHostNetwork` | `true` | Alpha | 1.26| |
| `WindowsHostNetwork` | `true` | Alpha | 1.26 | |
{{< /table >}}
### Feature gates for graduated or deprecated features
@ -230,9 +228,9 @@ For a reference to old feature gates that are removed, please refer to
| Feature | Default | Stage | Since | Until |
|---------|---------|-------|-------|-------|
| `AdvancedAuditing` | `false` | Alpha | 1.7 | 1.7 |
| `AdvancedAuditing` | `true` | Beta | 1.8 | 1.11 |
| `AdvancedAuditing` | `true` | GA | 1.12 | - |
| `APISelfSubjectReview` | `false` | Alpha | 1.26 | 1.26 |
| `APISelfSubjectReview` | `true` | Beta | 1.27 | 1.27 |
| `APISelfSubjectReview` | `true` | GA | 1.28 | - |
| `CPUManager` | `false` | Alpha | 1.8 | 1.9 |
| `CPUManager` | `true` | Beta | 1.10 | 1.25 |
| `CPUManager` | `true` | GA | 1.26 | - |
@ -240,70 +238,81 @@ For a reference to old feature gates that are removed, please refer to
| `CSIMigrationAzureFile` | `false` | Beta | 1.21 | 1.23 |
| `CSIMigrationAzureFile` | `true` | Beta | 1.24 | 1.25 |
| `CSIMigrationAzureFile` | `true` | GA | 1.26 | |
| `CSIMigrationGCE` | `false` | Alpha | 1.14 | 1.16 |
| `CSIMigrationGCE` | `false` | Beta | 1.17 | 1.22 |
| `CSIMigrationGCE` | `true` | Beta | 1.23 | 1.24 |
| `CSIMigrationGCE` | `true` | GA | 1.25 | - |
| `CSIMigrationRBD` | `false` | Alpha | 1.23 | 1.27 |
| `CSIMigrationRBD` | `false` | Deprecated | 1.28 | |
| `CSIMigrationvSphere` | `false` | Alpha | 1.18 | 1.18 |
| `CSIMigrationvSphere` | `false` | Beta | 1.19 | 1.24 |
| `CSIMigrationvSphere` | `true` | Beta | 1.25 | 1.25 |
| `CSIMigrationvSphere` | `true` | GA | 1.26 | - |
| `CSIStorageCapacity` | `false` | Alpha | 1.19 | 1.20 |
| `CSIStorageCapacity` | `true` | Beta | 1.21 | 1.23 |
| `CSIStorageCapacity` | `true` | GA | 1.24 | - |
| `ConsistentHTTPGetHandlers` | `true` | GA | 1.25 | - |
| `CronJobTimeZone` | `false` | Alpha | 1.24 | 1.24 |
| `CronJobTimeZone` | `true` | Beta | 1.25 | 1.26 |
| `CronJobTimeZone` | `true` | GA | 1.27 | - |
| `DelegateFSGroupToCSIDriver` | `false` | Alpha | 1.22 | 1.22 |
| `DelegateFSGroupToCSIDriver` | `true` | Beta | 1.23 | 1.25 |
| `DelegateFSGroupToCSIDriver` | `true` | GA | 1.26 |-|
| `DevicePlugins` | `false` | Alpha | 1.8 | 1.9 |
| `DevicePlugins` | `true` | Beta | 1.10 | 1.25 |
| `DevicePlugins` | `true` | GA | 1.26 | - |
| `DisableAcceleratorUsageMetrics` | `false` | Alpha | 1.19 | 1.19 |
| `DisableAcceleratorUsageMetrics` | `true` | Beta | 1.20 | 1.24 |
| `DisableAcceleratorUsageMetrics` | `true` | GA | 1.25 |- |
| `DaemonSetUpdateSurge` | `false` | Alpha | 1.21 | 1.21 |
| `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 |
| `DaemonSetUpdateSurge` | `true` | GA | 1.25 | |
| `DefaultHostNetworkHostPortsInPodTemplates` | `false` | Deprecated | 1.28 | |
| `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 |
| `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 |
| `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 |
| `DownwardAPIHugePages` | `true` | GA | 1.27 | - |
| `DryRun` | `false` | Alpha | 1.12 | 1.12 |
| `DryRun` | `true` | Beta | 1.13 | 1.18 |
| `DryRun` | `true` | GA | 1.19 | - |
| `DownwardAPIHugePages` | `true` | GA | 1.27 | |
| `EfficientWatchResumption` | `false` | Alpha | 1.20 | 1.20 |
| `EfficientWatchResumption` | `true` | Beta | 1.21 | 1.23 |
| `EfficientWatchResumption` | `true` | GA | 1.24 | - |
| `EndpointSliceTerminatingCondition` | `false` | Alpha | 1.20 | 1.21 |
| `EndpointSliceTerminatingCondition` | `true` | Beta | 1.22 | 1.25 |
| `EndpointSliceTerminatingCondition` | `true` | GA | 1.26 | |
| `ExecProbeTimeout` | `true` | GA | 1.20 | - |
| `EfficientWatchResumption` | `true` | GA | 1.24 | |
| `ExecProbeTimeout` | `true` | GA | 1.20 | |
| `ExpandedDNSConfig` | `false` | Alpha | 1.22 | 1.25 |
| `ExpandedDNSConfig` | `true` | Beta | 1.26 | 1.27 |
| `ExpandedDNSConfig` | `true` | GA | 1.28 | |
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | 1.27 |
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Deprecated | 1.28 | |
| `GRPCContainerProbe` | `false` | Alpha | 1.23 | 1.23 |
| `GRPCContainerProbe` | `true` | Beta | 1.24 | 1.26 |
| `GRPCContainerProbe` | `true` | GA | 1.27 | |
| `IPTablesOwnershipCleanup` | `false` | Alpha | 1.25 | 1.26 |
| `IPTablesOwnershipCleanup` | `true` | Beta | 1.27 | 1.27 |
| `IPTablesOwnershipCleanup` | `true` | GA | 1.28 | |
| `InTreePluginRBDUnregister` | `false` | Alpha | 1.23 | 1.27 |
| `InTreePluginRBDUnregister` | `false` | Deprecated | 1.28 | |
| `JobMutableNodeSchedulingDirectives` | `true` | Beta | 1.23 | 1.26 |
| `JobMutableNodeSchedulingDirectives` | `true` | GA | 1.27 | |
| `JobTrackingWithFinalizers` | `false` | Alpha | 1.22 | 1.22 |
| `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 |
| `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 |
| `JobTrackingWithFinalizers` | `true` | GA | 1.26 | - |
| `KubeletCredentialProviders` | `false` | Alpha | 1.20 | 1.23 |
| `KubeletCredentialProviders` | `true` | Beta | 1.24 | 1.25 |
| `KubeletCredentialProviders` | `true` | GA | 1.26 | - |
| `JobTrackingWithFinalizers` | `true` | GA | 1.26 | |
| `KMSv1` | `true` | Deprecated | 1.28 | |
| `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 |
| `KubeletPodResources` | `true` | Beta | 1.15 | 1.27 |
| `KubeletPodResources` | `true` | GA | 1.28 | |
| `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 |
| `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | 1.27 |
| `KubeletPodResourcesGetAllocatable` | `true` | GA | 1.28 | |
| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | Beta | 1.24 | 1.25 |
| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | - |
| `MixedProtocolLBService` | `false` | Alpha | 1.20 | 1.23 |
| `MixedProtocolLBService` | `true` | Beta | 1.24 | 1.25 |
| `MixedProtocolLBService` | `true` | GA | 1.26 | - |
| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | |
| `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.26 | 1.26 |
| `LegacyServiceAccountTokenTracking` | `true` | Beta | 1.27 | 1.27 |
| `LegacyServiceAccountTokenTracking` | `true` | GA | 1.28 | |
| `MinimizeIPTablesRestore` | `false` | Alpha | 1.26 | 1.26 |
| `MinimizeIPTablesRestore` | `true` | Beta | 1.27 | 1.27 |
| `MinimizeIPTablesRestore` | `true` | GA | 1.28 | |
| `NodeOutOfServiceVolumeDetach` | `false` | Alpha | 1.24 | 1.25 |
| `NodeOutOfServiceVolumeDetach` | `true` | Beta | 1.26 | 1.27 |
| `NodeOutOfServiceVolumeDetach` | `true` | GA | 1.28 | |
| `OpenAPIV3` | `false` | Alpha | 1.23 | 1.23 |
| `OpenAPIV3` | `true` | Beta | 1.24 | 1.26 |
| `OpenAPIV3` | `true` | GA | 1.27 | - |
| `PodSecurity` | `false` | Alpha | 1.22 | 1.22 |
| `PodSecurity` | `true` | Beta | 1.23 | 1.24 |
| `PodSecurity` | `true` | GA | 1.25 | |
| `OpenAPIV3` | `true` | GA | 1.27 | |
| `ProbeTerminationGracePeriod` | `false` | Alpha | 1.21 | 1.21 |
| `ProbeTerminationGracePeriod` | `false` | Beta | 1.22 | 1.24 |
| `ProbeTerminationGracePeriod` | `true` | Beta | 1.25 | 1.27 |
| `ProbeTerminationGracePeriod` | `true` | GA | 1.28 | |
| `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | 1.25 |
| `ProxyTerminatingEndpoints` | `true` | Beta | 1.26 | 1.27 |
| `ProxyTerminatingEndpoints` | `true` | GA | 1.28 | |
| `RemoveSelfLink` | `false` | Alpha | 1.16 | 1.19 |
| `RemoveSelfLink` | `true` | Beta | 1.20 | 1.23 |
| `RemoveSelfLink` | `true` | GA | 1.24 | - |
| `RemoveSelfLink` | `true` | GA | 1.24 | |
| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 |
| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | 1.27 |
| `RetroactiveDefaultStorageClass` | `true` | GA | 1.28 | |
| `SeccompDefault` | `false` | Alpha | 1.22 | 1.24 |
| `SeccompDefault` | `true` | Beta | 1.25 | 1.26 |
| `SeccompDefault` | `true` | GA | 1.27 | - |
@ -313,21 +322,12 @@ For a reference to old feature gates that are removed, please refer to
| `ServerSideFieldValidation` | `false` | Alpha | 1.23 | 1.24 |
| `ServerSideFieldValidation` | `true` | Beta | 1.25 | 1.26 |
| `ServerSideFieldValidation` | `true` | GA | 1.27 | - |
| `ServiceIPStaticSubrange` | `false` | Alpha | 1.24 | 1.24 |
| `ServiceIPStaticSubrange` | `true` | Beta | 1.25 | 1.25 |
| `ServiceIPStaticSubrange` | `true` | GA | 1.26 | - |
| `ServiceInternalTrafficPolicy` | `false` | Alpha | 1.21 | 1.21 |
| `ServiceInternalTrafficPolicy` | `true` | Beta | 1.22 | 1.25 |
| `ServiceInternalTrafficPolicy` | `true` | GA | 1.26 | - |
| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 |
| `TopologyManager` | `true` | Beta | 1.18 | 1.26 |
| `TopologyManager` | `true` | GA | 1.27 | - |
| `WatchBookmark` | `false` | Alpha | 1.15 | 1.15 |
| `WatchBookmark` | `true` | Beta | 1.16 | 1.16 |
| `WatchBookmark` | `true` | GA | 1.17 | - |
| `WindowsHostProcessContainers` | `false` | Alpha | 1.22 | 1.22 |
| `WindowsHostProcessContainers` | `true` | Beta | 1.23 | 1.25 |
| `WindowsHostProcessContainers` | `true` | GA | 1.26 | - |
{{< /table >}}
## Using a feature
@ -387,7 +387,6 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `APIServerIdentity`: Assign each API server an ID in a cluster, using a [Lease](/docs/concepts/architecture/leases).
- `APIServerTracing`: Add support for distributed tracing in the API server.
See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details.
- `AdvancedAuditing`: Enable [advanced auditing](/docs/tasks/debug/debug-cluster/audit/#advanced-audit)
- `AggregatedDiscoveryEndpoint`: Enable a single HTTP endpoint `/discovery/<version>` which
supports native HTTP caching with ETags containing all APIResources known to the API server.
- `AnyVolumeDataSource`: Enable use of any custom resource as the `DataSource` of a
@ -412,13 +411,6 @@ Each feature gate is designed for enabling/disabling a specific feature:
installed and configured. Does not support falling back for provision
operations, for those the CSI plugin must be installed and configured.
Requires CSIMigration feature flag enabled.
- `CSIMigrationGCE`: Enables shims and translation logic to route volume
operations from the GCE-PD in-tree plugin to PD CSI plugin. Supports falling
back to in-tree GCE plugin for mount operations to nodes that have the
feature disabled or that do not have PD CSI plugin installed and configured.
Does not support falling back for provision operations, for those the CSI
plugin must be installed and configured. Requires CSIMigration feature flag
enabled.
- `CSIMigrationRBD`: Enables shims and translation logic to route volume
operations from the RBD in-tree plugin to Ceph RBD CSI plugin. Requires
CSIMigration and csiMigrationRBD feature flags enabled and Ceph CSI plugin
@ -437,10 +429,6 @@ Each feature gate is designed for enabling/disabling a specific feature:
Requires Portworx CSI driver to be installed and configured in the cluster.
- `CSINodeExpandSecret`: Enable passing secret authentication data to a CSI driver for use
during a `NodeExpandVolume` CSI operation.
- `CSIStorageCapacity`: Enables CSI drivers to publish storage capacity information
and the Kubernetes scheduler to use that information when scheduling pods. See
[Storage Capacity](/docs/concepts/storage/storage-capacity/).
Check the [`csi` volume type](/docs/concepts/storage/volumes/#csi) documentation for more details.
- `CSIVolumeHealth`: Enable support for CSI volume health monitoring on node.
- `CloudControllerManagerWebhook`: Enable webhooks in cloud controller manager.
- `CloudDualStackNodeIPs`: Enables dual-stack `kubelet --node-ip` with external cloud providers.
@ -452,11 +440,18 @@ Each feature gate is designed for enabling/disabling a specific feature:
allowing you to scrape health check metrics.
- `ConsistentHTTPGetHandlers`: Normalize HTTP get URL and Header passing for lifecycle
handlers with probers.
- `ConsistentListFromCache`: Allow the API server to serve consistent lists from cache.
- `ContainerCheckpoint`: Enables the kubelet `checkpoint` API.
See [Kubelet Checkpoint API](/docs/reference/node/kubelet-checkpoint-api/) for more details.
- `ContextualLogging`: When you enable this feature gate, Kubernetes components that support
contextual logging add extra detail to log output.
- `CronJobsScheduledAnnotation`: Set the scheduled job time as an
{{< glossary_tooltip text="annotation" term_id="annotation" >}} on Jobs that were created
on behalf of a CronJob.
- `CronJobTimeZone`: Allow the use of the `timeZone` optional field in [CronJobs](/docs/concepts/workloads/controllers/cron-jobs/)
- `CRDValidationRatcheting`: Enable updates to custom resources to contain
violations of their OpenAPI schema if the offending portions of the resource
update did not change. See [Validation Ratcheting](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-ratcheting) for more details.
- `CrossNamespaceVolumeDataSource`: Enable the usage of cross namespace volume data source
to allow you to specify a source namespace in the `dataSourceRef` field of a
PersistentVolumeClaim.
@ -465,13 +460,16 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `CustomResourceValidationExpressions`: Enable expression language validation in CRD
which will validate customer resource based on validation rules written in
the `x-kubernetes-validations` extension.
- `DelegateFSGroupToCSIDriver`: If supported by the CSI driver, delegates the
role of applying `fsGroup` from a Pod's `securityContext` to the driver by
passing `fsGroup` through the NodeStageVolume and NodePublishVolume CSI calls.
- `DevicePlugins`: Enable the [device-plugins](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)
based resource provisioning on nodes.
- `DisableAcceleratorUsageMetrics`:
[Disable accelerator metrics collected by the kubelet](/docs/concepts/cluster-administration/system-metrics/#disable-accelerator-metrics).
- `DaemonSetUpdateSurge`: Enables the DaemonSet workloads to maintain
availability during update per node.
See [Perform a Rolling Update on a DaemonSet](/docs/tasks/manage-daemon/update-daemon-set/).
- `DefaultHostNetworkHostPortsInPodTemplates`: Changes when the default value of
`PodSpec.containers[*].ports[*].hostPort`
is assigned. The default is to only set a default value in Pods.
Enabling this means a default will be assigned even to embedded
PodSpecs (e.g. in a Deployment), which is the historical default.
- `DevicePluginCDIDevices`: Enable support to CDI device IDs in the
[Device Plugin](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/) API.
- `DisableCloudProviders`: Disables any functionality in `kube-apiserver`,
`kube-controller-manager` and `kubelet` related to the `--cloud-provider`
component flag.
@ -479,16 +477,12 @@ Each feature gate is designed for enabling/disabling a specific feature:
to authenticate to a cloud provider container registry for image pull credentials.
- `DownwardAPIHugePages`: Enables usage of hugepages in
[downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information).
- `DryRun`: Enable server-side [dry run](/docs/reference/using-api/api-concepts/#dry-run) requests
so that validation, merging, and mutation can be tested without committing.
- `DynamicResourceAllocation`: Enables support for resources with custom parameters and a lifecycle
that is independent of a Pod.
- `ElasticIndexedJob`: Enables Indexed Jobs to be scaled up or down by mutating both
`spec.completions` and `spec.parallelism` together such that `spec.completions == spec.parallelism`.
See docs on [elastic Indexed Jobs](/docs/concepts/workloads/controllers/job#elastic-indexed-jobs)
for more details.
- `EndpointSliceTerminatingCondition`: Enables EndpointSlice `terminating` and `serving`
condition fields.
- `EfficientWatchResumption`: Allows for storage-originated bookmark (progress
notify) events to be delivered to the users. This is only applied to watch operations.
- `EventedPLEG`: Enable support for the kubelet to receive container life cycle events from the
@ -549,8 +543,11 @@ Each feature gate is designed for enabling/disabling a specific feature:
and volume controllers.
- `JobMutableNodeSchedulingDirectives`: Allows updating node scheduling directives in
the pod template of [Job](/docs/concepts/workloads/controllers/job).
- `JobBackoffLimitPerIndex`: Allows specifying the maximal number of pod
retries per index in Indexed jobs.
- `JobPodFailurePolicy`: Allow users to specify handling of pod failures based on container
exit codes and pod conditions.
- `JobPodReplacementPolicy`: Allows you to specify pod replacement for terminating pods in a [Job](/docs/concepts/workloads/controllers/job)
- `JobReadyPods`: Enables tracking the number of Pods that have a `Ready`
[condition](/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions).
The count of `Ready` pods is recorded in the
@ -560,9 +557,24 @@ Each feature gate is designed for enabling/disabling a specific feature:
completions without relying on Pods remaining in the cluster indefinitely.
The Job controller uses Pod finalizers and a field in the Job status to keep
track of the finished Pods to count towards completion.
- `KMSv1`: Enables KMS v1 API for encryption at rest. See [Using a KMS Provider for data encryption](/docs/tasks/administer-cluster/kms-provider) for more details.
- `KMSv2`: Enables KMS v2 API for encryption at rest. See [Using a KMS Provider for data encryption](/docs/tasks/administer-cluster/kms-provider) for more details.
- `KubeletCredentialProviders`: Enable kubelet exec credential providers for
image pull credentials.
- `KMSv2KDF`: Enables KMS v2 to generate single use data encryption keys.
See [Using a KMS Provider for data encryption](/docs/tasks/administer-cluster/kms-provider) for more details.
If the `KMSv2` feature gate is not enabled in your cluster, the value of the `KMSv2KDF` feature gate has no effect.
- `KubeProxyDrainingTerminatingNodes`: Implement connection draining for
terminating nodes for `externalTrafficPolicy: Cluster` services.
- `KubeletCgroupDriverFromCRI`: Enable detection of the kubelet cgroup driver
configuration option from the {{<glossary_tooltip term_id="cri" text="CRI">}}.
You can use this feature gate on nodes with a kubelet that supports the feature gate
and where there is a CRI container runtime that supports the `RuntimeConfig`
CRI call. If both CRI and kubelet support this feature, the kubelet ignores the
`cgroupDriver` configuration setting (or deprecated `--cgroup-driver` command
line argument). If you enable this feature gate and the container runtime
doesn't support it, the kubelet falls back to using the driver configured using
the `cgroupDriver` configuration setting.
See [Configuring a cgroup driver](/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver)
for more details.
- `KubeletInUserNamespace`: Enables support for running kubelet in a
{{<glossary_tooltip text="user namespace" term_id="userns">}}.
See [Running Kubernetes Node Components as a Non-root User](/docs/tasks/administer-cluster/kubelet-in-userns/).
@ -584,9 +596,12 @@ Each feature gate is designed for enabling/disabling a specific feature:
OpenTelemetry trace spans.
See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details.
- `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based
[service account tokens](/docs/reference/access-authn-authz/authentication/#service-account-tokens).
[service account tokens](/docs/concepts/security/service-accounts/#get-a-token).
- `LegacyServiceAccountTokenCleanUp`: Enable cleaning up Secret-based
[service account tokens](/docs/concepts/security/service-accounts/#get-a-token)
when they are not used in a specified time (default to be one year).
- `LegacyServiceAccountTokenTracking`: Track usage of Secret-based
[service account tokens](/docs/reference/access-authn-authz/authentication/#service-account-tokens).
[service account tokens](/docs/concepts/security/service-accounts/#get-a-token).
- `LocalStorageCapacityIsolationFSQuotaMonitoring`: When `LocalStorageCapacityIsolation`
is enabled for
[local ephemeral storage](/docs/concepts/configuration/manage-resources-containers/)
@ -612,11 +627,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
[Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/).
- `MinimizeIPTablesRestore`: Enables new performance improvement logics
in the kube-proxy iptables mode.
- `MixedProtocolLBService`: Enable using different protocols in the same `LoadBalancer` type
Service instance.
- `MultiCIDRRangeAllocator`: Enables the MultiCIDR range allocator.
- `MultiCIDRServiceAllocator`: Track IP address allocations for Service cluster IPs using IPAddress objects.
- `NetworkPolicyStatus`: Enable the `status` subresource for NetworkPolicy objects.
- `NewVolumeManagerReconstruction`: Enables improved discovery of mounted volumes during kubelet
startup. Since this code has been significantly refactored, we allow to opt-out in case kubelet
gets stuck at the startup or is not unmounting volumes from terminated Pods. Note that this
@ -645,14 +657,19 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `PDBUnhealthyPodEvictionPolicy`: Enables the `unhealthyPodEvictionPolicy` field of a `PodDisruptionBudget`. This specifies
when unhealthy pods should be considered for eviction. Please see [Unhealthy Pod Eviction Policy](/docs/tasks/run-application/configure-pdb/#unhealthy-pod-eviction-policy)
for more details.
- `PodDeletionCost`: Enable the [Pod Deletion Cost](/docs/concepts/workloads/controllers/replicaset/#pod-deletion-cost)
feature which allows users to influence ReplicaSet downscaling order.
- `PersistentVolumeLastPhaseTransitionTime`: Adds a new field to PersistentVolume
which holds a timestamp of when the volume last transitioned its phase.
- `PodAndContainerStatsFromCRI`: Configure the kubelet to gather container and pod stats from the CRI container runtime rather than gathering them from cAdvisor.
As of 1.26, this also includes gathering metrics from CRI and emitting them over `/metrics/cadvisor` (rather than having cAdvisor emit them directly).
- `PodDeletionCost`: Enable the [Pod Deletion Cost](/docs/concepts/workloads/controllers/replicaset/#pod-deletion-cost)
feature which allows users to influence ReplicaSet downscaling order.
- `PodDisruptionConditions`: Enables support for appending a dedicated pod condition indicating that the pod is being deleted due to a disruption.
- `PodHasNetworkCondition`: Enable the kubelet to mark the [PodHasNetwork](/docs/concepts/workloads/pods/pod-lifecycle/#pod-has-network) condition on pods.
- `PodHostIPs`: Enable the `status.hostIPs` field for pods and the {{< glossary_tooltip term_id="downward-api" text="downward API" >}}.
The field lets you expose host IP addresses to workloads.
- `PodIndexLabel`: Enables the Job controller and StatefulSet controller to add the pod index as a label when creating new pods. See [Job completion mode docs](/docs/concepts/workloads/controllers/job#completion-mode) and [StatefulSet pod index label docs](/docs/concepts/workloads/controllers/statefulset/#pod-index-label) for more details.
- `PodReadyToStartContainersCondition`: Enable the kubelet to mark the [PodReadyToStartContainers](/docs/concepts/workloads/pods/pod-lifecycle/#pod-has-network)
condition on pods. This was previously (1.25-1.27) known as `PodHasNetworkCondition`.
- `PodSchedulingReadiness`: Enable setting `schedulingGates` field to control a Pod's [scheduling readiness](/docs/concepts/scheduling-eviction/pod-scheduling-readiness).
- `PodSecurity`: Enables the `PodSecurity` admission plugin.
- `ProbeTerminationGracePeriod`: Enable [setting probe-level
`terminationGracePeriodSeconds`](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#probe-level-terminationgraceperiodseconds)
on pods. See the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2238-liveness-probe-grace-period)
@ -684,6 +701,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `SELinuxMountReadWriteOncePod`: Speeds up container startup by allowing kubelet to mount volumes
for a Pod directly with the correct SELinux label instead of changing each file on the volumes
recursively. The initial implementation focused on ReadWriteOncePod volumes.
- `SchedulerQueueingHints`: Enables the scheduler's _queueing hints_ enhancement,
which benefits to reduce the useless requeueing.
- `SeccompDefault`: Enables the use of `RuntimeDefault` as the default seccomp profile
for all workloads.
The seccomp profile is specified in the `securityContext` of a Pod and/or a Container.
@ -693,15 +712,15 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `ServerSideFieldValidation`: Enables server-side field validation. This means the validation
of resource schema is performed at the API server side rather than the client side
(for example, the `kubectl create` or `kubectl apply` command line).
- `ServiceInternalTrafficPolicy`: Enables the `internalTrafficPolicy` field on Services
- `ServiceIPStaticSubrange`: Enables a strategy for Services ClusterIP allocations, whereby the
ClusterIP range is subdivided. Dynamic allocated ClusterIP addresses will be allocated preferently
from the upper range allowing users to assign static ClusterIPs from the lower range with a low
risk of collision. See
[Avoiding collisions](/docs/reference/networking/virtual-ips/#avoiding-collisions)
- `SidecarContainers`: Allow setting the `restartPolicy` of an init container to
`Always` so that the container becomes a sidecar container (restartable init containers).
See
[Sidecar containers and restartPolicy](/docs/concepts/workloads/pods/init-containers/#sidecar-containers-and-restartpolicy)
for more details.
- `SizeMemoryBackedVolumes`: Enable kubelets to determine the size limit for
memory-backed volumes (mainly `emptyDir` volumes).
- `SkipReadOnlyValidationGCE`: Skip validation for GCE, will enable in the
next version.
- `StableLoadBalancerNodeSet`: Enables less load balancer re-configurations by
the service controller (KCCM) as an effect of changing node state.
- `StatefulSetStartOrdinal`: Allow configuration of the start ordinal in a
@ -728,7 +747,11 @@ Each feature gate is designed for enabling/disabling a specific feature:
This feature gate guards *a group* of topology manager options whose quality level is beta.
This feature gate will never graduate to stable.
- `TopologyManagerPolicyOptions`: Allow fine-tuning of topology manager policies,
- `UserNamespacesStatelessPodsSupport`: Enable user namespace support for stateless Pods.
- `UnknownVersionInteroperabilityProxy`: Proxy resource requests to the correct peer kube-apiserver when
multiple kube-apiservers exist at varied versions.
See [Mixed version proxy](/docs/concepts/architecture/mixed-version-proxy/) for more information.
- `UserNamespacesSupport`: Enable user namespace support for Pods.
Before Kubernetes v1.28, this feature gate was named `UserNamespacesStatelessPodsSupport`.
- `ValidatingAdmissionPolicy`: Enable [ValidatingAdmissionPolicy](/docs/reference/access-authn-authz/validating-admission-policy/) support for CEL validations be used in Admission Control.
- `VolumeCapacityPriority`: Enable support for prioritizing nodes in different
topologies based on available PV capacity.
@ -737,7 +760,6 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `WinDSR`: Allows kube-proxy to create DSR loadbalancers for Windows.
- `WinOverlay`: Allows kube-proxy to run in overlay mode for Windows.
- `WindowsHostNetwork`: Enables support for joining Windows containers to a hosts' network namespace.
- `WindowsHostProcessContainers`: Enables support for Windows HostProcess containers.
## {{% heading "whatsnext" %}}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -153,7 +153,7 @@ requested. e.g. a patch can result in either a CREATE or UPDATE Operation.</p>
</td>
</tr>
<tr><td><code>userInfo</code> <B>[Required]</B><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#userinfo-v1-authentication-k8s-io"><code>authentication/v1.UserInfo</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#userinfo-v1-authentication-k8s-io"><code>authentication/v1.UserInfo</code></a>
</td>
<td>
<p>UserInfo is information about the requesting user</p>
@ -227,7 +227,7 @@ This must be copied over from the corresponding AdmissionRequest.</p>
</td>
</tr>
<tr><td><code>status</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#status-v1-meta"><code>meta/v1.Status</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#status-v1-meta"><code>meta/v1.Status</code></a>
</td>
<td>
<p>Result contains extra details into why an admission request was denied.

View File

@ -72,14 +72,14 @@ For non-resource requests, this is the lower-cased HTTP method.</p>
</td>
</tr>
<tr><td><code>user</code> <B>[Required]</B><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#userinfo-v1-authentication"><code>authentication/v1.UserInfo</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#userinfo-v1-authentication-k8s-io"><code>authentication/v1.UserInfo</code></a>
</td>
<td>
<p>Authenticated user information.</p>
</td>
</tr>
<tr><td><code>impersonatedUser</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#userinfo-v1-authentication"><code>authentication/v1.UserInfo</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#userinfo-v1-authentication-k8s-io"><code>authentication/v1.UserInfo</code></a>
</td>
<td>
<p>Impersonated user information.</p>
@ -117,7 +117,7 @@ Does not apply for List-type requests, or non-resource requests.</p>
</td>
</tr>
<tr><td><code>responseStatus</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#status-v1-meta"><code>meta/v1.Status</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#status-v1-meta"><code>meta/v1.Status</code></a>
</td>
<td>
<p>The response status, populated even when the ResponseObject is not a Status type.
@ -145,14 +145,14 @@ at Response Level.</p>
</td>
</tr>
<tr><td><code>requestReceivedTimestamp</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#microtime-v1-meta"><code>meta/v1.MicroTime</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#microtime-v1-meta"><code>meta/v1.MicroTime</code></a>
</td>
<td>
<p>Time the request reached the apiserver.</p>
</td>
</tr>
<tr><td><code>stageTimestamp</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#microtime-v1-meta"><code>meta/v1.MicroTime</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#microtime-v1-meta"><code>meta/v1.MicroTime</code></a>
</td>
<td>
<p>Time the request reached current audit stage.</p>
@ -189,7 +189,7 @@ should be short. Annotations are included in the Metadata level.</p>
<tr><td><code>metadata</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#listmeta-v1-meta"><code>meta/v1.ListMeta</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#listmeta-v1-meta"><code>meta/v1.ListMeta</code></a>
</td>
<td>
<span class="text-muted">No description provided.</span></td>
@ -224,7 +224,7 @@ categories are logged.</p>
<tr><td><code>metadata</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#objectmeta-v1-meta"><code>meta/v1.ObjectMeta</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#objectmeta-v1-meta"><code>meta/v1.ObjectMeta</code></a>
</td>
<td>
<p>ObjectMeta is included for interoperability with API infrastructure.</p>
@ -279,7 +279,7 @@ in a rule will override the global default.</p>
<tr><td><code>metadata</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#listmeta-v1-meta"><code>meta/v1.ListMeta</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#listmeta-v1-meta"><code>meta/v1.ListMeta</code></a>
</td>
<td>
<span class="text-muted">No description provided.</span></td>
@ -322,14 +322,12 @@ The empty string represents the core API group.</p>
</td>
<td>
<p>Resources is a list of resources this rule applies to.</p>
<p>For example:</p>
<ul>
<li><code>pods</code> matches pods.</li>
<li><code>pods/log</code> matches the log subresource of pods.</li>
<li><code>&ast;<code> matches all resources and their subresources.</li>
<li><code>pods/&ast;</code> matches all subresources of pods.</li>
<li><code>&ast;/scale</code> matches all scale subresources.</li>
</ul>
<p>For example:
'pods' matches pods.
'pods/log' matches the log subresource of pods.
'<em>' matches all resources and their subresources.
'pods/</em>' matches all subresources of pods.
'*/scale' matches all scale subresources.</p>
<p>If wildcard is present, the validation rule will ensure resources do not
overlap with each other.</p>
<p>An empty list implies all resources and subresources in this API groups apply.</p>
@ -503,12 +501,10 @@ An empty list implies every namespace.</p>
</td>
<td>
<p>NonResourceURLs is a set of URL paths that should be audited.
<code>&ast;<code>s are allowed, but only as the full, final step in the path.
Examples:</p>
<ul>
<li>&quot;/metrics&quot; - Log requests for apiserver metrics</li>
<li>&quot;/healthz&ast;&quot; - Log all health checks</li>
</ul>
<em>s are allowed, but only as the full, final step in the path.
Examples:
&quot;/metrics&quot; - Log requests for apiserver metrics
&quot;/healthz</em>&quot; - Log all health checks</p>
</td>
</tr>
<tr><td><code>omitStages</code><br/>
@ -556,4 +552,4 @@ Policy.OmitManagedFields will stand.</li>

View File

@ -20,8 +20,8 @@ auto_generated: true
<p>EncryptionConfiguration stores the complete configuration for encryption providers.
It also allows the use of wildcards to specify the resources that should be encrypted.
Use <code>&ast;.&lt;group&gt;</code> to encrypt all resources within a group or <code>&ast;.&ast;</code> to encrypt all resources.
<code>&ast;.</code> can be used to encrypt all resource in the core group. <code>&ast;.&ast;</code> will encrypt all
Use '<em>.<!-- raw HTML omitted -->' to encrypt all resources within a group or '</em>.<em>' to encrypt all resources.
'</em>.' can be used to encrypt all resource in the core group. '<em>.</em>' will encrypt all
resources, even custom resources that are added after API server start.
Use of wildcards that overlap within the same resource list or across multiple
entries are not allowed since part of the configuration would be ineffective.
@ -282,10 +282,10 @@ Set to a negative value to disable caching. This field is only allowed for KMS v
</td>
<td>
<p>resources is a list of kubernetes resources which have to be encrypted. The resource names are derived from <code>resource</code> or <code>resource.group</code> of the group/version/resource.
eg: <code>pandas.awesome.bears.example</code> is a custom resource with 'group': <code>awesome.bears.example</code>, 'resource': <code>pandas</code>.
Use <code>&ast;.&ast;</code> to encrypt all resources and <code>&ast;.&lt;group&gt;</code>' to encrypt all resources in a specific group.
eg: <code>&ast;.awesome.bears.example</code> will encrypt all resources in the group <code>awesome.bears.example</code>.
eg: <code>&ast;.</code> will encrypt all resources in the core group (such as pods, configmaps, etc).</p>
eg: pandas.awesome.bears.example is a custom resource with 'group': awesome.bears.example, 'resource': pandas.
Use '<em>.</em>' to encrypt all resources and '<em>.<!-- raw HTML omitted -->' to encrypt all resources in a specific group.
eg: '</em>.awesome.bears.example' will encrypt all resources in the group 'awesome.bears.example'.
eg: '*.' will encrypt all resources in the core group (such as pods, configmaps, etc).</p>
</td>
</tr>
<tr><td><code>providers</code> <B>[Required]</B><br/>
@ -325,4 +325,4 @@ Each key has to be 32 bytes long.</p>
</tr>
</tbody>
</table>

View File

@ -206,7 +206,7 @@ itself should at least be protected via file permissions.</p>
<tr><td><code>expirationTimestamp</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#time-v1-meta"><code>meta/v1.Time</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#time-v1-meta"><code>meta/v1.Time</code></a>
</td>
<td>
<p>ExpirationTimestamp indicates a time when the provided credentials expire.</p>

View File

@ -206,7 +206,7 @@ itself should at least be protected via file permissions.</p>
<tr><td><code>expirationTimestamp</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#time-v1-meta"><code>meta/v1.Time</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#time-v1-meta"><code>meta/v1.Time</code></a>
</td>
<td>
<p>ExpirationTimestamp indicates a time when the provided credentials expire.</p>

View File

@ -29,7 +29,7 @@ auto_generated: true
<tr><td><code>metadata</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#objectmeta-v1-meta"><code>meta/v1.ObjectMeta</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#objectmeta-v1-meta"><code>meta/v1.ObjectMeta</code></a>
</td>
<td>
<p>Standard object's metadata.

View File

@ -1,7 +1,7 @@
---
title: kube-controller-manager Configuration (v1alpha1)
content_type: tool-reference
package: controllermanager.config.k8s.io/v1alpha1
package: kubecontrollermanager.config.k8s.io/v1alpha1
auto_generated: true
---
@ -9,190 +9,9 @@ auto_generated: true
## Resource Types
- [LeaderMigrationConfiguration](#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration)
- [KubeControllerManagerConfiguration](#kubecontrollermanager-config-k8s-io-v1alpha1-KubeControllerManagerConfiguration)
- [CloudControllerManagerConfiguration](#cloudcontrollermanager-config-k8s-io-v1alpha1-CloudControllerManagerConfiguration)
## `LeaderMigrationConfiguration` {#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration}
**Appears in:**
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>LeaderMigrationConfiguration provides versioned configuration for all migrating leader locks.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>apiVersion</code><br/>string</td><td><code>controllermanager.config.k8s.io/v1alpha1</code></td></tr>
<tr><td><code>kind</code><br/>string</td><td><code>LeaderMigrationConfiguration</code></td></tr>
<tr><td><code>leaderName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>LeaderName is the name of the leader election resource that protects the migration
E.g. 1-20-KCM-to-1-21-CCM</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>ResourceLock indicates the resource object type that will be used to lock
Should be &quot;leases&quot; or &quot;endpoints&quot;</p>
</td>
</tr>
<tr><td><code>controllerLeaders</code> <B>[Required]</B><br/>
<a href="#controllermanager-config-k8s-io-v1alpha1-ControllerLeaderConfiguration"><code>[]ControllerLeaderConfiguration</code></a>
</td>
<td>
<p>ControllerLeaders contains a list of migrating leader lock configurations</p>
</td>
</tr>
</tbody>
</table>
## `ControllerLeaderConfiguration` {#controllermanager-config-k8s-io-v1alpha1-ControllerLeaderConfiguration}
**Appears in:**
- [LeaderMigrationConfiguration](#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration)
<p>ControllerLeaderConfiguration provides the configuration for a migrating leader lock.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>name</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>Name is the name of the controller being migrated
E.g. service-controller, route-controller, cloud-node-controller, etc</p>
</td>
</tr>
<tr><td><code>component</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>Component is the name of the component in which the controller should be running.
E.g. kube-controller-manager, cloud-controller-manager, etc
Or '*' meaning the controller can be run under any component that participates in the migration</p>
</td>
</tr>
</tbody>
</table>
## `GenericControllerManagerConfiguration` {#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration}
**Appears in:**
- [CloudControllerManagerConfiguration](#cloudcontrollermanager-config-k8s-io-v1alpha1-CloudControllerManagerConfiguration)
- [KubeControllerManagerConfiguration](#kubecontrollermanager-config-k8s-io-v1alpha1-KubeControllerManagerConfiguration)
<p>GenericControllerManagerConfiguration holds configuration for a generic controller-manager.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>Port</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>port is the port that the controller-manager's http service runs on.</p>
</td>
</tr>
<tr><td><code>Address</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>address is the IP address to serve on (set to 0.0.0.0 for all interfaces).</p>
</td>
</tr>
<tr><td><code>MinResyncPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>minResyncPeriod is the resync period in reflectors; will be random between
minResyncPeriod and 2*minResyncPeriod.</p>
</td>
</tr>
<tr><td><code>ClientConnection</code> <B>[Required]</B><br/>
<a href="#ClientConnectionConfiguration"><code>ClientConnectionConfiguration</code></a>
</td>
<td>
<p>ClientConnection specifies the kubeconfig file and client connection
settings for the proxy server to use when communicating with the apiserver.</p>
</td>
</tr>
<tr><td><code>ControllerStartInterval</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>How long to wait between starting controller managers</p>
</td>
</tr>
<tr><td><code>LeaderElection</code> <B>[Required]</B><br/>
<a href="#LeaderElectionConfiguration"><code>LeaderElectionConfiguration</code></a>
</td>
<td>
<p>leaderElection defines the configuration of leader election client.</p>
</td>
</tr>
<tr><td><code>Controllers</code> <B>[Required]</B><br/>
<code>[]string</code>
</td>
<td>
<p>Controllers is the list of controllers to enable or disable
'*' means &quot;all enabled by default controllers&quot;
'foo' means &quot;enable 'foo'&quot;
'-foo' means &quot;disable 'foo'&quot;
first item for a particular name wins</p>
</td>
</tr>
<tr><td><code>Debugging</code> <B>[Required]</B><br/>
<a href="#DebuggingConfiguration"><code>DebuggingConfiguration</code></a>
</td>
<td>
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
</td>
</tr>
<tr><td><code>LeaderMigrationEnabled</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>LeaderMigrationEnabled indicates whether Leader Migration should be enabled for the controller manager.</p>
</td>
</tr>
<tr><td><code>LeaderMigration</code> <B>[Required]</B><br/>
<a href="#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration"><code>LeaderMigrationConfiguration</code></a>
</td>
<td>
<p>LeaderMigration holds the configuration for Leader Migration.</p>
</td>
</tr>
</tbody>
</table>
@ -335,13 +154,18 @@ GarbageCollectorController related features.</p>
<p>CronJobControllerConfiguration holds configuration for CronJobController related features.</p>
</td>
</tr>
<tr><td><code>LegacySATokenCleaner</code> <B>[Required]</B><br/>
<a href="#kubecontrollermanager-config-k8s-io-v1alpha1-LegacySATokenCleanerConfiguration"><code>LegacySATokenCleanerConfiguration</code></a>
</td>
<td>
<p>LegacySATokenCleanerConfiguration holds configuration for LegacySATokenCleaner related features.</p>
</td>
</tr>
<tr><td><code>NamespaceController</code> <B>[Required]</B><br/>
<a href="#kubecontrollermanager-config-k8s-io-v1alpha1-NamespaceControllerConfiguration"><code>NamespaceControllerConfiguration</code></a>
</td>
<td>
<p>NamespaceControllerConfiguration holds configuration for NamespaceController
related features.
NamespaceControllerConfiguration holds configuration for NamespaceController
related features.</p>
</td>
</tr>
@ -424,6 +248,14 @@ related features.</p>
TTLAfterFinishedController related features.</p>
</td>
</tr>
<tr><td><code>ValidatingAdmissionPolicyStatusController</code> <B>[Required]</B><br/>
<a href="#kubecontrollermanager-config-k8s-io-v1alpha1-ValidatingAdmissionPolicyStatusControllerConfiguration"><code>ValidatingAdmissionPolicyStatusControllerConfiguration</code></a>
</td>
<td>
<p>ValidatingAdmissionPolicyStatusControllerConfiguration holds configuration for
ValidatingAdmissionPolicyStatusController related features.</p>
</td>
</tr>
</tbody>
</table>
@ -1014,6 +846,33 @@ but more CPU (and network) load.</p>
</tbody>
</table>
## `LegacySATokenCleanerConfiguration` {#kubecontrollermanager-config-k8s-io-v1alpha1-LegacySATokenCleanerConfiguration}
**Appears in:**
- [KubeControllerManagerConfiguration](#kubecontrollermanager-config-k8s-io-v1alpha1-KubeControllerManagerConfiguration)
<p>LegacySATokenCleanerConfiguration contains elements describing LegacySATokenCleaner</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>CleanUpPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>CleanUpPeriod is the period of time since the last usage of an
auto-generated service account token before it can be deleted.</p>
</td>
</tr>
</tbody>
</table>
## `NamespaceControllerConfiguration` {#kubecontrollermanager-config-k8s-io-v1alpha1-NamespaceControllerConfiguration}
@ -1212,7 +1071,7 @@ and persistent volume claims.</p>
<code>[]string</code>
</td>
<td>
<p>VolumeHostCIDRDenylist is a list of CIDRs that should not be reachable by the
<p>DEPRECATED: VolumeHostCIDRDenylist is a list of CIDRs that should not be reachable by the
controller from plugins.</p>
</td>
</tr>
@ -1220,7 +1079,7 @@ controller from plugins.</p>
<code>bool</code>
</td>
<td>
<p>VolumeHostAllowLocalLoopback indicates if local loopback hosts (127.0.0.1, etc)
<p>DEPRECATED: VolumeHostAllowLocalLoopback indicates if local loopback hosts (127.0.0.1, etc)
should be allowed from plugins.</p>
</td>
</tr>
@ -1523,6 +1382,35 @@ allowed to sync concurrently.</p>
</tbody>
</table>
## `ValidatingAdmissionPolicyStatusControllerConfiguration` {#kubecontrollermanager-config-k8s-io-v1alpha1-ValidatingAdmissionPolicyStatusControllerConfiguration}
**Appears in:**
- [KubeControllerManagerConfiguration](#kubecontrollermanager-config-k8s-io-v1alpha1-KubeControllerManagerConfiguration)
<p>ValidatingAdmissionPolicyStatusControllerConfiguration contains elements describing ValidatingAdmissionPolicyStatusController.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>ConcurrentPolicySyncs</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>ConcurrentPolicySyncs is the number of policy objects that are
allowed to sync concurrently. Larger number = quicker type checking,
but more CPU (and network) load.
The default value is 5.</p>
</td>
</tr>
</tbody>
</table>
## `VolumeConfiguration` {#kubecontrollermanager-config-k8s-io-v1alpha1-VolumeConfiguration}
@ -1879,4 +1767,185 @@ first item for a particular name wins</p>
</tr>
</tbody>
</table>
## `LeaderMigrationConfiguration` {#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration}
**Appears in:**
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>LeaderMigrationConfiguration provides versioned configuration for all migrating leader locks.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>apiVersion</code><br/>string</td><td><code>controllermanager.config.k8s.io/v1alpha1</code></td></tr>
<tr><td><code>kind</code><br/>string</td><td><code>LeaderMigrationConfiguration</code></td></tr>
<tr><td><code>leaderName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>LeaderName is the name of the leader election resource that protects the migration
E.g. 1-20-KCM-to-1-21-CCM</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>ResourceLock indicates the resource object type that will be used to lock
Should be &quot;leases&quot; or &quot;endpoints&quot;</p>
</td>
</tr>
<tr><td><code>controllerLeaders</code> <B>[Required]</B><br/>
<a href="#controllermanager-config-k8s-io-v1alpha1-ControllerLeaderConfiguration"><code>[]ControllerLeaderConfiguration</code></a>
</td>
<td>
<p>ControllerLeaders contains a list of migrating leader lock configurations</p>
</td>
</tr>
</tbody>
</table>
## `ControllerLeaderConfiguration` {#controllermanager-config-k8s-io-v1alpha1-ControllerLeaderConfiguration}
**Appears in:**
- [LeaderMigrationConfiguration](#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration)
<p>ControllerLeaderConfiguration provides the configuration for a migrating leader lock.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>name</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>Name is the name of the controller being migrated
E.g. service-controller, route-controller, cloud-node-controller, etc</p>
</td>
</tr>
<tr><td><code>component</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>Component is the name of the component in which the controller should be running.
E.g. kube-controller-manager, cloud-controller-manager, etc
Or '*' meaning the controller can be run under any component that participates in the migration</p>
</td>
</tr>
</tbody>
</table>
## `GenericControllerManagerConfiguration` {#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration}
**Appears in:**
- [CloudControllerManagerConfiguration](#cloudcontrollermanager-config-k8s-io-v1alpha1-CloudControllerManagerConfiguration)
- [KubeControllerManagerConfiguration](#kubecontrollermanager-config-k8s-io-v1alpha1-KubeControllerManagerConfiguration)
<p>GenericControllerManagerConfiguration holds configuration for a generic controller-manager.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>Port</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>port is the port that the controller-manager's http service runs on.</p>
</td>
</tr>
<tr><td><code>Address</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>address is the IP address to serve on (set to 0.0.0.0 for all interfaces).</p>
</td>
</tr>
<tr><td><code>MinResyncPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>minResyncPeriod is the resync period in reflectors; will be random between
minResyncPeriod and 2*minResyncPeriod.</p>
</td>
</tr>
<tr><td><code>ClientConnection</code> <B>[Required]</B><br/>
<a href="#ClientConnectionConfiguration"><code>ClientConnectionConfiguration</code></a>
</td>
<td>
<p>ClientConnection specifies the kubeconfig file and client connection
settings for the proxy server to use when communicating with the apiserver.</p>
</td>
</tr>
<tr><td><code>ControllerStartInterval</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>How long to wait between starting controller managers</p>
</td>
</tr>
<tr><td><code>LeaderElection</code> <B>[Required]</B><br/>
<a href="#LeaderElectionConfiguration"><code>LeaderElectionConfiguration</code></a>
</td>
<td>
<p>leaderElection defines the configuration of leader election client.</p>
</td>
</tr>
<tr><td><code>Controllers</code> <B>[Required]</B><br/>
<code>[]string</code>
</td>
<td>
<p>Controllers is the list of controllers to enable or disable
'*' means &quot;all enabled by default controllers&quot;
'foo' means &quot;enable 'foo'&quot;
'-foo' means &quot;disable 'foo'&quot;
first item for a particular name wins</p>
</td>
</tr>
<tr><td><code>Debugging</code> <B>[Required]</B><br/>
<a href="#DebuggingConfiguration"><code>DebuggingConfiguration</code></a>
</td>
<td>
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
</td>
</tr>
<tr><td><code>LeaderMigrationEnabled</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>LeaderMigrationEnabled indicates whether Leader Migration should be enabled for the controller manager.</p>
</td>
</tr>
<tr><td><code>LeaderMigration</code> <B>[Required]</B><br/>
<a href="#controllermanager-config-k8s-io-v1alpha1-LeaderMigrationConfiguration"><code>LeaderMigrationConfiguration</code></a>
</td>
<td>
<p>LeaderMigration holds the configuration for Leader Migration.</p>
</td>
</tr>
</tbody>
</table>

View File

@ -13,6 +13,196 @@ auto_generated: true
## `ClientConnectionConfiguration` {#ClientConnectionConfiguration}
**Appears in:**
- [KubeProxyConfiguration](#kubeproxy-config-k8s-io-v1alpha1-KubeProxyConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>ClientConnectionConfiguration contains details for constructing a client.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>kubeconfig</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>kubeconfig is the path to a KubeConfig file.</p>
</td>
</tr>
<tr><td><code>acceptContentTypes</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>acceptContentTypes defines the Accept header sent by clients when connecting to a server, overriding the
default value of 'application/json'. This field will control all connections to the server used by a particular
client.</p>
</td>
</tr>
<tr><td><code>contentType</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>contentType is the content type used when sending data to the server from this client.</p>
</td>
</tr>
<tr><td><code>qps</code> <B>[Required]</B><br/>
<code>float32</code>
</td>
<td>
<p>qps controls the number of queries per second allowed for this connection.</p>
</td>
</tr>
<tr><td><code>burst</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>burst allows extra queries to accumulate when a client is exceeding its rate.</p>
</td>
</tr>
</tbody>
</table>
## `DebuggingConfiguration` {#DebuggingConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>enableProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableProfiling enables profiling via web interface host:port/debug/pprof/</p>
</td>
</tr>
<tr><td><code>enableContentionProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableContentionProfiling enables block profiling, if
enableProfiling is true.</p>
</td>
</tr>
</tbody>
</table>
## `LeaderElectionConfiguration` {#LeaderElectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>LeaderElectionConfiguration defines the configuration of leader election
clients for components that can run with leader election enabled.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>leaderElect</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>leaderElect enables a leader election client to gain leadership
before executing the main loop. Enable this when running replicated
components for high availability.</p>
</td>
</tr>
<tr><td><code>leaseDuration</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>leaseDuration is the duration that non-leader candidates will wait
after observing a leadership renewal until attempting to acquire
leadership of a led but unrenewed leader slot. This is effectively the
maximum duration that a leader can be stopped before it is replaced
by another candidate. This is only applicable if leader election is
enabled.</p>
</td>
</tr>
<tr><td><code>renewDeadline</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>renewDeadline is the interval between attempts by the acting master to
renew a leadership slot before it stops leading. This must be less
than or equal to the lease duration. This is only applicable if leader
election is enabled.</p>
</td>
</tr>
<tr><td><code>retryPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>retryPeriod is the duration the clients should wait between attempting
acquisition and renewal of a leadership. This is only applicable if
leader election is enabled.</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceLock indicates the resource object type that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the name of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceNamespace</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the namespace of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
</tbody>
</table>
## `KubeProxyConfiguration` {#kubeproxy-config-k8s-io-v1alpha1-KubeProxyConfiguration}
@ -192,6 +382,15 @@ An empty string slice is meant to select all network interfaces.</p>
<p>DetectLocal contains optional configuration settings related to DetectLocalMode.</p>
</td>
</tr>
<tr><td><code>logging</code> <B>[Required]</B><br/>
<a href="#LoggingConfiguration"><code>LoggingConfiguration</code></a>
</td>
<td>
<p>logging specifies the options of logging.
Refer to <a href="https://github.com/kubernetes/component-base/blob/master/logs/options.go">Logs Options</a>
for more information.</p>
</td>
</tr>
</tbody>
</table>
@ -520,200 +719,4 @@ will exit with an error.</p>
## `ClientConnectionConfiguration` {#ClientConnectionConfiguration}
**Appears in:**
- [KubeProxyConfiguration](#kubeproxy-config-k8s-io-v1alpha1-KubeProxyConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>ClientConnectionConfiguration contains details for constructing a client.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>kubeconfig</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>kubeconfig is the path to a KubeConfig file.</p>
</td>
</tr>
<tr><td><code>acceptContentTypes</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>acceptContentTypes defines the Accept header sent by clients when connecting to a server, overriding the
default value of 'application/json'. This field will control all connections to the server used by a particular
client.</p>
</td>
</tr>
<tr><td><code>contentType</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>contentType is the content type used when sending data to the server from this client.</p>
</td>
</tr>
<tr><td><code>qps</code> <B>[Required]</B><br/>
<code>float32</code>
</td>
<td>
<p>qps controls the number of queries per second allowed for this connection.</p>
</td>
</tr>
<tr><td><code>burst</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>burst allows extra queries to accumulate when a client is exceeding its rate.</p>
</td>
</tr>
</tbody>
</table>
## `DebuggingConfiguration` {#DebuggingConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>enableProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableProfiling enables profiling via web interface host:port/debug/pprof/</p>
</td>
</tr>
<tr><td><code>enableContentionProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableContentionProfiling enables block profiling, if
enableProfiling is true.</p>
</td>
</tr>
</tbody>
</table>
## `LeaderElectionConfiguration` {#LeaderElectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [GenericControllerManagerConfiguration](#controllermanager-config-k8s-io-v1alpha1-GenericControllerManagerConfiguration)
<p>LeaderElectionConfiguration defines the configuration of leader election
clients for components that can run with leader election enabled.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>leaderElect</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>leaderElect enables a leader election client to gain leadership
before executing the main loop. Enable this when running replicated
components for high availability.</p>
</td>
</tr>
<tr><td><code>leaseDuration</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>leaseDuration is the duration that non-leader candidates will wait
after observing a leadership renewal until attempting to acquire
leadership of a led but unrenewed leader slot. This is effectively the
maximum duration that a leader can be stopped before it is replaced
by another candidate. This is only applicable if leader election is
enabled.</p>
</td>
</tr>
<tr><td><code>renewDeadline</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>renewDeadline is the interval between attempts by the acting master to
renew a leadership slot before it stops leading. This must be less
than or equal to the lease duration. This is only applicable if leader
election is enabled.</p>
</td>
</tr>
<tr><td><code>retryPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>retryPeriod is the duration the clients should wait between attempting
acquisition and renewal of a leadership. This is only applicable if
leader election is enabled.</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceLock indicates the resource object type that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the name of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceNamespace</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the namespace of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
</tbody>
</table>

View File

@ -20,6 +20,188 @@ auto_generated: true
## `ClientConnectionConfiguration` {#ClientConnectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
<p>ClientConnectionConfiguration contains details for constructing a client.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>kubeconfig</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>kubeconfig is the path to a KubeConfig file.</p>
</td>
</tr>
<tr><td><code>acceptContentTypes</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>acceptContentTypes defines the Accept header sent by clients when connecting to a server, overriding the
default value of 'application/json'. This field will control all connections to the server used by a particular
client.</p>
</td>
</tr>
<tr><td><code>contentType</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>contentType is the content type used when sending data to the server from this client.</p>
</td>
</tr>
<tr><td><code>qps</code> <B>[Required]</B><br/>
<code>float32</code>
</td>
<td>
<p>qps controls the number of queries per second allowed for this connection.</p>
</td>
</tr>
<tr><td><code>burst</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>burst allows extra queries to accumulate when a client is exceeding its rate.</p>
</td>
</tr>
</tbody>
</table>
## `DebuggingConfiguration` {#DebuggingConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>enableProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableProfiling enables profiling via web interface host:port/debug/pprof/</p>
</td>
</tr>
<tr><td><code>enableContentionProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableContentionProfiling enables block profiling, if
enableProfiling is true.</p>
</td>
</tr>
</tbody>
</table>
## `LeaderElectionConfiguration` {#LeaderElectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
<p>LeaderElectionConfiguration defines the configuration of leader election
clients for components that can run with leader election enabled.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>leaderElect</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>leaderElect enables a leader election client to gain leadership
before executing the main loop. Enable this when running replicated
components for high availability.</p>
</td>
</tr>
<tr><td><code>leaseDuration</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>leaseDuration is the duration that non-leader candidates will wait
after observing a leadership renewal until attempting to acquire
leadership of a led but unrenewed leader slot. This is effectively the
maximum duration that a leader can be stopped before it is replaced
by another candidate. This is only applicable if leader election is
enabled.</p>
</td>
</tr>
<tr><td><code>renewDeadline</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>renewDeadline is the interval between attempts by the acting master to
renew a leadership slot before it stops leading. This must be less
than or equal to the lease duration. This is only applicable if leader
election is enabled.</p>
</td>
</tr>
<tr><td><code>retryPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>retryPeriod is the duration the clients should wait between attempting
acquisition and renewal of a leadership. This is only applicable if
leader election is enabled.</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceLock indicates the resource object type that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the name of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceNamespace</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the namespace of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
</tbody>
</table>
## `DefaultPreemptionArgs` {#kubescheduler-config-k8s-io-v1-DefaultPreemptionArgs}
@ -191,6 +373,16 @@ with the &quot;default-scheduler&quot; profile, if present here.</p>
with the extender. These extenders are shared by all scheduler profiles.</p>
</td>
</tr>
<tr><td><code>delayCacheUntilActive</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>DelayCacheUntilActive specifies when to start caching. If this is true and leader election is enabled,
the scheduler will wait to fill informer caches until it is the leader. Doing so will have slower
failover with the benefit of lower memory overhead while waiting to become leader.
Defaults to false.</p>
</td>
</tr>
</tbody>
</table>
@ -210,7 +402,7 @@ with the extender. These extenders are shared by all scheduler profiles.</p>
<tr><td><code>addedAffinity</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#nodeaffinity-v1-core"><code>core/v1.NodeAffinity</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#nodeaffinity-v1-core"><code>core/v1.NodeAffinity</code></a>
</td>
<td>
<p>AddedAffinity is applied to all Pods additionally to the NodeAffinity
@ -309,7 +501,7 @@ The default strategy is LeastAllocated with an equal &quot;cpu&quot; and &quot;m
<tr><td><code>defaultConstraints</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#topologyspreadconstraint-v1-core"><code>[]core/v1.TopologySpreadConstraint</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#topologyspreadconstraint-v1-core"><code>[]core/v1.TopologySpreadConstraint</code></a>
</td>
<td>
<p>DefaultConstraints defines topology spread constraints to be applied to
@ -1089,192 +1281,4 @@ Weight defaults to 1 if not specified or explicitly set to 0.</p>
</tr>
</tbody>
</table>
## `ClientConnectionConfiguration` {#ClientConnectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
<p>ClientConnectionConfiguration contains details for constructing a client.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>kubeconfig</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>kubeconfig is the path to a KubeConfig file.</p>
</td>
</tr>
<tr><td><code>acceptContentTypes</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>acceptContentTypes defines the Accept header sent by clients when connecting to a server, overriding the
default value of 'application/json'. This field will control all connections to the server used by a particular
client.</p>
</td>
</tr>
<tr><td><code>contentType</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>contentType is the content type used when sending data to the server from this client.</p>
</td>
</tr>
<tr><td><code>qps</code> <B>[Required]</B><br/>
<code>float32</code>
</td>
<td>
<p>qps controls the number of queries per second allowed for this connection.</p>
</td>
</tr>
<tr><td><code>burst</code> <B>[Required]</B><br/>
<code>int32</code>
</td>
<td>
<p>burst allows extra queries to accumulate when a client is exceeding its rate.</p>
</td>
</tr>
</tbody>
</table>
## `DebuggingConfiguration` {#DebuggingConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
<p>DebuggingConfiguration holds configuration for Debugging related features.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>enableProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableProfiling enables profiling via web interface host:port/debug/pprof/</p>
</td>
</tr>
<tr><td><code>enableContentionProfiling</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>enableContentionProfiling enables block profiling, if
enableProfiling is true.</p>
</td>
</tr>
</tbody>
</table>
## `LeaderElectionConfiguration` {#LeaderElectionConfiguration}
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1-KubeSchedulerConfiguration)
<p>LeaderElectionConfiguration defines the configuration of leader election
clients for components that can run with leader election enabled.</p>
<table class="table">
<thead><tr><th width="30%">Field</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>leaderElect</code> <B>[Required]</B><br/>
<code>bool</code>
</td>
<td>
<p>leaderElect enables a leader election client to gain leadership
before executing the main loop. Enable this when running replicated
components for high availability.</p>
</td>
</tr>
<tr><td><code>leaseDuration</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>leaseDuration is the duration that non-leader candidates will wait
after observing a leadership renewal until attempting to acquire
leadership of a led but unrenewed leader slot. This is effectively the
maximum duration that a leader can be stopped before it is replaced
by another candidate. This is only applicable if leader election is
enabled.</p>
</td>
</tr>
<tr><td><code>renewDeadline</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>renewDeadline is the interval between attempts by the acting master to
renew a leadership slot before it stops leading. This must be less
than or equal to the lease duration. This is only applicable if leader
election is enabled.</p>
</td>
</tr>
<tr><td><code>retryPeriod</code> <B>[Required]</B><br/>
<a href="https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Duration"><code>meta/v1.Duration</code></a>
</td>
<td>
<p>retryPeriod is the duration the clients should wait between attempting
acquisition and renewal of a leadership. This is only applicable if
leader election is enabled.</p>
</td>
</tr>
<tr><td><code>resourceLock</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceLock indicates the resource object type that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceName</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the name of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
<tr><td><code>resourceNamespace</code> <B>[Required]</B><br/>
<code>string</code>
</td>
<td>
<p>resourceName indicates the namespace of resource object that will be used to lock
during leader election cycles.</p>
</td>
</tr>
</tbody>
</table>

View File

@ -210,7 +210,7 @@ with the extender. These extenders are shared by all scheduler profiles.</p>
<tr><td><code>addedAffinity</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#nodeaffinity-v1-core"><code>core/v1.NodeAffinity</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#nodeaffinity-v1-core"><code>core/v1.NodeAffinity</code></a>
</td>
<td>
<p>AddedAffinity is applied to all Pods additionally to the NodeAffinity
@ -309,7 +309,7 @@ The default strategy is LeastAllocated with an equal &quot;cpu&quot; and &quot;m
<tr><td><code>defaultConstraints</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#topologyspreadconstraint-v1-core"><code>[]core/v1.TopologySpreadConstraint</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#topologyspreadconstraint-v1-core"><code>[]core/v1.TopologySpreadConstraint</code></a>
</td>
<td>
<p>DefaultConstraints defines topology spread constraints to be applied to
@ -1083,8 +1083,6 @@ Weight defaults to 1 if not specified or explicitly set to 0.</p>
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
@ -1141,8 +1139,6 @@ client.</p>
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)
@ -1177,8 +1173,6 @@ enableProfiling is true.</p>
**Appears in:**
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerConfiguration)
- [KubeSchedulerConfiguration](#kubescheduler-config-k8s-io-v1beta3-KubeSchedulerConfiguration)

View File

@ -62,7 +62,7 @@ customization).</p>
provided by kubeadm includes also enforcing consistency of values across components when required (e.g.
<code>--cluster-cidr</code> flag on controller manager and <code>clusterCIDR</code> on kube-proxy).</p>
<p>Users are always allowed to override default values, with the only exception of a small subset of setting with
relevance for security (e.g. enforce authorization-mode Node and RBAC on api server)</p>
relevance for security (e.g. enforce authorization-mode Node and RBAC on api server).</p>
<p>If the user provides a configuration types that is not expected for the action you are performing, kubeadm will
ignore those types and print a warning.</p>
<h2>Kubeadm init configuration types</h2>
@ -218,7 +218,7 @@ configuration types to be used during a <code>kubeadm init</code> run.</p>
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">pathType</span>:<span style="color:#bbb"> </span>File<span style="color:#bbb">
</span><span style="color:#bbb"></span><span style="color:#000;font-weight:bold">scheduler</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">extraArgs</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">address</span>:<span style="color:#bbb"> </span><span style="color:#d14">&#34;10.100.0.1&#34;</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">bind-address</span>:<span style="color:#bbb"> </span><span style="color:#d14">&#34;10.100.0.1&#34;</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">extraVolumes</span>:<span style="color:#bbb">
</span><span style="color:#bbb"> </span>- <span style="color:#000;font-weight:bold">name</span>:<span style="color:#bbb"> </span><span style="color:#d14">&#34;some-volume&#34;</span><span style="color:#bbb">
</span><span style="color:#bbb"> </span><span style="color:#000;font-weight:bold">hostPath</span>:<span style="color:#bbb"> </span><span style="color:#d14">&#34;/etc/some-path&#34;</span><span style="color:#bbb">
@ -934,7 +934,7 @@ file from which to load cluster information.</p>
</td>
</tr>
<tr><td><code>pathType</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#hostpathtype-v1-core"><code>core/v1.HostPathType</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#hostpathtype-v1-core"><code>core/v1.HostPathType</code></a>
</td>
<td>
<p><code>pathType</code> is the type of the <code>hostPath</code>.</p>
@ -1153,11 +1153,11 @@ Defaults to the hostname of the node if not provided.</p>
</td>
<td>
<p><code>criSocket</code> is used to retrieve container runtime info.
This information will be annotated to the Node API object, for later re-use</p>
This information will be annotated to the Node API object, for later re-use.</p>
</td>
</tr>
<tr><td><code>taints</code> <B>[Required]</B><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#taint-v1-core"><code>[]core/v1.Taint</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#taint-v1-core"><code>[]core/v1.Taint</code></a>
</td>
<td>
<p><code>taints</code> specifies the taints the Node API object should be registered with.
@ -1184,11 +1184,12 @@ command line except without leading dash(es).</p>
</td>
<td>
<p><code>ignorePreflightErrors</code> provides a list of pre-flight errors to be ignored when
the current node is registered.</p>
the current node is registered, e.g. <code>IsPrevilegedUser,Swap</code>.
Value <code>all</code> ignores errors from all checks.</p>
</td>
</tr>
<tr><td><code>imagePullPolicy</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#pullpolicy-v1-core"><code>core/v1.PullPolicy</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#pullpolicy-v1-core"><code>core/v1.PullPolicy</code></a>
</td>
<td>
<p><code>imagePullPolicy</code> specifies the policy for image pulling during kubeadm &quot;init&quot; and
@ -1281,7 +1282,7 @@ for, so other administrators can know its purpose.</p>
</td>
</tr>
<tr><td><code>expires</code><br/>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#time-v1-meta"><code>meta/v1.Time</code></a>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#time-v1-meta"><code>meta/v1.Time</code></a>
</td>
<td>
<p><code>expires</code> specifies the timestamp when this token expires. Defaults to being set

File diff suppressed because it is too large Load Diff

View File

@ -81,9 +81,9 @@ to provide credentials. Images are expected to contain the registry domain
and URL path.</p>
<p>Each entry in matchImages is a pattern which can optionally contain a port and a path.
Globs can be used in the domain, but not in the port or the path. Globs are supported
as subdomains like <code>&ast;.k8s.io</code> or <code>k8s.&ast;.io</code>, and top-level-domains such as <code>k8s.&ast;</code>.
Matching partial subdomains like <code>app&ast;.k8s.io<code> is also supported. Each glob can only match
a single subdomain segment, so <code>&ast;.io</code> does not match <code>&ast;.k8s.io</code>.</p>
as subdomains like '<em>.k8s.io' or 'k8s.</em>.io', and top-level-domains such as 'k8s.<em>'.
Matching partial subdomains like 'app</em>.k8s.io' is also supported. Each glob can only match
a single subdomain segment, so *.io does not match *.k8s.io.</p>
<p>A match exists between an image and a matchImage when all of the below are true:</p>
<ul>
<li>Both contain the same number of domain parts and each part matches.</li>
@ -93,9 +93,9 @@ a single subdomain segment, so <code>&ast;.io</code> does not match <code>&ast;.
<p>Example values of matchImages:</p>
<ul>
<li>123456789.dkr.ecr.us-east-1.amazonaws.com</li>
<li>&ast;.azurecr.io</li>
<li>*.azurecr.io</li>
<li>gcr.io</li>
<li>&ast;.&ast;.registry.io</li>
<li><em>.</em>.registry.io</li>
<li>registry.io:8080/path</li>
</ul>
</td>
@ -169,4 +169,4 @@ credential plugin.</p>
</tr>
</tbody>
</table>

Some files were not shown because too many files have changed in this diff Show More