Merge remote-tracking branch 'upstream/main' into dev-1.23

This commit is contained in:
Jesse Butler 2021-11-17 12:55:09 -05:00
commit d330226a95
246 changed files with 6943 additions and 2352 deletions

View File

@ -35,9 +35,11 @@ aliases:
- divya-mohan0209
- jimangel
- kbhawkey
- mehabhalodiya
- onlydole
- rajeshdeshpande02
- sftim
- shannonxtreme
- tengqm
sig-docs-es-owners: # Admins for Spanish content
- raelga
@ -166,6 +168,7 @@ aliases:
- xichengliudui
# zhangxiaoyu-zidif
sig-docs-pt-owners: # Admins for Portuguese content
- edsoncelio
- femrtnz
- jailton
- jcjesus
@ -174,6 +177,7 @@ aliases:
- rikatz
- yagonobre
sig-docs-pt-reviews: # PR reviews for Portugese content
- edsoncelio
- femrtnz
- jailton
- jcjesus
@ -221,12 +225,12 @@ aliases:
# authoritative source: git.k8s.io/community/OWNERS_ALIASES
committee-steering: # provide PR approvals for announcements
- cblecker
- derekwaynecarr
- dims
- justaugustus
- liggitt
- mrbobbytables
- nikhita
- parispittman
- tpepper
# authoritative source: https://git.k8s.io/sig-release/OWNERS_ALIASES
sig-release-leads:
- cpanato # SIG Technical Lead

View File

@ -146,7 +146,8 @@ Learn more about SIG Docs Kubernetes community and meetings on the [community pa
You can also reach the maintainers of this project at:
- [Slack](https://kubernetes.slack.com/messages/sig-docs) [Get an invite for this Slack](https://slack.k8s.io/)
- [Slack](https://kubernetes.slack.com/messages/sig-docs)
- [Get an invite for this Slack](https://slack.k8s.io/)
- [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-docs)
## Contributing to the docs

View File

@ -810,11 +810,10 @@ section#cncf {
}
}
.td-search {
header > .header-filler {
height: $hero-padding-top;
background-color: black;
}
// Header filler size adjustment
.header-hero.filler {
height: $hero-padding-top;
}
// Docs specific
@ -859,17 +858,6 @@ section#cncf {
/* DOCUMENTATION */
body.td-documentation {
header > .header-filler {
height: $hero-padding-top;
background-color: black;
}
/* Special case for if an announcement is active */
header section#announcement ~ .header-filler {
display: none;
}
}
// nav-tabs and tab-content
.nav-tabs {
border-bottom: none !important;

View File

@ -26,6 +26,10 @@ $announcement-size-adjustment: 8px;
}
}
.header-hero #quickstartButton.button {
margin-top: 1em;
}
section {
.main-section {
@media only screen and (min-width: 1024px) {
@ -34,8 +38,11 @@ section {
}
}
.td-outer {
padding: 0 !important;
body {
header + .td-outer {
min-height: 50vh;
height: auto;
}
}
@ -313,33 +320,40 @@ main {
// blockquotes and callouts
.td-content, body {
blockquote.callout {
body {
.alert {
// Override Docsy styles
padding: 0.4rem 0.4rem 0.4rem 1rem;
border: 1px solid #eee;
border-left-width: 0.5em;
border-top: 1px solid #eee;
border-bottom: 1px solid #eee;
border-right: 1px solid #eee;
border-radius: 0.25em;
border-left-width: 0.5em; // fallback in case calc() is missing
background: #fff;
color: #000;
margin-top: 0.5em;
margin-bottom: 0.5em;
}
blockquote.callout {
border-radius: calc(1em/3);
// Set minimum width and radius for alert color
.alert {
border-left-width: calc(max(0.5em, 4px));
border-top-left-radius: calc(max(0.5em, 4px));
border-bottom-left-radius: calc(max(0.5em, 4px));
}
.callout.caution {
.alert.callout.caution {
border-left-color: #f0ad4e;
}
.callout.note {
.alert.callout.note {
border-left-color: #428bca;
}
.callout.warning {
.alert.callout.warning {
border-left-color: #d9534f;
}
.alert.third-party-content {
border-left-color: #444;
}
h1:first-of-type + blockquote.callout {
h1:first-of-type + .alert.callout {
margin-top: 1.5em;
}
}
@ -367,7 +381,7 @@ main {
background: #f8f9cb;
}
.deprecation-warning {
.deprecation-warning, .pageinfo.deprecation-warning {
padding: 20px;
margin: 20px 0;
background-color: #faf5b6;
@ -554,34 +568,6 @@ main.content {
}
}
/* ANNOUNCEMENTS */
section#fp-announcement ~ .header-hero {
padding: $announcement-size-adjustment 0;
> div {
margin-top: $announcement-size-adjustment;
margin-bottom: $announcement-size-adjustment;
}
h1, h2, h3, h4, h5 {
margin: $announcement-size-adjustment 0;
}
}
section#announcement ~ .header-hero {
padding: #{$announcement-size-adjustment / 2} 0;
> div {
margin-top: #{$announcement-size-adjustment / 2};
margin-bottom: #{$announcement-size-adjustment / 2};
padding-bottom: #{$announcement-size-adjustment / 2};
}
h1, h2, h3, h4, h5 {
margin: #{$announcement-size-adjustment / 2} 0;
}
}
/* DOCUMENTATION */
/* Don't show lead text */
@ -607,12 +593,12 @@ body.td-documentation {
@media print {
/* Do not print announcements */
#announcement, section#announcement, #fp-announcement, section#fp-announcement {
#announcement {
display: none;
}
}
#announcement, #fp-announcement {
#announcement {
> * {
color: inherit;
background: inherit;
@ -629,42 +615,90 @@ body.td-documentation {
}
}
#announcement {
padding-top: 105px;
padding-bottom: 25px;
}
.header-hero {
padding-top: 40px;
}
/* Extra announcement height only for landscape viewports */
@media (min-aspect-ratio: 8/9) {
#fp-announcement {
min-height: 25vh;
}
}
#fp-announcement aside {
padding-top: 115px;
padding-bottom: 25px;
}
.announcement {
.content {
#announcement {
.announcement-main {
margin-left: auto;
margin-right: auto;
margin-bottom: 0px;
// for padding-top see _size.scss
padding-bottom: calc(max(2em, 2rem));
max-width: calc(min(1200px - 8em, 80vw));
}
> p {
.gridPage #announcement .content p,
.announcement > h4,
.announcement > h3 {
color: #ffffff;
/* always white */
h1, h2, h3, h4, h5, h6, p * {
color: #ffffff;
background: transparent;
img.event-logo {
display: inline-block;
max-height: calc(min(80px, 8em));
max-width: calc(min(240px, 33vw));
float: right;
}
}
}
#announcement + .header-hero {
padding-top: 2em;
}
// Extra padding for anything except wide viewports
@media (min-width: 992px) {
#announcement aside { // more specific
.announcement-main {
padding-top: calc(max(8em, 8rem));
}
}
}
@media (max-width: 768px) {
#announcement {
padding-top: 4rem;
padding-bottom: 4rem;
.announcement-main, aside .announcement-main {
padding-top: calc(min(2rem,2em));
}
}
}
@media (max-width: 480px) {
#announcement {
padding-bottom: 0.5em;
}
#announcement aside {
h1, h2, h3, h4, h5, h6 {
img.event-logo {
margin-left: auto;
margin-right: auto;
margin-bottom: 0.75em;
display: block;
max-height: initial;
max-width: calc(min(calc(100vw - 2em), 240px));
float: initial;
}
}
}
}
#announcement + .header-hero.filler {
display: none;
}
@media (min-width: 768px) {
#announcement + .header-hero {
display: none;
}
}
// Match Docsy-imposed max width on text body
@media (min-width: 1200px) {
body.td-blog main .td-content > figure {
@ -721,3 +755,13 @@ figure {
}
}
}
// Indent definition lists
dl {
padding-left: 1.5em;
// Add vertical space before definitions
> *:not(dt) + dt, dt:first-child {
margin-top: 1.5em;
}
}

View File

@ -18,3 +18,11 @@ section,
line-height: $vendor-strip-height;
font-size: $vendor-strip-font-size;
}
#announcement {
min-height: $hero-padding-top;
.announcement-main {
padding-top: calc(max(8em, 8rem, #{$hero-padding-top} / 3));
}
}

View File

@ -0,0 +1,137 @@
---
title: Schulungen
bigheader: Kubernetes Schulungen und Zertifizierungen
abstract: Schulungsprogramme, Zertifizierungen und Partner.
layout: basic
cid: training
class: training
---
<section class="call-to-action">
<div class="main-section">
<div class="call-to-action" id="cta-certification">
<div class="cta-text">
<h2>Gestalte deine Cloud Native Karriere</h2>
<p>Kubernetes ist das Herzstück der Cloud Native-Bewegung. Mit den Schulungen und Zertifizierungen der Linux Foundation und unserer Schulungspartner kannst Du in deine Karriere investieren, Kubernetes lernen und deine Cloud Native-Projekte zum Erfolg führen.</p>
</div>
<div class="logo-certification cta-image" id="logo-kcnf">
<img src="/images/training/kubernetes-kcnf-white.svg" />
</div>
<div class="logo-certification cta-image" id="logo-cka">
<img src="/images/training/kubernetes-cka-white.svg"/>
</div>
<div class="logo-certification cta-image" id="logo-ckad">
<img src="/images/training/kubernetes-ckad-white.svg"/>
</div>
<div class="logo-certification cta-image" id="logo-cks">
<img src="/images/training/kubernetes-cks-white.svg"/>
</div>
</div>
</div>
</section>
<section>
<div class="main-section padded">
<center>
<h2>Nimm an einen kostenlosen Kurs bei edX teil</h2>
</center>
<div class="col-container">
<div class="col-nav">
<center>
<h5>
<b>Einf&uuml;hrung in Kubernetes <br> &nbsp;</b>
</h5>
<p>M&ouml;chtest Du Kubernetes lernen? Erfahre alles über dieses leistungsstarke System zur Verwaltung von Containeranwendungen.</p>
<br>
<a href="https://www.edx.org/course/introduction-to-kubernetes" target="_blank" class="button">Zum Kurs</a>
</center>
</div>
<div class="col-nav">
<center>
<h5>
<b>Einführung in Cloud-Infrastruktur Technologien</b>
</h5>
<p>Lerne die Grundlagen für den Aufbau und die Verwaltung von Cloud-Technologien direkt von der Linux Foundation, dem Marktführer im Bereich Open Source.</p>
<br>
<a href="https://www.edx.org/course/introduction-to-cloud-infrastructure-technologies" target="_blank" class="button">Zum Kurs</a>
</center>
</div>
<div class="col-nav">
<center>
<h5>
<b>Einf&uuml;hrung in Linux</b>
</h5>
<p>Du hast nie Linux gelernt? Willst du eine Auffrischung? Erarbeite dir gute Linux-Kenntnisse über die grafische Oberfläche und die Kommandozeile der wichtigsten Linux-Distributionen.</p>
<br>
<a href="https://www.edx.org/course/introduction-to-linux" target="_blank" class="button">Zum Kurs</a>
</center>
</div>
</div>
</section>
<div class="padded lighter-gray-bg">
<div class="main-section two-thirds-centered">
<center>
<h2>Mit der Linux Foundation lernen</h2>
<p>Die Linux Foundation bietet Kurse für alle Aspekte der Entwicklung und des Betriebs von Kubernetes-Anwendungen an, die entweder von Lehrkräften geleitet werden oder zum Selbststudium geeignet sind.</p>
<br/><br/>
<a href="https://training.linuxfoundation.org/training/course-catalog/?_sft_technology=kubernetes" target="_blank" class="button">Kurse anzeigen</a>
</center>
</div>
</div>
<section id="get-certified">
<div class="main-section padded">
<h2>Werde Kubernetes zertifiziert</h2>
<div class="col-container">
<div class="col-nav">
<h5>
<b>Kubernetes and Cloud Native Associate (KCNA)</b>
</h5>
<p>Die Prüfung zum Kubernetes and Cloud Native Associate (KCNA) weist die grundlegenden Kenntnisse und Fähigkeiten eines Benutzers in Kubernetes und dem breiteren Cloud Native-Ökosystem nach.</p>
<p>Ein zertifizierter KCNA bestätigt konzeptionelles Wissen über das gesamte Cloud Native Ecosystem, mit besonderem Fokus auf Kubernetes.</p>
<br>
<a href="https://training.linuxfoundation.org/certification/kubernetes-cloud-native-associate/" target="_blank" class="button">Zur Zertifizierung</a>
</div>
<div class="col-nav">
<h5>
<b>Certified Kubernetes Application Developer (CKAD)</b>
</h5>
<p>Die Prüfung zum Certified Kubernetes Application Developer (Zertifizierter Kubernetes-Anwendungsentwickler) bescheinigt, dass Teilnehmer Cloud Native-Anwendungen für Kubernetes entwerfen, erstellen, konfigurieren und bereitstellen können.</p>
<p>Ein CKAD kann Anwendungsressourcen definieren und zentrale Elemente verwenden, um skalierbare Anwendungen und Tools in Kubernetes zu erstellen, zu überwachen und Fehler zu beheben.</p>
<br>
<a href="https://training.linuxfoundation.org/certification/certified-kubernetes-application-developer-ckad/" target="_blank" class="button">Zur Zertifizierung</a>
</div>
<div class="col-nav">
<h5>
<b>Certified Kubernetes Administrator (CKA)</b>
</h5>
<p>Das Certified Kubernetes Administrator (CKA)-Programm garantiert, dass CKAs die Fähigkeiten, das Wissen und die Kompetenz besitzen, um die Aufgaben eines Kubernetes-Administrators zu erfüllen.</p>
<p>Ein zertifizierter Kubernetes-Administrator hat nachgewiesen, dass er in der Lage ist, grundlegende Installationen durchzuführen sowie Kubernetes-Cluster in einer Produktionsumgebung zu konfigurieren und zu verwalten.</p>
<br>
<a href="https://training.linuxfoundation.org/certification/certified-kubernetes-administrator-cka/" target="_blank" class="button">Zur Zertifizierung</a>
</div>
<div class="col-nav">
<h5>
<b>Certified Kubernetes Security Specialist (CKS)</b>
</h5>
<p>Das Programm Certified Kubernetes Security Specialist (CKS) bietet die Gewissheit, dass der Zertifikatsinhaber mit einem breiten Spektrum an Best Practices vertraut ist und diese beherrscht. Die CKS-Zertifizierung umfasst Fähigkeiten zur Sicherung von Container-basierten Anwendungen und Kubernetes-Plattformen während der Erstellung, Bereitstellung und Laufzeit.</p>
<p><em>Kandidaten für den CKS müssen über eine aktuelle Zertifizierung als Certified Kubernetes Administrator (CKA) verfügen, um nachzuweisen, dass sie über ausreichende Kubernetes-Kenntnisse verfügen, bevor sie sich für den CKS anmelden.</em></p>
<br>
<a href="https://training.linuxfoundation.org/certification/certified-kubernetes-security-specialist/" target="_blank" class="button">Zur Zertifizierung</a>
</div>
</div>
</div>
</section>
<div class="padded lighter-gray-bg">
<div class="main-section two-thirds-centered">
<center>
<h2>Kubernetes Schulungspartner</h2>
<p>Unser Netzwerk von Kubernetes-Schulungspartnern bietet Schulungsangebote für Kubernetes- und Cloud Native-Projekte.</p>
</center>
</div>
<div class="main-section landscape-section">
{{< cncf-landscape helpers=false category="kubernetes-training-partner" >}}
</div>
</div>

View File

@ -29,7 +29,7 @@ They join continuing members Christoph Blecker ([@cblecker](https://github.com/c
* Josh Berkus ([@jberkus](https://github.com/jberkus)), Red Hat
* Thanks to the Emeritus Steering Committee Members. Your prior service is appreciated by the community:
* Aaron Crickenberger ([@spiffxp](https://github.com/spiffxp)), Google
* and Lachlan Evenson([@lachie8e)](https://github.com/lachie8e)), Microsoft
* and Lachlan Evenson([@lachie83)](https://github.com/lachie83)), Microsoft
* And thank you to all the candidates who came forward to run for election. As [Jorge Castro put it](https://twitter.com/castrojo/status/1315718627639820288?s=20): we are spoiled with capable, kind, and selfless volunteers who put the needs of the project first.
## Get Involved with the Steering Committee

View File

@ -121,7 +121,7 @@ deploymentApplyConfig.Spec.Template.Spec.WithContainers(corev1ac.Container().
)
// apply
applied, err := deploymentClient.Apply(ctx, extractedDeployment, metav1.ApplyOptions{FieldManager: fieldMgr})
applied, err := deploymentClient.Apply(ctx, deploymentApplyConfig, metav1.ApplyOptions{FieldManager: fieldMgr})
```
For developers using Custom Resource Definitions (CRDs), the Kubebuilder apply support will provide the same capabilities. Documentation will be included in the Kubebuilder book when available.

View File

@ -0,0 +1,56 @@
---
layout: blog
title: "Announcing the 2021 Steering Committee Election Results"
date: 2021-11-08
slug: steering-committee-results-2021
---
**Author**: Kaslin Fields
The [2021 Steering Committee Election](https://github.com/kubernetes/community/tree/master/events/elections/2021) is now complete. The Kubernetes Steering Committee consists of 7 seats, 4 of which were up for election in 2021. Incoming committee members serve a term of 2 years, and all members are elected by the Kubernetes Community.
This community body is significant since it oversees the governance of the entire Kubernetes project. With that great power comes great responsibility. You can learn more about the steering committees role in their [charter](https://github.com/kubernetes/steering/blob/master/charter.md).
## Results
Congratulations to the elected committee members whose two year terms begin immediately (listed in alphabetical order by GitHub handle):
* **Christoph Blecker ([@cblecker](https://github.com/cblecker)), Red Hat**
* **Stephen Augustus ([@justaugustus](https://github.com/justaugustus)), Cisco**
* **Paris Pittman ([@parispittman](https://github.com/parispittman)), Apple**
* **Tim Pepper ([@tpepper](https://github.com/tpepper)), VMware**
They join continuing members:
* **Davanum Srinivas ([@dims](https://github.com/dims)), VMware**
* **Jordan Liggitt ([@liggitt](https://github.com/liggitt)), Google**
* **Bob Killen ([@mrbobbytables](https://github.com/mrbobbytables)), Google**
Paris Pittman and Christoph Blecker are returning Steering Committee Members.
## Big Thanks
Thank you and congratulations on a successful election to this rounds election officers:
* Alison Dowdney, ([@alisondy](https://github.com/alisondy))
* Noah Kantrowitz ([@coderanger](https://github.com/coderanger))
* Josh Berkus ([@jberkus](https://github.com/jberkus))
Special thanks to Arnaud Meukam ([@ameukam](https://github.com/ameukam)), k8s-infra liaison, who enabled our voting software on community-owned infrastructure.
Thanks to the Emeritus Steering Committee Members. Your prior service is appreciated by the community:
* Derek Carr ([@derekwaynecarr](https://github.com/derekwaynecarr))
* Nikhita Raghunath ([@nikhita](https://github.com/nikhita))
And thank you to all the candidates who came forward to run for election.
## Get Involved with the Steering Committee
This governing body, like all of Kubernetes, is open to all. You can follow along with Steering Committee [backlog items](https://github.com/kubernetes/steering/projects/1) and weigh in by filing an issue or creating a PR against their [repo](https://github.com/kubernetes/steering). They have an open meeting on [the first Monday at 9:30am PT of every month](https://github.com/kubernetes/steering) and regularly attend Meet Our Contributors. They can also be contacted at their public mailing list steering@kubernetes.io.
You can see what the Steering Committee meetings are all about by watching past meetings on the [YouTube Playlist](https://www.youtube.com/playlist?list=PL69nYSiGNLP1yP1B_nd9-drjoxp0Q14qM).
---
_This post was written by the [Upstream Marketing Working Group](https://github.com/kubernetes/community/tree/master/communication/marketing-team#contributor-marketing). If you want to write stories about the Kubernetes community, learn more about us._

View File

@ -0,0 +1,238 @@
---
layout: blog
title: 'Non-root Containers And Devices'
date: 2021-11-09
slug: non-root-containers-and-devices
---
**Author:** Mikko Ylinen (Intel)
The user/group ID related security settings in Pod's `securityContext` trigger a problem when users want to
deploy containers that use accelerator devices (via [Kubernetes Device Plugins](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)) on Linux. In this blog
post I talk about the problem and describe the work done so far to address it. It's not meant to be a long story about getting the [k/k issue](https://github.com/kubernetes/kubernetes/issues/92211) fixed.
Instead, this post aims to raise awareness of the issue and to highlight important device use-cases too. This is needed as Kubernetes works on new related features such as support for user namespaces.
## Why non-root containers can't use devices and why it matters
One of the key security principles for running containers in Kubernetes is the
principle of least privilege. The Pod/container `securityContext` specifies the config
options to set, e.g., Linux capabilities, MAC policies, and user/group ID values to achieve this.
Furthermore, the cluster admins are supported with tools like [PodSecurityPolicy](/docs/concepts/policy/pod-security-policy/) (deprecated) or
[Pod Security Admission](/docs/concepts/security/pod-security-admission/) (alpha) to enforce the desired security settings for pods that are being deployed in
the cluster. These settings could, for instance, require that containers must be `runAsNonRoot` or
that they are forbidden from running with root's group ID in `runAsGroup` or `supplementalGroups`.
In Kubernetes, the kubelet builds the list of [`Device`](https://pkg.go.dev/k8s.io/cri-api@v0.22.1/pkg/apis/runtime/v1#Device) resources to be made available to a container
(based on inputs from the Device Plugins) and the list is included in the CreateContainer CRI message
sent to the CRI container runtime. Each `Device` contains little information: host/container device
paths and the desired devices cgroups permissions.
The [OCI Runtime Spec for Linux Container Configuration](https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md)
expects that in addition to the devices cgroup fields, more detailed information about the devices
must be provided:
```yaml
{
"type": "<string>",
"path": "<string>",
"major": <int64>,
"minor": <int64>,
"fileMode": <uint32>,
"uid": <uint32>,
"gid": <uint32>
},
```
The CRI container runtimes (containerd, CRI-O) are responsible for obtaining this information
from the host for each `Device`. By default, the runtimes copy the host device's user and group IDs:
- `uid` (uint32, OPTIONAL) - id of device owner in the container namespace.
- `gid` (uint32, OPTIONAL) - id of device group in the container namespace.
Similarly, the runtimes prepare other mandatory `config.json` sections based on the CRI fields,
including the ones defined in `securityContext`: `runAsUser`/`runAsGroup`, which become part of the POSIX
platforms user structure via:
- `uid` (int, REQUIRED) specifies the user ID in the container namespace.
- `gid` (int, REQUIRED) specifies the group ID in the container namespace.
- `additionalGids` (array of ints, OPTIONAL) specifies additional group IDs in the container namespace to be added to the process.
However, the resulting `config.json` triggers a problem when trying to run containers with
both devices added and with non-root uid/gid set via `runAsUser`/`runAsGroup`: the container user process
has no permission to use the device even when its group id (gid, copied from host) was permissive to
non-root groups. This is because the container user does not belong to that host group (e.g., via `additionalGids`).
Being able to run applications that use devices as non-root user is normal and expected to work so that
the security principles can be met. Therefore, several alternatives were considered to get the gap filled with what the PodSec/CRI/OCI supports today.
## What was done to solve the issue?
You might have noticed from the problem definition that it would at least be possible to workaround
the problem by manually adding the device gid(s) to `supplementalGroups`, or in
the case of just one device, set `runAsGroup` to the device's group id. However, this is problematic because the device gid(s) may have
different values depending on the nodes' distro/version in the cluster. For example, with GPUs the following commands for different distros and versions return different gids:
Fedora 33:
```
$ ls -l /dev/dri/
total 0
drwxr-xr-x. 2 root root 80 19.10. 10:21 by-path
crw-rw----+ 1 root video 226, 0 19.10. 10:42 card0
crw-rw-rw-. 1 root render 226, 128 19.10. 10:21 renderD128
$ grep -e video -e render /etc/group
video:x:39:
render:x:997:
```
Ubuntu 20.04:
```
$ ls -l /dev/dri/
total 0
drwxr-xr-x 2 root root 80 19.10. 17:36 by-path
crw-rw---- 1 root video 226, 0 19.10. 17:36 card0
crw-rw---- 1 root render 226, 128 19.10. 17:36 renderD128
$ grep -e video -e render /etc/group
video:x:44:
render:x:133:
```
Which number to choose in your `securityContext`? Also, what if the `runAsGroup`/`runAsUser` values cannot be hard-coded because
they are automatically assigned during pod admission time via external security policies?
Unlike volumes with `fsGroup`, the devices have no official notion of `deviceGroup`/`deviceUser` that the CRI runtimes (or kubelet)
would be able to use. We considered using container annotations set by the device plugins (e.g., `io.kubernetes.cri.hostDeviceSupplementalGroup/`) to get custom OCI `config.json` uid/gid values.
This would have required changes to all existing device plugins which was not ideal.
Instead, a solution that is *seamless* to end-users without getting the device plugin vendors involved was preferred. The selected approach was
to re-use `runAsUser` and `runAsGroup` values in `config.json` for devices:
```yaml
{
"type": "c",
"path": "/dev/foo",
"major": 123,
"minor": 4,
"fileMode": 438,
"uid": <runAsUser>,
"gid": <runAsGroup>
},
```
With `runc` OCI runtime (in non-rootless mode), the device is created (`mknod(2)`) in
the container namespace and the ownership is changed to `runAsUser`/`runAsGroup` using `chmod(2)`.
{{< note >}}
[Rootless mode](/docs/tasks/administer-cluster/kubelet-in-userns/) and devices is not supported.
{{</note>}}
Having the ownership updated in the container namespace is justified as the user process is the only one accessing the device. Only `runAsUser`/`runAsGroup`
are taken into account, and, e.g., the `USER` setting in the container is currently ignored.
While it is likely that the "faulty" deployments (i.e., non-root `securityContext` + devices) do not exist, to be absolutely sure no
deployments break, an opt-in config entry in both containerd and CRI-O to enable the new behavior was added. The following:
`device_ownership_from_security_context (bool)`
defaults to `false` and must be enabled to use the feature.
## See non-root containers using devices after the fix
To demonstrate the new behavior, let's use a Data Plane Development Kit (DPDK) application using hardware accelerators, Kubernetes CPU manager, and HugePages as an example. The cluster runs containerd with:
```toml
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = true
```
or CRI-O with:
```toml
[crio.runtime]
device_ownership_from_security_context = true
```
and the `Guaranteed` QoS Class Pod that runs DPDK's crypto-perf test utility with this YAML:
```yaml
...
metadata:
name: qat-dpdk
spec:
securityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
containers:
- name: crypto-perf
image: intel/crypto-perf:devel
...
resources:
requests:
cpu: "3"
memory: "128Mi"
qat.intel.com/generic: '4'
hugepages-2Mi: "128Mi"
limits:
cpu: "3"
memory: "128Mi"
qat.intel.com/generic: '4'
hugepages-2Mi: "128Mi"
...
```
To verify the results, check the user and group ID that the container runs as:
```
$ kubectl exec -it qat-dpdk -c crypto-perf -- id
```
They are set to non-zero values as expected:
```
uid=1000 gid=2000 groups=2000,3000
```
Next, check the device node permissions (`qat.intel.com/generic` exposes `/dev/vfio/` devices) are accessible to `runAsUser`/`runAsGroup`:
```
$ kubectl exec -it qat-dpdk -c crypto-perf -- ls -la /dev/vfio
total 0
drwxr-xr-x 2 root root 140 Sep 7 10:55 .
drwxr-xr-x 7 root root 380 Sep 7 10:55 ..
crw------- 1 1000 2000 241, 0 Sep 7 10:55 58
crw------- 1 1000 2000 241, 2 Sep 7 10:55 60
crw------- 1 1000 2000 241, 10 Sep 7 10:55 68
crw------- 1 1000 2000 241, 11 Sep 7 10:55 69
crw-rw-rw- 1 1000 2000 10, 196 Sep 7 10:55 vfio
```
Finally, check the non-root container is also allowed to create HugePages:
```
$ kubectl exec -it qat-dpdk -c crypto-perf -- ls -la /dev/hugepages/
```
`fsGroup` gives a `runAsUser` writable HugePages emptyDir mountpoint:
```
total 0
drwxrwsr-x 2 root 3000 0 Sep 7 10:55 .
drwxr-xr-x 7 root root 380 Sep 7 10:55 ..
```
## Help us test it and provide feedback!
The functionality described here is expected to help with cluster security and the configurability of device permissions. To allow
non-root containers to use devices requires cluster admins to opt-in to the functionality by setting
`device_ownership_from_security_context = true`. To make it a default setting, please test it and provide your feedback (via SIG-Node meetings or issues)!
The flag is available in CRI-O v1.22 release and queued for containerd v1.6.
More work is needed to get it *properly* supported. It is known to work with `runc` but it also needs to be made to function
with other OCI runtimes too, where applicable. For instance, Kata Containers supports device passthrough and allows it to make devices
available to containers in VM sandboxes too.
Moreover, the additional challenge comes with support of user names and devices. This problem is still [open](https://github.com/kubernetes/enhancements/pull/2101)
and requires more brainstorming.
Finally, it needs to be understood whether `runAsUser`/`runAsGroup` are enough or if device specific settings similar to `fsGroups` are needed in PodSpec/CRI v2.
## Thanks
My thanks goes to Mike Brown (IBM, containerd), Peter Hunt (Redhat, CRI-O), and Alexander Kanevskiy (Intel) for providing all the feedback and good conversations.

View File

@ -0,0 +1,59 @@
---
layout: blog
title: "Dockershim removal is coming. Are you ready?"
date: 2021-11-12
slug: are-you-ready-for-dockershim-removal
---
**Author:** Sergey Kanzhelev, Google. With reviews from Davanum Srinivas, Elana Hashman, Noah Kantrowitz, Rey Lejano.
Last year we announced that Dockershim is being deprecated: [Dockershim Deprecation FAQ](/blog/2020/12/02/dockershim-faq/).
Our current plan is to remove dockershim from the Kubernetes codebase soon.
We are looking for feedback from you whether you are ready for dockershim
removal and to ensure that you are ready when the time comes.
**Please fill out this survey: https://forms.gle/svCJmhvTv78jGdSx8**.
The dockershim component that enables Docker as a Kubernetes container runtime is
being deprecated in favor of runtimes that directly use the [Container Runtime Interface](/blog/2016/12/container-runtime-interface-cri-in-kubernetes/)
created for Kubernetes. Many Kubernetes users have migrated to
other container runtimes without problems. However we see that dockershim is
still very popular. You may see some public numbers in recent [Container Report](https://www.datadoghq.com/container-report/#8) from DataDog.
Some Kubernetes hosting vendors just recently enabled other runtimes support
(especially for Windows nodes). And we know that many third party tools vendors
are still not ready: [migrating telemetry and security agents](/docs/tasks/administer-cluster/migrating-from-dockershim/migrating-telemetry-and-security-agents/#telemetry-and-security-agent-vendors).
At this point, we believe that there is feature parity between Docker and the
other runtimes. Many end-users have used our [migration guide](/docs/tasks/administer-cluster/migrating-from-dockershim/)
and are running production workload using these different runtimes. The plan of
record today is that dockershim will be removed in version 1.24, slated for
release around April of next year. For those developing or running alpha and
beta versions, dockershim will be removed in December at the beginning of the
1.24 release development cycle.
There is only one month left to give us feedback. We want you to tell us how
ready you are.
**We are collecting opinions through this survey: [https://forms.gle/svCJmhvTv78jGdSx8](https://forms.gle/svCJmhvTv78jGdSx8)**
To better understand preparedness for the dockershim removal, our survey is
asking the version of Kubernetes you are currently using, and an estimate of
when you think you will adopt Kubernetes 1.24. All the aggregated information
on dockershim removal readiness will be published.
Free form comments will be reviewed by SIG Node leadership. If you want to
discuss any details of migrating from dockershim, report bugs or adoption
blockers, you can use one of the SIG Node contact options any time:
https://github.com/kubernetes/community/tree/master/sig-node#contact
Kubernetes is a mature project. This deprecation is another
step in the effort to get away from permanent beta features and providing more
stability and compatibility guarantees. With the migration from dockershim you
will get more flexibility and choice of container runtime features as well as
less dependencies of your apps on specific underlying technology. Please take
time to review the [dockershim migration documentation](/docs/tasks/administer-cluster/migrating-from-dockershim/)
and consult your Kubernetes hosting vendor (if you have one) what container runtime options are available for you.
Read up [container runtime documentation with instructions on how to use containerd and CRI-O](/docs/setup/production-environment/container-runtimes/#container-runtimes)
to help prepare you when you're ready to upgrade to 1.24. CRI-O, containerd, and
Docker with [Mirantis cri-dockerd](https://github.com/Mirantis/cri-dockerd) are
not the only container runtime options, we encourage you to explore the [CNCF landscape on container runtimes](https://landscape.cncf.io/card-mode?category=container-runtime&grouping=category)
in case another suits you better.
Thank you!

View File

@ -19,6 +19,7 @@ cid: community
<div class="community__navbar">
<a href="https://www.kubernetes.dev/">Contributor Community</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="#values">Community Values</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="#conduct">Code of conduct </a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="#videos">Videos</a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

View File

@ -43,11 +43,11 @@ The controllers inside the cloud controller manager include:
### Node controller
The node controller is responsible for creating {{< glossary_tooltip text="Node" term_id="node" >}} objects
The node controller is responsible for updating {{< glossary_tooltip text="Node" term_id="node" >}} objects
when new servers are created in your cloud infrastructure. The node controller obtains information about the
hosts running inside your tenancy with the cloud provider. The node controller performs the following functions:
1. Initialize a Node object for each server that the controller discovers through the cloud provider API.
1. Update a Node object with the corresponding server's unique identifier obtained from the cloud provider API.
2. Annotating and labelling the Node object with cloud-specific information, such as the region the node
is deployed into and the resources (CPU, memory, etc) that it has available.
3. Obtain the node's hostname and network addresses.

View File

@ -72,7 +72,8 @@ The name of a Node object must be a valid
The [name](/docs/concepts/overview/working-with-objects/names#names) identifies a Node. Two Nodes
cannot have the same name at the same time. Kubernetes also assumes that a resource with the same
name is the same object. In case of a Node, it is implicitly assumed that an instance using the
same name will have the same state (e.g. network settings, root disk contents). This may lead to
same name will have the same state (e.g. network settings, root disk contents)
and attributes like node labels. This may lead to
inconsistencies if an instance was modified without changing its name. If the Node needs to be
replaced or updated significantly, the existing Node object needs to be removed from API server
first and re-added after the update.
@ -98,6 +99,21 @@ When the [Node authorization mode](/docs/reference/access-authn-authz/node/) and
[NodeRestriction admission plugin](/docs/reference/access-authn-authz/admission-controllers/#noderestriction) are enabled,
kubelets are only authorized to create/modify their own Node resource.
{{< note >}}
As mentioned in the [Node name uniqueness](#node-name-uniqueness) section,
when Node configuration needs to be updated, it is a good practice to re-register
the node with the API server. For example, if the kubelet being restarted with
the new set of `--node-labels`, but the same Node name is used, the change will
not take an effect, as labels are being set on the Node registration.
Pods already scheduled on the Node may misbehave or cause issues if the Node
configuration will be changed on kubelet restart. For example, already running
Pod may be tainted against the new labels assigned to the Node, while other
Pods, that are incompatible with that Pod will be scheduled based on this new
label. Node re-registration ensures all Pods will be drained and properly
re-scheduled.
{{< /note >}}
### Manual Node administration
You can create and modify Node objects using

View File

@ -25,7 +25,7 @@ This page lists some of the available add-ons and links to their respective inst
* [Contrail](https://www.juniper.net/us/en/products-services/sdn/contrail/contrail-networking/), based on [Tungsten Fabric](https://tungsten.io), is an open source, multi-cloud network virtualization and policy management platform. Contrail and Tungsten Fabric are integrated with orchestration systems such as Kubernetes, OpenShift, OpenStack and Mesos, and provide isolation modes for virtual machines, containers/pods and bare metal workloads.
* [Flannel](https://github.com/flannel-io/flannel#deploying-flannel-manually) is an overlay network provider that can be used with Kubernetes.
* [Knitter](https://github.com/ZTE/Knitter/) is a plugin to support multiple network interfaces in a Kubernetes pod.
* [Multus](https://github.com/Intel-Corp/multus-cni) is a Multi plugin for multiple network support in Kubernetes to support all CNI plugins (e.g. Calico, Cilium, Contiv, Flannel), in addition to SRIOV, DPDK, OVS-DPDK and VPP based workloads in Kubernetes.
* Multus is a Multi plugin for multiple network support in Kubernetes to support all CNI plugins (e.g. Calico, Cilium, Contiv, Flannel), in addition to SRIOV, DPDK, OVS-DPDK and VPP based workloads in Kubernetes.
* [OVN-Kubernetes](https://github.com/ovn-org/ovn-kubernetes/) is a networking provider for Kubernetes based on [OVN (Open Virtual Network)](https://github.com/ovn-org/ovn/), a virtual networking implementation that came out of the Open vSwitch (OVS) project. OVN-Kubernetes provides an overlay based networking implementation for Kubernetes, including an OVS based implementation of load balancing and network policy.
* [OVN4NFV-K8S-Plugin](https://github.com/opnfv/ovn4nfv-k8s-plugin) is OVN based CNI controller plugin to provide cloud native based Service function chaining(SFC), Multiple OVN overlay networking, dynamic subnet creation, dynamic creation of virtual networks, VLAN Provider network, Direct provider network and pluggable with other Multi-network plugins, ideal for edge based cloud native workloads in Multi-cluster networking
* [NSX-T](https://docs.vmware.com/en/VMware-NSX-T/2.0/nsxt_20_ncp_kubernetes.pdf) Container Plug-in (NCP) provides integration between VMware NSX-T and container orchestrators such as Kubernetes, as well as integration between NSX-T and container-based CaaS/PaaS platforms such as Pivotal Container Service (PKS) and OpenShift.

View File

@ -246,7 +246,7 @@ Lars Kellogg-Stedman.
### Multus (a Multi Network plugin)
[Multus](https://github.com/Intel-Corp/multus-cni) is a Multi CNI plugin to support the Multi Networking feature in Kubernetes using CRD based network objects in Kubernetes.
Multus is a Multi CNI plugin to support the Multi Networking feature in Kubernetes using CRD based network objects in Kubernetes.
Multus supports all [reference plugins](https://github.com/containernetworking/plugins) (eg. [Flannel](https://github.com/containernetworking/cni.dev/blob/main/content/plugins/v0.9/meta/flannel.md), [DHCP](https://github.com/containernetworking/plugins/tree/master/plugins/ipam/dhcp), [Macvlan](https://github.com/containernetworking/plugins/tree/master/plugins/main/macvlan)) that implement the CNI specification and 3rd party plugins (eg. [Calico](https://github.com/projectcalico/cni-plugin), [Weave](https://github.com/weaveworks/weave), [Cilium](https://github.com/cilium/cilium), [Contiv](https://github.com/contiv/netplugin)). In addition to it, Multus supports [SRIOV](https://github.com/hustcat/sriov-cni), [DPDK](https://github.com/Intel-Corp/sriov-cni), [OVS-DPDK & VPP](https://github.com/intel/vhost-user-net-plugin) workloads in Kubernetes with both cloud native and NFV based applications in Kubernetes.

View File

@ -99,9 +99,10 @@ resource requests/limits of that type for each Container in the Pod.
Limits and requests for CPU resources are measured in *cpu* units.
One cpu, in Kubernetes, is equivalent to **1 vCPU/Core** for cloud providers and **1 hyperthread** on bare-metal Intel processors.
Fractional requests are allowed. A Container with
`spec.containers[].resources.requests.cpu` of `0.5` is guaranteed half as much
CPU as one that asks for 1 CPU. The expression `0.1` is equivalent to the
Fractional requests are allowed. When you define a container with
`spec.containers[].resources.requests.cpu` set to `0.5`, you are requesting half
as much CPU time compared to if you asked for `1.0` CPU.
For CPU resource units, the expression `0.1` is equivalent to the
expression `100m`, which can be read as "one hundred millicpu". Some people say
"one hundred millicores", and this is understood to mean the same thing. A
request with a decimal point, like `0.1`, is converted to `100m` by the API, and
@ -236,7 +237,7 @@ The kubelet also uses this kind of storage to hold
container images, and the writable layers of running containers.
{{< caution >}}
If a node fails, the data in its ephemeral storage can be lost.
If a node fails, the data in its ephemeral storage can be lost.
Your applications cannot expect any performance SLAs (disk IOPS for example)
from local ephemeral storage.
{{< /caution >}}
@ -440,7 +441,7 @@ Kubernetes does not use them.
Quotas are faster and more accurate than directory scanning. When a
directory is assigned to a project, all files created under a
directory are created in that project, and the kernel merely has to
keep track of how many blocks are in use by files in that project.
keep track of how many blocks are in use by files in that project.
If a file is created and deleted, but has an open file descriptor,
it continues to consume space. Quota tracking records that space accurately
whereas directory scans overlook the storage used by deleted files.

View File

@ -55,7 +55,7 @@ DNS server watches the Kubernetes API for new `Services` and creates a set of DN
If you only need access to the port for debugging purposes, you can use the [apiserver proxy](/docs/tasks/access-application-cluster/access-cluster/#manually-constructing-apiserver-proxy-urls) or [`kubectl port-forward`](/docs/tasks/access-application-cluster/port-forward-access-application-cluster/).
If you explicitly need to expose a Pod's port on the node, consider using a [NodePort](/docs/concepts/services-networking/service/#nodeport) Service before resorting to `hostPort`.
If you explicitly need to expose a Pod's port on the node, consider using a [NodePort](/docs/concepts/services-networking/service/#type-nodeport) Service before resorting to `hostPort`.
- Avoid using `hostNetwork`, for the same reasons as `hostPort`.

View File

@ -244,7 +244,7 @@ on the fly:
The `kubernetes.io/basic-auth` type is provided for storing credentials needed
for basic authentication. When using this Secret type, the `data` field of the
Secret must contain the following two keys:
Secret must contain one of the following two keys:
- `username`: the user name for authentication;
- `password`: the password or token for authentication.

View File

@ -59,7 +59,7 @@ Resources consumed by the command are counted against the Container.
### Hook handler execution
When a Container lifecycle management hook is called,
the Kubernetes management system execute the handler according to the hook action,
the Kubernetes management system executes the handler according to the hook action,
`httpGet` and `tcpSocket` are executed by the kubelet process, and `exec` is executed in the container.
Hook handler calls are synchronous within the context of the Pod containing the Container.

View File

@ -265,6 +265,73 @@ template needs to include the `.docker/config.json` or mount a drive that contai
All pods will have read access to images in any private registry once private
registry keys are added to the `.docker/config.json`.
### Interpretation of config.json {#config-json}
The interpretation of `config.json` varies between the original Docker
implementation and the Kubernetes interpretation. In Docker, the `auths` keys
can only specify root URLs, whereas Kubernetes allows glob URLs as well as
prefix-matched paths. This means that a `config.json` like this is valid:
```json
{
"auths": {
"*my-registry.io/images": {
"auth": "…"
}
}
}
```
The root URL (`*my-registry.io`) is matched by using the following syntax:
```
pattern:
{ term }
term:
'*' matches any sequence of non-Separator characters
'?' matches any single non-Separator character
'[' [ '^' ] { character-range } ']'
character class (must be non-empty)
c matches character c (c != '*', '?', '\\', '[')
'\\' c matches character c
character-range:
c matches character c (c != '\\', '-', ']')
'\\' c matches character c
lo '-' hi matches character c for lo <= c <= hi
```
Image pull operations would now pass the credentials to the CRI container
runtime for every valid pattern. For example the following container image names
would match successfully:
- `my-registry.io/images`
- `my-registry.io/images/my-image`
- `my-registry.io/images/another-image`
- `sub.my-registry.io/images/my-image`
- `a.sub.my-registry.io/images/my-image`
The kubelet performs image pulls sequentially for every found credential. This
means, that multiple entries in `config.json` are possible, too:
```json
{
"auths": {
"my-registry.io/images": {
"auth": "…"
},
"my-registry.io/images/subpath": {
"auth": "…"
}
}
}
```
If now a container specifies an image `my-registry.io/images/subpath/my-image`
to be pulled, then the kubelet will try to download them from both
authentication sources if one of them fails.
### Pre-pulled images
{{< note >}}
@ -390,3 +457,4 @@ Kubelet will merge any `imagePullSecrets` into a single virtual `.docker/config.
* Read the [OCI Image Manifest Specification](https://github.com/opencontainers/image-spec/blob/master/manifest.md).
* Learn about [container image garbage collection](/docs/concepts/architecture/garbage-collection/#container-image-garbage-collection).
* Learn more about [pulling an Image from a Private Registry](/docs/tasks/configure-pod-container/pull-image-private-registry).

View File

@ -52,9 +52,9 @@ Flags and configuration files may not always be changeable in a hosted Kubernete
Extensions are software components that extend and deeply integrate with Kubernetes.
They adapt it to support new types and new kinds of hardware.
Most cluster administrators will use a hosted or distribution
instance of Kubernetes. As a result, most Kubernetes users will not need to
install extensions and fewer will need to author new ones.
Many cluster administrators use a hosted or distribution instance of Kubernetes.
These clusters come with extensions pre-installed. As a result, most Kubernetes
users will not need to install extensions and even fewer users will need to author new ones.
## Extension Patterns

View File

@ -148,8 +148,8 @@ and use a controller to handle events.
Usually, each resource in the Kubernetes API requires code that handles REST requests and manages persistent storage of objects. The main Kubernetes API server handles built-in resources like *pods* and *services*, and can also generically handle custom resources through [CRDs](#customresourcedefinitions).
The [aggregation layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) allows you to provide specialized
implementations for your custom resources by writing and deploying your own standalone API server.
The main API server delegates requests to you for the custom resources that you handle,
implementations for your custom resources by writing and deploying your own API server.
The main API server delegates requests to your API server for the custom resources that you handle,
making them available to all of its clients.
## Choosing a method for adding custom resources

View File

@ -31,9 +31,7 @@ built-in automation from the core of Kubernetes. You can use Kubernetes
to automate deploying and running workloads, *and* you can automate how
Kubernetes does that.
Kubernetes' {{< glossary_tooltip text="controllers" term_id="controller" >}}
concept lets you extend the cluster's behaviour without modifying the code
of Kubernetes itself.
Kubernetes' {{< glossary_tooltip text="operator pattern" term_id="operator-pattern" >}} concept lets you extend the cluster's behaviour without modifying the code of Kubernetes itself by linking {{< glossary_tooltip text="controllers" term_id="controller" >}} to one or more custom resources.
Operators are clients of the Kubernetes API that act as controllers for
a [Custom Resource](/docs/concepts/extend-kubernetes/api-extension/custom-resources/).

View File

@ -74,10 +74,10 @@ Suggestions for securing your infrastructure in a Kubernetes cluster:
Area of Concern for Kubernetes Infrastructure | Recommendation |
--------------------------------------------- | -------------- |
Network access to API Server (Control plane) | All access to the Kubernetes control plane is not allowed publicly on the internet and is controlled by network access control lists restricted to the set of IP addresses needed to administer the cluster.|
Network access to Nodes (nodes) | Nodes should be configured to _only_ accept connections (via network access control lists)from the control plane on the specified ports, and accept connections for services in Kubernetes of type NodePort and LoadBalancer. If possible, these nodes should not be exposed on the public internet entirely.
Network access to Nodes (nodes) | Nodes should be configured to _only_ accept connections (via network access control lists) from the control plane on the specified ports, and accept connections for services in Kubernetes of type NodePort and LoadBalancer. If possible, these nodes should not be exposed on the public internet entirely.
Kubernetes access to Cloud Provider API | Each cloud provider needs to grant a different set of permissions to the Kubernetes control plane and nodes. It is best to provide the cluster with cloud provider access that follows the [principle of least privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege) for the resources it needs to administer. The [Kops documentation](https://github.com/kubernetes/kops/blob/master/docs/iam_roles.md#iam-roles) provides information about IAM policies and roles.
Access to etcd | Access to etcd (the datastore of Kubernetes) should be limited to the control plane only. Depending on your configuration, you should attempt to use etcd over TLS. More information can be found in the [etcd documentation](https://github.com/etcd-io/etcd/tree/master/Documentation).
etcd Encryption | Wherever possible it's a good practice to encrypt all drives at rest, but since etcd holds the state of the entire cluster (including Secrets) its disk should especially be encrypted at rest.
etcd Encryption | Wherever possible it's a good practice to encrypt all storage at rest, and since etcd holds the state of the entire cluster (including Secrets) its disk should especially be encrypted at rest.
{{< /table >}}
@ -99,7 +99,7 @@ good information practices, read and follow the advice about
Depending on the attack surface of your application, you may want to focus on specific
aspects of security. For example: If you are running a service (Service A) that is critical
in a chain of other resources and a separate workload (Service B) which is
vulnerable to a resource exhaustion attack then the risk of compromising Service A
vulnerable to a resource exhaustion attack, then the risk of compromising Service A
is high if you do not limit the resources of Service B. The following table lists
areas of security concerns and recommendations for securing workloads running in Kubernetes:
@ -108,10 +108,10 @@ Area of Concern for Workload Security | Recommendation |
RBAC Authorization (Access to the Kubernetes API) | https://kubernetes.io/docs/reference/access-authn-authz/rbac/
Authentication | https://kubernetes.io/docs/concepts/security/controlling-access/
Application secrets management (and encrypting them in etcd at rest) | https://kubernetes.io/docs/concepts/configuration/secret/ <br> https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/
Pod Security Policies | https://kubernetes.io/docs/concepts/policy/pod-security-policy/
Ensuring that pods meet defined Pod Security Standards | https://kubernetes.io/docs/concepts/security/pod-security-standards/#policy-instantiation
Quality of Service (and Cluster resource management) | https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/
Network Policies | https://kubernetes.io/docs/concepts/services-networking/network-policies/
TLS For Kubernetes Ingress | https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
TLS for Kubernetes Ingress | https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
## Container
@ -137,7 +137,7 @@ are recommendations to protect application code:
Area of Concern for Code | Recommendation |
-------------------------| -------------- |
Access over TLS only | If your code needs to communicate by TCP, perform a TLS handshake with the client ahead of time. With the exception of a few cases, encrypt everything in transit. Going one step further, it's a good idea to encrypt network traffic between services. This can be done through a process known as mutual or [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication) which performs a two sided verification of communication between two certificate holding services. |
Access over TLS only | If your code needs to communicate by TCP, perform a TLS handshake with the client ahead of time. With the exception of a few cases, encrypt everything in transit. Going one step further, it's a good idea to encrypt network traffic between services. This can be done through a process known as mutual TLS authentication or [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication) which performs a two sided verification of communication between two certificate holding services. |
Limiting port ranges of communication | This recommendation may be a bit self-explanatory, but wherever possible you should only expose the ports on your service that are absolutely essential for communication or metric gathering. |
3rd Party Dependency Security | It is a good practice to regularly scan your application's third party libraries for known security vulnerabilities. Each programming language has a tool for performing this check automatically. |
Static Code Analysis | Most languages provide a way for a snippet of code to be analyzed for any potentially unsafe coding practices. Whenever possible you should perform checks using automated tooling that can scan codebases for common security errors. Some of the tools can be found at: https://owasp.org/www-community/Source_Code_Analysis_Tools |

View File

@ -484,7 +484,7 @@ of individual policies are not defined here.
- {{< example file="security/podsecurity-baseline.yaml" >}}Baseline namespace{{< /example >}}
- {{< example file="security/podsecurity-restricted.yaml" >}}Restricted namespace{{< /example >}}
[**PodSecurityPolicy**](/docs/concepts/profile/pod-security-profile/) (Deprecated)
[**PodSecurityPolicy**](/docs/concepts/policy/pod-security-policy/) (Deprecated)
- {{< example file="policy/privileged-psp.yaml" >}}Privileged{{< /example >}}
- {{< example file="policy/baseline-psp.yaml" >}}Baseline{{< /example >}}

View File

@ -54,7 +54,7 @@ kubectl get pods -l run=my-nginx -o yaml | grep podIP
You should be able to ssh into any node in your cluster and curl both IPs. Note that the containers are *not* using port 80 on the node, nor are there any special NAT rules to route traffic to the pod. This means you can run multiple nginx pods on the same node all using the same containerPort and access them from any other pod or node in your cluster using IP. Like Docker, ports can still be published to the host node's interfaces, but the need for this is radically diminished because of the networking model.
You can read more about [how we achieve this](/docs/concepts/cluster-administration/networking/#how-to-achieve-this) if you're curious.
You can read more about the [Kubernetes Networking Model](/docs/concepts/cluster-administration/networking/#the-kubernetes-network-model) if you're curious.
## Creating a Service

View File

@ -28,6 +28,7 @@ Kubernetes as a project supports and maintains [AWS](https://github.com/kubernet
controller.
* [Apache APISIX ingress controller](https://github.com/apache/apisix-ingress-controller) is an [Apache APISIX](https://github.com/apache/apisix)-based ingress controller.
* [Avi Kubernetes Operator](https://github.com/vmware/load-balancer-and-ingress-services-for-kubernetes) provides L4-L7 load-balancing using [VMware NSX Advanced Load Balancer](https://avinetworks.com/).
* [BFE Ingress Controller](https://github.com/bfenetworks/ingress-bfe) is a [BFE](https://www.bfe-networks.net)-based ingress controller.
* The [Citrix ingress controller](https://github.com/citrix/citrix-k8s-ingress-controller#readme) works with
Citrix Application Delivery Controller.
* [Contour](https://projectcontour.io/) is an [Envoy](https://www.envoyproxy.io/) based ingress controller.

View File

@ -51,7 +51,7 @@ graph LR;
An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name-based virtual hosting. An [Ingress controller](/docs/concepts/services-networking/ingress-controllers) is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically
uses a service of type [Service.Type=NodePort](/docs/concepts/services-networking/service/#nodeport) or
uses a service of type [Service.Type=NodePort](/docs/concepts/services-networking/service/#type-nodeport) or
[Service.Type=LoadBalancer](/docs/concepts/services-networking/service/#loadbalancer).
## Prerequisites

View File

@ -568,7 +568,7 @@ The default is `ClusterIP`.
You can also use [Ingress](/docs/concepts/services-networking/ingress/) to expose your Service. Ingress is not a Service type, but it acts as the entry point for your cluster. It lets you consolidate your routing rules
into a single resource as it can expose multiple services under the same IP address.
### Type NodePort {#nodeport}
### Type NodePort {#type-nodeport}
If you set the `type` field to `NodePort`, the Kubernetes control plane
allocates a port from a range specified by `--service-node-port-range` flag (default: 30000-32767).

View File

@ -0,0 +1,123 @@
---
reviewers:
- sftim
- marosset
- jsturtevant
- zshihang
title: Projected Volumes
content_type: concept
---
<!-- overview -->
This document describes the current state of _projected volumes_ in Kubernetes. Familiarity with [volumes](/docs/concepts/storage/volumes/) is suggested.
<!-- body -->
## Introduction
A `projected` volume maps several existing volume sources into the same directory.
Currently, the following types of volume sources can be projected:
* [`secret`](/docs/concepts/storage/volumes/#secret)
* [`downwardAPI`](/docs/concepts/storage/volumes/#downwardapi)
* [`configMap`](/docs/concepts/storage/volumes/#configmap)
* `serviceAccountToken`
All sources are required to be in the same namespace as the Pod. For more details,
see the [all-in-one volume design document](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/all-in-one-volume.md).
### Example configuration with a secret, a downwardAPI, and a configMap {#example-configuration-secret-downwardapi-configmap}
{{< codenew file="pods/storage/projected-secret-downwardapi-configmap.yaml" >}}
### Example configuration: secrets with a non-default permission mode set {#example-configuration-secrets-nondefault-permission-mode}
{{< codenew file="pods/storage/projected-secrets-nondefault-permission-mode.yaml" >}}
Each projected volume source is listed in the spec under `sources`. The
parameters are nearly the same with two exceptions:
* For secrets, the `secretName` field has been changed to `name` to be consistent
with ConfigMap naming.
* The `defaultMode` can only be specified at the projected level and not for each
volume source. However, as illustrated above, you can explicitly set the `mode`
for each individual projection.
When the `TokenRequestProjection` feature is enabled, you can inject the token
for the current [service account](/docs/reference/access-authn-authz/authentication/#service-account-tokens)
into a Pod at a specified path. For example:
{{< codenew file="pods/storage/projected-service-account-token.yaml" >}}
The example Pod has a projected volume containing the injected service account
token. This token can be used by a Pod's containers to access the Kubernetes API
server. The `audience` field contains the intended audience of the
token. A recipient of the token must identify itself with an identifier specified
in the audience of the token, and otherwise should reject the token. This field
is optional and it defaults to the identifier of the API server.
The `expirationSeconds` is the expected duration of validity of the service account
token. It defaults to 1 hour and must be at least 10 minutes (600 seconds). An administrator
can also limit its maximum value by specifying the `--service-account-max-token-expiration`
option for the API server. The `path` field specifies a relative path to the mount point
of the projected volume.
{{< note >}}
A container using a projected volume source as a [`subPath`](/docs/concepts/storage/volumes/#using-subpath)
volume mount will not receive updates for those volume sources.
{{< /note >}}
## SecurityContext interactions
The [proposal for file permission handling in projected service account volume](https://github.com/kubernetes/enhancements/pull/1598)
enhancement introduced the projected files having the the correct owner
permissions set.
### Linux
In Linux pods that have a projected volume and `RunAsUser` set in the Pod
[`SecurityContext`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context),
the projected files have the correct ownership set including container user
ownership.
### Windows
In Windows pods that have a projected volume and `RunAsUsername` set in the
Pod `SecurityContext`, the ownership is not enforced due to the way user
accounts are managed in Windows. Windows stores and manages local user and group
accounts in a database file called Security Account Manager (SAM). Each
container maintains its own instance of the SAM database, to which the host has
no visibility into while the container is running. Windows containers are
designed to run the user mode portion of the OS in isolation from the host,
hence the maintenance of a virtual SAM database. As a result, the kubelet running
on the host does not have the ability to dynamically configure host file
ownership for virtualized container accounts. It is recommended that if files on
the host machine are to be shared with the container then they should be placed
into their own volume mount outside of `C:\`.
By default, the projected files will have the following ownership as shown for
an example projected volume file:
```powershell
Path : Microsoft.PowerShell.Core\FileSystem::C:\var\run\secrets\kubernetes.io\serviceaccount\..2021_08_31_22_22_18.318230061\ca.crt
Owner : BUILTIN\Administrators
Group : NT AUTHORITY\SYSTEM
Access : NT AUTHORITY\SYSTEM Allow FullControl
BUILTIN\Administrators Allow FullControl
BUILTIN\Users Allow ReadAndExecute, Synchronize
Audit :
Sddl : O:BAG:SYD:AI(A;ID;FA;;;SY)(A;ID;FA;;;BA)(A;ID;0x1200a9;;;BU)
```
This implies all administrator users like `ContainerAdministrator` will have
read, write and execute access while, non-administrator users will have read and
execute access.
{{< note >}}
In general, granting the container access to the host is discouraged as it can
open the door for potential security exploits.
Creating a Windows Pod with `RunAsUser` in it's `SecurityContext` will result in
the Pod being stuck at `ContainerCreating` forever. So it is advised to not use
the Linux only `RunAsUser` option with Windows Pods.
{{< /note >}}

View File

@ -33,8 +33,8 @@ drivers, but the functionality is somewhat limited.
Kubernetes supports many types of volumes. A {{< glossary_tooltip term_id="pod" text="Pod" >}}
can use any number of volume types simultaneously.
Ephemeral volume types have a lifetime of a pod, but persistent volumes exist beyond
the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes;
however, Kubernetes does not destroy persistent volumes.
the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes;
however, Kubernetes does not destroy persistent volumes.
For any kind of volume in a given pod, data is preserved across container restarts.
At its core, a volume is a directory, possibly with some data in it, which
@ -44,12 +44,21 @@ volume type used.
To use a volume, specify the volumes to provide for the Pod in `.spec.volumes`
and declare where to mount those volumes into containers in `.spec.containers[*].volumeMounts`.
A process in a container sees a filesystem view composed from their Docker
image and volumes. The [Docker image](https://docs.docker.com/userguide/dockerimages/)
is at the root of the filesystem hierarchy. Volumes mount at the specified paths within
the image. Volumes can not mount onto other volumes or have hard links to
other volumes. Each Container in the Pod's configuration must independently specify where to
mount each volume.
A process in a container sees a filesystem view composed from the initial contents of
the {{< glossary_tooltip text="container image" term_id="image" >}}, plus volumes
(if defined) mounted inside the container.
The process sees a root filesystem that initially matches the contents of the container
image.
Any writes to within that filesystem hierarchy, if allowed, affect what that process views
when it performs a subsequent filesystem access.
Volumes mount at the [specified paths](#using-subpath) within
the image.
For each container defined within a Pod, you must independently specify where
to mount each volume that the container uses.
Volumes cannot mount within other volumes (but see [Using subPath](#using-subpath)
for a related mechanism). Also, a volume cannot contain a hard link to anything in
a different volume.
## Types of Volumes {#volume-types}
@ -217,7 +226,7 @@ It redirects all plugin operations from the existing in-tree plugin to the
`cinder.csi.openstack.org` Container Storage Interface (CSI) Driver.
[OpenStack Cinder CSI Driver](https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md)
must be installed on the cluster.
You can disable Cinder CSI migration for your cluster by setting the `CSIMigrationOpenStack`
You can disable Cinder CSI migration for your cluster by setting the `CSIMigrationOpenStack`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to `false`.
If you disable the `CSIMigrationOpenStack` feature, the in-tree Cinder volume plugin takes responsibility
for all aspects of Cinder volume storage management.
@ -801,143 +810,8 @@ For more details, see the [Portworx volume](https://github.com/kubernetes/exampl
### projected
A `projected` volume maps several existing volume sources into the same directory.
Currently, the following types of volume sources can be projected:
* [`secret`](#secret)
* [`downwardAPI`](#downwardapi)
* [`configMap`](#configmap)
* `serviceAccountToken`
All sources are required to be in the same namespace as the Pod. For more details,
see the [all-in-one volume design document](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/all-in-one-volume.md).
#### Example configuration with a secret, a downwardAPI, and a configMap {#example-configuration-secret-downwardapi-configmap}
```yaml
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: all-in-one
mountPath: "/projected-volume"
readOnly: true
volumes:
- name: all-in-one
projected:
sources:
- secret:
name: mysecret
items:
- key: username
path: my-group/my-username
- downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "cpu_limit"
resourceFieldRef:
containerName: container-test
resource: limits.cpu
- configMap:
name: myconfigmap
items:
- key: config
path: my-group/my-config
```
#### Example configuration: secrets with a non-default permission mode set {#example-configuration-secrets-nondefault-permission-mode}
```yaml
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: all-in-one
mountPath: "/projected-volume"
readOnly: true
volumes:
- name: all-in-one
projected:
sources:
- secret:
name: mysecret
items:
- key: username
path: my-group/my-username
- secret:
name: mysecret2
items:
- key: password
path: my-group/my-password
mode: 511
```
Each projected volume source is listed in the spec under `sources`. The
parameters are nearly the same with two exceptions:
* For secrets, the `secretName` field has been changed to `name` to be consistent
with ConfigMap naming.
* The `defaultMode` can only be specified at the projected level and not for each
volume source. However, as illustrated above, you can explicitly set the `mode`
for each individual projection.
When the `TokenRequestProjection` feature is enabled, you can inject the token
for the current [service account](/docs/reference/access-authn-authz/authentication/#service-account-tokens)
into a Pod at a specified path. For example:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: sa-token-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: token-vol
mountPath: "/service-account"
readOnly: true
volumes:
- name: token-vol
projected:
sources:
- serviceAccountToken:
audience: api
expirationSeconds: 3600
path: token
```
The example Pod has a projected volume containing the injected service account
token. This token can be used by a Pod's containers to access the Kubernetes API
server. The `audience` field contains the intended audience of the
token. A recipient of the token must identify itself with an identifier specified
in the audience of the token, and otherwise should reject the token. This field
is optional and it defaults to the identifier of the API server.
The `expirationSeconds` is the expected duration of validity of the service account
token. It defaults to 1 hour and must be at least 10 minutes (600 seconds). An administrator
can also limit its maximum value by specifying the `--service-account-max-token-expiration`
option for the API server. The `path` field specifies a relative path to the mount point
of the projected volume.
{{< note >}}
A container using a projected volume source as a [`subPath`](#using-subpath) volume mount will not
receive updates for those volume sources.
{{< /note >}}
A projected volume maps several existing volume sources into the same
directory. For more details, see [projected volumes](/docs/concepts/storage/projected-volumes/)
### quobyte (deprecated) {#quobyte}

View File

@ -17,8 +17,6 @@ A _CronJob_ creates {{< glossary_tooltip term_id="job" text="Jobs" >}} on a repe
One CronJob object is like one line of a _crontab_ (cron table) file. It runs a job periodically
on a given schedule, written in [Cron](https://en.wikipedia.org/wiki/Cron) format.
In addition, the CronJob schedule supports timezone handling, you can specify the timezone by adding "CRON_TZ=<time zone>" at the beginning of the CronJob schedule, and it is recommended to always set `CRON_TZ`.
{{< caution >}}
All **CronJob** `schedule:` times are based on the timezone of the
{{< glossary_tooltip term_id="kube-controller-manager" text="kube-controller-manager" >}}.
@ -28,6 +26,16 @@ containers, the timezone set for the kube-controller-manager container determine
that the cron job controller uses.
{{< /caution >}}
{{< caution >}}
The [v1 CronJob API](/docs/reference/kubernetes-api/workload-resources/cron-job-v1/)
does not officially support setting timezone as explained above.
Setting variables such as `CRON_TZ` or `TZ` is not officially supported by the Kubernetes project.
`CRON_TZ` or `TZ` is an implementation detail of the internal library being used
for parsing and calculating the next Job creation time. Any usage of it is not
recommended in a production cluster.
{{< /caution >}}
When creating the manifest for a CronJob resource, make sure the name you provide
is a valid [DNS subdomain name](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names).
The name must be no longer than 52 characters. This is because the CronJob controller will automatically
@ -55,16 +63,15 @@ takes you through this example in more detail).
### Cron schedule syntax
```
# ┌────────────────── timezone (optional)
# | ┌───────────── minute (0 - 59)
# | │ ┌───────────── hour (0 - 23)
# | │ │ ┌───────────── day of the month (1 - 31)
# | │ │ │ ┌───────────── month (1 - 12)
# | │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# | │ │ │ │ │ 7 is also Sunday on some systems)
# | │ │ │ │ │
# | │ │ │ │ │
# CRON_TZ=UTC * * * * *
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │ 7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * *
```
@ -78,9 +85,9 @@ takes you through this example in more detail).
For example, the line below states that the task must be started every Friday at midnight, as well as on the 13th of each month at midnight(in UTC):
For example, the line below states that the task must be started every Friday at midnight, as well as on the 13th of each month at midnight:
`CRON_TZ=UTC 0 0 13 * 5`
`0 0 13 * 5`
To generate CronJob schedule expressions, you can also use web tools like [crontab.guru](https://crontab.guru/).

View File

@ -703,6 +703,7 @@ You can pause a Deployment before triggering one or more updates and then resume
apply multiple fixes in between pausing and resuming without triggering unnecessary rollouts.
* For example, with a Deployment that was created:
Get the Deployment details:
```shell
kubectl get deploy
@ -753,7 +754,7 @@ apply multiple fixes in between pausing and resuming without triggering unnecess
REVISION CHANGE-CAUSE
1 <none>
```
* Get the rollout status to ensure that the Deployment is updated successfully:
* Get the rollout status to verify that the existing ReplicaSet has not changed:
```shell
kubectl get rs
```

View File

@ -282,7 +282,7 @@ define readiness distinct from completion. This is enforced during validation.
Use `activeDeadlineSeconds` on the Pod to prevent init containers from failing forever.
The active deadline includes init containers.
However it is recommended to use `activeDeadlineSeconds` if user deploy their application
However it is recommended to use `activeDeadlineSeconds` only if teams deploy their application
as a Job, because `activeDeadlineSeconds` has an effect even after initContainer finished.
The Pod which is already running correctly would be killed by `activeDeadlineSeconds` if you set.

View File

@ -236,41 +236,41 @@ The scheduler will skip the non-matching nodes from the skew calculations if the
Suppose you have a 5-node cluster ranging from zoneA to zoneC:
{{<mermaid>}}
graph BT
subgraph "zoneB"
p3(Pod) --> n3(Node3)
n4(Node4)
end
subgraph "zoneA"
p1(Pod) --> n1(Node1)
p2(Pod) --> n2(Node2)
end
{{<mermaid>}}
graph BT
subgraph "zoneB"
p3(Pod) --> n3(Node3)
n4(Node4)
end
subgraph "zoneA"
p1(Pod) --> n1(Node1)
p2(Pod) --> n2(Node2)
end
classDef plain fill:#ddd,stroke:#fff,stroke-width:4px,color:#000;
classDef k8s fill:#326ce5,stroke:#fff,stroke-width:4px,color:#fff;
classDef cluster fill:#fff,stroke:#bbb,stroke-width:2px,color:#326ce5;
class n1,n2,n3,n4,p1,p2,p3 k8s;
class p4 plain;
class zoneA,zoneB cluster;
{{< /mermaid >}}
classDef plain fill:#ddd,stroke:#fff,stroke-width:4px,color:#000;
classDef k8s fill:#326ce5,stroke:#fff,stroke-width:4px,color:#fff;
classDef cluster fill:#fff,stroke:#bbb,stroke-width:2px,color:#326ce5;
class n1,n2,n3,n4,p1,p2,p3 k8s;
class p4 plain;
class zoneA,zoneB cluster;
{{< /mermaid >}}
{{<mermaid>}}
graph BT
subgraph "zoneC"
n5(Node5)
end
{{<mermaid>}}
graph BT
subgraph "zoneC"
n5(Node5)
end
classDef plain fill:#ddd,stroke:#fff,stroke-width:4px,color:#000;
classDef k8s fill:#326ce5,stroke:#fff,stroke-width:4px,color:#fff;
classDef cluster fill:#fff,stroke:#bbb,stroke-width:2px,color:#326ce5;
class n5 k8s;
class zoneC cluster;
{{< /mermaid >}}
classDef plain fill:#ddd,stroke:#fff,stroke-width:4px,color:#000;
classDef k8s fill:#326ce5,stroke:#fff,stroke-width:4px,color:#fff;
classDef cluster fill:#fff,stroke:#bbb,stroke-width:2px,color:#326ce5;
class n5 k8s;
class zoneC cluster;
{{< /mermaid >}}
and you know that "zoneC" must be excluded. In this case, you can compose the yaml as below, so that "mypod" will be placed onto "zoneB" instead of "zoneC". Similarly `spec.nodeSelector` is also respected.
{{< codenew file="pods/topology-spread-constraints/one-constraint-with-nodeaffinity.yaml" >}}
{{< codenew file="pods/topology-spread-constraints/one-constraint-with-nodeaffinity.yaml" >}}
The scheduler doesn't have prior knowledge of all the zones or other topology domains that a cluster has. They are determined from the existing nodes in the cluster. This could lead to a problem in autoscaled clusters, when a node pool (or node group) is scaled to zero nodes and the user is expecting them to scale up, because, in this case, those topology domains won't be considered until there is at least one node in them.

View File

@ -180,6 +180,7 @@ SIG Docs communicates with different methods:
## Other ways to contribute
- Visit the [Kubernetes community site](/community/). Participate on Twitter or Stack Overflow, learn about local Kubernetes meetups and events, and more.
- Read the [contributor cheatsheet](https://github.com/kubernetes/community/tree/master/contributors/guide/contributor-cheatsheet) to get involved with Kubernetes feature development.
- Read the [contributor cheatsheet](https://www.kubernetes.dev/docs/contributor-cheatsheet/) to get involved with Kubernetes feature development.
- Visit the contributor site to learn more about [Kubernetes Contributors](https://www.kubernetes.dev/) and [additional contributor resources](https://www.kubernetes.dev/resources/).
- Submit a [blog post or case study](/docs/contribute/new-content/blogs-case-studies/).

View File

@ -56,12 +56,14 @@ prior to submitting new content. The information details follow.
- Write Kubernetes documentation in Markdown and build the Kubernetes site
using [Hugo](https://gohugo.io/).
- Kubernetes documentation uses [CommonMark](https://commonmark.org/) as its flavor of Markdown.
- The source is in [GitHub](https://github.com/kubernetes/website). You can find
Kubernetes documentation at `/content/en/docs/`. Some of the reference
documentation is automatically generated from scripts in
the `update-imported-docs/` directory.
- [Page content types](/docs/contribute/style/page-content-types/) describe the
presentation of documentation content in Hugo.
- You can use [Docsy shortcodes](https://www.docsy.dev/docs/adding-content/shortcodes/) or [custom Hugo shortcodes](/docs/contribute/style/hugo-shortcodes/) to contribute to Kubernetes documentation.
- In addition to the standard Hugo shortcodes, we use a number of
[custom Hugo shortcodes](/docs/contribute/style/hugo-shortcodes/) in our
documentation to control the presentation of content.

View File

@ -242,6 +242,43 @@ Renders to:
{{< tab name="JSON File" include="podtemplate.json" />}}
{{< /tabs >}}
## Third party content marker
Running Kubernetes requires third-party software. For example: you
usually need to add a
[DNS server](/docs/tasks/administer-cluster/dns-custom-nameservers/#introduction)
to your cluster so that name resolution works.
When we link to third-party software, or otherwise mention it,
we follow the [content guide](/docs/contribute/style/content-guide/)
and we also mark those third party items.
Using these shortcodes adds a disclaimer to any documentation page
that uses them.
### Lists {#third-party-content-list}
For a list of several third-party items, add:
```
{{%/* thirdparty-content */%}}
```
just below the heading for the section that includes all items.
### Items {#third-party-content-item}
If you have a list where most of the items refer to in-project
software (for example: Kubernetes itself, and the separate
[Descheduler](https://github.com/kubernetes-sigs/descheduler)
component), then there is a different form to use.
Add the shortcode:
```
{{%/* thirdparty-content single="true" */%}}
```
before the item, or just below the heading for the specific item.
## Version strings
To generate a version string for inclusion in the documentation, you can choose from

View File

@ -53,7 +53,7 @@ or be treated as an anonymous user.
## Authentication strategies
Kubernetes uses client certificates, bearer tokens, an authenticating proxy, or HTTP basic auth to
Kubernetes uses client certificates, bearer tokens, or an authenticating proxy to
authenticate API requests through authentication plugins. As HTTP requests are
made to the API server, plugins attempt to associate the following attributes
with the request:
@ -356,7 +356,7 @@ You can use an existing public OpenID Connect Identity Provider (such as Google,
Or, you can run your own Identity Provider, such as [dex](https://dexidp.io/),
[Keycloak](https://github.com/keycloak/keycloak),
CloudFoundry [UAA](https://github.com/cloudfoundry/uaa), or
Tremolo Security's [OpenUnison](https://github.com/tremolosecurity/openunison).
Tremolo Security's [OpenUnison](https://openunison.github.io/).
For an identity provider to work with Kubernetes it must:

View File

@ -199,7 +199,7 @@ To allow signing a CertificateSigningRequest:
## Normal user
A few steps are required in order to get a normal user to be able to
authenticate and invoke an API. First, this user must have certificate issued
authenticate and invoke an API. First, this user must have a certificate issued
by the Kubernetes cluster, and then present that certificate to the Kubernetes API.
### Create private key
@ -274,10 +274,10 @@ kubectl get csr myuser -o jsonpath='{.status.certificate}'| base64 -d > myuser.c
### Create Role and RoleBinding
With the certificate created. it is time to define the Role and RoleBinding for
With the certificate created it is time to define the Role and RoleBinding for
this user to access Kubernetes cluster resources.
This is a sample script to create a Role for this new user:
This is a sample command to create a Role for this new user:
```shell
kubectl create role developer --verb=create --verb=get --verb=list --verb=update --verb=delete --resource=pods

View File

@ -817,7 +817,7 @@ This is commonly used by add-on API servers for unified authentication and autho
<tr>
<td><b>system:persistent-volume-provisioner</b></td>
<td>None</td>
<td>Allows access to the resources required by most <a href="/docs/concepts/storage/persistent-volumes/#provisioner">dynamic volume provisioners</a>.</td>
<td>Allows access to the resources required by most <a href="/docs/concepts/storage/persistent-volumes/#dynamic">dynamic volume provisioners</a>.</td>
</tr>
<tr>
<td><b>system:monitoring</b></td>

View File

@ -15,4 +15,4 @@ tags:
<!--more-->
Most cluster administrators will use a hosted or distribution instance of Kubernetes. As a result, most Kubernetes users will need to install [extensions](/docs/concepts/extend-kubernetes/extend-cluster/#extensions) and fewer will need to author new ones.
Many cluster administrators use a hosted or distribution instance of Kubernetes. These clusters come with extensions pre-installed. As a result, most Kubernetes users will not need to install [extensions](/docs/concepts/extend-kubernetes/extend-cluster/#extensions) and even fewer users will need to author new ones.

View File

@ -15,4 +15,12 @@ tags:
- operation
---
A [Pod Disruption Budget](/docs/concepts/workloads/pods/disruptions/) allows an application owner to create an object for a replicated application, that ensures a certain number or percentage of Pods with an assigned label will not be voluntarily evicted at any point in time. PDBs cannot prevent an involuntary disruption, but will count against the budget.
A [Pod Disruption Budget](/docs/concepts/workloads/pods/disruptions/) allows an
application owner to create an object for a replicated application, that ensures
a certain number or percentage of Pods with an assigned label will not be voluntarily
evicted at any point in time.
<!--more-->
PDBs cannot prevent an involuntary disruption, but
will count against the budget.

View File

@ -14,6 +14,11 @@ tags:
- operation
---
[Pod disruption](/docs/concepts/workloads/pods/disruptions/) is the process by which Pods on Nodes are terminated either voluntarily or involuntarily.
[Pod disruption](/docs/concepts/workloads/pods/disruptions/) is the process by which
Pods on Nodes are terminated either voluntarily or involuntarily.
Voluntary disruptions are started intentionally by application owners or cluster administrators. Involuntary disruptions are unintentional and can be triggered by unavoidable issues like Nodes running out of resources, or by accidental deletions.
<!--more-->
Voluntary disruptions are started intentionally by application owners or cluster
administrators. Involuntary disruptions are unintentional and can be triggered by
unavoidable issues like Nodes running out of resources, or by accidental deletions.

View File

@ -10,7 +10,7 @@ aka:
tags:
- core-object
---
A whole-number representation of small or large numbers using SI suffixes.
A whole-number representation of small or large numbers using [SI](https://en.wikipedia.org/wiki/International_System_of_Units) suffixes.
<!--more-->
@ -21,7 +21,7 @@ mega, or giga units.
For instance, the number `1.5` is represented as `1500m`, while the number `1000`
can be represented as `1k`, and `1000000` as `1M`. You can also specify
binary-notation suffixes; the number 2048 can be written as `2Ki`.
[binary-notation](https://en.wikipedia.org/wiki/Binary_prefix) suffixes; the number 2048 can be written as `2Ki`.
The accepted decimal (power-of-10) units are `m` (milli), `k` (kilo,
intentionally lowercase), `M` (mega), `G` (giga), `T` (tera), `P` (peta),
@ -29,3 +29,4 @@ intentionally lowercase), `M` (mega), `G` (giga), `T` (tera), `P` (peta),
The accepted binary (power-of-2) units are `Ki` (kibi), `Mi` (mebi), `Gi` (gibi),
`Ti` (tebi), `Pi` (pebi), `Ei` (exbi).

View File

@ -27,7 +27,7 @@ We're extremely grateful for security researchers and users that report vulnerab
To make a report, submit your vulnerability to the [Kubernetes bug bounty program](https://hackerone.com/kubernetes). This allows triage and handling of the vulnerability with standardized response times.
You can also email the private [security@kubernetes.io](mailto:security@kubernetes.io) list with the security details and the details expected for [all Kubernetes bug reports](https://git.k8s.io/kubernetes/.github/ISSUE_TEMPLATE/bug-report.md).
You can also email the private [security@kubernetes.io](mailto:security@kubernetes.io) list with the security details and the details expected for [all Kubernetes bug reports](https://github.com/kubernetes/kubernetes/blob/master/.github/ISSUE_TEMPLATE/bug-report.yaml).
You may encrypt your email to this list using the GPG keys of the [Security Response Committee members](https://git.k8s.io/security/README.md#product-security-committee-psc). Encryption using GPG is NOT required to make a disclosure.

View File

@ -219,7 +219,7 @@ kubectl get pods -o json | jq -c 'path(..)|[.[]|tostring]|join(".")'
# Produce ENV for all pods, assuming you have a default container for the pods, default namespace and the `env` command is supported.
# Helpful when running any supported command across all pods, not just `env`
for pod in $(kubectl get po --output=jsonpath={.items..metadata.name}); do echo $pod && kubectl exec -it $pod env; done
for pod in $(kubectl get po --output=jsonpath={.items..metadata.name}); do echo $pod && kubectl exec -it $pod -- env; done
```
## Updating resources

View File

@ -21,7 +21,7 @@ by implementing one or more of these extension points.
You can specify scheduling profiles by running `kube-scheduler --config <filename>`,
using the
KubeSchedulerConfiguration ([v1beta1](/docs/reference/config-api/kube-scheduler-config.v1beta1/)
or [v1beta2](/docs/reference/config-api/kube-scheduler-config.v1beta2/))
or [v1beta2](/docs/reference/config-api/kube-scheduler-config.v1beta2/))
struct.
A minimal configuration looks as follows:
@ -89,7 +89,7 @@ profiles:
- plugins:
score:
disabled:
- name: NodeResourcesLeastAllocated
- name: PodTopologySpread
enabled:
- name: MyCustomPluginA
weight: 2
@ -100,7 +100,7 @@ profiles:
You can use `*` as name in the disabled array to disable all default plugins
for that extension point. This can also be used to rearrange plugins order, if
desired.
### Scheduling plugins
The following plugins, enabled by default, implement one or more of these
@ -116,10 +116,6 @@ extension points:
Extension points: `filter`.
- `NodePorts`: Checks if a node has free ports for the requested Pod ports.
Extension points: `preFilter`, `filter`.
- `NodePreferAvoidPods`: Scores nodes according to the node
{{< glossary_tooltip text="annotation" term_id="annotation" >}}
`scheduler.alpha.kubernetes.io/preferAvoidPods`.
Extension points: `score`.
- `NodeAffinity`: Implements
[node selectors](/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector)
and [node affinity](/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity).
@ -170,7 +166,7 @@ extension points:
Extension points: `bind`.
- `DefaultPreemption`: Provides the default preemption mechanism.
Extension points: `postFilter`.
You can also enable the following plugins, through the component config APIs,
that are not enabled by default:
@ -182,7 +178,7 @@ that are not enabled by default:
- `CinderLimits`: Checks that [OpenStack Cinder](https://docs.openstack.org/cinder/)
volume limits can be satisfied for the node.
Extension points: `filter`.
The following plugins are deprecated and can only be enabled in a `v1beta1`
configuration:
@ -206,7 +202,7 @@ configuration:
- `NodePreferAvoidPods`: Prioritizes nodes according to the node annotation
`scheduler.alpha.kubernetes.io/preferAvoidPods`.
Extension points: `score`.
### Multiple profiles
You can configure `kube-scheduler` to run more than one profile.
@ -255,10 +251,47 @@ the same configuration parameters (if applicable). This is because the scheduler
only has one pending pods queue.
{{< /note >}}
## Scheduler configuration migrations
{{< tabs name="tab_with_md" >}}
{{% tab name="v1beta1 → v1beta2" %}}
* With the v1beta2 configuration version, you can use a new score extension for the
`NodeResourcesFit` plugin.
The new extension combines the functionalities of the `NodeResourcesLeastAllocated`,
`NodeResourcesMostAllocated` and `RequestedToCapacityRatio` plugins.
For example, if you previously used the `NodeResourcesMostAllocated` plugin, you
would instead use `NodeResourcesFit` (enabled by default) and add a `pluginConfig`
with a `scoreStrategy` that is similar to:
```yaml
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
profiles:
- pluginConfig:
- args:
scoringStrategy:
resources:
- name: cpu
weight: 1
type: MostAllocated
name: NodeResourcesFit
```
* The scheduler plugin `NodeLabel` is deprecated; instead, use the [`NodeAffinity`](/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) plugin (enabled by default) to achieve similar behavior.
* The scheduler plugin `ServiceAffinity` is deprecated; instead, use the [`InterPodAffinity`](/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity) plugin (enabled by default) to achieve similar behavior.
* The scheduler plugin `NodePreferAvoidPods` is deprecated; instead, use [node taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) to achieve similar behavior.
* A plugin enabled in a v1beta2 configuration file takes precedence over the default configuration for that plugin.
* Invalid `host` or `port` configured for scheduler healthz and metrics bind address will cause validation failure.
{{% /tab %}}
{{< /tabs >}}
## {{% heading "whatsnext" %}}
* Read the [kube-scheduler reference](/docs/reference/command-line-tools-reference/kube-scheduler/)
* Learn about [scheduling](/docs/concepts/scheduling-eviction/kube-scheduler/)
* Read the [kube-scheduler configuration (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta1/) reference
* Read the [kube-scheduler configuration (v1beta2)](/docs/reference/config-api/kube-scheduler-config.v1beta2/) reference

View File

@ -12,20 +12,27 @@ During `kubeadm init`, kubeadm uploads the `ClusterConfiguration` object to your
in a ConfigMap called `kubeadm-config` in the `kube-system` namespace. This configuration is then read during
`kubeadm join`, `kubeadm reset` and `kubeadm upgrade`.
You can use `kubeadm config print` to print the default configuration and `kubeadm config migrate` to
convert your old configuration files to a newer version. `kubeadm config images list` and
`kubeadm config images pull` can be used to list and pull the images that kubeadm requires.
You can use `kubeadm config print` to print the default static configuration that kubeadm
uses for `kubeadm init` and `kubeadm join`.
For more information navigate to
{{< note >}}
The output of the command is meant to serve as an example. You must manually edit the output
of this command to adapt to your setup. Remove the fields that you are not certain about and kubeadm
will try to default them on runtime by examining the host.
{{< /note >}}
For more information on `init` and `join` navigate to
[Using kubeadm init with a configuration file](/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file)
or [Using kubeadm join with a configuration file](/docs/reference/setup-tools/kubeadm/kubeadm-join/#config-file).
You can also configure several kubelet-configuration options with `kubeadm init`. These options will be the same on any node in your cluster.
See [Configuring each kubelet in your cluster using kubeadm](/docs/setup/production-environment/tools/kubeadm/kubelet-integration/) for details.
For more information on using the kubeadm configuration API navigate to
[Customizing components with the kubeadm API](/docs/setup/production-environment/tools/kubeadm/control-plane-flags).
In Kubernetes v1.13.0 and later to list/pull kube-dns images instead of the CoreDNS image
the `--config` method described [here](/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/#cmd-phase-addon)
has to be used.
You can use `kubeadm config migrate` to convert your old configuration files that contain a deprecated
API version to a newer, supported API version.
`kubeadm config images list` and `kubeadm config images pull` can be used to list and pull the images
that kubeadm requires.
<!-- body -->
## kubeadm config print {#cmd-config-print}

View File

@ -9,25 +9,82 @@ weight: 20
---
<!-- overview -->
This page describes common concepts in the Kubernetes API.
The Kubernetes API is a resource-based (RESTful) programmatic interface
provided via HTTP. It supports retrieving, creating, updating, and deleting
primary resources via the standard HTTP verbs (POST, PUT, PATCH, DELETE,
GET).
For some resources, the API includes additional subresources that allow
fine grained authorization (such as a separating viewing details for a Pod from
retrieving its logs), and can accept and serve those resources in different
representations for convenience or efficiency.
Kubernetes supports efficient change notifications on resources via *watches*.
Kubernetes also provides consistent list operations so that API clients can
effectively cache, track, and synchronize the state of resources.
You can view the [API reference](/docs/reference/kubernetes-api/) online,
or read on to learn about the API in general.
<!-- body -->
The Kubernetes API is a resource-based (RESTful) programmatic interface provided via HTTP. It supports retrieving, creating,
updating, and deleting primary resources via the standard HTTP verbs (POST, PUT, PATCH, DELETE, GET), includes additional subresources for many objects that allow fine grained authorization (such as binding a pod to a node), and can accept and serve those resources in different representations for convenience or efficiency. It also supports efficient change notifications on resources via "watches" and consistent lists to allow other components to effectively cache and synchronize the state of resources.
## Kubernetes API terminology {#standard-api-terminology}
## Standard API terminology
Kubernetes generally leverages common RESTful terminology to describe the
API concepts:
Most Kubernetes API resource types are [objects](/docs/concepts/overview/working-with-objects/kubernetes-objects/#kubernetes-objects): they represent a concrete instance of a concept on the cluster, like a pod or namespace. A smaller number of API resource types are "virtual" - they often represent operations rather than objects, such as a permission check (use a POST with a JSON-encoded body of `SubjectAccessReview` to the `subjectaccessreviews` resource). All objects will have a unique name to allow idempotent creation and retrieval, but virtual resource types may not have unique names if they are not retrievable or do not rely on idempotency.
* A *resource type* is the name used in the URL (`pods`, `namespaces`, `services`)
* All resource types have a concrete representation (their object schema) which is called a *kind*
* A list of instances of a resource is known as a *collection*
* A single instance of a resource type is called a *resource*, and also usually represents an *object*
* For some resource types, the API includes one or more *sub-resources*, which are represented as URI paths below the resource
Kubernetes generally leverages standard RESTful terminology to describe the API concepts:
Most Kubernetes API resource types are
[objects](/docs/concepts/overview/working-with-objects/kubernetes-objects/#kubernetes-objects):
they represent a concrete instance of a concept on the cluster, like a
pod or namespace. A smaller number of API resource types are *virtual* in
that they often represent operations on objects, rather than objects, such
as a permission check
(use a POST with a JSON-encoded body of `SubjectAccessReview` to the
`subjectaccessreviews` resource), or the `eviction` sub-resource of a Pod
(used to trigger
[API-initiated eviction](/docs/concepts/scheduling-eviction/api-eviction/)).
* A **resource type** is the name used in the URL (`pods`, `namespaces`, `services`)
* All resource types have a concrete representation in JSON (their object schema) which is called a **kind**
* A list of instances of a resource type is known as a **collection**
* A single instance of the resource type is called a **resource**
### Object names
All resource types are either scoped by the cluster (`/apis/GROUP/VERSION/*`) or to a namespace (`/apis/GROUP/VERSION/namespaces/NAMESPACE/*`). A namespace-scoped resource type will be deleted when its namespace is deleted and access to that resource type is controlled by authorization checks on the namespace scope. The following paths are used to retrieve collections and resources:
All objects you can create via the API have a unique object
{{< glossary_tooltip text="name" term_id="name" >}} to allow idempotent creation and
retrieval, except that virtual resource types may not have unique names if they are
not retrievable, or do not rely on idempotency.
Within a {{< glossary_tooltip text="namespace" term_id="namespace" >}}, only one object
of a given kind can have a given name at a time. However, if you delete the object,
you can make a new object with the same name. Some objects are not namespaced (for
example: Nodes), and so their names must be unique across the whole cluster.
### API verbs
Almost all object resource types support the standard HTTP verbs - GET, POST, PUT, PATCH,
and DELETE. Kubernetes also uses its own verbs, which are often written lowercase to distinguish
them from HTTP verbs.
Kubernetes uses the term **list** to describe returning a [collection](#collections) of
resources to distinguish from retrieving a single resource which is usually called
a **get**. If you sent an HTTP GET request with the `?watch` query parameter,
Kubernetes calls this a **watch** and not a **get** (see
[Efficient detection of changes](#efficient-detection-of-changes) for more details).
For PUT requests, Kubernetes internally classifies these as either **create** or **update**
based on the state of the existing object. An **update** is different from a **patch**; the
HTTP verb for a **patch** is PATCH.
## Resource URIs
All resource types are either scoped by the cluster (`/apis/GROUP/VERSION/*`) or to a
namespace (`/apis/GROUP/VERSION/namespaces/NAMESPACE/*`). A namespace-scoped resource
type will be deleted when its namespace is deleted and access to that resource type
is controlled by authorization checks on the namespace scope.
You can also access collections of resources (for example: listing all Nodes).
The following paths are used to retrieve collections and resources:
* Cluster-scoped resources:
@ -40,21 +97,41 @@ All resource types are either scoped by the cluster (`/apis/GROUP/VERSION/*`) or
* `GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE` - return collection of all instances of the resource type in NAMESPACE
* `GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE/NAME` - return the instance of the resource type with NAME in NAMESPACE
Since a namespace is a cluster-scoped resource type, you can retrieve the list of all namespaces with `GET /api/v1/namespaces` and details about a particular namespace with `GET /api/v1/namespaces/NAME`.
Almost all object resource types support the standard HTTP verbs - GET, POST, PUT, PATCH, and DELETE. Kubernetes uses the term **list** to describe returning a collection of resources to distinguish from retrieving a single resource which is usually called a **get**.
Some resource types will have one or more sub-resources, represented as sub paths below the resource:
Since a namespace is a cluster-scoped resource type, you can retrieve the list
(“collection”) of all namespaces with `GET /api/v1/namespaces` and details about
a particular namespace with `GET /api/v1/namespaces/NAME`.
* Cluster-scoped subresource: `GET /apis/GROUP/VERSION/RESOURCETYPE/NAME/SUBRESOURCE`
* Namespace-scoped subresource: `GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE/NAME/SUBRESOURCE`
The verbs supported for each subresource will differ depending on the object - see the API documentation for more information. It is not possible to access sub-resources across multiple resources - generally a new virtual resource type would be used if that becomes necessary.
The verbs supported for each subresource will differ depending on the object -
see the [API reference](/docs/reference/kubernetes-api/) for more information. It
is not possible to access sub-resources across multiple resources - generally a new
virtual resource type would be used if that becomes necessary.
## Efficient detection of changes
To enable clients to build a model of the current state of a cluster, all Kubernetes object resource types are required to support consistent lists and an incremental change notification feed called a **watch**. Every Kubernetes object has a `resourceVersion` field representing the version of that resource as stored in the underlying database. When retrieving a collection of resources (either namespace or cluster scoped), the response from the server will contain a `resourceVersion` value that can be used to initiate a watch against the server. The server will return all changes (creates, deletes, and updates) that occur after the supplied `resourceVersion`. This allows a client to fetch the current state and then watch for changes without missing any updates. If the client watch is disconnected they can restart a new watch from the last returned `resourceVersion`, or perform a new collection request and begin again. See [Resource Version Semantics](#resource-versions) for more detail.
The Kubernetes API allows clients to make an initial request for an object or a
collection, and then to track changes since that initial request: a **watch**. Clients
can send a **list** or a **get** and then make a follow-up **watch** request.
To make this change tracking possible, every Kubernetes object has a `resourceVersion`
field representing the version of that resource as stored in the underlying persistence
layer. When retrieving a collection of resources (either namespace or cluster scoped),
the response from the API server contains a `resourceVersion` value. The client can
use that `resourceVersion` to initiate a **watch** against the API server.
When you send a **watch** request, the API server responds with a stream of
changes. These changes itemize the outcome of operations (such as **create**, **delete**,
and **update**) that occurred after the `resourceVersion` you specified as a parameter
to the **watch** request. The overall **watch** mechanism allows a client to fetch
the current state and then subscribe to subsequent changes, without missing any events.
If a client **watch** is disconnected then that client can start a new **watch** from
the last returned `resourceVersion`; the client could also perform a fresh **get** /
**list** request and begin again. See [Resource Version Semantics](#resource-versions)
for more detail.
For example:
@ -74,7 +151,10 @@ For example:
}
```
2. Starting from resource version 10245, receive notifications of any creates, deletes, or updates as individual JSON objects.
2. Starting from resource version 10245, receive notifications of any API operations
(such as **create**, **delete**, **apply** or **update**) that affect Pods in the
_test_ namespace. Each change notification is a JSON document. The HTTP response body
(served as `application/json`) consists a series of JSON documents.
```
GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245
@ -94,11 +174,24 @@ For example:
...
```
A given Kubernetes server will only preserve a historical list of changes for a limited time. Clusters using etcd3 preserve changes in the last 5 minutes by default. When the requested watch operations fail because the historical version of that resource is not available, clients must handle the case by recognizing the status code `410 Gone`, clearing their local cache, performing a list operation, and starting the watch from the `resourceVersion` returned by that new list operation. Most client libraries offer some form of standard tool for this logic. (In Go this is called a `Reflector` and is located in the `k8s.io/client-go/cache` package.)
A given Kubernetes server will only preserve a historical record of changes for a
limited time. Clusters using etcd 3 preserve changes in the last 5 minutes by default.
When the requested **watch** operations fail because the historical version of that
resource is not available, clients must handle the case by recognizing the status code
`410 Gone`, clearing their local cache, performing a new **get** or **list** operation,
and starting the **watch** from the `resourceVersion` that was returned.
For subscribing to collections, Kubernetes client libraries typically offer some form
of standard tool for this **list**-then-**watch** logic. (In the Go client library,
this is called a `Reflector` and is located in the `k8s.io/client-go/cache` package.)
### Watch bookmarks
To mitigate the impact of short history window, we introduced a concept of `bookmark` watch event. It is a special kind of event to mark that all changes up to a given `resourceVersion` the client is requesting have already been sent. Object returned in that event is of the type requested by the request, but only `resourceVersion` field is set, e.g.:
To mitigate the impact of short history window, the Kubernetes API provides a watch
event named `BOOKMARK`. It is a special kind of event to mark that all changes up
to a given `resourceVersion` the client is requesting have already been sent. The
document representing the `BOOKMARK` event is of the type requested by the request,
but only includes a `.metadata.resourceVersion` field. For example:
```console
GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245&allowWatchBookmarks=true
@ -118,19 +211,49 @@ Content-Type: application/json
}
```
`Bookmark` events can be requested by `allowWatchBookmarks=true` option in watch requests, but clients shouldn't assume bookmarks are returned at any specific interval, nor may they assume the server will send any `bookmark` event.
As a client, you can request `BOOKMARK` events by setting the
`allowWatchBookmarks=true` query parameter to a **watch** request, but you shouldn't
assume bookmarks are returned at any specific interval, nor can clients assume that
the API server will send any `BOOKMARK` event even when requested.
## Retrieving large results sets in chunks
{{< feature-state for_k8s_version="v1.9" state="beta" >}}
On large clusters, retrieving the collection of some resource types may result in very large responses that can impact the server and client. For instance, a cluster may have tens of thousands of pods, each of which is 1-2kb of encoded JSON. Retrieving all pods across all namespaces may result in a very large response (10-20MB) and consume a large amount of server resources. Starting in Kubernetes 1.9 the server supports the ability to break a single large collection request into many smaller chunks while preserving the consistency of the total request. Each chunk can be returned sequentially which reduces both the total size of the request and allows user-oriented clients to display results incrementally to improve responsiveness.
On large clusters, retrieving the collection of some resource types may result in
very large responses that can impact the server and client. For instance, a cluster
may have tens of thousands of Pods, each of which is equivalent to roughly 2 KiB of
encoded JSON. Retrieving all pods across all namespaces may result in a very large
response (10-20MB) and consume a large amount of server resources.
To retrieve a single list in chunks, two new parameters `limit` and `continue` are supported on collection requests and a new field `continue` is returned from all list operations in the list `metadata` field. A client should specify the maximum results they wish to receive in each chunk with `limit` and the server will return up to `limit` resources in the result and include a `continue` value if there are more resources in the collection. The client can then pass this `continue` value to the server on the next request to instruct the server to return the next chunk of results. By continuing until the server returns an empty `continue` value the client can consume the full set of results.
Provided that you don't explicitly disable the `APIListChunking`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), the
Kubernetes API server supports the ability to break a single large collection request
into many smaller chunks while preserving the consistency of the total request. Each
chunk can be returned sequentially which reduces both the total size of the request and
allows user-oriented clients to display results incrementally to improve responsiveness.
Like a watch operation, a `continue` token will expire after a short amount of time (by default 5 minutes) and return a `410 Gone` if more results cannot be returned. In this case, the client will need to start from the beginning or omit the `limit` parameter.
You can request that the API server handles a **list** by serving single collection
using pages (which Kubernetes calls _chunks_). To retrieve a single collection in
chunks, two query parameters `limit` and `continue` are supported on requests against
collections, and a response field `continue` is returned from all **list** operations
in the collection's `metadata` field. A client should specify the maximum results they
wish to receive in each chunk with `limit` and the server will return up to `limit`
resources in the result and include a `continue` value if there are more resources
in the collection.
For example, if there are 1,253 pods on the cluster and the client wants to receive chunks of 500 pods at a time, they would request those chunks as follows:
As an API client, you can then pass this `continue` value to the API server on the
next request, to instruct the server to return the next page (_chunk_) of results. By
continuing until the server returns an empty `continue` value, you can retrieve the
entire collection.
Like a **watch** operation, a `continue` token will expire after a short amount
of time (by default 5 minutes) and return a `410 Gone` if more results cannot be
returned. In this case, the client will need to start from the beginning or omit the
`limit` parameter.
For example, if there are 1,253 pods on the cluster and you wants to receive chunks
of 500 pods at a time, request those chunks as follows:
1. List all of the pods on a cluster, retrieving up to 500 pods each time.
@ -192,34 +315,40 @@ For example, if there are 1,253 pods on the cluster and the client wants to rece
}
```
Note that the `resourceVersion` of the list remains constant across each request,
indicating the server is showing us a consistent snapshot of the pods. Pods that
are created, updated, or deleted after version `10245` would not be shown unless
the user makes a list request without the `continue` token. This allows clients
to break large requests into smaller chunks and then perform a watch operation
on the full set without missing any updates.
Notice that the `resourceVersion` of the collection remains constant across each request,
indicating the server is showing you a consistent snapshot of the pods. Pods that
are created, updated, or deleted after version `10245` would not be shown unless
you make a separate **list** request without the `continue` token. This allows you
to break large requests into smaller chunks and then perform a **watch** operation
on the full set without missing any updates.
`remainingItemCount` is the number of subsequent items in the list which are not
included in this list response. If the list request contained label or field selectors,
then the number of remaining items is unknown and the API server does not include
a `remainingItemCount` field in its response. If the list is complete (either
because it is not chunking or because this is the last chunk), then there are no
more remaining items and the API server does not include a `remainingItemCount`
field in its response. The intended use of the `remainingItemCount` is estimating
the size of a collection.
`remainingItemCount` is the number of subsequent items in the collection that are not
included in this response. If the **list** request contained label or field
{{< glossary_tooltip text="selectors" term_id="selector">}} then the number of
remaining items is unknown and the API server does not include a `remainingItemCount`
field in its response.
If the **list** is complete (either because it is not chunking, or because this is the
last chunk), then there are no more remaining items and the API server does not include a
`remainingItemCount` field in its response. The intended use of the `remainingItemCount`
is estimating the size of a collection.
## Lists
## Collections
There are dozens of list types (such as `PodList`, `ServiceList`, and `NodeList`) defined in the Kubernetes API.
You can get more information about each list type from the [Kubernetes API](/docs/reference/kubernetes-api/) documentation.
In Kubernetes terminology, the response you get from a **list** is
a _collection_. However, Kubernetes defines concrete kinds for
collections of different types of resource. Collections have a kind
named for the resource kind, with `List` appended.
When you query the API for a particular type, all items returned by that query are of that type. For example, when you
ask for a list of services, the list type is shown as `kind: ServiceList` and each item in that list represents a single Service. For example:
```console
When you query the API for a particular type, all items returned by that query are
of that type.
For example, when you **list** Services, the collection response
has `kind` set to
[`ServiceList`](/docs/reference/kubernetes-api/service-resources/service-v1/#ServiceList); each item in that collection represents a single Service. For example:
```
GET /api/v1/services
---
```
```yaml
{
"kind": "ServiceList",
"apiVersion": "v1",
@ -238,12 +367,21 @@ GET /api/v1/services
...
```
Some tools, such as `kubectl` provide another way to query the Kubernetes API. Because the output of `kubectl` might include multiple list types, the list of items is represented as `kind: List`. For example:
There are dozens of collection types (such as `PodList`, `ServiceList`,
and `NodeList`) defined in the Kubernetes API.
You can get more information about each collection type from the
[Kubernetes API](/docs/reference/kubernetes-api/) documentation.
```console
$ kubectl get services -A -o yaml
Some tools, such as `kubectl`, represent the Kubernetes collection
mechanism slightly differently from the Kubernetes API itself.
Because the output of `kubectl` might include the response from
multiple **list** operations at the API level, `kubectl` represents
a list of items using `kind: List`. For example:
```shell
kubectl get services -A -o yaml
```
```yaml
apiVersion: v1
kind: List
metadata:
@ -276,29 +414,43 @@ items:
```
{{< note >}}
Keep in mind that the Kubernetes API does not have a `kind: List` type. `kind: List` is an internal mechanism type for lists of mixed resources and should not be depended upon.
{{< /note >}}
Keep in mind that the Kubernetes API does not have a `kind` named `List`.
`kind: List` is a client-side, internal implementation detail for processing
collections that might be of different kinds of object. Avoid depending on
`kind: List` in automation or other code.
{{< /note >}}
## Receiving resources as Tables
`kubectl get` is a simple tabular representation of one or more instances of a particular resource type. In the past, clients were required to reproduce the tabular and describe output implemented in `kubectl` to perform simple lists of objects.
A few limitations of that approach include non-trivial logic when dealing with certain objects. Additionally, types provided by API aggregation or third party resources are not known at compile time. This means that generic implementations had to be in place for types unrecognized by a client.
When you run `kubectl get`, the default output format is a simple tabular
representation of one or more instances of a particular resource type. In the past,
clients were required to reproduce the tabular and describe output implemented in
`kubectl` to perform simple lists of objects.
A few limitations of that approach include non-trivial logic when dealing with
certain objects. Additionally, types provided by API aggregation or third party
resources are not known at compile time. This means that generic implementations
had to be in place for types unrecognized by a client.
In order to avoid potential limitations as described above, clients may request the Table representation of objects, delegating specific details of printing to the server. The Kubernetes API implements standard HTTP content type negotiation: passing an `Accept` header containing a value of `application/json;as=Table;g=meta.k8s.io;v=v1beta1` with a `GET` call will request that the server return objects in the Table content type.
In order to avoid potential limitations as described above, clients may request
the Table representation of objects, delegating specific details of printing to the
server. The Kubernetes API implements standard HTTP content type negotiation: passing
an `Accept` header containing a value of `application/json;as=Table;g=meta.k8s.io;v=v1`
with a `GET` call will request that the server return objects in the Table content
type.
For example, list all of the pods on a cluster in the Table format.
```console
GET /api/v1/pods
Accept: application/json;as=Table;g=meta.k8s.io;v=v1beta1
Accept: application/json;as=Table;g=meta.k8s.io;v=v1
---
200 OK
Content-Type: application/json
{
"kind": "Table",
"apiVersion": "meta.k8s.io/v1beta1",
"apiVersion": "meta.k8s.io/v1",
...
"columnDefinitions": [
...
@ -306,7 +458,9 @@ Content-Type: application/json
}
```
For API resource types that do not have a custom Table definition on the server, a default Table response is returned by the server, consisting of the resource's `name` and `creationTimestamp` fields.
For API resource types that do not have a custom Table definition known to the control
plane, the API server returns a default Table response that consists of the resource's
`name` and `creationTimestamp` fields.
```console
GET /apis/crd.example.com/v1alpha1/namespaces/default/resources
@ -317,7 +471,7 @@ Content-Type: application/json
{
"kind": "Table",
"apiVersion": "meta.k8s.io/v1beta1",
"apiVersion": "meta.k8s.io/v1",
...
"columnDefinitions": [
{
@ -334,18 +488,38 @@ Content-Type: application/json
}
```
Table responses are available beginning in version 1.10 of the kube-apiserver. As such, not all API resource types will support a Table response, specifically when using a client against older clusters. Clients that must work against all resource types, or can potentially deal with older clusters, should specify multiple content types in their `Accept` header to support fallback to non-Tabular JSON:
Not all API resource types support a Table response; for example, a
{{< glossary_tooltip term_id="CustomResourceDefinition" text="CustomResourceDefinitions" >}}
might not define field-to-table mappings, and an APIService that
[extends the core Kubernetes API](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/)
might not serve Table responses at all. If you are implementing a client that
uses the Table information and must work against all resource types, including
extensions, you should make requests that specify multiple content types in the
`Accept` header. For example:
```console
Accept: application/json;as=Table;g=meta.k8s.io;v=v1beta1, application/json
Accept: application/json;as=Table;g=meta.k8s.io;v=v1, application/json
```
## Alternate representations of resources
By default, Kubernetes returns objects serialized to JSON with content type `application/json`. This is the default serialization format for the API. However, clients may request the more efficient Protobuf representation of these objects for better performance at scale. The Kubernetes API implements standard HTTP content type negotiation: passing an `Accept` header with a `GET` call will request that the server return objects in the provided content type, while sending an object in Protobuf to the server for a `PUT` or `POST` call takes the `Content-Type` header. The server will return a `Content-Type` header if the requested format is supported, or the `406 Not acceptable` error if an invalid content type is provided.
By default, Kubernetes returns objects serialized to JSON with content type
`application/json`. This is the default serialization format for the API. However,
clients may request the more efficient
[Protobuf representation](#protobuf-encoding) of these objects for better performance at scale.
The Kubernetes API implements standard HTTP content type negotiation: passing an
`Accept` header with a `GET` call will request that the server tries to return
a response in your preferred media type, while sending an object in Protobuf to
the server for a `PUT` or `POST` call means that you must set the `Content-Type`
header appropriately.
See the API documentation for a list of supported content types for each API.
The server will return a response with a `Content-Type` header if the requested
format is supported, or the `406 Not acceptable` error if none of the media types you
requested are supported. All built-in resource types support the `application/json`
media type.
See the Kubernetes [API reference](/docs/reference/kubernetes-api/) for a list of
supported content types for each API.
For example:
@ -361,7 +535,8 @@ For example:
... binary encoded PodList object
```
2. Create a pod by sending Protobuf encoded data to the server, but request a response in JSON.
1. Create a pod by sending Protobuf encoded data to the server, but request a response
in JSON.
```console
POST /api/v1/namespaces/test/pods
@ -379,15 +554,25 @@ For example:
}
```
Not all API resource types will support Protobuf, specifically those defined via Custom Resource Definitions or those that are API extensions. Clients that must work against all resource types should specify multiple content types in their `Accept` header to support fallback to JSON:
Not all API resource types support Protobuf; specifically, Protobuf isn't available for
resources that are defined as
{{< glossary_tooltip term_id="CustomResourceDefinition" text="CustomResourceDefinitions" >}}
or are served via the
{{< glossary_tooltip text="aggregation layer" term_id="aggregation-layer" >}}.
As a client, if you might need to work with extension types you should specify multiple
content types in the request `Accept` header to support fallback to JSON.
For example:
```console
Accept: application/vnd.kubernetes.protobuf, application/json
```
### Protobuf encoding
### Kubernetes Protobuf encoding {#protobuf-encoding}
Kubernetes uses an envelope wrapper to encode Protobuf responses. That wrapper starts with a 4 byte magic number to help identify content in disk or in etcd as Protobuf (as opposed to JSON), and then is followed by a Protobuf encoded wrapper message, which describes the encoding and type of the underlying object and then contains the object.
Kubernetes uses an envelope wrapper to encode Protobuf responses. That wrapper starts
with a 4 byte magic number to help identify content in disk or in etcd as Protobuf
(as opposed to JSON), and then is followed by a Protobuf encoded wrapper message, which
describes the encoding and type of the underlying object and then contains the object.
The wrapper format is:
@ -419,13 +604,21 @@ An encoded Protobuf message with the following IDL:
}
```
Clients that receive a response in `application/vnd.kubernetes.protobuf` that does not match the expected prefix should reject the response, as future versions may need to alter the serialization format in an incompatible way and will do so by changing the prefix.
{{< note >}}
Clients that receive a response in `application/vnd.kubernetes.protobuf` that does
not match the expected prefix should reject the response, as future versions may need
to alter the serialization format in an incompatible way and will do so by changing
the prefix.
{{< /note >}}
## Resource deletion
Resources are deleted in two phases: 1) finalization, and 2) removal.
When you **delete** a resource this takes place in two phases.
```go
1. _finalization_
2. removal
```yaml
{
"kind": "ConfigMap",
"apiVersion": "v1",
@ -436,47 +629,86 @@ Resources are deleted in two phases: 1) finalization, and 2) removal.
}
```
When a client first deletes a resource, the `.metadata.deletionTimestamp` is set to the current time.
When a client first sends a **delete** to request removal of a resource, the `.metadata.deletionTimestamp` is set to the current time.
Once the `.metadata.deletionTimestamp` is set, external controllers that act on finalizers
may start performing their cleanup work at any time, in any order.
Order is NOT enforced because it introduces significant risk of stuck `.metadata.finalizers`.
`.metadata.finalizers` is a shared field, any actor with permission can reorder it.
If the finalizer list is processed in order, then this can lead to a situation
Order is **not** enforced between finalizers because it would introduce significant
risk of stuck `.metadata.finalizers`.
The `.metadata.finalizers` field is shared: any actor with permission can reorder it.
If the finalizer list were processed in order, then this might lead to a situation
in which the component responsible for the first finalizer in the list is
waiting for a signal (field value, external system, or other) produced by a
waiting for some signal (field value, external system, or other) produced by a
component responsible for a finalizer later in the list, resulting in a deadlock.
Without enforced ordering finalizers are free to order amongst themselves and
are not vulnerable to ordering changes in the list.
Without enforced ordering, finalizers are free to order amongst themselves and are
not vulnerable to ordering changes in the list.
Once the last finalizer is removed, the resource is actually removed from etcd.
## Single resource API
API verbs GET, CREATE, UPDATE, PATCH, DELETE and PROXY support single resources only.
These verbs with single resource support have no support for submitting
multiple resources together in an ordered or unordered list or transaction.
Clients including kubectl will parse a list of resources and make
single-resource API requests.
The Kubernetes API verbs **get**, **create**, **apply**, **update**, **patch**,
**delete** and **proxy** support single resources only.
These verbs with single resource support have no support for submitting multiple
resources together in an ordered or unordered list or transaction.
API verbs LIST and WATCH support getting multiple resources, and
DELETECOLLECTION supports deleting multiple resources.
When clients (including kubectl) act on a set of resources, the client makes a series
of single-resource API requests, then aggregates the responses if needed.
By contrast, the Kubernetes API verbs **list** and **watch** allow getting multiple
resources, and **deletecollection** allows deleting multiple resources.
## Dry-run
{{< feature-state for_k8s_version="v1.18" state="stable" >}}
The modifying verbs (`POST`, `PUT`, `PATCH`, and `DELETE`) can accept requests in a _dry run_ mode. Dry run mode helps to evaluate a request through the typical request stages (admission chain, validation, merge conflicts) up until persisting objects to storage. The response body for the request is as close as possible to a non-dry-run response. The system guarantees that dry-run requests will not be persisted in storage or have any other side effects.
When you use HTTP verbs that can modify resources (`POST`, `PUT`, `PATCH`, and
`DELETE`), you can submit your request in a _dry run_ mode. Dry run mode helps to
evaluate a request through the typical request stages (admission chain, validation,
merge conflicts) up until persisting objects to storage. The response body for the
request is as close as possible to a non-dry-run response. Kubernetes guarantees that
dry-run requests will not be persisted in storage or have any other side effects.
### Make a dry-run request
Dry-run is triggered by setting the `dryRun` query parameter. This parameter is a string, working as an enum, and the only accepted values are:
Dry-run is triggered by setting the `dryRun` query parameter. This parameter is a
string, working as an enum, and the only accepted values are:
* `All`: Every stage runs as normal, except for the final storage stage. Admission controllers are run to check that the request is valid, mutating controllers mutate the request, merge is performed on `PATCH`, fields are defaulted, and schema validation occurs. The changes are not persisted to the underlying storage, but the final object which would have been persisted is still returned to the user, along with the normal status code. If the request would trigger an admission controller which would have side effects, the request will be failed rather than risk an unwanted side effect. All built in admission control plugins support dry-run. Additionally, admission webhooks can declare in their [configuration object](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#webhook-v1beta1-admissionregistration-k8s-io) that they do not have side effects by setting the sideEffects field to "None". If a webhook actually does have side effects, then the sideEffects field should be set to "NoneOnDryRun", and the webhook should also be modified to understand the `DryRun` field in AdmissionReview, and prevent side effects on dry-run requests.
* Leave the value empty, which is also the default: Keep the default modifying behavior.
[no value set]
: Allow side effects. You request this with a query string such as `?dryRun`
or `?dryRun&pretty=true`. The response is the final object that would have been
persisted, or an error if the request could not be fulfilled.
For example:
`All`
: Every stage runs as normal, except for the final storage stage where side effects
are prevented.
When you set `?dryRun=All`, any relevant
{{< glossary_tooltip text="admission controllers" term_id="admission-controller" >}}
are run, validating admission controllers check the request post-mutation, merge is
performed on `PATCH`, fields are defaulted, and schema validation occurs. The changes
are not persisted to the underlying storage, but the final object which would have
been persisted is still returned to the user, along with the normal status code.
If the non-dry-run version of a request would trigger an admission controller that has
side effects, the request will be failed rather than risk an unwanted side effect. All
built in admission control plugins support dry-run. Additionally, admission webhooks can
declare in their
[configuration object](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#webhook-v1beta1-admissionregistration-k8s-io)
that they do not have side effects, by setting their `sideEffects` field to `None`.
{{< note >}}
If a webhook actually does have side effects, then the `sideEffects` field should be
set to "NoneOnDryRun". That change is appropriate provided that the webhook is also
be modified to understand the `DryRun` field in AdmissionReview, and to prevent side
effects on any request marked as dry runs.
{{< /note >}}
Here is an example dry-run request that uses `?dryRun=All`:
```console
POST /api/v1/namespaces/test/pods?dryRun=All
@ -484,144 +716,193 @@ Content-Type: application/json
Accept: application/json
```
The response would look the same as for non-dry-run request, but the values of some generated fields may differ.
The response would look the same as for non-dry-run request, but the values of some
generated fields may differ.
### Generated values
Some values of an object are typically generated before the object is persisted. It
is important not to rely upon the values of these fields set by a dry-run request,
since these values will likely be different in dry-run mode from when the real
request is made. Some of these fields are:
* `name`: if `generateName` is set, `name` will have a unique random name
* `creationTimestamp` / `deletionTimestamp`: records the time of creation/deletion
* `UID`: [uniquely identifies](/docs/concepts/overview/working-with-objects/names/#uids) the object and is randomly generated (non-deterministic)
* `resourceVersion`: tracks the persisted version of the object
* Any field set by a mutating admission controller
* For the `Service` resource: Ports or IP addresses that the kube-apiserver assigns to Service objects
### Dry-run authorization
Authorization for dry-run and non-dry-run requests is identical. Thus, to make
a dry-run request, the user must be authorized to make the non-dry-run request.
a dry-run request, you must be authorized to make the non-dry-run request.
For example, to run a dry-run `PATCH` for Deployments, you must have the
`PATCH` permission for Deployments, as in the example of the RBAC rule below.
For example, to run a dry-run **patch** for a Deployment, you must be authorized
to perform that **patch**. Here is an example of a rule for Kubernetes
{{< glossary_tooltip text="RBAC" term_id="rbac">}} that allows patching
Deployments:
```yaml
rules:
- apiGroups: ["extensions", "apps"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["patch"]
```
See [Authorization Overview](/docs/reference/access-authn-authz/authorization/).
### Generated values
Some values of an object are typically generated before the object is persisted. It is important not to rely upon the values of these fields set by a dry-run request, since these values will likely be different in dry-run mode from when the real request is made. Some of these fields are:
* `name`: if `generateName` is set, `name` will have a unique random name
* `creationTimestamp`/`deletionTimestamp`: records the time of creation/deletion
* `UID`: uniquely identifies the object and is randomly generated (non-deterministic)
* `resourceVersion`: tracks the persisted version of the object
* Any field set by a mutating admission controller
* For the `Service` resource: Ports or IPs that kube-apiserver assigns to v1.Service objects
## Server Side Apply
Starting from Kubernetes v1.18, you can enable the
[Server Side Apply](/docs/reference/using-api/server-side-apply/)
feature so that the control plane tracks managed fields for all newly created objects.
Kubernetes' [Server Side Apply](/docs/reference/using-api/server-side-apply/)
feature allows the control plane to track managed fields for newly created objects.
Server Side Apply provides a clear pattern for managing field conflicts,
offers server-side `Apply` and `Update` operations, and replaces the
client-side functionality of `kubectl apply`. For more details about this
feature, see the section on
[Server Side Apply](/docs/reference/using-api/server-side-apply/).
client-side functionality of `kubectl apply`.
## Resource Versions
The API verb for Server-Side Apply is **apply**.
See [Server Side Apply](/docs/reference/using-api/server-side-apply/) for more details.
Resource versions are strings that identify the server's internal version of an object. Resource versions can be used by clients to determine when objects have changed, or to express data consistency requirements when getting, listing and watching resources. Resource versions must be treated as opaque by clients and passed unmodified back to the server. For example, clients must not assume resource versions are numeric, and may only compare two resource versions for equality (i.e. must not compare resource versions for greater-than or less-than relationships).
## Resource versions
### ResourceVersion in metadata
Resource versions are strings that identify the server's internal version of an
object. Resource versions can be used by clients to determine when objects have
changed, or to express data consistency requirements when getting, listing and
watching resources. Resource versions must be treated as opaque by clients and passed
unmodified back to the server.
Clients find resource versions in resources, including the resources in watch events, and list responses returned from the server:
You must not assume resource versions are numeric or collatable. API clients may
only compare two resource versions for equality (this means that you must not compare
resource versions for greater-than or less-than relationships).
### `resourceVersion` fields in metadata {#resourceversion-in-metadata}
Clients find resource versions in resources, including the resources from the response
stream for a **watch**, or when using **list** to enumerate resources.
[v1.meta/ObjectMeta](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#objectmeta-v1-meta) - The `metadata.resourceVersion` of a resource instance identifies the resource version the instance was last modified at.
[v1.meta/ListMeta](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#listmeta-v1-meta) - The `metadata.resourceVersion` of a resource collection (i.e. a list response) identifies the resource version at which the list response was constructed.
[v1.meta/ListMeta](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#listmeta-v1-meta) - The `metadata.resourceVersion` of a resource collection (the response to a **list**) identifies the resource version at which the collection was constructed.
### The ResourceVersion Parameter
### `resourceVersion` parameters in query strings {#the-resourceversion-parameter}
The get, list, and watch operations support the `resourceVersion` parameter.
The **get**, **list**, and **watch** operations support the `resourceVersion` parameter.
From version v1.19, Kubernetes API servers also support the `resourceVersionMatch`
parameter on _list_ requests.
The exact meaning of this parameter differs depending on the operation and the value of `resourceVersion`.
The API server interprets the `resourceVersion` parameter differently depending
on the operation you request, and on the value of `resourceVersion`. If you set
`resourceVersionMatch` then this also affects the way matching happens.
For get and list, the semantics of resource version are:
### Semantics for **get** and **list**
**Get:**
For **get** and **list**, the semantics of `resourceVersion` are:
**get:**
| resourceVersion unset | resourceVersion="0" | resourceVersion="{value other than 0}" |
|-----------------------|---------------------|----------------------------------------|
| Most Recent | Any | Not older than |
**List:**
**list:**
v1.19+ API servers support the `resourceVersionMatch` parameter, which
determines how resourceVersion is applied to list calls. It is highly
recommended that `resourceVersionMatch` be set for list calls where
`resourceVersion` is set. If `resourceVersion` is unset, `resourceVersionMatch`
is not allowed. For backward compatibility, clients must tolerate the server
ignoring `resourceVersionMatch`:
From version v1.19, Kubernetes API servers support the `resourceVersionMatch` parameter
on _list_ requests. If you set both `resourceVersion` and `resourceVersionMatch`, the
`resourceVersionMatch` parameter determines how the API server interprets
`resourceVersion`.
- When using `resourceVersionMatch=NotOlderThan` and limit is set, clients must
handle HTTP 410 "Gone" responses. For example, the client might retry with a
newer `resourceVersion` or fall back to `resourceVersion=""`.
- When using `resourceVersionMatch=Exact` and `limit` is unset, clients must
verify that the `resourceVersion` in the `ListMeta` of the response matches
the requested `resourceVersion`, and handle the case where it does not. For
example, the client might fall back to a request with `limit` set.
You should always set the `resourceVersionMatch` parameter when setting
`resourceVersion` on a **list** request. However, be prepared to handle the case
where the API server that responds is unaware of `resourceVersionMatch`
and ignores it.
Unless you have strong consistency requirements, using `resourceVersionMatch=NotOlderThan` and
a known `resourceVersion` is preferable since it can achieve better performance and scalability
of your cluster than leaving `resourceVersion` and `resourceVersionMatch` unset, which requires
quorum read to be served.
Setting the `resourceVersionMatch` parameter without setting `resourceVersion` is not valid.
This table explains the behavior of **list** requests with various combinations of
`resourceVersion` and `resourceVersionMatch`:
{{< table caption="resourceVersionMatch and paging parameters for list" >}}
| resourceVersionMatch param | paging params | resourceVersion unset | resourceVersion="0" | resourceVersion="{value other than 0}" |
| resourceVersionMatch param | paging params | resourceVersion not set | resourceVersion="0" | resourceVersion="{value other than 0}" |
|---------------------------------------|-------------------------------|-----------------------|-------------------------------------------|----------------------------------------|
| resourceVersionMatch unset | limit unset | Most Recent | Any | Not older than |
| resourceVersionMatch unset | limit=\<n\>, continue unset | Most Recent | Any | Exact |
| resourceVersionMatch unset | limit=\<n\>, continue=\<token\> | Continue Token, Exact | Invalid, treated as Continue Token, Exact | Invalid, HTTP `400 Bad Request` |
| resourceVersionMatch=Exact [1] | limit unset | Invalid | Invalid | Exact |
| resourceVersionMatch=Exact [1] | limit=\<n\>, continue unset | Invalid | Invalid | Exact |
| resourceVersionMatch=NotOlderThan [1] | limit unset | Invalid | Any | Not older than |
| resourceVersionMatch=NotOlderThan [1] | limit=\<n\>, continue unset | Invalid | Any | Not older than |
| _unset_ | _limit unset_ | Most Recent | Any | Not older than |
| _unset_ | limit=\<n\>, _continue unset_ | Most Recent | Any | Exact |
| _unset_ | limit=\<n\>, continue=\<token\> | Continue Token, Exact | Invalid, treated as Continue Token, Exact | Invalid, HTTP `400 Bad Request` |
| `resourceVersionMatch=Exact` | _limit unset_ | Invalid | Invalid | Exact |
| `resourceVersionMatch=Exact` | limit=\<n\>, _continue unset_ | Invalid | Invalid | Exact |
| `resourceVersionMatch=NotOlderThan` | _limit unset_ | Invalid | Any | Not older than |
| `resourceVersionMatch=NotOlderThan` | limit=\<n\>, _continue unset_ | Invalid | Any | Not older than |
{{< /table >}}
**Footnotes:**
{{< note >}}
If your cluster's API server does not honor the `resourceVersionMatch` parameter,
the behavior is the same as if you did not set it.
{{< /note >}}
[1] If the server does not honor the `resourceVersionMatch` parameter, it is treated as if it is unset.
The meaning of the **get** and **list** semantics are:
The meaning of the get and list semantics are:
- **Most Recent:** Return data at the most recent resource version. The returned data must be
consistent (i.e. served from etcd via a quorum read).
- **Any:** Return data at any resource version. The newest available resource version is preferred,
Any
: Return data at any resource version. The newest available resource version is preferred,
but strong consistency is not required; data at any resource version may be served. It is possible
for the request to return data at a much older resource version that the client has previously
observed, particularly in high availability configurations, due to partitions or stale
caches. Clients that cannot tolerate this should not use this semantic.
- **Not older than:** Return data at least as new as the provided resourceVersion. The newest
available data is preferred, but any data not older than the provided resourceVersion may be
served. For list requests to servers that honor the resourceVersionMatch parameter, this
guarantees that resourceVersion in the ListMeta is not older than the requested resourceVersion,
but does not make any guarantee about the resourceVersion in the ObjectMeta of the list items
since ObjectMeta.resourceVersion tracks when an object was last updated, not how up-to-date the
object is when served.
- **Exact:** Return data at the exact resource version provided. If the provided resourceVersion is
unavailable, the server responds with HTTP 410 "Gone". For list requests to servers that honor the
resourceVersionMatch parameter, this guarantees that resourceVersion in the ListMeta is the same as
the requested resourceVersion, but does not make any guarantee about the resourceVersion in the
ObjectMeta of the list items since ObjectMeta.resourceVersion tracks when an object was last
updated, not how up-to-date the object is when served.
- **Continue Token, Exact:** Return data at the resource version of the initial paginated list
call. The returned Continue Tokens are responsible for keeping track of the initially provided
resource version for all paginated list calls after the initial paginated list call.
Most recent
: Return data at the most recent resource version. The returned data must be
consistent (in detail: served from etcd via a quorum read).
For watch, the semantics of resource version are:
Not older than
: Return data at least as new as the provided `resourceVersion`. The newest
available data is preferred, but any data not older than the provided `resourceVersion` may be
served. For **list** requests to servers that honor the `resourceVersionMatch` parameter, this
guarantees that the collection's `.metadata.resourceVersion` is not older than the requested
`resourceVersion`, but does not make any guarantee about the `.metadata.resourceVersion` of any
of the items in that collection.
**Watch:**
Exact
: Return data at the exact resource version provided. If the provided `resourceVersion` is
unavailable, the server responds with HTTP 410 "Gone". For **list** requests to servers that honor the
`resourceVersionMatch` parameter, this guarantees that the collection's `.metadata.resourceVersion`
is the same as the `resourceVersion` you requested in the query string. That guarantee does
not apply to the `.metadata.resourceVersion` of any items within that collection.
Continue Token, Exact
: Return data at the resource version of the initial paginated **list** call. The returned _continue
tokens_ are responsible for keeping track of the initially provided resource version for all paginated
**list** calls after the initial paginated **list**.
{{< note >}}
When you **list** resources and receive a collection response, the response includes the
[metadata](/docs/reference/generated/kubernetes-api/v1.21/#listmeta-v1-meta) of the collection as
well as [object metadata](/docs/reference/generated/kubernetes-api/v1.21/#listmeta-v1-meta)
for each item in that collection. For individual objects found within a collection response,
`.metadata.resourceVersion` tracks when that object was last updated, and not how up-to-date
the object is when served.
{{< /note >}}
When using `resourceVersionMatch=NotOlderThan` and limit is set, clients must
handle HTTP 410 "Gone" responses. For example, the client might retry with a
newer `resourceVersion` or fall back to `resourceVersion=""`.
When using `resourceVersionMatch=Exact` and `limit` is unset, clients must
verify that the collection's `.metadata.resourceVersion` matches
the requested `resourceVersion`, and handle the case where it does not. For
example, the client might fall back to a request with `limit` set.
### Semantics for **watch**
For **watch**, the semantics of resource version are:
**watch:**
{{< table caption="resourceVersion for watch" >}}
@ -631,18 +912,67 @@ For watch, the semantics of resource version are:
{{< /table >}}
The meaning of the watch semantics are:
The meaning of those **watch** semantics are:
- **Get State and Start at Most Recent:** Start a watch at the most recent resource version, which must be consistent (i.e. served from etcd via a quorum read). To establish initial state, the watch begins with synthetic "Added" events of all resources instances that exist at the starting resource version. All following watch events are for all changes that occurred after the resource version the watch started at.
- **Get State and Start at Any:** Warning: Watches initialize this way may return arbitrarily stale data! Please review this semantic before using it, and favor the other semantics where possible. Start a watch at any resource version, the most recent resource version available is preferred, but not required; any starting resource version is allowed. It is possible for the watch to start at a much older resource version that the client has previously observed, particularly in high availability configurations, due to partitions or stale caches. Clients that cannot tolerate this should not start a watch with this semantic. To establish initial state, the watch begins with synthetic "Added" events for all resources instances that exist at the starting resource version. All following watch events are for all changes that occurred after the resource version the watch started at.
- **Start at Exact:** Start a watch at an exact resource version. The watch events are for all changes after the provided resource version. Unlike "Get State and Start at Most Recent" and "Get State and Start at Any", the watch is not started with synthetic "Added" events for the provided resource version. The client is assumed to already have the initial state at the starting resource version since the client provided the resource version.
Get State and Start at Any
: {{< caution >}} Watches initialized this way may return arbitrarily stale
data. Please review this semantic before using it, and favor the other semantics
where possible.
{{< /caution >}}
Start a **watch** at any resource version; the most recent resource version
available is preferred, but not required. Any starting resource version is
allowed. It is possible for the **watch** to start at a much older resource
version that the client has previously observed, particularly in high availability
configurations, due to partitions or stale caches. Clients that cannot tolerate
this apparent rewinding should not start a **watch** with this semantic. To
establish initial state, the **watch** begins with synthetic "Added" events for
all resource instances that exist at the starting resource version. All following
watch events are for all changes that occurred after the resource version the
**watch** started at.
Get State and Start at Most Recent
: Start a **watch** at the most recent resource version, which must be consistent
(in detail: served from etcd via a quorum read). To establish initial state,
the **watch** begins with synthetic "Added" events of all resources instances
that exist at the starting resource version. All following watch events are for
all changes that occurred after the resource version the **watch** started at.
Start at Exact
: Start a **watch** at an exact resource version. The watch events are for all changes
after the provided resource version. Unlike "Get State and Start at Most Recent"
and "Get State and Start at Any", the **watch** is not started with synthetic
"Added" events for the provided resource version. The client is assumed to already
have the initial state at the starting resource version since the client provided
the resource version.
### "410 Gone" responses
Servers are not required to serve all older resource versions and may return a HTTP `410 (Gone)` status code if a client requests a resourceVersion older than the server has retained. Clients must be able to tolerate `410 (Gone)` responses. See [Efficient detection of changes](#efficient-detection-of-changes) for details on how to handle `410 (Gone)` responses when watching resources.
Servers are not required to serve all older resource versions and may return a HTTP
`410 (Gone)` status code if a client requests a `resourceVersion` older than the
server has retained. Clients must be able to tolerate `410 (Gone)` responses. See
[Efficient detection of changes](#efficient-detection-of-changes) for details on
how to handle `410 (Gone)` responses when watching resources.
If you request a a resourceVersion outside the applicable limit then, depending on whether a request is served from cache or not, the API server may reply with a `410 Gone` HTTP response.
If you request a `resourceVersion` outside the applicable limit then, depending
on whether a request is served from cache or not, the API server may reply with a
`410 Gone` HTTP response.
### Unavailable resource versions
Servers are not required to serve unrecognized resource versions. List and Get requests for unrecognized resource versions may wait briefly for the resource version to become available, should timeout with a `504 (Gateway Timeout)` if the provided resource versions does not become available in a reasonable amount of time, and may respond with a `Retry-After` response header indicating how many seconds a client should wait before retrying the request. Currently, the kube-apiserver also identifies these responses with a "Too large resource version" message. Watch requests for an unrecognized resource version may wait indefinitely (until the request timeout) for the resource version to become available.
Servers are not required to serve unrecognized resource versions. If you request
**list** or **get** for a resource version that the API server does not recognize,
then the API server may either:
* wait briefly for the resource version to become available, then timeout with a
`504 (Gateway Timeout)` if the provided resource versions does not become available
in a reasonable amount of time;
* respond with a `Retry-After` response header indicating how many seconds a client
should wait before retrying the request.
If you request a resource version that an API server does not recognize, the
kube-apiserver additionally identifies its error responses with a "Too large resource
version" message.
If you make a **watch** request for an unrecognized resource version, the API server
may wait indefinitely (until the request timeout) for the resource version to become
available.

View File

@ -71,7 +71,6 @@ their authors, not the Kubernetes team.
| Python | [github.com/tomplus/kubernetes_asyncio](https://github.com/tomplus/kubernetes_asyncio) |
| Python | [github.com/Frankkkkk/pykorm](https://github.com/Frankkkkk/pykorm) |
| Ruby | [github.com/abonas/kubeclient](https://github.com/abonas/kubeclient) |
| Ruby | [github.com/Ch00k/kuber](https://github.com/Ch00k/kuber) |
| Ruby | [github.com/k8s-ruby/k8s-ruby](https://github.com/k8s-ruby/k8s-ruby) |
| Ruby | [github.com/kontena/k8s-client](https://github.com/kontena/k8s-client) |
| Rust | [github.com/clux/kube-rs](https://github.com/clux/kube-rs) |

View File

@ -508,11 +508,3 @@ sub-resources that don't receive the resource object type. If you are
using Server Side Apply with such a sub-resource, the changed fields
won't be tracked.
{{< /caution >}}
## Disabling the feature
Server Side Apply is a beta feature, so it is enabled by default. To turn this
[feature gate](/docs/reference/command-line-tools-reference/feature-gates) off,
you need to include the `--feature-gates ServerSideApply=false` flag when
starting `kube-apiserver`. If you have multiple `kube-apiserver` replicas, all
should have the same flag setting.

View File

@ -93,7 +93,7 @@ when creating a cluster without an internet connection on its nodes.
See [Running kubeadm without an internet connection](/docs/reference/setup-tools/kubeadm/kubeadm-init#without-internet-connection) for more details.
Kubeadm allows you to use a custom image repository for the required images.
See [Using custom images](docs/reference/setup-tools/kubeadm/kubeadm-init#custom-images)
See [Using custom images](/docs/reference/setup-tools/kubeadm/kubeadm-init#custom-images)
for more details.
### Initializing your control-plane node

View File

@ -69,7 +69,11 @@ For more details please see the [Network Plugin Requirements](/docs/concepts/ext
## Check required ports
These
[required ports](/docs/reference/ports-and-protocols/)
need to be open in order for Kubernetes components to communicate with each other.
need to be open in order for Kubernetes components to communicate with each other. You can use telnet to check if a port is open. For example:
```shell
telnet 127.0.0.1 6443
```
The pod network plugin you use (see below) may also require certain ports to be
open. Since this differs with each pod network plugin, please see the
@ -239,7 +243,7 @@ sudo mkdir -p $DOWNLOAD_DIR
Install crictl (required for kubeadm / Kubelet Container Runtime Interface (CRI))
```bash
CRICTL_VERSION="v1.17.0"
CRICTL_VERSION="v1.22.0"
ARCH="amd64"
curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz
```

View File

@ -151,8 +151,10 @@ systemctl daemon-reload && systemctl restart kubelet
After the kubelet loads the new configuration, kubeadm writes the
`/etc/kubernetes/bootstrap-kubelet.conf` KubeConfig file, which contains a CA certificate and Bootstrap
Token. These are used by the kubelet to perform the TLS Bootstrap and obtain a unique
credential, which is stored in `/etc/kubernetes/kubelet.conf`. When this file is written, the kubelet
has finished performing the TLS Bootstrap.
credential, which is stored in `/etc/kubernetes/kubelet.conf`.
When the `/etc/kubernetes/kubelet.conf` file is written, the kubelet has finished performing the TLS Bootstrap.
Kubeadm deletes the `/etc/kubernetes/bootstrap-kubelet.conf` file after completing the TLS Bootstrap.
## The kubelet drop-in file for systemd

View File

@ -159,7 +159,7 @@ Calico, Canal, and Flannel CNI providers are verified to support HostPort.
For more information, see the [CNI portmap documentation](https://github.com/containernetworking/plugins/blob/master/plugins/meta/portmap/README.md).
If your network provider does not support the portmap CNI plugin, you may need to use the [NodePort feature of
services](/docs/concepts/services-networking/service/#nodeport) or use `HostNetwork=true`.
services](/docs/concepts/services-networking/service/#type-nodeport) or use `HostNetwork=true`.
## Pods are not accessible via their Service IP
@ -224,9 +224,17 @@ the `ca.key` you must sign the embedded certificates in the `kubelet.conf` exter
1. Copy this resulted `kubelet.conf` to `/etc/kubernetes/kubelet.conf` on the failed node.
1. Restart the kubelet (`systemctl restart kubelet`) on the failed node and wait for
`/var/lib/kubelet/pki/kubelet-client-current.pem` to be recreated.
1. Run `kubeadm init phase kubelet-finalize all` on the failed node. This will make the new
`kubelet.conf` file use `/var/lib/kubelet/pki/kubelet-client-current.pem` and will restart the kubelet.
1. Manually edit the `kubelet.conf` to point to the rotated kubelet client certificates, by replacing
`client-certificate-data` and `client-key-data` with:
```yaml
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
```
1. Restart the kubelet.
1. Make sure the node becomes `Ready`.
## Default NIC When using flannel as the pod network in Vagrant
The following error might indicate that something was wrong in the pod network:

View File

@ -33,13 +33,13 @@ To choose a tool which best fits your use case, read [this comparison](https://g
Provision servers with the following [requirements](https://github.com/kubernetes-sigs/kubespray#requirements):
* **Ansible v2.9 and python-netaddr is installed on the machine that will run Ansible commands**
* **Ansible v2.9 and python-netaddr are installed on the machine that will run Ansible commands**
* **Jinja 2.11 (or newer) is required to run the Ansible Playbooks**
* The target servers must have access to the Internet in order to pull docker images. Otherwise, additional configuration is required ([See Offline Environment](https://github.com/kubernetes-sigs/kubespray/blob/master/docs/offline-environment.md))
* The target servers are configured to allow **IPv4 forwarding**
* **Your ssh key must be copied** to all the servers part of your inventory
* The **firewalls are not managed**, you'll need to implement your own rules the way you used to. in order to avoid any issue during deployment you should disable your firewall
* If kubespray is ran from non-root user account, correct privilege escalation method should be configured in the target servers. Then the `ansible_become` flag or command parameters `--become` or `-b` should be specified
* **Your ssh key must be copied** to all the servers in your inventory
* **Firewalls are not managed by kubespray**. You'll need to implement appropriate rules as needed. You should disable your firewall in order to avoid any issues during deployment
* If kubespray is ran from a non-root user account, correct privilege escalation method should be configured in the target servers and the `ansible_become` flag or command parameters `--become` or `-b` should be specified
Kubespray provides the following utilities to help provision your environment:
@ -85,7 +85,7 @@ Large deployments (100+ nodes) may require [specific adjustments](https://github
### (5/5) Verify the deployment
Kubespray provides a way to verify inter-pod connectivity and DNS resolve with [Netchecker](https://github.com/kubernetes-sigs/kubespray/blob/master/docs/netcheck.md). Netchecker ensures the netchecker-agents pods can resolve DNS requests and ping each over within the default namespace. Those pods mimic similar behavior of the rest of the workloads and serve as cluster health indicators.
Kubespray provides a way to verify inter-pod connectivity and DNS resolve with [Netchecker](https://github.com/kubernetes-sigs/kubespray/blob/master/docs/netcheck.md). Netchecker ensures the netchecker-agents pods can resolve DNS requests and ping each over within the default namespace. Those pods mimic similar behavior as the rest of the workloads and serve as cluster health indicators.
## Cluster operations

View File

@ -499,13 +499,6 @@ passed from the Kubernetes components (kubelet, kube-proxy) are unchanged.
The following list documents differences between how Pod container specifications
work between Windows and Linux:
* `limits.cpu` and `limits.memory` - Windows doesn't use hard limits
for CPU allocations. Instead, a share system is used.
The fields based on millicores are scaled into
relative shares that are followed by the Windows scheduler
See [`kuberuntime/helpers_windows.go`](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/helpers_windows.go),
and [Implementing resource controls for Windows containers](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/resource-controls)
in Microsoft's virtualization documentation.
* Huge pages are not implemented in the Windows container
runtime, and are not available. They require [asserting a user
privilege](https://docs.microsoft.com/en-us/windows/desktop/Memory/large-page-support)
@ -526,13 +519,13 @@ work between Windows and Linux:
* `securityContext.readOnlyRootFilesystem` -
not possible on Windows; write access is required for registry & system
processes to run inside the container
* `EcurityContext.runAsGroup` -
* `securityContext.runAsGroup` -
not possible on Windows as there is no GID support
* `ecurityContext.runAsNonRoot` -
Windows does not have a root user. The closest equivalent is `ContainerAdministrator`
which is an identity that doesn't exist on the node.
* `securityContext.runAsNonRoot` -
this setting will prevent containers from running as `ContainerAdministrator`
which is the closest equivalent to a root user on Windows.
* `securityContext.runAsUser` -
use [`runAsUsername`](/docs/tasks/configure-pod-container/configure-runasusername)
use [`runAsUserName`](/docs/tasks/configure-pod-container/configure-runasusername)
instead
* `securityContext.seLinuxOptions` -
not possible on Windows as SELinux is Linux-specific

View File

@ -27,7 +27,7 @@ Kubernetes {{< glossary_tooltip term_id="service" >}} object.
This task uses
[Services with external load balancers](/docs/tasks/access-application-cluster/create-external-load-balancer/), which
require a supported environment. If your environment does not support this, you can use a Service of type
[NodePort](/docs/concepts/services-networking/service/#nodeport) instead.
[NodePort](/docs/concepts/services-networking/service/#type-nodeport) instead.
<!-- lessoncontent -->

View File

@ -182,7 +182,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
```
1. Add the following line to the bottom of the `/etc/hosts` file on
your computer (you will need adminstrator access):
your computer (you will need administrator access):
```
172.17.0.15 hello-world.info

View File

@ -35,7 +35,7 @@ Dashboard also provides information on the state of Kubernetes resources in your
The Dashboard UI is not deployed by default. To deploy it, run the following command:
```
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.3.1/aio/deploy/recommended.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml
```
## Accessing the Dashboard UI

View File

@ -325,7 +325,7 @@ Here is an example:
```shell
ETCDCTL_API=3 etcdctl --endpoints 10.2.0.9:2379 snapshot restore snapshotdb
```
Another example for restoring using etcdutl options:
Another example for restoring using etcdctl options:
```shell
ETCDCTL_API=3 etcdctl --data-dir <data-dir-location> snapshot restore snapshotdb
```

View File

@ -1,225 +0,0 @@
---
reviewers:
- jszczepkowski
title: Set up a High-Availability Control Plane
content_type: task
aliases: [ '/docs/tasks/administer-cluster/highly-available-master/' ]
---
<!-- overview -->
{{< feature-state for_k8s_version="v1.5" state="alpha" >}}
You can replicate Kubernetes control plane nodes in `kube-up` or `kube-down` scripts for Google Compute Engine. However this scripts are not suitable for any sort of production use, it's widely used in the project's CI.
This document describes how to use kube-up/down scripts to manage a highly available (HA) control plane and how HA control planes are implemented for use with GCE.
## {{% heading "prerequisites" %}}
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
<!-- steps -->
## Starting an HA-compatible cluster
To create a new HA-compatible cluster, you must set the following flags in your `kube-up` script:
* `MULTIZONE=true` - to prevent removal of control plane kubelets from zones different than server's default zone.
Required if you want to run control plane nodes in different zones, which is recommended.
* `ENABLE_ETCD_QUORUM_READ=true` - to ensure that reads from all API servers will return most up-to-date data.
If true, reads will be directed to leader etcd replica.
Setting this value to true is optional: reads will be more reliable but will also be slower.
Optionally, you can specify a GCE zone where the first control plane node is to be created.
Set the following flag:
* `KUBE_GCE_ZONE=zone` - zone where the first control plane node will run.
The following sample command sets up a HA-compatible cluster in the GCE zone europe-west1-b:
```shell
MULTIZONE=true KUBE_GCE_ZONE=europe-west1-b ENABLE_ETCD_QUORUM_READS=true ./cluster/kube-up.sh
```
Note that the commands above create a cluster with one control plane node;
however, you can add new control plane nodes to the cluster with subsequent commands.
## Adding a new control plane node
After you have created an HA-compatible cluster, you can add control plane nodes to it.
You add control plane nodes by using a `kube-up` script with the following flags:
* `KUBE_REPLICATE_EXISTING_MASTER=true` - to create a replica of an existing control plane
node.
* `KUBE_GCE_ZONE=zone` - zone where the control plane node will run.
Must be in the same region as other control plane nodes' zones.
You don't need to set the `MULTIZONE` or `ENABLE_ETCD_QUORUM_READS` flags,
as those are inherited from when you started your HA-compatible cluster.
The following sample command replicates the control plane node on an existing
HA-compatible cluster:
```shell
KUBE_GCE_ZONE=europe-west1-c KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh
```
## Removing a control plane node
You can remove a control plane node from an HA cluster by using a `kube-down` script with the following flags:
* `KUBE_DELETE_NODES=false` - to restrain deletion of kubelets.
* `KUBE_GCE_ZONE=zone` - the zone from where the control plane node will be removed.
* `KUBE_REPLICA_NAME=replica_name` - (optional) the name of control plane node to
remove. If empty: any replica from the given zone will be removed.
The following sample command removes a control plane node from an existing HA cluster:
```shell
KUBE_DELETE_NODES=false KUBE_GCE_ZONE=europe-west1-c ./cluster/kube-down.sh
```
## Handling control plane node failures
If one of the control plane nodes in your HA cluster fails,
the best practice is to remove the node from your cluster and add a new control plane
node in the same zone.
The following sample commands demonstrate this process:
1. Remove the broken replica:
```shell
KUBE_DELETE_NODES=false KUBE_GCE_ZONE=replica_zone KUBE_REPLICA_NAME=replica_name ./cluster/kube-down.sh
```
<ol start="2"><li>Add a new node in place of the old one:</li></ol>
```shell
KUBE_GCE_ZONE=replica-zone KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh
```
## Best practices for replicating control plane nodes for HA clusters
* Try to place control plane nodes in different zones. During a zone failure, all
control plane nodes placed inside the zone will fail.
To survive zone failure, also place nodes in multiple zones
(see [multiple-zones](/docs/setup/best-practices/multiple-zones/) for details).
* Do not use a cluster with two control plane nodes. Consensus on a two-node
control plane requires both nodes running when changing persistent state.
As a result, both nodes are needed and a failure of any node turns the cluster
into majority failure state.
A two-node control plane is thus inferior, in terms of HA, to a cluster with
one control plane node.
* When you add a control plane node, cluster state (etcd) is copied to a new instance.
If the cluster is large, it may take a long time to duplicate its state.
This operation may be sped up by migrating the etcd data directory, as described in
the [etcd administration guide](https://etcd.io/docs/v2.3/admin_guide/#member-migration)
(we are considering adding support for etcd data dir migration in the future).
<!-- discussion -->
## Implementation notes
![ha-control-plane](/docs/images/ha-control-plane.svg)
### Overview
The figure above illustrates three control plane nodes and their components in a highly available cluster. The control plane nodes components employ the following methods:
- etcd: instances are clustered together using consensus.
- Controllers, scheduler and cluster auto-scaler: only one instance of each will be active in a cluster using a lease mechanism.
- Add-on manager: each works independently to keep add-ons in sync.
In addition, a load balancer operating in front of the API servers routes external and internal traffic to the control plane nodes.
Each of the control plane nodes will run the following components in the following mode:
* etcd instance: all instances will be clustered together using consensus;
* API server: each server will talk to local etcd - all API servers in the cluster will be available;
* controllers, scheduler, and cluster auto-scaler: will use lease mechanism - only one instance of each of them will be active in the cluster;
* add-on manager: each manager will work independently trying to keep add-ons in sync.
In addition, there will be a load balancer in front of API servers that will route external and internal traffic to them.
### Load balancing
When starting the second control plane node, a load balancer containing the two replicas will be created
and the IP address of the first replica will be promoted to IP address of load balancer.
Similarly, after removal of the penultimate control plane node, the load balancer will be removed and its IP address will be assigned to the last remaining replica.
Please note that creation and removal of load balancer are complex operations and it may take some time (~20 minutes) for them to propagate.
### Control plane service & kubelets
Instead of trying to keep an up-to-date list of Kubernetes apiserver in the Kubernetes service,
the system directs all traffic to the external IP:
* in case of a single node control plane, the IP points to the control plane node,
* in case of an HA control plane, the IP points to the load balancer in-front of the control plane nodes.
Similarly, the external IP will be used by kubelets to communicate with the control plane.
### Control plane node certificates
Kubernetes generates TLS certificates for the external public IP and local IP for each control plane node.
There are no certificates for the ephemeral public IP for control plane nodes;
to access a control plane node via its ephemeral public IP, you must skip TLS verification.
### Clustering etcd
To allow etcd clustering, ports needed to communicate between etcd instances will be opened (for inside cluster communication).
To make such deployment secure, communication between etcd instances is authorized using SSL.
### API server identity
{{< feature-state state="alpha" for_k8s_version="v1.20" >}}
The API Server Identity feature is controlled by a
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
and is not enabled by default. You can activate API Server Identity by enabling
the feature gate named `APIServerIdentity` when you start the
{{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}:
```shell
kube-apiserver \
--feature-gates=APIServerIdentity=true \
# …and other flags as usual
```
During bootstrap, each kube-apiserver assigns a unique ID to itself. The ID is
in the format of `kube-apiserver-{UUID}`. Each kube-apiserver creates a
[Lease](/docs/reference/generated/kubernetes-api/{{< param "version" >}}//#lease-v1-coordination-k8s-io)
in the _kube-system_ {{< glossary_tooltip text="namespaces" term_id="namespace">}}.
The Lease name is the unique ID for the kube-apiserver. The Lease contains a
label `k8s.io/component=kube-apiserver`. Each kube-apiserver refreshes its
Lease every `IdentityLeaseRenewIntervalSeconds` (defaults to 10s). Each
kube-apiserver also checks all the kube-apiserver identity Leases every
`IdentityLeaseDurationSeconds` (defaults to 3600s), and deletes Leases that
hasn't got refreshed for more than `IdentityLeaseDurationSeconds`.
`IdentityLeaseRenewIntervalSeconds` and `IdentityLeaseDurationSeconds` can be
configured by kube-apiserver flags `identity-lease-renew-interval-seconds`
and `identity-lease-duration-seconds`.
Enabling this feature is a prerequisite for using features that involve HA API
server coordination (for example, the `StorageVersionAPI` feature gate).
## Additional reading
[Automated HA master deployment - design doc](https://git.k8s.io/community/contributors/design-proposals/cluster-lifecycle/ha_master.md)

View File

@ -165,7 +165,7 @@ These are advanced topics for users who need to integrate their organization's c
### Set up a signer
The Kubernetes Certificate Authority does not work out of the box.
You can configure an external signer such as [cert-manager](https://docs.cert-manager.io/en/latest/tasks/issuers/setup-ca.html), or you can use the built-in signer.
You can configure an external signer such as [cert-manager](https://cert-manager.io/docs/configuration/ca/), or you can use the built-in signer.
The built-in signer is part of [`kube-controller-manager`](/docs/reference/command-line-tools-reference/kube-controller-manager/).

View File

@ -14,7 +14,7 @@ without root privileges, by using a {{< glossary_tooltip text="user namespace" t
This technique is also known as _rootless mode_.
{{< note >}}
This document describes how to run Kubernetes Node components (and hence pods) a non-root user.
This document describes how to run Kubernetes Node components (and hence pods) as a non-root user.
If you are just looking for how to run a pod as a non-root user, see [SecurityContext](/docs/tasks/configure-pod-container/security-context/).
{{< /note >}}
@ -141,6 +141,7 @@ the host with an external port forwarder, such as RootlessKit, slirp4netns, or
You can use the port forwarder from K3s.
See [Running K3s in Rootless Mode](https://rancher.com/docs/k3s/latest/en/advanced/#known-issues-with-rootless-mode)
for more details.
The implementation can be found in [the `pkg/rootlessports` package](https://github.com/k3s-io/k3s/blob/v1.22.3+k3s1/pkg/rootlessports/controller.go) of k3s.
### Configuring CRI
@ -152,8 +153,7 @@ containerd or CRI-O and ensure that it is running within the user namespace befo
Running CRI plugin of containerd in a user namespace is supported since containerd 1.4.
Running containerd within a user namespace requires the following configurations
in `/etc/containerd/containerd-config.toml`.
Running containerd within a user namespace requires the following configurations.
```toml
version = 2
@ -176,6 +176,9 @@ version = 2
SystemdCgroup = false
```
The default path of the configuration file is `/etc/containerd/config.toml`.
The path can be specified with `containerd -c /path/to/containerd/config.toml`.
{{% /tab %}}
{{% tab name="CRI-O" %}}
@ -183,7 +186,7 @@ Running CRI-O in a user namespace is supported since CRI-O 1.22.
CRI-O requires an environment variable `_CRIO_ROOTLESS=1` to be set.
The following configurations (in `/etc/crio/crio.conf`) are also recommended:
The following configurations are also recommended:
```toml
[crio]
@ -197,6 +200,8 @@ The following configurations (in `/etc/crio/crio.conf`) are also recommended:
cgroup_manager = "cgroupfs"
```
The default path of the configuration file is `/etc/crio/crio.conf`.
The path can be specified with `crio --config /path/to/crio/crio.conf`.
{{% /tab %}}
{{< /tabs >}}

View File

@ -1,6 +1,6 @@
---
title: Check whether Dockershim deprecation affects you
content_type: task
content_type: task
reviewers:
- SergeyKanzhelev
weight: 20
@ -26,16 +26,21 @@ When alternative container runtime is used, executing Docker commands may either
not work or yield unexpected output. This is how you can find whether you have a
dependency on Docker:
1. Make sure no privileged Pods execute Docker commands.
2. Check that scripts and apps running on nodes outside of Kubernetes
1. Make sure no privileged Pods execute Docker commands (like `docker ps`),
restart the Docker service (commands such as `systemctl restart docker.service`),
or modify Docker-specific files such as `/etc/docker/daemon.json`.
1. Check for any private registries or image mirror settings in the Docker
configuration file (like `/etc/docker/daemon.json`). Those typically need to
be reconfigured for another container runtime.
1. Check that scripts and apps running on nodes outside of your Kubernetes
infrastructure do not execute Docker commands. It might be:
- SSH to nodes to troubleshoot;
- Node startup scripts;
- Monitoring and security agents installed on nodes directly.
3. Third-party tools that perform above mentioned privileged operations. See
1. Third-party tools that perform above mentioned privileged operations. See
[Migrating telemetry and security agents from dockershim](/docs/tasks/administer-cluster/migrating-from-dockershim/migrating-telemetry-and-security-agents)
for more information.
4. Make sure there is no indirect dependencies on dockershim behavior.
1. Make sure there is no indirect dependencies on dockershim behavior.
This is an edge case and unlikely to affect your application. Some tooling may be configured
to react to Docker-specific behaviors, for example, raise alert on specific metrics or search for
a specific log message as part of troubleshooting instructions.

View File

@ -0,0 +1,52 @@
---
title: Find Out What Container Runtime is Used on a Node
content_type: task
reviewers:
- SergeyKanzhelev
weight: 10
---
<!-- overview -->
This page outlines steps to find out what [container runtime](/docs/setup/production-environment/container-runtimes/)
the nodes in your cluster use.
Depending on the way you run your cluster, the container runtime for the nodes may
have been pre-configured or you need to configure it. If you're using a managed
Kubernetes service, there might be vendor-specific ways to check what container runtime is
configured for the nodes. The method described on this page should work whenever
the execution of `kubectl` is allowed.
## {{% heading "prerequisites" %}}
Install and configure `kubectl`. See [Install Tools](/docs/tasks/tools/#kubectl) section for details.
## Find out the container runtime used on a Node
Use `kubectl` to fetch and show node information:
```shell
kubectl get nodes -o wide
```
The output is similar to the following. The column `CONTAINER-RUNTIME` outputs
the runtime and its version.
```none
# For dockershim
NAME STATUS VERSION CONTAINER-RUNTIME
node-1 Ready v1.16.15 docker://19.3.1
node-2 Ready v1.16.15 docker://19.3.1
node-3 Ready v1.16.15 docker://19.3.1
```
```none
# For containerd
NAME STATUS VERSION CONTAINER-RUNTIME
node-1 Ready v1.19.6 containerd://1.4.1
node-2 Ready v1.19.6 containerd://1.4.1
node-3 Ready v1.19.6 containerd://1.4.1
```
Find out more information about container runtimes
on [Container Runtimes](/docs/setup/production-environment/container-runtimes/) page.

View File

@ -292,7 +292,7 @@ command line arguments to `kube-apiserver`:
* `--service-account-issuer`
* `--service-account-key-file`
* `--service-account-signing-key-file`
* `--api-audiences`
* `--api-audiences` (can be omitted)
{{< /note >}}

View File

@ -90,7 +90,7 @@ plugins:
# Array of authenticated usernames to exempt.
usernames: []
# Array of runtime class names to exempt.
runtimeClassNames: []
runtimeClasses: []
# Array of namespaces to exempt.
namespaces: []
```

View File

@ -6,29 +6,36 @@ weight: 100
<!-- overview -->
This page shows how to create a Pod that uses a Secret to pull an image from a
private Docker registry or repository.
This page shows how to create a Pod that uses a
{{< glossary_tooltip text="Secret" term_id="secret" >}} to pull an image from a
private container image registry or repository.
{{% thirdparty-content single="true" %}}
## {{% heading "prerequisites" %}}
* {{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
* {{< include "task-tutorial-prereqs.md" >}}
* To do this exercise, you need a
[Docker ID](https://docs.docker.com/docker-id/) and password.
* To do this exercise, you need the `docker` command line tool, and a
[Docker ID](https://docs.docker.com/docker-id/) for which you know the password.
<!-- steps -->
## Log in to Docker
## Log in to Docker Hub
On your laptop, you must authenticate with a registry in order to pull a private image:
On your laptop, you must authenticate with a registry in order to pull a private image.
Use the `docker` tool to log in to Docker Hub. See the _log in_ section of
[Docker ID accounts](https://docs.docker.com/docker-id/#log-in) for more information.
```shell
docker login
```
When prompted, enter your Docker username and password.
When prompted, enter your Docker ID, and then the credential you want to use (access token,
or the password for your Docker ID).
The login process creates or updates a `config.json` file that holds an authorization token.
The login process creates or updates a `config.json` file that holds an authorization token. Review [how Kubernetes interprets this file](/docs/concepts/containers/images#config-json).
View the `config.json` file:
@ -171,14 +178,14 @@ You have successfully set your Docker credentials as a Secret called `regcred` i
## Create a Pod that uses your Secret
Here is a configuration file for a Pod that needs access to your Docker credentials in `regcred`:
Here is a manifest for an example Pod that needs access to your Docker credentials in `regcred`:
{{< codenew file="pods/private-reg-pod.yaml" >}}
Download the above file:
Download the above file onto your computer:
```shell
wget -O my-private-reg-pod.yaml https://k8s.io/examples/pods/private-reg-pod.yaml
curl -L -O my-private-reg-pod.yaml https://k8s.io/examples/pods/private-reg-pod.yaml
```
In file `my-private-reg-pod.yaml`, replace `<your-private-image>` with the path to an image in a private registry such as:
@ -200,10 +207,10 @@ kubectl get pod private-reg
## {{% heading "whatsnext" %}}
* Learn more about [Secrets](/docs/concepts/configuration/secret/).
* Learn more about [Secrets](/docs/concepts/configuration/secret/)
* or read the API reference for {{< api-reference page="config-and-storage-resources/secret-v1" >}}
* Learn more about [using a private registry](/docs/concepts/containers/images/#using-a-private-registry).
* Learn more about [adding image pull secrets to a service account](/docs/tasks/configure-pod-container/configure-service-account/#add-imagepullsecrets-to-a-service-account).
* See [kubectl create secret docker-registry](/docs/reference/generated/kubectl/kubectl-commands/#-em-secret-docker-registry-em-).
* See [Secret](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#secret-v1-core).
* See the `imagePullSecrets` field of [PodSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#podspec-v1-core).
* See the `imagePullSecrets` field within the [container definitions](/docs/reference/kubernetes-api/workload-resources/pod-v1/#containers) of a Pod

View File

@ -213,6 +213,13 @@ In order to highlight the full range of configuration, the Service you created
here uses a different port number than the Pods. For many real-world
Services, these values might be the same.
## Any Network Policy Ingress rules affecting the target Pods?
If you have deployed any Network Policy Ingress rules which may affect incoming
traffic to `hostnames-*` Pods, these need to be reviewed.
Please refer to [Network Policies](/docs/concepts/services-networking/network-policies/) for more details.
## Does the Service work by DNS name?
One of the most common ways that clients consume a Service is through a DNS

View File

@ -71,15 +71,26 @@ config. Save it as `my-scheduler.yaml`:
{{< codenew file="admin/sched/my-scheduler.yaml" >}}
An important thing to note here is that the name of the scheduler specified as an
argument to the scheduler command in the container spec should be unique. This is the name that is matched against the value of the optional `spec.schedulerName` on pods, to determine whether this scheduler is responsible for scheduling a particular pod.
In the above manifest, you use a [KubeSchedulerConfiguration](/docs/reference/scheduling/config/)
to customize the behavior of your scheduler implementation. This configuration has been passed to
the `kube-scheduler` during initialization with the `--config` option. The `my-scheduler-config` ConfigMap stores the configuration file. The Pod of the`my-scheduler` Deployment mounts the `my-scheduler-config` ConfigMap as a volume.
Note also that we created a dedicated service account `my-scheduler` and bind the cluster role
In the aforementioned Scheduler Configuration, your scheduler implementation is represented via
a [KubeSchedulerProfile](/docs/reference/config-api/kube-scheduler-config.v1beta2/#kubescheduler-config-k8s-io-v1beta2-KubeSchedulerProfile).
{{< note >}}
To determine if a scheduler is responsible for scheduling a specific Pod, the `spec.schedulerName` field in a
PodTemplate or Pod manifest must match the `schedulerName` field of the `KubeSchedulerProfile`.
All schedulers running in the cluster must have unique names.
{{< /note >}}
Also, note that you create a dedicated service account `my-scheduler` and bind the ClusterRole
`system:kube-scheduler` to it so that it can acquire the same privileges as `kube-scheduler`.
Please see the
[kube-scheduler documentation](/docs/reference/command-line-tools-reference/kube-scheduler/) for
detailed description of other command line arguments.
detailed description of other command line arguments and
[Scheduler Configuration reference](https://kubernetes.io/docs/reference/config-api/kube-scheduler-config.v1beta2/) for
detailed description of other customizable `kube-scheduler` configurations.
## Run the second scheduler in the cluster
@ -110,11 +121,11 @@ pod in this list.
To run multiple-scheduler with leader election enabled, you must do the following:
First, update the following fields in your YAML file:
Update the following fields for the KubeSchedulerConfiguration in the `my-scheduler-config` ConfigMap in your YAML file:
* `--leader-elect=true`
* `--lock-object-namespace=<lock-object-namespace>`
* `--lock-object-name=<lock-object-name>`
* `leaderElection.leaderElect` to `true`
* `leaderElection.resourceNamespace` to `<lock-object-namespace>`
* `leaderElection.resourceName` to `<lock-object-name>`
{{< note >}}
The control plane creates the lock objects for you, but the namespace must already exist.
@ -168,8 +179,8 @@ scheduler in that pod spec. Let's look at three examples.
{{< codenew file="admin/sched/pod3.yaml" >}}
In this case, we specify that this pod should be scheduled using the scheduler that we
deployed - `my-scheduler`. Note that the value of `spec.schedulerName` should match the name supplied to the scheduler
command as an argument in the deployment config for the scheduler.
deployed - `my-scheduler`. Note that the value of `spec.schedulerName` should match the name supplied for the scheduler
in the `schedulerName` field of the mapping `KubeSchedulerProfile`.
Save this file as `pod3.yaml` and submit it to the Kubernetes cluster.

View File

@ -25,15 +25,8 @@ You need to configure the API Server to use the Konnectivity service
and direct the network traffic to the cluster nodes:
1. Make sure that
the `ServiceAccountTokenVolumeProjection` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled. You can enable
[service account token volume protection](/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection)
by providing the following flags to the kube-apiserver:
```
--service-account-issuer=api
--service-account-signing-key-file=/etc/kubernetes/pki/sa.key
--api-audiences=system:konnectivity-server
```
[Service Account Token Volume Projection](/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection)
feature enabled in your cluster. It is enabled by default since Kubernetes v1.20.
1. Create an egress configuration file such as `admin/konnectivity/egress-selector-configuration.yaml`.
1. Set the `--egress-selector-config-file` flag of the API Server to the path of
your API Server egress configuration file.

View File

@ -12,8 +12,8 @@ card:
## {{% heading "prerequisites" %}}
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew latestVersion >}} client can communicate with v{{< skew prevMinorVersion >}}, v{{< skew latestVersion >}}, and v{{< skew nextMinorVersion >}} control planes.
Using the latest version of kubectl helps avoid unforeseen issues.
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew currentVersion >}} client can communicate with v{{< skew currentVersionAddMinor -1 >}}, v{{< skew currentVersionAddMinor 0 >}}, and v{{< skew currentVersionAddMinor 1 >}} control planes.
Using the latest compatible version of kubectl helps avoid unforeseen issues.
## Install kubectl on Linux
@ -130,7 +130,7 @@ For example, to download version {{< param "fullversion" >}} on Linux, type:
{{% /tab %}}
{{< tab name="Red Hat-based distributions" codelang="bash" >}}
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
@ -139,7 +139,7 @@ gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubectl
sudo yum install -y kubectl
{{< /tab >}}
{{< /tabs >}}

View File

@ -12,8 +12,8 @@ card:
## {{% heading "prerequisites" %}}
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew latestVersion >}} client can communicate with v{{< skew prevMinorVersion >}}, v{{< skew latestVersion >}}, and v{{< skew nextMinorVersion >}} control planes.
Using the latest version of kubectl helps avoid unforeseen issues.
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew currentVersion >}} client can communicate with v{{< skew currentVersionAddMinor -1 >}}, v{{< skew currentVersionAddMinor 0 >}}, and v{{< skew currentVersionAddMinor 1 >}} control planes.
Using the latest compatible version of kubectl helps avoid unforeseen issues.
## Install kubectl on macOS

View File

@ -12,8 +12,8 @@ card:
## {{% heading "prerequisites" %}}
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew latestVersion >}} client can communicate with v{{< skew prevMinorVersion >}}, v{{< skew latestVersion >}}, and v{{< skew nextMinorVersion >}} control planes.
Using the latest version of kubectl helps avoid unforeseen issues.
You must use a kubectl version that is within one minor version difference of your cluster. For example, a v{{< skew currentVersion >}} client can communicate with v{{< skew currentVersionAddMinor -1 >}}, v{{< skew currentVersionAddMinor 0 >}}, and v{{< skew currentVersionAddMinor 1 >}} control planes.
Using the latest compatible version of kubectl helps avoid unforeseen issues.
## Install kubectl on Windows

View File

@ -233,7 +233,7 @@ kubectl get events | grep hello-apparmor
We can verify that the container is actually running with that profile by checking its proc attr:
```shell
kubectl exec hello-apparmor cat /proc/1/attr/current
kubectl exec hello-apparmor -- cat /proc/1/attr/current
```
```
k8s-apparmor-example-deny-write (enforce)
@ -242,7 +242,7 @@ k8s-apparmor-example-deny-write (enforce)
Finally, we can see what happens if we try to violate the profile by writing to a file:
```shell
kubectl exec hello-apparmor touch /tmp/test
kubectl exec hello-apparmor -- touch /tmp/test
```
```
touch: /tmp/test: Permission denied

View File

@ -72,7 +72,7 @@ weight: 10
<div class="row">
<div class="col-md-8">
<p><b>The Control Plane is responsible for managing the cluster.</b> The Control Plane coordinates all activities in your cluster, such as scheduling applications, maintaining applications' desired state, scaling applications, and rolling out new updates.</p>
<p><b>A node is a VM or a physical computer that serves as a worker machine in a Kubernetes cluster.</b> Each node has a Kubelet, which is an agent for managing the node and communicating with the Kubernetes control plane. The node should also have tools for handling container operations, such as containerd or Docker. A Kubernetes cluster that handles production traffic should have a minimum of three nodes.</p>
<p><b>A node is a VM or a physical computer that serves as a worker machine in a Kubernetes cluster.</b> Each node has a Kubelet, which is an agent for managing the node and communicating with the Kubernetes control plane. The node should also have tools for handling container operations, such as containerd or Docker. A Kubernetes cluster that handles production traffic should have a minimum of three nodes because if one node goes down, both an etcd member and a control plane instance are lost, and redundancy is compromised. You can mitigate this risk by adding more control plane nodes.</p>
</div>
<div class="col-md-4">

View File

@ -164,7 +164,7 @@ The `client_address` is always the client pod's IP address, whether the client p
## Source IP for Services with `Type=NodePort`
Packets sent to Services with
[`Type=NodePort`](/docs/concepts/services-networking/service/#nodeport)
[`Type=NodePort`](/docs/concepts/services-networking/service/#type-nodeport)
are source NAT'd by default. You can test this by creating a `NodePort` Service:
```shell

View File

@ -266,7 +266,7 @@ to read the data from another.
The command below executes the `zkCli.sh` script to write `world` to the path `/hello` on the `zk-0` Pod in the ensemble.
```shell
kubectl exec zk-0 zkCli.sh create /hello world
kubectl exec zk-0 -- zkCli.sh create /hello world
```
```
@ -279,7 +279,7 @@ Created /hello
To get the data from the `zk-1` Pod use the following command.
```shell
kubectl exec zk-1 zkCli.sh get /hello
kubectl exec zk-1 -- zkCli.sh get /hello
```
The data that you created on `zk-0` is available on all the servers in the

View File

@ -6,7 +6,7 @@ metadata:
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"

View File

@ -30,6 +30,22 @@ roleRef:
name: system:volume-scheduler
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
name: my-scheduler-config
namespace: kube-system
data:
my-scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: my-scheduler
leaderElection:
leaderElect: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
@ -55,9 +71,7 @@ spec:
containers:
- command:
- /usr/local/bin/kube-scheduler
- --address=0.0.0.0
- --leader-elect=false
- --scheduler-name=my-scheduler
- --config=/etc/kubernetes/my-scheduler/my-scheduler-config.yaml
image: gcr.io/my-gcp-project/my-kube-scheduler:1.0
livenessProbe:
httpGet:
@ -74,7 +88,12 @@ spec:
cpu: '0.1'
securityContext:
privileged: false
volumeMounts: []
volumeMounts:
- name: config-volume
mountPath: /etc/kubernetes/my-scheduler
hostNetwork: false
hostPID: false
volumes: []
volumes:
- name: config-volume
configMap:
name: my-scheduler-config

View File

@ -0,0 +1,35 @@
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: all-in-one
mountPath: "/projected-volume"
readOnly: true
volumes:
- name: all-in-one
projected:
sources:
- secret:
name: mysecret
items:
- key: username
path: my-group/my-username
- downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "cpu_limit"
resourceFieldRef:
containerName: container-test
resource: limits.cpu
- configMap:
name: myconfigmap
items:
- key: config
path: my-group/my-config

View File

@ -0,0 +1,27 @@
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: all-in-one
mountPath: "/projected-volume"
readOnly: true
volumes:
- name: all-in-one
projected:
sources:
- secret:
name: mysecret
items:
- key: username
path: my-group/my-username
- secret:
name: mysecret2
items:
- key: password
path: my-group/my-password
mode: 511

View File

@ -0,0 +1,20 @@
apiVersion: v1
kind: Pod
metadata:
name: sa-token-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: token-vol
mountPath: "/service-account"
readOnly: true
volumes:
- name: token-vol
projected:
sources:
- serviceAccountToken:
audience: api
expirationSeconds: 3600
path: token

View File

@ -8,6 +8,6 @@ sitemap:
priority: 0.5
---
Release notes can be found by reading the [Changelog](https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG) that matches your Kubernetes version. View the changelog for {{< skew latestVersion >}} on [GitHub](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-{{< skew latestVersion >}}.md).
Release notes can be found by reading the [Changelog](https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG) that matches your Kubernetes version. View the changelog for {{< skew currentVersionAddMinor 0 >}} on [GitHub](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-{{< skew currentVersionAddMinor 0 >}}.md).
Alternately, release notes can be searched and filtered online at: [relnotes.k8s.io](https://relnotes.k8s.io). View filtered release notes for {{< skew latestVersion >}} on [relnotes.k8s.io](https://relnotes.k8s.io/?releaseVersions={{< skew latestVersion >}}.0).
Alternately, release notes can be searched and filtered online at: [relnotes.k8s.io](https://relnotes.k8s.io). View filtered release notes for {{< skew currentVersionAddMinor 0 >}} on [relnotes.k8s.io](https://relnotes.k8s.io/?releaseVersions={{< skew currentVersionAddMinor 0 >}}.0).

View File

@ -78,10 +78,11 @@ releases may also occur in between these.
| Monthly Patch Release | Cherry Pick Deadline | Target date |
| --------------------- | -------------------- | ----------- |
| October 2021 | 2021-10-22 | 2021-10-27 |
| November 2021 | 2021-11-12 | 2021-11-17 |
| December 2021 | 2021-12-10 | 2021-12-15 |
| January 2022 | 2021-01-14 | 2021-01-19 |
| February 2022 | 2021-02-11 | 2021-02-16 |
| March 2022 | 2021-03-11 | 2021-03-16 |
## Detailed Release History for Active Branches
@ -93,6 +94,7 @@ End of Life for **1.22** is **2022-10-28**
| PATCH RELEASE | CHERRY PICK DEADLINE | TARGET DATE | NOTE |
|---------------|----------------------|-------------|------|
| 1.22.4 | 2021-11-12 | 2021-11-17 | |
| 1.22.3 | 2021-10-22 | 2021-10-27 | |
| 1.22.2 | 2021-09-10 | 2021-09-15 | |
| 1.22.1 | 2021-08-16 | 2021-08-19 | |
@ -105,6 +107,7 @@ End of Life for **1.21** is **2022-06-28**
| PATCH RELEASE | CHERRY PICK DEADLINE | TARGET DATE | NOTE |
| ------------- | -------------------- | ----------- | ---------------------------------------------------------------------- |
| 1.21.7 | 2021-11-12 | 2021-11-17 | |
| 1.21.6 | 2021-10-22 | 2021-10-27 | |
| 1.21.5 | 2021-09-10 | 2021-09-15 | |
| 1.21.4 | 2021-08-07 | 2021-08-11 | |
@ -120,6 +123,7 @@ End of Life for **1.20** is **2022-02-28**
| PATCH RELEASE | CHERRY PICK DEADLINE | TARGET DATE | NOTE |
| ------------- | -------------------- | ----------- | ----------------------------------------------------------------------------------- |
| 1.20.13 | 2021-11-12 | 2021-11-17 | |
| 1.20.12 | 2021-10-22 | 2021-10-27 | |
| 1.20.11 | 2021-09-10 | 2021-09-15 | |
| 1.20.10 | 2021-08-07 | 2021-08-11 | |
@ -133,30 +137,6 @@ End of Life for **1.20** is **2022-02-28**
| 1.20.2 | 2021-01-08 | 2021-01-13 | |
| 1.20.1 | 2020-12-11 | 2020-12-18 | [Tagging Issue](https://groups.google.com/g/kubernetes-dev/c/dNH2yknlCBA) |
### 1.19
**1.19** enters maintenance mode on **2021-08-28**
End of Life for **1.19** is **2021-10-28**
| PATCH RELEASE | CHERRY PICK DEADLINE | TARGET DATE | NOTE |
| ------------- | -------------------- | ----------- | ------------------------------------------------------------------------- |
| 1.19.16 | 2021-10-22 | 2021-09-27 | |
| 1.19.15 | 2021-09-10 | 2021-09-15 | |
| 1.19.14 | 2021-08-07 | 2021-08-11 | |
| 1.19.13 | 2021-07-10 | 2021-07-14 | |
| 1.19.12 | 2021-06-12 | 2021-06-16 | |
| 1.19.11 | 2021-05-07 | 2021-05-12 | [Regression](https://groups.google.com/g/kubernetes-dev/c/KuF8s2zueFs) |
| 1.19.10 | 2021-04-09 | 2021-04-14 | |
| 1.19.9 | 2021-03-12 | 2021-03-17 | |
| 1.19.8 | 2021-02-12 | 2021-02-17 | |
| 1.19.7 | 2021-01-08 | 2021-01-13 | |
| 1.19.6 | 2020-12-11 | 2020-12-18 | [Tagging Issue](https://groups.google.com/g/kubernetes-dev/c/dNH2yknlCBA) |
| 1.19.5 | 2020-12-04 | 2020-12-09 | |
| 1.19.4 | 2020-11-06 | 2020-11-11 | |
| 1.19.3 | 2020-10-09 | 2020-10-14 | |
| 1.19.2 | 2020-09-11 | 2020-09-16 | |
| 1.19.1 | 2020-09-04 | 2020-09-09 | |
## Non-Active Branch History
@ -164,6 +144,7 @@ These releases are no longer supported.
| MINOR VERSION | FINAL PATCH RELEASE | EOL DATE | NOTE |
| ------------- | ------------------- | ---------- | ---------------------------------------------------------------------- |
| 1.19 | 1.19.16 | 2021-10-28 | |
| 1.18 | 1.18.20 | 2021-06-18 | Created to resolve regression introduced in 1.18.19 |
| 1.18 | 1.18.19 | 2021-05-12 | [Regression](https://groups.google.com/g/kubernetes-dev/c/KuF8s2zueFs) |
| 1.17 | 1.17.17 | 2021-01-13 | |

View File

@ -213,6 +213,6 @@ Example: [1.15 Release Team](https://git.k8s.io/sig-release/releases/release-1.1
[handbook-patch-release]: https://git.k8s.io/sig-release/release-engineering/role-handbooks/patch-release-team.md
[k-sig-release-releases]: https://git.k8s.io/sig-release/releases
[patches]: /patch-releases.md
[src]: https://git.k8s.io/community/committee-product-security/README.md
[src]: https://git.k8s.io/community/committee-security-response/README.md
[release-team]: https://git.k8s.io/sig-release/release-team/README.md
[security-release-process]: https://git.k8s.io/security/security-release-process.md

View File

@ -54,6 +54,7 @@ Amazon Web Services | https://aws.amazon.com/security/ |
Google Cloud Platform | https://cloud.google.com/security/ |
IBM Cloud | https://www.ibm.com/cloud/security |
Microsoft Azure | https://docs.microsoft.com/en-us/azure/security/azure-security |
Oracle Cloud Infrastructure | https://www.oracle.com/security/ |
VMWare VSphere | https://www.vmware.com/security/hardening-guides.html |
{{< /table >}}

File diff suppressed because it is too large Load Diff

View File

@ -68,7 +68,7 @@ sudo apt-get update
sudo apt-get install -y kubectl
{{< /tab >}}
{{< tab name="CentOS, RHEL or Fedora" codelang="bash" >}}cat <<EOF > /etc/yum.repos.d/kubernetes.repo
{{< tab name="CentOS, RHEL or Fedora" codelang="bash" >}}cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
@ -77,7 +77,7 @@ gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubectl
sudo yum install -y kubectl
{{< /tab >}}
{{< /tabs >}}

View File

@ -68,7 +68,7 @@ echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/
sudo apt-get update
sudo apt-get install -y kubectl
{{< /tab >}}
{{< tab name="CentOS, RHEL or Fedora" codelang="bash" >}}cat <<EOF > /etc/yum.repos.d/kubernetes.repo
{{< tab name="CentOS, RHEL or Fedora" codelang="bash" >}}sudo cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
@ -77,7 +77,7 @@ gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubectl
sudo yum install -y kubectl
{{< /tab >}}
{{< /tabs >}}

View File

@ -773,7 +773,7 @@ Hal ini umumnya digunakan oleh pugasan server API untuk otentikasi dan otorisasi
<tr>
<td><b>system:persistent-volume-provisioner</b></td>
<td>Tidak ada</td>
<td>Mengizinkan akses ke sumber daya yang dibutuhkan oleh kebanyakan <a href="/docs/concepts/storage/persistent-volumes/#provisioner">penyedia volume dinamis</a>.</td>
<td>Mengizinkan akses ke sumber daya yang dibutuhkan oleh kebanyakan <a href="/id/docs/concepts/storage/persistent-volumes/#dinamis">penyedia volume dinamis</a>.</td>
</tr>
<tbody>
</table>

View File

@ -6,7 +6,7 @@ metadata:
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"

View File

@ -35,7 +35,7 @@ Kubernetesクラスターの計画、セットアップ、設定の例を知る
## クラスターをセキュアにする
* [Certificates](/docs/concepts/cluster-administration/certificates/)では、異なるツールチェインを使用して証明書を作成する方法を説明します。
* [Certificates](/ja/docs/concepts/cluster-administration/certificates/)では、異なるツールチェインを使用して証明書を作成する方法を説明します。
* [Kubernetes コンテナの環境](/ja/docs/concepts/containers/container-environment/)では、Kubernetesード上でのKubeletが管理するコンテナの環境について説明します。

View File

@ -303,7 +303,7 @@ ServiceのclusterIPを発見するためにDNSのみを使う場合、このよ
### DNS
ユーザーは[アドオン](/docs/concepts/cluster-administration/addons/)を使ってKubernetesクラスターにDNS Serviceをセットアップできます(常にセットアップすべきです)。
ユーザーは[アドオン](/ja/docs/concepts/cluster-administration/addons/)を使ってKubernetesクラスターにDNS Serviceをセットアップできます(常にセットアップすべきです)。
CoreDNSなどのクラスター対応のDNSサーバーは新しいServiceや、各Service用のDNSレコードのセットのためにKubernetes APIを常に監視します。
もしクラスターを通してDNSが有効になっている場合、全てのPodはDNS名によって自動的にServiceに対する名前解決をするようにできるはずです。

View File

@ -678,7 +678,7 @@ Secretsの内容を読み取るとNamespaceのServiceAccountのクレデンシ
<tr>
<td><b>system:kube-dns</b></td>
<td><b><b>kube-system</b>Namespaceのサービスアカウントkube-dns</b></td>
<td><a href="/docs/concepts/services-networking/dns-pod-service/">kube-dns</a>コンポーネントのRole。</td>
<td><a href="/ja/docs/concepts/services-networking/dns-pod-service/">kube-dns</a>コンポーネントのRole。</td>
</tr>
<tr>
<td><b>system:kubelet-api-admin</b></td>
@ -698,7 +698,7 @@ Secretsの内容を読み取るとNamespaceのServiceAccountのクレデンシ
<tr>
<td><b>system:persistent-volume-provisioner</b></td>
<td>None</td>
<td>ほとんどの<a href="/docs/concepts/storage/persistent-volumes/#provisioner">dynamic volume provisioners</a>が必要とするリソースへのアクセスを許可します。</td>
<td>ほとんどの<a href="/ja/docs/concepts/storage/persistent-volumes/#provisioner">dynamic volume provisioners</a>が必要とするリソースへのアクセスを許可します。</td>
</tr>
</table>
@ -995,7 +995,7 @@ subjects:
--namespace=my-namespace
```
多くの[アドオン](https://kubernetes.io/docs/concepts/cluster-administration/addons/)は、
多くの[アドオン](/ja/docs/concepts/cluster-administration/addons/)は、
Namespace`kube-system`のサービスアカウント「default」として実行されます。
これらのアドオンをスーパーユーザーアクセスでの実行を許可するには、Namespace`kube-system`のサービスアカウント「default」のcluster-admin権限を付与します。

View File

@ -142,7 +142,7 @@ GA 버전과 중복된 사용 중단(deprecated)된 여러 베타 API가 1.22에
# 다가오는 릴리스 웨비나
이번 릴리스에 대한 중요 기능뿐만 아니라 업그레이드 계획을 위해 필요한 사용 중지된 사항이나 제거에 대한 사항을 학습하고 싶다면, 2021년 9월 7일에 쿠버네티스 1.22 릴리스 팀 웨비나에 참여하세요. 더 자세한 정보와 등록에 대해서는 CNCF 온라인 프로그램 사이트의 [이벤트 페이지](https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cncf-live-webinar-kubernetes-122-release/)를 확인하세요.
이번 릴리스에 대한 중요 기능뿐만 아니라 업그레이드 계획을 위해 필요한 사용 중지된 사항이나 제거에 대한 사항을 학습하고 싶다면, 2021년 10월 5일에 쿠버네티스 1.22 릴리스 팀 웨비나에 참여하세요. 더 자세한 정보와 등록에 대해서는 CNCF 온라인 프로그램 사이트의 [이벤트 페이지](https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cncf-live-webinar-kubernetes-122-release/)를 확인하세요.
# 참여하기

View File

@ -122,6 +122,9 @@ kubelets 은 자신의 노드 리소스를 생성/수정할 권한을 가진다.
kubectl cordon $NODENAME
```
보다 자세한 내용은 [안전하게 노드를 드레인(drain)하기](/docs/tasks/administer-cluster/safely-drain-node/)
를 참고한다.
{{< note >}}
{{< glossary_tooltip term_id="daemonset" >}}에 포함되는 일부 파드는
스케줄 불가 노드에서 실행될 수 있다. 일반적으로 데몬셋은 워크로드 애플리케이션을
@ -174,7 +177,7 @@ kubectl describe node <insert-node-name-here>
대신 코드화된 노드는 사양에 스케줄 불가로 표시된다.
{{< /note >}}
노드 컨디션은 JSON 오브젝트로 표현된다. 예를 들어, 다음 응답은 상태 양호한 노드를 나타낸다.
쿠버네티스 API에서, 노드의 컨디션은 노드 리소스의 `.status` 부분에 표현된다. 예를 들어, 다음의 JSON 구조는 상태가 양호한 노드를 나타낸다.
```json
"conditions": [
@ -189,20 +192,30 @@ kubectl describe node <insert-node-name-here>
]
```
ready 컨디션의 상태가 `pod-eviction-timeout` ({{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}에 전달된 인수) 보다 더 길게 `Unknown` 또는 `False`로 유지되는 경우, 노드 상에 모든 파드는 노드 컨트롤러에 의해 삭제되도록 스케줄 된다. 기본 축출 타임아웃 기간은 **5분** 이다. 노드에 접근이 불가할 때와 같은 경우, apiserver는 노드 상의 kubelet과 통신이 불가하다. apiserver와의 통신이 재개될 때까지 파드 삭제에 대한 결정은 kubelet에 전해질 수 없다. 그 사이, 삭제되도록 스케줄 되어진 파드는 분할된 노드 상에서 계속 동작할 수도 있다.
ready 컨디션의 `status``pod-eviction-timeout`
({{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager"
>}}에 전달된 인수)보다 더 길게 `Unknown` 또는 `False`로 유지되는 경우,
[노드 컨트롤러](#node-controller)가 해당 노드에 할당된 전체 파드에 대해
{{< glossary_tooltip text="API를 이용한 축출" term_id="api-eviction" >}}
을 트리거한다. 기본 축출 타임아웃 기간은
**5분** 이다.
노드에 접근이 불가할 때와 같은 경우, API 서버는 노드 상의 kubelet과 통신이 불가하다.
API 서버와의 통신이 재개될 때까지 파드 삭제에 대한 결정은 kubelet에 전해질 수 없다.
그 사이, 삭제되도록 스케줄 되어진 파드는 분할된 노드 상에서 계속 동작할 수도 있다.
노드 컨트롤러가 클러스터 내 동작 중지된 것을 확신할 때까지는 파드를
강제로 삭제하지 않는다. 파드가 `Terminating` 또는 `Unknown` 상태로 있을 때 접근 불가한 노드 상에서
노드 컨트롤러가 클러스터 내 동작 중지된 것을 확신할 때까지는 파드를 강제로 삭제하지 않는다.
파드가 `Terminating` 또는 `Unknown` 상태로 있을 때 접근 불가한 노드 상에서
동작되고 있는 것을 보게 될 수도 있다. 노드가 영구적으로 클러스터에서 삭제되었는지에
대한 여부를 쿠버네티스가 기반 인프라로부터 유추할 수 없는 경우, 노드가 클러스터를 영구적으로
탈퇴하게 되면, 클러스터 관리자는 손수 노드 오브젝트를 삭제해야 할 수도 있다.
쿠버네티스에서 노드 오브젝트를 삭제하면 노드 상에서 동작중인 모든 파드 오브젝트가
apiserver로부터 삭제되어 그 이름을 사용할 수 있는 결과를 낳는다.
API 서버로부터 삭제되어 그 이름을 사용할 수 있는 결과를 낳는다.
노드 수명주기 컨트롤러는 자동으로 컨디션을 나타내
노드에서 문제가 발생하면, 쿠버네티스 컨트롤 플레인은 자동으로 노드 상태에 영향을 주는 조건과 일치하
[테인트(taints)](/ko/docs/concepts/scheduling-eviction/taint-and-toleration/)를 생성한다.
스케줄러는 파드를 노드에 할당 할 때 노드의 테인트를 고려한다.
또한 파드는 노드의 테인트를 극복(tolerate)할 수 있는 톨러레이션(toleration)을 가질 수 있다.
또한 파드는 노드에 특정 테인트가 있더라도 해당 노드에서 동작하도록
{{< glossary_tooltip text="톨러레이션(toleration)" term_id="toleration" >}}을 가질 수 있다.
자세한 내용은
[컨디션별 노드 테인트하기](/ko/docs/concepts/scheduling-eviction/taint-and-toleration/#컨디션별-노드-테인트하기)를 참조한다.
@ -222,8 +235,34 @@ apiserver로부터 삭제되어 그 이름을 사용할 수 있는 결과를 낳
### 정보
커널 버전, 쿠버네티스 버전 (kubelet과 kube-proxy 버전), (사용하는 경우) Docker 버전, OS 이름과 같은노드에 대한 일반적인 정보를 보여준다.
이 정보는 Kubelet에 의해 노드로부터 수집된다.
커널 버전, 쿠버네티스 버전 (kubelet과 kube-proxy 버전), 컨테이너 런타임 상세 정보 및 노드가 사용하는 운영 체계가 무엇인지와 같은 노드에 대한 일반적인 정보가 기술된다.
이 정보는 Kubelet이 노드로부터 수집해서 쿠버네티스 API로 이를 보낸다.
## 하트비트
쿠버네티스 노드가 보내는 하트비트는 클러스터가 개별 노드가 가용한지를
판단할 수 있도록 도움을 주고, 장애가 발견된 경우 조치를 할 수 있게한다.
노드에는 두 가지 형태의 하트비트가 있다.
* 노드의 `.status`에 대한 업데이트
* `kube-node-lease`
{{< glossary_tooltip term_id="namespace" text="네임스페이스">}} 내의 [리스(Lease)](/docs/reference/kubernetes-api/cluster-resources/lease-v1/) 오브젝트.
각 노드는 연관된 리스 오브젝트를 갖는다.
노드의 `.status`와 비교해서, 리스는 경량의 리소스이다.
큰 규모의 클러스터에서는 리스를 하트비트에 사용해서 업데이트를 위해 필요한 성능 영향도를 줄일 수 있다.
kubelet은 노드의 `.status` 생성과 업데이트 및
관련된 리스의 업데이트를 담당한다.
- kubelet은 상태가 변경되거나 설정된 인터벌보다 오래 업데이트가 없는 경우 노드의 `.status`를 업데이트한다.
노드의 `.status` 업데이트에 대한 기본 인터벌은 접근이 불가능한 노드에 대한 타임아웃인 40초 보다 훨씬 긴 5분이다.
- kubelet은 리스 오브젝트를 (기본 업데이트 인터벌인) 매 10초마다 생성하고 업데이트한다.
리스 업데이트는 노드의 `.status` 업데이트와는 독립적이다.
만약 리스 업데이트가 실패하면, kubelet은 200밀리초에서 시작하고 7초의 상한을 갖는 지수적 백오프를 사용해서 재시도한다.
### 노드 컨트롤러
@ -241,39 +280,15 @@ apiserver로부터 삭제되어 그 이름을 사용할 수 있는 결과를 낳
세 번째는 노드의 동작 상태를 모니터링 하는 것이다. 노드 컨트롤러는
다음을 담당한다.
- 노드 다운과 같은 어떤 이유로 노드 컨트롤러가
하트비트 수신이 중단되는 경우 NodeStatus의 NodeReady
컨디션을 ConditionUnknown으로 업데이트 한다.
- 노드가 계속 접근 불가할 경우 나중에 노드로부터 정상적인 종료를 이용해서 모든 파드를 축출 한다.
ConditionUnknown을 알리기 시작하는 기본 타임아웃 값은 40초 이고,
파드를 축출하기 시작하는 값은 5분이다.
- 노드가 접근이 불가능한 상태가되는 경우, 노드의 `.status` 내에 있는 NodeReady 컨디션을 업데이트한다.
이 경우에는 노드 컨트롤러가 NodeReady 컨디션을 `ConditionUnknown`으로 설정한다.
- 노드에 계속 접근이 불가능한 상태로 남아있는 경우에는 해당 노드의 모든 파드에 대해서
[API를 이용한 축출](/docs/concepts/scheduling-eviction/api-eviction/)을 트리거한다.
기본적으로, 노드 컨트롤러는 노드를 `ConditionUnknown`으로 마킹한 뒤 5분을 기다렸다가 최초의 축출 요청을 시작한다.
노드 컨트롤러는 매 `--node-monitor-period` 초 마다 각 노드의 상태를 체크한다.
#### 하트비트
쿠버네티스 노드에서 보내는 하트비트는 노드의 가용성을 결정하는데 도움이 된다.
하트비트의 두 가지 형태는 `NodeStatus`
[리스(Lease) 오브젝트](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#lease-v1-coordination-k8s-io)이다.
각 노드에는 `kube-node-lease` 라는
{{< glossary_tooltip term_id="namespace" text="네임스페이스">}} 에 관련된 리스 오브젝트가 있다.
리스는 경량 리소스로, 클러스터가 확장될 때
노드의 하트비트 성능을 향상 시킨다.
kubelet은 `NodeStatus` 와 리스 오브젝트를 생성하고 업데이트 할
의무가 있다.
- kubelet은 상태가 변경되거나 구성된 상태에 대한 업데이트가 없는 경우,
`NodeStatus` 를 업데이트 한다. `NodeStatus` 의 기본 업데이트
주기는 5분으로, 연결할 수 없는 노드의 시간 제한인 40초
보다 훨씬 길다.
- kubelet은 10초마다 리스 오브젝트를 생성하고 업데이트 한다(기본 업데이트 주기).
리스 업데이트는 `NodeStatus` 업데이트와는
독립적으로 발생한다. 리스 업데이트가 실패하면 kubelet에 의해 재시도하며
7초로 제한된 지수 백오프를 200 밀리초에서 부터 시작한다.
#### 안정성
#### 축출 빈도 한계
대부분의 경우, 노드 컨트롤러는 초당 `--node-eviction-rate`(기본값 0.1)로
축출 속도를 제한한다. 이 말은 10초당 1개의 노드를 초과하여
@ -281,8 +296,8 @@ kubelet은 `NodeStatus` 와 리스 오브젝트를 생성하고 업데이트 할
노드 축출 행위는 주어진 가용성 영역 내 하나의 노드가 상태가 불량할
경우 변화한다. 노드 컨트롤러는 영역 내 동시에 상태가 불량한 노드의 퍼센티지가 얼마나 되는지
체크한다(NodeReady 컨디션은 ConditionUnknown 또는
ConditionFalse 다).
체크한다(NodeReady 컨디션은 `ConditionUnknown` 또는
`ConditionFalse` 다).
- 상태가 불량한 노드의 비율이 최소 `--unhealthy-zone-threshold`
(기본값 0.55)가 되면 축출 속도가 감소한다.
- 클러스터가 작으면 (즉 `--large-cluster-size-threshold`
@ -292,16 +307,15 @@ ConditionFalse 다).
이 정책들이 가용성 영역 단위로 실행되어지는 이유는 나머지가 연결되어 있는 동안
하나의 가용성 영역이 마스터로부터 분할되어 질 수도 있기 때문이다.
만약 클러스터가 여러 클라우드 제공사업자의 가용성 영역에 걸쳐 있지 않으면,
오직 하나의 가용성 영역만 (전체 클러스터) 존재하게 된다.
만약 클러스터가 여러 클라우드 제공사업자의 가용성 영역에 걸쳐 있지 않는 이상,
축출 매커니즘은 영역 별 가용성을 고려하지 않는다.
노드가 가용성 영역들에 걸쳐 퍼져 있는 주된 이유는 하나의 전체 영역이
장애가 발생할 경우 워크로드가 상태 양호한 영역으로 이전되어질 수 있도록 하기 위해서이다.
그러므로, 하나의 영역 내 모든 노드들이 상태가 불량하면 노드 컨트롤러는
`--node-eviction-rate` 의 정상 속도로 축출한다. 코너 케이스란 모든 영역이
완전히 상태불량 (즉 클러스터 내 양호한 노드가 없는 경우) 한 경우이다.
이러한 경우, 노드 컨트롤러는 마스터 연결에 문제가 있어 일부 연결이
복원될 때까지 모든 축출을 중지하는 것으로 여긴다.
완전히 상태불량(클러스터 내 양호한 노드가 없는 경우)한 경우이다.
이러한 경우, 노드 컨트롤러는 컨트롤 플레인과 노드 간 연결에 문제가 있는 것으로 간주하고 축출을 실행하지 않는다. (중단 이후 일부 노드가 다시 보이는 경우 노드 컨트롤러는 상태가 양호하지 않거나 접근이 불가능한 나머지 노드에서 파드를 축출한다.)
또한, 노드 컨트롤러는 파드가 테인트를 허용하지 않을 때 `NoExecute` 테인트 상태의
노드에서 동작하는 파드에 대한 축출 책임을 가지고 있다.
@ -309,7 +323,7 @@ ConditionFalse 다).
{{< glossary_tooltip text="테인트" term_id="taint" >}}를 추가한다.
이는 스케줄러가 비정상적인 노드에 파드를 배치하지 않게 된다.
### 노드 용량
## 리소스 용량 추적 {#node-capacity}
노드 오브젝트는 노드 리소스 용량에 대한 정보: 예를 들어, 사용 가능한 메모리의
양과 CPU의 수를 추적한다.

Some files were not shown because too many files have changed in this diff Show More