Merge branch 'merged-master-dev-1.16' of github.com:simplytunde/website into merged-master-dev-1.16

This commit is contained in:
simplytunde 2019-09-09 07:48:14 -05:00
commit 68aad9888a
48 changed files with 3114 additions and 1413 deletions

View File

@ -290,6 +290,13 @@ includes all containers started by the kubelet, but not containers started direc
If you want to explicitly reserve resources for non-Pod processes, follow this tutorial to
[reserve resources for system daemons](/docs/tasks/administer-cluster/reserve-compute-resources/#system-reserved).
## Node topology
{{< feature-state state="alpha" >}}
If you have enabled the `TopologyManager`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), then
the kubelet can use topology hints when making resource assignment decisions.
## API Object
@ -298,3 +305,7 @@ API object can be found at:
[Node API object](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
{{% /capture %}}
{{% capture whatsnext %}}
* Read about [node components](https://kubernetes.io/docs/concepts/overview/components/#node-components)
* Read about node-level topology: [Control Topology Management Policies on a node](/docs/tasks/administer-cluster/topology-manager/)
{{% /capture %}}

View File

@ -394,4 +394,8 @@ The design documents for
[node affinity](https://git.k8s.io/community/contributors/design-proposals/scheduling/nodeaffinity.md)
and for [inter-pod affinity/anti-affinity](https://git.k8s.io/community/contributors/design-proposals/scheduling/podaffinity.md) contain extra background information about these features.
Once a Pod is assigned to a Node, the kubelet runs the Pod and allocates node-local resources.
The [topology manager](/docs/tasks/administer-cluster/topology-manager/) can take part in node-level
resource allocation decisions.
{{% /capture %}}

View File

@ -0,0 +1,62 @@
---
reviewers:
- dchen1107
- egernst
- tallclair
title: Pod Overhead
content_template: templates/concept
weight: 20
---
{{% capture overview %}}
{{< feature-state for_k8s_version="v1.16" state="alpha" >}}
When you run a Pod on a Node, the Pod itself takes an amount of system resources. These
resources are additional to the resources needed to run the container(s) inside the Pod.
_Pod Overhead_ is a feature for accounting for the resources consumed by the pod infrastructure
on top of the container requests & limits.
{{% /capture %}}
{{% capture body %}}
## Pod Overhead
In Kubernetes, the pod's overhead is set at
[admission](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#what-are-admission-webhooks)
time according to the overhead associated with the pod's
[RuntimeClass](https://kubernetes.io/docs/concepts/containers/runtime-class/).
When Pod Overhead is enabled, the overhead is considered in addition to the sum of container
resource requests when scheduling a pod. Similarly, Kubelet will include the pod overhead when sizing
the pod cgroup, and when carrying out pod eviction ranking.
### Set Up
You need to make sure that the `PodOverhead`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled (it is off by default)
across your cluster. This means:
- in {{< glossary_tooltip text="kube-scheduler" term_id="kube-scheduler" >}}
- in {{< glossary_tooltip text="kube-apiserver" term_id="kube-apiserver" >}}
- in the {{< glossary_tooltip text="kubelet" term_id="kubelet" >}} on each Node
- in any custom API servers that use feature gates
{{< note >}}
Users who can write to RuntimeClass resources are able to have cluster-wide impact on
workload performance. You can limit access to this ability using Kubernetes access controls.
See [Authorization Overview](/docs/reference/access-authn-authz/authorization/) for more details.
{{< /note >}}
{{% /capture %}}
{{% capture whatsnext %}}
* [RuntimeClass](/docs/concepts/containers/runtime-class/)
* [PodOverhead Design](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/20190226-pod-overhead.md)
{{% /capture %}}

View File

@ -0,0 +1,197 @@
---
reviewers:
- bsalamat
- k82cn
- ahg-g
title: Resource Bin Packing for Extended Resources
content_template: templates/concept
weight: 10
---
{{% capture overview %}}
{{< feature-state for_k8s_version="1.16" state="alpha" >}}
The kube-scheduler can be configured to enable bin packing of resources along with extended resources using `RequestedToCapacityRatioResourceAllocation` priority function. Priority functions can be used to fine-tune the kube-scheduler as per custom needs.
{{% /capture %}}
{{% capture body %}}
## Enabling Bin Packing using RequestedToCapacityRatioResourceAllocation
Before Kubernetes 1.15, Kube-scheduler used to allow scoring nodes based on the request to capacity ratio of primary resources like CPU and Memory. Kubernetes 1.16 added a new parameter to the priority function that allows the users to specify the resources along with weights for each resource to score nodes based on the request to capacity ratio. This allows users to bin pack extended resources by using appropriate parameters improves the utilization of scarce resources in large clusters. The behavior of the `RequestedToCapacityRatioResourceAllocation` priority function can be controlled by a configuration option called `requestedToCapacityRatioArguments`. This argument consists of two parameters `shape` and `resources`. Shape allows the user to tune the function as least requested or most requested based on `utilization` and `score` values. Resources
consists of `name` which specifies the resource to be considered during scoring and `weight` specify the weight of each resource.
Below is an example configuration that sets `requestedToCapacityRatioArguments` to bin packing behavior for extended resources `intel.com/foo` and `intel.com/bar`
```json
{
"kind" : "Policy",
"apiVersion" : "v1",
...
"priorities" : [
...
{
"name": "RequestedToCapacityRatioPriority",
"weight": 2,
"argument": {
"requestedToCapacityRatioArguments": {
"shape": [
{"utilization": 0, "score": 0},
{"utilization": 100, "score": 10}
],
"resources": [
{"name": "intel.com/foo", "weight": 3},
{"name": "intel.com/bar", "weight": 5}
]
}
}
}
],
}
```
**This feature is disabled by default**
### Tuning RequestedToCapacityRatioResourceAllocation Priority Function
`shape` is used to specify the behavior of the `RequestedToCapacityRatioPriority` function.
```yaml
{"utilization": 0, "score": 0},
{"utilization": 100, "score": 10}
```
The above arguments give the node a score of 0 if utilization is 0% and 10 for utilization 100%, thus enabling bin packing behavior. To enable least requested the score value must be reversed as follows.
```yaml
{"utilization": 0, "score": 100},
{"utilization": 100, "score": 0}
```
`resources` is an optional parameter which by defaults is set to:
``` yaml
"resources": [
{"name": "CPU", "weight": 1},
{"name": "Memory", "weight": 1}
]
```
It can be used to add extended resources as follows:
```yaml
"resources": [
{"name": "intel.com/foo", "weight": 5},
{"name": "CPU", "weight": 3},
{"name": "Memory", "weight": 1}
]
```
The weight parameter is optional and is set to 1 if not specified. Also, the weight cannot be set to a negative value.
### How the RequestedToCapacityRatioResourceAllocation Priority Function Scores Nodes
This section is intended for those who want to understand the internal details
of this feature.
Below is an example of how the node score is calculated for a given set of values.
```
Requested Resources
intel.com/foo : 2
Memory: 256MB
CPU: 2
Resource Weights
intel.com/foo : 5
Memory: 1
CPU: 3
FunctionShapePoint {{0, 0}, {100, 10}}
Node 1 Spec
Available:
intel.com/foo : 4
Memory : 1 GB
CPU: 8
Used:
intel.com/foo: 1
Memory: 256MB
CPU: 1
Node Score:
intel.com/foo = resourceScoringFunction((2+1),4)
= (100 - ((4-3)*100/4)
= (100 - 25)
= 75
= rawScoringFunction(75)
= 7
Memory = resourceScoringFunction((256+256),1024)
= (100 -((1024-512)*100/1024))
= 50
= rawScoringFunction(50)
= 5
CPU = resourceScoringFunction((2+1),8)
= (100 -((8-3)*100/8))
= 37.5
= rawScoringFunction(37.5)
= 3
NodeScore = (7 * 5) + (5 * 1) + (3 * 3) / (5 + 1 + 3)
= 5
Node 2 Spec
Available:
intel.com/foo: 8
Memory: 1GB
CPU: 8
Used:
intel.com/foo: 2
Memory: 512MB
CPU: 6
Node Score:
intel.com/foo = resourceScoringFunction((2+2),8)
= (100 - ((8-4)*100/8)
= (100 - 25)
= 50
= rawScoringFunction(50)
= 5
Memory = resourceScoringFunction((256+512),1024)
= (100 -((1024-768)*100/1024))
= 75
= rawScoringFunction(75)
= 7
CPU = resourceScoringFunction((2+6),8)
= (100 -((8-8)*100/8))
= 100
= rawScoringFunction(100)
= 10
NodeScore = (5 * 5) + (7 * 1) + (10 * 3) / (5 + 1 + 3)
= 7
```
{{% /capture %}}

View File

@ -57,10 +57,9 @@ implementation dependent. See the corresponding documentation ([below](#cri-conf
CRI implementation for how to configure.
{{< note >}}
RuntimeClass currently assumes a homogeneous node configuration across the cluster (which means that
all nodes are configured the same way with respect to container runtimes). Any heterogeneity
(varying configurations) must be managed independently of RuntimeClass through scheduling features
(see [Assigning Pods to Nodes](/docs/concepts/configuration/assign-pod-node/)).
RuntimeClass assumes a homogeneous node configuration across the cluster by default (which means
that all nodes are configured the same way with respect to container runtimes). To support
heterogenous node configurations, see [Scheduling](#scheduling) below.
{{< /note >}}
The configurations have a corresponding `handler` name, referenced by the RuntimeClass. The
@ -147,6 +146,44 @@ table](https://github.com/kubernetes-sigs/cri-o/blob/master/docs/crio.conf.5.md#
See cri-o's config documentation for more details:
https://github.com/kubernetes-sigs/cri-o/blob/master/cmd/crio/config.go
### Scheduling
{{< feature-state for_k8s_version="v1.16" state="beta" >}}
As of Kubernetes v1.16, RuntimeClass includes support for heterogenous clusters through its
`scheduling` fields. Through the use of these fields, you can ensure that pods running with this
RuntimeClass are scheduled to nodes that support it. To use the scheduling support, you must have
the RuntimeClass [admission controller][] enabled (the default, as of 1.16).
To ensure pods land on nodes supporting a specific RuntimeClass, that set of nodes should have a
common label which is then selected by the `runtimeclass.scheduling.nodeSelector` field. The
RuntimeClass's nodeSelector is merged with the pod's nodeSelector in admission, effectively taking
the intersection of the set of nodes selected by each. If there is a conflict, the pod will be
rejected.
If the supported nodes are tainted to prevent other RuntimeClass pods from running on the node, you
can add `tolerations` to the RuntimeClass. As with the `nodeSelector`, the tolerations are merged
with the pod's tolerations in admission, effectively taking the union of the set of nodes tolerated
by each.
To learn more about configuring the node selector and tolerations, see [Assigning Pods to
Nodes](/docs/concepts/configuration/assign-pod-node/).
[admission controller]: /docs/reference/access-authn-authz/admission-controllers/
### Pod Overhead
{{< feature-state for_k8s_version="v1.16" state="alpha" >}}
As of Kubernetes v1.16, RuntimeClass includes support for specifying overhead associated with
running a pod, as part of the [`PodOverhead`](/docs/concepts/configuration/pod-overhead.md) feature.
To use `PodOverhead`, you must have the PodOverhead [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
enabled (it is off by default).
Pod overhead is defined in RuntimeClass through the `Overhead` fields. Through the use of these fields,
you can specify the overhead of running pods utilizing this RuntimeClass and ensure these overheads
are accounted for in Kubernetes.
### Upgrading RuntimeClass from Alpha to Beta
@ -175,4 +212,11 @@ RuntimeClass feature to the beta version:
in the handler are no longer valid, and must be migrated to a valid handler configuration (see
above).
### Further Reading
- [RuntimeClass Design](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/runtime-class.md)
- [RuntimeClass Scheduling Design](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/runtime-class-scheduling.md)
- Read about the [Pod Overhead](/docs/concepts/configuration/pod-overhead/) concept
- [PodOverhead Feature Design](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/20190226-pod-overhead.md)
{{% /capture %}}

View File

@ -174,7 +174,7 @@ Aggregated APIs offer more advanced API features and customization of other feat
| Feature | Description | CRDs | Aggregated API |
| ------- | ----------- | ---- | -------------- |
| Validation | Help users prevent errors and allow you to evolve your API independently of your clients. These features are most useful when there are many clients who can't all update at the same time. | Yes. Most validation can be specified in the CRD using [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation). Any other validations supported by addition of a [Validating Webhook](/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook-alpha-in-1-8-beta-in-1-9). | Yes, arbitrary validation checks |
| Defaulting | See above | Yes, either via [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#defaulting) `default` keyword (alpha in 1.15), or via a [Mutating Webhook](/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook-beta-in-1-9) | Yes |
| Defaulting | See above | Yes, either via [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#defaulting) `default` keyword (beta in 1.16), or via a [Mutating Webhook](/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook-beta-in-1-9) | Yes |
| Multi-versioning | Allows serving the same object through two API versions. Can help ease API changes like renaming fields. Less important if you control your client versions. | [Yes](/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definition-versioning) | Yes |
| Custom Storage | If you need storage with a different performance mode (for example, time-series database instead of key-value store) or isolation for security (for example, encryption secrets or different | No | Yes |
| Custom Business Logic | Perform arbitrary checks or actions when creating, reading, updating or deleting an object | Yes, using [Webhooks](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks). | Yes |
@ -183,7 +183,7 @@ Aggregated APIs offer more advanced API features and customization of other feat
| Other Subresources | Add operations other than CRUD, such as "logs" or "exec". | No | Yes |
| strategic-merge-patch | The new endpoints support PATCH with `Content-Type: application/strategic-merge-patch+json`. Useful for updating objects that may be modified both locally, and by the server. For more information, see ["Update API Objects in Place Using kubectl patch"](/docs/tasks/run-application/update-api-object-kubectl-patch/) | No | Yes |
| Protocol Buffers | The new resource supports clients that want to use Protocol Buffers | No | Yes |
| OpenAPI Schema | Is there an OpenAPI (swagger) schema for the types that can be dynamically fetched from the server? Is the user protected from misspelling field names by ensuring only allowed fields are set? Are types enforced (in other words, don't put an `int` in a `string` field?) | Yes, based on the [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation) schema (beta in 1.15) | Yes |
| OpenAPI Schema | Is there an OpenAPI (swagger) schema for the types that can be dynamically fetched from the server? Is the user protected from misspelling field names by ensuring only allowed fields are set? Are types enforced (in other words, don't put an `int` in a `string` field?) | Yes, based on the [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation) schema (GA in 1.16) | Yes |
### Common Features

View File

@ -181,6 +181,9 @@ kube-scheduler has a default set of scheduling policies.
{{% /capture %}}
{{% capture whatsnext %}}
* Read about [scheduler performance tuning](/docs/concepts/scheduling/scheduler-perf-tuning/)
* Read about [Pod topology spread constraints](/docs/concepts/workloads/pods/pod-topology-spread-constraints/)
* Read the [reference documentation](/docs/reference/command-line-tools-reference/kube-scheduler/) for kube-scheduler
* Learn about [configuring multiple schedulers](https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/)
* Learn about [configuring multiple schedulers](/docs/tasks/administer-cluster/configure-multiple-schedulers/)
* Learn about [topology management policies](/docs/tasks/administer-cluster/topology-manager/)
* Learn about [Pod Overhead](/docs/concepts/configuration/pod-overhead/)
{{% /capture %}}

View File

@ -98,7 +98,7 @@ Name: hostpath
Namespace: default
StorageClass: example-hostpath
Status: Terminating
Volume:
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-class=example-hostpath
volume.beta.kubernetes.io/storage-provisioner=example.com/hostpath
@ -116,15 +116,15 @@ Annotations: <none>
Finalizers: [kubernetes.io/pv-protection]
StorageClass: standard
Status: Available
Claim:
Claim:
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 1Gi
Message:
Message:
Source:
Type: HostPath (bare host directory volume)
Path: /tmp/data
HostPathType:
HostPathType:
Events: <none>
```
@ -217,9 +217,9 @@ new `PersistentVolume` is never created to satisfy the claim. Instead, an existi
#### CSI Volume expansion
{{< feature-state for_k8s_version="v1.14" state="alpha" >}}
{{< feature-state for_k8s_version="v1.16" state="beta" >}}
CSI volume expansion requires enabling `ExpandCSIVolumes` feature gate and also requires specific CSI driver to support volume expansion. Please refer to documentation of specific CSI driver for more information.
Support for expanding CSI volumes is enabled by default but it also requires specific CSI driver to support volume expansion. Please refer to documentation of specific CSI driver for more information.
#### Resizing a volume containing a file system
@ -228,10 +228,10 @@ You can only resize volumes containing a file system if the file system is XFS,
When a volume contains a file system, the file system is only resized when a new Pod is using
the `PersistentVolumeClaim` in ReadWrite mode. File system expansion is either done when Pod is starting up
or is done when Pod is running and underlying file system supports online expansion.
or is done when Pod is running and underlying file system supports online expansion.
FlexVolumes allow resize if the driver is set with the `RequiresFSResize` capability to true.
The FlexVolume can be resized on pod restart.
FlexVolumes allow resize if the driver is set with the `RequiresFSResize` capability to true.
The FlexVolume can be resized on pod restart.
#### Resizing an in-use PersistentVolumeClaim
@ -682,7 +682,7 @@ spec:
## Volume Cloning
{{< feature-state for_k8s_version="v1.15" state="alpha" >}}
{{< feature-state for_k8s_version="v1.16" state="beta" >}}
Volume clone feature was added to support CSI Volume Plugins only. For details, see [volume cloning](/docs/concepts/storage/volume-pvc-datasource/).

View File

@ -11,7 +11,7 @@ weight: 30
{{% capture overview %}}
{{< feature-state for_k8s_version="v1.15" state="alpha" >}}
{{< feature-state for_k8s_version="v1.16" state="beta" >}}
This document describes the concept of cloning existing CSI Volumes in Kubernetes. Familiarity with [Volumes](/docs/concepts/storage/volumes) is suggested.
This feature requires VolumePVCDataSource feature gate to be enabled:
@ -32,7 +32,7 @@ The {{< glossary_tooltip text="CSI" term_id="csi" >}} Volume Cloning feature add
A Clone is defined as a duplicate of an existing Kubernetes Volume that can be consumed as any standard Volume would be. The only difference is that upon provisioning, rather than creating a "new" empty Volume, the back end device creates an exact duplicate of the specified Volume.
The implementation of cloning, from the perspective of the Kubernetes API simply adds the ability to specify an existing unbound PVC as a dataSource during new pvc creation.
The implementation of cloning, from the perspective of the Kubernetes API simply adds the ability to specify an existing unbound PVC as a dataSource during new pvc creation.
Users need to be aware of the following when using this feature:
@ -40,6 +40,10 @@ Users need to be aware of the following when using this feature:
* Cloning support is only available for dynamic provisioners.
* CSI drivers may or may not have implemented the volume cloning functionality.
* You can only clone a PVC when it exists in the same namespace as the destination PVC (source and destination must be in the same namespace).
* Cloning is only supported within the same Storage Class.
- Destination volume must be the same storage class as the source
- Default storage class can be used and storageClassName omitted in the spec
## Provisioning
@ -54,6 +58,7 @@ metadata:
spec:
accessModes:
- ReadWriteOnce
storageClassName: cloning
resources:
requests:
storage: 5Gi

View File

@ -0,0 +1,209 @@
---
reviewers:
- verb
- yujuhong
title: Ephemeral Containers
content_template: templates/concept
weight: 80
---
{{% capture overview %}}
{{< feature-state state="alpha" >}}
This page provides an overview of ephemeral containers: a special type of container
that runs temporarily in an existing {{< glossary_tooltip term_id="pod" >}} to accomplish user-initiated actions such
as troubleshooting. You use ephemeral containers to inspect services rather than
to build applications.
{{< warning >}}
Ephemeral containers are in early alpha state and are not suitable for production
clusters. You should expect the feature not to work in some situations, such as
when targeting the namespaces of a container. In accordance with the [Kubernetes
Deprecation Policy](/docs/reference/using-api/deprecation-policy/), this alpha
feature could change significantly in the future or be removed entirely.
{{< /warning >}}
{{% /capture %}}
{{% capture body %}}
## Understanding ephemeral containers
{{< glossary_tooltip text="Pods" term_id="pod" >}} are the fundamental building
block of Kubernetes applications. Since Pods are intended to be disposable and
replaceable, you cannot add a container to a Pod once it has been created.
Instead, you usually delete and replace Pods in a controlled fashion using
{{< glossary_tooltip text="deployments" term_id="deployment" >}}.
Sometimes it's necessary to inspect the state of an existing Pod, however, for
example to troubleshoot a hard-to-reproduce bug. In these cases you can run
an ephemeral container in an existing Pod to inspect its state and run
arbitrary commands.
### What is an ephemeral container?
Ephemeral containers differ from other containers in that they lack guarantees
for resources or execution, and they will never be automatically restarted, so
they are not appropriate for building applications. Ephemeral containers are
described using the same `ContainerSpec` as regular containers, but many fields
are incompatible and disallowed for ephemeral containers.
- Ephemeral containers may not have ports, so fields such as `ports`,
`livenessProbe`, `readinessProbe` are disallowed.
- Pod resource allocations are immutable, so setting `resources` is disallowed.
- For a complete list of allowed fields, see the [EphemeralContainer reference
documentation](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#ephemeralcontainer-v1-core).
Ephemeral containers are created using a special `ephemeralcontainers` handler
in the API rather than by adding them directly to `pod.spec`, so it's not
possible to add an ephemeral container using `kubectl edit`.
Like regular containers, you may not change or remove an ephemeral container
after you have added it to a Pod.
## Uses for ephemeral containers
Ephemeral containers are useful for interactive troubleshooting when `kubectl
exec` is insufficient because a container has crashed or a container image
doesn't include debugging utilities.
In particular, [distroless images](https://github.com/GoogleContainerTools/distroless)
enable you to deploy minimal container images that reduce attack surface
and exposure to bugs and vulnerabilities. Since distroless images do not include a
shell or any debugging utilities, it's difficult to troubleshoot distroless
images using `kubectl exec` alone.
When using ephemeral containers, it's helpful to enable [process namespace
sharing](/docs/tasks/configure-pod-container/share-process-namespace/) so
you can view processes in other containers.
### Examples
{{< note >}}
The examples in this section require the `EphemeralContainers` [feature
gate](/docs/reference/command-line-tools-reference/feature-gates/) to be
enabled and kubernetes client and server version v1.16 or later.
{{< /note >}}
The examples in this section demonstrate how ephemeral containers appear in
the API. Users would normally use a `kubectl` plugin for troubleshooting that
would automate these steps.
Ephemeral containers are created using the `ephemeralcontainers` subresource
of Pod, which can be demonstrated using `kubectl --raw`. First describe
the ephemeral container to add as an `EphemeralContainers` list:
```json
{
"apiVersion": "v1",
"kind": "EphemeralContainers",
"metadata": {
"name": "example-pod"
},
"ephemeralContainers": [{
"command": [
"sh"
],
"image": "busybox",
"imagePullPolicy": "IfNotPresent",
"name": "debugger",
"stdin": true,
"tty": true,
"terminationMessagePolicy": "File"
}]
}
```
To update the ephemeral containers of the already running `example-pod`:
```shell
kubectl replace --raw /api/v1/namespaces/default/pods/example-pod/ephemeralcontainers -f ec.json
```
This will return the new list of ephemeral containers:
```json
{
"kind":"EphemeralContainers",
"apiVersion":"v1",
"metadata":{
"name":"example-pod",
"namespace":"default",
"selfLink":"/api/v1/namespaces/default/pods/example-pod/ephemeralcontainers",
"uid":"a14a6d9b-62f2-4119-9d8e-e2ed6bc3a47c",
"resourceVersion":"15886",
"creationTimestamp":"2019-08-29T06:41:42Z"
},
"ephemeralContainers":[
{
"name":"debugger",
"image":"busybox",
"command":[
"sh"
],
"resources":{
},
"terminationMessagePolicy":"File",
"imagePullPolicy":"IfNotPresent",
"stdin":true,
"tty":true
}
]
}
```
You can view the state of the newly created ephemeral container using `kubectl describe`:
```shell
kubectl describe pod example-pod
```
```
...
Ephemeral Containers:
debugger:
Container ID: docker://cf81908f149e7e9213d3c3644eda55c72efaff67652a2685c1146f0ce151e80f
Image: busybox
Image ID: docker-pullable://busybox@sha256:9f1003c480699be56815db0f8146ad2e22efea85129b5b5983d0e0fb52d9ab70
Port: <none>
Host Port: <none>
Command:
sh
State: Running
Started: Thu, 29 Aug 2019 06:42:21 +0000
Ready: False
Restart Count: 0
Environment: <none>
Mounts: <none>
...
```
You can attach to the new ephemeral container using `kubectl attach`:
```shell
kubectl attach -it example-pod -c debugger
```
If process namespace sharing is enabled, you can see processes from all the containers in that Pod. For example:
```shell
/ # ps auxww
PID USER TIME COMMAND
1 root 0:00 /pause
6 root 0:00 nginx: master process nginx -g daemon off;
11 101 0:00 nginx: worker process
12 101 0:00 nginx: worker process
13 101 0:00 nginx: worker process
14 101 0:00 nginx: worker process
15 101 0:00 nginx: worker process
16 101 0:00 nginx: worker process
17 101 0:00 nginx: worker process
18 101 0:00 nginx: worker process
19 root 0:00 /pause
24 root 0:00 sh
29 root 0:00 ps auxww
```
{{% /capture %}}

View File

@ -101,7 +101,7 @@ Each probe has one of three results:
* Failure: The Container failed the diagnostic.
* Unknown: The diagnostic failed, so no action should be taken.
The kubelet can optionally perform and react to two kinds of probes on running
The kubelet can optionally perform and react to three kinds of probes on running
Containers:
* `livenessProbe`: Indicates whether the Container is running. If
@ -115,7 +115,15 @@ Containers:
state of readiness before the initial delay is `Failure`. If a Container does
not provide a readiness probe, the default state is `Success`.
### When should you use liveness or readiness probes?
* `startupProbe`: Indicates whether the application within the Container is started.
All other probes are disabled if a startup probe is provided, until it succeeds.
If the startup probe fails, the kubelet kills the Container, and the Container
is subjected to its [restart policy](#restart-policy). If a Container does not
provide a startup probe, the default state is `Success`.
### When should you use a liveness probe?
{{< feature-state for_k8s_version="v1.0" state="stable" >}}
If the process in your Container is able to crash on its own whenever it
encounters an issue or becomes unhealthy, you do not necessarily need a liveness
@ -125,6 +133,10 @@ with the Pod's `restartPolicy`.
If you'd like your Container to be killed and restarted if a probe fails, then
specify a liveness probe, and specify a `restartPolicy` of Always or OnFailure.
### When should you use a readiness probe?
{{< feature-state for_k8s_version="v1.0" state="stable" >}}
If you'd like to start sending traffic to a Pod only when a probe succeeds,
specify a readiness probe. In this case, the readiness probe might be the same
as the liveness probe, but the existence of the readiness probe in the spec means
@ -142,6 +154,13 @@ puts itself into an unready state regardless of whether the readiness probe exis
The Pod remains in the unready state while it waits for the Containers in the Pod
to stop.
### When should you use a startup probe?
{{< feature-state for_k8s_version="v1.16" state="alpha" >}}
If your Container usually starts in more than `initialDelaySeconds + failureThreshold × periodSeconds`, you should specify a startup probe that checks the same endpoint as the liveness probe. The default for `periodSeconds` is 30s.
You should then set its `failureThreshold` high enough to allow the Container to start, without changing the default values of the liveness probe. This helps to protect against deadlocks.
For more information about how to set up a liveness or readiness probe, see
[Configure Liveness and Readiness Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/).

View File

@ -0,0 +1,209 @@
---
title: Pod Topology Spread Constraints
content_template: templates/concept
weight: 50
---
{{% capture overview %}}
{{< feature-state for_k8s_version="v1.16" state="alpha" >}}
You can use _topology spread constraints_ to control how {{< glossary_tooltip text="Pods" term_id="Pod" >}} are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.
{{% /capture %}}
{{% capture body %}}
## Prerequisites
### Enable Feature Gate
Ensure the `EvenPodsSpread` feature gate is enabled (it is disabled by default
in 1.16). See [Feature Gates](/docs/reference/command-line-tools-reference/feature-gates/)
for an explanation of enabling feature gates. The `EvenPodsSpread` feature gate must be enabled for the
{{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}} **and**
{{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}}.
### Node Labels
Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. For example, a Node might have labels: `node=node1,zone=us-east-1a,region=us-east-1`
Suppose you have a 4-node cluster with the following labels:
```
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready <none> 4m26s v1.16.0 node=node1,zone=zoneA
node2 Ready <none> 3m58s v1.16.0 node=node2,zone=zoneA
node3 Ready <none> 3m17s v1.16.0 node=node3,zone=zoneB
node4 Ready <none> 2m43s v1.16.0 node=node4,zone=zoneB
```
Then the cluster is logically viewed as below:
```
+---------------+---------------+
| zoneA | zoneB |
+-------+-------+-------+-------+
| node1 | node2 | node3 | node4 |
+-------+-------+-------+-------+
```
Instead of manually applying labels, you can also reuse the [well-known labels](/docs/reference/kubernetes-api/labels-annotations-taints/) that are created and populated automatically on most clusters.
## Spread Constraints for Pods
### API
The field `pod.spec.topologySpreadConstraints` is introduced in 1.16 as below:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
topologySpreadConstraints:
- maxSkew: <integer>
topologyKey: <string>
whenUnsatisfiable: <string>
labelSelector: <object>
```
You can define one or multiple `topologySpreadConstraint` to instruct the kube-scheduler how to place each incoming Pod in relation to the existing Pods across your cluster. The fields are:
- **maxSkew** describes the degree to which Pods may be unevenly distributed. It's the maximum permitted difference between the number of matching Pods in any two topology domains of a given topology type. It must be greater than zero.
- **topologyKey** is the key of node labels. If two Nodes are labelled with this key and have identical values for that label, the scheduler treats both Nodes as being in the same topology. The scheduler tries to place a balanced number of Pods into each topology domain.
- **whenUnsatisfiable** indicates how to deal with a Pod if it doesn't satisfy the spread constraint:
- `DoNotSchedule` (default) tells the scheduler not to schedule it.
- `ScheduleAnyway` tells the scheduler to still schedule it while prioritizing nodes that minimize the skew.
- **labelSelector** is used to find matching Pods. Pods that match this label selector are counted to determine the number of Pods in their corresponding topology domain. See [Label Selectors](/docs/concepts/overview/working-with-objects/labels/#label-selectors) for more details.
You can read more about this field by running `kubectl explain Pod.spec.topologySpreadConstraints`.
### Example: One TopologySpreadConstraint
Suppose you have a 4-node cluster where 3 Pods labeled `foo:bar` are located in node1, node2 and node3 respectively (`P` represents Pod):
```
+---------------+---------------+
| zoneA | zoneB |
+-------+-------+-------+-------+
| node1 | node2 | node3 | node4 |
+-------+-------+-------+-------+
| P | P | P | |
+-------+-------+-------+-------+
```
If we want an incoming Pod to be evenly spread with existing Pods across zones, the spec can be given as:
{{< codenew file="pods/topology-spread-constraints/one-constraint.yaml" >}}
`topologyKey: zone` implies the even distribution will only be applied to the nodes which have label pair "zone:<any value>" present. `whenUnsatisfiable: DoNotSchedule` tells the scheduler to let it stay pending if the incoming Pod cant satisfy the constraint.
If the scheduler placed this incoming Pod into "zoneA", the Pods distribution would become [3, 1], hence the actual skew is 2 (3 - 1) - which violates `maxSkew: 1`. In this example, the incoming Pod can only be placed onto "zoneB":
```
+---------------+---------------+ +---------------+---------------+
| zoneA | zoneB | | zoneA | zoneB |
+-------+-------+-------+-------+ +-------+-------+-------+-------+
| node1 | node2 | node3 | node4 | OR | node1 | node2 | node3 | node4 |
+-------+-------+-------+-------+ +-------+-------+-------+-------+
| P | P | P | P | | P | P | P P | |
+-------+-------+-------+-------+ +-------+-------+-------+-------+
```
You can tweak the Pod spec to meet various kinds of requirements:
- Change `maxSkew` to a bigger value like "2" so that the incoming Pod can be placed onto "zoneA" as well.
- Change `topologyKey` to "node" so as to distribute the Pods evenly across nodes instead of zones. In the above example, if `maxSkew` remains "1", the incoming Pod can only be placed onto "node4".
- Change `whenUnsatisfiable: DoNotSchedule` to `whenUnsatisfiable: ScheduleAnyway` to ensure the incoming Pod to be always schedulable (suppose other scheduling APIs are satisfied). However, its preferred to be placed onto the topology domain which has fewer matching Pods. (Be aware that this preferability is jointly normalized with other internal scheduling priorities like resource usage ratio, etc.)
### Example: Multiple TopologySpreadConstraints
This builds upon the previous example. Suppose you have a 4-node cluster where 3 Pods labeled `foo:bar` are located in node1, node2 and node3 respectively (`P` represents Pod):
```
+---------------+---------------+
| zoneA | zoneB |
+-------+-------+-------+-------+
| node1 | node2 | node3 | node4 |
+-------+-------+-------+-------+
| P | P | P | |
+-------+-------+-------+-------+
```
You can use 2 TopologySpreadConstraints to control the Pods spreading on both zone and node:
{{< codenew file="pods/topology-spread-constraints/two-constraints.yaml" >}}
In this case, to match the first constraint, the incoming Pod can only be placed onto "zoneB"; while in terms of the second constraint, the incoming Pod can only be placed onto "node4". Then the results of 2 constraints are ANDed, so the only viable option is to place on "node4".
Multiple constraints can lead to conflicts. Suppose you have a 3-node cluster across 2 zones:
```
+---------------+-------+
| zoneA | zoneB |
+-------+-------+-------+
| node1 | node2 | nod3 |
+-------+-------+-------+
| P P | P | P P |
+-------+-------+-------+
```
If you apply "two-constraints.yaml" to this cluster, you will notice "mypod" stays in `Pending` state. This is because: to satisfy the first constraint, "mypod" can only be put to "zoneB"; while in terms of the second constraint, "mypod" can only put to "node2". Then a joint result of "zoneB" and "node2" returns nothing.
To overcome this situation, you can either increase the `maxSkew` or modify one of the constraints to use `whenUnsatisfiable: ScheduleAnyway`.
### Conventions
There are some implicit conventions worth noting here:
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
- Nodes without `topologySpreadConstraints[*].topologyKey` present will be bypassed. It implies that:
1. the Pods located on those nodes do not impact `maxSkew` calculation - in the above example, suppose "node1" does not have label "zone", then the 2 Pods will be disregarded, hence the incomingPod will be scheduled into "zoneA".
2. the incoming Pod has no chances to be scheduled onto this kind of nodes - in the above example, suppose a "node5" carrying label `{zone-typo: zoneC}` joins the cluster, it will be bypassed due to the absence of label key "zone".
- Be aware of what will happen if the incomingPods `topologySpreadConstraints[*].labelSelector` doesnt match its own labels. In the above example, if we remove the incoming Pods labels, it can still be placed onto "zoneB" since the constraints are still satisfied. However, after the placement, the degree of imbalance of the cluster remains unchanged - its still zoneA having 2 Pods which hold label {foo:bar}, and zoneB having 1 Pod which holds label {foo:bar}. So if this is not what you expect, we recommend the workloads `topologySpreadConstraints[*].labelSelector` to match its own labels.
- If the incoming Pod has `spec.nodeSelector` or `spec.affinity.nodeAffinity` defined, nodes not matching them will be bypassed.
Suppose you have a 5-node cluster ranging from zoneA to zoneC:
```
+---------------+---------------+-------+
| zoneA | zoneB | zoneC |
+-------+-------+-------+-------+-------+
| node1 | node2 | node3 | node4 | node5 |
+-------+-------+-------+-------+-------+
| P | P | P | | |
+-------+-------+-------+-------+-------+
```
and you know that "zoneC" must be excluded. In this case, you can compose the yaml as below, so that "mypod" will be placed onto "zoneB" instead of "zoneC". Similarly `spec.nodeSelector` is also respected.
{{< codenew file="pods/topology-spread-constraints/one-constraint-with-nodeaffinity.yaml" >}}
## Comparison with PodAffinity/PodAntiAffinity
In Kubernetes, directives related to "Affinity" control how Pods are
scheduled - more packed or more scattered.
- For `PodAffinity`, you can try to pack any number of Pods into qualifying
topology domain(s)
- For `PodAntiAffinity`, only one Pod can be scheduled into a
single topology domain.
The "EvenPodsSpread" feature provides flexible options to distribute Pods evenly across different
topology domains - to achieve high availability or cost-saving. This can also help on rolling update
workloads and scaling out replicas smoothly.
See [Motivation](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/20190221-even-pods-spreading.md#motivation) for more details.
## Known Limitations
As of 1.16, at which this feature is Alpha, there are some known limitations:
- Scaling down a `Deployment` may result in imbalanced Pods distribution.
- Pods matched on tainted nodes are respected. See [Issue 80921](https://github.com/kubernetes/kubernetes/issues/80921)
{{% /capture %}}

View File

@ -83,7 +83,6 @@ different Kubernetes components.
| `CustomResourceValidation` | `true` | Beta | 1.9 | |
| `CustomResourceWebhookConversion` | `false` | Alpha | 1.13 | 1.14 |
| `CustomResourceWebhookConversion` | `true` | Beta | 1.15 | |
| `DebugContainers` | `false` | Alpha | 1.10 | |
| `DevicePlugins` | `false` | Alpha | 1.8 | 1.9 |
| `DevicePlugins` | `true` | Beta | 1.10 | |
| `DryRun` | `true` | Beta | 1.13 | |
@ -94,6 +93,7 @@ different Kubernetes components.
| `DynamicVolumeProvisioning` | `true` | Alpha | 1.3 | 1.7 |
| `DynamicVolumeProvisioning` | `true` | GA | 1.8 | |
| `EnableEquivalenceClassCache` | `false` | Alpha | 1.8 | |
| `EphemeralContainers` | `false` | Alpha | 1.16 | |
| `ExpandCSIVolumes` | `false` | Alpha | 1.14 | |
| `ExpandInUsePersistentVolumes` | `false` | Alpha | 1.11 | 1.14 |
| `ExpandInUsePersistentVolumes` | `true` | Beta | 1.15 | |
@ -101,6 +101,7 @@ different Kubernetes components.
| `ExpandPersistentVolumes` | `true` | Beta | 1.11 | |
| `ExperimentalCriticalPodAnnotation` | `false` | Alpha | 1.5 | |
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | |
| `EvenPodsSpread` | `false` | Alpha | 1.16 | |
| `GCERegionalPersistentDisk` | `true` | Beta | 1.10 | 1.12 |
| `GCERegionalPersistentDisk` | `true` | GA | 1.13 | - |
| `HugePages` | `false` | Alpha | 1.8 | 1.9 |
@ -130,13 +131,14 @@ different Kubernetes components.
| `PersistentLocalVolumes` | `false` | Alpha | 1.7 | 1.9 |
| `PersistentLocalVolumes` | `true` | Beta | 1.10 | 1.13 |
| `PersistentLocalVolumes` | `true` | GA | 1.14 | |
| `PodOverhead` | `false` | Alpha | 1.16 | - |
| `PodPriority` | `false` | Alpha | 1.8 | 1.10 |
| `PodPriority` | `true` | Beta | 1.11 | 1.13 |
| `PodPriority` | `true` | GA | 1.14 | |
| `PodReadinessGates` | `false` | Alpha | 1.11 | 1.11 |
| `PodReadinessGates` | `true` | Beta | 1.12 | 1.13 |
| `PodReadinessGates` | `true` | GA | 1.14 | - |
| `PodShareProcessNamespace` | `false` | Alpha | 1.10 | |
| `PodShareProcessNamespace` | `false` | Alpha | 1.10 | 1.11 |
| `PodShareProcessNamespace` | `true` | Beta | 1.12 | |
| `ProcMountType` | `false` | Alpha | 1.12 | |
| `PVCProtection` | `false` | Alpha | 1.9 | 1.9 |
@ -154,6 +156,7 @@ different Kubernetes components.
| `ServerSideApply` | `false` | Alpha | 1.14 | |
| `ServiceLoadBalancerFinalizer` | `false` | Alpha | 1.15 | |
| `ServiceNodeExclusion` | `false` | Alpha | 1.8 | |
| `StartupProbe` | `false` | Alpha | 1.16 | |
| `StorageObjectInUseProtection` | `true` | Beta | 1.10 | 1.10 |
| `StorageObjectInUseProtection` | `true` | GA | 1.11 | |
| `StorageVersionHash` | `false` | Alpha | 1.14 | 1.14 |
@ -177,7 +180,9 @@ different Kubernetes components.
| `TokenRequestProjection` | `false` | Alpha | 1.11 | 1.11 |
| `TokenRequestProjection` | `true` | Beta | 1.12 | |
| `TTLAfterFinished` | `false` | Alpha | 1.12 | |
| `VolumePVCDataSource` | `false` | Alpha | 1.15 | |
| `TopologyManager` | `false` | Alpha | 1.16 | |
| `VolumePVCDataSource` | `false` | Alpha | 1.15 | 1.15 |
| `VolumePVCDataSource` | `true` | Beta | 1.16 | |
| `VolumeScheduling` | `false` | Alpha | 1.9 | 1.9 |
| `VolumeScheduling` | `true` | Beta | 1.10 | 1.12 |
| `VolumeScheduling` | `true` | GA | 1.13 | |
@ -278,7 +283,6 @@ Each feature gate is designed for enabling/disabling a specific feature:
[CustomResourceDefinition](/docs/concepts/api-extension/custom-resources/).
- `CustomResourceWebhookConversion`: Enable webhook-based conversion
on resources created from [CustomResourceDefinition](/docs/concepts/api-extension/custom-resources/).
- `DebugContainers`: Enable running a "debugging" container in a Pod's namespace to
troubleshoot a running Pod.
- `DevicePlugins`: Enable the [device-plugins](/docs/concepts/cluster-administration/device-plugins/)
based resource provisioning on nodes.
@ -289,6 +293,9 @@ Each feature gate is designed for enabling/disabling a specific feature:
This feature is superceded by the `VolumeScheduling` feature completely in v1.12.
- `DynamicVolumeProvisioning`(*deprecated*): Enable the [dynamic provisioning](/docs/concepts/storage/dynamic-provisioning/) of persistent volumes to Pods.
- `EnableEquivalenceClassCache`: Enable the scheduler to cache equivalence of nodes when scheduling Pods.
- `EphemeralContainers`: Enable the ability to add {{< glossary_tooltip text="ephemeral containers"
term_id="ephemeral-container" >}} to running pods.
- `EvenPodsSpread`: Enable pods to be scheduled evenly across topology domains. See [Even Pods Spread](/docs/concepts/configuration/even-pods-spread).
- `ExpandInUsePersistentVolumes`: Enable expanding in-use PVCs. See [Resizing an in-use PersistentVolumeClaim](/docs/concepts/storage/persistent-volumes/#resizing-an-in-use-persistentvolumeclaim).
- `ExpandPersistentVolumes`: Enable the expanding of persistent volumes. See [Expanding Persistent Volumes Claims](/docs/concepts/storage/persistent-volumes/#expanding-persistent-volumes-claims).
- `ExperimentalCriticalPodAnnotation`: Enable annotating specific pods as *critical* so that their [scheduling is guaranteed](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/).
@ -314,13 +321,17 @@ Each feature gate is designed for enabling/disabling a specific feature:
For more details, please see [mount propagation](/docs/concepts/storage/volumes/#mount-propagation).
- `NodeDisruptionExclusion`: Enable use of the node label `node.kubernetes.io/exclude-disruption` which prevents nodes from being evacuated during zone failures.
- `NodeLease`: Enable the new Lease API to report node heartbeats, which could be used as a node health signal.
- `NonPreemptingPriority`: Enable NonPreempting option for PriorityClass and Pod.
- `NonPreemptingPriority`: Enable NonPreempting option for PriorityClass and Pod.
- `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods.
Pod affinity has to be specified if requesting a `local` volume.
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/configuration/pod-overhead/) feature to account for pod overheads.
- `PodPriority`: Enable the descheduling and preemption of Pods based on their [priorities](/docs/concepts/configuration/pod-priority-preemption/).
- `PodReadinessGates`: Enable the setting of `PodReadinessGate` field for extending
Pod readiness evaluation.
For more details, please see [Pod readiness gate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate).
Pod readiness evaluation. See [Pod readiness gate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate)
for more details.
- `PodShareProcessNamespace`: Enable the setting of `shareProcessNamespace` in a Pod for sharing
a single process namespace between containers running in a pod. More details can be found in
[Share Process Namespace between Containers in a Pod](/docs/tasks/configure-pod-container/share-process-namespace/).
- `ProcMountType`: Enables control over ProcMountType for containers.
- `PVCProtection`: Enable the prevention of a PersistentVolumeClaim (PVC) from
being deleted when it is still used by any Pod.
@ -343,6 +354,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `ServiceLoadBalancerFinalizer`: Enable finalizer protection for Service load balancers.
- `ServiceNodeExclusion`: Enable the exclusion of nodes from load balancers created by a cloud provider.
A node is eligible for exclusion if labelled with "`alpha.service-controller.kubernetes.io/exclude-balancer`" key (when `LegacyNodeRoleBehavior` is on) or `node.kubernetes.io/exclude-from-external-load-balancers`.
- `StartupProbe`: Enable the [startup](/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-startup-probe) probe in the kubelet.
- `StorageObjectInUseProtection`: Postpone the deletion of PersistentVolume or
PersistentVolumeClaim objects if they are still being used.
- `StorageVersionHash`: Allow apiservers to expose the storage version hash in the discovery.

View File

@ -570,7 +570,7 @@ kube-apiserver [flags]
<td colspan="2">--feature-gates mapStringBool</td>
</tr>
<tr>
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:<br/>APIListChunking=true|false (BETA - default=true)<br/>APIResponseCompression=true|false (ALPHA - default=false)<br/>AllAlpha=true|false (ALPHA - default=false)<br/>AppArmor=true|false (BETA - default=true)<br/>AttachVolumeLimit=true|false (BETA - default=true)<br/>BalanceAttachedNodeVolumes=true|false (ALPHA - default=false)<br/>BlockVolume=true|false (BETA - default=true)<br/>BoundServiceAccountTokenVolume=true|false (ALPHA - default=false)<br/>CPUManager=true|false (BETA - default=true)<br/>CRIContainerLogRotation=true|false (BETA - default=true)<br/>CSIBlockVolume=true|false (ALPHA - default=false)<br/>CSIDriverRegistry=true|false (ALPHA - default=false)<br/>CSINodeInfo=true|false (ALPHA - default=false)<br/>CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)<br/>CustomPodDNS=true|false (BETA - default=true)<br/>CustomResourceSubresources=true|false (BETA - default=true)<br/>CustomResourceValidation=true|false (BETA - default=true)<br/>CustomResourceWebhookConversion=true|false (ALPHA - default=false)<br/>DebugContainers=true|false (ALPHA - default=false)<br/>DevicePlugins=true|false (BETA - default=true)<br/>DryRun=true|false (BETA - default=true)<br/>DynamicAuditing=true|false (ALPHA - default=false)<br/>DynamicKubeletConfig=true|false (BETA - default=true)<br/>EnableEquivalenceClassCache=true|false (ALPHA - default=false)<br/>ExpandInUsePersistentVolumes=true|false (ALPHA - default=false)<br/>ExpandPersistentVolumes=true|false (BETA - default=true)<br/>ExperimentalCriticalPodAnnotation=true|false (ALPHA - default=false)<br/>ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)<br/>HugePages=true|false (BETA - default=true)<br/>HyperVContainer=true|false (ALPHA - default=false)<br/>Initializers=true|false (ALPHA - default=false)<br/>KubeletPodResources=true|false (ALPHA - default=false)<br/>LocalStorageCapacityIsolation=true|false (BETA - default=true)<br/>LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)<br/>MountContainers=true|false (ALPHA - default=false)<br/>NodeLease=true|false (ALPHA - default=false)<br/>PersistentLocalVolumes=true|false (BETA - default=true)<br/>PodPriority=true|false (BETA - default=true)<br/>PodReadinessGates=true|false (BETA - default=true)<br/>PodShareProcessNamespace=true|false (BETA - default=true)<br/>ProcMountType=true|false (ALPHA - default=false)<br/>QOSReserved=true|false (ALPHA - default=false)<br/>ResourceLimitsPriorityFunction=true|false (ALPHA - default=false)<br/>ResourceQuotaScopeSelectors=true|false (BETA - default=true)<br/>RotateKubeletClientCertificate=true|false (BETA - default=true)<br/>RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>RunAsGroup=true|false (ALPHA - default=false)<br/>RuntimeClass=true|false (ALPHA - default=false)<br/>SCTPSupport=true|false (ALPHA - default=false)<br/>ScheduleDaemonSetPods=true|false (BETA - default=true)<br/>ServiceNodeExclusion=true|false (ALPHA - default=false)<br/>StreamingProxyRedirects=true|false (BETA - default=true)<br/>SupportPodPidsLimit=true|false (ALPHA - default=false)<br/>Sysctls=true|false (BETA - default=true)<br/>TTLAfterFinished=true|false (ALPHA - default=false)<br/>TaintBasedEvictions=true|false (BETA - default=true)<br/>TaintNodesByCondition=true|false (BETA - default=true)<br/>TokenRequest=true|false (BETA - default=true)<br/>TokenRequestProjection=true|false (BETA - default=true)<br/>ValidateProxyRedirects=true|false (ALPHA - default=false)<br/>VolumeSnapshotDataSource=true|false (ALPHA - default=false)<br/>VolumeSubpathEnvExpansion=true|false (ALPHA - default=false)</td>
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:<br/>APIListChunking=true|false (BETA - default=true)<br/>APIResponseCompression=true|false (ALPHA - default=false)<br/>AllAlpha=true|false (ALPHA - default=false)<br/>AppArmor=true|false (BETA - default=true)<br/>AttachVolumeLimit=true|false (BETA - default=true)<br/>BalanceAttachedNodeVolumes=true|false (ALPHA - default=false)<br/>BlockVolume=true|false (BETA - default=true)<br/>BoundServiceAccountTokenVolume=true|false (ALPHA - default=false)<br/>CPUManager=true|false (BETA - default=true)<br/>CRIContainerLogRotation=true|false (BETA - default=true)<br/>CSIBlockVolume=true|false (ALPHA - default=false)<br/>CSIDriverRegistry=true|false (ALPHA - default=false)<br/>CSINodeInfo=true|false (ALPHA - default=false)<br/>CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)<br/>CustomPodDNS=true|false (BETA - default=true)<br/>CustomResourceSubresources=true|false (BETA - default=true)<br/>CustomResourceValidation=true|false (BETA - default=true)<br/>CustomResourceWebhookConversion=true|false (ALPHA - default=false)<br/>DebugContainers=true|false (ALPHA - default=false)<br/>DevicePlugins=true|false (BETA - default=true)<br/>DryRun=true|false (BETA - default=true)<br/>DynamicAuditing=true|false (ALPHA - default=false)<br/>DynamicKubeletConfig=true|false (BETA - default=true)<br/>EnableEquivalenceClassCache=true|false (ALPHA - default=false)<br/>EvenPodsSpread=true|false (ALPHA - default=false)<br/>ExpandInUsePersistentVolumes=true|false (ALPHA - default=false)<br/>ExpandPersistentVolumes=true|false (BETA - default=true)<br/>ExperimentalCriticalPodAnnotation=true|false (ALPHA - default=false)<br/>ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)<br/>HugePages=true|false (BETA - default=true)<br/>HyperVContainer=true|false (ALPHA - default=false)<br/>Initializers=true|false (ALPHA - default=false)<br/>KubeletPodResources=true|false (ALPHA - default=false)<br/>LocalStorageCapacityIsolation=true|false (BETA - default=true)<br/>LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)<br/>MountContainers=true|false (ALPHA - default=false)<br/>NodeLease=true|false (ALPHA - default=false)<br/>PersistentLocalVolumes=true|false (BETA - default=true)<br/>PodPriority=true|false (BETA - default=true)<br/>PodReadinessGates=true|false (BETA - default=true)<br/>PodShareProcessNamespace=true|false (BETA - default=true)<br/>ProcMountType=true|false (ALPHA - default=false)<br/>QOSReserved=true|false (ALPHA - default=false)<br/>ResourceLimitsPriorityFunction=true|false (ALPHA - default=false)<br/>ResourceQuotaScopeSelectors=true|false (BETA - default=true)<br/>RotateKubeletClientCertificate=true|false (BETA - default=true)<br/>RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>RunAsGroup=true|false (ALPHA - default=false)<br/>RuntimeClass=true|false (ALPHA - default=false)<br/>SCTPSupport=true|false (ALPHA - default=false)<br/>ScheduleDaemonSetPods=true|false (BETA - default=true)<br/>ServiceNodeExclusion=true|false (ALPHA - default=false)<br/>StreamingProxyRedirects=true|false (BETA - default=true)<br/>SupportPodPidsLimit=true|false (ALPHA - default=false)<br/>Sysctls=true|false (BETA - default=true)<br/>TTLAfterFinished=true|false (ALPHA - default=false)<br/>TaintBasedEvictions=true|false (BETA - default=true)<br/>TaintNodesByCondition=true|false (BETA - default=true)<br/>TokenRequest=true|false (BETA - default=true)<br/>TokenRequestProjection=true|false (BETA - default=true)<br/>ValidateProxyRedirects=true|false (ALPHA - default=false)<br/>VolumeSnapshotDataSource=true|false (ALPHA - default=false)<br/>VolumeSubpathEnvExpansion=true|false (ALPHA - default=false)</td>
</tr>
<tr>

View File

@ -153,7 +153,7 @@ kube-scheduler [flags]
<td colspan="2">--feature-gates mapStringBool</td>
</tr>
<tr>
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:<br/>APIListChunking=true|false (BETA - default=true)<br/>APIResponseCompression=true|false (ALPHA - default=false)<br/>AllAlpha=true|false (ALPHA - default=false)<br/>AppArmor=true|false (BETA - default=true)<br/>AttachVolumeLimit=true|false (BETA - default=true)<br/>BalanceAttachedNodeVolumes=true|false (ALPHA - default=false)<br/>BlockVolume=true|false (BETA - default=true)<br/>BoundServiceAccountTokenVolume=true|false (ALPHA - default=false)<br/>CPUManager=true|false (BETA - default=true)<br/>CRIContainerLogRotation=true|false (BETA - default=true)<br/>CSIBlockVolume=true|false (ALPHA - default=false)<br/>CSIDriverRegistry=true|false (ALPHA - default=false)<br/>CSINodeInfo=true|false (ALPHA - default=false)<br/>CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)<br/>CustomPodDNS=true|false (BETA - default=true)<br/>CustomResourceSubresources=true|false (BETA - default=true)<br/>CustomResourceValidation=true|false (BETA - default=true)<br/>CustomResourceWebhookConversion=true|false (ALPHA - default=false)<br/>DebugContainers=true|false (ALPHA - default=false)<br/>DevicePlugins=true|false (BETA - default=true)<br/>DryRun=true|false (BETA - default=true)<br/>DynamicAuditing=true|false (ALPHA - default=false)<br/>DynamicKubeletConfig=true|false (BETA - default=true)<br/>EnableEquivalenceClassCache=true|false (ALPHA - default=false)<br/>ExpandInUsePersistentVolumes=true|false (ALPHA - default=false)<br/>ExpandPersistentVolumes=true|false (BETA - default=true)<br/>ExperimentalCriticalPodAnnotation=true|false (ALPHA - default=false)<br/>ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)<br/>HugePages=true|false (BETA - default=true)<br/>HyperVContainer=true|false (ALPHA - default=false)<br/>Initializers=true|false (ALPHA - default=false)<br/>KubeletPodResources=true|false (ALPHA - default=false)<br/>LocalStorageCapacityIsolation=true|false (BETA - default=true)<br/>LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)<br/>MountContainers=true|false (ALPHA - default=false)<br/>NodeLease=true|false (ALPHA - default=false)<br/>PersistentLocalVolumes=true|false (BETA - default=true)<br/>PodPriority=true|false (BETA - default=true)<br/>PodReadinessGates=true|false (BETA - default=true)<br/>PodShareProcessNamespace=true|false (BETA - default=true)<br/>ProcMountType=true|false (ALPHA - default=false)<br/>QOSReserved=true|false (ALPHA - default=false)<br/>ResourceLimitsPriorityFunction=true|false (ALPHA - default=false)<br/>ResourceQuotaScopeSelectors=true|false (BETA - default=true)<br/>RotateKubeletClientCertificate=true|false (BETA - default=true)<br/>RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>RunAsGroup=true|false (ALPHA - default=false)<br/>RuntimeClass=true|false (ALPHA - default=false)<br/>SCTPSupport=true|false (ALPHA - default=false)<br/>ScheduleDaemonSetPods=true|false (BETA - default=true)<br/>ServiceNodeExclusion=true|false (ALPHA - default=false)<br/>StreamingProxyRedirects=true|false (BETA - default=true)<br/>SupportPodPidsLimit=true|false (ALPHA - default=false)<br/>Sysctls=true|false (BETA - default=true)<br/>TTLAfterFinished=true|false (ALPHA - default=false)<br/>TaintBasedEvictions=true|false (BETA - default=true)<br/>TaintNodesByCondition=true|false (BETA - default=true)<br/>TokenRequest=true|false (BETA - default=true)<br/>TokenRequestProjection=true|false (BETA - default=true)<br/>ValidateProxyRedirects=true|false (ALPHA - default=false)<br/>VolumeSnapshotDataSource=true|false (ALPHA - default=false)<br/>VolumeSubpathEnvExpansion=true|false (ALPHA - default=false)</td>
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:<br/>APIListChunking=true|false (BETA - default=true)<br/>APIResponseCompression=true|false (ALPHA - default=false)<br/>AllAlpha=true|false (ALPHA - default=false)<br/>AppArmor=true|false (BETA - default=true)<br/>AttachVolumeLimit=true|false (BETA - default=true)<br/>BalanceAttachedNodeVolumes=true|false (ALPHA - default=false)<br/>BlockVolume=true|false (BETA - default=true)<br/>BoundServiceAccountTokenVolume=true|false (ALPHA - default=false)<br/>CPUManager=true|false (BETA - default=true)<br/>CRIContainerLogRotation=true|false (BETA - default=true)<br/>CSIBlockVolume=true|false (ALPHA - default=false)<br/>CSIDriverRegistry=true|false (ALPHA - default=false)<br/>CSINodeInfo=true|false (ALPHA - default=false)<br/>CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)<br/>CustomPodDNS=true|false (BETA - default=true)<br/>CustomResourceSubresources=true|false (BETA - default=true)<br/>CustomResourceValidation=true|false (BETA - default=true)<br/>CustomResourceWebhookConversion=true|false (ALPHA - default=false)<br/>DebugContainers=true|false (ALPHA - default=false)<br/>DevicePlugins=true|false (BETA - default=true)<br/>DryRun=true|false (BETA - default=true)<br/>DynamicAuditing=true|false (ALPHA - default=false)<br/>DynamicKubeletConfig=true|false (BETA - default=true)<br/>EnableEquivalenceClassCache=true|false (ALPHA - default=false)<br/>EvenPodsSpread=true|false (ALPHA - default=false)<br/>ExpandInUsePersistentVolumes=true|false (ALPHA - default=false)<br/>ExpandPersistentVolumes=true|false (BETA - default=true)<br/>ExperimentalCriticalPodAnnotation=true|false (ALPHA - default=false)<br/>ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)<br/>HugePages=true|false (BETA - default=true)<br/>HyperVContainer=true|false (ALPHA - default=false)<br/>Initializers=true|false (ALPHA - default=false)<br/>KubeletPodResources=true|false (ALPHA - default=false)<br/>LocalStorageCapacityIsolation=true|false (BETA - default=true)<br/>LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)<br/>MountContainers=true|false (ALPHA - default=false)<br/>NodeLease=true|false (ALPHA - default=false)<br/>PersistentLocalVolumes=true|false (BETA - default=true)<br/>PodPriority=true|false (BETA - default=true)<br/>PodReadinessGates=true|false (BETA - default=true)<br/>PodShareProcessNamespace=true|false (BETA - default=true)<br/>ProcMountType=true|false (ALPHA - default=false)<br/>QOSReserved=true|false (ALPHA - default=false)<br/>ResourceLimitsPriorityFunction=true|false (ALPHA - default=false)<br/>ResourceQuotaScopeSelectors=true|false (BETA - default=true)<br/>RotateKubeletClientCertificate=true|false (BETA - default=true)<br/>RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>RunAsGroup=true|false (ALPHA - default=false)<br/>RuntimeClass=true|false (ALPHA - default=false)<br/>SCTPSupport=true|false (ALPHA - default=false)<br/>ScheduleDaemonSetPods=true|false (BETA - default=true)<br/>ServiceNodeExclusion=true|false (ALPHA - default=false)<br/>StreamingProxyRedirects=true|false (BETA - default=true)<br/>SupportPodPidsLimit=true|false (ALPHA - default=false)<br/>Sysctls=true|false (BETA - default=true)<br/>TTLAfterFinished=true|false (ALPHA - default=false)<br/>TaintBasedEvictions=true|false (BETA - default=true)<br/>TaintNodesByCondition=true|false (BETA - default=true)<br/>TokenRequest=true|false (BETA - default=true)<br/>TokenRequestProjection=true|false (BETA - default=true)<br/>ValidateProxyRedirects=true|false (ALPHA - default=false)<br/>VolumeSnapshotDataSource=true|false (ALPHA - default=false)<br/>VolumeSubpathEnvExpansion=true|false (ALPHA - default=false)</td>
</tr>
<tr>

View File

@ -0,0 +1,18 @@
---
title: Ephemeral Container
id: ephemeral-container
date: 2019-08-26
full_link: /docs/concepts/workloads/pods/ephemeral-containers/
short_description: >
A type of container type that you can temporarily run inside a Pod
aka:
tags:
- fundamental
---
A {{< glossary_tooltip term_id="container" >}} type that you can temporarily run inside a {{< glossary_tooltip term_id="pod" >}}.
<!--more-->
If you want to investigate a Pod that's running with problems, you can add an ephemeral container to that Pod and carry out diagnostics. Ephemeral containers have no resource or scheduling guarantees, and you should not use them to run any part of the workload itself.

View File

@ -8,13 +8,9 @@ content_template: templates/concept
weight: 50
---
{{% capture overview %}}
Beginning with v1.8.0, kubeadm uploads the configuration of your cluster to a ConfigMap called
`kubeadm-config` in the `kube-system` namespace, and later reads the ConfigMap when upgrading.
This enables correct configuration of system components, and provides a seamless user experience.
You can execute `kubeadm config view` to view the ConfigMap. If you initialized your cluster using
kubeadm v1.7.x or lower, you must use `kubeadm config upload` to create the ConfigMap before you
may use `kubeadm upgrade`.
During `kubeadm init`, kubeadm uploads the `ClusterConfiguration` object to your cluster
in a ConfigMap called `kubeadm-config` in the `kube-system` namespace. This configuration is then read during
`kubeadm join`, `kubeadm reset` and `kubeadm upgrade`. To view this ConfigMap call `kubeadm config view`.
You can use `kubeadm config print` to print the default configuration and `kubeadm config migrate` to
convert your old configuration files to a newer version. `kubeadm config images list` and

View File

@ -2,7 +2,9 @@
title: kubeadm init phase
weight: 90
---
In v1.8.0, kubeadm introduced the `kubeadm alpha phase` command with the aim of making kubeadm more modular. In v1.13.0 this command graduated to `kubeadm init phase`. This modularity enables you to invoke atomic sub-steps of the bootstrap process. Hence, you can let kubeadm do some parts and fill in yourself where you need customizations.
`kubeadm init phase` enables you to invoke atomic steps of the bootstrap process.
Hence, you can let kubeadm do some of the work and you can fill in the gaps
if you wish to apply customization.
`kubeadm init phase` is consistent with the [kubeadm init workflow](/docs/reference/setup-tools/kubeadm/kubeadm-init/#init-workflow),
and behind the scene both use the same code.

View File

@ -37,7 +37,7 @@ following steps:
kubeconfig file for administration named `admin.conf`.
1. Generates static Pod manifests for the API server,
controller manager and scheduler. In case an external etcd is not provided,
controller-manager and scheduler. In case an external etcd is not provided,
an additional static Pod manifest is generated for etcd.
Static Pod manifests are written to `/etc/kubernetes/manifests`; the kubelet
@ -75,7 +75,7 @@ following steps:
### Using init phases with kubeadm {#init-phases}
Kubeadm allows you create a control-plane node in phases. In 1.13 the `kubeadm init phase` command has graduated to GA from its previous alpha state under `kubeadm alpha phase`.
Kubeadm allows you to create a control-plane node in phases using the `kubeadm init phase` command.
To view the ordered list of phases and sub-phases you can call `kubeadm init --help`. The list will be located at the top of the help screen and each phase will have a description next to it.
Note that by calling `kubeadm init` all of the phases and sub-phases will be executed in this exact order.
@ -113,9 +113,9 @@ The config file is still considered beta and may change in future versions.
It's possible to configure `kubeadm init` with a configuration file instead of command
line flags, and some more advanced features may only be available as
configuration file options. This file is passed in the `--config` option.
configuration file options. This file is passed with the `--config` option.
In Kubernetes 1.11 and later, the default configuration can be printed out using the
The default configuration can be printed out using the
[kubeadm config print](/docs/reference/setup-tools/kubeadm/kubeadm-config/) command.
It is **recommended** that you migrate your old `v1beta1` configuration to `v1beta2` using
@ -139,8 +139,8 @@ For information about passing flags to control plane components see:
### Using custom images {#custom-images}
By default, kubeadm pulls images from `k8s.gcr.io`, unless
the requested Kubernetes version is a CI version. In this case,
By default, kubeadm pulls images from `k8s.gcr.io`. If the
requested Kubernetes version is a CI label (such as `ci/latest`)
`gcr.io/kubernetes-ci-images` is used.
You can override this behavior by using [kubeadm with a configuration file](#config-file).
@ -187,11 +187,10 @@ To do so, you must place them in whatever directory is specified by the
`--cert-dir` flag or `CertificatesDir` configuration file key. By default this
is `/etc/kubernetes/pki`.
If a given certificate and private key pair exists, kubeadm skips the
generation step and existing files are used for the prescribed
use case. This means you can, for example, copy an existing CA into `/etc/kubernetes/pki/ca.crt`
and `/etc/kubernetes/pki/ca.key`, and kubeadm will use this CA for signing the rest
of the certs.
If a given certificate and private key pair exists before running `kubeadm init`,
kubeadm will not overwrite them. This means you can, for example, copy an existing
CA into `/etc/kubernetes/pki/ca.crt` and `/etc/kubernetes/pki/ca.key`,
and kubeadm will use this CA for signing the rest of the certificates.
#### External CA mode {#external-ca-mode}
@ -212,47 +211,14 @@ For further information, see [Managing the kubeadm drop-in file for systemd](/do
### Use kubeadm with CRI runtimes
Since v1.6.0, Kubernetes has enabled the use of CRI, Container Runtime Interface, by default.
The container runtime used by default is Docker, which is enabled through the built-in
`dockershim` CRI implementation inside of the `kubelet`.
Other CRI-based runtimes include:
- [cri-containerd](https://github.com/containerd/cri-containerd)
- [cri-o](https://github.com/kubernetes-incubator/cri-o)
- [frakti](https://github.com/kubernetes/frakti)
- [rkt](https://github.com/kubernetes-incubator/rktlet)
Refer to the [CRI installation instructions](/docs/setup/cri) for more information.
After you have successfully installed `kubeadm` and `kubelet`, execute
these two additional steps:
1. Install the runtime shim on every node, following the installation
document in the runtime shim project listing above.
1. Configure kubelet to use the remote CRI runtime. Please remember to change
`RUNTIME_ENDPOINT` to your own value like `/var/run/{your_runtime}.sock`:
```shell
cat > /etc/systemd/system/kubelet.service.d/20-cri.conf <<EOF
[Service]
Environment="KUBELET_EXTRA_ARGS=--container-runtime=remote --container-runtime-endpoint=$RUNTIME_ENDPOINT"
EOF
systemctl daemon-reload
```
Now `kubelet` is ready to use the specified CRI runtime, and you can continue
with the `kubeadm init` and `kubeadm join` workflow to deploy Kubernetes cluster.
You may also want to set `--cri-socket` to `kubeadm init` and `kubeadm reset` when
using an external CRI implementation.
By default kubeadm attempts to detect your container runtime. For more details on this detection, see
the [kubeadm CRI installation guide](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-runtime).
### Setting the node name
By default, `kubeadm` assigns a node name based on a machine's host address. You can override this setting with the `--node-name`flag.
By default, `kubeadm` assigns a node name based on a machine's host address. You can override this setting with the `--node-name` flag.
The flag passes the appropriate [`--hostname-override`](/docs/reference/command-line-tools-reference/kubelet/#options)
to the kubelet.
value to the kubelet.
Be aware that overriding the hostname can [interfere with cloud providers](https://github.com/kubernetes/website/pull/8873).
@ -260,25 +226,25 @@ Be aware that overriding the hostname can [interfere with cloud providers](https
For running kubeadm without an internet connection you have to pre-pull the required control-plane images.
In Kubernetes 1.11 and later, you can list and pull the images using the `kubeadm config images` sub-command:
You can list and pull the images using the `kubeadm config images` sub-command:
```shell
kubeadm config images list
kubeadm config images pull
```
In Kubernetes 1.12 and later, the `k8s.gcr.io/kube-*`, `k8s.gcr.io/etcd` and `k8s.gcr.io/pause` images
don't require an `-${ARCH}` suffix.
All images that kubeadm requires such as `k8s.gcr.io/kube-*`, `k8s.gcr.io/etcd` and `k8s.gcr.io/pause` support multiple architectures.
### Automating kubeadm
Rather than copying the token you obtained from `kubeadm init` to each node, as
in the [basic kubeadm tutorial](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/), you can parallelize the
token distribution for easier automation. To implement this automation, you must
know the IP address that the control-plane node will have after it is started.
know the IP address that the control-plane node will have after it is started,
or use a DNS name or an address of a load balancer.
1. Generate a token. This token must have the form `<6 character string>.<16
character string>`. More formally, it must match the regex:
character string>`. More formally, it must match the regex:
`[a-z0-9]{6}\.[a-z0-9]{16}`.
kubeadm can generate a token for you:

View File

@ -2,8 +2,9 @@
title: kubeadm join phase
weight: 90
---
In v1.14.0, kubeadm introduces the `kubeadm join phase` command with the aim of making kubeadm more modular. This modularity enables you to invoke atomic sub-steps of the join process.
Hence, you can let kubeadm do some parts and fill in yourself where you need customizations.
`kubeadm join phase` enables you to invoke atomic steps of the join process.
Hence, you can let kubeadm do some of the work and you can fill in the gaps
if you wish to apply customization.
`kubeadm join phase` is consistent with the [kubeadm join workflow](/docs/reference/setup-tools/kubeadm/kubeadm-join/#join-workflow),
and behind the scene both use the same code.

View File

@ -47,7 +47,7 @@ For control-plane nodes additional steps are performed:
### Using join phases with kubeadm {#join-phases}
Kubeadm allows you join a node to the cluster in phases. The `kubeadm join phase` command was added in v1.14.0.
Kubeadm allows you join a node to the cluster in phases using `kubeadm join phase`.
To view the ordered list of phases and sub-phases you can call `kubeadm join --help`. The list will be located
at the top of the help screen and each phase will have a description next to it.
@ -76,7 +76,7 @@ security expectations you have about your network and node lifecycles.
#### Token-based discovery with CA pinning
This is the default mode in Kubernetes 1.8 and above. In this mode, kubeadm downloads
This is the default mode in kubeadm. In this mode, kubeadm downloads
the cluster configuration (including root CA) and validates it using the token
as well as validating that the root CA public key matches the provided hash and
that the API server certificate is valid under the root CA.
@ -117,15 +117,14 @@ if the `kubeadm init` command was called with `--upload-certs`.
- The CA hash is not normally known until the control-plane node has been provisioned,
which can make it more difficult to build automated provisioning tools that
use kubeadm. By generating your CA in beforehand, you may workaround this
limitation though.
limitation.
#### Token-based discovery without CA pinning
_This was the default in Kubernetes 1.7 and earlier_, but comes with some
important caveats. This mode relies only on the symmetric token to sign
This mode relies only on the symmetric token to sign
(HMAC-SHA256) the discovery information that establishes the root of trust for
the control-plane. It's still possible in Kubernetes 1.8 and above using the
`--discovery-token-unsafe-skip-ca-verification` flag, but you should consider
the control-plane. To use the mode the joining nodes must skip the hash validation of the
CA public key, using `--discovery-token-unsafe-skip-ca-verification`. You should consider
using one of the other modes if possible.
**Example `kubeadm join` command:**
@ -150,9 +149,13 @@ kubeadm join --token abcdef.1234567890abcdef --discovery-token-unsafe-skip-ca-ve
tradeoff in your environment.
#### File or HTTPS-based discovery
This provides an out-of-band way to establish a root of trust between the control-plane node
and bootstrapping nodes. Consider using this mode if you are building automated provisioning
using kubeadm.
and bootstrapping nodes. Consider using this mode if you are building automated provisioning
using kubeadm. The format of the discovery file is a regular Kubernetes
[kubeconfig](/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) file.
In case the discovery file does not contain credentials, the TLS discovery token will be used.
**Example `kubeadm join` commands:**
@ -215,7 +218,7 @@ NAME AGE REQUESTOR
node-csr-c69HXe7aYcqkS1bKmH4faEnHAWxn6i2bHZ2mD04jZyQ 1m system:bootstrap:878f07 Approved,Issued
```
Only after `kubectl certificate approve` has been run, `kubeadm join` can proceed.
This forces the workflow that `kubeadm join` will only succeed if `kubectl certificate approve` has been run.
#### Turning off public access to the cluster-info ConfigMap

View File

@ -2,8 +2,9 @@
title: kubeadm reset phase
weight: 90
---
In v1.15.0, kubeadm introduces the `kubeadm reset phase` command with the aim of making kubeadm more modular. This modularity enables you to invoke atomic sub-steps of the reset process.
Hence, you can let kubeadm do some parts and fill in yourself where you need customizations.
`kubeadm reset phase` enables you to invoke atomic steps of the node reset process.
Hence, you can let kubeadm do some of the work and you can fill in the gaps
if you wish to apply customization.
`kubeadm reset phase` is consistent with the [kubeadm reset workflow](/docs/reference/setup-tools/kubeadm/kubeadm-reset/#reset-workflow),
and behind the scene both use the same code.

View File

@ -8,26 +8,18 @@ content_template: templates/concept
weight: 40
---
{{% capture overview %}}
`kubeadm upgrade` is a user-friendly command that wraps complex upgrading logic behind one command, with support
for both planning an upgrade and actually performing it. `kubeadm upgrade` can also be used for downgrading
cluster if necessary.
`kubeadm upgrade` is a user-friendly command that wraps complex upgrading logic
behind one command, with support for both planning an upgrade and actually performing it.
{{% /capture %}}
{{% capture body %}}
## kubeadm upgrade guidance
Every upgrade process might be a bit different, so we've documented each minor upgrade process individually.
For more version-specific upgrade guidance, see the following resources:
The steps for performing a upgrade using kubeadm are outlined in [this document](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/).
For older versions of kubeadm, please refer to older documentation sets of the Kubernetes website.
* [1.12 to 1.13 upgrades](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-13/)
* [1.13 to 1.14 upgrades](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-14/)
* [1.14 to 1.15 upgrades](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-15/)
_For older versions, please refer to older documentation sets on the Kubernetes website._
In Kubernetes v1.11.0 and later, you can use `kubeadm upgrade diff` to see the changes that would be
applied to static pod manifests.
You can use `kubeadm upgrade diff` to see the changes that would be applied to static pod manifests.
To use kube-dns with upgrades in Kubernetes v1.13.0 and later please follow [this guide](/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/#cmd-phase-addon).

View File

@ -42,7 +42,7 @@ Example usage:
```yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.13.0
kubernetesVersion: v1.16.0
apiServer:
extraArgs:
advertise-address: 192.168.0.103
@ -59,7 +59,7 @@ Example usage:
```yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.13.0
kubernetesVersion: v1.16.0
controllerManager:
extraArgs:
cluster-signing-key-file: /home/johndoe/keys/ca.key
@ -75,7 +75,7 @@ Example usage:
```yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.13.0
kubernetesVersion: v1.16.0
scheduler:
extraArgs:
address: 0.0.0.0

View File

@ -35,17 +35,6 @@ but you may also build them from source for other OSes.
### kubeadm maturity
| Area | Maturity Level |
|---------------------------|--------------- |
| Command line UX | GA |
| Implementation | GA |
| Config file API | Beta |
| CoreDNS | GA |
| kubeadm alpha subcommands | Alpha |
| High availability | Beta |
| DynamicKubeletConfig | Alpha |
kubeadm's overall feature state is **GA**. Some sub-features, like the configuration
file API are still under active development. The implementation of creating the cluster
may change slightly as the tool evolves, but the overall implementation should be pretty stable.
@ -61,16 +50,10 @@ timeframe; which also applies to `kubeadm`.
| Kubernetes version | Release month | End-of-life-month |
|--------------------|----------------|-------------------|
| v1.6.x | March 2017 | December 2017 |
| v1.7.x | June 2017 | March 2018 |
| v1.8.x | September 2017 | June 2018 |
| v1.9.x | December 2017 | September 2018   |
| v1.10.x | March 2018 | December 2018   |
| v1.11.x | June 2018 | March 2019   |
| v1.12.x | September 2018 | June 2019   |
| v1.13.x | December 2018 | September 2019   |
| v1.14.x | March 2019 | December 2019   |
| v1.15.x | June 2019 | March 2020   |
| v1.16.x | September 2019 | June 2020   |
{{% /capture %}}

View File

@ -201,17 +201,17 @@ systemctl enable --now kubelet
Install CNI plugins (required for most pod network):
```bash
CNI_VERSION="v0.7.5"
CNI_VERSION="v0.8.2"
mkdir -p /opt/cni/bin
curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-amd64-${CNI_VERSION}.tgz" | tar -C /opt/cni/bin -xz
curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-amd64-${CNI_VERSION}.tgz" | tar -C /opt/cni/bin -xz
```
Install crictl (required for kubeadm / Kubelet Container Runtime Interface (CRI))
```bash
CRICTL_VERSION="v1.12.0"
CRICTL_VERSION="v1.16.0"
mkdir -p /opt/bin
curl -L "https://github.com/kubernetes-incubator/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-amd64.tar.gz" | tar -C /opt/bin -xz
curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-amd64.tar.gz" | tar -C /opt/bin -xz
```
Install `kubeadm`, `kubelet`, `kubectl` and add a `kubelet` systemd service:

View File

@ -10,7 +10,7 @@ weight: 100
### Self-hosting the Kubernetes control plane {#self-hosting}
As of 1.8, you can experimentally create a _self-hosted_ Kubernetes control
kubeadm allows you to experimentally create a _self-hosted_ Kubernetes control
plane. This means that key components such as the API server, controller
manager, and scheduler run as [DaemonSet pods](/docs/concepts/workloads/controllers/daemonset/)
configured via the Kubernetes API instead of [static pods](/docs/tasks/administer-cluster/static-pod/)

View File

@ -102,14 +102,32 @@ Pods, Controllers and Services are critical elements to managing Windows workloa
Docker EE-basic 18.09 is required on Windows Server 2019 / 1809 nodes for Kubernetes. This works with the dockershim code included in the kubelet. Additional runtimes such as CRI-ContainerD may be supported in later Kubernetes versions.
#### Storage
#### Persistent Storage
Kubernetes Volumes enable complex applications with data persistence and Pod volume sharing requirements to be deployed on Kubernetes. Kubernetes on Windows supports the following types of [volumes](/docs/concepts/storage/volumes/):
Kubernetes [volumes](/docs/concepts/storage/volumes/) enable complex applications, with data persistence and Pod volume sharing requirements, to be deployed on Kubernetes. Management of persistent volumes associated with a specific storage back-end or protocol includes actions such as: provisioning/de-provisioning/resizing of volumes, attaching/detaching a volume to/from a Kubernetes node and mounting/dismounting a volume to/from individual containers in a pod that needs to persist data. The code implementing these volume management actions for a specific storage back-end or protocol is shipped in the form of a Kubernetes volume [plugin](/docs/concepts/storage/volumes/#types-of-volumes). The following broad classes of Kubernetes volume plugins are supported on Windows:
##### In-tree Volume Plugins
Code associated with in-tree volume plugins ship as part of the core Kubernetes code base. Deployment of in-tree volume plugins do not require installation of additional scripts or deployment of separate containerized plugin components. These plugins can handle: provisioning/de-provisioning and resizing of volumes in the storage backend, attaching/detaching of volumes to/from a Kubernetes node and mounting/dismounting a volume to/from individual containers in a pod. The following in-tree plugins support Windows nodes:
* FlexVolume out-of-tree plugin with [SMB and iSCSI](https://github.com/Microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows) support
* [azureDisk](/docs/concepts/storage/volumes/#azuredisk)
* [azureFile](/docs/concepts/storage/volumes/#azurefile)
* [gcePersistentDisk](/docs/concepts/storage/volumes/#gcepersistentdisk)
* [awsElasticBlockStore](/docs/concepts/storage/volumes/#awselasticblockstore)
* [vsphereVolume](/docs/concepts/storage/volumes/#vspherevolume)
##### FlexVolume Plugins
Code associated with [FlexVolume](/docs/concepts/storage/volumes/#flexVolume) plugins ship as out-of-tree scripts or binaries that need to be deployed directly on the host. FlexVolume plugins handle attaching/detaching of volumes to/from a Kubernetes node and mounting/dismounting a volume to/from individual containers in a pod. Provisioning/De-provisioning of persistent volumes associated with FlexVolume plugins may be handled through an external provisioner that is typically separate from the FlexVolume plugins. The following FlexVolume [plugins](https://github.com/Microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows), deployed as powershell scripts on the host, support Windows nodes:
* [SMB](https://github.com/microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows/plugins/microsoft.com~smb.cmd)
* [iSCSI](https://github.com/microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows/plugins/microsoft.com~iscsi.cmd)
##### CSI Plugins
{{< feature-state for_k8s_version="v1.16" state="alpha" >}}
Code associated with {{< glossary_tooltip text="CSI" term_id="csi" >}} plugins ship as out-of-tree scripts and binaries that are typically distributed as container images and deployed using standard Kubernetes constructs like DaemonSets and StatefulSets. CSI plugins handle a wide range of volume management actions in Kubernetes: provisioning/de-provisioning/resizing of volumes, attaching/detaching of volumes to/from a Kubernetes node and mounting/dismounting a volume to/from individual containers in a pod, backup/restore of persistent data using snapshots and cloning. CSI plugins typically consist of node plugins (that run on each node as a DaemonSet) and controller plugins.
CSI node plugins (especially those associated with persistent volumes exposed as either block devices or over a shared file-system) need to perform various privileged operations like scanning of disk devices, mounting of file systems, etc. These operations differ for each host operating system. For Linux worker nodes, containerized CSI node plugins are typically deployed as privileged containers. For Windows worker nodes, privileged operations for containerized CSI node plugins is supported using [csi-proxy](https://github.com/kubernetes-csi/csi-proxy), a community-managed, stand-alone binary that needs to be pre-installed on each Windows node. Please refer to the deployment guide of the CSI plugin you wish to deploy for further details.
#### Networking
@ -131,13 +149,13 @@ Windows supports five different networking drivers/modes: L2bridge, L2tunnel, Ov
| Network Driver | Description | Container Packet Modifications | Network Plugins | Network Plugin Characteristics |
| -------------- | ----------- | ------------------------------ | --------------- | ------------------------------ |
| L2bridge | Containers are attached to an external vSwitch. Containers are attached to the underlay network, although the physical network doesn't need to learn the container MACs because they are rewritten on ingress/egress. Inter-container traffic is bridged inside the container host. | MAC is rewritten to host MAC, IP remains the same. | [win-bridge](https://github.com/containernetworking/plugins/tree/master/plugins/main/windows/win-bridge), [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), Flannel host-gateway uses win-bridge | win-bridge uses L2bridge network mode, connects containers to the underlay of hosts, offering best performance. Requires L2 adjacency between container hosts |
| L2bridge | Containers are attached to an external vSwitch. Containers are attached to the underlay network, although the physical network doesn't need to learn the container MACs because they are rewritten on ingress/egress. Inter-container traffic is bridged inside the container host. | MAC is rewritten to host MAC, IP remains the same. | [win-bridge](https://github.com/containernetworking/plugins/tree/master/plugins/main/windows/win-bridge), [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md), Flannel host-gateway uses win-bridge | win-bridge uses L2bridge network mode, connects containers to the underlay of hosts, offering best performance. Requires user-defined routes (UDR) for inter-node connectivity. |
| L2Tunnel | This is a special case of l2bridge, but only used on Azure. All packets are sent to the virtualization host where SDN policy is applied. | MAC rewritten, IP visible on the underlay network | [Azure-CNI](https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md) | Azure-CNI allows integration of containers with Azure vNET, and allows them to leverage the set of capabilities that [Azure Virtual Network provides](https://azure.microsoft.com/en-us/services/virtual-network/). For example, securely connect to Azure services or use Azure NSGs. See [azure-cni for some examples](https://docs.microsoft.com/en-us/azure/aks/concepts-network#azure-cni-advanced-networking) |
| Overlay (Overlay networking for Windows in Kubernetes is in *alpha* stage) | Containers are given a vNIC connected to an external vSwitch. Each overlay network gets its own IP subnet, defined by a custom IP prefix.The overlay network driver uses VXLAN encapsulation. | Encapsulated with an outer header, inner packet remains the same. | [Win-overlay](https://github.com/containernetworking/plugins/tree/master/plugins/main/windows/win-overlay), Flannel VXLAN (uses win-overlay) | win-overlay should be used when virtual container networks are desired to be isolated from underlay of hosts (e.g. for security reasons). Allows for IPs to be re-used for different overlay networks (which have different VNID tags) if you are restricted on IPs in your datacenter. This option may be used when the container hosts are not L2 adjacent but have L3 connectivity |
| Overlay (Overlay networking for Windows in Kubernetes is in *alpha* stage) | Containers are given a vNIC connected to an external vSwitch. Each overlay network gets its own IP subnet, defined by a custom IP prefix.The overlay network driver uses VXLAN encapsulation. | Encapsulated with an outer header. | [Win-overlay](https://github.com/containernetworking/plugins/tree/master/plugins/main/windows/win-overlay), Flannel VXLAN (uses win-overlay) | win-overlay should be used when virtual container networks are desired to be isolated from underlay of hosts (e.g. for security reasons). Allows for IPs to be re-used for different overlay networks (which have different VNID tags) if you are restricted on IPs in your datacenter. This option requires [KB4489899](https://support.microsoft.com/help/4489899) on Windows Server 2019. |
| Transparent (special use case for [ovn-kubernetes](https://github.com/openvswitch/ovn-kubernetes)) | Requires an external vSwitch. Containers are attached to an external vSwitch which enables intra-pod communication via logical networks (logical switches and routers). | Packet is encapsulated either via [GENEVE](https://datatracker.ietf.org/doc/draft-gross-geneve/) or [STT](https://datatracker.ietf.org/doc/draft-davie-stt/) tunneling to reach pods which are not on the same host. <br/> Packets are forwarded or dropped via the tunnel metadata information supplied by the ovn network controller. <br/> NAT is done for north-south communication. | [ovn-kubernetes](https://github.com/openvswitch/ovn-kubernetes) | [Deploy via ansible](https://github.com/openvswitch/ovn-kubernetes/tree/master/contrib). Distributed ACLs can be applied via Kubernetes policies. IPAM support. Load-balancing can be achieved without kube-proxy. NATing is done without using iptables/netsh. |
| NAT (*not used in Kubernetes*) | Containers are given a vNIC connected to an internal vSwitch. DNS/DHCP is provided using an internal component called [WinNAT](https://blogs.technet.microsoft.com/virtualization/2016/05/25/windows-nat-winnat-capabilities-and-limitations/) | MAC and IP is rewritten to host MAC/IP. | [nat](https://github.com/Microsoft/windows-container-networking/tree/master/plugins/nat) | Included here for completeness |
As outlined above, the [Flannel](https://github.com/coreos/flannel) CNI [meta plugin](https://github.com/containernetworking/plugins/tree/master/plugins/meta/flannel) is also supported on [Windows](https://github.com/containernetworking/plugins/tree/master/plugins/meta/flannel#windows-support-experimental) via the [VXLAN network backend](https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan) (**alpha support** ; delegates to win-overlay) and [host-gateway network backend](https://github.com/coreos/flannel/blob/master/Documentation/backends.md#host-gw) (stable support; delegates to win-bridge). This plugin supports delegating to one of the reference CNI plugins (win-overlay, win-bridge), to work in conjunction with Flannel daemon on Windows (Flanneld) for automatic node subnet lease assignment and HNS network creation. This plugin reads in its own configuration file (net-conf.json), and aggregates it with the environment variables from the FlannelD generated subnet.env file. It then delegates to one of the reference CNI plugins for network plumbing, and sends the correct configuration containing the node-assigned subnet to the IPAM plugin (e.g. host-local).
As outlined above, the [Flannel](https://github.com/coreos/flannel) CNI [meta plugin](https://github.com/containernetworking/plugins/tree/master/plugins/meta/flannel) is also supported on [Windows](https://github.com/containernetworking/plugins/tree/master/plugins/meta/flannel#windows-support-experimental) via the [VXLAN network backend](https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan) (**alpha support** ; delegates to win-overlay) and [host-gateway network backend](https://github.com/coreos/flannel/blob/master/Documentation/backends.md#host-gw) (stable support; delegates to win-bridge). This plugin supports delegating to one of the reference CNI plugins (win-overlay, win-bridge), to work in conjunction with Flannel daemon on Windows (Flanneld) for automatic node subnet lease assignment and HNS network creation. This plugin reads in its own configuration file (cni.conf), and aggregates it with the environment variables from the FlannelD generated subnet.env file. It then delegates to one of the reference CNI plugins for network plumbing, and sends the correct configuration containing the node-assigned subnet to the IPAM plugin (e.g. host-local).
For the node, pod, and service objects, the following network flows are supported for TCP/UDP traffic:
@ -218,7 +236,6 @@ As a result, the following storage functionality is not supported on Windows nod
* Read-only root filesystem. Mapped volumes still support readOnly
* Block device mapping
* Memory as the storage medium
* CSI plugins which require privileged containers
* File system features like uui/guid, per-user Linux filesystem permissions
* NFS based storage/volume support
* Expanding the mounted volume (resizefs)
@ -259,6 +276,7 @@ These features were added in Kubernetes v1.15:
* ClusterFirstWithHostNet is not supported for DNS. Windows treats all names with a '.' as a FQDN and skips PQDN resolution
* On Linux, you have a DNS suffix list, which is used when trying to resolve PQDNs. On Windows, we only have 1 DNS suffix, which is the DNS suffix associated with that pod's namespace (mydns.svc.cluster.local for example). Windows can resolve FQDNs and services or names resolvable with just that suffix. For example, a pod spawned in the default namespace, will have the DNS suffix **default.svc.cluster.local**. On a Windows pod, you can resolve both **kubernetes.default.svc.cluster.local** and **kubernetes**, but not the in-betweens, like **kubernetes.default** or **kubernetes.default.svc**.
* On Windows, there are multiple DNS resolvers that can be used. As these come with slightly different behaviors, using the `Resolve-DNSName` utility for name query resolutions is recommended.
##### Security
@ -513,6 +531,20 @@ Your main source of help for troubleshooting your Kubernetes cluster should star
This was implemented in Kubernetes 1.15, and the pause infrastructure container `mcr.microsoft.com/k8s/core/pause:1.2.0`. Be sure to use these versions or newer ones.
If you would like to build your own pause infrastructure container, be sure to include [wincat](https://github.com/kubernetes-sigs/sig-windows-tools/tree/master/cmd/wincat)
1. My Kubernetes installation is failing because my Windows Server node is behind a proxy
If you are behind a proxy, the following PowerShell environment variables must be defined:
```PowerShell
[Environment]::SetEnvironmentVariable("HTTP_PROXY", "http://proxy.example.com:80/", [EnvironmentVariableTarget]::Machine)
[Environment]::SetEnvironmentVariable("HTTPS_PROXY", "http://proxy.example.com:443/", [EnvironmentVariableTarget]::Machine)
```
1. What is a `pause` container?
In a Kubernetes Pod, an infrastructure or "pause" container is first created to host the container endpoint. Containers that belong to the same pod, including infrastructure and worker containers, share a common network namespace and endpoint (same IP and port space). Pause containers are needed to accomodate worker containers crashing or restarting without losing any of the networking configuration.
The "pause" (infrastructure) image is hosted on Microsoft Container Registry (MCR). You can access it using `docker pull mcr.microsoft.com/k8s/core/pause:1.2.0`. For more details, see the [DOCKERFILE](https://github.com/kubernetes-sigs/sig-windows-tools/tree/master/cmd/wincat).
### Further investigation
If these steps don't resolve your problem, you can get help running Windows containers on Windows nodes in Kubernetes through:

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.9 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 MiB

View File

@ -3,26 +3,36 @@ reviewers:
- michmike
- patricklang
title: Guide for adding Windows Nodes in Kubernetes
content_template: templates/concept
min-kubernetes-server-version: v1.14
content_template: templates/tutorial
weight: 70
---
{{% capture overview %}}
The Kubernetes platform can now be used to run both Linux and Windows containers. One or more Windows nodes can be registered to a cluster. This guide shows how to:
* Register a Windows node to the cluster
* Configure networking so pods on Linux and Windows can communicate
The Kubernetes platform can now be used to run both Linux and Windows containers. This page shows how one or more Windows nodes can be registered to a cluster.
{{% /capture %}}
{{% capture body %}}
## Before you begin
{{% capture prerequisites %}}
* Obtain a [Windows Server license](https://www.microsoft.com/en-us/cloud-platform/windows-server-pricing) in order to configure the Windows node that hosts Windows containers. You can use your organization's licenses for the cluster, or acquire one from Microsoft, a reseller, or via the major cloud providers such as GCP, AWS, and Azure by provisioning a virtual machine running Windows Server through their marketplaces. A [time-limited trial](https://www.microsoft.com/en-us/cloud-platform/windows-server-trial) is also available.
* Obtain a [Windows Server 2019 license](https://www.microsoft.com/en-us/cloud-platform/windows-server-pricing) (or higher) in order to configure the Windows node that hosts Windows containers. You can use your organization's licenses for the cluster, or acquire one from Microsoft, a reseller, or via the major cloud providers such as GCP, AWS, and Azure by provisioning a virtual machine running Windows Server through their marketplaces. A [time-limited trial](https://www.microsoft.com/en-us/cloud-platform/windows-server-trial) is also available.
* Build a Linux-based Kubernetes cluster in which you have access to the control plane (some examples include [Creating a single control-plane cluster with kubeadm](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/), [AKS Engine](/docs/setup/production-environment/turnkey/azure/), [GCE](/docs/setup/production-environment/turnkey/gce/), [AWS](/docs/setup/production-environment/turnkey/aws/).
* Build a Linux-based Kubernetes cluster in which you have access to the control-plane (some examples include [Creating a single control-plane cluster with kubeadm](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/), [AKS Engine](/docs/setup/production-environment/turnkey/azure/), [GCE](/docs/setup/production-environment/turnkey/gce/), [AWS](/docs/setup/production-environment/turnkey/aws/).
{{% /capture %}}
{{% capture objectives %}}
* Register a Windows node to the cluster
* Configure networking so Pods and Services on Linux and Windows can communicate with each other
{{% /capture %}}
{{% capture lessoncontent %}}
## Getting Started: Adding a Windows Node to Your Cluster
@ -42,7 +52,7 @@ Review the networking options supported in 'Intro to Windows containers in Kuber
### Components that run on Windows
While the Kubernetes control plane runs on your Linux node(s), the following components are configured and run on your Windows node(s).
While the Kubernetes control-plane runs on your Linux node(s), the following components are configured and run on your Windows node(s).
1. kubelet
2. kube-proxy
@ -53,9 +63,9 @@ Get the latest binaries from [https://github.com/kubernetes/kubernetes/releases]
### Networking Configuration
Once you have a Linux-based Kubernetes master node you are ready to choose a networking solution. This guide illustrates using Flannel in VXLAN mode for simplicity.
Once you have a Linux-based Kubernetes control-plane ("Master") node you are ready to choose a networking solution. This guide illustrates using Flannel in VXLAN mode for simplicity.
#### Configuring Flannel in VXLAN mode on the Linux controller
#### Configuring Flannel in VXLAN mode on the Linux control-plane
1. Prepare Kubernetes master for Flannel
@ -122,7 +132,7 @@ Once you have a Linux-based Kubernetes master node you are ready to choose a net
}
```
1. Apply the Flannel yaml and Validate
1. Apply the Flannel manifest and validate
Let's apply the Flannel configuration:
@ -169,120 +179,136 @@ Once you have a Linux-based Kubernetes master node you are ready to choose a net
kube-proxy 2 2 2 2 2 beta.kubernetes.io/os=linux 26d
```
#### Join Windows Worker
In this section we'll cover configuring a Windows node from scratch to join a cluster on-prem. If your cluster is on a cloud you'll likely want to follow the cloud specific guides in the next section.
### Join Windows Worker Node
In this section we'll cover configuring a Windows node from scratch to join a cluster on-prem. If your cluster is on a cloud you'll likely want to follow the cloud specific guides in the [public cloud providers section](#public-cloud-providers).
#### Preparing a Windows Node
{{< note >}}
All code snippets in Windows sections are to be run in a PowerShell environment with elevated permissions (Admin).
All code snippets in Windows sections are to be run in a PowerShell environment with elevated permissions (Administrator) on the Windows worker node.
{{< /note >}}
1. Install Docker (requires a system reboot)
1. Download the [SIG Windows tools](https://github.com/kubernetes-sigs/sig-windows-tools) repository containing install and join scripts
```PowerShell
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Start-BitsTransfer https://github.com/kubernetes-sigs/sig-windows-tools/archive/master.zip
tar -xvf .\master.zip --strip-components 3 sig-windows-tools-master/kubeadm/v1.15.0/*
Remove-Item .\master.zip
```
Kubernetes uses [Docker](https://www.docker.com/) as its container engine, so we need to install it. You can follow the [official Docs instructions](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-docker/configure-docker-daemon#install-docker), the [Docker instructions](https://store.docker.com/editions/enterprise/docker-ee-server-windows), or try the following *recommended* steps:
```PowerShell
Enable-WindowsOptionalFeature -FeatureName Containers
Restart-Computer -Force
Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name Docker -ProviderName DockerMsftProvider
```
If you are behind a proxy, the following PowerShell environment variables must be defined:
```PowerShell
[Environment]::SetEnvironmentVariable("HTTP_PROXY", "http://proxy.example.com:80/", [EnvironmentVariableTarget]::Machine)
[Environment]::SetEnvironmentVariable("HTTPS_PROXY", "http://proxy.example.com:443/", [EnvironmentVariableTarget]::Machine)
```
After reboot, you can verify that the docker service is ready with the command below.
```PowerShell
docker version
```
If you see error message like the following, you need to start the docker service manually.
1. Customize the Kubernetes [configuration file](https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/kubeadm/v1.15.0/Kubeclustervxlan.json)
```
Client:
Version: 17.06.2-ee-11
API version: 1.30
Go version: go1.8.7
Git commit: 06fc007
Built: Thu May 17 06:14:39 2018
OS/Arch: windows / amd64
error during connect: Get http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.30/version: open //./pipe/docker_engine: The system c
annot find the file specified. In the default daemon configuration on Windows, the docker client must be run elevated to
connect. This error may also indicate that the docker daemon is not running.
```
You can start the docker service manually like below.
```PowerShell
Start-Service docker
```
{{< note >}}
The "pause" (infrastructure) image is hosted on Microsoft Container Registry (MCR). You can access it using "docker pull mcr.microsoft.com/k8s/core/pause:1.2.0". The DOCKERFILE is available at https://github.com/kubernetes-sigs/sig-windows-tools/tree/master/cmd/wincat.
{{< /note >}}
1. Prepare a Windows directory for Kubernetes
Create a "Kubernetes for Windows" directory to store Kubernetes binaries as well as any deployment scripts and config files.
```PowerShell
mkdir c:\k
```
1. Copy Kubernetes certificate
Copy the Kubernetes certificate file `$HOME/.kube/config` [from the Linux controller](https://docs.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/creating-a-linux-master#collect-cluster-information) to this new `C:\k` directory on your Windows node.
Tip: You can use tools such as [xcopy](https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/xcopy), [WinSCP](https://winscp.net/eng/download.php), or this [PowerShell wrapper for WinSCP](https://www.powershellgallery.com/packages/WinSCP/5.13.2.0) to transfer the config file between nodes.
1. Download Kubernetes binaries
To be able to run Kubernetes, you first need to download the `kubelet` and `kube-proxy` binaries. You download these from the Node Binaries links in the CHANGELOG.md file of the [latest releases](https://github.com/kubernetes/kubernetes/releases/). For example 'kubernetes-node-windows-amd64.tar.gz'. You may also optionally download `kubectl` to run on Windows which you can find under Client Binaries.
Use the [Expand-Archive](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.archive/expand-archive?view=powershell-6) PowerShell command to extract the archive and place the binaries into `C:\k`.
#### Join the Windows node to the Flannel cluster
The Flannel overlay deployment scripts and documentation are available in [this repository](https://github.com/Microsoft/SDN/tree/master/Kubernetes/flannel/overlay). The following steps are a simple walkthrough of the more comprehensive instructions available there.
Download the [Flannel start.ps1](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/start.ps1) script, the contents of which should be extracted to `C:\k`:
```PowerShell
cd c:\k
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
wget https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/flannel/start.ps1 -o c:\k\start.ps1
```
{
"Cri" : { // Contains values for container runtime and base container setup
"Name" : "dockerd", // Container runtime name
"Images" : {
"Pause" : "mcr.microsoft.com/k8s/core/pause:1.2.0", // Infrastructure container image
"Nanoserver" : "mcr.microsoft.com/windows/nanoserver:1809", // Base Nanoserver container image
"ServerCore" : "mcr.microsoft.com/windows/servercore:ltsc2019" // Base ServerCore container image
}
},
"Cni" : { // Contains values for networking executables
"Name" : "flannel", // Name of network fabric
"Source" : [{ // Contains array of objects containing values for network daemon(s)
"Name" : "flanneld", // Name of network daemon
"Url" : "https://github.com/coreos/flannel/releases/download/v0.11.0/flanneld.exe" // Direct URL pointing to network daemon executable
}
],
"Plugin" : { // Contains values for CNI network plugin
"Name": "vxlan" // Backend network mechanism to use: ["vxlan" | "bridge"]
},
"InterfaceName" : "Ethernet" // Designated network interface name on Windows node to use as container network
},
"Kubernetes" : { // Contains values for Kubernetes node binaries
"Source" : { // Contains values for Kubernetes node binaries
"Release" : "1.15.0", // Version of Kubernetes node binaries
"Url" : "https://dl.k8s.io/v1.15.0/kubernetes-node-windows-amd64.tar.gz" // Direct URL pointing to Kubernetes node binaries tarball
},
"ControlPlane" : { // Contains values associated with Kubernetes control-plane ("Master") node
"IpAddress" : "kubemasterIP", // IP address of control-plane ("Master") node
"Username" : "localadmin", // Username on control-plane ("Master") node with remote SSH access
"KubeadmToken" : "token", // Kubeadm bootstrap token
"KubeadmCAHash" : "discovery-token-ca-cert-hash" // Kubeadm CA key hash
},
"KubeProxy" : { // Contains values for Kubernetes network proxy configuration
"Gates" : "WinOverlay=true" // Comma-separated key-value pairs passed to kube-proxy feature gate flag
},
"Network" : { // Contains values for IP ranges in CIDR notation for Kubernetes networking
"ServiceCidr" : "10.96.0.0/12", // Service IP subnet used by Services in CIDR notation
"ClusterCidr" : "10.244.0.0/16" // Cluster IP subnet used by Pods in CIDR notation
}
},
"Install" : { // Contains values and configurations for Windows node installation
"Destination" : "C:\\ProgramData\\Kubernetes" // Absolute DOS path where Kubernetes will be installed on the Windows node
}
}
```
{{< note >}}
[start.ps1](https://github.com/Microsoft/SDN/blob/master/Kubernetes/flannel/start.ps1) references [install.ps1](https://github.com/Microsoft/SDN/blob/master/Kubernetes/windows/install.ps1), which downloads additional files such as the `flanneld` executable and the [Dockerfile for infrastructure pod](https://github.com/Microsoft/SDN/blob/master/Kubernetes/windows/Dockerfile) and install those for you. For overlay networking mode, the [firewall](https://github.com/Microsoft/SDN/blob/master/Kubernetes/windows/helper.psm1#L111) is opened for local UDP port 4789. There may be multiple powershell windows being opened/closed as well as a few seconds of network outage while the new external vSwitch for the pod network is being created the first time. Run the script using the arguments as specified below:
Users can generate values for the `ControlPlane.KubeadmToken` and `ControlPlane.KubeadmCAHash` fields by running `kubeadm token create --print-join-command` on the Kubernetes control-plane ("Master") node.
{{< /note >}}
```PowerShell
cd c:\k
.\start.ps1 -ManagementIP <Windows Node IP> `
-NetworkMode overlay `
-ClusterCIDR <Cluster CIDR> `
-ServiceCIDR <Service CIDR> `
-KubeDnsServiceIP <Kube-dns Service IP> `
-LogDir <Log directory>
```
1. Install containers and Kubernetes (requires a system reboot)
| Parameter | Default Value | Notes |
| --- | --- | --- |
| -ManagementIP | N/A (required) | The IP address assigned to the Windows node. You can use `ipconfig` to find this. |
| -NetworkMode | l2bridge | We're using `overlay` here |
| -ClusterCIDR | 10.244.0.0/16 | Refer to your cluster IP plan |
| -ServiceCIDR | 10.96.0.0/12 | Refer to your cluster IP plan |
| -KubeDnsServiceIP | 10.96.0.10 | |
| -InterfaceName | Ethernet | The name of the network interface of the Windows host. You can use <code>ipconfig</code> to find this. |
| -LogDir | C:\k | The directory where kubelet and kube-proxy logs are redirected into their respective output files. |
Use the previously downloaded [KubeCluster.ps1](https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/kubeadm/v1.15.0/KubeCluster.ps1) script to install Kubernetes on the Windows Server container host:
```PowerShell
.\KubeCluster.ps1 -ConfigFile .\Kubeclustervxlan.json -install
```
where `-ConfigFile` points to the path of the Kubernetes configuration file.
{{< note >}}
In the example below, we are using overlay networking mode. This requires Windows Server version 2019 with [KB4489899](https://support.microsoft.com/help/4489899) and at least Kubernetes v1.14 or above. Users that cannot meet this requirement must use `L2bridge` networking instead by selecting `bridge` as the [plugin](https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/kubeadm/v1.15.0/Kubeclusterbridge.json#L18) in the configuration file.
{{< /note >}}
![alt_text](../kubecluster.ps1-install.gif "KubeCluster.ps1 install output")
On the Windows node you target, this step will:
1. Enable Windows Server containers role (and reboot)
1. Download and install the chosen container runtime
1. Download all needed container images
1. Download Kubernetes binaries and add them to the `$PATH` environment variable
1. Download CNI plugins based on the selection made in the Kubernetes Configuration file
1. (Optionally) Generate a new SSH key which is required to connect to the control-plane ("Master") node during joining
{{< note >}}For the SSH key generation step, you also need to add the generated public SSH key to the `authorized_keys` file on your (Linux) control-plane node. You only need to do this once. The script prints out the steps you can follow to do this, at the end of its output.{{< /note >}}
Once installation is complete, any of the generated configuration files or binaries can be modified before joining the Windows node.
#### Join the Windows Node to the Kubernetes cluster
This section covers how to join a [Windows node with Kubernetes installed](#preparing-a-windows-node) with an existing (Linux) control-plane, to form a cluster.
Use the previously downloaded [KubeCluster.ps1](https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/kubeadm/v1.15.0/KubeCluster.ps1) script to join the Windows node to the cluster:
```PowerShell
.\KubeCluster.ps1 -ConfigFile .\Kubeclustervxlan.json -join
```
where `-ConfigFile` points to the path of the Kubernetes configuration file.
![alt_text](../kubecluster.ps1-join.gif "KubeCluster.ps1 join output")
{{< note >}}
Should the script fail during the bootstrap or joining procedure for whatever reason, start a new PowerShell session before starting each consecutive join attempt.
{{< /note >}}
This step will perform the following actions:
1. Connect to the control-plane ("Master") node via SSH, to retrieve the [Kubeconfig file](/docs/concepts/configuration/organize-cluster-access-kubeconfig/) file.
1. Register kubelet as a Windows service
1. Configure CNI network plugins
1. Create an HNS network on top of the chosen network interface
{{< note >}}
This may cause a network blip for a few seconds while the vSwitch is being created.
{{< /note >}}
1. (If vxlan plugin is selected) Open up inbound firewall UDP port 4789 for overlay traffic
1. Register flanneld as a Windows service
1. Register kube-proxy as a Windows service
Now you can view the Windows nodes in your cluster by running the following:
@ -290,9 +316,28 @@ Now you can view the Windows nodes in your cluster by running the following:
kubectl get nodes
```
{{< note >}}
You may want to configure your Windows node components like kubelet and kube-proxy to run as services. View the services and background processes section under [troubleshooting](#troubleshooting) for additional instructions. Once you are running the node components as services, collecting logs becomes an important part of troubleshooting. View the [gathering logs](https://github.com/kubernetes/community/blob/master/sig-windows/CONTRIBUTING.md#gathering-logs) section of the contributing guide for further instructions.
{{< /note >}}
#### Remove the Windows Node from the Kubernetes cluster
In this section we'll cover how to remove a Windows node from a Kubernetes cluster.
Use the previously downloaded [KubeCluster.ps1](https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/kubeadm/v1.15.0/KubeCluster.ps1) script to remove the Windows node from the cluster:
```PowerShell
.\KubeCluster.ps1 -ConfigFile .\Kubeclustervxlan.json -reset
```
where `-ConfigFile` points to the path of the Kubernetes configuration file.
![alt_text](../kubecluster.ps1-reset.gif "KubeCluster.ps1 reset output")
This step will perform the following actions on the targeted Windows node:
1. Delete the Windows node from the Kubernetes cluster
1. Stop all running containers
1. Remove all container networking (HNS) resources
1. Unregister all Kubernetes services (flanneld, kubelet, kube-proxy)
1. Delete all Kubernetes binaries (kube-proxy.exe, kubelet.exe, flanneld.exe, kubeadm.exe)
1. Delete all CNI network plugins binaries
1. Delete [Kubeconfig file](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) used to access the Kubernetes cluster
### Public Cloud Providers
@ -306,10 +351,12 @@ Users can easily deploy a complete Kubernetes cluster on GCE following this step
#### Deployment with kubeadm and cluster API
Kubeadm is becoming the de facto standard for users to deploy a Kubernetes cluster. Windows node support in kubeadm will come in a future release. We are also making investments in cluster API to ensure Windows nodes are properly provisioned.
Kubeadm is becoming the de facto standard for users to deploy a Kubernetes cluster. Windows node support in kubeadm is an alpha feature since Kubernetes release v1.16. We are also making investments in cluster API to ensure Windows nodes are properly provisioned. For more details, please consult the [kubeadm for Windows KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/20190424-kubeadm-for-windows.md).
### Next Steps
Now that you've configured a Windows worker in your cluster to run Windows containers you may want to add one or more Linux nodes as well to run Linux containers. You are now ready to schedule Windows containers on your cluster.
{{% /capture %}}

View File

@ -167,8 +167,8 @@ The finalizer will only be removed after the load balancer resource is cleaned u
This prevents dangling load balancer resources even in corner cases such as the
service controller crashing.
This feature was introduced as alpha in Kubernetes v1.15. You can start using it by
enabling the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
This feature is beta and enabled by default since Kubernetes v1.16. You can also
enable it in v1.15 (alpha) via the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`ServiceLoadBalancerFinalizer`.
## External Load Balancer Providers

View File

@ -1,7 +1,6 @@
---
title: Versions of CustomResourceDefinitions
title: Versions in CustomResourceDefinitions
reviewers:
- mbohlool
- sttts
- liggitt
content_template: templates/task
@ -19,7 +18,7 @@ level of your CustomResourceDefinitions or advance your API to a new version wit
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
* Make sure your Kubernetes cluster has a master version of 1.11.0 or higher.
* Make sure your Kubernetes cluster has a master version of 1.16.0 or higher for `apiextensions.k8s.io/v1`, or 1.11.0 or higher for `apiextensions.k8s.io/v1beta1`.
* Read about [custom resources](/docs/concepts/api-extension/custom-resources/).
@ -29,28 +28,88 @@ level of your CustomResourceDefinitions or advance your API to a new version wit
## Overview
{{< feature-state state="beta" for_kubernetes_version="1.15" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
The CustomResourceDefinition API supports a `versions` field that you can use to
support multiple versions of custom resources that you have developed. Versions
can have different schemas with a conversion webhook to convert custom resources between versions.
The CustomResourceDefinition API provides a workflow for introducing and upgrading
to new versions of a CustomResourceDefinition.
When a CustomResourceDefinition is created, the first version is set in the
CustomResourceDefinition `spec.versions` list to an appropriate stability level
and a version number. For example `v1beta1` would indicate that the first
version is not yet stable. All custom resource objects will initially be stored
at this version.
Once the CustomResourceDefinition is created, clients may begin using the
`v1beta1` API.
Later it might be necessary to add new version such as `v1`.
Adding a new version:
1. Pick a conversion strategy. Since custom resource objects need to be able to
be served at both versions, that means they will sometimes be served at a
different version than their storage version. In order for this to be
possible, the custom resource objects must sometimes be converted between the
version they are stored at and the version they are served at. If the
conversion involves schema changes and requires custom logic, a conversion
webhook should be used. If there are no schema changes, the default `None`
conversion strategy may be used and only the `apiVersion` field will be
modified when serving different versions.
1. If using conversion webhooks, create and deploy the conversion webhook. See
the [Webhook conversion](#webhook-conversion) for more details.
1. Update the CustomResourceDefinition to include the new version in the
`spec.versions` list with `served:true`. Also, set `spec.conversion` field
to the selected conversion strategy. If using a conversion webhook, configure
`spec.conversion.webhookClientConfig` field to call the webhook.
Once the new version is added, clients may incrementally migrate to the new
version. It is perfectly safe for some clients to use the old version while
others use the new version.
Migrate stored objects to the new version:
1. See the [upgrade existing objects to a new stored version](#upgrade-existing-objects-to-a-new-stored-version) section.
It is safe for clients to use both the old and new version before, during and
after upgrading the objects to a new stored version.
Removing an old version:
1. Ensure all clients are fully migrated to the new version. The kube-apiserver
logs can reviewed to help identify any clients that are still accessing via
the old version.
1. Set `served` to `false` for the old version in the `spec.versions` list. If
any clients are still unexpectedly using the old version they may begin reporting
errors attempting to access the custom resource objects at the old version.
If this occurs, switch back to using `served:true` on the old version, migrate the
remaining clients to the new version and repeat this step.
1. Ensure the [upgrade of existing objects to the new stored version](#upgrade-existing-objects-to-a-new-stored-version) step has been completed.
1. Verify that the `stored` is set to `true` for the new version in the `spec.versions` list in the CustomResourceDefinition.
1. Verify that the old version is no longer listed in the CustomResourceDefinition `status.storedVersions`.
1. Remove the old version from the CustomResourceDefinition `spec.versions` list.
1. Drop conversion support for the old version in conversion webhooks.
## Specify multiple versions
The CustomResourceDefinition API `versions` field can be used to support multiple versions of custom resources that you
have developed. Versions can have different schemas, and conversion webhooks can convert custom resources between versions.
Webhook conversions should follow the [Kubernetes API conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md) wherever applicable.
Specifically, See the [API change documentation](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md) for a set of useful gotchas and suggestions.
{{< note >}}
Earlier iterations included a `version` field instead of `versions`. The
In `apiextensions.k8s.io/v1beta1`, there was a `version` field instead of `versions`. The
`version` field is deprecated and optional, but if it is not empty, it must
match the first item in the `versions` field.
{{< /note >}}
## Specify multiple versions
This example shows a CustomResourceDefinition with two versions. For the first
example, the assumption is all versions share the same schema with no conversion
between them. The comments in the YAML provide more context.
{{< tabs name="CustomResourceDefinition_versioning_example_1" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
@ -65,9 +124,26 @@ spec:
served: true
# One and only one version must be marked as the storage version.
storage: true
# A schema is required
schema:
openAPIV3Schema:
type: object
properties:
host:
type: string
port:
type: string
- name: v1
served: true
storage: false
schema:
openAPIV3Schema:
type: object
properties:
host:
type: string
port:
type: string
# The conversion section is introduced in Kubernetes 1.13+ with a default value of
# None conversion (strategy sub-field set to None).
conversion:
@ -87,6 +163,57 @@ spec:
shortNames:
- ct
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1beta1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
- name: v1
served: true
storage: false
validation:
openAPIV3Schema:
type: object
properties:
host:
type: string
port:
type: string
# The conversion section is introduced in Kubernetes 1.13+ with a default value of
# None conversion (strategy sub-field set to None).
conversion:
# None conversion assumes the same schema for all versions and only sets the apiVersion
# field of custom resources to the proper value
strategy: None
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
```
{{% /tab %}}
{{< /tabs >}}
You can save the CustomResourceDefinition in a YAML file, then use
`kubectl apply` to create it.
@ -149,7 +276,7 @@ the version.
## Webhook conversion
{{< feature-state state="beta" for_kubernetes_version="1.15" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
{{< note >}}
Webhook conversion is available as beta since 1.15, and as alpha since Kubernetes 1.13. The
@ -169,13 +296,13 @@ To cover all of these cases and to optimize conversion by the API server, the co
### Write a conversion webhook server
Please refer to the implementation of the [custom resource conversion webhook
server](https://github.com/kubernetes/kubernetes/tree/v1.13.0/test/images/crd-conversion-webhook/main.go)
server](https://github.com/kubernetes/kubernetes/tree/v1.15.0/test/images/crd-conversion-webhook/main.go)
that is validated in a Kubernetes e2e test. The webhook handles the
`ConversionReview` requests sent by the API servers, and sends back conversion
results wrapped in `ConversionResponse`. Note that the request
contains a list of custom resources that need to be converted independently without
changing the order of objects.
The example server is organized in a way to be reused for other conversions. Most of the common code are located in the [framework file](https://github.com/kubernetes/kubernetes/tree/v1.14.0/test/images/crd-conversion-webhook/converter/framework.go) that leaves only [one function](https://github.com/kubernetes/kubernetes/blob/v1.13.0/test/images/crd-conversion-webhook/converter/example_converter.go#L29-L80) to be implemented for different conversions.
The example server is organized in a way to be reused for other conversions. Most of the common code are located in the [framework file](https://github.com/kubernetes/kubernetes/tree/v1.15.0/test/images/crd-conversion-webhook/converter/framework.go) that leaves only [one function](https://github.com/kubernetes/kubernetes/blob/v1.15.0/test/images/crd-conversion-webhook/converter/example_converter.go#L29-L80) to be implemented for different conversions.
{{< note >}}
The example conversion webhook server leaves the `ClientAuth` field
@ -208,7 +335,75 @@ if a different port is used for the service.
The `None` conversion example can be extended to use the conversion webhook by modifying `conversion`
section of the `spec`:
{{< tabs name="CustomResourceDefinition_versioning_example_2" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1beta1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
# Each version can define it's own schema when there is no top-level
# schema is defined.
schema:
openAPIV3Schema:
type: object
properties:
hostPort:
type: string
- name: v1
served: true
storage: false
schema:
openAPIV3Schema:
type: object
properties:
host:
type: string
port:
type: string
conversion:
# a Webhook strategy instruct API server to call an external webhook for any conversion between custom resources.
strategy: Webhook
# webhook is required when strategy is `Webhook` and it configures the webhook endpoint to be called by API server.
webhook:
# conversionReviewVersions indicates what ConversionReview versions are understood/preferred by the webhook.
# The first version in the list understood by the API server is sent to the webhook.
# The webhook must respond with a ConversionReview object in the same version it received.
conversionReviewVersions: ["v1","v1beta1"]
clientConfig:
service:
namespace: default
name: example-conversion-webhook-server
path: /crdconvert
caBundle: "Ci0tLS0tQk...<base64-encoded PEM bundle>...tLS0K"
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
@ -248,14 +443,13 @@ spec:
conversion:
# a Webhook strategy instruct API server to call an external webhook for any conversion between custom resources.
strategy: Webhook
# webhookClientConfig is required when strategy is `Webhook` and it configure the webhook endpoint to be
# called by API server.
# webhookClientConfig is required when strategy is `Webhook` and it configures the webhook endpoint to be called by API server.
webhookClientConfig:
service:
namespace: default
name: example-conversion-webhook-server
path: /crdconvert
caBundle: <pem encoded ca cert that signs the server cert used by the webhook>
caBundle: "Ci0tLS0tQk...<base64-encoded PEM bundle>...tLS0K"
# either Namespaced or Cluster
scope: Namespaced
names:
@ -269,6 +463,8 @@ spec:
shortNames:
- ct
```
{{% /tab %}}
{{< /tabs >}}
You can save the CustomResourceDefinition in a YAML file, then use
`kubectl apply` to apply it.
@ -313,7 +509,25 @@ Fragments ("#...") and query parameters ("?...") are also not allowed.
Here is an example of a conversion webhook configured to call a URL
(and expects the TLS certificate to be verified using system trust roots, so does not specify a caBundle):
{{< tabs name="CustomResourceDefinition_versioning_example_3" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
...
spec:
...
conversion:
strategy: Webhook
webhook:
clientConfig:
url: "https://my-webhook.example.com:9443/my-webhook-path"
...
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
...
@ -325,6 +539,8 @@ spec:
url: "https://my-webhook.example.com:9443/my-webhook-path"
...
```
{{% /tab %}}
{{< /tabs >}}
### Service Reference
@ -337,7 +553,30 @@ Here is an example of a webhook that is configured to call a service on port "12
at the subpath "/my-path", and to verify the TLS connection against the ServerName
`my-service-name.my-service-namespace.svc` using a custom CA bundle.
{{< tabs name="CustomResourceDefinition_versioning_example_4" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1b
kind: CustomResourceDefinition
...
spec:
...
conversion:
strategy: Webhook
webhook:
clientConfig:
service:
namespace: my-service-namespace
name: my-service-name
path: /my-path
port: 1234
caBundle: "Ci0tLS0tQk...<base64-encoded PEM bundle>...tLS0K"
...
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
...
@ -354,6 +593,312 @@ spec:
caBundle: "Ci0tLS0tQk...<base64-encoded PEM bundle>...tLS0K"
...
```
{{% /tab %}}
{{< /tabs >}}
## Webhook request and response
### Request
Webhooks are sent a POST request, with `Content-Type: application/json`,
with a `ConversionReview` API object in the `apiextensions.k8s.io` API group
serialized to JSON as the body.
Webhooks can specify what versions of `ConversionReview` objects they accept
with the `conversionReviewVersions` field in their CustomResourceDefinition:
{{< tabs name="conversionReviewVersions" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
...
spec:
...
conversion:
strategy: Webhook
webhook:
conversionReviewVersions: ["v1", "v1beta1"]
...
```
`conversionReviewVersions` is a required field when creating
`apiextensions.k8s.io/v1` custom resource definitions.
Webhooks are required to support at least one `ConversionReview`
version understood by the current and previous API server.
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
...
spec:
...
conversion:
strategy: Webhook
conversionReviewVersions: ["v1", "v1beta1"]
...
```
If no `conversionReviewVersions` are specified, the default when creating
`apiextensions.k8s.io/v1beta1` custom resource definitions is `v1beta1`.
{{% /tab %}}
{{< /tabs >}}
API servers send the first `ConversionReview` version in the `conversionReviewVersions` list they support.
If none of the versions in the list are supported by the API server, the custom resource definition will not be allowed to be created.
If an API server encounters a conversion webhook configuration that was previously created and does not support any of the `ConversionReview`
versions the API server knows how to send, attempts to call to the webhook will fail.
This example shows the data contained in an `ConversionReview` object
for a request to convert `CronTab` objects to `example.com/v1`:
{{< tabs name="ConversionReview_request" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "ConversionReview",
"request": {
# Random uid uniquely identifying this conversion call
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
# The API group and version the objects should be converted to
"desiredAPIVersion": "example.com/v1",
# The list of objects to convert.
# May contain one or more objects, in one or more versions.
"objects": [
{
"kind": "CronTab",
"apiVersion": "example.com/v1beta1",
"metadata": {
"creationTimestamp": "2019-09-04T14:03:02Z",
"name": "local-crontab",
"namespace": "default",
"resourceVersion": "143",
"uid": "3415a7fc-162b-4300-b5da-fd6083580d66"
},
"hostPort": "localhost:1234"
},
{
"kind": "CronTab",
"apiVersion": "example.com/v1beta1",
"metadata": {
"creationTimestamp": "2019-09-03T13:02:01Z",
"name": "remote-crontab",
"resourceVersion": "12893",
"uid": "359a83ec-b575-460d-b553-d859cedde8a0"
},
"hostPort": "example.com:2345"
}
]
}
}
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
{
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
"apiVersion": "apiextensions.k8s.io/v1beta1",
"kind": "ConversionReview",
"request": {
# Random uid uniquely identifying this conversion call
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
# The API group and version the objects should be converted to
"desiredAPIVersion": "example.com/v1",
# The list of objects to convert.
# May contain one or more objects, in one or more versions.
"objects": [
{
"kind": "CronTab",
"apiVersion": "example.com/v1beta1",
"metadata": {
"creationTimestamp": "2019-09-04T14:03:02Z",
"name": "local-crontab",
"namespace": "default",
"resourceVersion": "143",
"uid": "3415a7fc-162b-4300-b5da-fd6083580d66"
},
"hostPort": "localhost:1234"
},
{
"kind": "CronTab",
"apiVersion": "example.com/v1beta1",
"metadata": {
"creationTimestamp": "2019-09-03T13:02:01Z",
"name": "remote-crontab",
"resourceVersion": "12893",
"uid": "359a83ec-b575-460d-b553-d859cedde8a0"
},
"hostPort": "example.com:2345"
}
]
}
}
```
{{% /tab %}}
{{< /tabs >}}
### Response
Webhooks respond with a 200 HTTP status code, `Content-Type: application/json`,
and a body containing a `ConversionReview` object (in the same version they were sent),
with the `response` stanza populated, serialized to JSON.
If conversion succeeds, a webhook should return a `response` stanza containing the following fields:
* `uid`, copied from the `request.uid` sent to the webhook
* `result`, set to `{"status":"Success"}`
* `convertedObjects`, containing all of the objects from `request.objects`, converted to `request.desiredVersion`
Example of a minimal successful response from a webhook:
{{< tabs name="ConversionReview_response_success" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "ConversionReview",
"response": {
# must match <request.uid>
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
"result": {
"status": "Success"
},
# Objects must match the order of request.objects, and have apiVersion set to <request.desiredAPIVersion>.
# kind, metadata.uid, metadata.name, and metadata.namespace fields must not be changed by the webhook.
# metadata.labels and metadata.annotations fields may be changed by the webhook.
# All other changes to metadata fields by the webhook are ignored.
"convertedObjects": [
{
"kind": "CronTab",
"apiVersion": "example.com/v1",
"metadata": {
"creationTimestamp": "2019-09-04T14:03:02Z",
"name": "local-crontab",
"namespace": "default",
"resourceVersion": "143",
"uid": "3415a7fc-162b-4300-b5da-fd6083580d66"
},
"host": "localhost",
"port": "1234"
},
{
"kind": "CronTab",
"apiVersion": "example.com/v1",
"metadata": {
"creationTimestamp": "2019-09-03T13:02:01Z",
"name": "remote-crontab",
"resourceVersion": "12893",
"uid": "359a83ec-b575-460d-b553-d859cedde8a0"
},
"host": "example.com",
"port": "2345"
}
]
}
}
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
{
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
"apiVersion": "apiextensions.k8s.io/v1beta1",
"kind": "ConversionReview",
"response": {
# must match <request.uid>
"uid": "705ab4f5-6393-11e8-b7cc-42010a800002",
"result": {
"status": "Failed"
},
# Objects must match the order of request.objects, and have apiVersion set to <request.desiredAPIVersion>.
# kind, metadata.uid, metadata.name, and metadata.namespace fields must not be changed by the webhook.
# metadata.labels and metadata.annotations fields may be changed by the webhook.
# All other changes to metadata fields by the webhook are ignored.
"convertedObjects": [
{
"kind": "CronTab",
"apiVersion": "example.com/v1",
"metadata": {
"creationTimestamp": "2019-09-04T14:03:02Z",
"name": "local-crontab",
"namespace": "default",
"resourceVersion": "143",
"uid": "3415a7fc-162b-4300-b5da-fd6083580d66"
},
"host": "localhost",
"port": "1234"
},
{
"kind": "CronTab",
"apiVersion": "example.com/v1",
"metadata": {
"creationTimestamp": "2019-09-03T13:02:01Z",
"name": "remote-crontab",
"resourceVersion": "12893",
"uid": "359a83ec-b575-460d-b553-d859cedde8a0"
},
"host": "example.com",
"port": "2345"
}
]
}
}
```
{{% /tab %}}
{{< /tabs >}}
If conversion fails, a webhook should return a `response` stanza containing the following fields:
* `uid`, copied from the `request.uid` sent to the webhook
* `result`, set to `{"status":"Failed"}`
{{< warning >}}
Failing conversion can disrupt read and write access to the custom resources,
including the ability to update or delete the resources. Conversion failures
should be avoided whenever possible, and should not be used to enforce validation
constraints (use validation schemas or webhook admission instead).
{{< /warning >}}
Example of a response from a webhook indicating a conversion request failed, with an optional message:
{{< tabs name="ConversionReview_response_failure" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "ConversionReview",
"response": {
"uid": "<value from request.uid>",
"result": {
"status": "Failed",
"message": "hostPort could not be parsed into a separate host and port"
}
}
}
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
{
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
"apiVersion": "apiextensions.k8s.io/v1beta1",
"kind": "ConversionReview",
"response": {
"uid": "<value from request.uid>",
"result": {
"status": "Failed",
"message": "hostPort could not be parsed into a separate host and port"
}
}
}
```
{{% /tab %}}
{{< /tabs >}}
## Writing, reading, and updating versioned CustomResourceDefinition objects
@ -398,16 +943,23 @@ can exist in storage at a version that has never been a storage version.
## Upgrade existing objects to a new stored version
When deprecating versions and dropping support, devise a storage upgrade
procedure. The following is an example procedure to upgrade from `v1beta1`
to `v1`.
When deprecating versions and dropping support, select a storage upgrade
procedure.
*Option 1:* Use the Storage Version Migrator
1. Run the [storage Version migrator](https://github.com/kubernetes-sigs/kube-storage-version-migrator)
2. Remove the old version from the CustomResourceDefinition `status.storedVersions` field.
*Option 2:* Manually upgrade the existing objects to a new stored version
The following is an example procedure to upgrade from `v1beta1` to `v1`.
1. Set `v1` as the storage in the CustomResourceDefinition file and apply it
using kubectl. The `storedVersions` is now `v1beta1, v1`.
2. Write an upgrade procedure to list all existing objects and write them with
the same content. This forces the backend to write objects in the current
storage version, which is `v1`.
3. Update the CustomResourceDefinition `Status` by removing `v1beta1` from
`storedVersions` field.
2. Remove `v1beta1` from the CustomResourceDefinition `status.storedVersions` field.
{{% /capture %}}

View File

@ -2,7 +2,9 @@
title: Extend the Kubernetes API with CustomResourceDefinitions
reviewers:
- deads2k
- enisoc
- jpbetz
- liggitt
- roycaihw
- sttts
content_template: templates/task
weight: 20
@ -19,7 +21,7 @@ into the Kubernetes API by creating a
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
* Make sure your Kubernetes cluster has a master version of 1.7.0 or higher.
* Make sure your Kubernetes cluster has a master version of 1.16.0 or higher to use `apiextensions.k8s.io/v1`, or 1.7.0 or higher for `apiextensions.k8s.io/v1beta1`.
* Read about [custom resources](/docs/concepts/api-extension/custom-resources/).
@ -37,7 +39,54 @@ are available to all namespaces.
For example, if you save the following CustomResourceDefinition to `resourcedefinition.yaml`:
{{< tabs name="CustomResourceDefinition_example_1" >}}
{{% tab name="admissionregistration.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: crontabs.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
# either Namespaced or Cluster
scope: Namespaced
names:
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: crontabs
# singular name to be used as an alias on the CLI and for display
singular: crontab
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: CronTab
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ct
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
@ -80,6 +129,8 @@ spec:
replicas:
type: integer
```
{{% /tab %}}
{{< /tabs >}}
And create it:
@ -192,11 +243,11 @@ If you later recreate the same CustomResourceDefinition, it will start out empty
## Specifying a structural schema
{{< feature-state state="beta" for_kubernetes_version="1.15" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
CustomResources traditionally store arbitrary JSON (next to `apiVersion`, `kind` and `metadata`, which is validated by the API server implicitly). With [OpenAPI v3.0 validation](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation) a schema can be specified, which is validated during creation and updates, compare below for details and limits of such a schema.
With `apiextensions.k8s.io/v1` the definition of a structural schema will be mandatory for CustomResourceDefinitions, while in `v1beta1` this is still optional.
With `apiextensions.k8s.io/v1` the definition of a structural schema is mandatory for CustomResourceDefinitions, while in `v1beta1` this is still optional.
A structural schema is an [OpenAPI v3.0 validation schema](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation) which:
@ -305,24 +356,34 @@ anyOf:
Violations of the structural schema rules are reported in the `NonStructural` condition in the CustomResourceDefinition.
Not being structural disables the following features:
Structural schemas are a requirement for `apiextensions.k8s.io/v1`, and disables the following features for `apiextensions.k8s.io/v1beta1`:
* [Validation Schema Publishing](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#publish-validation-schema-in-openapi-v2)
* [Webhook Conversion](/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definition-versioning/#webhook-conversion)
* [Validation Schema Defaulting](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#defaulting)
* [Pruning](#preserving-unknown-fields)
and possibly more features in the future.
### Pruning versus preserving unknown fields
{{< feature-state state="beta" for_kubernetes_version="1.15" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
CustomResourceDefinitions traditionally store any (possibly validated) JSON as is in etcd. This means that unspecified fields (if there is a [OpenAPI v3.0 validation schema](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation) at all) are persisted. This is in contrast to native Kubernetes resources like e.g. a pod where unknown fields are dropped before being persisted to etcd. We call this "pruning" of unknown fields.
If a [structural OpenAPI v3 validation schema](#specifying-a-structural-schema) is defined (either in the global `spec.validation.openAPIV3Schema` or for each version) in a CustomResourceDefinition, pruning can be enabled by setting `spec.preserveUnknownFields` to `false`. Then unspecified fields on creation and on update are dropped.
Compare the CustomResourceDefinition `crontabs.stable.example.com` above. It has pruning enabled. Hence, if you save the following YAML to `my-crontab.yaml`:
{{< tabs name="CustomResourceDefinition_pruning" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
For CustomResourceDefinitions created in `apiextensions.k8s.io/v1`, [structural OpenAPI v3 validation schemas](#specifying-a-structural-schema) are required and pruning is enabled and cannot be disabled (note that CRDs converted from `apiextensions.k8s.io/v1beta1` to `apiextensions.k8s.io/v1` might lack structural schemas, and `spec.preserveUnknownFields` might be `true`).
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
For CustomResourceDefinitions created in `apiextensions.k8s.io/v1beta1`, if a [structural OpenAPI v3 validation schema](#specifying-a-structural-schema) is defined (either in the global `spec.validation.openAPIV3Schema` in `apiextensions.k8s.io/v1beta1` or for each version) in a CustomResourceDefinition, pruning can be enabled by setting `spec.preserveUnknownFields` to `false`.
{{% /tab %}}
{{% /tabs %}}
If pruning is enabled, unspecified fields in CustomResources on creation and on update are dropped.
Compare the CustomResourceDefinition `crontabs.stable.example.com` above. It has pruning enabled (both in `apiextensions.k8s.io/v1` and `apiextensions.k8s.io/v1beta1). Hence, if you save the following YAML to `my-crontab.yaml`:
```yaml
apiVersion: "stable.example.com/v1"
@ -362,11 +423,9 @@ The field `someRandomField` has been pruned.
Note that the `kubectl create` call uses `--validate=false` to skip client-side validation. Because the [OpenAPI validation schemas are also published](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#publish-validation-schema-in-openapi-v2) to kubectl, it will also check for unknown fields and reject those objects long before they are sent to the API server.
In `apiextensions.k8s.io/v1beta1`, pruning is disabled by default, i.e. `spec.preserveUnknownFields` defaults to `true`. In `apiextensions.k8s.io/v1` no new CustomResourceDefinitions with `spec.preserveUnknownFields: true` will be allowed to be created.
### Controlling pruning
With `spec.preserveUnknownField: false` in the CustomResourceDefinition, pruning is enabled for all custom resources of that type and in all versions. It is possible though to opt-out of that for JSON sub-trees via `x-kubernetes-preserve-unknown-fields: true` in the [structural OpenAPI v3 validation schema](#specifying-a-structural-schema):
If pruning is enabled (enforced in `apiextensions.k8s.io/v1`, or as opt-in via `spec.preserveUnknownField: false` in `apiextensions.k8s.io/v1beta1`) in the CustomResourceDefinition, all unspecified fields in custom resources of that type and in all versions are pruned. It is possible though to opt-out of that for JSON sub-trees via `x-kubernetes-preserve-unknown-fields: true` in the [structural OpenAPI v3 validation schema](#specifying-a-structural-schema):
```yaml
type: object
@ -545,10 +604,11 @@ meaning all finalizers have been executed.
### Validation
{{< feature-state state="beta" for_kubernetes_version="1.9" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
Validation of custom objects is possible via
[OpenAPI v3 schema](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md#schemaObject) or [validatingadmissionwebhook](/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook).
[OpenAPI v3 schemas](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md#schemaObject) or [validatingadmissionwebhook](/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook). In `apiextensions.k8s.io/v1` schemas are required, in `apiextensions.k8s.io/v1beta1` they are optional.
Additionally, the following restrictions are applied to the schema:
- These fields cannot be set:
@ -568,15 +628,10 @@ Additionally, the following restrictions are applied to the schema:
These fields can only be set with specific features enabled:
- `default`: the `CustomResourceDefaulting` feature gate must be enabled, compare [Validation Schema Defaulting](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#defaulting).
- `default`: can be set for `apiextensions.k8s.io/v1` CustomResourceDefinitions. Defaulting is in beta since 1.16 and requires the `CustomResourceDefaulting` feature gate to be enabled (which is the case automatically for many clusters for beta features). Compare [Validation Schema Defaulting](/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#defaulting).
Note: compare with [structural schemas](#specifying-a-structural-schema) for further restriction required for certain CustomResourceDefinition features.
{{< note >}}
OpenAPI v3 validation is available as beta. The
`CustomResourceValidation` feature must be enabled, which is the case automatically for many clusters for beta features. Please refer to the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) documentation for more information.
{{< /note >}}
The schema is defined in the CustomResourceDefinition. In the following example, the
CustomResourceDefinition applies the following validations on the custom object:
@ -585,7 +640,46 @@ CustomResourceDefinition applies the following validations on the custom object:
Save the CustomResourceDefinition to `resourcedefinition.yaml`:
{{< tabs name="CustomResourceDefinition_validation" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
versions:
- name: v1
served: true
storage: true
schema:
# openAPIV3Schema is the schema for validating custom objects.
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
replicas:
type: integer
minimum: 1
maximum: 10
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
@ -620,6 +714,8 @@ spec:
minimum: 1
maximum: 10
```
{{% /tab %}}
{{< /tabs >}}
And create it:
@ -685,18 +781,16 @@ crontab "my-new-cron-object" created
### Defaulting
{{< feature-state state="alpha" for_kubernetes_version="1.15" >}}
{{< feature-state state="beta" for_kubernetes_version="1.16" >}}
{{< note >}}
Defaulting is available as alpha since 1.15. It is disabled by default and can be enabled via the `CustomResourceDefaulting` feature gate. Please refer to the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) documentation for more information.
Defaulting also requires a structural schema and pruning.
Defaulting is available as beta since 1.16 in `apiextensions.k8s.io/v1` CustomResourceDefinitions, and hence enabled by default for most clusters (feature gate `CustomResourceDefaulting`, refer to the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) documentation).
{{< /note >}}
Defaulting allows to specify default values in the [OpenAPI v3 validation schema](#validation):
```yaml
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
@ -706,34 +800,32 @@ spec:
- name: v1
served: true
storage: true
version: v1
schema:
# openAPIV3Schema is the schema for validating custom objects.
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
default: "5 0 * * *"
image:
type: string
replicas:
type: integer
minimum: 1
maximum: 10
default: 1
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
preserveUnknownFields: false
validation:
# openAPIV3Schema is the schema for validating custom objects.
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
default: "5 0 * * *"
image:
type: string
replicas:
type: integer
minimum: 1
maximum: 10
default: 1
- ct
```
With this both `cronSpec` and `replicas` are defaulted:
@ -762,15 +854,19 @@ spec:
Note that defaulting happens on the object
* in the request to the API server using the request version defaults
* when reading from etcd using the storage version defaults
* in the request to the API server using the request version defaults,
* when reading from etcd using the storage version defaults,
* after mutating admission plugins with non-empty patches using the admission webhook object version defaults.
Note that defaults applied when reading data from etcd are not automatically written back to etcd. An update request via the API is required to persist those defaults back into etcd.
Defaults applied when reading data from etcd are not automatically written back to etcd. An update request via the API is required to persist those defaults back into etcd.
Default values must be pruned (with the exception of defaults for `metadata` fields) and must validate against a provided schema.
Default values for `metadata` fields of `x-kubernetes-embedded-resources: true` nodes (or parts of a default value covering `metadata`) are not pruned during CustomResourceDefinition creation, but through the pruning step during handling of requests.
### Publish Validation Schema in OpenAPI v2
{{< feature-state state="beta" for_kubernetes_version="1.15" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
{{< note >}}
OpenAPI v2 Publishing is available as beta since 1.15, and as alpha since 1.14. The
@ -801,34 +897,98 @@ CustomResourceDefinition. The following example adds the `Spec`, `Replicas`, and
columns.
1. Save the CustomResourceDefinition to `resourcedefinition.yaml`.
```yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
version: v1
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
additionalPrinterColumns:
- name: Spec
type: string
description: The cron spec defining the interval a CronJob is run
JSONPath: .spec.cronSpec
- name: Replicas
type: integer
description: The number of jobs launched by the CronJob
JSONPath: .spec.replicas
- name: Age
type: date
JSONPath: .metadata.creationTimestamp
```
{{< tabs name="CustomResourceDefinition_printer_columns" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
additionalPrinterColumns:
- name: Spec
type: string
description: The cron spec defining the interval a CronJob is run
jsonPath: .spec.cronSpec
- name: Replicas
type: integer
description: The number of jobs launched by the CronJob
jsonPath: .spec.replicas
- name: Age
type: date
jsonPath: .metadata.creationTimestamp
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
version: v1
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
validation:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
additionalPrinterColumns:
- name: Spec
type: string
description: The cron spec defining the interval a CronJob is run
JSONPath: .spec.cronSpec
- name: Replicas
type: integer
description: The number of jobs launched by the CronJob
JSONPath: .spec.replicas
- name: Age
type: date
JSONPath: .metadata.creationTimestamp
```
{{% /tab %}}
{{< /tabs >}}
2. Create the CustomResourceDefinition:
@ -892,7 +1052,7 @@ The column's `format` controls the style used when `kubectl` prints the value.
### Subresources
{{< feature-state state="beta" for_kubernetes_version="1.11" >}}
{{< feature-state state="stable" for_kubernetes_version="1.16" >}}
Custom resources support `/status` and `/scale` subresources.
@ -972,7 +1132,63 @@ In the following example, both status and scale subresources are enabled.
Save the CustomResourceDefinition to `resourcedefinition.yaml`:
{{< tabs name="CustomResourceDefinition_scale" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
status:
type: object
properties:
replicas:
type: integer
labelSelector:
type: string
# subresources describes the subresources for custom resources.
subresources:
# status enables the status subresource.
status: {}
# scale enables the scale subresource.
scale:
# specReplicasPath defines the JSONPath inside of a custom resource that corresponds to Scale.Spec.Replicas.
specReplicasPath: .spec.replicas
# statusReplicasPath defines the JSONPath inside of a custom resource that corresponds to Scale.Status.Replicas.
statusReplicasPath: .status.replicas
# labelSelectorPath defines the JSONPath inside of a custom resource that corresponds to Scale.Status.Selector.
labelSelectorPath: .status.labelSelector
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
@ -990,6 +1206,26 @@ spec:
kind: CronTab
shortNames:
- ct
validation:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
status:
type: object
properties:
replicas:
type: integer
labelSelector:
type: string
# subresources describes the subresources for custom resources.
subresources:
# status enables the status subresource.
@ -1003,6 +1239,8 @@ spec:
# labelSelectorPath defines the JSONPath inside of a custom resource that corresponds to Scale.Status.Selector.
labelSelectorPath: .status.labelSelector
```
{{% /tab %}}
{{< /tabs >}}
And create it:
@ -1068,8 +1306,10 @@ and illustrates how to output the custom resource using `kubectl get all`.
Save the following CustomResourceDefinition to `resourcedefinition.yaml`:
{{< tabs name="CustomResourceDefinition_categories" >}}
{{% tab name="apiextensions.k8s.io/v1" %}}
```yaml
apiVersion: apiextensions.k8s.io/v1beta1
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
@ -1079,6 +1319,19 @@ spec:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
scope: Namespaced
names:
plural: crontabs
@ -1090,6 +1343,46 @@ spec:
categories:
- all
```
{{% /tab %}}
{{% tab name="apiextensions.k8s.io/v1beta1" %}}
```yaml
# Deprecated in v1.16 in favor of apiextensions.k8s.io/v1
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: crontabs.stable.example.com
spec:
group: stable.example.com
versions:
- name: v1
served: true
storage: true
validation:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
shortNames:
- ct
# categories is a list of grouped resources the custom resource belongs to.
categories:
- all
```
{{% /tab %}}
{{< /tabs >}}
And create it:
@ -1134,7 +1427,7 @@ crontabs/my-new-cron-object 3s
{{% capture whatsnext %}}
* See [CustomResourceDefinition](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#customresourcedefinition-v1beta1-apiextensions-k8s-io).
* See [CustomResourceDefinition](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#customresourcedefinition-v1-apiextensions-k8s-io).
* Serve [multiple versions](/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definition-versioning/) of a
CustomResourceDefinition.

View File

@ -1,348 +0,0 @@
---
reviewers:
- sig-cluster-lifecycle
title: Upgrading kubeadm clusters from v1.12 to v1.13
content_template: templates/task
---
{{% capture overview %}}
This page explains how to upgrade a Kubernetes cluster created with `kubeadm` from version 1.12.x to version 1.13.x, and from version 1.13.x to 1.13.y, where `y > x`.
{{% /capture %}}
{{% capture prerequisites %}}
- You need to have a `kubeadm` Kubernetes cluster running version 1.12.0 or later.
[Swap must be disabled][swap].
The cluster should use a static control plane and etcd pods.
- Make sure you read the [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md) carefully.
- Make sure to back up any important components, such as app-level state stored in a database.
`kubeadm upgrade` does not touch your workloads, only components internal to Kubernetes, but backups are always a best practice.
[swap]: https://serverfault.com/questions/684771/best-way-to-disable-swap-in-linux
### Additional information
- All containers are restarted after upgrade, because the container spec hash value is changed.
- You can upgrade only from one minor version to the next minor version.
That is, you cannot skip versions when you upgrade.
For example, you can upgrade only from 1.10 to 1.11, not from 1.9 to 1.11.
{{< warning >}}
The command `join --experimental-control-plane` is known to fail on single node clusters created with kubeadm v1.12 and then upgraded to v1.13.x.
This will be fixed when graduating the `join --control-plane` workflow from alpha to beta.
A possible workaround is described [here](https://github.com/kubernetes/kubeadm/issues/1269#issuecomment-441116249).
{{</ warning >}}
{{% /capture %}}
{{% capture steps %}}
## Determine which version to upgrade to
1. Find the latest stable 1.13 version:
{{< tabs name="k8s_install_versions" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
apt update
apt-cache policy kubeadm
# find the latest 1.13 version in the list
# it should look like 1.13.x-00, where x is the latest patch
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
yum list --showduplicates kubeadm --disableexcludes=kubernetes
# find the latest 1.13 version in the list
# it should look like 1.13.x-0, where x is the latest patch
{{% /tab %}}
{{< /tabs >}}
## Upgrade the control plane node
1. On your control plane node, upgrade kubeadm:
{{< tabs name="k8s_install_kubeadm" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.13.x-00 with the latest patch version
apt-mark unhold kubeadm && \
apt-get update && apt-get install -y kubeadm=1.13.x-00 && \
apt-mark hold kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.13.x-0 with the latest patch version
yum install -y kubeadm-1.13.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
1. Verify that the download works and has the expected version:
```shell
kubeadm version
```
1. On the master node, run:
```shell
kubeadm upgrade plan
```
You should see output similar to this:
```shell
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.12.2
[upgrade/versions] kubeadm version: v1.13.0
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 2 x v1.12.2 v1.13.0
Upgrade to the latest version in the v1.12 series:
COMPONENT CURRENT AVAILABLE
API Server v1.12.2 v1.13.0
Controller Manager v1.12.2 v1.13.0
Scheduler v1.12.2 v1.13.0
Kube Proxy v1.12.2 v1.13.0
CoreDNS 1.2.2 1.2.6
Etcd 3.2.24 3.2.24
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.13.0
_____________________________________________________________________
```
This command checks that your cluster can be upgraded, and fetches the versions you can upgrade to.
1. Choose a version to upgrade to, and run the appropriate command. For example:
```shell
kubeadm upgrade apply v1.13.0
```
You should see output similar to this:
<!-- TODO: output from stable -->
```shell
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/apply] Respecting the --cri-socket flag that is set with higher priority than the config file.
[upgrade/version] You have chosen to change the cluster version to "v1.13.0"
[upgrade/versions] Cluster version: v1.12.2
[upgrade/versions] kubeadm version: v1.13.0
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.13.0"...
Static pod: kube-apiserver-ip-10-0-0-7 hash: 4af3463d6ace12615f1795e40811c1a1
Static pod: kube-controller-manager-ip-10-0-0-7 hash: a640b0098f5bddc701786e007c96e220
Static pod: kube-scheduler-ip-10-0-0-7 hash: ee7b1077c61516320f4273309e9b4690
map[localhost:2379:3.2.24]
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests969681047"
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2018-11-20-18-30-42/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-ip-10-0-0-7 hash: 4af3463d6ace12615f1795e40811c1a1
Static pod: kube-apiserver-ip-10-0-0-7 hash: bf5b045d2be93e73654f3eb7027a4ef8
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2018-11-20-18-30-42/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-ip-10-0-0-7 hash: a640b0098f5bddc701786e007c96e220
Static pod: kube-controller-manager-ip-10-0-0-7 hash: 1e0eea23b3d971460ac032c18ab7daac
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2018-11-20-18-30-42/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-ip-10-0-0-7 hash: ee7b1077c61516320f4273309e9b4690
Static pod: kube-scheduler-ip-10-0-0-7 hash: 7f7d929b61a2cc5bcdf36609f75927ec
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ip-10-0-0-7" as an annotation
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.13.0". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
```
1. Manually upgrade your Software Defined Network (SDN).
Your Container Network Interface (CNI) provider may have its own upgrade instructions to follow.
Check the [addons](/docs/concepts/cluster-administration/addons/) page to
find your CNI provider and see whether additional upgrade steps are required.
1. Upgrade the kubelet on the control plane node:
{{< tabs name="k8s_install_kubelet" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.13.x-00 with the latest patch version
apt-mark unhold kubelet && \
apt-get update && apt-get install -y kubelet=1.13.x-00 && \
apt-mark hold kubelet
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.13.x-0 with the latest patch version
yum install -y kubelet-1.13.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
## Upgrade kubectl on all nodes
1. Upgrade kubectl on all nodes:
{{< tabs name="k8s_install_kubectl" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.13.x-00 with the latest patch version
apt-mark unhold kubectl && \
apt-get update && apt-get install -y kubectl=1.13.x-00 && \
apt-mark hold kubectl
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.13.x-0 with the latest patch version
yum install -y kubectl-1.13.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
## Drain control plane and worker nodes
1. Prepare each node for maintenance by marking it unschedulable and evicting the workloads. Run:
```shell
kubectl drain $NODE --ignore-daemonsets
```
On the control plane node, you must add `--ignore-daemonsets`:
```shell
kubectl drain ip-172-31-85-18
node "ip-172-31-85-18" cordoned
error: unable to drain node "ip-172-31-85-18", aborting command...
There are pending nodes to be drained:
ip-172-31-85-18
error: DaemonSet-managed pods (use --ignore-daemonsets to ignore): calico-node-5798d, kube-proxy-thjp9
```
```
kubectl drain ip-172-31-85-18 --ignore-daemonsets
node "ip-172-31-85-18" already cordoned
WARNING: Ignoring DaemonSet-managed pods: calico-node-5798d, kube-proxy-thjp9
node "ip-172-31-85-18" drained
```
## Upgrade the kubelet config on worker nodes
1. On each node except the control plane node, upgrade the kubelet config:
```shell
kubeadm upgrade node config --kubelet-version v1.13.x
```
Replace `x` with the patch version you picked for this ugprade.
## Upgrade kubeadm and the kubelet on worker nodes
1. Upgrade the Kubernetes package version on each `$NODE` node by running the Linux package manager for your distribution:
{{< tabs name="k8s_kubelet_and_kubeadm" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.13.x-00 with the latest patch version
apt-mark unhold kubelet kubeadm
apt-get update
apt-get install -y kubelet=1.13.x-00 kubeadm=1.13.x-00
apt-mark hold kubelet kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.13.x-0 with the latest patch version
yum install -y kubelet-1.13.x-0 kubeadm-1.13.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
## Restart the kubelet for all nodes
1. Restart the kubelet process for all nodes:
```shell
systemctl restart kubelet
```
1. Verify that the new version of the `kubelet` is running on the node:
```shell
systemctl status kubelet
```
1. Bring the node back online by marking it schedulable:
```shell
kubectl uncordon $NODE
```
1. After the kubelet is upgraded on all nodes, verify that all nodes are available again by running the following command from anywhere kubectl can access the cluster:
```shell
kubectl get nodes
```
The `STATUS` column should show `Ready` for all your nodes, and the version number should be updated.
{{% /capture %}}
## Recovering from a failure state
If `kubeadm upgrade` fails and does not roll back, for example because of an unexpected shutdown during execution, you can run `kubeadm upgrade` again.
This command is idempotent and eventually makes sure that the actual state is the desired state you declare.
To recover from a bad state, you can also run `kubeadm upgrade --force` without changing the version that your cluster is running.
## How it works
`kubeadm upgrade apply` does the following:
- Checks that your cluster is in an upgradeable state:
- The API server is reachable
- All nodes are in the `Ready` state
- The control plane is healthy
- Enforces the version skew policies.
- Makes sure the control plane images are available or available to pull to the machine.
- Upgrades the control plane components or rollbacks if any of them fails to come up.
- Applies the new `kube-dns` and `kube-proxy` manifests and makes sure that all necessary RBAC rules are created.
- Creates new certificate and key files of the API server and backs up old files if they're about to expire in 180 days.

View File

@ -1,379 +0,0 @@
---
reviewers:
- sig-cluster-lifecycle
title: Upgrading kubeadm clusters from v1.13 to v1.14
content_template: templates/task
---
{{% capture overview %}}
This page explains how to upgrade a Kubernetes cluster created with kubeadm from version 1.13.x to version 1.14.x,
and from version 1.14.x to 1.14.y (where `y > x`).
The upgrade workflow at high level is the following:
1. Upgrade the primary control plane node.
1. Upgrade additional control plane nodes.
1. Upgrade worker nodes.
{{< note >}}
With the release of Kubernetes v1.14, the kubeadm instructions for upgrading both HA and single control plane clusters
are merged into a single document.
{{</ note >}}
{{% /capture %}}
{{% capture prerequisites %}}
- You need to have a kubeadm Kubernetes cluster running version 1.13.0 or later.
- [Swap must be disabled](https://serverfault.com/questions/684771/best-way-to-disable-swap-in-linux).
- The cluster should use a static control plane and etcd pods.
- Make sure you read the [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md) carefully.
- Make sure to back up any important components, such as app-level state stored in a database.
`kubeadm upgrade` does not touch your workloads, only components internal to Kubernetes, but backups are always a best practice.
### Additional information
- All containers are restarted after upgrade, because the container spec hash value is changed.
- You only can upgrade from one MINOR version to the next MINOR version,
or between PATCH versions of the same MINOR. That is, you cannot skip MINOR versions when you upgrade.
For example, you can upgrade from 1.y to 1.y+1, but not from 1.y to 1.y+2.
{{% /capture %}}
{{% capture steps %}}
## Determine which version to upgrade to
1. Find the latest stable 1.14 version:
{{< tabs name="k8s_install_versions" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
apt update
apt-cache policy kubeadm
# find the latest 1.14 version in the list
# it should look like 1.14.x-00, where x is the latest patch
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
yum list --showduplicates kubeadm --disableexcludes=kubernetes
# find the latest 1.14 version in the list
# it should look like 1.14.x-0, where x is the latest patch
{{% /tab %}}
{{< /tabs >}}
## Upgrade the first control plane node
1. On your first control plane node, upgrade kubeadm:
{{< tabs name="k8s_install_kubeadm_first_cp" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.14.x-00 with the latest patch version
apt-mark unhold kubeadm kubelet && \
apt-get update && apt-get install -y kubeadm=1.14.x-00 && \
apt-mark hold kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.14.x-0 with the latest patch version
yum install -y kubeadm-1.14.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
1. Verify that the download works and has the expected version:
```shell
kubeadm version
```
1. On the control plane node, run:
```shell
sudo kubeadm upgrade plan
```
You should see output similar to this:
```shell
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.13.3
[upgrade/versions] kubeadm version: v1.14.0
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 2 x v1.13.3 v1.14.0
Upgrade to the latest version in the v1.13 series:
COMPONENT CURRENT AVAILABLE
API Server v1.13.3 v1.14.0
Controller Manager v1.13.3 v1.14.0
Scheduler v1.13.3 v1.14.0
Kube Proxy v1.13.3 v1.14.0
CoreDNS 1.2.6 1.3.1
Etcd 3.2.24 3.3.10
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.14.0
_____________________________________________________________________
```
This command checks that your cluster can be upgraded, and fetches the versions you can upgrade to.
1. Choose a version to upgrade to, and run the appropriate command. For example:
```shell
sudo kubeadm upgrade apply v1.14.x
```
- Replace `x` with the patch version you picked for this ugprade.
You should see output similar to this:
```shell
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/version] You have chosen to change the cluster version to "v1.14.0"
[upgrade/versions] Cluster version: v1.13.3
[upgrade/versions] kubeadm version: v1.14.0
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.14.0"...
Static pod: kube-apiserver-myhost hash: 6436b0d8ee0136c9d9752971dda40400
Static pod: kube-controller-manager-myhost hash: 8ee730c1a5607a87f35abb2183bf03f2
Static pod: kube-scheduler-myhost hash: 4b52d75cab61380f07c0c5a69fb371d4
[upgrade/etcd] Upgrading to TLS for etcd
Static pod: etcd-myhost hash: 877025e7dd7adae8a04ee20ca4ecb239
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-03-14-20-52-44/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: etcd-myhost hash: 877025e7dd7adae8a04ee20ca4ecb239
Static pod: etcd-myhost hash: 877025e7dd7adae8a04ee20ca4ecb239
Static pod: etcd-myhost hash: 64a28f011070816f4beb07a9c96d73b6
[apiclient] Found 1 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests043818770"
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-03-14-20-52-44/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-myhost hash: 6436b0d8ee0136c9d9752971dda40400
Static pod: kube-apiserver-myhost hash: 6436b0d8ee0136c9d9752971dda40400
Static pod: kube-apiserver-myhost hash: 6436b0d8ee0136c9d9752971dda40400
Static pod: kube-apiserver-myhost hash: b8a6533e241a8c6dab84d32bb708b8a1
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-03-14-20-52-44/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-myhost hash: 8ee730c1a5607a87f35abb2183bf03f2
Static pod: kube-controller-manager-myhost hash: 6f77d441d2488efd9fc2d9a9987ad30b
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-03-14-20-52-44/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-myhost hash: 4b52d75cab61380f07c0c5a69fb371d4
Static pod: kube-scheduler-myhost hash: a24773c92bb69c3748fcce5e540b7574
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.14.0". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
```
1. Manually upgrade your CNI provider plugin.
Your Container Network Interface (CNI) provider may have its own upgrade instructions to follow.
Check the [addons](/docs/concepts/cluster-administration/addons/) page to
find your CNI provider and see whether additional upgrade steps are required.
1. Upgrade the kubelet and kubectl on the control plane node:
{{< tabs name="k8s_install_kubelet" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.14.x-00 with the latest patch version
apt-mark unhold kubelet kubectl && \
apt-get update && apt-get install -y kubelet=1.14.x-00 kubectl=1.14.x-00 && \
apt-mark hold kubelet kubectl
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.14.x-0 with the latest patch version
yum install -y kubelet-1.14.x-0 kubectl-1.14.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
1. Restart the kubelet
```shell
sudo systemctl restart kubelet
```
## Upgrade additional control plane nodes
1. Same as the first control plane node but use:
```
sudo kubeadm upgrade node experimental-control-plane
```
instead of:
```
sudo kubeadm upgrade apply
```
Also `sudo kubeadm upgrade plan` is not needed.
## Upgrade worker nodes
The upgrade procedure on worker nodes should be executed one node at a time or few nodes at a time,
without compromising the minimum required capacity for running your workloads.
### Upgrade kubeadm
1. Upgrade kubeadm on all worker nodes:
{{< tabs name="k8s_install_kubeadm_worker_nodes" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.14.x-00 with the latest patch version
apt-mark unhold kubeadm kubelet && \
apt-get update && apt-get install -y kubeadm=1.14.x-00 && \
apt-mark hold kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.14.x-0 with the latest patch version
yum install -y kubeadm-1.14.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
### Cordon the node
1. Prepare the node for maintenance by marking it unschedulable and evicting the workloads. Run:
```shell
kubectl drain $NODE --ignore-daemonsets
```
You should see output similar to this:
```shell
node/ip-172-31-85-18 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-proxy-dj7d7, kube-system/weave-net-z65qx
node/ip-172-31-85-18 drained
```
### Upgrade the kubelet config
1. Upgrade the kubelet config:
```shell
sudo kubeadm upgrade node config --kubelet-version v1.14.x
```
Replace `x` with the patch version you picked for this ugprade.
### Upgrade kubelet and kubectl
1. Upgrade the Kubernetes package version by running the Linux package manager for your distribution:
{{< tabs name="k8s_kubelet_and_kubectl" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.14.x-00 with the latest patch version
apt-mark unhold kubelet kubectl && \
apt-get update && apt-get install -y kubelet=1.14.x-00 kubectl=1.14.x-00 && \
apt-mark hold kubelet kubectl
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.14.x-0 with the latest patch version
yum install -y kubelet-1.14.x-0 kubectl-1.14.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
1. Restart the kubelet
```shell
sudo systemctl restart kubelet
```
### Uncordon the node
1. Bring the node back online by marking it schedulable:
```shell
kubectl uncordon $NODE
```
## Verify the status of the cluster
After the kubelet is upgraded on all nodes verify that all nodes are available again by running the following command from anywhere kubectl can access the cluster:
```shell
kubectl get nodes
```
The `STATUS` column should show `Ready` for all your nodes, and the version number should be updated.
{{% /capture %}}
## Recovering from a failure state
If `kubeadm upgrade` fails and does not roll back, for example because of an unexpected shutdown during execution, you can run `kubeadm upgrade` again.
This command is idempotent and eventually makes sure that the actual state is the desired state you declare.
To recover from a bad state, you can also run `kubeadm upgrade --force` without changing the version that your cluster is running.
## How it works
`kubeadm upgrade apply` does the following:
- Checks that your cluster is in an upgradeable state:
- The API server is reachable
- All nodes are in the `Ready` state
- The control plane is healthy
- Enforces the version skew policies.
- Makes sure the control plane images are available or available to pull to the machine.
- Upgrades the control plane components or rollbacks if any of them fails to come up.
- Applies the new `kube-dns` and `kube-proxy` manifests and makes sure that all necessary RBAC rules are created.
- Creates new certificate and key files of the API server and backs up old files if they're about to expire in 180 days.
`kubeadm upgrade node experimental-control-plane` does the following on additional control plane nodes:
- Fetches the kubeadm `ClusterConfiguration` from the cluster.
- Optionally backups the kube-apiserver certificate.
- Upgrades the static Pod manifests for the control plane components.

View File

@ -1,168 +0,0 @@
---
reviewers:
- luxas
- timothysc
- jbeda
title: Upgrading kubeadm HA clusters from v1.12 to v1.13
content_template: templates/task
---
{{% capture overview %}}
This page explains how to upgrade a highly available (HA) Kubernetes cluster created with `kubeadm` from version 1.12.x to version 1.13.y. In addition to upgrading, you must also follow the instructions in [Creating HA clusters with kubeadm](/docs/setup/production-environment/tools/kubeadm/high-availability/).
{{% /capture %}}
{{% capture prerequisites %}}
Before proceeding:
- You need to have a `kubeadm` HA cluster running version 1.12 or higher.
- Make sure you read the [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md) carefully.
- Make sure to back up any important components, such as app-level state stored in a database. `kubeadm upgrade` does not touch your workloads, only components internal to Kubernetes, but backups are always a best practice.
- Check the prerequisites for [Upgrading/downgrading kubeadm clusters between v1.12 to v1.13](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-13/).
{{< note >}}
All commands on any control plane or etcd node should be run as root.
{{< /note >}}
{{% /capture %}}
{{% capture steps %}}
## Prepare for both methods
Upgrade `kubeadm` to the version that matches the version of Kubernetes that you are upgrading to:
```shell
apt-mark unhold kubeadm && \
apt-get update && apt-get upgrade -y kubeadm && \
apt-mark hold kubeadm
```
Check prerequisites and determine the upgrade versions:
```shell
kubeadm upgrade plan
```
You should see something like the following:
```
Upgrade to the latest version in the v1.13 series:
COMPONENT CURRENT AVAILABLE
API Server v1.12.2 v1.13.0
Controller Manager v1.12.2 v1.13.0
Scheduler v1.12.2 v1.13.0
Kube Proxy v1.12.2 v1.13.0
CoreDNS 1.2.2 1.2.6
```
## Stacked control plane nodes
### Upgrade the first control plane node
Modify `configmap/kubeadm-config` for this control plane node:
```shell
kubectl edit configmap -n kube-system kubeadm-config
```
Make the following modifications to the ClusterConfiguration key:
- `etcd`
Remove the etcd section completely
Make the following modifications to the ClusterStatus key:
- `apiEndpoints`
Add an entry for each of the additional control plane hosts
Start the upgrade:
```shell
kubeadm upgrade apply v<YOUR-CHOSEN-VERSION-HERE>
```
You should see something like the following:
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.13.0". Enjoy!
The `kubeadm-config` ConfigMap is now updated from `v1alpha3` version to `v1beta1`.
### Upgrading additional control plane nodes
Start the upgrade:
```shell
kubeadm upgrade node experimental-control-plane
```
## External etcd
### Upgrade the first control plane
Run the upgrade:
```
kubeadm upgrade apply v1.13.0
```
### Upgrade the other control plane nodes
For other control plane nodes in the cluster, run the following command:
```
kubeadm upgrade node experimental-control-plane
```
## Next steps
### Manually upgrade your CNI provider
Your Container Network Interface (CNI) provider might have its own upgrade instructions to follow. Check the [addons](/docs/concepts/cluster-administration/addons/) page to find your CNI provider and see whether you need to take additional upgrade steps.
### Update kubelet and kubectl packages
Upgrade the kubelet and kubectl by running the following on each node:
```shell
# use your distro's package manager, e.g. 'apt-get' on Debian-based systems
# for the versions stick to kubeadm's output (see above)
apt-mark unhold kubelet kubectl && \
apt-get update && \
apt-get install kubelet=<NEW-K8S-VERSION> kubectl=<NEW-K8S-VERSION> && \
apt-mark hold kubelet kubectl && \
systemctl restart kubelet
```
In this example a _deb_-based system is assumed and `apt-get` is used for installing the upgraded software. On rpm-based systems the command is `yum install <PACKAGE>=<NEW-K8S-VERSION>` for all packages.
Verify that the new version of the kubelet is running:
```shell
systemctl status kubelet
```
Verify that the upgraded node is available again by running the following command from wherever you run `kubectl`:
```shell
kubectl get nodes
```
If the `STATUS` column shows `Ready` for the upgraded host, you can continue. You might need to repeat the command until the node shows `Ready`.
## If something goes wrong
If the upgrade fails, see whether one of the following scenarios applies:
- If `kubeadm upgrade apply` failed to upgrade the cluster, it will try to perform a rollback. If this is the case on the first master, the cluster is probably still intact.
You can run `kubeadm upgrade apply` again, because it is idempotent and should eventually make sure the actual state is the desired state you are declaring. You can run `kubeadm upgrade apply` to change a running cluster with `x.x.x --> x.x.x` with `--force` to recover from a bad state.
- If `kubeadm upgrade apply` on one of the secondary masters failed, the cluster is upgraded and working, but the secondary masters are in an undefined state. You need to investigate further and join the secondaries manually.
{{% /capture %}}

View File

@ -1,14 +1,17 @@
---
reviewers:
- sig-cluster-lifecycle
title: Upgrading kubeadm clusters from v1.14 to v1.15
title: Upgrading kubeadm clusters
content_template: templates/task
---
{{% capture overview %}}
This page explains how to upgrade a Kubernetes cluster created with kubeadm from version
1.14.x to version 1.15.x, and from version 1.15.x to 1.15.y (where `y > x`).
1.15.x to version 1.16.x, and from version 1.16.x to 1.16.y (where `y > x`).
To see information about upgrading clusters created using older versions of kubeadm,
please pick a Kubernetes version from the drop down menu of this web page.
The upgrade workflow at high level is the following:
@ -20,10 +23,10 @@ The upgrade workflow at high level is the following:
{{% capture prerequisites %}}
- You need to have a kubeadm Kubernetes cluster running version 1.14.0 or later.
- You need to have a kubeadm Kubernetes cluster running version 1.15.0 or later.
- [Swap must be disabled](https://serverfault.com/questions/684771/best-way-to-disable-swap-in-linux).
- The cluster should use a static control plane and etcd pods or external etcd.
- Make sure you read the [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md) carefully.
- Make sure you read the [release notes](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md) carefully.
- Make sure to back up any important components, such as app-level state stored in a database.
`kubeadm upgrade` does not touch your workloads, only components internal to Kubernetes, but backups are always a best practice.
@ -40,19 +43,19 @@ The upgrade workflow at high level is the following:
## Determine which version to upgrade to
1. Find the latest stable 1.15 version:
1. Find the latest stable 1.16 version:
{{< tabs name="k8s_install_versions" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
apt update
apt-cache policy kubeadm
# find the latest 1.15 version in the list
# it should look like 1.15.x-00, where x is the latest patch
# find the latest 1.16 version in the list
# it should look like 1.16.x-00, where x is the latest patch
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
yum list --showduplicates kubeadm --disableexcludes=kubernetes
# find the latest 1.15 version in the list
# it should look like 1.15.x-0, where x is the latest patch
# find the latest 1.16 version in the list
# it should look like 1.16.x-0, where x is the latest patch
{{% /tab %}}
{{< /tabs >}}
@ -64,14 +67,14 @@ The upgrade workflow at high level is the following:
{{< tabs name="k8s_install_kubeadm_first_cp" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.15.x-00 with the latest patch version
# replace x in 1.16.x-00 with the latest patch version
apt-mark unhold kubeadm && \
apt-get update && apt-get install -y kubeadm=1.15.x-00 && \
apt-get update && apt-get install -y kubeadm=1.16.x-00 && \
apt-mark hold kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.15.x-0 with the latest patch version
yum install -y kubeadm-1.15.x-0 --disableexcludes=kubernetes
# replace x in 1.16.x-0 with the latest patch version
yum install -y kubeadm-1.16.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
@ -96,26 +99,26 @@ The upgrade workflow at high level is the following:
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.14.2
[upgrade/versions] kubeadm version: v1.15.0
[upgrade/versions] Cluster version: v1.15.2
[upgrade/versions] kubeadm version: v1.16.0
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 1 x v1.14.2 v1.15.0
Kubelet 1 x v1.15.2 v1.16.0
Upgrade to the latest version in the v1.15 series:
Upgrade to the latest version in the v1.16 series:
COMPONENT CURRENT AVAILABLE
API Server v1.14.2 v1.15.0
Controller Manager v1.14.2 v1.15.0
Scheduler v1.14.2 v1.15.0
Kube Proxy v1.14.2 v1.15.0
CoreDNS 1.3.1 1.3.1
Etcd 3.3.10 3.3.10
API Server v1.15.2 v1.16.0
Controller Manager v1.15.2 v1.16.0
Scheduler v1.15.2 v1.16.0
Kube Proxy v1.15.2 v1.16.0
CoreDNS 1.3.1 1.6.2
Etcd 3.3.10 3.3.15
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.15.0
kubeadm upgrade apply v1.16.0
_____________________________________________________________________
```
@ -123,15 +126,15 @@ The upgrade workflow at high level is the following:
This command checks that your cluster can be upgraded, and fetches the versions you can upgrade to.
{{< note >}}
With the release of Kubernetes v1.15, `kubeadm upgrade` also automatically renews
the certificates that it manages on this node. To opt-out of certificate renewal the flag `--certificate-renewal=false` can be used.
`kubeadm upgrade` also automatically renews the certificates that it manages on this node.
To opt-out of certificate renewal the flag `--certificate-renewal=false` can be used.
For more information see the [certificate management guide](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs).
{{</ note >}}
1. Choose a version to upgrade to, and run the appropriate command. For example:
```shell
sudo kubeadm upgrade apply v1.15.x
sudo kubeadm upgrade apply v1.16.x
```
- Replace `x` with the patch version you picked for this upgrade.
@ -144,9 +147,9 @@ The upgrade workflow at high level is the following:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/version] You have chosen to change the cluster version to "v1.15.0"
[upgrade/versions] Cluster version: v1.14.2
[upgrade/versions] kubeadm version: v1.15.0
[upgrade/version] You have chosen to change the cluster version to "v1.16.0"
[upgrade/versions] Cluster version: v1.15.2
[upgrade/versions] kubeadm version: v1.16.0
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
@ -164,7 +167,7 @@ The upgrade workflow at high level is the following:
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.15.0"...
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.16.0"...
Static pod: kube-apiserver-luboitvbox hash: 8d931c2296a38951e95684cbcbe3b923
Static pod: kube-controller-manager-luboitvbox hash: 2480bf6982ad2103c05f6764e20f2787
Static pod: kube-scheduler-luboitvbox hash: 9b290132363a92652555896288ca3f88
@ -202,8 +205,8 @@ The upgrade workflow at high level is the following:
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upgrade/staticpods] Renewing certificate embedded in "admin.conf"
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
@ -211,7 +214,7 @@ The upgrade workflow at high level is the following:
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.15.0". Enjoy!
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.16.0". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
```
@ -246,14 +249,14 @@ Also `sudo kubeadm upgrade plan` is not needed.
{{< tabs name="k8s_install_kubelet" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.15.x-00 with the latest patch version
# replace x in 1.16.x-00 with the latest patch version
apt-mark unhold kubelet kubectl && \
apt-get update && apt-get install -y kubelet=1.15.x-00 kubectl=1.15.x-00 && \
apt-get update && apt-get install -y kubelet=1.16.x-00 kubectl=1.16.x-00 && \
apt-mark hold kubelet kubectl
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.15.x-0 with the latest patch version
yum install -y kubelet-1.15.x-0 kubectl-1.15.x-0 --disableexcludes=kubernetes
# replace x in 1.16.x-0 with the latest patch version
yum install -y kubelet-1.16.x-0 kubectl-1.16.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
@ -274,14 +277,14 @@ without compromising the minimum required capacity for running your workloads.
{{< tabs name="k8s_install_kubeadm_worker_nodes" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.15.x-00 with the latest patch version
# replace x in 1.16.x-00 with the latest patch version
apt-mark unhold kubeadm && \
apt-get update && apt-get install -y kubeadm=1.15.x-00 && \
apt-get update && apt-get install -y kubeadm=1.16.x-00 && \
apt-mark hold kubeadm
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.15.x-0 with the latest patch version
yum install -y kubeadm-1.15.x-0 --disableexcludes=kubernetes
# replace x in 1.16.x-0 with the latest patch version
yum install -y kubeadm-1.16.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}
@ -315,14 +318,14 @@ without compromising the minimum required capacity for running your workloads.
{{< tabs name="k8s_kubelet_and_kubectl" >}}
{{% tab name="Ubuntu, Debian or HypriotOS" %}}
# replace x in 1.15.x-00 with the latest patch version
# replace x in 1.16.x-00 with the latest patch version
apt-mark unhold kubelet kubectl && \
apt-get update && apt-get install -y kubelet=1.15.x-00 kubectl=1.15.x-00 && \
apt-get update && apt-get install -y kubelet=1.16.x-00 kubectl=1.16.x-00 && \
apt-mark hold kubelet kubectl
{{% /tab %}}
{{% tab name="CentOS, RHEL or Fedora" %}}
# replace x in 1.15.x-0 with the latest patch version
yum install -y kubelet-1.15.x-0 kubectl-1.15.x-0 --disableexcludes=kubernetes
# replace x in 1.16.x-0 with the latest patch version
yum install -y kubelet-1.16.x-0 kubectl-1.16.x-0 --disableexcludes=kubernetes
{{% /tab %}}
{{< /tabs >}}

View File

@ -0,0 +1,153 @@
---
title: Control Topology Management Policies on a node
reviewers:
- ConnorDoyle
- klueska
- lmdaly
- nolancon
content_template: templates/task
---
{{% capture overview %}}
{{< feature-state state="alpha" >}}
An increasing number of systems leverage a combination of CPUs and hardware accelerators to support latency-critical execution and high-throughput parallel computation. These include workloads in fields such as telecommunications, scientific computing, machine learning, financial services and data analytics. Such hybrid systems comprise a high performance environment.
In order to extract the best performance, optimizations related to CPU isolation, memory and device locality are required. However, in Kubernetes, these optimizations are handled by a disjoint set of components.
_Topology Manager_ is a Kubelet component that aims to co-ordinate the set of components that are resposible for these optimizations.
{{% /capture %}}
{{% capture prerequisites %}}
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
{{% /capture %}}
{{% capture steps %}}
## How Topology Manager Works
Prior to the introduction of Topology Manager, the CPU and Device Manager in Kubernetes make resource allocation decisions independently of each other.
This can result in undesirable allocations on multiple-socketed systems, performance/latency sensitive applications will suffer due to these undesirable allocations.
Undesirable in this case meaning for example, CPUs and devices being allocated from different NUMA Nodes thus, incurring additional latency.
The Topology Manager is a Kubelet component, which acts as a source of truth so that other Kubelet components can make topology aligned resource allocation choices.
The Topology Manager provides an interface for components, called *Hint Providers*, to send and receive topology information. Topology Manager has a set of node level policies which are explained below.
The Topology manager receives Topology information from the *Hint Providers* as a bitmask denoting NUMA Nodes available and a preferred allocation indication. The Topology Manager polices preform a set of operations on the hints provided and converge on the hint determined by the policy to give the optimal result, if a undesirable hint is stored the preferred field for the hint will be set to false. In the current policies preferred is the narrowest preferred mask.
The selected hint is stored as part of the Topology Manager. Depending on the policy configured the pod can be accepted or rejected from the node based on the selected hint.
The hint is then stored in the Topology Manager for use by the *Hint Providers* when making the resource allocation decisions.
### Topology Manager Policies
The Topology Manager currently:
- Works on Nodes with the `static` CPU Manager Policy enabled. See [control CPU Management Policies](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/)
- Works on Pods in the `Guaranteed` {{< glossary_tooltip text="QoS class" term_id="qos-class" >}}
If these conditions are met, Topology Manager will align CPU and device requests.
Topology Manager supports four allocation policies. You can set a policy via a Kubelet flag, `--topology-manager-policy`.
There are four supported policies:
* `none` (default)
* `best-effort`
* `restricted`
* `single-numa-node`
### none policy {#policy-none}
This is the default policy and does not perform any topology alignment.
### best-effort policy {#policy-best-effort}
For each container in a Guaranteed Pod, kubelet, with `best-effort` topology
management policy, calls each Hint Provider to discover their resource availability.
Using this information, the Topology Manager stores the
preferred NUMA Node affinity for that container. If the affinity is not preferred,
Topology Manager will store this and admit the pod to the node anyway.
The *Hint Providers* can then use this information when making the
resource allocation decision.
### restricted policy {#policy-restricted}
For each container in a Guaranteed Pod, kubelet, with `restricted` topology
management policy, calls each Hint Provider to discover their resource availability.
Using this information, the Topology Manager stores the
preferred NUMA Node affinity for that container. If the affinity is not preferred,
Topology Manager will reject this pod from the node. This will result in a pod in a `Terminated` state with a pod admission failure.
If the pod is admitted, the *Hint Providers* can then use this information when making the
resource allocation decision.
### single-numa-node policy {#policy-single-numa-node}
For each container in a Guaranteed Pod, kubelet, with `single-numa-node` topology
management policy, calls each Hint Provider to discover their resource availability.
Using this information, the Topology Manager determines if a single NUMA Node affinity is possible.
If it is Topology Manager will store this and the *Hint Providers* can then use this information when making the
resource allocation decision.
If, however, this is not possible the Topology Manager will reject the pod from the node. This will result in a pod in a `Terminated` state with a pod admission failure.
### Pod Interactions with Topology Manager Policies
Consider the containers in the following pod specs:
```yaml
spec:
containers:
- name: nginx
image: nginx
```
This pod runs in the `BestEffort` QoS class because no resource `requests` or
`limits` are specified.
```yaml
spec:
containers:
- name: nginx
image: nginx
resources:
limits:
memory: "200Mi"
requests:
memory: "100Mi"
```
This pod runs in the `Burstable` QoS class because requests are less than limits.
If the selected policy is anything other than `none` , Topology Manager would not consider either of these Pod
specifications.
```yaml
spec:
containers:
- name: nginx
image: nginx
resources:
limits:
memory: "200Mi"
cpu: "2"
example.com/device: "1"
requests:
memory: "200Mi"
cpu: "2"
example.com/device: "1"
```
This pod runs in the `Guaranteed` QoS class because `requests` are equal to `limits`.
Topology Manager would consider this Pod. The Topology Manager consults the CPU Manager `static` policy, which returns the topology of available CPUs.
Topology Manager also consults Device Manager to discover the topology of available devices for example.com/device.
Topology Manager will use this information to store the best Topology for this container. In the case of this Pod, CPU and Device Manager will use this stored information at the resource allocation stage.
{{% /capture %}}

View File

@ -133,7 +133,7 @@ roleRef:
```
## Configure GMSA credential spec reference in pod spec
In the alpha stage of the feature, the annotation `pod.alpha.windows.kubernetes.io/gmsa-credential-spec-name` is used to specify references to desired GMSA credential spec custom resources in pod specs. This configures all containers in the pod spec to use the specified GMSA. A sample pod spec with the annotation populated to refer to `gmsa-WebApp1`:
In the alpha stage of the feature, the pod spec field `securityContext.windowsOptions.gmsaCredentialSpecName` is used to specify references to desired GMSA credential spec custom resources in pod specs. This configures all containers in the pod spec to use the specified GMSA. A sample pod spec with the annotation populated to refer to `gmsa-WebApp1`:
```
apiVersion: apps/v1beta1
@ -152,9 +152,10 @@ spec:
metadata:
labels:
run: with-creds
annotations:
pod.alpha.windows.kubernetes.io/gmsa-credential-spec-name: gmsa-WebApp1 # This must be the name of the cred spec you created
spec:
securityContext:
windowsOptions:
gmsaCredentialSpecName: gmsa-webapp1
containers:
- image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
imagePullPolicy: Always
@ -163,7 +164,7 @@ spec:
beta.kubernetes.io/os: windows
```
Individual containers in a pod spec can also specify the desired GMSA credspec using annotation `<containerName>.container.alpha.windows.kubernetes.io/gmsa-credential-spec`. For example:
Individual containers in a pod spec can also specify the desired GMSA credspec using a per-container `securityContext.windowsOptions.gmsaCredentialSpecName` field. For example:
```
apiVersion: apps/v1beta1
@ -182,13 +183,14 @@ spec:
metadata:
labels:
run: with-creds
annotations:
iis.container.alpha.windows.kubernetes.io/gmsa-credential-spec-name: gmsa-WebApp1 # This must be the name of the cred spec you created
spec:
containers:
- image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
imagePullPolicy: Always
name: iis
securityContext:
windowsOptions:
gmsaCredentialSpecName: gmsa-Webapp1
nodeSelector:
beta.kubernetes.io/os: windows
```

View File

@ -19,6 +19,12 @@ accepting traffic. A Pod is considered ready when all of its Containers are read
One use of this signal is to control which Pods are used as backends for Services.
When a Pod is not ready, it is removed from Service load balancers.
The kubelet uses startup probes to know when a Container application has started.
If such a probe is configured, it disables liveness and readiness checks until
it succeeds, making sure those probes don't interfere with the application startup.
This can be used to adopt liveness checks on slow starting containers, avoiding them
getting killed by the kubelet before they are up and running.
{{% /capture %}}
{{% capture prerequisites %}}
@ -231,6 +237,46 @@ livenessProbe:
port: liveness-port
```
## Protect slow starting containers with startup probes {#define-startup-probes}
Sometimes, you have to deal with legacy applications that might require
an additional startup time on their first initialization.
In such cases, it can be tricky to setup liveness probe parameters without
compromising the fast response to deadlocks that motivated such a probe.
The trick is to setup a startup probe with the same command, HTTP or TCP
check, with a `failureThreshold * periodSeconds` long enough to cover the
worse case startup time.
So, the previous example would become:
```yaml
ports:
- name: liveness-port
containerPort: 8080
hostPort: 8080
livenessProbe:
httpGet:
path: /healthz
port: liveness-port
failureThreshold: 1
periodSeconds: 10
startupProbe:
httpGet:
path: /healthz
port: liveness-port
failureThreshold: 30
periodSeconds: 10
```
Thanks to the startup probe, the application will have a maximum of 5 minutes
(30 * 10 = 300s) to finish its startup.
Once the startup probe has succeeded once, the liveness probe takes over to
provide a fast response to container deadlocks.
If the startup probe never succeeds, the container is killed after 300s and
subject to the pod's `restartPolicy`.
## Define readiness probes
Sometimes, applications are temporarily unable to serve traffic.

View File

@ -261,9 +261,10 @@ kubectl delete namespace qos-example
* [Configure a Pod Quota for a Namespace](/docs/tasks/administer-cluster/quota-pod-namespace/)
* [Configure Quotas for API Objects](/docs/tasks/administer-cluster/quota-api-object/)
* [Control Topology Management policies on a node](/docs/tasks/administer-cluster/topology-manager/)
{{% /capture %}}

View File

@ -506,6 +506,8 @@ plugin which supports full-text search and analytics.
{{% capture whatsnext %}}
Visit [Auditing with Falco](/docs/tasks/debug-application-cluster/falco)
Visit [Auditing with Falco](/docs/tasks/debug-application-cluster/falco).
Learn about [Mutating webhook auditing annotations](/docs/reference/access-authn-authz/extensible-admission-controllers/#mutating-webhook-auditing-annotations).
{{% /capture %}}

View File

@ -0,0 +1,26 @@
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: NotIn
values:
- zoneC
containers:
- name: pause
image: k8s.gcr.io/pause:3.1

View File

@ -0,0 +1,17 @@
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
containers:
- name: pause
image: k8s.gcr.io/pause:3.1

View File

@ -0,0 +1,23 @@
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
- maxSkew: 1
topologyKey: node
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
containers:
- name: pause
image: k8s.gcr.io/pause:3.1