sig-list.md updated

Signed-off-by: Ihor Dvoretskyi <ihor@linux.com>
This commit is contained in:
Ihor Dvoretskyi 2018-06-05 14:52:40 +00:00
commit bed39ba418
151 changed files with 5702 additions and 4833 deletions

8
Gopkg.lock generated
View File

@ -7,14 +7,14 @@
".",
"cmd/misspell"
]
revision = "59894abde931a32630d4e884a09c682ed20c5c7c"
version = "v0.3.0"
revision = "b90dc15cfd220ecf8bbc9043ecb928cef381f011"
version = "v0.3.4"
[[projects]]
branch = "v2"
name = "gopkg.in/yaml.v2"
packages = ["."]
revision = "eb3733d160e74a9c7e442f435eb3bea458e1d19f"
revision = "5420a8b6744d3b0345ab293f6fcba19c978f1183"
version = "v2.2.1"
[solve-meta]
analyzer-name = "dep"

View File

@ -1,2 +1,6 @@
required = ["github.com/client9/misspell/cmd/misspell"]
[prune]
go-tests = true
unused-packages = true
non-go = true

View File

@ -21,9 +21,10 @@ aliases:
- kris-nova
- countspongebob
sig-azure-leads:
- slack
- justaugustus
- shubheksha
- khenidak
- colemickens
- jdumars
sig-big-data-leads:
- foxish
- erikerlandson
@ -31,6 +32,10 @@ aliases:
- soltysh
- pwittrock
- AdoHe
sig-cloud-provider-leads:
- andrewsykim
- hogepodge
- jagosan
sig-cluster-lifecycle-leads:
- lukemarsden
- roberthbailey

13
SECURITY_CONTACTS Normal file
View File

@ -0,0 +1,13 @@
# Defined below are the security contacts for this repo.
#
# They are the contact point for the Product Security Team to reach out
# to for triaging and handling of incoming issues.
#
# The below names agree to abide by the
# [Embargo Policy](https://github.com/kubernetes/sig-release/blob/master/security-release-process-documentation/security-release-process.md#embargo-policy)
# and will be removed and replaced if they violate that agreement.
#
# DO NOT REPORT SECURITY VULNERABILITIES DIRECTLY TO THESE NAMES, FOLLOW THE
# INSTRUCTIONS AT https://kubernetes.io/security/
cblecker

View File

@ -1,23 +1,32 @@
# SIG Governance Template
# SIG Charter Guide
## Goals
All Kubernetes SIGs must define a charter defining the scope and governance of the SIG.
The following documents outline recommendations and requirements for SIG governance structure and provide
template documents for SIGs to adapt. The goals are to define the baseline needs for SIGs to self govern
and organize in a way that addresses the needs of the core Kubernetes project.
- The scope must define what areas the SIG is responsible for directing and maintaining.
- The governance must outline the responsibilities within the SIG as well as the roles
owning those responsibilities.
The documents are focused on:
## Steps to create a SIG charter
- Outlining organizational responsibilities
- Outlining organizational roles
- Outlining processes and tools
1. Copy the template into a new file under community/sig-*YOURSIG*/charter.md ([sig-architecture example])
2. Read the [Recommendations and requirements] so you have context for the template
3. Customize your copy of the template for your SIG. Feel free to make adjustments as needed.
4. Update [sigs.yaml] with the individuals holding the roles as defined in the template.
5. Add subprojects owned by your SIG to the [sigs.yaml]
5. Create a pull request with a draft of your charter.md and sigs.yaml changes. Communicate it within your SIG
and get feedback as needed.
6. Send the SIG Charter out for review to steering@kubernetes.io. Include the subject "SIG Charter Proposal: YOURSIG"
and a link to the PR in the body.
7. Typically expect feedback within a week of sending your draft. Expect longer time if it falls over an
event such as Kubecon or holidays. Make any necessary changes.
8. Once accepted, the steering committee will ratify the PR by merging it.
Specific attention has been given to:
## Steps to update an existing SIG charter
- The role of technical leadership
- The role of operational leadership
- Process for agreeing upon technical decisions
- Process for ensuring technical assets remain healthy
- For significant changes, or any changes that could impact other SIGs, such as the scope, create a
PR and send it to the steering committee for review with the subject: "SIG Charter Update: YOURSIG"
- For minor updates to that only impact issues or areas within the scope of the SIG the SIG Chairs should
facilitate the change.
## How to use the templates
@ -35,6 +44,26 @@ and project.
- [Short Template]
## Goals
The following documents outline recommendations and requirements for SIG charters and provide
template documents for SIGs to adapt. The goals are to define the baseline needs for SIGs to
self govern and exercise ownership over an area of the Kubernetes project.
The documents are focused on:
- Defining SIG scope
- Outlining organizational responsibilities
- Outlining organizational roles
- Outlining processes and tools
Specific attention has been given to:
- The role of technical leadership
- The role of operational leadership
- Process for agreeing upon technical decisions
- Process for ensuring technical assets remain healthy
## FAQ
See [frequently asked questions]
@ -42,3 +71,5 @@ See [frequently asked questions]
[Recommendations and requirements]: sig-governance-requirements.md
[Short Template]: sig-governance-template-short.md
[frequently asked questions]: FAQ.md
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml
[sig-architecture example]: ../../sig-architecture/charter.md

View File

@ -64,7 +64,7 @@ All technical assets *MUST* be owned by exactly 1 SIG subproject. The following
- *SHOULD* define a level of commitment for decisions that have gone through the formal process
(e.g. when is a decision revisited or reversed)
- *MUST* How technical assets of project remain healthy and can be released
- *MUST* define how technical assets of project remain healthy and can be released
- Publicly published signals used to determine if code is in a healthy and releasable state
- Commitment and process to *only* release when signals say code is releasable
- Commitment and process to ensure assets are in a releasable state for milestones / releases

View File

@ -1,8 +1,23 @@
# SIG Governance Template (Short Version)
# SIG YOURSIG Charter
This charter adheres to the conventions described in the [Kubernetes Charter README].
## Scope
This section defines the scope of things that would fall under ownership by this SIG.
It must be used when determining whether subprojects should fall into this SIG.
### In scope
Outline of what falls into the scope of this SIG
### Out of scope
Outline of things that could be confused as falling into this SIG but don't
## Roles
Membership for roles tracked in: <link to OWNERS file>
Membership for roles tracked in: [sigs.yaml]
- Chair
- Run operations and processes governing the SIG
@ -39,7 +54,7 @@ Membership for roles tracked in: <link to OWNERS file>
- *MAY* select additional subproject owners through a [super-majority] vote amongst subproject owners. This
*SHOULD* be supported by a majority of subproject contributors (through [lazy-consensus] with fallback on voting).
- Number: 3-5
- Defined in [sigs.yaml] [OWNERS] files
- Defined in [OWNERS] files that are specified in [sigs.yaml]
- Members
- *MUST* maintain health of at least one subproject or the health of the SIG
@ -50,6 +65,14 @@ Membership for roles tracked in: <link to OWNERS file>
- *MAY* participate in decision making for the subprojects they hold roles in
- Includes all reviewers and approvers in [OWNERS] files for subprojects
- Security Contact
- *MUST* be a contact point for the Product Security Team to reach out to for
triaging and handling of incoming issues
- *MUST* accept the [Embargo Policy](https://github.com/kubernetes/sig-release/blob/master/security-release-process-documentation/security-release-process.md#embargo-policy)
- Defined in `SECURITY_CONTACTS` files, this is only relevant to the root file in
the repository, there is a template
[here](https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS)
## Organizational management
- SIG meets bi-weekly on zoom with agenda in meeting notes
@ -120,3 +143,4 @@ Issues impacting multiple subprojects in the SIG should be resolved by either:
[KEP]: https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml#L1454
[OWNERS]: contributors/devel/owners.md
[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md

View File

@ -10,7 +10,7 @@ The Kubernetes community abides by the [CNCF code of conduct]. Here is an excer
## SIGs
Kubernetes encompasses many projects, organized into [SIGs](sig-list.md).
Kubernetes encompasses many projects, organized into [SIGs](/sig-list.md).
Some communication has moved into SIG-specific channels - see
a given SIG subdirectory for details.
@ -41,11 +41,15 @@ please [file an issue].
## Mailing lists
Development announcements and discussions appear on the Google group
[kubernetes-dev] (send mail to `kubernetes-dev@googlegroups.com`).
Kubernetes mailing lists are hosted through Google Groups. To
receive these lists' emails,
[join](https://support.google.com/groups/answer/1067205) the groups
relevant to you, as you would any other Google Group.
Users trade notes on the Google group
[kubernetes-users] (send mail to `kubernetes-users@googlegroups.com`).
* [kubernetes-announce] broadcasts major project announcements such as releases and security issues
* [kubernetes-dev] hosts development announcements and discussions around developing kubernetes itself
* [kubernetes-users] is where kubernetes users trade notes
* Additional Google groups exist and can be joined for discussion related to each SIG and Working Group. These are linked from the [SIG list](/sig-list.md).
## Accessing community documents
@ -92,6 +96,7 @@ Kubernetes is the main focus of CloudNativeCon/KubeCon, held every spring in Eur
[iCal url]: https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics
[Kubernetes Community Meeting Agenda]: https://docs.google.com/document/d/1VQDIAB0OqiSjIHI8AWMvSdceWhnz56jNpZrLs6o7NJY/edit#
[kubernetes-community-video-chat]: https://groups.google.com/forum/#!forum/kubernetes-community-video-chat
[kubernetes-announce]: https://groups.google.com/forum/#!forum/kubernetes-announce
[kubernetes-dev]: https://groups.google.com/forum/#!forum/kubernetes-dev
[kubernetes-users]: https://groups.google.com/forum/#!forum/kubernetes-users
[kubernetes.slackarchive.io]: https://kubernetes.slackarchive.io

View File

@ -0,0 +1,63 @@
# Moderation on Kubernetes Communications Channels
This page describes the rules and best practices for people chosen to moderate Kubernetes communications channels.
This includes, Slack and the mailing lists and _any communication tool_ used in an official manner by the project.
## Roles and Responsibilities
As part of volunteering to become a moderator you are now representative of the Kubernetes community and it is your responsibility to remain aware of your contributions in this space.
These responsibilities apply to all Kubernetes official channels.
Moderators _MUST_:
- Take action as specified by these Kubernetes Moderator Guidelines.
- You are empowered to take _immediate action_ when there is a violation. You do not need to wait for review or approval if an egregious violation has occurred. Make a judgement call based on our Code of Conduct and Values (see below).
- Removing a bad actor or content from the medium is preferable to letting it sit there.
- Abide by the documented tasks and actions required of moderators.
- Ensure that the [CNCF Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md) is in effect on all official Kubernetes communication channels.
- Become familiar with the [Kubernetes Community Values](https://github.com/kubernetes/steering/blob/master/values.md).
- Take care of spam as soon as possible, which may mean taking action by removing a member from that resource.
- Foster a safe and productive environment by being aware of potential multiple cultural differences between Kubernetes community members.
- Understand that you might be contacted by moderators, community managers, and other users via private email or a direct message.
- Report egregious behavior to steering@k8s.io.
Moderators _SHOULD_:
- Exercise compassion and empathy when communicating and collaborating with other community members.
- Understand the difference between a user abusing the resource or just having difficulty expressing comments and questions in English.
- Be an example and role model to others in the community.
- Remember to check and recognize if you need take a break when you become frustrated or find yourself in a heated debate.
- Help your colleagues if you recognize them in one of the [stages of burnout](https://opensource.com/business/15/12/avoid-burnout-live-happy).
- Be helpful and have fun!
## Violations
The Kubernetes [Steering Committee](https://github.com/kubernetes/steering) will have the final authority regarding escalated moderation matters. Violations of the Code of Conduct will be handled on a case by case basis. Depending on severity this can range up to and including removal of the person from the community, though this is extremely rare.
## Specific Guidelines
These guidelines are for tool-specific policies that don't fit under a general umbrella.
### Mailing Lists
### Slack
- [Slack Guidelines](./slack-guidelines.md)
### Zoom
- [Zoom Guidelines](./zoom-guidelines.md)
### References and Resources
Thanks to the following projects for making their moderation guidelines public, allowing us to build on the shoulders of giants.
Moderators are encouraged to learn how other projects moderate and learn from them in order to improve our guidelines:
- Mozilla's [Forum Moderation Guidelines](https://support.mozilla.org/en-US/kb/moderation-guidelines)
- OASIS [How to Moderate a Mailing List](https://www.oasis-open.org/khelp/kmlm/user_help/html/mailing_list_moderation.html)
- Community Spark's [How to effectively moderate forums](http://www.communityspark.com/how-to-effectively-moderate-forums/)
- [5 tips for more effective community moderation](https://www.socialmediatoday.com/social-business/5-tips-more-effective-community-moderation)
- [8 Helpful Moderation Tips for Community Managers](https://sproutsocial.com/insights/tips-community-managers/)
- [Setting Up Community Guidelines for Moderation](https://www.getopensocial.com/blog/community-management/setting-community-guidelines-moderation)

View File

@ -0,0 +1,45 @@
# Zoom Guidelines
Zoom is the main video communication platform for Kubernetes.
It is used for running the [community meeting](https://github.com/kubernetes/community/blob/master/events/community-meeting.md) and SIG meetings.
Since the Zoom meetings are open to the general public, a Zoom host has to moderate a meeting if a person is in violation of the code of conduct.
These guidelines are meant as a tool to help Kubernetes members manage their Zoom resources.
Check the main [moderation](./moderation.md) page for more information on other tools and general moderation guidelines.
## Code of Conduct
Kubernetes adheres to Cloud Native Compute Foundation's [Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md) throughout the project, and includes all communication mediums.
## Moderation
Zoom has documentation on how to use their moderation tools:
- https://support.zoom.us/hc/en-us/articles/201362603-Host-Controls-in-a-Meeting
Check the "Screen Share Controls" (via the ^ next to Share Screen): Select who can share in your meeting and if you want only the host or any participant to be able to start a new share when someone is sharing.
You can also put an attendee on hold. This allows the host(s) to put attendee on hold to temporarily remove an attendee from the meeting.
Unfortunately, Zoom doesn't have the ability to ban or block people from joining - especially if they have the invitation to that channel and the meeting id is publicly known.
It is required that a host be comfortable with how to use these moderation tools. It is strongly encouraged that at least two people in a given SIG are comfortable with the moderation tools.
## Meeting Archive Videos
If a violation has been addressed by a host and it has been recorded by Zoom, the video should be edited before being posted on the [Kubernetes channel](https://www.youtube.com/c/kubernetescommunity).
Contact [SIG Contributor Experience](https://github.com/kubernetes/community/tree/master/sig-contributor-experience) if you need help to edit a video before posting it to the public.
## Admins
- @parispittman
- @castrojo
Each SIG should have at least one person with a paid Zoom account.
See the [SIG Creation procedure](https://github.com/kubernetes/community/blob/master/sig-governance.md#sig-creation-procedure) document on how to set up an initial account.
The Zoom licenses are managed by the [CNCF Service Desk](https://github.com/cncf/servicedesk).
## Escalating and/Reporting a Problem
Issues that cannot be handle via normal moderation can be escalated to the [Kubernetes steering committee](https://github.com/kubernetes/steering).

View File

@ -222,7 +222,7 @@ The following apply to the subproject for which one would be an owner.
**Status:** Removed
The Maintainer role has been removed and replaced with a greater focus on [owner](#owner)s.
The Maintainer role has been removed and replaced with a greater focus on [OWNERS].
[code reviews]: contributors/devel/collab.md
[community expectations]: contributors/guide/community-expectations.md

View File

@ -31,7 +31,7 @@ aggregated servers.
* Developers should be able to write their own API server and cluster admins
should be able to add them to their cluster, exposing new APIs at runtime. All
of this should not require any change to the core kubernetes API server.
* These new APIs should be seamless extension of the core kubernetes APIs (ex:
* These new APIs should be seamless extensions of the core kubernetes APIs (ex:
they should be operated upon via kubectl).
## Non Goals

View File

@ -390,7 +390,7 @@ the following command.
### Rollback
For future work, `kubeclt rollout undo` can be implemented in the general case
For future work, `kubectl rollout undo` can be implemented in the general case
as an extension of the [above](#viewing-history ).
```bash

View File

@ -42,7 +42,7 @@ Here are some potential requirements that haven't been covered by this proposal:
- Uptime is critical for each pod of a DaemonSet during an upgrade (e.g. the time
from a DaemonSet pods being killed to recreated and healthy should be < 5s)
- Each DaemonSet pod can still fit on the node after being updated
- Some DaemonSets require the node to be drained before the DeamonSet's pod on it
- Some DaemonSets require the node to be drained before the DaemonSet's pod on it
is updated (e.g. logging daemons)
- DaemonSet's pods are implicitly given higher priority than non-daemons
- DaemonSets can only be operated by admins (i.e. people who manage nodes)

View File

@ -747,7 +747,7 @@ kubectl rollout undo statefulset web
### Rolling Forward
Rolling back is usually the safest, and often the fastest, strategy to mitigate
deployment failure, but rolling forward is sometimes the only practical solution
for stateful applications (e.g. A users has a minor configuration error but has
for stateful applications (e.g. A user has a minor configuration error but has
already modified the storage format for the application). Users can use
sequential `kubectl apply`'s to update the StatefulSet's current
[target state](#target-state). The StatefulSet's `.Spec.GenerationPartition`

View File

@ -0,0 +1,93 @@
# ProcMount/ProcMountType Option
## Background
Currently the way docker and most other container runtimes work is by masking
and setting as read-only certain paths in `/proc`. This is to prevent data
from being exposed into a container that should not be. However, there are
certain use-cases where it is necessary to turn this off.
## Motivation
For end-users who would like to run unprivileged containers using user namespaces
_nested inside_ CRI containers, we need an option to have a `ProcMount`. That is,
we need an option to designate explicitly turn off masking and setting
read-only of paths so that we can
mount `/proc` in the nested container as an unprivileged user.
Please see the following filed issues for more information:
- [opencontainers/runc#1658](https://github.com/opencontainers/runc/issues/1658#issuecomment-373122073)
- [moby/moby#36597](https://github.com/moby/moby/issues/36597)
- [moby/moby#36644](https://github.com/moby/moby/pull/36644)
Please also see the [use case for building images securely in kubernetes](https://github.com/jessfraz/blog/blob/master/content/post/building-container-images-securely-on-kubernetes.md).
Unmasking the paths in `/proc` option really only makes sense for when a user
is nesting
unprivileged containers with user namespaces as it will allow more information
than is necessary to the program running in the container spawned by
kubernetes.
The main use case for this option is to run
[genuinetools/img](https://github.com/genuinetools/img) inside a kubernetes
container. That program then launches sub-containers that take advantage of
user namespaces and re-mask /proc and set /proc as read-only. So therefore
there is no concern with having an unmasked proc open in the top level container.
It should be noted that this is different that the host /proc. It is still
a newly mounted /proc just the container runtimes will not mask the paths.
Since the only use case for this option is to run unprivileged nested
containers,
this option should only be allowed or used if the user in the container is not `root`.
This can be easily enforced with `MustRunAs`.
Since the user inside is still unprivileged,
doing things to `/proc` would be off limits regardless, since linux user
support already prevents this.
## Existing SecurityContext objects
Kubernetes defines `SecurityContext` for `Container` and `PodSecurityContext`
for `PodSpec`. `SecurityContext` objects define the related security options
for Kubernetes containers, e.g. selinux options.
To support "ProcMount" options in Kubernetes, it is proposed to make
the following changes:
## Changes of SecurityContext objects
Add a new `string` type field named `ProcMountType` will hold the viable
options for `procMount` to the `SecurityContext`
definition.
By default,`procMount` is `default`, aka the same behavior as today and the
paths are masked.
This will look like the following in the spec:
```go
type ProcMountType string
const (
// DefaultProcMount uses the container runtime default ProcType. Most
// container runtimes mask certain paths in /proc to avoid accidental security
// exposure of special devices or information.
DefaultProcMount ProcMountType = "Default"
// UnmaskedProcMount bypasses the default masking behavior of the container
// runtime and ensures the newly created /proc the container stays in tact with
// no modifications.
UnmaskedProcMount ProcMountType = "Unmasked"
)
procMount *ProcMountType
```
This requires changes to the CRI runtime integrations so that
kubelet will add the specific `unmasked` or `whatever_it_is_named` option.
## Pod Security Policy changes
A new `[]ProcMountType{}` field named `allowedProcMounts` will be added to the Pod
Security Policy as well to gate the allowed ProcMountTypes a user is allowed to
set. This field will default to `[]ProcMountType{ DefaultProcMount }`.

View File

@ -0,0 +1,89 @@
# Support traffic shaping for CNI network plugin
Version: Alpha
Authors: @m1093782566
## Motivation and background
Currently the kubenet code supports applying basic traffic shaping during pod setup. This will happen if bandwidth-related annotations have been added to the pod's metadata, for example:
```json
{
"kind": "Pod",
"metadata": {
"name": "iperf-slow",
"annotations": {
"kubernetes.io/ingress-bandwidth": "10M",
"kubernetes.io/egress-bandwidth": "10M"
}
}
}
```
Our current implementation uses the `linux tc` to add an download(ingress) and upload(egress) rate limiter using 1 root `qdisc`, 2 `class `(one for ingress and one for egress) and 2 `filter`(one for ingress and one for egress attached to the ingress and egress classes respectively).
Kubelet CNI code doesn't support it yet, though CNI has already added a [traffic sharping plugin](https://github.com/containernetworking/plugins/tree/master/plugins/meta/bandwidth). We can replicate the behavior we have today in kubenet for kubelet CNI network plugin if we feel this is an important feature.
## Goal
Support traffic shaping for CNI network plugin in Kubernetes.
## Non-goal
CNI plugins to implement this sort of traffic shaping guarantee.
## Proposal
If kubelet starts up with `network-plugin = cni` and user enabled traffic shaping via the network plugin configuration, it would then populate the `runtimeConfig` section of the config when calling the `bandwidth` plugin.
Traffic shaping in Kubelet CNI network plugin can work with ptp and bridge network plugins.
### Pod Setup
When we create a pod with bandwidth configuration in its metadata, for example,
```json
{
"kind": "Pod",
"metadata": {
"name": "iperf-slow",
"annotations": {
"kubernetes.io/ingress-bandwidth": "10M",
"kubernetes.io/egress-bandwidth": "10M"
}
}
}
```
Kubelet would firstly parse the ingress and egress bandwidth values and transform them to Kbps because both `ingressRate` and `egressRate` in cni bandwidth plugin are in Kbps. A user would add something like this to their CNI config list if they want to enable traffic shaping via the plugin:
```json
{
"type": "bandwidth",
"capabilities": {"trafficShaping": true}
}
```
Kubelet would then populate the `runtimeConfig` section of the config when calling the `bandwidth` plugin:
```json
{
"type": "bandwidth",
"runtimeConfig": {
"trafficShaping": {
"ingressRate": "X",
"egressRate": "Y"
}
}
}
```
### Pod Teardown
When we delete a pod, kubelet will bulid the runtime config for calling cni plugin `DelNetwork/DelNetworkList` API, which will remove this pod's bandwidth configuration.
## Next step
* Support ingress and egress burst bandwidth in Pod.
* Graduate annotations to Pod Spec.

View File

@ -85,7 +85,7 @@ The implementation will mainly be in two parts:
In both parts, we need to implement:
* Fork code for Windows from Linux.
* Convert from Resources.Requests and Resources.Limits to Windows configuration in CRI, and convert from Windows configration in CRI to container configuration.
* Convert from Resources.Requests and Resources.Limits to Windows configuration in CRI, and convert from Windows configuration in CRI to container configuration.
To implement resource controls for Windows containers, refer to [this MSDN documentation](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/resource-controls) and [Docker's conversion to OCI spec](https://github.com/moby/moby/blob/master/daemon/oci_windows.go).

View File

@ -142,11 +142,22 @@ extend this by maintaining a metadata file in the pod directory.
**Log format**
The runtime should decorate each log entry with a RFC 3339Nano timestamp
prefix, the stream type (i.e., "stdout" or "stderr"), and ends with a newline.
prefix, the stream type (i.e., "stdout" or "stderr"), the tags of the log
entry, the log content that ends with a newline.
The `tags` fields can support multiple tags, delimited by `:`. Currently, only
one tag is defined in CRI to support multi-line log entries: partial or full.
Partial (`P`) is used when a log entry is split into multiple lines by the
runtime, and the entry has not ended yet. Full (`F`) indicates that the log
entry is completed -- it is either a single-line entry, or this is the last
line of the muiltple-line entry.
For example,
```
2016-10-06T00:17:09.669794202Z stdout The content of the log entry 1
2016-10-06T00:17:10.113242941Z stderr The content of the log entry 2
2016-10-06T00:17:09.669794202Z stdout F The content of the log entry 1
2016-10-06T00:17:09.669794202Z stdout P First line of log entry 2
2016-10-06T00:17:09.669794202Z stdout P Second line of the log entry 2
2016-10-06T00:17:10.113242941Z stderr F Last line of the log entry 2
```
With the knowledge, kubelet can parses the logs and serve them for `kubectl

View File

@ -0,0 +1,209 @@
# Support Node-Level User Namespaces Remapping
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Use Stories](#user-stories)
- [Proposal](#proposal)
- [Future Work](#future-work)
- [Risks and Mitigations](risks-and-mitigations)
- [Graduation Criteria](graduation-criteria)
- [Alternatives](alternatives)
_Authors:_
* Mrunal Patel &lt;mpatel@redhat.com&gt;
* Jan Pazdziora &lt;jpazdziora@redhat.com&gt;
* Vikas Choudhary &lt;vichoudh@redhat.com&gt;
## Summary
Container security consists of many different kernel features that work together to make containers secure. User namespaces is one such feature that enables interesting possibilities for containers by allowing them to be root inside the container while not being root on the host. This gives more capabilities to the containers while protecting the host from the container being root and adds one more layer to container security.
In this proposal we discuss:
- use-cases/user-stories that benefit from this enhancement
- implementation design and scope for alpha release
- long-term roadmap to fully support this feature beyond alpha
## Motivation
From user_namespaces(7):
> User namespaces isolate security-related identifiers and attributes, in particular, user IDs and group IDs, the root directory, keys, and capabilities. A process's user and group IDs can be different inside and outside a user namespace. In particular, a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace.
In order to run Pods with software which expects to run as root or with elevated privileges while still containing the processes and protecting both the Nodes and other Pods, Linux kernel mechanism of user namespaces can be used make the processes in the Pods view their environment as having the privileges, while on the host (Node) level these processes appear as without privileges or with privileges only affecting processes in the same Pods
The purpose of using user namespaces in Kubernetes is to let the processes in Pods think they run as one uid set when in fact they run as different “real” uids on the Nodes.
In this text, most everything said about uids can also be applied to gids.
## Goals
Enable user namespace support in a kubernetes cluster so that workloads that work today also work with user namespaces enabled at runtime. Furthermore, make workloads that require root/privileged user inside the container, safer for the node using the additional security of user namespaces. Containers will run in a user namespace different from user-namespace of the underlying host.
## Non-Goals
- Non-goal is to support pod/container level user namespace isolation. There can be images using different users but on the node, pods/containers running with these images will share common user namespace remapping configuration. In other words, all containers on a node share a common user-namespace range.
- Remote volumes support eg. NFS
## User Stories
- As a cluster admin, I want to protect the node from the rogue container process(es) running inside pod containers with root privileges. If such a process is able to break out into the node, it could be a security issue.
- As a cluster admin, I want to support all the images irrespective of what user/group that image is using.
- As a cluster admin, I want to allow some pods to disable user namespaces if they require elevated privileges.
## Proposal
Proposal is to support user-namespaces for the pod containers. This can be done at two levels:
- Node-level : This proposal explains this part in detail.
- Namespace-Level/Pod-level: Plan is to target this in future due to missing support in the low level system components such as runtimes and kernel. More on this in the `Future Work` section.
Node-level user-namespace support means that, if feature is enabled, all pods on a node will share a common user-namespace, common UID(and GID) range (which is a subset of nodes total UIDs(and GIDs)). This common user-namespace is runtimes default user-namespace range which is remapped to containers UIDs(and GID), starting with the first UID as containers root.
In general Linux convention, UID(or GID) mapping consists of three parts:
1. Host (U/G)ID: First (U/G)ID of the range on the host that is being remapped to the (U/G)IDs in the container user-namespace
2. Container (U/G)ID: First (U/G)ID of the range in the container namespace and this is mapped to the first (U/G)ID on the host(mentioned in previous point).
3. Count/Size: Total number of consecutive mapping between host and container user-namespaces, starting from the first one (including) mentioned above.
As an example, `host_id 1000, container_id 0, size 10`
In this case, 1000 to 1009 on host will be mapped to 0 to 9 inside the container.
User-namespace support should be enabled only when container runtime on the node supports user-namespace remapping and is enabled in its configuration. To enable user-namespaces, feature-gate flag will need to be passed to Kubelet like this `--feature-gates=”NodeUserNamespace=true”`
A new CRI API, `GetRuntimeConfigInfo` will be added. Kubelet will use this API:
- To verify if user-namespace remapping is enabled at runtime. If found disabled, kubelet will fail to start
- To determine the default user-namespace range at the runtime, starting UID of which is mapped to the UID '0' of the container.
### Volume Permissions
Kubelet will change the file permissions, i.e chown, at `/var/lib/kubelet/pods` prior to any container start to get file permissions updated according to remapped UID and GID.
This proposal will work only for local volumes and not with remote volumes such as NFS.
### How to disable `NodeUserNamespace` for a specific pod
This can be done in two ways:
- **Alpha:** Implicitly using host namespace for the pod containers
This support is already present (currently it seems broken, will be fixed) in Kubernetes as an experimental functionality, which can be enabled using `feature-gates=”ExperimentalHostUserNamespaceDefaulting=true”`.
If Pod-Security-Policy is configured to allow the following to be requested by a pod, host user-namespace will be enabled for the container:
- host namespaces (pid, ipc, net)
- non-namespaced capabilities (mknod, sys_time, sys_module)
- the pod contains a privileged container or using host path volumes.
- https://github.com/kubernetes/kubernetes/commit/d0d78f478ce0fb9d5e121db3b7c6993b482af82c#diff-a53fa76e941e0bdaee26dcbc435ad2ffR437 introduced via https://github.com/kubernetes/kubernetes/commit/d0d78f478ce0fb9d5e121db3b7c6993b482af82c.
- **Beta:** Explicit API to request host user-namespace in pod spec
This is being targeted under Beta graduation plans.
### CRI API Changes
Proposed CRI API changes:
```golang
// Runtime service defines the public APIs for remote container runtimes
service RuntimeService {
// Version returns the runtime name, runtime version, and runtime API version.
rpc Version(VersionRequest) returns (VersionResponse) {}
…….
…….
// GetRuntimeConfigInfo returns the configuration details of the runtime.
rpc GetRuntimeConfigInfo(GetRuntimeConfigInfoRequest) returns (GetRuntimeConfigInfoResponse) {}
}
// LinuxIDMapping represents a single user namespace mapping in Linux.
message LinuxIDMapping {
// container_id is the starting id for the mapping inside the container.
uint32 container_id = 1;
// host_id is the starting id for the mapping on the host.
uint32 host_id = 2;
// size is the length of the mapping.
uint32 size = 3;
}
message LinuxUserNamespaceConfig {
// is_enabled, if true indicates that user-namespaces are supported and enabled in the container runtime
bool is_enabled = 1;
// uid_mappings is an array of user id mappings.
repeated LinuxIDMapping uid_mappings = 1;
// gid_mappings is an array of group id mappings.
repeated LinuxIDMapping gid_mappings = 2;
}
message GetRuntimeConfig {
LinuxUserNamespaceConfig user_namespace_config = 1;
}
message GetRuntimeConfigInfoRequest {}
message GetRuntimeConfigInfoResponse {
GetRuntimeConfig runtime_config = 1
}
...
// NamespaceOption provides options for Linux namespaces.
message NamespaceOption {
// Network namespace for this container/sandbox.
// Note: There is currently no way to set CONTAINER scoped network in the Kubernetes API.
// Namespaces currently set by the kubelet: POD, NODE
NamespaceMode network = 1;
// PID namespace for this container/sandbox.
// Note: The CRI default is POD, but the v1.PodSpec default is CONTAINER.
// The kubelet's runtime manager will set this to CONTAINER explicitly for v1 pods.
// Namespaces currently set by the kubelet: POD, CONTAINER, NODE
NamespaceMode pid = 2;
// IPC namespace for this container/sandbox.
// Note: There is currently no way to set CONTAINER scoped IPC in the Kubernetes API.
// Namespaces currently set by the kubelet: POD, NODE
NamespaceMode ipc = 3;
// User namespace for this container/sandbox.
// Note: There is currently no way to set CONTAINER scoped user namespace in the Kubernetes API.
// The container runtime should ignore this if user namespace is NOT enabled.
// POD is the default value. Kubelet will set it to NODE when trying to use host user-namespace
// Namespaces currently set by the kubelet: POD, NODE
NamespaceMode user = 4;
}
```
### Runtime Support
- Docker: Here is the [user-namespace documentation](https://docs.docker.com/engine/security/userns-remap/) and this is the [implementation PR](https://github.com/moby/moby/pull/12648)
- Concerns:
Docker API does not provide user-namespace mapping. Therefore to handle `GetRuntimeConfigInfo` API, changes will be done in `dockershim` to read system files, `/etc/subuid` and `/etc/subgid`, for figuring out default user-namespace mapping. `/info` api will be used to figure out if user-namespace is enabled and `Docker Root Dir` will be used to figure out host uid mapped to the uid `0` in container. eg. `Docker Root Dir: /var/lib/docker/2131616.2131616` this shows host uid `2131616` will be mapped to uid `0`
- CRI-O: https://github.com/kubernetes-incubator/cri-o/pull/1519
- Containerd: https://github.com/containerd/containerd/blob/129167132c5e0dbd1b031badae201a432d1bd681/container_opts_unix.go#L149
### Implementation Roadmap
#### Phase 1: Support in Kubelet, Alpha, [Target: Kubernetes v1.11]
- Add feature gate `NodeUserNamespace`, disabled by default
- Add new CRI API, `GetRuntimeConfigInfo()`
- Add logic in Kubelet to handle pod creation which includes parsing GetRuntimeConfigInfo response and changing file-permissions in /var/lib/kubelet with learned userns mapping.
- Add changes in dockershim to implement GetRuntimeConfigInfo() for docker runtime
- Add changes in CRI-O to implement userns support and GetRuntimeConfigInfo() support
- Unit test cases
- e2e tests
#### Phase 2: Beta Support [Target: Kubernetes v1.12]
- PSP integration
- To grow ExperimentalHostUserNamespaceDefaulting from experimental feature gate to a Kubelet flag
- API changes to allow pod able to request HostUserNamespace in pod spec
- e2e tests
### References
- Default host user namespace via experimental flag
- https://github.com/kubernetes/kubernetes/pull/31169
- Enable userns support for containers launched by kubelet
- https://github.com/kubernetes/features/issues/127
- Track Linux User Namespaces in the Pod Security Policy
- https://github.com/kubernetes/kubernetes/issues/59152
- Add support for experimental-userns-remap-root-uid and experimental-userns-remap-root-gid options to match the remapping used by the container runtime.
- https://github.com/kubernetes/kubernetes/pull/55707
- rkt User Namespaces Background
- https://coreos.com/rkt/docs/latest/devel/user-namespaces.html
## Future Work
### Namespace-Level/Pod-Level user-namespace support
There is no runtime today which supports creating containers with a specified user namespace configuration. For example here is the discussion related to this support in Docker https://github.com/moby/moby/issues/28593
Once user-namespace feature in the runtimes has evolved to support containers request for a specific user-namespace mapping(UID and GID range), we can extend current Node-Level user-namespace support in Kubernetes to support Namespace-level isolation(or if desired even pod-level isolation) by dividing and allocating learned mapping from runtime among Kubernetes namespaces (or pods, if desired). From end-user UI perspective, we dont expect any change in the UI related to user namespaces support.
### Remote Volumes
Remote Volumes support should be investigated and should be targeted in future once support is there at lower infra layers.
## Risks and Mitigations
The main risk with this change stems from the fact that processes in Pods will run with different “real” uids than they used to, while expecting the original uids to make operations on the Nodes or consistently access shared persistent storage.
- This can be mitigated by turning the feature on gradually, per-Pod or per Kubernetes namespace.
- For the Kubernetes' cluster Pods (that provide the Kubernetes functionality), testing of their behaviour and ability to run in user namespaced setups is crucial.
## Graduation Criteria
- PSP integration
- API changes to allow pod able to request host user namespace using for example, `HostUserNamespace: True`, in pod spec
- e2e tests
## Alternatives
User Namespace mappings can be passed explicitly through kubelet flags similar to https://github.com/kubernetes/kubernetes/pull/55707 but we do not prefer this option because this is very much prone to mis-configuration.

View File

@ -28,7 +28,7 @@ implied. However, describing the process as "moving" the pod is approximately ac
and easier to understand, so we will use this terminology in the document.
We use the term "rescheduling" to describe any action the system takes to move an
already-running pod. The decision may be made and executed by any component; we wil
already-running pod. The decision may be made and executed by any component; we will
introduce the concept of a "rescheduler" component later, but it is not the only
component that can do rescheduling.

View File

@ -19,8 +19,8 @@ In addition to this, with taint-based-eviction, the Node Controller already tain
| ------------------ | ------------------ | ------------ | -------- |
|Ready |True | - | |
| |False | NoExecute | node.kubernetes.io/not-ready |
| |Unknown | NoExecute | node.kubernetes.io/unreachable |
|OutOfDisk |True | NoSchedule | node.kubernetes.io/out-of-disk |
| |Unknown | NoExecute | node.kubernetes.io/unreachable |
|OutOfDisk |True | NoSchedule | node.kubernetes.io/out-of-disk |
| |False | - | |
| |Unknown | - | |
|MemoryPressure |True | NoSchedule | node.kubernetes.io/memory-pressure |
@ -32,6 +32,9 @@ In addition to this, with taint-based-eviction, the Node Controller already tain
|NetworkUnavailable |True | NoSchedule | node.kubernetes.io/network-unavailable |
| |False | - | |
| |Unknown | - | |
|PIDPressure |True | NoSchedule | node.kubernetes.io/pid-pressure |
| |False | - | |
| |Unknown | - | |
For example, if a CNI network is not detected on the node (e.g. a network is unavailable), the Node Controller will taint the node with `node.kubernetes.io/network-unavailable=:NoSchedule`. This will then allow users to add a toleration to their `PodSpec`, ensuring that the pod can be scheduled to this node if necessary. If the kubelet did not update the nodes status after a grace period, the Node Controller will only taint the node with `node.kubernetes.io/unreachable`; it will not taint the node with any unknown condition.

View File

@ -314,7 +314,7 @@ The attach/detach controller,running as part of the kube-controller-manager bina
When the controller decides to attach a CSI volume, it will call the in-tree CSI volume plugins attach method. The in-tree CSI volume plugins attach method will do the following:
1. Create a new `VolumeAttachment` object (defined in the “Communication Channels” section) to attach the volume.
* The name of the of the `VolumeAttachment` object will be `pv-<SHA256(PVName+NodeName)>`.
* The name of the `VolumeAttachment` object will be `pv-<SHA256(PVName+NodeName)>`.
* `pv-` prefix is used to allow using other scheme(s) for inline volumes in the future, with their own prefix.
* SHA256 hash is to reduce length of `PVName` plus `NodeName` string, each of which could be max allowed name length (hexadecimal representation of SHA256 is 64 characters).
* `PVName` is `PV.name` of the attached PersistentVolume.

View File

@ -198,7 +198,7 @@ we have considered following options:
Cons:
* I don't know if there is a pattern that exists in kube today for shipping shell scripts that are called out from code in Kubernetes. Flex is
different because, none of the flex scripts are shipped with Kuberntes.
different because, none of the flex scripts are shipped with Kubernetes.
3. Ship resizing tools in a container.

View File

@ -55,7 +55,7 @@ the RBD image.
### Pros
- Simple to implement
- Does not cause regression in RBD image names, which remains same as earlier.
- The metada information is not immediately visible to RBD admins
- The metadata information is not immediately visible to RBD admins
### Cons
- NA

View File

@ -0,0 +1,148 @@
# Service Account Token Volumes
Authors:
@smarterclayton
@liggitt
@mikedanese
## Summary
Kubernetes is able to provide pods with unique identity tokens that can prove
the caller is a particular pod to a Kubernetes API server. These tokens are
injected into pods as secrets. This proposal proposes a new mechanism of
distribution with support for [improved service account tokens][better-tokens]
and explores how to migrate from the existing mechanism backwards compatibly.
## Motivation
Many workloads running on Kubernetes need to prove to external parties who they
are in order to participate in a larger application environment. This identity
must be attested to by the orchestration system in a way that allows a third
party to trust that an arbitrary container on the cluster is who it says it is.
In addition, infrastructure running on top of Kubernetes needs a simple
mechanism to communicate with the Kubernetes APIs and to provide more complex
tooling. Finally, a significant set of security challenges are associated with
storing service account tokens as secrets in Kubernetes and limiting the methods
whereby malicious parties can get access to these tokens will reduce the risk of
platform compromise.
As a platform, Kubernetes should evolve to allow identity management systems to
provide more powerful workload identity without breaking existing use cases, and
provide a simple out of the box workload identity that is sufficient to cover
the requirements of bootstrapping low-level infrastructure running on
Kubernetes. We expect that other systems to cover the more advanced scenarios,
and see this effort as necessary glue to allow more powerful systems to succeed.
With this feature, we hope to provide a backwards compatible replacement for
service account tokens that strengthens the security and improves the
scalability of the platform.
## Proposal
Kubernetes should implement a ServiceAccountToken volume projection that
maintains a service account token requested by the node from the TokenRequest
API.
### Token Volume Projection
A new volume projection will be implemented with an API that closely matches the
TokenRequest API.
```go
type ProjectedVolumeSource struct {
Sources []VolumeProjection
DefaultMode *int32
}
type VolumeProjection struct {
Secret *SecretProjection
DownwardAPI *DownwardAPIProjection
ConfigMap *ConfigMapProjection
ServiceAccountToken *ServiceAccountTokenProjection
}
// ServiceAccountTokenProjection represents a projected service account token
// volume. This projection can be used to insert a service account token into
// the pods runtime filesystem for use against APIs (Kubernetes API Server or
// otherwise).
type ServiceAccountTokenProjection struct {
// Audience is the intended audience of the token. A recipient of a token
// must identify itself with an identifier specified in the audience of the
// token, and otherwise should reject the token. The audience defaults to the
// identifier of the apiserver.
Audience string
// ExpirationSeconds is the requested duration of validity of the service
// account token. As the token approaches expiration, the kubelet volume
// plugin will proactively rotate the service account token. The kubelet will
// start trying to rotate the token if the token is older than 80 percent of
// its time to live or if the token is older than 24 hours.Defaults to 1 hour
// and must be at least 10 minutes.
ExpirationSeconds int64
// Path is the relative path of the file to project the token into.
Path string
}
```
A volume plugin implemented in the kubelet will project a service account token
sourced from the TokenRequest API into volumes created from
ProjectedVolumeSources. As the token approaches expiration, the kubelet volume
plugin will proactively rotate the service account token. The kubelet will start
trying to rotate the token if the token is older than 80 percent of its time to
live or if the token is older than 24 hours.
To replace the current service account token secrets, we also need to inject the
clusters CA certificate bundle. Initially we will deploy to data in a configmap
per-namespace and reference it using a ConfigMapProjection.
A projected volume source that is equivalent to the current service account
secret:
```yaml
sources:
- serviceAccountToken:
expirationSeconds: 3153600000 # 100 years
path: token
- configMap:
name: kube-cacrt
items:
- key: ca.crt
path: ca.crt
- downwardAPI:
items:
- path: namespace
fieldRef: metadata.namespace
```
This fixes one scalability issue with the current service account token
deployment model where secret GETs are a large portion of overall apiserver
traffic.
A projected volume source that requests a token for vault and Istio CA:
```yaml
sources:
- serviceAccountToken:
path: vault-token
audience: vault
- serviceAccountToken:
path: istio-token
audience: ca.istio.io
```
### Alternatives
1. Instead of implementing a service account token volume projection, we could
implement all injection as a flex volume or CSI plugin.
1. Both flex volume and CSI are alpha and are unlikely to graduate soon.
1. Virtual kubelets (like Fargate or ACS) may not be able to run flex
volumes.
1. Service account tokens are a fundamental part of our API.
1. Remove service accounts and service account tokens completely from core, use
an alternate mechanism that sits outside the platform.
1. Other core features need service account integration, leading to all
users needing to install this extension.
1. Complicates installation for the majority of users.
[better-tokens]: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md

View File

@ -365,10 +365,14 @@ being required otherwise.
### Edit defaults.go
If your change includes new fields for which you will need default values, you
need to add cases to `pkg/apis/<group>/<version>/defaults.go` (the core v1 API
is special, its defaults.go is at `pkg/api/v1/defaults.go`. For simplicity, we
will not mention this special case in the rest of the article). Of course, since
you have added code, you have to add a test:
need to add cases to `pkg/apis/<group>/<version>/defaults.go`
*Note:* In the past the core v1 API
was special. Its `defaults.go` used to live at `pkg/api/v1/defaults.go`.
If you see code referencing that path, you can be sure its outdated. Now the core v1 api lives at
`pkg/apis/core/v1/defaults.go` which follows the above convention.
Of course, since you have added code, you have to add a test:
`pkg/apis/<group>/<version>/defaults_test.go`.
Do use pointers to scalars when you need to distinguish between an unset value
@ -601,7 +605,6 @@ Due to the fast changing nature of the project, the following content is probabl
to generate protobuf IDL and marshallers.
* You must add the new version to
[cmd/kube-apiserver/app#apiVersionPriorities](https://github.com/kubernetes/kubernetes/blob/v1.8.0-alpha.2/cmd/kube-apiserver/app/aggregator.go#L172)
to let the aggregator list it. This list will be removed before release 1.8.
* You must setup storage for the new version in
[pkg/registry/group_name/rest](https://github.com/kubernetes/kubernetes/blob/v1.8.0-alpha.2/pkg/registry/authentication/rest/storage_authentication.go)

View File

@ -1,3 +0,0 @@
This document has been moved to https://git.k8s.io/community/contributors/guide/coding-conventions.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -134,7 +134,9 @@ development environment, please [set one up](http://golang.org/doc/code.html).
| 1.5, 1.6 | 1.7 - 1.7.5 |
| 1.7 | 1.8.1 |
| 1.8 | 1.8.3 |
| 1.9+ | 1.9.1 |
| 1.9 | 1.9.1 |
| 1.10 | 1.9.1 |
| 1.11+ | 1.10.1 |
Ensure your GOPATH and PATH have been configured in accordance with the Go
environment instructions.

View File

@ -1,4 +0,0 @@
The contents of this file have been moved to https://git.k8s.io/community/contributors/guide/pull-requests.md.
<!--
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
-->

View File

@ -132,7 +132,7 @@ Note: Secrets are passed only to "mount/unmount" call-outs.
See [nginx-lvm.yaml] & [nginx-nfs.yaml] for a quick example on how to use Flexvolume in a pod.
[lvm]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/lvm
[nfs]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nfs
[nginx-lvm.yaml]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nginx-lvm.yaml
[nginx-nfs.yaml]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nginx-nfs.yaml
[lvm]: https://git.k8s.io/examples/staging/volumes/flexvolume/lvm
[nfs]: https://git.k8s.io/examples/staging/volumes/flexvolume/nfs
[nginx-lvm.yaml]: https://git.k8s.io/examples/staging/volumes/flexvolume/nginx-lvm.yaml
[nginx-nfs.yaml]: https://git.k8s.io/examples/staging/volumes/flexvolume/nginx-nfs.yaml

View File

@ -1,3 +0,0 @@
This document's content has been rolled into https://git.k8s.io/community/contributors/guide/coding-conventions.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,4 +0,0 @@
This document has been moved to https://git.k8s.io/community/contributors/guide/owners.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,4 +0,0 @@
This file has been moved to https://git.k8s.io/community/contributors/guide/pull-requests.md.
<!--
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
-->

View File

@ -1,8 +0,0 @@
reviewers:
- saad-ali
- pwittrock
- steveperry-53
- chenopis
- spiffxp
approvers:
- sig-release-leads

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/README.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/issues.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/release-process-documentation/release-team-guides/patch-release-manager-playbook.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/patch_release.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/scalability-validation.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/testing.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -1,4 +0,0 @@
This document has been moved to https://git.k8s.io/community/contributors/guide/scalability-good-practices.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -84,7 +84,7 @@ scheduling policies to apply, and can add new ones.
The policies that are applied when scheduling can be chosen in one of two ways.
The default policies used are selected by the functions `defaultPredicates()` and `defaultPriorities()` in
[pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](http://releases.k8s.io/HEAD/examples/scheduler-policy-config.json) for an example
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](https://git.k8s.io/examples/staging/scheduler-policy-config.json) for an example
config file. (Note that the config file format is versioned; the API is defined in [pkg/scheduler/api](http://releases.k8s.io/HEAD/pkg/scheduler/api/)).
Thus to add a new scheduling policy, you should modify [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) or add to the directory [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/), and either register the policy in `defaultPredicates()` or `defaultPriorities()`, or use a policy config file.

View File

@ -1,3 +0,0 @@
The original content of this file has been migrated to https://git.k8s.io/sig-release/security-release-process-documentation/security-release-process.md
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.

View File

@ -208,7 +208,7 @@ If you haven't noticed by now, we have a large, lively, and friendly open-source
## Events
Kubernetes is the main focus of CloudNativeCon/KubeCon, held twice per year in EMEA and in North America. Information about these and other community events is available on the CNCF [events](https://www.cncf.io/events/) pages.
Kubernetes is the main focus of KubeCon + CloudNativeCon, held three times per year in China, Europe and in North America. Information about these and other community events is available on the CNCF [events](https://www.cncf.io/events/) pages.
### Meetups

View File

@ -17,10 +17,11 @@ A list of common resources when contributing to Kubernetes.
- [Gubernator Dashboard - k8s.reviews](https://k8s-gubernator.appspot.com/pr)
- [Submit Queue](https://submit-queue.k8s.io)
- [Bot commands](https://go.k8s.io/bot-commands)
- [Release Buckets](http://gcsweb.k8s.io/gcs/kubernetes-release/)
- [GitHub labels](https://go.k8s.io/github-labels)
- [Release Buckets](https://gcsweb.k8s.io/gcs/kubernetes-release/)
- Developer Guide
- [Cherry Picking Guide](/contributors/devel/cherry-picks.md) - [Queue](http://cherrypick.k8s.io/#/queue)
- [https://k8s-code.appspot.com/](https://k8s-code.appspot.com/) - Kubernetes Code Search, maintained by [@dims](https://github.com/dims)
- [Cherry Picking Guide](/contributors/devel/cherry-picks.md) - [Queue](https://cherrypick.k8s.io/#/queue)
- [Kubernetes Code Search](https://cs.k8s.io/), maintained by [@dims](https://github.com/dims)
## SIGs and Working Groups
@ -39,8 +40,10 @@ A list of common resources when contributing to Kubernetes.
## Tests
- [Current Test Status](https://prow.k8s.io/)
- [Aggregated Failures](https://storage.googleapis.com/k8s-gubernator/triage/index.html)
- [Test Grid](https://k8s-testgrid.appspot.com/)
- [Aggregated Failures](https://go.k8s.io/triage)
- [Test Grid](https://testgrid.k8s.io)
- [Test Health](https://go.k8s.io/test-health)
- [Test History](https://go.k8s.io/test-history)
## Other

View File

@ -74,6 +74,22 @@ git checkout -b myfeature
Then edit code on the `myfeature` branch.
#### Build
The following section is a quick start on how to build Kubernetes locally, for more detailed information you can see [kubernetes/build](https://git.k8s.io/kubernetes/build/README.md).
The best way to validate your current setup is to build a small part of Kubernetes. This way you can address issues without waiting for the full build to complete. To build a specific part of Kubernetes use the `WHAT` environment variable to let the build scripts know you want to build only a certain package/executable.
```sh
make WHAT=cmd/{$package_you_want}
```
*Note:* This applies to all top level folders under kubernetes/cmd.
So for the cli, you can run:
```sh
make WHAT=cmd/kubectl
```
If everything checks out you will have an executable in the `_output/bin` directory to play around with.
*Note:* If you are using `CDPATH`, you must either start it with a leading colon, or unset the variable. The make rules and scripts to build require the current directory to come first on the CD search path in order to properly navigate between directories.

View File

@ -0,0 +1,14 @@
reviewers:
- parispittman
- guineveresaenger
- jberkus
- errordeveloper
- tpepper
- spiffxp
approvers:
- parispittman
- guineveresaenger
- jberkus
- errordeveloper
labels:
- area/new-contributor-track

View File

@ -0,0 +1,12 @@
# Welcome to KubeCon Copenhagen's New Contributor Track!
Hello new contributors!
This subfolder of [kubernetes/community](https://github.com/kubernetes/community) will be used as a safe space for participants in the New Contributor Onboarding Track to familiarize themselves with (some of) the Kubernetes Project's review and pull request processes.
The label associated with this track is `area/new-contributor-track`.
*If you are not currently attending or organizing this event, please DO NOT create issues or pull requests against this section of the community repo.*
A [Youtube playlist](https://www.youtube.com/playlist?list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx) of this workshop has been posted, and an outline of content to videos can be found [here](http://git.k8s.io/community/events/2018/05-contributor-summit).

View File

@ -0,0 +1,4 @@
# Hello from Copenhagen!
Hello everyone who's attending the Contributor Summit at KubeCon + CloudNativeCon in Copenhagen!
Great to see so many amazing people interested in contributing to Kubernetes :)

View File

@ -0,0 +1,350 @@
# Kubernetes New Contributor Workshop - KubeCon EU 2018 - Notes
Joining in the beginning was onboarding on a yacht
Now is more onboarding a BIG cruise ship.
Will be a Hard schedule, and let's hope we can achieve everything
Sig-contributor-experience -> from Non-member contributors to Owner
## SIG presentation
- SIG-docs & SIG-contributor-experience: **Docs and website** contribution
- SIG-testing: **Testing** contribution
- SIG-\* (*depends on the area to contribute on*): **Code** contribution
**=> Find your first topics**: bug, feature, learning, community development and documentation
Table exercise: Introduce yourself and give a tip on where you want to contribute in Kubernetes
## Communication in the community
Kubernetes community is like a Capybara: community members are really cool with everyone and they are from a lot of different horizons.
- Tech question on Slack and Stack Overflow, not on Github
- A lot of discussion will be involve when GH issues and PR are opened. Don't be frustrated
- Stay patient because there is a lot of contribution
When in doubt, **ask on Slack**
Other communication channels:
- Community meetings
- Mailing lists
- @ on Github
- Office Hour
- Kubernetes meetups https://www.meetup.com/topics/kubernetes
on https://kubernetes.io/community, there is the schedule for all the SIG/Working group meeting.
If you want to join or create a meetup. Go to **slack#sig-contribex**
## SIG - Special Interest Group
Semi-autonomous teams:
- Own leaders & charteers
- Code, Github repo, Slack, mailing, meeting responsibility
### Types
[SIG List](https://github.com/kubernetes/community/blob/master/sig-list.md)
1. Features Area
- sig-auth
- sig-apps
- sig-autoscaling
- sig-big-data
- sig-cli
- sig-multicluster
- sig-network
- sig-node
- sig-scalability
- sig-scheduling
- sig-service-catalog
- sig-storage
- sig-ui
2. Plumbing
- sig-cluster-lifecycle
- sig-api-machinary
- sig-instrumentation
3. Cloud Providers *(currently working on moving cloudprovider code out of Core)*
- sig-aws
- sig-azure
- sig-gcp
- sig-ibmcloud
- sig-openstack
4. Meta
- sig-architecture: For all general architectural decision
- sig-contributor-experience: Helping contributor and community experience
- sig-product-management: Long-term decision
- sig-release
- sig-testing: In charge of all the test for Kubernetes
5. Docs
- sig-docs: for documentation and website
## Working groups and "Subproject"
From working group to "subproject".
For specific: tools (ex. Helm), goals (ex. Resource Management) or areas (ex. Machine Learning).
Working groups change around more frequently than SIGs, and some might be temporary.
- wg-app-def
- wg-apply
- wg-cloud-provider
- wg-cluster-api
- wg-container-identity
- ...
### Picking the right SIG:
1. Figure out which area you would like to contribute to
2. Find out which SIG / WG / subproject covers that (tip: ask on #sig-contribex Slack channel)
3. Join that SIG / WG / subproject (you should also join the main SIG when joining a WG / subproject)
## Tour des repositories
Everything will be refactored (cleaning, move, merged,...)
### Core repository
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes)
### Project
- [kubernetes/Community](https://github.com/kubernetes/Community): Kubecon, proposition, Code of conduct and Contribution guideline, SIG-list
- [kubernetes/Features](https://github.com/kubernetes/Features): Features proposal for future release
- [kubernetes/Steering](https://github.com/kubernetes/Steering)
- [kubernetes/Test-Infra](https://github.com/kubernetes/Test-Infra): All related to test except Perf
- [kubernetes/Perf-Tests](https://github.com/kubernetes/Perf-Tests):
### Docs/Website
- website
- kubernetes-cn
- kubernetes-ko
### Developer Tools
- sample-controller*
- sample- apiserver*
- code-generator*
- k8s.io
- kubernetes-template-project: For new github repo
### Staging repositories
Mirror of core part for easy vendoring
### SIG repositories
- release
- federation
- autoscaler
### Cloud Providers
No AWS
### Tools & Products
- kubeadm
- kubectl
- kops
- helm
- charts
- kompose
- ingress-nginx
- minikube
- dashboard
- heapster
- kubernetes-anywhere
- kube-openapi
### 2nd Namespace: Kubernetes-sigs
Too much places for Random/Incubation stuff.
No working path for **promotion/deprecation**
In future:
1. start in Kubernetes-sigs
2. SIGs determine when and how the project will be **promoted/deprecated**
Those repositories can have their own rules:
- Approval
- Ownership
- ...
## Contribution
### First Bug report
```
- Bug or Feature
- What happened
- How to reproduce
```
### Issues as specifications
Most of k8s change start with an issue:
- Feature proposal
- API changes proposal
- Specification
### From Issue to Code/Docs
1. Start with an issue
2. Apply all appropriate labels
3. cc SIG leads and concerned devs
4. Raise the issue at a SIG meeting or on mailing list
5. If *Lazy consensus*, submit a PR
### Required labels https://github.com/kubernetes/test-infra/blob/master/label_sync/labels.md
#### On creation
- `sig/\*`: the sig the issue belong too
- `kind/\*`:
- bug
- feature
- documentation
- design
- failing-test
#### For issue closed as port of **triage**
- `triage/duplicate`
- `triage/needs-information`
- `triage/support`
- `triage/unreproduceable`
- `triage/unresolved`
#### Prority
- `priority/critical-urgent`
- `priority/important-soon`
- `priority/important-longtem`
- `priority/backlog`
- `priority/awaiting-evidence`
#### Area
Free for dedicated issue area
- `area/kubectl`
- `area/api`
- `area/dns`
- `area/platform/gcp`
#### help-wanted
Currently mostly complicated things
#### SOON
`good-first-issue`
## Making a contribution by Pull Request
We will go through the typical PR process on kubernetes repos.
We will play there: [community/contributors/new-contributor-playground at master · kubernetes/community · GitHub](https://github.com/kubernetes/community/tree/master/contributors/new-contributor-playground)
1. When we contribute to any kubernetes repository, **fork it**
2. Do your modification in your fork
```
$ git clone git@github.com:jgsqware/community.git $GOPATH/src/github.com/kubernetes/community
$ git remote add upstream https://github.com/kubernetes/community.git
$ git remote -v
origin git@github.com:jgsqware/community.git (fetch)
origin git@github.com:jgsqware/community.git (push)
upstream git@github.com:kubernetes/community.git (fetch)
upstream git@github.com:kubernetes/community.git (push)
$ git checkout -b kubecon
Switched to a new branch 'kubecon'
## DO YOUR MODIFCATION IN THE CODE##
$ git add contributors/new-contributor-playground/new-contibutor-playground-xyz.md
$ git commit
### IN YOUR COMMIT EDITOR ###
Adding a new contributors file
We are currently experimenting PR process in the kubernetes repository.
$ git push -u origin kubecon
```
3. Create a Pull request via Github
4. If needed, sign the CLA to make valid your contribution
5. Read the `k8s-ci-robot` message and `/assign @reviewer` recommended by the `k8s-ci-robot`
6. wait for a `LTGM` label from one of the `OWNER/reviewers`
7. wait for approval from one of `OWNER/approvers`
8. `k8s-ci-robot` will automatically merge the PR
`needs-ok-to-test` is used for non-member contributor to validate the pull request
## Test infrastructure
> How bot toll you when you mess up
At the end of a PR there is a bunch of test.
2 types:
- required: Always run and needed to pass to validate the PR (eg. end-to-end test)
- not required: Needed in specific condition (eg. modifying on ly specific part of code)
If something failed, click on `details` and check the test failure logs to see what happened.
There is `junit-XX.log` with the list of test executed and `e2e-xxxxx` folder with all the component logs.
To check if the test failed because of your PR or another one, you can click on the **TOP** `pull-request-xxx` link and you will see the test-grid and check if your failing test is failing in other PR too.
If you want to retrigger the test manually, you can comment the PR with `/retest` and `k8s-ci-robot` will retrigger the tests.
## SIG-Docs contribution
Anyone can contribute to docs.
### Kubernetes docs
- Websites URL
- Github Repository
- k8s slack: #sig-docs
### Working with docs
Docs use `k8s-ci-robot`. Approval process is the same as for any k8s repo.
In docs, `master` branch is the current version of the docs. So always branch from `master`. It's continuous deployment
For a specific release docs, branch from `release-1.X`.
## Local build and Test
The code: [kubernetes/kubernetes]
The process: [kubernetes/community]
### Dev Env
You need:
- Go
- Docker
- Lot of RAM and CPU and 10 GB of space
- best to use Linux
- place you k8s repo fork in:
- `$GOPATH/src/k8s.io/kubernetes`
- `cd $GOPATH/src/k8s.io/kubernetes`
- build: `./build/run.sh make`
- Build is incremental, keep running `./build/run.sh make` til it works
- To build variant: `make WHAT="kubectl"`
- Building kubectl on Mac for linux: `KUBE_*_PLATFORM="linux/amd64" make WHAT "kubectl"`
There is `build` documentation there: https://git.k8s.io/kubernetes/build
### Testing
There is `test` documentation there: https://git.k8s.io/community/contributor/guide

View File

@ -0,0 +1,5 @@
# Hello everyone!
Please feel free to talk amongst yourselves or ask questions if you need help
First commit at kubecon from @mitsutaka

View File

@ -16,7 +16,7 @@ We need the 80% case, Fabric8 is a good example of this. We need a good set of
We also need to look at how to get developer feedback on this so that we're building what they need. Pradeepto did a comparison of Kompose vs. Docker Compose for simplicity/usability.
One of the things we're discussing the Kompose API. We want to get rid of this and supply something which people can use directly with kuberntes. A bunch of shops only have developers. Someone asked though what's so complicated with Kube definitions. Have we identified what gives people trouble with this? We push too many concepts on developers too quickly. We want some high-level abstract types which represent the 95% use case. Then we could decompose these to the real types.
One of the things we're discussing the Kompose API. We want to get rid of this and supply something which people can use directly with kubernetes. A bunch of shops only have developers. Someone asked though what's so complicated with Kube definitions. Have we identified what gives people trouble with this? We push too many concepts on developers too quickly. We want some high-level abstract types which represent the 95% use case. Then we could decompose these to the real types.
What's the gap between compose files and the goal? As an example, say you want to run a webserver pod. You have to deal with ingress, and service, and replication controller, and a bunch of other things. What's the equivalent of "docker run" which is easy to get. The critical thing is how fast you can learn it.

View File

@ -13,35 +13,40 @@ In some sense, the summit is a real-life extension of the community meetings and
## Registration
- [Sign the CLA](/CLA.md) if you have not done so already.
- [Fill out this Google Form](https://goo.gl/forms/TgoUiqbqZLkyZSZw1)
- [Fill out this Google Form](https://goo.gl/forms/TgoUiqbqZLkyZSZw1) - Registration is now <b> closed.</b>
## When and Where
- Tuesday, May 1, 2018 (before Kubecon EU)
- Bella Center
- Copenhagen, Denmark
- Bella Center, Copenhagen, Denmark
- Registration and breakfast start at 8am in Room C1-M0
- Happy hour reception onsite to close at 5:30pm
All day event with a happy hour reception to close
## Agenda
There is a [Slack channel](https://kubernetes.slack.com/messages/contributor-summit) (#contributor-summit) for you to use during the summit to pass URLs, notes, reserve the hallway track room, etc.
## Agenda
### Morning
| Time | Track One | Track Two | Track Three |
| ----------- | ------------------------------- | ---------------------------- | -------------- |
| 8:00 | Registration and Breakfast | | |
| 9:00-9:15 | Welcome and Introduction | | |
| 9:15-9:30 | Steering Committee Update | | |
| Time | Track One - Room: C1-M1 | Track Two - Room: C1-M2 | Track Three - Room: B4-M5 |
| ----------- | ------------------------------- | ---------------------------- | -------------- |
| 8:00 | Registration and Breakfast - <b>Room: C1-M0</b> | | |
| 9:00-9:15 | | Welcome and Introduction | |
| 9:15-9:30 | | Steering Committee Update | |
| | | | |
| | [New Contributor Workshop](/events/2018/05-contributor-summit/new-contributor-workshop.md) | Current Contributor Workshop | Docs Sprint |
| | | | |
| | New Contributor Workshop | Current Contributor Workshop | Docs Sprint |
| | | | |
| 9:30-10:00 | Session | Unconference | |
| 10:00-10:50 | Session | Unconference | |
| 9:30-10:00 | Part 1 | What's next in networking? Lead: thockin | |
| 10:00-10:50 | Part 2 | CRDs and Aggregation - future and pain points. Lead: sttts | |
| 10:50-11:00 | B R E A K | B R E A K | |
| 11:00-12:00 | Session | Unconference | |
| 12:00-1:00 | Session | Unconference | |
| 11:00-12:00 | Part 3 | client-go and API extensions. Lead: munnerz | |
| 12:00-1:00 | Part 4 | Developer Tools. Leads: errordeveloper and r2d4 | |
| 1:00-2:00 | Lunch (Provided) | Lunch (Provided) | |
*Note: The New Contributor Workshop will be a single continuous training, rather than being divided into sessions as the Current Contributor track is. New contributors should plan to stay for the whole 3 hours. [Outline here](/events/2018/05-contributor-summit/new-contributor-workshop.md).*
### Afternoon
| Time | Track One |
@ -55,9 +60,9 @@ All day event with a happy hour reception to close
- SIG Updates (~5 minutes per SIG)
- 2 slides per SIG, focused on cross-SIG issues, not internal SIG discussions (those are for Kubecon)
- Identify potential issues that might affect multiple SIGs across the project
- One-to-many announcements about changes a SIG expects that might affect others
- One-to-many announcements about changes a SIG expects that might affect others
- Track Leads
- New Contributor Workshop - Josh Berkus
- New Contributor Workshop - Josh Berkus, Guinevere Saenger, Ilya Dmitrichenko
- Current Contributor Workshop - Paris Pittman
- SIG Updates - Jorge Castro

View File

@ -0,0 +1,139 @@
# Client-go
**Lead:** munnerz with assist from lavalamp
**Slides:** combined with the CRD session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0) (CRD is first; client-go is after)
**Thanks to our notetakers:** kragniz, mrbobbytales, directxman12, onyiny-ang
## Goals for the Session
* What is currently painful when building a controller
* Questions around best practices
* As someone new:
* What is hard to grasp?
* As someone experienced:
* What important bits of info do you think are critical
## Pain points when building controller
* A lot of boilerplate
* Work queues
* HasSynced functions
* Re-queuing
* Lack of deep documentation in these areas
* Some documentation exists, bot focused on k/k core
* Securing webhooks & APIServers
* Validation schemas
* TLS, the number of certs is a pain point
* It is hard right now, the internal k8s CA has been used a bit.
* OpenShift has a 'serving cert controller' that will generate a cert based on an annotation that might be able to possibly integrate upstream.
* Election has been problematic and the Scaling API is low-level and hard to use. doesn't work well if resource has multiple meanings of scale (eg multiple pools of nodes)
* Registering CRDs, what's the best way to go about it?
* No best way to do it, but has been deployed with application
* Personally, deploy the CRDs first for RBAC reasons
* Declarative API on one end that has to be translated to translated to a transactional API on the other end (e.g. ingress). Controller trying to change quite a few things.
* You can do locking, but it has to be built.
* Q: how do you deal with "rolling back" if the underlying infrastructure
that you're describing says no on an operation?
* A: use validating webhook?
* A: use status to keep track of things?
* A: two types of controllers: `kube --> kube` and `kube --> external`,
they work differently
* A: Need a record that keeps track of things in progress. e.g. status. Need more info on how to properly tackle this problem.
## Best practices
(discussion may be shown by Q: for question or A: for audience or answer)
* How do you keep external resources up to date with Kubernetes resources?
* A: the original intention was to use the sync period on the controller if
you watch external resources, use that
* Should you set resync period to never if you're not dealing with
external resources?
* A: Yes, it's not a bug if watch fails to deliver things right
* A: controller automatically relists on connection issues, resync
interval is *only* for external resources
* maybe should be renamed to make it clear it's for external resources
* how many times to update status per sync?
* A: use status conditions to communicate "fluffy" status to user
(messages, what might be blocked, etc, in HPA), use fields to
communicate "crunchy" status (last numbers we saw, last metrics, state
I need later).
* How do I generate nice docs (markdown instead of swagger)
* A: kubebuilder (kubernetes-sigs/kubebuilder) generates docs out of the
box
* A: Want to have IDL pipeline that runs on native types to run on CRDs,
run on docs generator
* Conditions vs fields
* used to check a pods state
* "don't use conditions too much"; other features require the use of conditions, status is unsure
* What does condition mean in this context
* Additional fields that can have `ready` with a msg, represents `state`.
* Limit on states that the object can be in.
* Use conditions to reflect the state of the world, is something blocked etc.
* Conditions were created to allow for mixed mode of clients, old clients can ignore some conditions while new clients can follow them. Designed to make it easier to extend status without breaking clients.
* Validating webhooks vs OpenAPI schema
* Can we write a test that spins up main API server in process?
* Can do that current in some k/k tests, but not easy to consume
* vendoring is hard
* Currently have a bug where you have to serve aggregated APIs on 443,
so that might complicate things
* How are people testing extensions?
* Anyone reusing upstream dind cluster?
* People looking for a good way to test them.
* kube-builder uses the sig-testing framework to bring up a local control plane and use that to test against. (@pwittrock)
* How do you start cluster for e2es?
* Spin up a full cluster with kubeadm and run tests against that
* integration tests -- pull in packages that will build the clusters
* Q: what CIs are you using?
* A: Circle CI and then spin up new VMs to host cluster
* Mirtantis has a tool for a multi-node dind cluster for testing
* #testing-commons channel on stack. 27 page document on this--link will be put in slides
* Deploying and managing Validating/Mutating webhooks?
* how complex should they be?
* When to use subresources?
* Are people switching to api agg to use this today?
* Really just for status and scale
* Why not use subresources today with scale?
* multiple replicas fields
* doesn't fit polymorphic structure that exists
* pwittrock@: kubectl side, scale
* want to push special kubectl verbs into subresources to make kubectl
more tolerant to version skew
## Other Questions
* Q: Client-go generated listers, what is the reason for two separate interfaces to retrieve from client and cache?
* A: historical, but some things are better done local vs on the server.
* issues: client-set interface allows you to pass special options that allow you to do interesting stuff on the API server which isn't necessarily possible in the lister.
* started as same function call and then diverged
* lister gives you slice of pointers
* clientset gives you a slice of not pointers
* a lot of people would take return from clientset and then convert it to a slice of pointers so the listers helped avoid having to do deep copies every time. TLDR: interfaces are not identical
* Where should questions go on this topic for now?
* A: most goes to sig-api-machinery right now
* A : Controller related stuff would probably be best for sig-apps
* Q: Staleness of data, how are people dealing with keeping data up to date with external data?
* A: Specify sync period on your informer, will put everything through the loop and hit external resources.
* Q: With strictly kubernetes resources, should your sync period be never? aka does the watch return everything.
* A: The watch should return everything and should be used if its strictly k8s in and k8s out, no need to set the sync period.
* Q: What about controllers in other languages than go?
* A: [metacontroller](https://github.com/GoogleCloudPlatform/metacontroller) There are client libs in other languages, missing piece is work queue,
informer, etc
* Cluster API controllers cluster, machineset, deployment, have a copy of
deployment code for machines. Can we move this code into a library?
* A: it's a lot of work, someone needs to do it
* A: Janet Kuo is a good person to talk to (worked on getting core workloads
API to GA) about opinions on all of this
* Node name duplication caused issues with AWS and long-term caches
* make sure to store UIDs if you cache across reboot
## Moving Forwards
* How do share/disseminate knowledge (SIG PlatformDev?)
* Most SIGs maintain their own controllers
* Wiki? Developer Docs working group?
* Existing docs focus on in-tree development. Dedicated 'extending kubernetes' section?
* Git-book being developed for kubebuilder (book.kubebuilder.io); would appreciate feedback @pwittrock
* API extensions authors meetups?
* How do we communicate this knowledge for core kubernetes controllers
* Current-day: code review, hallway conversations
* Working group for platform development kit?
* Q: where should we discuss/have real time conversations?
* A: #sig-apimachinery, or maybe #sig-apps in slack (or mailing lists) for the workloads controllers

View File

@ -0,0 +1,92 @@
# CRDs - future and painpoints
**Lead:** sttts
**Slides:** combined with the client-go session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0)
**Thanks to our notetakers:** mrbobbytales, kragniz, tpepper, and onyiny-ang
## outlook - aggregation
* API stable since 1.10. There is a lack of tools and library support.
* GSoC project with @xmudrii: share etcd storage
* `kubectl create etcdstorage your api-server`
* Store custom data in etcd
## outlook custom resources
1.11:
* alpha: multiplier versions with/without conversion
* alpha: pruning - blocker for GA - unspecified fields are removed
* deep change of semantics of custom resources
* from JSON blob store to schema based storage
* alpha: defaulting - defaults from openapi validation schema are applied
* alpha: graceful deletion - (maybe? PR exists)
* alpha: server side printing columns for `kubectl get` customization
* beta: subresources - alpha in 1.10
* will have additionalProperties with extensible string map
* mutually exclusive with properties
1.12
* multiple versions with declarative field renames
* strict create mode (issue #5889)
Missing from Roadmap:
- Additional Properties: Forbid additional fields
- Unknown fields are silently dropped instead of erroring
- Istio used CRD extensively: proto requires some kind of verification and CRDs are JSON
- currently planning to go to GA without proto support
- possibly in the longer term to plan
- Resource Quotas for Custom Resources
- doable, we know how but not currently implemented
- Defaulting: mutating webhook will default things when they are written
- Is Validation going to be required in the future
- poll the audience!
- gauging general sense of validation requirements (who wants them, what's missing?)
- missing: references to core types aren't allowed/can't be defined -- this can lead to versioning complications
- limit CRDs clusterwide such that the don't affect all namespaces
- no good discussion about how to improve this yet
- feel free to start one!
- Server side printing columns, per resource type needs to come from server -- client could be in different version than server and highlight wrong columns
Autoscaling is alpha today hopefully beta in 1.11
## The Future: Versioning
* Most asked feature, coming..but slowly
* two types, "noConversion" and "Declarative Conversion"
* "NoConversion" versioning
* maybe in 1.11
* ONLY change is apiGroup
* Run multiple versions at the same time, they are not converted
* "Declarative Conversion" 1.12
* declarative rename e.g
```
spec:
group: kubecon.io
version: v1
conversions:
declarative:
renames:
from: v1pha1
to: v1
old: spec.foo
new: bar
```
* Support for webhook?
* not currently, very hard to implement
* complex problem for end user
* current need is really only changing for single fields
* Trying to avoid complexity by adding a lot of conversions
## Questions:
* When should someone move to their own API Server
* At the moment, telling people to start with CRDs. If you need an aggregated API server for custom versioning or other specific use-cases.
* How do I update everything to a new object version?
* Have to touch every object.
* are protobuf support in the future?
* possibly, likely yes
* update on resource quotas for CRDs
* PoC PR current out, it's doable just not quite done
* Is validation field going to be required?
* Eventually, yes? Some work being done to make CRDs work well with `kubectl apply`
* Can CRDs be cluster wide but viewable to only some users.
* It's been discussed, but hasn't been tackled.
* Is there support for CRDs in kubectl output?
* server side printing columns will make things easier for client tooling output. Versioning is important for client vs server versioning.

View File

@ -0,0 +1,63 @@
# Developer Tools:
**Leads:** errordeveloper, r2d4
**Slides:** n/a
**Thanks to our notetakers:** mrbobbytales, onyiny-ang
What APIs should we target, what parts of the developer workflow haven't been covered yet?
* Do you think the Developer tools for Kubernetes is a solved problem?
* A: No
### Long form responses from SIG Apps survey
* Need to talk about developer experience
* Kubernetes Community can do a lot more in helping evangelize Software development workflow, including CI/CD. Just expecting some guidelines on the more productive ways to write software that runs in k8s.
* Although my sentiment is neutral on kube, it is getting better as more tools are emerging to allow my devs to stick to app development and not get distracted by kube items. There is a lot of tooling available which is a dual edge sword, these tools range greatly in usability robustness and security. So it takes a lot of effort to...
### Current State of Developer Experience
* Many Tools
* Mostly incompatible
* Few end-to-end workflows
### Comments and Questions
* Idea from scaffold to normalize the interface for builders, be able to swap them out behind the scenes.
* Possible to formalize these as CRDs?
* Lots of choices, helm, other templating, kompose etc..
* So much flexibility in the Kubernetes API that it can become complicated for new developers coming up.
* Debug containers might make things easier for developers to work through building and troubleshooting their app.
* Domains and workflow are so different from companies that everyone has their own opinionated solution.
* Lots of work being done in the app def working group to define what an app is.
* app CRD work should make things easier for developers.
* Break out developer workflow into stages and try and work through expanding them, e.g. develop/debug
* debug containers are looking to be used both in prod and developer workflows
* Tool in sig-cli called kustomize, was previously 'konflate'?
* Hard to talk about all these topics as there isn't the language to talk about these classes of tools.
* @jacob investigation into application definition: re: phases, its not just build, deploy, debug, its build, deploy, lifecycle, debug. Managing lifecycle is still a problem, '1-click deploy' doesn't handle lifecycle.
* @Bryan Liles: thoughts about why this is hard:
* kubectl helm apply objects in different orders
* objects vs abstractions
* some people love [ksonnet](https://ksonnet.io/), some hate it. Kubernetes concepts are introduced differently to different people so not everyone is starting with the same base. Thus, some tools are harder for some people to grasp than others. Shout out to everyone who's trying to work through it * Being tied to one tool breaks compatibility across providers.
* Debug containers are great for break-glass scenarios
* CoreOS had an operator that handled the entire stack, additional objects could be created and certain metrics attached.
* Everything is open source now, etcd, prometheus operator
* Tools are applying things in different orders, and this can be a problem across tooling
* People who depend on startup order also tend to have reliability problems as they have their own operational problems, should try and engineer around it.
* Can be hard if going crazy on high-level abstractions, can make things overly complicated and there are a slew of constraints in play.
* Ordering constraints are needed for certain garbage collection tasks, having ordering may actually be useful.
* Some groups have avoided high-level DSLs because people should understand readiness/livelness probes etc. Developers may have a learning curve, but worthwhile when troubleshooting and getting into the weeds.
* Lots of people don't want to get into it at all, they want to put in a few details on a db etc and get it.
* Maybe standardize on a set of labels to on things that should be managed as a group. Helm is one implementation, it should go beyond helm.
* There is a PR that is out there that might take care of some of this.
* Everyone has their own "style" when it comes to this space.
* Break the phases and components in the development and deployment workflow into sub-problems and they may be able to actually be tackled. Right now the community seems to tackling everything at once and developing different tools to do the same thing.
* build UI that displays the whole thing as a list and allows easy creation/destruction of cluster
* avoid tools that would prevent portability
* objects rendered to file somehow: happens at runtime, additional operator that takes care of the sack
* 3, 4 minor upgrades without breakage
* @Daniel Smith: start up order problems = probably bigger problems, order shouldn't need to matter but in the real world sometimes it does
* platform team, internal paths team (TSL like theme), etc. In some cases it's best to go crazy focusing on the abstractions--whole lot of plumbing that needs to happen to get everything working properly
* Well defined order of creation may not be a bad thing. ie. ensure objects aren't created that are immediately garbage collected.
* Taking a step back from being contributors and put on developer hats to consider the tool sprawl that exists and is not necessarily compatible across different aspects of kubernetes. Is there anyway to consolidate them and make them more standardized?
* Split into sub-problems
## How can we get involved?
- SIG-Apps - join the conversation on slack, mailing list, or weekly Monday meeting

View File

@ -0,0 +1,129 @@
# Networking
**Lead:** thockin
**Slides:** [here](https://docs.google.com/presentation/d/1Qb2fbyTClpl-_DYJtNSReIllhetlOSxFWYei4Zt0qFU/edit#slide=id.g2264d16f0b_0_14)
**Thanks to our notetakers:** onyiny-ang, mrbobbytales, tpepper
This session is not declaring what's being implemented next, but rather laying out the problems that loom.
## Coming soon
- kube-proxy with IPVS
- currently beta
- core DNS replacing kube DNS
- currently beta
- pod "ready++"
- allow external systems to participate in rolling updates. Say your load-balancer takes 5-10 seconds to program, when you bring up new pod and take down old pod the load balancer has lost old backends but hasn't yet added new backends. The external dependency like this becomes a gating pod decorator.
- adds configuration to pod to easily verify readiness
- design agreed upon, alpha (maybe) in 1.11
## Ingress
* The lowest common-denominator API. This is really limiting for users, especially compared to modern software L7 proxies.
* annotation model of markup limits portability
* ingress survey reports:
* people want portability
* everyone uses non-portable features…
* 2018 L7 requirements are dramatically higher than what they were and many vendors dont support that level of functionality.
* Possible Solution? Routes
* openshift uses routes
* heptio prototyping routes currently
* All things considered, requirements are driving it closer and closer to istio
Possibility, poach some of the ideas and add them to kubernetes native.
## Istio
(as a potential solution)
- maturing rapidly with good APIs and support
- Given that plus istio is not part of kubernetes, it's unlikely near term to become a default or required part of a k8s deployment. The general ideas around istio style service mesh could be more native in k8s.
## Topology and node-local Services
- demand for node-local network and service discovery but how to go about it?
- e.g. “I want to talk to the logging daemon on my current host”
- special-case topology?
- client-side choice
- These types of services should not be a service proper.
## Multi-network
- certain scenarios demand multi-network
- A pod can be in multiple networks at once. You might have different quality of service on different networks (eg: fast/expensive, slower/cheaper), or different connectivity (eg: the rack-internal network).
- Tackling scenarios like NFV
- need deeper changes like multiple pod IPs but also need to avoid repeating old mistakes
- SIG-Network WG designing a PoC -- If interested jump on SIG-network WG weekly call
- Q: Would this PoC help if virtual-kubelets were used to span cloud providers? Spanning latency domains in networks is also complicated. Many parts of k8s are chatty, assuming a cluster internal low-latency connectivity.
## Net Plugins vs Device Plugins
- These plugins do not coordinate today and are difficult to work around
- gpu that is also an infiniband device
- causes problems because network and device are very different with verbs etc
- problems encountered with having to schedule devices and network together at the same time.
“I want a gpu on this host that has a gpu attached and I want it to be the same deviec”
PoC available to make this work, but its rough and a problem right now.
- Resources WG and networking SIG are discussing this challenging problem
- SIGs/WGs. Conversation may feel like a cycle, but @thockin feels it is a spiral that is slowly converging and he has a doc he can share covering the evolving thinking.
## Net Plugins, gRPC, Services
- tighter coupling between netplugins and kube-proxy could be useful
- grpc is awesome for plugins, why not use a grpc network plugin
- pass services to network plugin to bypass kube-proxy, give more awareness to the network plugin and enable more functionality.
## IPv6
- beta but **no** support for dual-stack (v4 & v6 at the same time)
- Need deeper changes like multiple pod IPs (need to change the pod API--see Multi-network)
- https://github.com/kubernetes/features/issues/563
## Services v3
- Services + Endpoints have a grab-bag of features which is not ideal; "grew organically"
- Need to start segmenting the "core" API group
- write API in a way that is more obvious
- split things out and reflect it in API
- Opportunity to rethink and refactor:
- Endpoints -> Endpoint?
- split the grouping construct from the “gazintas”
- virtualIP, network, dns name moves into the service
- EOL troublesome features
- port remapping
## DNS Reboot
- We abuse DNS and mess up our DNS schema
- it's possible to write queries in DNS that take over names
- @thockin has a doc with more information about the details of this
- Why can't I use more than 6 web domains? bugzilla circa 1996
- problem: its possible to write queries in dns that write over names
- create a namespace called “com” and an app named “google” and itll cause a problem
- “svc” is an artifact and should not be a part of dns
- issues with certain underlying libraries
- Changing it is hard (if we care about compatibility)
- Can we fix DNS spec or use "enlightened" DNS servers
- Smart proxies on behalf of pods that do the searching and become a “better” dns
- External DNS
- Creates DNS entries in external system (route53)
- Currently in incubator, not sure on status, possibly might move out of incubator, but unsure on path forward
Perf and Scalability
- iptables is krufty. nftables implementation should be better.
- ebpf implementation (eg; Cilium) has potential
## Questions:
- Consistent mechanism to continue progress but maintain backwards compatibility
- External DNS was not mentioned -- blue/green traffic switching
- synchronizes kubernetes resources into various Kubernetes services
- it's in incubator right now (deprecated)
- unsure of the future trajectory
- widely used in production
- relies sometimes on annotations and ingress
- Q: Device plugins. . .spiraling around and hoping for eventual convergence/simplification
- A: Resource management on device/net plugin, feels like things are going in a spiral, but progress is being made, it is a very difficult problem and hard to keep all design points tracked. Trying to come to consensus on it all.
- Q: Would CoreDNS be the best place for the plugins and other modes for DNS proxy etc.
- loss of packets are a problem -- long tail of latency
- encourage cloud providers to support gRPC
- Q: With the issues talked about earlier, why cant istio be integrated natively?
- A: Istio can't be required/default: still green
- today we can't proclaim that Kubernetes must support Istio
- probably not enough community support this year (not everyone is using it at this point)
- Q: Thoughts on k8s v2?
- A: Things will not just be turned off, things must be phased out and over the course of years, especially for services which have been core for some time.
## Take Aways:
- This is not a comprehensive list of everything that is up and coming
- A lot of work went into all of these projects

View File

@ -0,0 +1,99 @@
Kubernetes Summit: New Contributor Workshop
*This was presented as one continuous 3-hour training with a break. For purposes of live coding exercises, participants were asked to bring a laptop with git installed.*
This course was captured on video, and the playlist can be found [here](https://www.youtube.com/playlist?list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx).
*Course Playlist [Part One](https://www.youtube.com/watch?v=obyAKf39H38&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=1):*
* Opening
* Welcome contributors
* Who this is for
* Program
* The contributor ladder
* CLA signing
* Why we have a CLA
* Going through the signing process
* Choose Your Own Adventure: Figuring out where to contribute
* Docs & Website
* Testing
* Community management
* Code
* Main code
* Drivers, platforms, plugins, subprojects
* Finding your first topic
* Things that fit into your work at work
* Interest match
* Skills match
* Choose your own adventure exercise
* Let's talk: Communication
* Importance of communication
* Community standards and courtesy
* Mailing Lists (esp Kube-dev)
* Slack
* Github Issues & PRs
* Zoom meetings & calendar
* Office hours, MoC, other events
* Meetups
* Communication exercise
* The SIG system
* What are SIGs and WGs
* Finding the right SIG
* Most active SIGs
* SIG Membership, governance
* WGs and Subprojects
* Repositories
* Tour de Repo
* Core Repo
* Website/docs
* Testing
* Other core repos
* Satellite Repos
* Owners files
* Repo membership
* BREAK (20min)
*Course Playlist [Part Two](https://www.youtube.com/watch?v=PERboIaNdcI&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=2):*
* Contributing by Issue: Josh (15 min) (1:42)
* Finding the right repo
* What makes a good issue
* Issues as spec for changes
* Labels
* label framework
* required labels
* Following up and communication
* Contributing by PR (with walkthrough)
* bugs vs. features vs. KEP
* PR approval process
* More Labels
* Finding a reviewer
* Following-up and communication
* On you: rebasing, test troubleshooting
* Test infrastructure
* Automated tests
* Understanding test failures
* Doc Contributions
* Upcoming changes to docs
* Building docs locally
* Doc review process
*Course Playlist [Part Three](https://www.youtube.com/watch?v=Z3pLlp6nckI&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=3):*
* Code Contributions: Build and Test
* Local core kubernetes build
* Running unit tests
* Troubleshooting build problems
* Releases
* Brief on Release schedule
* Release schedule details
* Release Team Opportunities (shadows)
* Going beyond
* Org membership
* Meetups & CNCF ambassador
* Mentorship opportunties
* Group Mentoring
* GSOC/Outreachy
* Release Team
* Meet Our Contributors
* 1-on-1 ad-hoc mentoring
* Kubernetes beginner tutorials
* Check your own progress on devstats

View File

@ -0,0 +1,13 @@
# Steering Committee Update
**Leads:** pwittrock, timothysc
**Thanks to our notetaker:** tpepper
* incubation is deprecated, "associated" projects are a thing
* WG are horizontal across SIGs and are ephemeral. Subprojects own a piece
of code and relate to a SIG. Example: SIG-Cluster-Lifecycle with
kubeadm, kops, etc. under it.
* SIG charters: PR a proposed new SIG with the draft charter. Discussion
can then happen on GitHub around the evolving charter. This is cleaner
and more efficient than discussing on mailing list.
* K8s values doc updated by Sarah Novotny
* changes to voting roles and rules are in the works

View File

@ -1,12 +1,10 @@
# Kubernetes Weekly Community Meeting
We have PUBLIC and RECORDED [weekly meeting](https://zoom.us/my/kubernetescommunity) every Thursday at 6pm UTC (1pm EST / 10am PST)
We have PUBLIC and RECORDED [weekly meeting](https://zoom.us/my/kubernetescommunity) every Thursday at [5pm UTC](https://www.google.com/search?q=5pm+UTC).
Map that to your local time with this [timezone table](https://www.google.com/search?q=1800+in+utc)
See it on the web at [calendar.google.com](https://calendar.google.com/calendar/embed?src=cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com&ctz=America/Los_Angeles) , or paste this [iCal url](https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics) into any [iCal client](https://en.wikipedia.org/wiki/ICalendar). Do NOT copy the meetings over to a your personal calendar, you will miss meeting updates. Instead use your client's calendaring feature to say you are attending the meeting so that any changes made to meetings will be reflected on your personal calendar.
See it on the web at [calendar.google.com](https://calendar.google.com/calendar/embed?src=cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com&ctz=America/Los_Angeles) , or paste this [iCal url](https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics) into any [iCal client](https://en.wikipedia.org/wiki/ICalendar). Do NOT copy the meetings over to a your perosnal calendar, you will miss meeting updates. Instead use your client's calendaring feature to say you are attending the meeting so that any changes made to meetings will be reflected on your personal calendar.
All meetings are archived on the [Youtube Channel](https://www.youtube.com/watch?v=onlFHICYB4Q&list=PL69nYSiGNLP1pkHsbPjzAewvMgGUpkCnJ)
All meetings are archived on the [Youtube Channel](https://www.youtube.com/playlist?list=PL69nYSiGNLP1pkHsbPjzAewvMgGUpkCnJ).
Quick links:

View File

@ -6,8 +6,8 @@ Office Hours is a live stream where we answer live questions about Kubernetes fr
Third Wednesday of every month, there are two sessions:
- European Edition: [2pm UTC](https://www.timeanddate.com/worldclock/fixedtime.html?msg=Kubernetes+Office+Hours+%28European+Edition%29&iso=20171115T14&p1=136&ah=1)
- Western Edition: [9pm UTC](https://www.timeanddate.com/worldclock/fixedtime.html?msg=Kubernetes+Office+Hours+%28Western+Edition%29&iso=20171115T13&p1=1241)
- European Edition: [1pm UTC](https://www.google.com/search?q=1pm+UTC)
- Western Edition: [8pm UTC](https://www.google.com/search?q=8pm+UTC)
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along.

View File

@ -16,7 +16,7 @@ The documentation follows a template and uses the values from [`sigs.yaml`](/sig
**Time Zone gotcha**:
Time zones make everything complicated.
And Daylight Savings time makes it even more complicated.
And Daylight Saving time makes it even more complicated.
Meetings are specified with a time zone and we generate a link to http://www.thetimezoneconverter.com/ so that people can easily convert it to their local time zone.
To make this work you need to specify the time zone in a way that that web site recognizes.
Practically, that means US pacific time must be `PT (Pacific Time)`.

View File

@ -0,0 +1,225 @@
---
kep-number: 8
title: Protomote sysctl annotations to fields
authors:
- "@ingvagabund"
owning-sig: sig-node
participating-sigs:
- sig-auth
reviewers:
- "@sjenning"
- "@derekwaynecarr"
approvers:
- "@sjenning "
- "@derekwaynecarr"
editor:
creation-date: 2018-04-30
last-updated: 2018-05-02
status: provisional
see-also:
replaces:
superseded-by:
---
# Promote sysctl annotations to fields
## Table of Contents
* [Promote sysctl annotations to fields](#promote-sysctl-annotations-to-fields)
* [Table of Contents](#table-of-contents)
* [Summary](#summary)
* [Motivation](#motivation)
* [Promote annotations to fields](#promote-annotations-to-fields)
* [Promote --experimental-allowed-unsafe-sysctls kubelet flag to kubelet config api option](#promote---experimental-allowed-unsafe-sysctls-kubelet-flag-to-kubelet-config-api-option)
* [Gate the feature](#gate-the-feature)
* [Proposal](#proposal)
* [User Stories](#user-stories)
* [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
* [Risks and Mitigations](#risks-and-mitigations)
* [Graduation Criteria](#graduation-criteria)
* [Implementation History](#implementation-history)
## Summary
Setting the `sysctl` parameters through annotations provided a successful story
for defining better constraints of running applications.
The `sysctl` feature has been tested by a number of people without any serious
complaints. Promoting the annotations to fields (i.e. to beta) is another step in making the
`sysctl` feature closer towards the stable API.
Currently, the `sysctl` provides `security.alpha.kubernetes.io/sysctls` and `security.alpha.kubernetes.io/unsafe-sysctls` annotations that can be used
in the following way:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: sysctl-example
annotations:
security.alpha.kubernetes.io/sysctls: kernel.shm_rmid_forced=1
security.alpha.kubernetes.io/unsafe-sysctls: net.ipv4.route.min_pmtu=1000,kernel.msgmax=1 2 3
spec:
...
```
The goal is to transition into native fields on pods:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: sysctl-example
spec:
securityContext:
sysctls:
- name: kernel.shm_rmid_forced
value: 1
- name: net.ipv4.route.min_pmtu
value: 1000
unsafe: true
- name: kernel.msgmax
value: "1 2 3"
unsafe: true
...
```
The `sysctl` design document with more details and rationals is available at [design-proposals/node/sysctl.md](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/sysctl.md#pod-api-changes)
## Motivation
As mentioned in [contributors/devel/api_changes.md#alpha-field-in-existing-api-version](https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#alpha-field-in-existing-api-version):
> Previously, annotations were used for experimental alpha features, but are no longer recommended for several reasons:
>
> They expose the cluster to "time-bomb" data added as unstructured annotations against an earlier API server (https://issue.k8s.io/30819)
> They cannot be migrated to first-class fields in the same API version (see the issues with representing a single value in multiple places in backward compatibility gotchas)
>
> The preferred approach adds an alpha field to the existing object, and ensures it is disabled by default:
>
> ...
The annotations as a means to set `sysctl` are no longer necessary.
The original intent of annotations was to provide additional description of Kubernetes
objects through metadata.
It's time to separate the ability to annotate from the ability to change sysctls settings
so a cluster operator can elevate the distinction between experimental and supported usage
of the feature.
### Promote annotations to fields
* Introduce native `sysctl` fields in pods through `spec.securityContext.sysctl` field as:
```yaml
sysctl:
- name: SYSCTL_PATH_NAME
value: SYSCTL_PATH_VALUE
unsafe: true # optional field
```
* Introduce native `sysctl` fields in [PSP](https://kubernetes.io/docs/concepts/policy/pod-security-policy/) as:
```yaml
apiVersion: v1
kind: PodSecurityPolicy
metadata:
name: psp-example
spec:
sysctls:
- kernel.shmmax
- kernel.shmall
- net.*
```
More examples at [design-proposals/node/sysctl.md#allowing-only-certain-sysctls](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/sysctl.md#allowing-only-certain-sysctls)
### Promote `--experimental-allowed-unsafe-sysctls` kubelet flag to kubelet config api option
As there is no longer a need to consider the `sysctl` feature experimental,
the list of unsafe sysctls can be configured accordingly through:
```go
// KubeletConfiguration contains the configuration for the Kubelet
type KubeletConfiguration struct {
...
// Whitelist of unsafe sysctls or unsafe sysctl patterns (ending in *).
// Default: nil
// +optional
AllowedUnsafeSysctls []string `json:"allowedUnsafeSysctls,omitempty"`
}
```
Upstream issue: https://github.com/kubernetes/kubernetes/issues/61669
### Gate the feature
As the `sysctl` feature stabilizes, it's time to gate the feature [1] and enable it by default.
* Expected feature gate key: `Sysctls`
* Expected default value: `true`
With the `Sysctl` feature enabled, both sysctl fields in `Pod` and `PodSecurityPolicy`
and the whitelist of unsafed sysctls are acknowledged.
If disabled, the fields and the whitelist are just ignored.
[1] https://kubernetes.io/docs/reference/feature-gates/
## Proposal
This is where we get down to the nitty gritty of what the proposal actually is.
### User Stories
* As a cluster admin, I want to have `sysctl` feature versioned so I can assure backward compatibility
and proper transformation between versioned to internal representation and back..
* As a cluster admin, I want to be confident the `sysctl` feature is stable enough and well supported so
applications are properly isolated
* As a cluster admin, I want to be able to apply the `sysctl` constraints on the cluster level so
I can define the default constraints for all pods.
### Implementation Details/Notes/Constraints
Extending `SecurityContext` struct with `Sysctls` field:
```go
// PodSecurityContext holds pod-level security attributes and common container settings.
// Some fields are also present in container.securityContext. Field values of
// container.securityContext take precedence over field values of PodSecurityContext.
type PodSecurityContext struct {
...
// Sysctls is a white list of allowed sysctls in a pod spec.
Sysctls []Sysctl `json:"sysctls,omitempty"`
}
```
Extending `PodSecurityPolicySpec` struct with `Sysctls` field:
```go
// PodSecurityPolicySpec defines the policy enforced on sysctls.
type PodSecurityPolicySpec struct {
...
// Sysctls is a white list of allowed sysctls in a pod spec.
Sysctls []Sysctl `json:"sysctls,omitempty"`
}
```
Following steps in [devel/api_changes.md#alpha-field-in-existing-api-version](https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#alpha-field-in-existing-api-version)
during implemention.
Validation checks implemented as part of [#27180](https://github.com/kubernetes/kubernetes/pull/27180).
### Risks and Mitigations
We need to assure backward compatibility, i.e. object specifications with `sysctl` annotations
must still work after the graduation.
## Graduation Criteria
* API changes allowing to configure the pod-scoped `sysctl` via `spec.securityContext` field.
* API changes allowing to configure the cluster-scoped `sysctl` via `PodSecurityPolicy` object
* Promote `--experimental-allowed-unsafe-sysctls` kubelet flag to kubelet config api option
* feature gate enabled by default
* e2e tests
## Implementation History
The `sysctl` feature is tracked as part of [features#34](https://github.com/kubernetes/features/issues/34).
This is one of the goals to promote the annotations to fields.

392
keps/0009-node-heartbeat.md Normal file
View File

@ -0,0 +1,392 @@
---
kep-number: 8
title: Efficient Node Heartbeat
authors:
- "@wojtek-t"
- "with input from @bgrant0607, @dchen1107, @yujuhong, @lavalamp"
owning-sig: sig-node
participating-sigs:
- sig-scalability
- sig-apimachinery
- sig-scheduling
reviewers:
- "@deads2k"
- "@lavalamp"
approvers:
- "@dchen1107"
- "@derekwaynecarr"
editor: TBD
creation-date: 2018-04-27
last-updated: 2018-04-27
status: implementable
see-also:
- https://github.com/kubernetes/kubernetes/issues/14733
- https://github.com/kubernetes/kubernetes/pull/14735
replaces:
- n/a
superseded-by:
- n/a
---
# Efficient Node Heartbeats
## Table of Contents
Table of Contents
=================
* [Efficient Node Heartbeats](#efficient-node-heartbeats)
* [Table of Contents](#table-of-contents)
* [Summary](#summary)
* [Motivation](#motivation)
* [Goals](#goals)
* [Non-Goals](#non-goals)
* [Proposal](#proposal)
* [Risks and Mitigations](#risks-and-mitigations)
* [Graduation Criteria](#graduation-criteria)
* [Implementation History](#implementation-history)
* [Alternatives](#alternatives)
* [Dedicated “heartbeat” object instead of “leader election” one](#dedicated-heartbeat-object-instead-of-leader-election-one)
* [Events instead of dedicated heartbeat object](#events-instead-of-dedicated-heartbeat-object)
* [Reuse the Component Registration mechanisms](#reuse-the-component-registration-mechanisms)
* [Split Node object into two parts at etcd level](#split-node-object-into-two-parts-at-etcd-level)
* [Delta compression in etcd](#delta-compression-in-etcd)
* [Replace etcd with other database](#replace-etcd-with-other-database)
## Summary
Node heartbeats are necessary for correct functioning of Kubernetes cluster.
This proposal makes them significantly cheaper from both scalability and
performance perspective.
## Motivation
While running different scalability tests we observed that in big enough clusters
(more than 2000 nodes) with non-trivial number of images used by pods on all
nodes (10-15), we were hitting etcd limits for its database size. That effectively
means that etcd enters "alert mode" and stops accepting all write requests.
The underlying root cause is combination of:
- etcd keeping both current state and transaction log with copy-on-write
- node heartbeats being pontetially very large objects (note that images
are only one potential problem, the second are volumes and customers
want to mount 100+ volumes to a single node) - they may easily exceed 15kB;
even though the patch send over network is small, in etcd we store the
whole Node object
- Kubelet sending heartbeats every 10s
This proposal presents a proper solution for that problem.
Note that currently (by default):
- Lack of NodeStatus update for `<node-monitor-grace-period>` (default: 40s)
results in NodeController marking node as NotReady (pods are no longer
scheduled on that node)
- Lack of NodeStatus updates for `<pod-eviction-timeout>` (default: 5m)
results in NodeController starting pod evictions from that node
We would like to preserve that behavior.
### Goals
- Reduce size of etcd by making node heartbeats cheaper
### Non-Goals
The following are nice-to-haves, but not primary goals:
- Reduce resource usage (cpu/memory) of control plane (e.g. due to processing
less and/or smaller objects)
- Reduce watch-related load on Node objects
## Proposal
We propose introducing a new `Lease` built-in API in the newly create API group
`coordination.k8s.io`. To make it easily reusable for other purposes it will
be namespaced. Its schema will be as following:
```
type Lease struct {
metav1.TypeMeta `json:",inline"`
// Standard object's metadata.
// More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata
// +optional
ObjectMeta metav1.ObjectMeta `json:"metadata,omitempty"`
// Specification of the Lease.
// More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#spec-and-status
// +optional
Spec LeaseSpec `json:"spec,omitempty"`
}
type LeaseSpec struct {
HolderIdentity string `json:"holderIdentity"`
LeaseDurationSeconds int32 `json:"leaseDurationSeconds"`
AcquireTime metav1.MicroTime `json:"acquireTime"`
RenewTime metav1.MicroTime `json:"renewTime"`
LeaseTransitions int32 `json:"leaseTransitions"`
}
```
The Spec is effectively of already existing (and thus proved) [LeaderElectionRecord][].
The only difference is using `MicroTime` instead of `Time` for better precision.
That would hopefully allow us go get directly to Beta.
We will use that object to represent node heartbeat - for each Node there will
be a corresponding `Lease` object with Name equal to Node name in a newly
created dedicated namespace (we considered using `kube-system` namespace but
decided that it's already too overloaded).
That namespace should be created automatically (similarly to "default" and
"kube-system", probably by NodeController) and never be deleted (so that nodes
don't require permission for it).
We considered using CRD instead of built-in API. However, even though CRDs are
`the new way` for creating new APIs, they don't yet have versioning support
and are significantly less performant (due to lack of protobuf support yet).
We also don't know whether we could seamlessly transition storage from a CRD
to a built-in API if we ran into a performance or any other problems.
As a result, we decided to proceed with built-in API.
With this new API in place, we will change Kubelet so that:
1. Kubelet is periodically computing NodeStatus every 10s (at it is now), but that will
be independent from reporting status
1. Kubelet is reporting NodeStatus if:
- there was a meaningful change in it (initially we can probably assume that every
change is meaningful, including e.g. images on the node)
- or it didnt report it over last `node-status-update-period` seconds
1. Kubelet creates and periodically updates its own Lease object and frequency
of those updates is independent from NodeStatus update frequency.
In the meantime, we will change `NodeController` to treat both updates of NodeStatus
object as well as updates of the new `Lease` object corresponding to a given
node as healthiness signal from a given Kubelet. This will make it work for both old
and new Kubelets.
We should also:
1. audit all other existing core controllers to verify if they also dont require
similar changes in their logic ([ttl controller][] being one of the examples)
1. change controller manager to auto-register that `Lease` CRD
1. ensure that `Lease` resource is deleted when corresponding node is
deleted (probably via owner references)
1. [out-of-scope] migrate all LeaderElection code to use that CRD
Once all the code changes are done, we will:
1. start updating `Lease` object every 10s by default, at the same time
reducing frequency of NodeStatus updates initially to 40s by default.
We will reduce it further later.
Note that it doesn't reduce frequency by which Kubelet sends "meaningful"
changes - it only impacts the frequency of "lastHeartbeatTime" changes.
<br> TODO: That still results in higher average QPS. It should be acceptable but
needs to be verified.
1. announce that we are going to reduce frequency of NodeStatus updates further
and give people 1-2 releases to switch their code to use `Lease`
object (if they relied on frequent NodeStatus changes)
1. further reduce NodeStatus updates frequency to not less often than once per
1 minute.
We cant stop periodically updating NodeStatus as it would be API breaking change,
but its fine to reduce its frequency (though we should continue writing it at
least once per eviction period).
To be considered:
1. We may consider reducing frequency of NodeStatus updates to once every 5 minutes
(instead of 1 minute). That would help with performance/scalability even more.
Caveats:
- NodeProblemDetector is currently updating (some) node conditions every 1 minute
(unconditionally, because lastHeartbeatTime always changes). To make reduction
of NodeStatus updates frequency really useful, we should also change NPD to
work in a similar mode (check periodically if condition changes, but report only
when something changed or no status was reported for a given time) and decrease
its reporting frequency too.
- In general, we recommend to keep frequencies of NodeStatus reporting in both
Kubelet and NodeProblemDetector in sync (once all changes will be done) and
that should be reflected in [NPD documentation][].
- Note that reducing frequency to 1 minute already gives us almost 6x improvment.
It seems more than enough for any foreseeable future assuming we wont
significantly increase the size of object Node.
Note that if we keep adding node conditions owned by other components, the
number of writes of Node object will go up. But that issue is separate from
that proposal.
Other notes:
1. Additional advantage of using Lease for that purpose would be the
ability to exclude it from audit profile and thus reduce the audit logs footprint.
[LeaderElectionRecord]: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/leaderelection/resourcelock/interface.go#L37
[ttl controller]: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/ttl/ttl_controller.go#L155
[NPD documentation]: https://kubernetes.io/docs/tasks/debug-application-cluster/monitor-node-health/
[kubernetes/kubernetes#63667]: https://github.com/kubernetes/kubernetes/issues/63677
### Risks and Mitigations
Increasing default frequency of NodeStatus updates may potentially break clients
relying on frequent Node object updates. However, in non-managed solutions, customers
will still be able to restore previous behavior by setting appropriate flag values.
Thus, changing defaults to what we recommend is the path to go with.
## Graduation Criteria
The API can be immediately promoted to Beta, as the API is effectively a copy of
already existing LeaderElectionRecord. It will be promoted to GA once it's gone
a sufficient amount of time as Beta with no changes.
The changes in components logic (Kubelet, NodeController) should be done behind
a feature gate. We suggest making that enabled by default once the feature is
implemented.
## Implementation History
- RRRR-MM-DD: KEP Summary, Motivation and Proposal merged
## Alternatives
We considered a number of alternatives, most important mentioned below.
### Dedicated “heartbeat” object instead of “leader election” one
Instead of introducing and using “lease” object, we considered
introducing a dedicated “heartbeat” object for that purpose. Apart from that,
all the details about the solution remain pretty much the same.
Pros:
- Conceptually easier to understand what the object is for
Cons:
- Introduces a new, narrow-purpose API. Lease is already used by other
components, implemented using annotations on Endpoints and ConfigMaps.
### Events instead of dedicated heartbeat object
Instead of introducing a dedicated object, we considered using “Event” object
for that purpose. At the high-level the solution looks very similar.
The differences from the initial proposal are:
- we use existing “Event” api instead of introducing a new API
- we create a dedicated namespace; events that should be treated as healthiness
signal by NodeController will be written by Kubelets (unconditionally) to that
namespace
- NodeController will be watching only Events from that namespace to avoid
processing all events in the system (the volume of all events will be huge)
- dedicated namespace also helps with security - we can give access to write to
that namespace only to Kubelets
Pros:
- No need to introduce new API
- We can use that approach much earlier due to that.
- We already need to optimize event throughput - separate etcd instance we have
for them may help with tuning
- Low-risk roll-forward/roll-back: no new objects is involved (node controller
starts watching events, kubelet just reduces the frequency of heartbeats)
Cons:
- Events are conceptually “best-effort” in the system:
- they may be silently dropped in case of problems in the system (the event recorder
library doesnt retry on errors, e.g. to not make things worse when control-plane
is starved)
- currently, components reporting events dont even know if it succeeded or not (the
library is built in a way that you throw the event into it and are not notified if
that was successfully submitted or not).
Kubelet sending any other update has full control on how/if retry errors.
- lack of fairness mechanisms means that even when some events are being successfully
send, there is no guarantee that any event from a given Kubelet will be submitted
over a given time period
So this would require a different mechanism of reporting those “heartbeat” events.
- Once we have “request priority” concept, I think events should have the lowest one.
Even though no particular heartbeat is important, guarantee that some heartbeats will
be successfully send it crucial (not delivering any of them will result in unnecessary
evictions or not-scheduling to a given node). So heartbeats should be of the highest
priority. OTOH, node heartbeats are one of the most important things in the system
(not delivering them may result in unnecessary evictions), so they should have the
highest priority.
- No core component in the system is currently watching events
- it would make systems operation harder to explain
- Users watch Node objects for heartbeats (even though we didnt recommend it).
Introducing a new object for the purpose of heartbeat will allow those users to
migrate, while using events for that purpose breaks that ability. (Watching events
may put us in tough situation also from performance reasons.)
- Deleting all events (e.g. event etcd failure + playbook response) should continue to
not cause a catastrophic failure and the design will need to account for this.
### Reuse the Component Registration mechanisms
Kubelet is one of control-place components (shared controller). Some time ago, Component
Registration proposal converged into three parts:
- Introducing an API for registering non-pod endpoints, including readiness information: #18610
- Changing endpoints controller to also watch those endpoints
- Identifying some of those endpoints as “components”
We could reuse that mechanism to represent Kubelets as non-pod endpoint API.
Pros:
- Utilizes desired API
Cons:
- Requires introducing that new API
- Stabilizing the API would take some time
- Implementing that API requires multiple changes in different components
### Split Node object into two parts at etcd level
We may stick to existing Node API and solve the problem at storage layer. At the
high level, this means splitting the Node object into two parts in etcd (frequently
modified one and the rest).
Pros:
- No need to introduce new API
- No need to change any components other than kube-apiserver
Cons:
- Very complicated to support watch
- Not very generic (e.g. splitting Spec and Status doesnt help, it needs to be just
heartbeat part)
- [minor] Doesnt reduce amount of data that should be processed in the system (writes,
reads, watches, …)
### Delta compression in etcd
An alternative for the above can be solving this completely at the etcd layer. To
achieve that, instead of storing full updates in etcd transaction log, we will just
store “deltas” and snapshot the whole object only every X seconds/minutes.
Pros:
- Doesnt require any changes to any Kubernetes components
Cons:
- Computing delta is tricky (etcd doesnt understand Kubernetes data model, and
delta between two protobuf-encoded objects is not necessary small)
- May require a major rewrite of etcd code and not even be accepted by its maintainers
- More expensive computationally to get an object in a given resource version (which
is what e.g. watch is doing)
### Replace etcd with other database
Instead of using etcd, we may also consider using some other open-source solution.
Pros:
- Doesnt require new API
Cons:
- We dont even know if there exists solution that solves our problems and can be used.
- Migration will take us years.

View File

@ -1 +1 @@
8
13

View File

@ -0,0 +1,222 @@
---
kep-number: 8
title: Kustomize
authors:
- "@pwittrock"
- "@monopole"
owning-sig: sig-cli
participating-sigs:
- sig-cli
reviewers:
- "@droot"
approvers:
- "@maciej"
editor: "@droot"
creation-date: 2018-05-5
last-updated: 2018-05-5
status: implemented
see-also:
- n/a
replaces:
- kinflate # Old name for kustomize
superseded-by:
- n/a
---
# Kustomize
## Table of Contents
- [Kustomize](#kustomize)
- [Table of Contents](#table-of-contents)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Risks of Not Having a Solution](#risks-of-not-having-a-solution)
- [Graduation Criteria](#graduation-criteria)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
- [FAQ](#faq)
## Summary
Declarative specification of Kubernetes objects is the recommended way to manage Kubernetes
production workloads, however gaps in the kubectl tooling force users to write their own scripting and
tooling to augment the declarative tools with preprocessing transformations.
While most of theser transformations already exist as imperative kubectl commands, they are not natively accessible
from a declarative workflow.
This KEP describes how `kustomize` addresses this problem by providing a declarative format for users to access
the imperative kubectl commands they are already familiar natively from declarative workflows.
## Motivation
The kubectl command provides a cli for:
- accessing the Kubernetes apis through json or yaml configuration
- porcelain commands for generating and transforming configuration off of commandline flags.
Examples:
- Generate a configmap or secret from a text or binary file
- `kubectl create configmap`, `kubectl create secret`
- Users can manage their configmaps and secrets text and binary files
- Create or update fields that cut across other fields and objects
- `kubectl label`, `kubectl annotate`
- Users can add and update labels for all objects composing an application
- Transform an existing declarative configuration without forking it
- `kubectl patch`
- Users may generate multiple variations of the same workload
- Transform live resources arbitrarily without auditing
- `kubectl edit`
To create a Secret from a binary file, users must first base64 encode the binary file and then create a Secret yaml
config from the resulting data. Because the source of truth is actually the binary file, not the config,
users must write scripting and tooling to keep the 2 sources consistent.
Instead, users should be able to access the simple, but necessary, functionality available in the imperative
kubectl commands from their declarative workflow.
#### Long standing issues
Kustomize addresses a number of long standing issues in kubectl.
- Declarative enumeration of multiple files [kubernetes/kubernetes#24649](https://github.com/kubernetes/kubernetes/issues/24649)
- Declarative configmap and secret creation: [kubernetes/kubernetes#24744](https://github.com/kubernetes/kubernetes/issues/24744), [kubernetes/kubernetes#30337](https://github.com/kubernetes/kubernetes/issues/30337)
- Configmap rollouts: [kubernetes/kubernetes#22368](https://github.com/kubernetes/kubernetes/issues/22368)
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/tree/master/examples/helloWorld#how-this-works-with-kustomize)
- Name/label scoping and safer pruning: [kubernetes/kubernetes#1698](https://github.com/kubernetes/kubernetes/issues/1698)
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/blob/master/examples/breakfast.md#demo-configure-breakfast)
- Template-free add-on customization: [kubernetes/kubernetes#23233](https://github.com/kubernetes/kubernetes/issues/23233)
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/tree/master/examples/helloWorld#staging-kustomization)
### Goals
- Declarative support for defining ConfigMaps and Secrets generated from binary and text files
- Declarative support for adding or updating cross-cutting fields
- labels & selectors
- annotations
- names (as transformation of the original name)
- Declarative support for applying patches to transform arbitrary fields
- use strategic-merge-patch format
- Ease of integration with CICD systems that maintain configuration in a version control repository
as a single source of truth, and take action (build, test, deploy, etc.) when that truth changes (gitops).
### Non-Goals
#### Exposing every imperative kubectl command in a declarative fashion
The scope of kustomize is limited only to functionality gaps that would otherwise prevent users from
defining their workloads in a purely declarative manner (e.g. without writing scripts to perform pre-processing
or linting). Commands such as `kubectl run`, `kubectl create deployment` and `kubectl edit` are unnecessary
in a declarative workflow because a Deployment can easily be managed as declarative config.
#### Providing a simpler facade on top of the Kubernetes APIs
The community has developed a number of facades in front of the Kubernetes APIs using
templates or DSLs. Attempting to provide an alternative interface to the Kubernetes API is
a non-goal. Instead the focus is on:
- Facilitating simple cross-cutting transformations on the raw config that would otherwise require other tooling such
as *sed*
- Generating configuration when the source of truth resides elsewhere
- Patching existing configuration with transformations
## Proposal
### Capabilities
**Note:** This proposal has already been implemented in `github.com/kubernetes/kubectl`.
Define a new meta config format called *kustomization.yaml*.
#### *kustomization.yaml* will allow users to reference config files
- Path to config yaml file (similar to `kubectl apply -f <file>`)
- Urls to config yaml file (similar to `kubectl apply -f <url>`)
- Path to *kustomization.yaml* file (takes the output of running kustomize)
#### *kustomization.yaml* will allow users to generate configs from files
- ConfigMap (`kubectl create configmap`)
- Secret (`kubectl create secret`)
#### *kustomization.yaml* will allow users to apply transformations to configs
- Label (`kubectl label`)
- Annotate (`kubectl annotate`)
- Strategic-Merge-Patch (`kubectl patch`)
- Name-Prefix
### UX
Kustomize will also contain subcommands to facilitate authoring *kustomization.yaml*.
#### Edit
The edit subcommands will allow users to modify the *kustomization.yaml* through cli commands containing
helpful messaging and documentation.
- Add ConfigMap - like `kubectl create configmap` but declarative in *kustomization.yaml*
- Add Secret - like `kubectl create secret` but declarative in *kustomization.yaml*
- Add Resource - adds a file reference to *kustomization.yaml*
- Set NamePrefix - adds NamePrefix declaration to *kustomization.yaml*
#### Diff
The diff subcommand will allow users to see a diff of the original and transformed configuration files
- Generated config (configmap) will show the files as created
- Transformations (name prefix) will show the files as modified
### Implementation Details/Notes/Constraints [optional]
Kustomize has already been implemented in the `github.com/kubernetes/kubectl` repo, and should be moved to a
separate repo for the subproject.
Kustomize was initially developed as its own cli, however once it has matured, it should be published
as a subcommand of kubectl or as a statically linked plugin. It should also be more tightly integrated with apply.
- Create the *kustomize* sig-cli subproject and update sigs.yaml
- Move the existing kustomize code from `github.com/kubernetes/kubectl` to `github.com/kubernetes-sigs/kustomize`
### Risks and Mitigations
### Risks of Not Having a Solution
By not providing a viable option for working directly with Kubernetes APIs as json or
yaml config, we risk the ecosystem becoming fragmented with various bespoke API facades.
By ensuring the raw Kubernetes API json or yaml is a usable approach for declaratively
managing applications, even tools that do not use the Kubernetes API as their native format can
better work with one another through transformation to a common format.
## Graduation Criteria
- Dogfood kustomize by either:
- moving one or more of our own (OSS Kubernetes) services to it.
- getting user feedback from one or more mid or large application deployments using kustomize.
- Publish kustomize as a subcommand of kubectl.
## Implementation History
kustomize was implemented in the kubectl repo before subprojects became a first class thing in Kubernetes.
The code has been fully implemented, but it must be moved to a proper location.
## Drawbacks
## Alternatives
1. Users write their own bespoke scripts to generate and transform the config before it is applied.
2. Users don't work with the API directly, and use or develop DSLs for interacting with Kubernetes.
## FAQs

View File

@ -4,18 +4,22 @@ title: Cloud Provider Controller Manager
authors:
- "@cheftako"
- "@calebamiles"
- "@hogepodge"
owning-sig: sig-apimachinery
participating-sigs:
- sig-apps
- sig-aws
- sig-azure
- sig-cloud-provider
- sig-gcp
- sig-network
- sig-openstack
- sig-storage
reviewers:
- "@wlan0"
- "@andrewsykim"
- "@calebamiles"
- "@hogepodge"
- "@jagosan"
approvers:
- "@thockin"
editor: TBD
@ -41,16 +45,21 @@ replaces:
- [API Server Changes](#api-server-changes)
- [Volume Management Changes](#volume-management-changes)
- [Deployment Changes](#deployment-changes)
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
- [Repository Requirements](#repository-requirements)
- [Notes for Repository Requirements](#notes-for-repository-requirements)
- [Repository Timeline](#repository-timeline)
- [Security Considerations](#security-considerations)
- [Graduation Criteria](#graduation-criteria)
- [Graduation to Beta](#graduation-to-beta)
- [Process Goals](#process-goals)
- [Implementation History](#implementation-history)
- [Alternatives](#alternatives)
## Summary
We want to remove any cloud provider specific logic from the kubernetes/kubernetes repo. We want to restructure the code
to make is easy for any cloud provider to extend the kubernetes core in a consistent manner for their cloud. New cloud
to make it easy for any cloud provider to extend the kubernetes core in a consistent manner for their cloud. New cloud
providers should look at the [Creating a Custom Cluster from Scratch](https://kubernetes.io/docs/getting-started-guides/scratch/#cloud-provider)
and the [cloud provider interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go#L31)
which will need to be implemented.
@ -208,8 +217,8 @@ taints.
### API Server Changes
Finally, in the kube-apiserver, the cloud provider is used for transferring SSH keys to all of the nodes, and within an a
dmission controller for setting labels on persistent volumes.
Finally, in the kube-apiserver, the cloud provider is used for transferring SSH keys to all of the nodes, and within an
admission controller for setting labels on persistent volumes.
Kube-apiserver uses the cloud provider for two purposes
@ -220,7 +229,7 @@ Kube-apiserver uses the cloud provider for two purposes
Volumes need cloud providers, but they only need **specific** cloud providers. The majority of volume management logic
resides in the controller manager. These controller loops need to be moved into the cloud-controller manager. The cloud
controller manager also needs a mechanism to read parameters for initilization from cloud config. This can be done via
controller manager also needs a mechanism to read parameters for initialization from cloud config. This can be done via
config maps.
There are two entirely different approach to refactoring volumes -
@ -257,6 +266,102 @@ In case of the cloud-controller-manager, the deployment should be deleted using
kubectl delete -f cloud-controller-manager.yml
```
### Implementation Details/Notes/Constraints
#### Repository Requirements
**This is a proposed structure, and may change during the 1.11 release cycle.
WG-Cloud-Provider will work with individual sigs to refine these requirements
to maintain consistency while meeting the technical needs of the provider
maintainers**
Each cloud provider hosted within the `kubernetes` organization shall have a
single repository named `kubernetes/cloud-provider-<provider_name>`. Those
repositories shall have the following structure:
* A `cloud-controller-manager` subdirectory that contains the implementation
of the provider-specific cloud controller.
* A `docs` subdirectory.
* A `docs/cloud-controller-manager.md` file that describes the options and
usage of the cloud controller manager code.
* A `docs/testing.md` file that describes how the provider code is tested.
* A `Makefile` with a `test` entrypoint to run the provider tests.
Additionally, the repository should have:
* A `docs/getting-started.md` file that describes the installation and basic
operation of the cloud controller manager code.
Where the provider has additional capabilities, the repository should have
the following subdirectories that contain the common features:
* `dns` for DNS provider code.
* `cni` for the Container Network Interface (CNI) driver.
* `csi` for the Container Storage Interface (CSI) driver.
* `flex` for the Flex Volume driver.
* `installer` for custom installer code.
Each repository may have additional directories and files that are used for
additional feature that include but are not limited to:
* Other provider specific testing.
* Additional documentation, including examples and developer documentation.
* Dependencies on provider-hosted or other external code.
##### Notes for Repository Requirements
This purpose of these requirements is to define a common structure for the
cloud provider repositories owned by current and future cloud provider SIGs.
In accordance with the
[WG-Cloud-Provider Charter](https://docs.google.com/document/d/1m4Kvnh_u_9cENEE9n1ifYowQEFSgiHnbw43urGJMB64/edit#)
to "define a set of common expected behaviors across cloud providers", this
proposal defines the location and structure of commonly expected code.
As each provider can and will have additional features that go beyond expected
common code, requirements only apply to the location of the
following code:
* Cloud Controller Manager implementations.
* Documentation.
This document may be amended with additional locations that relate to enabling
consistent upstream testing, independent storage drivers, and other code with
common integration hooks may be added
The development of the
[Cloud Controller Manager](https://github.com/kubernetes/kubernetes/tree/master/cmd/cloud-controller-manager)
and
[Cloud Provider Interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go)
has enabled the provider SIGs to develop external providers that
capture the core functionality of the upstream providers. By defining the
expected locations and naming conventions of where the external provider code
is, we will create a consistent experience for:
* Users of the providers, who will have easily understandable conventions for
discovering and using all of the providers.
* SIG-Docs, who will have a common hook for building or linking to externally
managed documentation
* SIG-Testing, who will be able to use common entry points for enabling
provider-specific e2e testing.
* Future cloud provider authors, who will have a common framework and examples
from which to build and share their code base.
##### Repository Timeline
To facilitate community development, providers named in the
[Makes SIGs responsible for implementations of `CloudProvider`](https://github.com/kubernetes/community/pull/1862)
patch can immediately migrate their external provider work into their named
repositories.
Each provider will work to implement the required structure during the
Kubernetes 1.11 development cycle, with conformance by the 1.11 release.
WG-Cloud-Provider may actively change repository requirements during the
1.11 release cycle to respond to collective SIG technical needs.
After the 1.11 release all current and new provider implementations must
conform with the requirements outlined in this document.
### Security Considerations
Make sure that you consider the impact of this feature from the point of view of Security.
@ -307,6 +412,20 @@ is proposed to
- serve as a repository for user experience reports related to Cloud Providers
which live within the Kubernetes GitHub organization or desire to do so
Major milestones:
- March 18, 2018: Accepted proposal for repository requirements.
*Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
Major milestones might include
- the `Summary` and `Motivation` sections being merged signaling SIG acceptance
- the `Proposal` section being merged signaling agreement on a proposed design
- the date implementation started
- the first Kubernetes release where an initial version of the KEP was available
- the version of Kubernetes where the KEP graduated to general availability
- when the KEP was retired or superseded*
The ultimate intention of WG Cloud Provider is to prevent multiple classes
of software purporting to be an implementation of the Cloud Provider interface
from fracturing the Kubernetes Community while also ensuring that new Cloud

View File

@ -0,0 +1,145 @@
---
kep-number: draft-20180412
title: Kubeadm Config Draft
authors:
- "@liztio"
owning-sig: sig-cluster-lifecycle
participating-sigs: []
reviewers:
- "@timothysc"
approvers:
- TBD
editor: TBD
creation-date: 2018-04-12
last-updated: 2018-04-12
status: draft
see-also: []
replaces: []
superseded-by: []
---
# Kubeadm Config to Beta
## Table of Contents
A table of contents is helpful for quickly jumping to sections of a KEP and for highlighting any additional information provided beyond the standard KEP template.
<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
**Table of Contents**
- [Kubeadm Config to Beta](#kubeadm-config-to-beta)
- [Table of Contents](#table-of-contents)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories [optional]](#user-stories-optional)
- [As a user upgrading with Kubeadm, I want the upgrade process to not fail with unfamiliar configuration.](#as-a-user-upgrading-with-kubeadm-i-want-the-upgrade-process-to-not-fail-with-unfamiliar-configuration)
- [As a infrastructure system using kubeadm, I want to be able to write configuration files that always work.](#as-a-infrastructure-system-using-kubeadm-i-want-to-be-able-to-write-configuration-files-that-always-work)
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
- [Risks and Mitigations](#risks-and-mitigations)
- [Graduation Criteria](#graduation-criteria)
- [Implementation History](#implementation-history)
- [Alternatives](#alternatives)
<!-- markdown-toc end -->
## Summary
Kubeadm uses MasterConfiguraton for two distinct but similar operations: Initialising a new cluster and upgrading an existing cluster.
The former is typically created by hand by an administrator.
It is stored on disk and passed to `kubeadm init` via command line flag.
The latter is produced by kubeadm using supplied configuration files, command line options, and internal defaults.
It will be stored in a ConfigMap so upgrade operations can find.
Right now the configuration format is unversioned.
This means configuration file formats can change between kubeadm versions and there's no safe way to update the configuration format.
We propose a stable versioning of this configuration, `v1alpha2` and eventually `v1beta1`.
Version information will be _mandatory_ going forward, both for user-generated configuration files and machine-generated configuration maps.
There as an [existing document][config] describing current Kubernetes best practices around component configuration.
[config]: https://docs.google.com/document/d/1FdaEJUEh091qf5B98HM6_8MS764iXrxxigNIdwHYW9c/edit#heading=h.nlhhig66a0v6
## Motivation
After 1.10.0, we discovered a bug in the upgrade process.
The `MasterConfiguraton` embedded a [struct that had changed][proxyconfig], which caused a backwards-incompatible change to the configuration format.
This caused `kubeadm upgrade` to fail, because a newer version of kubeadm was attempting to deserialise an older version of the struct.
Because the configuration is often written and read by different versions of kubeadm compiled by different versions of kubernetes,
it's very important for this configuration file to be well-versioned.
[proxyconfig]: https://github.com/kubernetes/kubernetes/commit/57071d85ee2c27332390f0983f42f43d89821961
### Goals
* kubeadm init fails if a configuration file isn't versioned
* the config map written out contains a version
* the configuration struct does not embed any other structs
* existing configuration files are converted on upgrade to a known, stable version
* structs should be sparsely populated
* all structs should have reasonable defaults so an empty config is still sensible
### Non-Goals
* kubeadm is able to read and write configuration files for older and newer versions of kubernetes than it was compiled with
* substantially changing the schema of the `MasterConfiguration`
## Proposal
The concrete proposal is as follows.
1. Immediately start writing Kind and Version information into the `MasterConfiguraton` struct.
2. Define the previous (1.9) version of the struct as `v1alpha1`.
3. Duplicate the KubeProxyConfig struct that caused the schema change, adding the old version to the `v1alpha1` struct.
3. Create a new `v1alpha2` directory mirroring the existing [`v1alpha1`][v1alpha1], which matches the 1.10 schema.
This version need not duplicate the file as well.
2. Warn users if their configuration files do not have a version and kind
4. Use [apimachinery's conversion][conversion] library to design migrations from the old (v1alpha1) versions to the new (v1alpha2) versions
5. Determine the changes for v1beta1
6. With v1beta1, enforce presence of version numbers in config files and ConfigMaps, erroring if not present.
[conversion]: https://godoc.org/k8s.io/apimachinery/pkg/conversion
[v1alpha1]: https://github.com/kubernetes/kubernetes/tree/d7d4381961f4eb2a4b581160707feb55731e324e/cmd/kubeadm/app/apis/kubeadm
### User Stories [optional]
#### As a user upgrading with Kubeadm, I want the upgrade process to not fail with unfamiliar configuration.
In the past, the haphazard nature of the versioning system has meant it was hard to provide strong guarantees between versions.
Implementing strong version guarantees mean any given configuration generated in the past by kubeadm will work with a future version of kubeadm.
Deprecations can happen in the future in well-regulated ways.
#### As a infrastructure system using kubeadm, I want to be able to write configuration files that always work.
Having a configuration file that changes without notice makes it very difficult to write software that integrates with kubeadm.
By providing strong version guarantees, we can guarantee that the files these tools produce will work with a given version of kubeadm.
### Implementation Details/Notes/Constraints
The incident that caused the breakage in alpha wasn't a field changed it Kubeadm, it was a struct [referenced][struct] inside the `MasterConfiguration` struct.
By completely owning our own configuration, changes in the rest of the project can't unknowingly affect us.
When we do need to interface with the rest of the project, we will do so explicitly in code and be protected by the compiler.
[struct]: https://github.com/kubernetes/kubernetes/blob/d7d4381961f4eb2a4b581160707feb55731e324e/cmd/kubeadm/app/apis/kubeadm/v1alpha1/types.go#L285
### Risks and Mitigations
Moving to a strongly versioned configuration from a weakly versioned one must be done carefully so as not break kubeadm for existing users.
We can start requiring versions of the existing `v1alpha1` format, issuing warnings to users when Version and Kind aren't present.
These fields can be used today, they're simply ignored.
In the future, we could require them, and transition to using `v1alpha1`.
## Graduation Criteria
This KEP can be considered complete once all currently supported versions of Kubeadm write out `v1beta1`-version structs.
## Implementation History
## Alternatives
Rather than creating our own copies of all structs in the `MasterConfiguration` struct, we could instead continue embedding the structs.
To provide our guarantees, we would have to invest a lot more in automated testing for upgrades.

View File

@ -0,0 +1,126 @@
---
kep-number: 10
title: Graduate CoreDNS to GA
authors:
- "@johnbelamaric"
- "@rajansandeep"
owning-sig: sig-network
participating-sigs:
- sig-cluster-lifecycle
reviewers:
- "@bowei"
- "@thockin"
approvers:
- "@thockin"
editor: "@rajansandeep"
creation-date: 2018-03-21
last-updated: 2018-05-18
status: provisional
see-also: https://github.com/kubernetes/community/pull/2167
---
# Graduate CoreDNS to GA
## Table of Contents
* [Summary](#summary)
* [Motivation](#motivation)
* [Goals](#goals)
* [Non-Goals](#non-goals)
* [Proposal](#proposal)
* [User Cases](#use-cases)
* [Graduation Criteria](#graduation-criteria)
* [Implementation History](#implementation-history)
## Summary
CoreDNS is sister CNCF project and is the successor to SkyDNS, on which kube-dns is based. It is a flexible, extensible
authoritative DNS server and directly integrates with the Kubernetes API. It can serve as cluster DNS,
complying with the [dns spec](https://git.k8s.io/dns/docs/specification.md). As an independent project,
it is more actively developed than kube-dns and offers performance and functionality beyond what kube-dns has. For more details, see the [introductory presentation](https://docs.google.com/presentation/d/1v6Coq1JRlqZ8rQ6bv0Tg0usSictmnN9U80g8WKxiOjQ/edit#slide=id.g249092e088_0_181), or [coredns.io](https://coredns.io), or the [CNCF webinar](https://youtu.be/dz9S7R8r5gw).
Currently, we are following the road-map defined [here](https://github.com/kubernetes/features/issues/427). CoreDNS is Beta in Kubernetes v1.10, which can be installed as an alternate to kube-dns.
The purpose of this proposal is to graduate CoreDNS to GA.
## Motivation
* CoreDNS is more flexible and extensible than kube-dns.
* CoreDNS is easily extensible and maintainable using a plugin architecture.
* CoreDNS has fewer moving parts than kube-dns, taking advantage of the plugin architecture, making it a single executable and single process.
* It is written in Go, making it memory-safe (kube-dns includes dnsmasq which is not).
* CoreDNS has [better performance](https://github.com/kubernetes/community/pull/1100#issuecomment-337747482) than [kube-dns](https://github.com/kubernetes/community/pull/1100#issuecomment-338329100) in terms of greater QPS, lower latency, and lower memory consumption.
### Goals
* Bump up CoreDNS to be GA.
* Make CoreDNS available as an image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
May be deferred to [next KEP](https://github.com/kubernetes/community/pull/2167) if goal not achieved in time.
* Provide a kube-dns to CoreDNS upgrade path with configuration translation in `kubeadm`.
* Provide a CoreDNS to CoreDNS upgrade path in `kubeadm`.
### Non-Goals
* Translation of CoreDNS ConfigMap back to kube-dns (i.e., downgrade).
* Translation configuration of kube-dns to equivalent CoreDNS that is defined outside of the kube-dns ConfigMap. For example, modifications to the manifest or `dnsmasq` configuration.
* Fate of kube-dns in future releases, i.e. deprecation path.
* Making [CoreDNS the default](https://github.com/kubernetes/community/pull/2167) in every installer.
## Proposal
The proposed solution is to enable the selection of CoreDNS as a GA cluster service discovery DNS for Kubernetes.
Some of the most used deployment tools have been upgraded by the CoreDNS team, in cooperation of the owners of these tools, to be able to deploy CoreDNS:
* kubeadm
* kube-up
* minikube
* kops
For other tools, each maintainer would have to add the upgrade to CoreDNS.
### Use Cases
* CoreDNS supports all functionality of kube-dns and also addresses [several use-cases kube-dns lacks](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/network/coredns.md#use-cases). Some of the Use Cases are as follows:
* Supporting [Autopath](https://coredns.io/plugins/autopath/), which reduces the high query load caused by the long DNS search path in Kubernetes.
* Making an alias for an external name [#39792](https://github.com/kubernetes/kubernetes/issues/39792)
* By default, the user experience would be unchanged. For more advanced uses, existing users would need to modify the ConfigMap that contains the CoreDNS configuration file.
* Since CoreDNS has more supporting features than kube-dns, there will be no path to retain the CoreDNS configuration in case a user wants to switch to kube-dns.
#### Configuring CoreDNS
The CoreDNS configuration file is called a `Corefile` and syntactically is the same as a [Caddyfile](https://caddyserver.com/docs/caddyfile). The file consists of multiple stanzas called _server blocks_.
Each of these represents a set of zones for which that server block should respond, along with the list of plugins to apply to a given request. More details on this can be found in the
[Corefile Explained](https://coredns.io/2017/07/23/corefile-explained/) and [How Queries Are Processed](https://coredns.io/2017/06/08/how-queries-are-processed-in-coredns/) blog entries.
The following can be expected when CoreDNS is graduated to GA.
#### Kubeadm
* The CoreDNS feature-gates flag will be marked as GA.
* As Kubeadm maintainers chose to deploy CoreDNS as the default Cluster DNS for Kubernetes 1.11:
* CoreDNS will be installed by default in a fresh install of Kubernetes via kubeadm.
* For users upgrading Kubernetes via kubeadm, it will install CoreDNS by default whether the user had kube-dns or CoreDNS in a previous kubernetes version.
* In case a user wants to install kube-dns instead of CoreDNS, they have to set the feature-gate of CoreDNS to false. `--feature-gates=CoreDNS=false`
* When choosing to install CoreDNS, the configmap of a previously installed kube-dns will be automatically translated to the equivalent CoreDNS configmap.
#### Kube-up
* CoreDNS will be installed when the environment variable `CLUSTER_DNS_CORE_DNS` is set to `true`. The default value is `false`.
#### Minikube
* CoreDNS to be an option in the add-on manager, with CoreDNS disabled by default.
## Graduation Criteria
* Verify that all e2e conformance and DNS related tests (xxx-kubernetes-e2e-gce, ci-kubernetes-e2e-gce-gci-ci-master and filtered by `--ginkgo.skip=\\[Slow\\]|\\[Serial\\]|\\[Disruptive\\]|\\[Flaky\\]|\\[Feature:.+\\]`) run successfully for CoreDNS.
None of the tests successful with Kube-DNS should be failing with CoreDNS.
* Add CoreDNS as part of the e2e Kubernetes scale runs and ensure tests are not failing.
* Extend [perf-tests](https://github.com/kubernetes/perf-tests/tree/master/dns) for CoreDNS.
* Add a dedicated DNS related tests in e2e scalability test [Feature:performance].
## Implementation History
* 20170912 - [Feature proposal](https://github.com/kubernetes/features/issues/427) for CoreDNS to be implemented as the default DNS in Kubernetes.
* 20171108 - Successfully released [CoreDNS as an Alpha feature-gate in Kubernetes v1.9](https://github.com/kubernetes/kubernetes/pull/52501).
* 20180226 - CoreDNS graduation to Incubation in CNCF.
* 20180305 - Support for Kube-dns configmap translation and move up [CoreDNS to Beta](https://github.com/kubernetes/kubernetes/pull/58828) for Kubernetes v1.10.

View File

@ -0,0 +1,574 @@
---
kep-number: TBD
title: IPVS Load Balancing Mode in Kubernetes
status: implemented
authors:
- "@rramkumar1"
owning-sig: sig-network
reviewers:
- "@thockin"
- "@m1093782566"
approvers:
- "@thockin"
- "@m1093782566"
editor:
- "@thockin"
- "@m1093782566"
creation-date: 2018-03-21
---
# IPVS Load Balancing Mode in Kubernetes
**Note: This is a retroactive KEP. Credit goes to @m1093782566, @haibinxie, and @quinton-hoole for all information & design in this KEP.**
**Important References: https://github.com/kubernetes/community/pull/692/files**
## Table of Contents
* [Summary](#summary)
* [Motivation](#motivation)
* [Goals](#goals)
* [Non\-goals](#non-goals)
* [Proposal](#proposal)
* [Kube-Proxy Parameter Changes](#kube-proxy-parameter-changes)
* [Build Changes](#build-changes)
* [Deployment Changes](#deployment-changes)
* [Design Considerations](#design-considerations)
* [IPVS service network topology](#ipvs-service-network-topology)
* [Port remapping](#port-remapping)
* [Falling back to iptables](#falling-back-to-iptables)
* [Supporting NodePort service](#supporting-nodeport-service)
* [Supporting ClusterIP service](#supporting-clusterip-service)
* [Supporting LoadBalancer service](#supporting-loadbalancer-service)
* [Session Affinity](#session-affinity)
* [Cleaning up inactive rules](#cleaning-up-inactive-rules)
* [Sync loop pseudo code](#sync-loop-pseudo-code)
* [Graduation Criteria](#graduation-criteria)
* [Implementation History](#implementation-history)
* [Drawbacks](#drawbacks)
* [Alternatives](#alternatives)
## Summary
We are building a new implementation of kube proxy built on top of IPVS (IP Virtual Server).
## Motivation
As Kubernetes grows in usage, the scalability of its resources becomes more and more
important. In particular, the scalability of services is paramount to the adoption of Kubernetes
by developers/companies running large workloads. Kube Proxy, the building block of service routing
has relied on the battle-hardened iptables to implement the core supported service types such as
ClusterIP and NodePort. However, iptables struggles to scale to tens of thousands of services because
it is designed purely for firewalling purposes and is based on in-kernel rule chains. On the
other hand, IPVS is specifically designed for load balancing and uses more efficient data structures
under the hood. For more information on the performance benefits of IPVS vs. iptables, take a look
at these [slides](https://docs.google.com/presentation/d/1BaIAywY2qqeHtyGZtlyAp89JIZs59MZLKcFLxKE6LyM/edit?usp=sharing).
### Goals
* Improve the performance of services
### Non-goals
None
### Challenges and Open Questions [optional]
None
## Proposal
### Kube-Proxy Parameter Changes
***Parameter: --proxy-mode***
In addition to existing userspace and iptables modes, IPVS mode is configured via --proxy-mode=ipvs. In the initial implementation, it implicitly uses IPVS [NAT](http://www.linuxvirtualserver.org/VS-NAT.html) mode.
***Parameter: --ipvs-scheduler***
A new kube-proxy parameter will be added to specify the IPVS load balancing algorithm, with the parameter being --ipvs-scheduler. If its not configured, then round-robin (rr) is default value. If its incorrectly configured, then kube-proxy will exit with error message.
* rr: round-robin
* lc: least connection
* dh: destination hashing
* sh: source hashing
* sed: shortest expected delay
* nq: never queue
For more details, refer to http://kb.linuxvirtualserver.org/wiki/Ipvsadm
In future, we can implement service specific scheduler (potentially via annotation), which has higher priority and overwrites the value.
***Parameter: --cleanup-ipvs***
Similar to the --cleanup-iptables parameter, if true, cleanup IPVS configuration and IPTables rules that are created in IPVS mode.
***Parameter: --ipvs-sync-period***
Maximum interval of how often IPVS rules are refreshed (e.g. '5s', '1m'). Must be greater than 0.
***Parameter: --ipvs-min-sync-period***
Minimum interval of how often the IPVS rules are refreshed (e.g. '5s', '1m'). Must be greater than 0.
### Build Changes
No changes at all. The IPVS implementation is built on [docker/libnetwork](https://godoc.org/github.com/docker/libnetwork/ipvs) IPVS library, which is a pure-golang implementation and talks to kernel via socket communication.
### Deployment Changes
IPVS kernel module installation is beyond Kubernetes. Its assumed that IPVS kernel modules are installed on the node before running kube-proxy. When kube-proxy starts, if the proxy mode is IPVS, kube-proxy would validate if IPVS modules are installed on the node. If its not installed, then kube-proxy will fall back to the iptables proxy mode.
### Design Considerations
#### IPVS service network topology
We will create a dummy interface and assign all kubernetes service ClusterIP's to the dummy interface (default name is `kube-ipvs0`). For example,
```shell
# ip link add kube-ipvs0 type dummy
# ip addr
...
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 26:1f:cc:f8:cd:0f brd ff:ff:ff:ff:ff:ff
#### Assume 10.102.128.4 is service Cluster IP
# ip addr add 10.102.128.4/32 dev kube-ipvs0
...
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
inet 10.102.128.4/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
```
Note that the relationship between a Kubernetes service and an IPVS service is `1:N`. Consider a Kubernetes service that has more than one access IP. For example, an External IP type service has 2 access IP's (ClusterIP and External IP). Then the IPVS proxier will create 2 IPVS services - one for Cluster IP and the other one for External IP.
The relationship between a Kubernetes endpoint and an IPVS destination is `1:1`.
For instance, deletion of a Kubernetes service will trigger deletion of the corresponding IPVS service and address bound to dummy interface.
#### Port remapping
There are 3 proxy modes in ipvs - NAT (masq), IPIP and DR. Only NAT mode supports port remapping. We will use IPVS NAT mode in order to supporting port remapping. The following example shows ipvs mapping service port `3080` to container port `8080`.
```shell
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.102.128.4:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
-> 10.244.1.237:8080 Masq 1 0 0
```
#### Falling back to iptables
IPVS proxier will employ iptables in doing packet filtering, SNAT and supporting NodePort type service. Specifically, ipvs proxier will fall back on iptables in the following 4 scenarios.
* kube-proxy start with --masquerade-all=true
* Specify cluster CIDR in kube-proxy startup
* Load Balancer Source Ranges is specified for LB type service
* Support NodePort type service
And, IPVS proxier will maintain 5 kubernetes-specific chains in nat table
- KUBE-POSTROUTING
- KUBE-MARK-MASQ
- KUBE-MARK-DROP
`KUBE-POSTROUTING`, `KUBE-MARK-MASQ`, ` KUBE-MARK-DROP` are maintained by kubelet and ipvs proxier won't create them. IPVS proxier will make sure chains `KUBE-MARK-SERVICES` and `KUBE-NODEPORTS` exist in its sync loop.
**1. kube-proxy start with --masquerade-all=true**
If kube-proxy starts with `--masquerade-all=true`, the IPVS proxier will masquerade all traffic accessing service ClusterIP, which behaves same as what iptables proxier does.
Suppose there is a serivice with Cluster IP `10.244.5.1` and port `8080`:
```shell
# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
Chain KUBE-MARK-MASQ (6 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 10.244.5.1 /* default/foo:http cluster IP */ tcp dpt:8080
```
**2. Specify cluster CIDR in kube-proxy startup**
If kube-proxy starts with `--cluster-cidr=<cidr>`, the IPVS proxier will masquerade off-cluster traffic accessing service ClusterIP, which behaves same as what iptables proxier does.
Suppose kube-proxy is provided with the cluster cidr `10.244.16.0/24`, and service Cluster IP is `10.244.5.1` and port is `8080`:
```shell
# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
Chain KUBE-MARK-MASQ (6 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !10.244.16.0/24 10.244.5.1 /* default/foo:http cluster IP */ tcp dpt:8080
```
**3. Load Balancer Source Ranges is specified for LB type service**
When service's `LoadBalancerStatus.ingress.IP` is not empty and service's `LoadBalancerSourceRanges` is specified, IPVS proxier will install iptables rules which looks like what is shown below.
Suppose service's `LoadBalancerStatus.ingress.IP` is `10.96.1.2` and service's `LoadBalancerSourceRanges` is `10.120.2.0/24`:
```shell
# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
Chain KUBE-MARK-MASQ (6 references)
target prot opt source destination
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-SERVICES (2 references)
target prot opt source destination
ACCEPT tcp -- 10.120.2.0/24 10.96.1.2 /* default/foo:http loadbalancer IP */ tcp dpt:8080
DROP tcp -- 0.0.0.0/0 10.96.1.2 /* default/foo:http loadbalancer IP */ tcp dpt:8080
```
**4. Support NodePort type service**
Please check the section below.
#### Supporting NodePort service
For supporting NodePort type service, iptables will recruit the existing implementation in the iptables proxier. For example,
```shell
# kubectl describe svc nginx-service
Name: nginx-service
...
Type: NodePort
IP: 10.101.28.148
Port: http 3080/TCP
NodePort: http 31604/TCP
Endpoints: 172.17.0.2:80
Session Affinity: None
# iptables -t nat -nL
[root@100-106-179-225 ~]# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !172.16.0.0/16 10.101.28.148 /* default/nginx-service:http cluster IP */ tcp dpt:3080
KUBE-SVC-6IM33IEVEEV7U3GP tcp -- 0.0.0.0/0 10.101.28.148 /* default/nginx-service:http cluster IP */ tcp dpt:3080
KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */ tcp dpt:31604
KUBE-SVC-6IM33IEVEEV7U3GP tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */ tcp dpt:31604
Chain KUBE-SVC-6IM33IEVEEV7U3GP (2 references)
target prot opt source destination
KUBE-SEP-Q3UCPZ54E6Q2R4UT all -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */
Chain KUBE-SEP-Q3UCPZ54E6Q2R4UT (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 172.17.0.2 0.0.0.0/0 /* default/nginx-service:http */
DNAT
```
#### Supporting ClusterIP service
When creating a ClusterIP type service, IPVS proxier will do 3 things:
* make sure dummy interface exists in the node
* bind service cluster IP to the dummy interface
* create an IPVS service whose address corresponds to the Kubernetes service Cluster IP.
For example,
```shell
# kubectl describe svc nginx-service
Name: nginx-service
...
Type: ClusterIP
IP: 10.102.128.4
Port: http 3080/TCP
Endpoints: 10.244.0.235:8080,10.244.1.237:8080
Session Affinity: None
# ip addr
...
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
inet 10.102.128.4/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.102.128.4:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
-> 10.244.1.237:8080 Masq 1 0 0
```
### Support LoadBalancer service
IPVS proxier will NOT bind LB's ingress IP to the dummy interface. When creating a LoadBalancer type service, ipvs proxier will do 4 things:
- Make sure dummy interface exists in the node
- Bind service cluster IP to the dummy interface
- Create an ipvs service whose address corresponding to kubernetes service Cluster IP
- Iterate LB's ingress IPs, create an ipvs service whose address corresponding LB's ingress IP
For example,
```shell
# kubectl describe svc nginx-service
Name: nginx-service
...
IP: 10.102.128.4
Port: http 3080/TCP
Endpoints: 10.244.0.235:8080
Session Affinity: None
#### Only bind Cluter IP to dummy interface
# ip addr
...
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
inet 10.102.128.4/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
#### Suppose LB's ingress IPs {10.96.1.2, 10.93.1.3}. IPVS proxier will create 1 ipvs service for cluster IP and 2 ipvs services for LB's ingree IP. Each ipvs service has its destination.
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.102.128.4:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
TCP 10.96.1.2:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
TCP 10.96.1.3:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
```
Since there is a need of supporting access control for `LB.ingress.IP`. IPVS proxier will fall back on iptables. Iptables will drop any packet which is not from `LB.LoadBalancerSourceRanges`. For example,
```shell
# iptables -A KUBE-SERVICES -d {ingress.IP} --dport {service.Port} -s {LB.LoadBalancerSourceRanges} -j ACCEPT
```
When the packet reach the end of chain, ipvs proxier will drop it.
```shell
# iptables -A KUBE-SERVICES -d {ingress.IP} --dport {service.Port} -j KUBE-MARK-DROP
```
### Support Only NodeLocal Endpoints
Similar to iptables proxier, when a service has the "Only NodeLocal Endpoints" annotation, ipvs proxier will only proxy traffic to endpoints in the local node.
```shell
# kubectl describe svc nginx-service
Name: nginx-service
...
IP: 10.102.128.4
Port: http 3080/TCP
Endpoints: 10.244.0.235:8080, 10.244.1.235:8080
Session Affinity: None
#### Assume only endpoint 10.244.0.235:8080 is in the same host with kube-proxy
#### There should be 1 destination for ipvs service.
[root@SHA1000130405 home]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.102.128.4:3080 rr
-> 10.244.0.235:8080 Masq 1 0 0
```
#### Session affinity
IPVS support client IP session affinity (persistent connection). When a service specifies session affinity, the IPVS proxier will set a timeout value (180min=10800s by default) in the IPVS service. For example,
```shell
# kubectl describe svc nginx-service
Name: nginx-service
...
IP: 10.102.128.4
Port: http 3080/TCP
Session Affinity: ClientIP
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.102.128.4:3080 rr persistent 10800
```
#### Cleaning up inactive rules
It seems difficult to distinguish if an IPVS service is created by the IPVS proxier or other processes. Currently we assume IPVS rules will be created only by the IPVS proxier on a node, so we can clear all IPVSrules on a node. We should add warnings in documentation and flag comments.
#### Sync loop pseudo code
Similar to the iptables proxier, the IPVS proxier will do a full sync loop in a configured period. Also, each update on a Kubernetes service or endpoint will trigger an IPVS service or destination update. For example,
* Creating a Kubernetes service will trigger creating a new IPVS service.
* Updating a Kubernetes service(for instance, change session affinity) will trigger updating an existing IPVS service.
* Deleting a Kubernetes service will trigger deleting an IPVS service.
* Adding an endpoint for a Kubernetes service will trigger adding a destination for an existing IPVS service.
* Updating an endpoint for a Kubernetes service will trigger updating a destination for an existing IPVS service.
* Deleting an endpoint for a Kubernetes service will trigger deleting a destination for an existing IPVS service.
Any IPVS service or destination updates will send an update command to kernel via socket communication, which won't take a service down.
The sync loop pseudo code is shown below:
```go
func (proxier *Proxier) syncProxyRules() {
When service or endpoint update, begin sync ipvs rules and iptables rules if needed.
ensure dummy interface exists, if not, create one.
for svcName, svcInfo := range proxier.serviceMap {
// Capture the clusterIP.
construct ipvs service from svcInfo
Set session affinity flag and timeout value for ipvs service if specified session affinity
bind Cluster IP to dummy interface
call libnetwork API to create ipvs service and destinations
// Capture externalIPs.
if externalIP is local then hold the svcInfo.Port so that can install ipvs rules on it
construct ipvs service from svcInfo
Set session affinity flag and timeout value for ipvs service if specified session affinity
call libnetwork API to create ipvs service and destinations
// Capture load-balancer ingress.
for _, ingress := range svcInfo.LoadBalancerStatus.Ingress {
if ingress.IP != "" {
if len(svcInfo.LoadBalancerSourceRanges) != 0 {
install specific iptables
}
construct ipvs service from svcInfo
Set session affinity flag and timeout value for ipvs service if specified session affinity
call libnetwork API to create ipvs service and destinations
}
}
// Capture nodeports.
if svcInfo.NodePort != 0 {
fall back on iptables, recruit existing iptables proxier implementation
}
call libnetwork API to clean up legacy ipvs services which is inactive any longer
unbind service address from dummy interface
clean up legacy iptables chains and rules
}
}
```
## Graduation Criteria
### Beta -> GA
The following requirements should be met before moving from Beta to GA. It is
suggested to file an issue which tracks all the action items.
- [ ] Testing
- [ ] 48 hours of green e2e tests.
- [ ] Flakes must be identified and filed as issues.
- [ ] Integrate with scale tests and. Failures should be filed as issues.
- [ ] Development work
- [ ] Identify all pending changes/refactors. Release blockers must be prioritized and fixed.
- [ ] Identify all bugs. Release blocking bugs must be identified and fixed.
- [ ] Docs
- [ ] All user-facing documentation must be updated.
### GA -> Future
__TODO__
## Implementation History
**In chronological order**
1. https://github.com/kubernetes/kubernetes/pull/46580
2. https://github.com/kubernetes/kubernetes/pull/52528
3. https://github.com/kubernetes/kubernetes/pull/54219
4. https://github.com/kubernetes/kubernetes/pull/57268
5. https://github.com/kubernetes/kubernetes/pull/58052
## Drawbacks [optional]
None
## Alternatives [optional]
None

View File

@ -0,0 +1,88 @@
---
kep-number: 11
title: Switch CoreDNS to the default DNS
authors:
- "@johnbelamaric"
- "@rajansandeep"
owning-sig: sig-network
participating-sigs:
- sig-cluster-lifecycle
reviewers:
- "@bowei"
- "@thockin"
approvers:
- "@thockin"
editor: "@rajansandeep"
creation-date: 2018-05-18
last-updated: 2018-05-18
status: provisional
---
# Switch CoreDNS to the default DNS
## Table of Contents
* [Summary](#summary)
* [Goals](#goals)
* [Proposal](#proposal)
* [User Cases](#use-cases)
* [Graduation Criteria](#graduation-criteria)
* [Implementation History](#implementation-history)
## Summary
CoreDNS is now well-established in Kubernetes as the DNS service, with CoreDNS starting as an alpha feature from Kubernetes v1.9 to now being GA in v1.11.
After successfully implementing the road-map defined [here](https://github.com/kubernetes/features/issues/427), CoreDNS is GA in Kubernetes v1.11, which can be installed as an alternate to kube-dns in tools like kubeadm, kops, minikube and kube-up.
Following the [KEP to graduate CoreDNS to GA](https://github.com/kubernetes/community/pull/1956), the purpose of this proposal is to make CoreDNS as the default DNS for Kubernetes, replacing kube-dns.
## Goals
* Make CoreDNS the default DNS for Kubernetes for all the remaining install tools (kube-up, kops, minikube).
* Make CoreDNS available as an image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
This goal is carried over from the [previous KEP](https://github.com/kubernetes/community/pull/1956), in case it cannot be completed there.
## Proposal
The proposed solution is to enable CoreDNS as the default cluster service discovery DNS for Kubernetes.
Some of the most used deployment tools will be upgraded by the CoreDNS team, in cooperation with the owners of these tools, to be able to deploy CoreDNS as default:
* kubeadm (already done for Kubernetes v1.11)
* kube-up
* minikube
* kops
For other tools, each maintainer would have to add the upgrade to CoreDNS.
### Use Cases
Use cases for CoreDNS has been well defined in the [previous KEP](https://github.com/kubernetes/community/pull/1956).
The following can be expected when CoreDNS is made the default DNS.
#### Kubeadm
* CoreDNS is already the default DNS from Kubernetes v1.11 and shall continue be the default DNS.
* In case users want to install kube-dns instead of CoreDNS, they have to set the feature-gate of CoreDNS to false. `--feature-gates=CoreDNS=false`
#### Kube-up
* CoreDNS will now become the default DNS.
* To install kube-dns in place of CoreDNS, set the environment variable `CLUSTER_DNS_CORE_DNS` to `false`.
#### Minikube
* CoreDNS to be enabled by default in the add-on manager, with kube-dns disabled by default.
#### Kops
* CoreDNS will now become the default DNS.
## Graduation Criteria
* Add CoreDNS image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
* Have a certain number (To Be Defined) of clusters of significant size (To Be Defined) adopting and running CoreDNS as their default DNS.
## Implementation History
* 20170912 - [Feature proposal](https://github.com/kubernetes/features/issues/427) for CoreDNS to be implemented as the default DNS in Kubernetes.
* 20171108 - Successfully released [CoreDNS as an Alpha feature-gate in Kubernetes v1.9](https://github.com/kubernetes/kubernetes/pull/52501).
* 20180226 - CoreDNS graduation to Incubation in CNCF.
* 20180305 - Support for Kube-dns configmap translation and move up [CoreDNS to Beta](https://github.com/kubernetes/kubernetes/pull/58828) for Kubernetes v1.10.
* 20180515 - CoreDNS was added as [GA and the default DNS in kubeadm](https://github.com/kubernetes/kubernetes/pull/63509) for Kubernetes v1.11.

View File

@ -1,4 +1,4 @@
# Meet Our Contributors - Ask Us Anything!
# Meet Our Contributors - Ask Us Anything!
When Slack seems like its going too fast, and you just need a quick answer from a human...
@ -6,18 +6,18 @@ Meet Our Contributors gives you a monthly one-hour opportunity to ask questions
## When:
Every first Wednesday of the month at the following times. Grab a copy of the calendar to yours from [kubernetes.io/community](https://kubernetes.io/community/)
* 03:30pm UTC
* 09:00pm UTC
* 02:30pm UTC
* 08:00pm UTC
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along with video and [#meet-our-contributors](https://kubernetes.slack.com/messages/meet-our-contributors) on Slack for questions and discourse.
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along with video and [#meet-our-contributors](https://kubernetes.slack.com/messages/meet-our-contributors) on Slack for questions and discourse.
## Whats on-topic:
## Whats on-topic:
* How our contributors got started with k8s
* Advice for getting attention on your PR
* GitHub tooling and automation
* Your first commit
* kubernetes/community
* Testing
* Testing
## Whats off-topic:
* End-user questions (Check out [#office-hours](https://kubernetes.slack.com/messages/office-hours) on slack and details [here](/events/office-hours.md))
@ -33,15 +33,13 @@ Questions will be on a first-come, first-served basis. First half will be dedica
### Code snip / PR for peer code review / Suggestion for part of codebase walk through:
* At least 24 hours before the session to slack channel (#meet-our-contributors)
Problems will be picked based on time commitment needed, skills of the reviewer, and if a large amount are submitted, need for the project.
Problems will be picked based on time commitment needed, skills of the reviewer, and if a large amount are submitted, need for the project.
## Call for Volunteers:
Contributors - [sign up to answer questions!](https://goo.gl/uhEJ33)
Contributors - [sign up to answer questions!](https://goo.gl/uhEJ33)
Expectations of volunteers:
* Be on 5 mins early. You can look at questions in the queue by joining the #meet-our-contributors slack channel to give yourself some prep.
* Expect questions about the contribution process, membership, navigating the kubernetes seas, testing, and general questions about you and your path to open source/kubernetes. It's ok if you don't know the answer!
* We will be using video chat (zoom but live streaming through YouTube) but voice only is fine if you are more comfortable with that.
* Be willing to provide suggestions and feedback to make this better!

View File

@ -70,7 +70,7 @@ Each organization should have the following teams:
- `foo-reviewers`: granted read access to the `foo` repo; intended to be used as
a notification mechanism for interested/active contributors for the `foo` repo
- a `bots` team
- should contain bots such as @k8s-ci-robot and @linuxfoundation that are
- should contain bots such as @k8s-ci-robot and @thelinuxfoundation that are
necessary for org and repo automation
- an `owners` team
- should be populated by everyone who has `owner` privileges to the org

View File

@ -21,7 +21,7 @@ the Linux Foundation CNCF CLA check for your repositories, please read on.
- Pull request: checked
- Issue comment: checked
- Active: checked
1. Add the [@linuxfoundation](https://github.com/linuxfoundation) GitHub user as an **Owner**
1. Add the [@thelinuxfoundation](https://github.com/thelinuxfoundation) GitHub user as an **Owner**
to your organization or repo to ensure the CLA status can be applied on PR's
1. After you send an invite, contact the [Linux Foundation](mailto:helpdesk@rt.linuxfoundation.org); and cc [Chris Aniszczyk](mailto:caniszczyk@linuxfoundation.org), [Ihor Dvoretskyi](mailto:ihor@cncf.io), [Eric Searcy](mailto:eric@linuxfoundation.org) (to ensure that the invite gets accepted).
1. Finally, open up a test PR to check that:

View File

@ -47,34 +47,34 @@ The following subprojects are owned by sig-api-machinery:
- **universal-machinery**
- Owners:
- https://raw.githubusercontent.com/kubernetes/apimachinery/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apimachinery/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apimachinery/OWNERS
- **server-frameworks**
- Owners:
- https://raw.githubusercontent.com/kubernetes/apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apiserver/OWNERS
- **server-crd**
- Owners:
- https://raw.githubusercontent.com/kubernetes/apiextensions-apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apiextensions-apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apiextensions-apiserver/OWNERS
- **server-api-aggregation**
- Owners:
- https://raw.githubusercontent.com/kubernetes/kube-aggregator/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/kube-aggregator/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/kube-aggregator/OWNERS
- **server-sdk**
- Owners:
- https://raw.githubusercontent.com/kubernetes/sample-apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/sample-apiserver/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/sample-apiserver/OWNERS
- https://raw.githubusercontent.com/kubernetes/sample-controller/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/sample-controller/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/sample-controller/OWNERS
- https://raw.githubusercontent.com/kubernetes-incubator/apiserver-builder/master/OWNERS
- **idl-schema-client-pipeline**
- Owners:
- https://raw.githubusercontent.com/kubernetes/gengo/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/code-generator/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/code-generator/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/code-generator/OWNERS
- https://raw.githubusercontent.com/kubernetes/kube-openapi/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/api/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/api/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/api/OWNERS
- https://raw.githubusercontent.com/kubernetes-client/gen/master/OWNERS
- **kubernetes-clients**
- Owners:
@ -90,7 +90,7 @@ The following subprojects are owned by sig-api-machinery:
- https://raw.githubusercontent.com/kubernetes-client/typescript/master/OWNERS
- https://raw.githubusercontent.com/kubernetes-incubator/client-python/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/client-go/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/client-go/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/client-go/OWNERS
- **universal-utils**
- Owners:
- https://raw.githubusercontent.com/kubernetes/utils/master/OWNERS

View File

@ -21,7 +21,7 @@ The Architecture SIG maintains and evolves the design principles of Kubernetes,
The Chairs of the SIG run operations and processes governing the SIG.
* Brian Grant (**[@bgrant0607](https://github.com/bgrant0607)**), Google
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
## Contact
* [Slack](https://kubernetes.slack.com/messages/sig-architecture)

View File

@ -11,7 +11,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
A Special Interest Group for building, deploying, maintaining, supporting, and using Kubernetes on Azure.
## Meetings
* Regular SIG Meeting: [Wednesdays at 16:00 UTC](https://zoom.us/j/2015551212) (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=16:00&tz=UTC).
* Regular SIG Meeting: [Wednesdays at 16:00 UTC](https://zoom.us/j/2015551212) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=16:00&tz=UTC).
* [Meeting notes and Agenda](https://docs.google.com/document/d/1SpxvmOgHDhnA72Z0lbhBffrfe9inQxZkU9xqlafOW9k/edit).
* [Meeting recordings](https://www.youtube.com/watch?v=yQLeUKi_dwg&list=PL69nYSiGNLP2JNdHwB8GxRs2mikK7zyc4).
@ -20,9 +20,15 @@ A Special Interest Group for building, deploying, maintaining, supporting, and u
### Chairs
The Chairs of the SIG run operations and processes governing the SIG.
* Jason Hansen (**[@slack](https://github.com/slack)**), Microsoft
* Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**), Red Hat
* Shubheksha Jalan (**[@shubheksha](https://github.com/shubheksha)**), Microsoft
### Technical Leads
The Technical Leads of the SIG establish new subprojects, decommission existing
subprojects, and resolve cross-subproject technical issues and decisions.
* Kal Khenidak (**[@khenidak](https://github.com/khenidak)**), Microsoft
* Cole Mickens (**[@colemickens](https://github.com/colemickens)**), Red Hat
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
## Contact
* [Slack](https://kubernetes.slack.com/messages/sig-azure)
@ -47,7 +53,13 @@ Monitor these for Github activity if you are not a member of the team.
| Team Name | Details | Google Groups | Description |
| --------- |:-------:|:-------------:| ----------- |
| @kubernetes/sig-azure-api-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-azure-api-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-api-reviews) | API Changes and Reviews |
| @kubernetes/sig-azure-bugs | [link](https://github.com/orgs/kubernetes/teams/sig-azure-bugs) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-bugs) | Bug Triage and Troubleshooting |
| @kubernetes/sig-azure-feature-requests | [link](https://github.com/orgs/kubernetes/teams/sig-azure-feature-requests) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-feature-requests) | Feature Requests |
| @kubernetes/sig-azure-misc | [link](https://github.com/orgs/kubernetes/teams/sig-azure-misc) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-misc) | General Discussion |
| @kubernetes/sig-azure-pr-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-azure-pr-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-pr-reviews) | PR Reviews |
| @kubernetes/sig-azure-proposals | [link](https://github.com/orgs/kubernetes/teams/sig-azure-proposals) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-proposals) | Design Proposals |
| @kubernetes/sig-azure-test-failures | [link](https://github.com/orgs/kubernetes/teams/sig-azure-test-failures) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-test-failures) | Test Failures and Triage |
<!-- BEGIN CUSTOM CONTENT -->

100
sig-azure/charter.md Normal file
View File

@ -0,0 +1,100 @@
# SIG Azure Charter
_The following is a charter for the Kubernetes Special Interest Group for Azure. It delineates the roles of SIG leadership, SIG members, as well as the organizational processes for the SIG, both as they relate to project management and technical processes for SIG subprojects._
## Roles
### SIG Chairs
- Run operations and processes governing the SIG
- Seed members established at SIG founding
- Chairs MAY decide to step down at anytime and propose a replacement. Use lazy consensus amongst chairs with fallback on majority vote to accept proposal. This SHOULD be supported by a majority of SIG Members.
- Chairs MAY select additional chairs through a [super-majority] vote amongst chairs. This SHOULD be supported by a majority of SIG Members.
- Chairs MUST remain active in the role and are automatically removed from the position if they are unresponsive for &gt; 3 months and MAY be removed if not proactively working with other chairs to fulfill responsibilities. Coordinated leaves of absence serve as exception to this requirement.
- Number: 2 - 3
- Defined in [sigs.yaml]
### SIG Technical Leads
- Establish new subprojects
- Decommission existing subprojects
- Resolve X-Subproject technical issues and decisions
- Technical Leads MUST remain active in the role and are automatically removed from the position if they are unresponsive for &gt; 3 months and MAY be removed if not proactively working with other chairs to fulfill responsibilities. Coordinated leaves of absence serve as exception to this requirement.
- Number: 2 - 3
- Defined in [sigs.yaml]
### Subproject Owners
- Scoped to a subproject defined in [sigs.yaml]
- Seed members established at subproject founding
- MUST be an escalation point for technical discussions and decisions in the subproject
- MUST set milestone priorities or delegate this responsibility
- MUST remain active in the role and are automatically removed from the position if they are unresponsive for &gt; 3 months. Coordinated leaves of absence serve as exception to this requirement.
- MAY be removed if not proactively working with other Subproject Owners to fulfill responsibilities.
- MAY decide to step down at anytime and propose a replacement. Use [lazy-consensus] amongst subproject owners with fallback on majority vote to accept proposal. This SHOULD be supported by a majority of subproject contributors (those having some role in the subproject).
- MAY select additional subproject owners through a [super-majority] vote amongst subproject owners. This SHOULD be supported by a majority of subproject contributors (through [lazy-consensus] with fallback on voting).
- Number: 3 - 5
- Defined in [sigs.yaml] [OWNERS] files
**IMPORTANT**
_With regards to leadership roles i.e., Chairs, Technical Leads, and Subproject Owners, we MUST, as a SIG, ensure that positions are held by a committee of members across a diverse set of companies. This allows for thoughtful discussion and structural management that can serve the needs of every consumer of Kubernetes on Azure._
### Members
- MUST maintain health of at least one subproject or the health of the SIG
- MUST show sustained contributions to at least one subproject or to the SIG
- SHOULD hold some documented role or responsibility in the SIG and / or at least one subproject (e.g. reviewer, approver, etc)
- MAY build new functionality for subprojects
- MAY participate in decision making for the subprojects they hold roles in
- Includes all reviewers and approvers in [OWNERS] files for subprojects
## Organizational management
- SIG meets bi-weekly on zoom with agenda in meeting notes
- SHOULD be facilitated by chairs unless delegated to specific Members
- SIG overview and deep-dive sessions organized for Kubecon
- SHOULD be organized by chairs unless delegated to specific Members
- Contributing instructions defined in the SIG CONTRIBUTING.md
### Project management
#### Subproject creation
Subprojects
may be created by [KEP] proposal and accepted by [lazy-consensus] with fallback on majority vote of SIG Technical Leads. The result SHOULD be supported by the majority of SIG members.
- KEP MUST establish subproject owners
- [sigs.yaml] MUST be updated to include subproject information and [OWNERS] files with subproject owners
- Where subprojects processes differ from the SIG governance, they MUST document how
- e.g., if subprojects release separately - they must document how release and planning is performed
Subprojects must define how releases are performed and milestones are set.
Example:
- Release milestones
- Follows the kubernetes/kubernetes release milestones and schedule
- Priorities for upcoming release are discussed during the SIG meeting following the preceding release and shared through a PR. Priorities are finalized before feature freeze.
- Code and artifacts are published as part of the kubernetes/kubernetes release
### Technical processes
Subprojects of the SIG MUST use the following processes unless explicitly following alternatives they have defined.
- Proposing and making decisions
- Proposals sent as [KEP] PRs and published to Google group as announcement
- Follow [KEP] decision making process
- Test health
- Canonical health of code published to
- Consistently broken tests automatically send an alert to
- SIG members are responsible for responding to broken tests alert. PRs that break tests should be rolled back if not fixed within 24 hours (business hours).
- Test dashboard checked and reviewed at start of each SIG meeting. Owners assigned for any broken tests. and followed up during the next SIG meeting.
Issues impacting multiple subprojects in the SIG should be resolved by SIG Technical Leads, with fallback to consensus of subproject owners.
[lazy-consensus]: http://communitymgt.wikia.com/wiki/Lazy_consensus
[super-majority]: https://en.wikipedia.org/wiki/Supermajority#Two-thirds_vote
[KEP]: https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml#L1454
[OWNERS]: contributors/devel/owners.md

View File

@ -40,6 +40,9 @@ The following subprojects are owned by sig-cli:
- Owners:
- https://raw.githubusercontent.com/kubernetes/kubectl/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/kubectl/OWNERS
- **kustomize**
- Owners:
- https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/OWNERS
## GitHub Teams

View File

@ -0,0 +1,100 @@
# SIG Cloud Provider Charter
## Mission
The Cloud Provider SIG ensures that the Kubernetes ecosystem is evolving in a way that is neutral to all (public and private) cloud providers. It will be responsible for establishing standards and requirements that must be met by all providers to ensure optimal integration with Kubernetes.
## Subprojects & Areas of Focus
* Maintaining parts of the Kubernetes project that allows Kubernetes to integrate with the underlying provider. This includes but are not limited to:
* [cloud provider interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go)
* [cloud-controller-manager](https://github.com/kubernetes/kubernetes/tree/master/cmd/cloud-controller-manager)
* Deployment tooling which has historically resided under [cluster/](https://github.com/kubernetes/kubernetes/tree/release-1.11/cluster)
* Code ownership for all cloud providers that fall under the kubernetes organization and have opted to be subprojects of SIG Cloud Provider. Following the guidelines around subprojects we anticipate providers will have full autonomy to maintain their own repositories, however, official code ownership will still belong to SIG Cloud Provider.
* [cloud-provider-azure](https://github.com/kubernetes/cloud-provider-azure)
* [cloud-provider-gcp](https://github.com/kubernetes/cloud-provider-gcp)
* [cloud-provider-openstack](https://github.com/kubernetes/cloud-provider-openstack)
* [cloud-provider-vsphere](https://github.com/kubernetes/cloud-provider-vsphere)
* Standards for documentation that should be included by all providers.
* Defining processes/standards for E2E tests that should be reported by all providers
* Developing future functionality in Kubernetes to support use cases common to all providers while also allowing custom and pluggable implementations when required, some examples include but are not limited to:
* Extendable node status and machine states based on provider
* Extendable node address types based on provider
* See also [Cloud Controller Manager KEP](https://github.com/kubernetes/community/blob/master/keps/0002-controller-manager.md)
* The collection of user experience reports from Kubernetes operators running on provider subprojects; and the delivery of roadmap information to SIG PM
## Organizational Management
* Six months after this charter is first ratified, it MUST be reviewed and re-approved by the SIG in order to evaluate the assumptions made in its initial drafting
* SIG meets bi-weekly on zoom with agenda in meeting notes.
* SHOULD be facilitated by chairs unless delegated to specific Members
* The SIG MUST make a best effort to provide leadership opportunities to individuals who represent different races, national origins, ethnicities, genders, abilities, sexual preferences, ages, backgrounds, levels of educational achievement, and socioeconomic statuses
## Subproject Creation
Each Kubernetes provider will (eventually) be a subproject under SIG Cloud Provider. To add new sub projects (providers), SIG Cloud Provider will maintain an open list of requirements that must be satisfied.
The current requirements can be seen [here](https://github.com/kubernetes/community/blob/master/keps/0002-controller-manager.md#repository-requirements). Each provider subproject is entitled to create 1..N repositories related to cluster turn up or operation on their platform, subject to technical standards set by SIG Cloud Provider.
Creation of a repository SHOULD follow the KEP process to preserve the motivation for the repository and any additional instructions for how other SIGs (e.g SIG Documentation and SIG Release) should interact with the repository
Subprojects that fall under SIG Cloud Provider may also be features in Kubernetes that is requested or needed by all, or at least a large majority of providers. The creation process for these subprojects will follow the usual KEP process.
## Subproject Retirement
Subprojects representing Kubernetes providers may be retired given they do not satisfy requirements for more than 6 months. Final decisions for retirement should be supported by a majority of SIG members using [lazy consensus](http://communitymgt.wikia.com/wiki/Lazy_consensus). Once retired any code related to that provider will be archived into the kubernetes-retired organization.
Subprojects representing Kubernetes features may be retired at any point given a lack of development or a lack of demand. Final decisions for retirement should be supported by a majority of SIG members, ideally from every provider. Once retired, any code related to that subproject will be archived into the kubernetes-retired organization.
## Technical Processes
Subprojects (providers) of the SIG MUST use the following processes unless explicitly following alternatives they have defined.
* Proposals will be sent as [KEP](https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md) PRs, and published to the official group mailing list as an announcement
* Proposals, once submitted, SHOULD be placed on the next full meeting agenda
* Decisions within the scope of individual subprojects should be made by lazy consensus by subproject owners, with fallback to majority vote by subproject owners; if a decision cant be made, it should be escalated to the SIG Chairs
* Issues impacting multiple subprojects in the SIG should be resolved by consensus of the owners of the involved subprojects; if a decision cant be made, it should be escalated to the SIG Chairs
## Roles
The following roles are required for the SIG to function properly. In the event that any role is unfilled, the SIG will make a best effort to fill it. Any decisions reliant on a missing role will be postponed until the role is filled.
### Chairs
* 3 chairs are required
* Run operations and processes governing the SIG
* An initial set of chairs was established at the time the SIG was founded.
* Chairs MAY decide to step down at anytime and propose a replacement, who must be approved by all of the other chairs. This SHOULD be supported by a majority of SIG Members.
* Chairs MAY select additional chairs using lazy consensus amongst SIG Members.
* Chairs MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed by consensus of the other Chairs and members if not proactively working with other Chairs to fulfill responsibilities.
* Chairs WILL be asked to step down if there is inappropriate behavior or code of conduct issues
* SIG Cloud Provider cannot have more than one chair from any one company.
### Subproject/Provider Owners
* There should be at least 1 representative per subproject/provider (though 3 is recommended to avoid deadlock) as specified in the OWNERS file of each cloud provider repository.
* MUST be an escalation point for technical discussions and decisions in the subproject/provider
* MUST set milestone priorities or delegate this responsibility
* MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed by consensus of other subproject owners and Chairs if not proactively working with other Subproject Owners to fulfill responsibilities.
* MAY decide to step down at anytime and propose a replacement. This can be done by updating the OWNERS file for any subprojects.
* MAY select additional subproject owners by updating the OWNERs file.
* WILL be asked to step down if there is inappropriate behavior or code of conduct issues
### SIG Members
Approvers and reviewers in the OWNERS file of all subprojects under SIG Cloud Provider.
## Long Term Goals
The long term goal of SIG Cloud Provider is to promote a vendor neutral ecosystem for our community. Vendors wanting to support Kubernetes should feel equally empowered to do so
as any of todays existing cloud providers; but more importantly ensuring a high quality user experience across providers. The SIG will act as a central group for developing
the Kubernetes project in a way that ensures all providers share common privileges and responsibilities. Below are some concrete goals on how SIG Cloud Provider plans to accomplish this.
### Consolidating Existing Cloud SIGs
SIG Cloud Provider will aim to eventually consolidate existing cloud provider SIGs and have each provider instead form a subproject under it. The subprojects would drive the development of
individual providers and work closely with SIG Cloud Provider to ensure compatibility with Kubernetes. With this model, code ownership for new and existing providers will belong to SIG Cloud Provider,
limiting SIG sprawl as more providers support Kubernetes. Existing SIGs representing cloud providers are highly encouraged to opt-in as sub-projects under SIG Cloud Provider but are not required to do.
As a SIG opts-in, it will operate to ensure a smooth transition, typically over the course of 3 release cycles.
### Supporting New Cloud Providers
One of the primary goals of SIG Cloud Provider is to become an entrypoint for new providers wishing to support Kubernetes on their platform and ensuring technical excellence from each of those providers.
SIG Cloud Provider will accomplish this by maintaining documentation around how new providers can get started and managing the set of requirements that must be met to onboard them. In addition to
onboarding new providers, the entire lifecycle of providers would also fall under the responsibility of SIG Cloud Provider, which may involve clean up work if a provider decides to no longer support Kubernetes.

View File

@ -0,0 +1,6 @@
reviewers:
- sig-cloud-provider-leads
approvers:
- sig-cloud-provider-leads
labels:
- sig/cloud-provider

View File

@ -0,0 +1,75 @@
<!---
This is an autogenerated file!
Please do not edit this file directly, but instead make changes to the
sigs.yaml file in the project root.
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
-->
# Cloud Provider Special Interest Group
Ensures that the Kubernetes ecosystem is evolving in a way that is neutral to all (public and private) cloud providers. It will be responsible for establishing standards and requirements that must be met by all providers to ensure optimal integration with Kubernetes.
## Meetings
* Regular SIG Meeting: [Wednesdays at 10:00 PT (Pacific Time)](https://zoom.us/my/sigcloudprovider) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=10:00&tz=PT%20%28Pacific%20Time%29).
* [Meeting notes and Agenda](TODO).
* [Meeting recordings](TODO).
## Leadership
### Chairs
The Chairs of the SIG run operations and processes governing the SIG.
* Andrew Sy Kim (**[@andrewsykim](https://github.com/andrewsykim)**), DigitalOcean
* Chris Hoge (**[@hogepodge](https://github.com/hogepodge)**), OpenStack Foundation
* Jago Macleod (**[@jagosan](https://github.com/jagosan)**), Google
## Contact
* [Slack](https://kubernetes.slack.com/messages/sig-cloud-provider)
* [Mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider)
* [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fcloud-provider)
## Subprojects
The following subprojects are owned by sig-cloud-provider:
- **kubernetes-cloud-provider**
- Owners:
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/cmd/cloud-controller-manager/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/controller/cloud/OWNERS
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/cloudprovider/OWNERS
- **cloud-provider-azure**
- Owners:
- https://raw.githubusercontent.com/kubernetes/cloud-provider-azure/master/OWNERS
- **cloud-provider-gcp**
- Owners:
- https://raw.githubusercontent.com/kubernetes/cloud-provider-gcp/master/OWNERS
- **cloud-provider-openstack**
- Owners:
- https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/OWNERS
- **cloud-provider-vsphere**
- Owners:
- https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/OWNERS
## GitHub Teams
The below teams can be mentioned on issues and PRs in order to get attention from the right people.
Note that the links to display team membership will only work if you are a member of the org.
The google groups contain the archive of Github team notifications.
Mentioning a team on Github will CC its group.
Monitor these for Github activity if you are not a member of the team.
| Team Name | Details | Google Groups | Description |
| --------- |:-------:|:-------------:| ----------- |
| @kubernetes/sig-cloud-provider-api-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-api-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-api-reviews) | API Changes and Reviews |
| @kubernetes/sig-cloud-provider-bugs | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-bugs) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-bugs) | Bug Triage and Troubleshooting |
| @kubernetes/sig-cloud-provider-feature-requests | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-feature-requests) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-feature-requests) | Feature Requests |
| @kubernetes/sig-cloud-provider-maintainers | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-maintainers) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-maintainers) | Cloud Providers Maintainers |
| @kubernetes/sig-cloud-providers-misc | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-providers-misc) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-providers-misc) | General Discussion |
| @kubernetes/sig-cloud-provider-pr-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-pr-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-pr-reviews) | PR Reviews |
| @kubernetes/sig-cloud-provider-proposals | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-proposals) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-proposals) | Design Proposals |
| @kubernetes/sig-cloud-provider-test-failures | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-test-failures) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-test-failures) | Test Failures and Triage |
<!-- BEGIN CUSTOM CONTENT -->
<!-- END CUSTOM CONTENT -->

View File

@ -21,7 +21,7 @@ Promote operability and interoperability of Kubernetes clusters. We focus on sha
The Chairs of the SIG run operations and processes governing the SIG.
* Rob Hirschfeld (**[@zehicle](https://github.com/zehicle)**), RackN
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
## Contact
* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)

View File

@ -23,7 +23,7 @@ Project | Owner(s)/Lead(s) | Description | Q1, Q2, Later
[Meet Our Contributors](/mentoring/meet-our-contributors.md) | @parispittman | Monthly web series similar to user office hours that allows anyone to ask new and current contributors questions about our process, ecosystem, or their stories in open source | Q1 - ongoing
[Outreachy](/mentoring/README.md) | @parispittman | Document new features, create new conceptual content, create new user paths | Q1
[Google Summer of Code](/mentoring/google-summer-of-code.md) | @nikhita | Kubernetes participation in Google Summer of Code for students | Q1 - ongoing
["Buddy" Program](https://github.com/kubernetes/community/issues/1803) | @parispittman, @chrisshort | 1 hour 1:1 sessions for new and current contributors to have dedicated time; meet our contributors but personal | Q2
["Buddy" Program](https://github.com/kubernetes/community/issues/1803) | @parispittman, @chris-short | 1 hour 1:1 sessions for new and current contributors to have dedicated time; meet our contributors but personal | Q2
## Contributor Documentation
Ensure the contribution process is well documented, discoverable, and consistent across repos to deliver the best contributor experience.

View File

@ -22,8 +22,8 @@ In order to standardize Special Interest Group efforts, create maximum transpare
### Prerequisites
* Propose the new SIG publicly, including a brief mission statement, by emailing kubernetes-dev@googlegroups.com and kubernetes-users@googlegroups.com, then wait a couple of days for feedback
* Ask a repo maintainer to create a github label, if one doesn't already exist: sig/foo
* Propose the new SIG publicly, including a brief mission statement, by emailing kubernetes-dev@googlegroups.com and kubernetes-users@googlegroups.com, then wait a couple of days for feedback.
* Ask a repo maintainer to create a github label, if one doesn't already exist: sig/foo.
* Request a new [kubernetes.slack.com](http://kubernetes.slack.com) channel (#sig-foo) from the #slack-admins channel. New users can join at [slack.kubernetes.io](http://slack.kubernetes.io).
* Slack activity is archived at [kubernetes.slackarchive.io](http://kubernetes.slackarchive.io). To start archiving a new channel invite the slackarchive bot to the channel via `/invite @slackarchive`
* Organize video meetings as needed. No need to wait for the [Weekly Community Video Conference](community/README.md) to discuss. Please report summary of SIG activities there.
@ -54,7 +54,7 @@ Create Google Groups at [https://groups.google.com/forum/#!creategroup](https://
* Create groups using the name conventions below;
* Groups should be created as e-mail lists with at least three owners (including sarahnovotny at google.com and ihor.dvoretskyi at gmail.com);
* To add the owners, visit the Group Settings (drop-down menu on the right side), select Direct Add Members on the left side and add Sarah and Ihor via email address (with a suitable welcome message); in Members/All Members select Ihor and Sarah and assign to an "owner role";
* Set "View topics", "Post", "Join the Group" permissions to be "Public"
* Set "View topics", "Post", "Join the Group" permissions to be "Public";
Name convention:

View File

@ -24,15 +24,16 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
|------|-------|--------|---------|----------|
|[API Machinery](sig-api-machinery/README.md)|api-machinery|* [Daniel Smith](https://github.com/lavalamp), Google<br>* [David Eads](https://github.com/deads2k), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-api-machinery)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-api-machinery)|* Regular SIG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/apimachinery)<br>
|[Apps](sig-apps/README.md)|apps|* [Matt Farina](https://github.com/mattfarina), Samsung SDS<br>* [Adnan Abdulhussein](https://github.com/prydonius), Bitnami<br>* [Kenneth Owens](https://github.com/kow3ns), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-apps)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-apps)|* Regular SIG Meeting: [Mondays at 9:00 PT (Pacific Time) (weekly)](https://zoom.us/my/sig.apps)<br>* (charts) Charts Chat: [Tuesdays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/166909412)<br>* (helm) Helm Developer call: [Thursdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/j/4526666954)<br>
|[Architecture](sig-architecture/README.md)|architecture|* [Brian Grant](https://github.com/bgrant0607), Google<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-architecture)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-architecture)|* Regular SIG Meeting: [Thursdays at 15:30 UTC (weekly)](https://zoom.us/j/9690526922)<br>
|[Architecture](sig-architecture/README.md)|architecture|* [Brian Grant](https://github.com/bgrant0607), Google<br>* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-architecture)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-architecture)|* Regular SIG Meeting: [Thursdays at 15:30 UTC (weekly)](https://zoom.us/j/9690526922)<br>
|[Auth](sig-auth/README.md)|auth|* [Eric Chiang](https://github.com/ericchiang), Red Hat<br>* [Jordan Liggitt](https://github.com/liggitt), Red Hat<br>* [Tim Allclair](https://github.com/tallclair), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-auth)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-auth)|* Regular SIG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8s.sig.auth)<br>
|[Autoscaling](sig-autoscaling/README.md)|autoscaling|* [Marcin Wielgus](https://github.com/mwielgus), Google<br>* [Solly Ross](https://github.com/directxman12), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-autoscaling)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-autoscaling)|* Regular SIG Meeting: [Mondays at 14:00 UTC (biweekly/triweekly)](https://zoom.us/my/k8s.sig.autoscaling)<br>
|[AWS](sig-aws/README.md)|aws|* [Justin Santa Barbara](https://github.com/justinsb)<br>* [Kris Nova](https://github.com/kris-nova), Heptio<br>* [Bob Wise](https://github.com/countspongebob), AWS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-aws)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-aws)|* Regular SIG Meeting: [Fridays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8ssigaws)<br>
|[Azure](sig-azure/README.md)|azure|* [Jason Hansen](https://github.com/slack), Microsoft<br>* [Cole Mickens](https://github.com/colemickens), Red Hat<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-azure)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-azure)|* Regular SIG Meeting: [Wednesdays at 16:00 UTC (weekly)](https://zoom.us/j/2015551212)<br>
|[Azure](sig-azure/README.md)|azure|* [Stephen Augustus](https://github.com/justaugustus), Red Hat<br>* [Shubheksha Jalan](https://github.com/shubheksha), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-azure)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-azure)|* Regular SIG Meeting: [Wednesdays at 16:00 UTC (biweekly)](https://zoom.us/j/2015551212)<br>
|[Big Data](sig-big-data/README.md)|big-data|* [Anirudh Ramanathan](https://github.com/foxish), Google<br>* [Erik Erlandson](https://github.com/erikerlandson), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-big-data)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-big-data)|* Regular SIG Meeting: [Wednesdays at 17:00 UTC (weekly)](https://zoom.us/my/sig.big.data)<br>
|[CLI](sig-cli/README.md)|cli|* [Maciej Szulik](https://github.com/soltysh), Red Hat<br>* [Phillip Wittrock](https://github.com/pwittrock), Google<br>* [Tony Ado](https://github.com/AdoHe), Alibaba<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cli)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cli)|* Regular SIG Meeting: [Wednesdays at 09:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/sigcli)<br>
|[Cloud Provider](sig-cloud-provider/README.md)|cloud-provider|* [Andrew Sy Kim](https://github.com/andrewsykim), DigitalOcean<br>* [Chris Hoge](https://github.com/hogepodge), OpenStack Foundation<br>* [Jago Macleod](https://github.com/jagosan), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cloud-provider)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider)|* Regular SIG Meeting: [Wednesdays at 10:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/sigcloudprovider)<br>
|[Cluster Lifecycle](sig-cluster-lifecycle/README.md)|cluster-lifecycle|* [Luke Marsden](https://github.com/lukemarsden), Weave<br>* [Robert Bailey](https://github.com/roberthbailey), Google<br>* [Lucas Käldström](https://github.com/luxas), Luxas Labs (occasionally contracting for Weaveworks)<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-lifecycle)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular SIG Meeting: [Tuesdays at 09:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* kubeadm Office Hours: [Wednesdays at 09:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* Cluster API working group: [Wednesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* kops Office Hours: [Fridays at 09:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8ssigaws)<br>
|[Cluster Ops](sig-cluster-ops/README.md)|cluster-ops|* [Rob Hirschfeld](https://github.com/zehicle), RackN<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-ops)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/297937771)<br>
|[Cluster Ops](sig-cluster-ops/README.md)|cluster-ops|* [Rob Hirschfeld](https://github.com/zehicle), RackN<br>* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-ops)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/297937771)<br>
|[Contributor Experience](sig-contributor-experience/README.md)|contributor-experience|* [Elsie Phillips](https://github.com/Phillels), CoreOS<br>* [Paris Pittman](https://github.com/parispittman), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-contribex)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-contribex)|* Regular SIG Meeting: [Wednesdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/j/7658488911)<br>
|[Docs](sig-docs/README.md)|docs|* [Zach Corleissen](https://github.com/zacharysarah), Linux Foundation<br>* [Andrew Chen](https://github.com/chenopis), Google<br>* [Jared Bhatti](https://github.com/jaredbhatti), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-docs)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-docs)|* Regular SIG Meeting: [Tuesdays at 17:30 UTC (weekly)](https://zoom.us/j/678394311)<br>
|[GCP](sig-gcp/README.md)|gcp|* [Adam Worrall](https://github.com/abgworrall), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-gcp)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-gcp)|* Regular SIG Meeting: [Thursdays at 16:00 UTC (biweekly)](https://zoom.us/j/761149873)<br>
@ -42,15 +43,21 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
|[Network](sig-network/README.md)|network|* [Tim Hockin](https://github.com/thockin), Google<br>* [Dan Williams](https://github.com/dcbw), Red Hat<br>* [Casey Davenport](https://github.com/caseydavenport), Tigera<br>|* [Slack](https://kubernetes.slack.com/messages/sig-network)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-network)|* Regular SIG Meeting: [Thursdays at 14:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/5806599998)<br>
|[Node](sig-node/README.md)|node|* [Dawn Chen](https://github.com/dchen1107), Google<br>* [Derek Carr](https://github.com/derekwaynecarr), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-node)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-node)|* Regular SIG Meeting: [Tuesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/j/4799874685)<br>
|[OpenStack](sig-openstack/README.md)|openstack|* [Chris Hoge](https://github.com/hogepodge), OpenStack Foundation<br>* [David Lyle](https://github.com/dklyle), Intel<br>* [Robert Morse](https://github.com/rjmorse), Ticketmaster<br>|* [Slack](https://kubernetes.slack.com/messages/sig-openstack)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-openstack)|* Regular SIG Meeting: [Wednesdays at 16:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/417251241)<br>
<<<<<<< HEAD
|[PM](sig-pm/README.md)|pm|* [Aparna Sinha](https://github.com/apsinha), Google<br>* [Ihor Dvoretskyi](https://github.com/idvoretskyi), CNCF<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/kubernetes-pm)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-pm)|* Regular SIG Meeting: [Tuesdays at 18:30 UTC (biweekly)](https://zoom.us/j/845373595)<br>
|[Release](sig-release/README.md)|release|* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-release)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-release)|* Regular SIG Meeting: [Tuesdays at 21:00 UTC (biweekly)](https://zoom.us/j/664772523)<br>
|[Scalability](sig-scalability/README.md)|scalability|* [Wojciech Tyczynski](https://github.com/wojtek-t), Google<br>* [Bob Wise](https://github.com/countspongebob), Samsung SDS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scalability)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scale)|* Regular SIG Meeting: [Thursdays at 16:30 UTC (bi-weekly)](https://zoom.us/j/989573207)<br>
=======
|[Product Management](sig-product-management/README.md)|none|* [Aparna Sinha](https://github.com/apsinha), Google<br>* [Ihor Dvoretskyi](https://github.com/idvoretskyi), CNCF<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/kubernetes-pm)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-pm)|* Regular SIG Meeting: [Tuesdays at 16:00 UTC (biweekly)](https://zoom.us/j/845373595)<br>
|[Release](sig-release/README.md)|release|* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-release)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-release)|* Regular SIG Meeting: [Tuesdays at 21:00 UTC (biweekly)](https://zoom.us/j/664772523)<br>
|[Scalability](sig-scalability/README.md)|scalability|* [Wojciech Tyczynski](https://github.com/wojtek-t), Google<br>* [Bob Wise](https://github.com/countspongebob), AWS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scalability)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scale)|* Regular SIG Meeting: [Thursdays at 16:30 UTC (bi-weekly)](https://zoom.us/j/989573207)<br>
>>>>>>> upstream/master
|[Scheduling](sig-scheduling/README.md)|scheduling|* [Bobby (Babak) Salamat](https://github.com/bsalamat), Google<br>* [Klaus Ma](https://github.com/k82cn), IBM<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scheduling)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scheduling)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/7767391691)<br>
|[Service Catalog](sig-service-catalog/README.md)|service-catalog|* [Paul Morie](https://github.com/pmorie), Red Hat<br>* [Aaron Schlesinger](https://github.com/arschles), Microsoft<br>* [Ville Aikas](https://github.com/vaikas-google), Google<br>* [Doug Davis](https://github.com/duglin), IBM<br>|* [Slack](https://kubernetes.slack.com/messages/sig-service-catalog)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-service-catalog)|* Regular SIG Meeting: [Mondays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/j/7201225346)<br>
|[Storage](sig-storage/README.md)|storage|* [Saad Ali](https://github.com/saad-ali), Google<br>* [Bradley Childs](https://github.com/childsb), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-storage)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-storage)|* Regular SIG Meeting: [Thursdays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/614261834)<br>
|[Testing](sig-testing/README.md)|testing|* [Aaron Crickenberger](https://github.com/spiffxp), Samsung SDS<br>* [Erick Feja](https://github.com/fejta), Google<br>* [Steve Kuznetsov](https://github.com/stevekuznetsov), Red Hat<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-testing)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-testing)|* Regular SIG Meeting: [Tuesdays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/my/k8s.sig.testing)<br>* (testing-commons) Testing Commons: [Wednesdays at 07:30 PT (Pacific Time) (bi-weekly)](https://zoom.us/my/k8s.sig.testing)<br>
|[Testing](sig-testing/README.md)|testing|* [Aaron Crickenberger](https://github.com/spiffxp)<br>* [Erick Feja](https://github.com/fejta), Google<br>* [Steve Kuznetsov](https://github.com/stevekuznetsov), Red Hat<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-testing)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-testing)|* Regular SIG Meeting: [Tuesdays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/my/k8s.sig.testing)<br>* (testing-commons) Testing Commons: [Wednesdays at 07:30 PT (Pacific Time) (bi-weekly)](https://zoom.us/my/k8s.sig.testing)<br>
|[UI](sig-ui/README.md)|ui|* [Dan Romlein](https://github.com/danielromlein), Google<br>* [Sebastian Florek](https://github.com/floreks), Fujitsu<br>|* [Slack](https://kubernetes.slack.com/messages/sig-ui)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-ui)|* Regular SIG Meeting: [Thursdays at 18:00 CET (Central European Time) (weekly)](https://groups.google.com/forum/#!forum/kubernetes-sig-ui)<br>
|[VMware](sig-vmware/README.md)|vmware|* [Fabio Rapposelli](https://github.com/frapposelli), VMware<br>* [Steve Wong](https://github.com/cantbewong), VMware<br>|* [Slack](https://kubernetes.slack.com/messages/sig-vmware)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-vmware)|* Regular SIG Meeting: [Thursdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/183662780)<br>
|[VMware](sig-vmware/README.md)|vmware|* [Fabio Rapposelli](https://github.com/frapposelli), VMware<br>* [Steve Wong](https://github.com/cantbewong), VMware<br>|* [Slack](https://kubernetes.slack.com/messages/sig-vmware)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-vmware)|* Regular SIG Meeting: [Thursdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/183662780)<br>* Cloud Provider vSphere weekly syncup: [Wednesdays at 16:30 UTC (weekly)](https://zoom.us/j/584244729)<br>
|[Windows](sig-windows/README.md)|windows|* [Michael Michael](https://github.com/michmike), Apprenda<br>* [Patrick Lang](https://github.com/patricklang), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-windows)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-windows)|* Regular SIG Meeting: [Tuesdays at 12:30 Eastern Standard Time (EST) (weekly)](https://zoom.us/my/sigwindows)<br>
### Master Working Group List
@ -58,10 +65,10 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
| Name | Organizers | Contact | Meetings |
|------|------------|---------|----------|
|[App Def](wg-app-def/README.md)|* [Antoine Legrand](https://github.com/ant31), CoreOS<br>* [Bryan Liles](https://github.com/bryanl), Heptio<br>* [Gareth Rushgrove](https://github.com/garethr), Docker<br>|* [Slack](https://kubernetes.slack.com/messages/wg-app-def)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-app-def)|* Regular WG Meeting: [Wednesdays at 16:00 UTC (bi-weekly)](https://zoom.us/j/748123863)<br>
|[Apply](wg-apply/README.md)|* [Daniel Smith](https://github.com/lavalamp), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-apply)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-apply)|* Regular WG Meeting: [Tuesdays at 9:30 PT (Pacific Time) (weekly)]()<br>
|[Apply](wg-apply/README.md)|* [Daniel Smith](https://github.com/lavalamp), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-apply)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-apply)|* Regular WG Meeting: [Tuesdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/my/apimachinery)<br>
|[Cloud Provider](wg-cloud-provider/README.md)|* [Sidhartha Mani](https://github.com/wlan0), Caascade Labs<br>* [Jago Macleod](https://github.com/jagosan), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-cloud-provider)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-cloud-provider)|* Regular WG Meeting: [Wednesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/my/cloudprovider)<br>
|[Cluster API](wg-cluster-api/README.md)|* [Kris Nova](https://github.com/kris-nova), Heptio<br>* [Robert Bailey](https://github.com/roberthbailey), Google<br>|* [Slack](https://kubernetes.slack.com/messages/cluster-api)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular WG Meeting: [s at ()]()<br>
|[Container Identity](wg-container-identity/README.md)|* [Clayton Coleman](https://github.com/smarterclayton), Red Hat<br>* [Greg Gastle](https://github.com/destijl), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-container-identity)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-container-identity)|* Regular WG Meeting: [Tuesdays at 15:00 UTC (bi-weekly (On demand))](TBD)<br>
|[Container Identity](wg-container-identity/README.md)|* [Clayton Coleman](https://github.com/smarterclayton), Red Hat<br>* [Greg Castle](https://github.com/destijl), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-container-identity)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-container-identity)|* Regular WG Meeting: [Tuesdays at 15:00 UTC (bi-weekly (On demand))](TBD)<br>
|[Kubeadm Adoption](wg-kubeadm-adoption/README.md)|* [Lucas Käldström](https://github.com/luxas), Luxas Labs (occasionally contracting for Weaveworks)<br>* [Justin Santa Barbara](https://github.com/justinsb)<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-lifecycle)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular WG Meeting: [Tuesdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>
|[Machine Learning](wg-machine-learning/README.md)|* [Vishnu Kannan](https://github.com/vishh), Google<br>* [Kenneth Owens](https://github.com/kow3ns), Google<br>* [Balaji Subramaniam](https://github.com/balajismaniam), Intel<br>* [Connor Doyle](https://github.com/ConnorDoyle), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-machine-learning)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-machine-learning)|* Regular WG Meeting: [Thursdays at 13:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/4799874685)<br>
|[Multitenancy](wg-multitenancy/README.md)|* [David Oppenheimer](https://github.com/davidopp), Google<br>* [Jessie Frazelle](https://github.com/jessfraz), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/wg-multitenancy)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-multitenancy)|* Regular WG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8s.sig.auth)<br>

View File

@ -8,7 +8,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
-->
# Multicluster Special Interest Group
A Special Interest Group focussed on solving common challenges related to the management of multiple Kubernetes clusters, and applications that exist therein. The SIG will be responsible for designing, discussing, implementing and maintaining APIs, tools and documentation related to multi-cluster administration and application management. This includes not only active automated approaches such as Cluster Federation, but also those that employ batch workflow-style continuous deployment systems like Spinnaker and others. Standalone building blocks for these and other similar systems (for example a cluster registry), and proposed changes to kubernetes core where appropriate will also be in scope.
A Special Interest Group focused on solving common challenges related to the management of multiple Kubernetes clusters, and applications that exist therein. The SIG will be responsible for designing, discussing, implementing and maintaining APIs, tools and documentation related to multi-cluster administration and application management. This includes not only active automated approaches such as Cluster Federation, but also those that employ batch workflow-style continuous deployment systems like Spinnaker and others. Standalone building blocks for these and other similar systems (for example a cluster registry), and proposed changes to kubernetes core where appropriate will also be in scope.
## Meetings
* Regular SIG Meeting: [Tuesdays at 9:30 PT (Pacific Time)](https://zoom.us/my/k8s.mc) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9:30&tz=PT%20%28Pacific%20Time%29).

View File

@ -19,7 +19,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
### Chairs
The Chairs of the SIG run operations and processes governing the SIG.
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
* Caleb Miles (**[@calebamiles](https://github.com/calebamiles)**), Google
## Contact

View File

@ -23,7 +23,7 @@ For more details about our objectives please review our [Scaling And Performance
The Chairs of the SIG run operations and processes governing the SIG.
* Wojciech Tyczynski (**[@wojtek-t](https://github.com/wojtek-t)**), Google
* Bob Wise (**[@countspongebob](https://github.com/countspongebob)**), Samsung SDS
* Bob Wise (**[@countspongebob](https://github.com/countspongebob)**), AWS
## Contact
* [Slack](https://kubernetes.slack.com/messages/sig-scalability)
@ -62,32 +62,17 @@ Monitor these for Github activity if you are not a member of the team.
<!-- BEGIN CUSTOM CONTENT -->
## Upcoming 2018 Meeting Dates
* 1/18
* 2/1
* 2/15
* 3/1
* 3/15
* 3/29
* 4/12
* 4/26
* 5/10
* 5/24
* 6/7
* 6/21
* 7/5
* 7/19
* 8/2
* 8/16
* 8/30
* 9/13
* 9/27
## Scalability SLOs
## Scalability/performance SLIs and SLOs
We officially support two different SLOs:
1. "API-responsiveness":
99% of all API calls return in less than 1s
1. "Pod startup time:
99% of pods (with pre-pulled images) start within 5s
This should be valid on appropriate hardware up to a 5000 node cluster with 30 pods/node. We eventually want to expand that to 100 pods/node.
For more details how do we measure those, you can look at: http://blog.kubernetes.io/2015_09_01_archive.html
We are working on refining existing SLOs and defining more for other areas of the system.
Check out [SLIs/SLOs page](./slos/slos.md).
<!-- END CUSTOM CONTENT -->

View File

@ -37,4 +37,4 @@ This document is a compilation of some interesting scalability/performance regre
- On many occasions our scalability tests caught critical/risky bugs which were missed by most other tests. If not caught, those could've seriously jeopardized production-readiness of k8s.
- SIG-Scalability has caught/fixed several important issues that span across various components, features and SIGs.
- Around 60% of times (possibly even more), we catch scalability regressions with just our medium-scale (and fast) tests, i.e gce-100 and kubemark-500. Making them run as presubmits should act as a strong shield against regressions.
- Majority of the remaining ones are caught by our large-scale (and slow) tests, i.e kubemark-5k and gce-2k. Making them as post-submit blokcers (given they're "usually" quite healthy) should act as a second layer of protection against regressions.
- Majority of the remaining ones are caught by our large-scale (and slow) tests, i.e kubemark-5k and gce-2k. Making them as post-submit blockers (given they're "usually" quite healthy) should act as a second layer of protection against regressions.

View File

@ -1,196 +0,0 @@
# API-machinery SLIs and SLOs
The document was converted from [Google Doc]. Please refer to the original for
extended commentary and discussion.
## Background
Scalability is an important aspect of the Kubernetes. However, Kubernetes is
such a large system that we need to manage users expectations in this area.
To achieve it, we are in process of redefining what does it mean that
Kubernetes supports X-node clusters - this doc describes the high-level
proposal. In this doc we are describing API-machinery related SLIs we would
like to introduce and suggest which of those should eventually have a
corresponding SLO replacing current "99% of API calls return in under 1s" one.
The SLOs we are proposing in this doc are our goal - they may not be currently
satisfied. As a result, while in the future we would like to block the release
when we are violating SLOs, we first need to understand where exactly we are
now, define and implement proper tests and potentially improve the system.
Only once this is done, we may try to introduce a policy of blocking the
release on SLO violation. But this is out of scope of this doc.
### SLIs and SLOs proposal
Below we introduce all SLIs and SLOs we would like to have in the api-machinery
area. A bunch of those are not easy to understand for users, as they are
designed for developers or performance tracking of higher level
user-understandable SLOs. The user-oriented one (which we want to publicly
announce) are additionally highlighted with bold.
### Prerequisite
Kubernetes cluster is available and serving.
### Latency<sup>[1](#footnote1)</sup> of API calls for single objects
__***SLI1: Non-streaming API calls for single objects (POST, PUT, PATCH, DELETE,
GET) latency for every (resource, verb) pair, measured as 99th percentile over
last 5 minutes***__
__***SLI2: 99th percentile for (resource, verb) pairs \[excluding virtual and
aggregated resources and Custom Resource Definitions\] combined***__
__***SLO: In default Kubernetes installation, 99th percentile of SLI2
per cluster-day<sup>[2](#footnote2)</sup> <= 1s***__
User stories:
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
response from an API call.
- As an administrator of Kubernetes cluster, if I know characteristics of my
external dependencies of apiserver (e.g custom admission plugins, webhooks and
initializers) I want to be able to provide guarantees for API calls latency to
users of my cluster
Background:
- We obviously cant give any guarantee in general, because cluster
administrators are allowed to register custom admission plugins, webhooks
and/or initializers, which we dont have any control about and they obviously
impact API call latencies.
- As a result, we define the SLIs to be very generic (no matter how your
cluster is set up), but we provide SLO only for default installations (where we
have control over what apiserver is doing). This doesnt provide a false
impression, that we provide guarantee no matter how the cluster is setup and
what is installed on top of it.
- At the same time, API calls are part of pretty much every non-trivial workflow
in Kubernetes, so this metric is a building block for less trivial SLIs and
SLOs.
Other notes:
- The SLO has to be satisfied independently from from the used encoding. This
makes the mix of client important while testing. However, we assume that all
`core` components communicate with apiserver with protocol buffers (otherwise
the SLO doesnt have to be satisfied).
- In case of GET requests, user has an option to opt-in for accepting
potentially stale data (the request is then served from cache and not hitting
underlying storage). However, the SLO has to be satisfied even if all requests
ask for up-to-date data, which again makes careful choice of requests in tests
important while testing.
### Latency of API calls for multiple objects
__***SLI1: Non-streaming API calls for multiple objects (LIST) latency for
every (resource, verb) pair, measure as 99th percentile over last 5 minutes***__
__***SLI2: 99th percentile for (resource, verb) pairs [excluding virtual and
aggregated resources and Custom Resource Definitions] combined***__
__***SLO1: In default Kubernetes installation, 99th percentile of SLI2 per
cluster-day***__
- __***is <= 1s if total number of objects of the same type as resource in the
system <= X***__
- __***is <= 5s if total number of objects of the same type as resource in the
system <= Y***__
- __***is <= 30s if total number of objects of the same types as resource in the
system <= Z***__
User stories:
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
response from an API call.
- As an administrator of Kubernetes cluster, if I know characteristics of my
external dependencies of apiserver (e.g custom admission plugins, webhooks and
initializers) I want to be able to provide guarantees for API calls latency to
users of my cluster.
Background:
- On top of arguments from latency of API calls for single objects, LIST
operations are crucial part of watch-related frameworks, which in turn are
responsible for overall system performance and responsiveness.
- The above SLO is user-oriented and may have significant buffer in threshold.
In fact, the latency of the request should be proportional to the amount of
work to do (which in our case is number of objects of a given type (potentially
in a requested namespace if specified)) plus some constant overhead. For better
tracking of performance, we define the other SLIs which are supposed to be
purely internal (developer-oriented)
_SLI3: Non-streaming API calls for multiple objects (LIST) latency minus 1s
(maxed with 0) divided by number of objects in the collection
<sup>[3](#footnote3)</sup> (which may be many more than the number of returned
objects) for every (resource, verb) pair, measured as 99th percentile over
last 5 minutes._
_SLI4: 99th percentile for (resource, verb) pairs [excluding virtual and
aggregated resources and Custom Resource Definitions] combined_
_SLO2: In default Kubernetes installation, 99th percentile of SLI4 per
cluster-day <= Xms_
### Watch latency
_SLI1: API-machinery watch latency (measured from the moment when object is
stored in database to when its ready to be sent to all watchers), measured
as 99th percentile over last 5 minutes_
_SLO1 (developer-oriented): 99th percentile of SLI1 per cluster-day <= Xms_
User stories:
- As an administrator, if system is slow, I would like to know if the root
cause is slow api-machinery or something farther the path (lack of network
bandwidth, slow or cpu-starved controllers, ...).
Background:
- Pretty much all control loops in Kubernetes are watch-based, so slow watch
means slow system in general. As a result, we want to give some guarantees on
how fast it is.
- Note that how we measure it, silently assumes no clock-skew in case of HA
clusters.
### Admission plugin latency
_SLI1: Admission latency for each admission plugin type, measured as 99th
percentile over last 5 minutes_
User stories:
- As an administrator, if API calls are slow, I would like to know if this is
because slow admission plugins and if so which ones are responsible.
### Webhook latency
_SLI1: Webhook call latency for each webhook type, measured as 99th percentile
over last 5 minutes_
User stories:
- As an administrator, if API calls are slow, I would like to know if this is
because slow webhooks and if so which ones are responsible.
### Initializer latency
_SLI1: Initializer latency for each initializer, measured as 99th percentile
over last 5 minutes_
User stories:
- As an administrator, if API calls are slow, I would like to know if this is
because of slow initializers and if so which ones are responsible.
---
<a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
from the moment when apiserver gets the request to last byte of response sent
to the user.
<a name="footnote2">\[2\]</a> For the purpose of visualization it will be a
sliding window. However, for the purpose of reporting the SLO, it means one
point per day (whether SLO was satisfied on a given day or not).
<a name="footnote3">\[3\]</a>A collection contains: (a) all objects of that
type for cluster-scoped resources, (b) all object of that type in a given
namespace for namespace-scoped resources.
[Google Doc]: https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw/edit#

View File

@ -0,0 +1,47 @@
## API call latency SLIs/SLOs details
### User stories
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
response from an API call.
- As an administrator of Kubernetes cluster, if I know characteristics of my
external dependencies of apiserver (e.g custom admission plugins, webhooks and
initializers) I want to be able to provide guarantees for API calls latency to
users of my cluster
### Other notes
- We obviously cant give any guarantee in general, because cluster
administrators are allowed to register custom admission plugins, webhooks
and/or initializers, which we dont have any control about and they obviously
impact API call latencies.
- As a result, we define the SLIs to be very generic (no matter how your
cluster is set up), but we provide SLO only for default installations (where we
have control over what apiserver is doing). This doesnt provide a false
impression, that we provide guarantee no matter how the cluster is setup and
what is installed on top of it.
- At the same time, API calls are part of pretty much every non-trivial workflow
in Kubernetes, so this metric is a building block for less trivial SLIs and
SLOs.
- The SLO for latency for read-only API calls of a given type may have significant
buffer in threshold. In fact, the latency of the request should be proportional to
the amount of work to do (which is number of objects of a given type in a given
scope) plus some constant overhead. For better tracking of performance, we
may want to define purely internal SLI of "latency per object". But that
isn't in near term plans.
### Caveats
- The SLO has to be satisfied independently from used encoding in user-originated
requests. This makes mix of client important while testing. However, we assume
that all `core` components communicate with apiserver using protocol buffers.
- In case of GET requests, user has an option opt-in for accepting potentially
stale data (being served from cache) and the SLO again has to be satisfied
independently of that. This makes the careful choice of requests in tests
important.
### TODOs
- We may consider treating `non-namespaced` resources as a separate bucket in
the future. However, it may not make sense if the number of those may be
comparable with `namespaced` ones.
### Test scenario
__TODO: Descibe test scenario.__

View File

@ -0,0 +1,6 @@
## API call extension points latency SLIs details
### User stories
- As an administrator, if API calls are slow, I would like to know if this is
because slow extension points (admission plugins, webhooks, initializers) and
if so which ones are responsible for it.

View File

@ -1,72 +0,0 @@
# Extended Kubernetes scalability SLOs
## Goal
The goal of this effort is to extend SLOs which Kubernetes cluster has to meet to support given number of Nodes. As of April 2017 we have only two SLOs:
- API-responsiveness: 99% of all API calls return in less than 1s
- Pod startup time: 99% of Pods (with pre-pulled images) start within 5s
which are enough to guarantee that cluster doesn't feel completely dead, but not enough to guarantee that it satisfies user's needs.
We're going to define more SLOs based on most important indicators, and standardize the format in which we speak about our objectives. Our SLOs need to have two properties:
- They need to be testable, i.e. we need to have a benchmark to measure if it's met,
- They need to be expressed in a way that's possible to understand by a user not intimately familiar with the system internals, i.e. formulation can't depend on some arcane knowledge.
On the other hand we do not require that:
- SLOs are possible to monitor in a running cluster, i.e. not all SLOs need to be easily translatable to SLAs. Being able to benchmark is enough for us.
## Split metrics from environment
Currently what me measure and how we measure it is tightly coupled. This means that we don't have good environmental constraint suggestions for users (e.g. how many Pods per Namespace we support, how many Endpoints per Service, how to setup the cluster etc.). We need to decide on what's reasonable and make the environment explicit.
## Split SLOs by kind
Current SLOs implicitly assume that the cluster is in a "steady state". By this we mean that we assume that there's only some, limited, number of things going during benchmarking. We need to make this assumption explicit and split SLOs into two categories: steady-state SLOs and burst SLOs.
## Steady state SLOs
With steady state SLO we want to give users the data about system's behavior during normal operation. We define steady state by limiting the churn on the cluster.
This includes current SLOs:
- API call latency
- E2e Pod startup latency
By churn we understand a measure of amount changes happening in the cluster. Its formal(-ish) definition will follow, but informally it can be thought about as number of user-issued requests per second plus number of pods affected by those requests.
More formally churn per second is defined as:
```
#Pod creations + #PodSpec updates + #user originated requests in a given second
```
The last part is necessary only to get rid of situations when user is spamming API server with various requests. In ordinary circumstances we expect it to be in the order of 1-2.
## Burst SLOs
With burst SLOs we want to give user idea on how system behaves under the heavy load, i.e. when one want the system to do something as quickly as possible, not caring too much about response time for a single request. Note that this voids all steady-state SLOs.
This includes the new SLO:
- Pod startup throughput
## Environment
A Kubernetes cluster in which we benchmark SLOs needs to meet the following criteria:
- Run a single appropriately sized master machine
- Main etcd runs as a single instance on the master machine
- Events are stored in a separate etcd instance running on the master machine
- Kubernetes version is at least 1.X.Y
- Components configuration = _?_
_TODO: NEED AN HA CONFIGURATION AS WELL_
## SLO template
All our performance SLOs should be defined using the following template:
---
# SLO: *TL;DR description of the SLO*
## (Burst|Steady state) foo bar SLO
### Summary
_One-two sentences describing the SLO, that's possible to understand by the majority of the community_
### User Stories
_A Few user stories showing in what situations users might be interested in this SLO, and why other ones are not enough_
## Full definition
### Test description
_Precise description of test scenario, including maximum number of Pods per Controller, objects per namespace, and anything else that even remotely seems important_
### Formal definition (can be skipped if the same as title/summary)
_Precise and as formal as possible definition of SLO. This does not necessarily need to be easily understandable by layman_

View File

@ -0,0 +1,54 @@
## Pod startup latency SLI/SLO details
### User stories
- As a user of vanilla Kubernetes, I want some guarantee how quickly my pods
will be started.
### Other notes
- Only schedulable and stateless pods contribute to the SLI:
- If there is no space in the cluster to place the pod, there is not much
we can do about it (it is task for Cluster Autoscaler which should have
separate SLIs/SLOs).
- If placing a pod requires preempting other pods, that may heavily depend
on the application (e.g. on their graceful termination period). We don't
want that to contribute to this SLI.
- Mounting disks required by non-stateless pods may potentially also require
non-negligible time, not fully dependent on Kubernetes.
- We are explicitly excluding image pulling from time the SLI. This is
because it highly depends on locality of the image, image registry performance
characteristic (e.g. throughput), image size itself, etc. Since we have
no control over any of those (and all of those would significantly affect SLI)
we decided to simply exclude it.
- We are also explicitly excluding time to run init containers, as, again, this
is heavily application-dependent (and does't depend on Kubernetes itself).
- The answer to question "when pod should be considered as started" is also
not obvious. We decided for the semantic of "when all its containers are
reported as started and observed via watch", because:
- we require all containers to be started (not e.g. the first one) to ensure
that the pod is started. We need to ensure that pontential regressions like
linearization of container startups within a pod will be catch by this SLI.
- note that we don't require all container to be running - if some of them
finished before the last one was started that is also fine. It is just
required that all of them has been started (at least once).
- we don't want to rely on "readiness checks", because they heavily
depend on the application. If the application takes couple minutes to
initialize before it starts responding to readiness checks, that shouldn't
count towards Kubernetes performance.
- even if your application started, many control loops in Kubernetes will
not fire before they will observe that. If Kubelet is not able to report
the status due to some reason, other parts of the system will not have
a way to learn about it - this is why reporting part is so important
here.
- since watch is so centric to Kubernetes (and many control loops are
triggered by specific watch events), observing the status of pod is
also part of the SLI (as this is the moment when next control loops
can potentially be fired).
### TODOs
- We should try to provide guarantees for non-stateless pods (the threshold
may be higher for them though).
- Revisit whether we want "watch pod status" part to be included in the SLI.
### Test scenario
__TODO: Descibe test scenario.__

View File

@ -0,0 +1,148 @@
# Kubernetes scalability and performance SLIs/SLOs
## What Kubernetes guarantees?
One of the important aspects of Kubernetes is its scalability and performance
characteristic. As Kubernetes user or operator/administrator of a cluster
you would expect to have some guarantees in those areas.
The goal of this doc is to organize the guarantees that Kubernetes provides
in these areas.
## What do we require from SLIs/SLOs?
We are going to define more SLIs and SLOs based on the most important indicators
in the system.
Our SLOs need to have the following properties:
- <b> They need to be testable </b> <br/>
That means that we need to have a benchmark to measure if it's met.
- <b> They need to be understandable for users </b> <br/>
In particular, they need to be understandable for people not familiar
with the system internals, i.e. their formulation can't depend on some
arcane knowledge.
However, we may introduce some internal (for developers only) SLIs, that
may be useful for understanding performance characterstic of the system,
but for which we don't provide any guarantees for users and thus may not
be fully understandable for users.
On the other hand, we do NOT require that our SLOs:
- are measurable in a running cluster (though that's desired if possible) <br/>
In other words, not SLOs need to be easily translatable to SLAs.
Being able to benchmark is enough for us.
## Types of SLOs
While SLIs are very generic and don't really depend on anything (they just
define what and how we measure), it's not the case for SLOs.
SLOs provide guarantees, and satisfying them may depend on meeting some
specific requirements.
As a result, we build our SLOs in "you promise, we promise" format.
That means, that we provide you a guarantee only if you satisfy the requirement
that we put on you.
As a consequence we introduce the two types of SLOs.
### Steady state SLOs
With steady state SLOs, we provide guarantees about system's behavior during
normal operations. We are able to provide much more guarantees in that situation.
```Definition
We define system to be in steady state when the cluster churn per second is <= 20, where
churn = #(Pod spec creations/updates/deletions) + #(user originated requests) in a given second
```
### Burst SLO
With burst SLOs, we provide guarantees on how system behaves under the heavy load
(when user wants the system to do something as quickly as possible not caring too
much about response time).
## Environment
In order to meet the SLOs, system must run in the environment satisfying
the following criteria:
- Runs a single or more appropriate sized master machines
- Main etcd running on master machine(s)
- Events are stored in a separate etcd running on the master machine(s)
- Kubernetes version is at least X.Y.Z
- ...
__TODO: Document other necessary configuration.__
## Thresholds
To make the cluster eligible for SLO, users also can't have too many objects in
their clusters. More concretely, the number of different objects in the cluster
MUST satisfy thresholds defined in [thresholds file][].
[thresholds file]: https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/thresholds.md
## Kubernetes SLIs/SLOs
The currently existing SLIs/SLOs are enough to guarantee that cluster isn't
completely dead. However, the are not enough to satisfy user's needs in most
of the cases.
We are looking into extending the set of SLIs/SLOs to cover more parts of
Kubernetes.
```
Prerequisite: Kubernetes cluster is available and serving.
```
### Steady state SLIs/SLOs
| Status | SLI | SLO | User stories, test scenarios, ... |
| --- | --- | --- | --- |
| __Official__ | Latency<sup>[1](#footnote1)</sup> of mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[3](#footnote3)</sup> <= 1s | [Details](./api_call_latency.md) |
| __Official__ | Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[4](#footnote3)</sup> API calls for every (resource, scope<sup>[5](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day (a) <= 1s if `scope=resource` (b) <= 5s if `scope=namespace` (c) <= 30s if `scope=cluster` | [Details](./api_call_latency.md) |
| __Official__ | Startup latency of stateless<sup>[6](#footnode6)</sup> and schedulable<sup>[7](#footnote7)</sup> pods, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day <= 5s | [Details](./pod_startup_latency.md) |
<a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
from the moment when apiserver gets the request to last byte of response sent
to the user.
<a name="footnote2">\[2\]</a>By mutating API calls we mean POST, PUT, DELETE
and PATCH.
<a name="footnote3">\[3\]</a> For the purpose of visualization it will be a
sliding window. However, for the purpose of reporting the SLO, it means one
point per day (whether SLO was satisfied on a given day or not).
<a name="footnote4">\[4\]</a>By non-streaming read-only API calls we mean GET
requests without `watch=true` option set. (Note that in Kubernetes internally
it translates to both GET and LIST calls).
<a name="footnote5">\[5\]</a>A scope of a request can be either (a) `resource`
if the request is about a single object, (b) `namespace` if it is about objects
from a single namespace or (c) `cluster` if it spawns objects from multiple
namespaces.
<a name="footnode6">[6\]</a>A `stateless pod` is defined as a pod that doesn't
mount volumes with sources other than secrets, config maps, downward API and
empty dir.
<a name="footnode7">[7\]</a>By schedulable pod we mean a pod that can be
scheduled in the cluster without causing any preemption.
### Burst SLIs/SLOs
| Status | SLI | SLO | User stories, test scenarios, ... |
| --- | --- | --- | --- |
| WIP | Time to start 30\*#nodes pods, measured from test scenario start until observing last Pod as ready | Benchmark: when all images present on all Nodes, 99th percentile <= X minutes | [Details](./system_throughput.md) |
### Other SLIs
| Status | SLI | User stories, ... |
| --- | --- | --- |
| WIP | Watch latency for every resource, (from the moment when object is stored in database to when it's ready to be sent to all watchers), measured as 99th percentile over last 5 minutes | TODO |
| WIP | Admission latency for each admission plugin type, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |
| WIP | Webhook call latency for each webhook type, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |
| WIP | Initializer latency for each initializer, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |

View File

@ -0,0 +1,28 @@
## System throughput SLI/SLO details
### User stories
- As a user, I want a guarantee that my workload of X pods can be started
within a given time
- As a user, I want to understand how quickly I can react to a dramatic
change in workload profile when my workload exhibits very bursty behavior
(e.g. shop during Back Friday Sale)
- As a user, I want a guarantee how quickly I can recreate the whole setup
in case of a serious disaster which brings the whole cluster down.
### Test scenario
- Start with a healthy (all nodes ready, all cluster addons already running)
cluster with N (>0) running pause pods per node.
- Create a number of `Namespaces` and a number of `Deployments` in each of them.
- All `Namespaces` should be isomorphic, possibly excluding last one which should
run all pods that didn't fit in the previous ones.
- Single namespace should run 5000 `Pods` in the following configuration:
- one big `Deployment` running ~1/3 of all `Pods` from this `namespace`
- medium `Deployments`, each with 120 `Pods`, in total running ~1/3 of all
`Pods` from this `namespace`
- small `Deployment`, each with 10 `Pods`, in total running ~1/3 of all `Pods`
from this `Namespace`
- Each `Deployment` should be covered by a single `Service`.
- Each `Pod` in any `Deployment` contains two pause containers, one `Secret`
other than default `ServiceAccount` and one `ConfigMap`. Additionally it has
resource requests set and doesn't use any advanced scheduling features or
init containers.

View File

@ -1,26 +0,0 @@
# SLO: Kubernetes cluster of size at least X is able to start Y Pods in Z minutes
**This is a WIP SLO doc - something that we want to meet, but we may not be there yet**
## Burst Pod Startup Throughput SLO
### User Stories
- User is running a workload of X total pods and wants to ensure that it can be started in Y time.
- User is running a system that exhibits very bursty behavior (e.g. shop during Black Friday Sale) and wants to understand how quickly they can react to a dramatic change in workload profile.
- User is running a huge serving app on a huge cluster. He wants to know how quickly he can recreate his whole setup in case of a serious disaster which will bring the whole cluster down.
Current steady state SLOs are do not provide enough data to make these assessments about burst behavior.
## SLO definition (full)
### Test setup
Standard performance test kubernetes setup, as describe in [the doc](../extending_slo.md#environment).
### Test scenario is following:
- Start with a healthy (all nodes ready, all cluster addons already running) cluster with N (>0) running pause Pods/Node.
- Create a number of Deployments that run X Pods and Namespaces necessary to create them.
- All namespaces should be isomorphic, possibly excluding last one which should run all Pods that didn't fit in the previous ones.
- Single Namespace should run at most 5000 Pods in the following configuration:
- one big Deployment running 1/3 of all Pods from this Namespace (1667 for 5000 Pod Namespace)
- medium Deployments, each of which is not running more than 120 Pods, running in total 1/3 of all Pods from this Namespace (14 Deployments with 119 Pods each for 5000 Pod Namespace)
- small Deployments, each of which is not running more than 10 Pods, running in total 1/3 of all Pods from this Namespace (238 Deployments with 7 Pods each for 5000 Pod Namespace)
- Each Deployment is covered by a single Service.
- Each Pod in any Deployment contains two pause containers, one secret other than ServiceAccount and one ConfigMap, has resource request set and doesn't use any advanced scheduling features (Affinities, etc.) or init containers.
- Measure the time between starting the test and moment when last Pod is started according to it's Kubelet. Note that pause container is ready just after it's started, which may not be true for more complex containers that use nontrivial readiness probes.
### Definition
Kubernetes cluster of size at least X adhering to the environment definition, when running the specified test, 99th percentile of time necessary to start Y pods from the time when user created all controllers to the time when Kubelet starts the last Pod from the set is no greater than Z minutes, assuming that all images are already present on all Nodes.

Some files were not shown because too many files have changed in this diff Show More