sig-list.md updated
Signed-off-by: Ihor Dvoretskyi <ihor@linux.com>
This commit is contained in:
commit
bed39ba418
|
|
@ -7,14 +7,14 @@
|
|||
".",
|
||||
"cmd/misspell"
|
||||
]
|
||||
revision = "59894abde931a32630d4e884a09c682ed20c5c7c"
|
||||
version = "v0.3.0"
|
||||
revision = "b90dc15cfd220ecf8bbc9043ecb928cef381f011"
|
||||
version = "v0.3.4"
|
||||
|
||||
[[projects]]
|
||||
branch = "v2"
|
||||
name = "gopkg.in/yaml.v2"
|
||||
packages = ["."]
|
||||
revision = "eb3733d160e74a9c7e442f435eb3bea458e1d19f"
|
||||
revision = "5420a8b6744d3b0345ab293f6fcba19c978f1183"
|
||||
version = "v2.2.1"
|
||||
|
||||
[solve-meta]
|
||||
analyzer-name = "dep"
|
||||
|
|
|
|||
|
|
@ -1,2 +1,6 @@
|
|||
|
||||
required = ["github.com/client9/misspell/cmd/misspell"]
|
||||
|
||||
[prune]
|
||||
go-tests = true
|
||||
unused-packages = true
|
||||
non-go = true
|
||||
|
|
|
|||
|
|
@ -21,9 +21,10 @@ aliases:
|
|||
- kris-nova
|
||||
- countspongebob
|
||||
sig-azure-leads:
|
||||
- slack
|
||||
- justaugustus
|
||||
- shubheksha
|
||||
- khenidak
|
||||
- colemickens
|
||||
- jdumars
|
||||
sig-big-data-leads:
|
||||
- foxish
|
||||
- erikerlandson
|
||||
|
|
@ -31,6 +32,10 @@ aliases:
|
|||
- soltysh
|
||||
- pwittrock
|
||||
- AdoHe
|
||||
sig-cloud-provider-leads:
|
||||
- andrewsykim
|
||||
- hogepodge
|
||||
- jagosan
|
||||
sig-cluster-lifecycle-leads:
|
||||
- lukemarsden
|
||||
- roberthbailey
|
||||
|
|
|
|||
|
|
@ -0,0 +1,13 @@
|
|||
# Defined below are the security contacts for this repo.
|
||||
#
|
||||
# They are the contact point for the Product Security Team to reach out
|
||||
# to for triaging and handling of incoming issues.
|
||||
#
|
||||
# The below names agree to abide by the
|
||||
# [Embargo Policy](https://github.com/kubernetes/sig-release/blob/master/security-release-process-documentation/security-release-process.md#embargo-policy)
|
||||
# and will be removed and replaced if they violate that agreement.
|
||||
#
|
||||
# DO NOT REPORT SECURITY VULNERABILITIES DIRECTLY TO THESE NAMES, FOLLOW THE
|
||||
# INSTRUCTIONS AT https://kubernetes.io/security/
|
||||
|
||||
cblecker
|
||||
|
|
@ -1,23 +1,32 @@
|
|||
# SIG Governance Template
|
||||
# SIG Charter Guide
|
||||
|
||||
## Goals
|
||||
All Kubernetes SIGs must define a charter defining the scope and governance of the SIG.
|
||||
|
||||
The following documents outline recommendations and requirements for SIG governance structure and provide
|
||||
template documents for SIGs to adapt. The goals are to define the baseline needs for SIGs to self govern
|
||||
and organize in a way that addresses the needs of the core Kubernetes project.
|
||||
- The scope must define what areas the SIG is responsible for directing and maintaining.
|
||||
- The governance must outline the responsibilities within the SIG as well as the roles
|
||||
owning those responsibilities.
|
||||
|
||||
The documents are focused on:
|
||||
## Steps to create a SIG charter
|
||||
|
||||
- Outlining organizational responsibilities
|
||||
- Outlining organizational roles
|
||||
- Outlining processes and tools
|
||||
1. Copy the template into a new file under community/sig-*YOURSIG*/charter.md ([sig-architecture example])
|
||||
2. Read the [Recommendations and requirements] so you have context for the template
|
||||
3. Customize your copy of the template for your SIG. Feel free to make adjustments as needed.
|
||||
4. Update [sigs.yaml] with the individuals holding the roles as defined in the template.
|
||||
5. Add subprojects owned by your SIG to the [sigs.yaml]
|
||||
5. Create a pull request with a draft of your charter.md and sigs.yaml changes. Communicate it within your SIG
|
||||
and get feedback as needed.
|
||||
6. Send the SIG Charter out for review to steering@kubernetes.io. Include the subject "SIG Charter Proposal: YOURSIG"
|
||||
and a link to the PR in the body.
|
||||
7. Typically expect feedback within a week of sending your draft. Expect longer time if it falls over an
|
||||
event such as Kubecon or holidays. Make any necessary changes.
|
||||
8. Once accepted, the steering committee will ratify the PR by merging it.
|
||||
|
||||
Specific attention has been given to:
|
||||
## Steps to update an existing SIG charter
|
||||
|
||||
- The role of technical leadership
|
||||
- The role of operational leadership
|
||||
- Process for agreeing upon technical decisions
|
||||
- Process for ensuring technical assets remain healthy
|
||||
- For significant changes, or any changes that could impact other SIGs, such as the scope, create a
|
||||
PR and send it to the steering committee for review with the subject: "SIG Charter Update: YOURSIG"
|
||||
- For minor updates to that only impact issues or areas within the scope of the SIG the SIG Chairs should
|
||||
facilitate the change.
|
||||
|
||||
## How to use the templates
|
||||
|
||||
|
|
@ -35,6 +44,26 @@ and project.
|
|||
|
||||
- [Short Template]
|
||||
|
||||
## Goals
|
||||
|
||||
The following documents outline recommendations and requirements for SIG charters and provide
|
||||
template documents for SIGs to adapt. The goals are to define the baseline needs for SIGs to
|
||||
self govern and exercise ownership over an area of the Kubernetes project.
|
||||
|
||||
The documents are focused on:
|
||||
|
||||
- Defining SIG scope
|
||||
- Outlining organizational responsibilities
|
||||
- Outlining organizational roles
|
||||
- Outlining processes and tools
|
||||
|
||||
Specific attention has been given to:
|
||||
|
||||
- The role of technical leadership
|
||||
- The role of operational leadership
|
||||
- Process for agreeing upon technical decisions
|
||||
- Process for ensuring technical assets remain healthy
|
||||
|
||||
## FAQ
|
||||
|
||||
See [frequently asked questions]
|
||||
|
|
@ -42,3 +71,5 @@ See [frequently asked questions]
|
|||
[Recommendations and requirements]: sig-governance-requirements.md
|
||||
[Short Template]: sig-governance-template-short.md
|
||||
[frequently asked questions]: FAQ.md
|
||||
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml
|
||||
[sig-architecture example]: ../../sig-architecture/charter.md
|
||||
|
|
|
|||
|
|
@ -64,7 +64,7 @@ All technical assets *MUST* be owned by exactly 1 SIG subproject. The following
|
|||
- *SHOULD* define a level of commitment for decisions that have gone through the formal process
|
||||
(e.g. when is a decision revisited or reversed)
|
||||
|
||||
- *MUST* How technical assets of project remain healthy and can be released
|
||||
- *MUST* define how technical assets of project remain healthy and can be released
|
||||
- Publicly published signals used to determine if code is in a healthy and releasable state
|
||||
- Commitment and process to *only* release when signals say code is releasable
|
||||
- Commitment and process to ensure assets are in a releasable state for milestones / releases
|
||||
|
|
|
|||
|
|
@ -1,8 +1,23 @@
|
|||
# SIG Governance Template (Short Version)
|
||||
# SIG YOURSIG Charter
|
||||
|
||||
This charter adheres to the conventions described in the [Kubernetes Charter README].
|
||||
|
||||
## Scope
|
||||
|
||||
This section defines the scope of things that would fall under ownership by this SIG.
|
||||
It must be used when determining whether subprojects should fall into this SIG.
|
||||
|
||||
### In scope
|
||||
|
||||
Outline of what falls into the scope of this SIG
|
||||
|
||||
### Out of scope
|
||||
|
||||
Outline of things that could be confused as falling into this SIG but don't
|
||||
|
||||
## Roles
|
||||
|
||||
Membership for roles tracked in: <link to OWNERS file>
|
||||
Membership for roles tracked in: [sigs.yaml]
|
||||
|
||||
- Chair
|
||||
- Run operations and processes governing the SIG
|
||||
|
|
@ -39,7 +54,7 @@ Membership for roles tracked in: <link to OWNERS file>
|
|||
- *MAY* select additional subproject owners through a [super-majority] vote amongst subproject owners. This
|
||||
*SHOULD* be supported by a majority of subproject contributors (through [lazy-consensus] with fallback on voting).
|
||||
- Number: 3-5
|
||||
- Defined in [sigs.yaml] [OWNERS] files
|
||||
- Defined in [OWNERS] files that are specified in [sigs.yaml]
|
||||
|
||||
- Members
|
||||
- *MUST* maintain health of at least one subproject or the health of the SIG
|
||||
|
|
@ -50,6 +65,14 @@ Membership for roles tracked in: <link to OWNERS file>
|
|||
- *MAY* participate in decision making for the subprojects they hold roles in
|
||||
- Includes all reviewers and approvers in [OWNERS] files for subprojects
|
||||
|
||||
- Security Contact
|
||||
- *MUST* be a contact point for the Product Security Team to reach out to for
|
||||
triaging and handling of incoming issues
|
||||
- *MUST* accept the [Embargo Policy](https://github.com/kubernetes/sig-release/blob/master/security-release-process-documentation/security-release-process.md#embargo-policy)
|
||||
- Defined in `SECURITY_CONTACTS` files, this is only relevant to the root file in
|
||||
the repository, there is a template
|
||||
[here](https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS)
|
||||
|
||||
## Organizational management
|
||||
|
||||
- SIG meets bi-weekly on zoom with agenda in meeting notes
|
||||
|
|
@ -120,3 +143,4 @@ Issues impacting multiple subprojects in the SIG should be resolved by either:
|
|||
[KEP]: https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md
|
||||
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml#L1454
|
||||
[OWNERS]: contributors/devel/owners.md
|
||||
[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ The Kubernetes community abides by the [CNCF code of conduct]. Here is an excer
|
|||
|
||||
## SIGs
|
||||
|
||||
Kubernetes encompasses many projects, organized into [SIGs](sig-list.md).
|
||||
Kubernetes encompasses many projects, organized into [SIGs](/sig-list.md).
|
||||
Some communication has moved into SIG-specific channels - see
|
||||
a given SIG subdirectory for details.
|
||||
|
||||
|
|
@ -41,11 +41,15 @@ please [file an issue].
|
|||
|
||||
## Mailing lists
|
||||
|
||||
Development announcements and discussions appear on the Google group
|
||||
[kubernetes-dev] (send mail to `kubernetes-dev@googlegroups.com`).
|
||||
Kubernetes mailing lists are hosted through Google Groups. To
|
||||
receive these lists' emails,
|
||||
[join](https://support.google.com/groups/answer/1067205) the groups
|
||||
relevant to you, as you would any other Google Group.
|
||||
|
||||
Users trade notes on the Google group
|
||||
[kubernetes-users] (send mail to `kubernetes-users@googlegroups.com`).
|
||||
* [kubernetes-announce] broadcasts major project announcements such as releases and security issues
|
||||
* [kubernetes-dev] hosts development announcements and discussions around developing kubernetes itself
|
||||
* [kubernetes-users] is where kubernetes users trade notes
|
||||
* Additional Google groups exist and can be joined for discussion related to each SIG and Working Group. These are linked from the [SIG list](/sig-list.md).
|
||||
|
||||
## Accessing community documents
|
||||
|
||||
|
|
@ -92,6 +96,7 @@ Kubernetes is the main focus of CloudNativeCon/KubeCon, held every spring in Eur
|
|||
[iCal url]: https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics
|
||||
[Kubernetes Community Meeting Agenda]: https://docs.google.com/document/d/1VQDIAB0OqiSjIHI8AWMvSdceWhnz56jNpZrLs6o7NJY/edit#
|
||||
[kubernetes-community-video-chat]: https://groups.google.com/forum/#!forum/kubernetes-community-video-chat
|
||||
[kubernetes-announce]: https://groups.google.com/forum/#!forum/kubernetes-announce
|
||||
[kubernetes-dev]: https://groups.google.com/forum/#!forum/kubernetes-dev
|
||||
[kubernetes-users]: https://groups.google.com/forum/#!forum/kubernetes-users
|
||||
[kubernetes.slackarchive.io]: https://kubernetes.slackarchive.io
|
||||
|
|
|
|||
|
|
@ -0,0 +1,63 @@
|
|||
# Moderation on Kubernetes Communications Channels
|
||||
|
||||
This page describes the rules and best practices for people chosen to moderate Kubernetes communications channels.
|
||||
This includes, Slack and the mailing lists and _any communication tool_ used in an official manner by the project.
|
||||
|
||||
## Roles and Responsibilities
|
||||
|
||||
As part of volunteering to become a moderator you are now representative of the Kubernetes community and it is your responsibility to remain aware of your contributions in this space.
|
||||
These responsibilities apply to all Kubernetes official channels.
|
||||
|
||||
Moderators _MUST_:
|
||||
|
||||
- Take action as specified by these Kubernetes Moderator Guidelines.
|
||||
- You are empowered to take _immediate action_ when there is a violation. You do not need to wait for review or approval if an egregious violation has occurred. Make a judgement call based on our Code of Conduct and Values (see below).
|
||||
- Removing a bad actor or content from the medium is preferable to letting it sit there.
|
||||
- Abide by the documented tasks and actions required of moderators.
|
||||
- Ensure that the [CNCF Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md) is in effect on all official Kubernetes communication channels.
|
||||
- Become familiar with the [Kubernetes Community Values](https://github.com/kubernetes/steering/blob/master/values.md).
|
||||
- Take care of spam as soon as possible, which may mean taking action by removing a member from that resource.
|
||||
- Foster a safe and productive environment by being aware of potential multiple cultural differences between Kubernetes community members.
|
||||
- Understand that you might be contacted by moderators, community managers, and other users via private email or a direct message.
|
||||
- Report egregious behavior to steering@k8s.io.
|
||||
|
||||
Moderators _SHOULD_:
|
||||
|
||||
- Exercise compassion and empathy when communicating and collaborating with other community members.
|
||||
- Understand the difference between a user abusing the resource or just having difficulty expressing comments and questions in English.
|
||||
- Be an example and role model to others in the community.
|
||||
- Remember to check and recognize if you need take a break when you become frustrated or find yourself in a heated debate.
|
||||
- Help your colleagues if you recognize them in one of the [stages of burnout](https://opensource.com/business/15/12/avoid-burnout-live-happy).
|
||||
- Be helpful and have fun!
|
||||
|
||||
## Violations
|
||||
|
||||
The Kubernetes [Steering Committee](https://github.com/kubernetes/steering) will have the final authority regarding escalated moderation matters. Violations of the Code of Conduct will be handled on a case by case basis. Depending on severity this can range up to and including removal of the person from the community, though this is extremely rare.
|
||||
|
||||
## Specific Guidelines
|
||||
|
||||
These guidelines are for tool-specific policies that don't fit under a general umbrella.
|
||||
|
||||
### Mailing Lists
|
||||
|
||||
|
||||
### Slack
|
||||
|
||||
- [Slack Guidelines](./slack-guidelines.md)
|
||||
|
||||
### Zoom
|
||||
|
||||
- [Zoom Guidelines](./zoom-guidelines.md)
|
||||
|
||||
|
||||
### References and Resources
|
||||
|
||||
Thanks to the following projects for making their moderation guidelines public, allowing us to build on the shoulders of giants.
|
||||
Moderators are encouraged to learn how other projects moderate and learn from them in order to improve our guidelines:
|
||||
|
||||
- Mozilla's [Forum Moderation Guidelines](https://support.mozilla.org/en-US/kb/moderation-guidelines)
|
||||
- OASIS [How to Moderate a Mailing List](https://www.oasis-open.org/khelp/kmlm/user_help/html/mailing_list_moderation.html)
|
||||
- Community Spark's [How to effectively moderate forums](http://www.communityspark.com/how-to-effectively-moderate-forums/)
|
||||
- [5 tips for more effective community moderation](https://www.socialmediatoday.com/social-business/5-tips-more-effective-community-moderation)
|
||||
- [8 Helpful Moderation Tips for Community Managers](https://sproutsocial.com/insights/tips-community-managers/)
|
||||
- [Setting Up Community Guidelines for Moderation](https://www.getopensocial.com/blog/community-management/setting-community-guidelines-moderation)
|
||||
|
|
@ -0,0 +1,45 @@
|
|||
# Zoom Guidelines
|
||||
|
||||
Zoom is the main video communication platform for Kubernetes.
|
||||
It is used for running the [community meeting](https://github.com/kubernetes/community/blob/master/events/community-meeting.md) and SIG meetings.
|
||||
Since the Zoom meetings are open to the general public, a Zoom host has to moderate a meeting if a person is in violation of the code of conduct.
|
||||
|
||||
These guidelines are meant as a tool to help Kubernetes members manage their Zoom resources.
|
||||
Check the main [moderation](./moderation.md) page for more information on other tools and general moderation guidelines.
|
||||
|
||||
## Code of Conduct
|
||||
Kubernetes adheres to Cloud Native Compute Foundation's [Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md) throughout the project, and includes all communication mediums.
|
||||
|
||||
## Moderation
|
||||
|
||||
Zoom has documentation on how to use their moderation tools:
|
||||
|
||||
- https://support.zoom.us/hc/en-us/articles/201362603-Host-Controls-in-a-Meeting
|
||||
|
||||
Check the "Screen Share Controls" (via the ^ next to Share Screen): Select who can share in your meeting and if you want only the host or any participant to be able to start a new share when someone is sharing.
|
||||
|
||||
You can also put an attendee on hold. This allows the host(s) to put attendee on hold to temporarily remove an attendee from the meeting.
|
||||
|
||||
Unfortunately, Zoom doesn't have the ability to ban or block people from joining - especially if they have the invitation to that channel and the meeting id is publicly known.
|
||||
|
||||
It is required that a host be comfortable with how to use these moderation tools. It is strongly encouraged that at least two people in a given SIG are comfortable with the moderation tools.
|
||||
|
||||
## Meeting Archive Videos
|
||||
|
||||
If a violation has been addressed by a host and it has been recorded by Zoom, the video should be edited before being posted on the [Kubernetes channel](https://www.youtube.com/c/kubernetescommunity).
|
||||
|
||||
Contact [SIG Contributor Experience](https://github.com/kubernetes/community/tree/master/sig-contributor-experience) if you need help to edit a video before posting it to the public.
|
||||
|
||||
## Admins
|
||||
|
||||
- @parispittman
|
||||
- @castrojo
|
||||
|
||||
Each SIG should have at least one person with a paid Zoom account.
|
||||
See the [SIG Creation procedure](https://github.com/kubernetes/community/blob/master/sig-governance.md#sig-creation-procedure) document on how to set up an initial account.
|
||||
|
||||
The Zoom licenses are managed by the [CNCF Service Desk](https://github.com/cncf/servicedesk).
|
||||
|
||||
## Escalating and/Reporting a Problem
|
||||
|
||||
Issues that cannot be handle via normal moderation can be escalated to the [Kubernetes steering committee](https://github.com/kubernetes/steering).
|
||||
|
|
@ -222,7 +222,7 @@ The following apply to the subproject for which one would be an owner.
|
|||
|
||||
**Status:** Removed
|
||||
|
||||
The Maintainer role has been removed and replaced with a greater focus on [owner](#owner)s.
|
||||
The Maintainer role has been removed and replaced with a greater focus on [OWNERS].
|
||||
|
||||
[code reviews]: contributors/devel/collab.md
|
||||
[community expectations]: contributors/guide/community-expectations.md
|
||||
|
|
|
|||
|
|
@ -31,7 +31,7 @@ aggregated servers.
|
|||
* Developers should be able to write their own API server and cluster admins
|
||||
should be able to add them to their cluster, exposing new APIs at runtime. All
|
||||
of this should not require any change to the core kubernetes API server.
|
||||
* These new APIs should be seamless extension of the core kubernetes APIs (ex:
|
||||
* These new APIs should be seamless extensions of the core kubernetes APIs (ex:
|
||||
they should be operated upon via kubectl).
|
||||
|
||||
## Non Goals
|
||||
|
|
|
|||
|
|
@ -390,7 +390,7 @@ the following command.
|
|||
|
||||
### Rollback
|
||||
|
||||
For future work, `kubeclt rollout undo` can be implemented in the general case
|
||||
For future work, `kubectl rollout undo` can be implemented in the general case
|
||||
as an extension of the [above](#viewing-history ).
|
||||
|
||||
```bash
|
||||
|
|
|
|||
|
|
@ -42,7 +42,7 @@ Here are some potential requirements that haven't been covered by this proposal:
|
|||
- Uptime is critical for each pod of a DaemonSet during an upgrade (e.g. the time
|
||||
from a DaemonSet pods being killed to recreated and healthy should be < 5s)
|
||||
- Each DaemonSet pod can still fit on the node after being updated
|
||||
- Some DaemonSets require the node to be drained before the DeamonSet's pod on it
|
||||
- Some DaemonSets require the node to be drained before the DaemonSet's pod on it
|
||||
is updated (e.g. logging daemons)
|
||||
- DaemonSet's pods are implicitly given higher priority than non-daemons
|
||||
- DaemonSets can only be operated by admins (i.e. people who manage nodes)
|
||||
|
|
|
|||
|
|
@ -747,7 +747,7 @@ kubectl rollout undo statefulset web
|
|||
### Rolling Forward
|
||||
Rolling back is usually the safest, and often the fastest, strategy to mitigate
|
||||
deployment failure, but rolling forward is sometimes the only practical solution
|
||||
for stateful applications (e.g. A users has a minor configuration error but has
|
||||
for stateful applications (e.g. A user has a minor configuration error but has
|
||||
already modified the storage format for the application). Users can use
|
||||
sequential `kubectl apply`'s to update the StatefulSet's current
|
||||
[target state](#target-state). The StatefulSet's `.Spec.GenerationPartition`
|
||||
|
|
|
|||
|
|
@ -0,0 +1,93 @@
|
|||
# ProcMount/ProcMountType Option
|
||||
|
||||
## Background
|
||||
|
||||
Currently the way docker and most other container runtimes work is by masking
|
||||
and setting as read-only certain paths in `/proc`. This is to prevent data
|
||||
from being exposed into a container that should not be. However, there are
|
||||
certain use-cases where it is necessary to turn this off.
|
||||
|
||||
## Motivation
|
||||
|
||||
For end-users who would like to run unprivileged containers using user namespaces
|
||||
_nested inside_ CRI containers, we need an option to have a `ProcMount`. That is,
|
||||
we need an option to designate explicitly turn off masking and setting
|
||||
read-only of paths so that we can
|
||||
mount `/proc` in the nested container as an unprivileged user.
|
||||
|
||||
Please see the following filed issues for more information:
|
||||
- [opencontainers/runc#1658](https://github.com/opencontainers/runc/issues/1658#issuecomment-373122073)
|
||||
- [moby/moby#36597](https://github.com/moby/moby/issues/36597)
|
||||
- [moby/moby#36644](https://github.com/moby/moby/pull/36644)
|
||||
|
||||
Please also see the [use case for building images securely in kubernetes](https://github.com/jessfraz/blog/blob/master/content/post/building-container-images-securely-on-kubernetes.md).
|
||||
|
||||
Unmasking the paths in `/proc` option really only makes sense for when a user
|
||||
is nesting
|
||||
unprivileged containers with user namespaces as it will allow more information
|
||||
than is necessary to the program running in the container spawned by
|
||||
kubernetes.
|
||||
|
||||
The main use case for this option is to run
|
||||
[genuinetools/img](https://github.com/genuinetools/img) inside a kubernetes
|
||||
container. That program then launches sub-containers that take advantage of
|
||||
user namespaces and re-mask /proc and set /proc as read-only. So therefore
|
||||
there is no concern with having an unmasked proc open in the top level container.
|
||||
|
||||
It should be noted that this is different that the host /proc. It is still
|
||||
a newly mounted /proc just the container runtimes will not mask the paths.
|
||||
|
||||
Since the only use case for this option is to run unprivileged nested
|
||||
containers,
|
||||
this option should only be allowed or used if the user in the container is not `root`.
|
||||
This can be easily enforced with `MustRunAs`.
|
||||
Since the user inside is still unprivileged,
|
||||
doing things to `/proc` would be off limits regardless, since linux user
|
||||
support already prevents this.
|
||||
|
||||
## Existing SecurityContext objects
|
||||
|
||||
Kubernetes defines `SecurityContext` for `Container` and `PodSecurityContext`
|
||||
for `PodSpec`. `SecurityContext` objects define the related security options
|
||||
for Kubernetes containers, e.g. selinux options.
|
||||
|
||||
To support "ProcMount" options in Kubernetes, it is proposed to make
|
||||
the following changes:
|
||||
|
||||
## Changes of SecurityContext objects
|
||||
|
||||
Add a new `string` type field named `ProcMountType` will hold the viable
|
||||
options for `procMount` to the `SecurityContext`
|
||||
definition.
|
||||
|
||||
By default,`procMount` is `default`, aka the same behavior as today and the
|
||||
paths are masked.
|
||||
|
||||
This will look like the following in the spec:
|
||||
|
||||
```go
|
||||
type ProcMountType string
|
||||
|
||||
const (
|
||||
// DefaultProcMount uses the container runtime default ProcType. Most
|
||||
// container runtimes mask certain paths in /proc to avoid accidental security
|
||||
// exposure of special devices or information.
|
||||
DefaultProcMount ProcMountType = "Default"
|
||||
|
||||
// UnmaskedProcMount bypasses the default masking behavior of the container
|
||||
// runtime and ensures the newly created /proc the container stays in tact with
|
||||
// no modifications.
|
||||
UnmaskedProcMount ProcMountType = "Unmasked"
|
||||
)
|
||||
|
||||
procMount *ProcMountType
|
||||
```
|
||||
|
||||
This requires changes to the CRI runtime integrations so that
|
||||
kubelet will add the specific `unmasked` or `whatever_it_is_named` option.
|
||||
|
||||
## Pod Security Policy changes
|
||||
|
||||
A new `[]ProcMountType{}` field named `allowedProcMounts` will be added to the Pod
|
||||
Security Policy as well to gate the allowed ProcMountTypes a user is allowed to
|
||||
set. This field will default to `[]ProcMountType{ DefaultProcMount }`.
|
||||
|
|
@ -0,0 +1,89 @@
|
|||
# Support traffic shaping for CNI network plugin
|
||||
|
||||
Version: Alpha
|
||||
|
||||
Authors: @m1093782566
|
||||
|
||||
## Motivation and background
|
||||
|
||||
Currently the kubenet code supports applying basic traffic shaping during pod setup. This will happen if bandwidth-related annotations have been added to the pod's metadata, for example:
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "Pod",
|
||||
"metadata": {
|
||||
"name": "iperf-slow",
|
||||
"annotations": {
|
||||
"kubernetes.io/ingress-bandwidth": "10M",
|
||||
"kubernetes.io/egress-bandwidth": "10M"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Our current implementation uses the `linux tc` to add an download(ingress) and upload(egress) rate limiter using 1 root `qdisc`, 2 `class `(one for ingress and one for egress) and 2 `filter`(one for ingress and one for egress attached to the ingress and egress classes respectively).
|
||||
|
||||
Kubelet CNI code doesn't support it yet, though CNI has already added a [traffic sharping plugin](https://github.com/containernetworking/plugins/tree/master/plugins/meta/bandwidth). We can replicate the behavior we have today in kubenet for kubelet CNI network plugin if we feel this is an important feature.
|
||||
|
||||
## Goal
|
||||
|
||||
Support traffic shaping for CNI network plugin in Kubernetes.
|
||||
|
||||
## Non-goal
|
||||
|
||||
CNI plugins to implement this sort of traffic shaping guarantee.
|
||||
|
||||
## Proposal
|
||||
|
||||
If kubelet starts up with `network-plugin = cni` and user enabled traffic shaping via the network plugin configuration, it would then populate the `runtimeConfig` section of the config when calling the `bandwidth` plugin.
|
||||
|
||||
Traffic shaping in Kubelet CNI network plugin can work with ptp and bridge network plugins.
|
||||
|
||||
### Pod Setup
|
||||
|
||||
When we create a pod with bandwidth configuration in its metadata, for example,
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "Pod",
|
||||
"metadata": {
|
||||
"name": "iperf-slow",
|
||||
"annotations": {
|
||||
"kubernetes.io/ingress-bandwidth": "10M",
|
||||
"kubernetes.io/egress-bandwidth": "10M"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Kubelet would firstly parse the ingress and egress bandwidth values and transform them to Kbps because both `ingressRate` and `egressRate` in cni bandwidth plugin are in Kbps. A user would add something like this to their CNI config list if they want to enable traffic shaping via the plugin:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "bandwidth",
|
||||
"capabilities": {"trafficShaping": true}
|
||||
}
|
||||
```
|
||||
|
||||
Kubelet would then populate the `runtimeConfig` section of the config when calling the `bandwidth` plugin:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "bandwidth",
|
||||
"runtimeConfig": {
|
||||
"trafficShaping": {
|
||||
"ingressRate": "X",
|
||||
"egressRate": "Y"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pod Teardown
|
||||
|
||||
When we delete a pod, kubelet will bulid the runtime config for calling cni plugin `DelNetwork/DelNetworkList` API, which will remove this pod's bandwidth configuration.
|
||||
|
||||
## Next step
|
||||
|
||||
* Support ingress and egress burst bandwidth in Pod.
|
||||
* Graduate annotations to Pod Spec.
|
||||
|
|
@ -85,7 +85,7 @@ The implementation will mainly be in two parts:
|
|||
|
||||
In both parts, we need to implement:
|
||||
* Fork code for Windows from Linux.
|
||||
* Convert from Resources.Requests and Resources.Limits to Windows configuration in CRI, and convert from Windows configration in CRI to container configuration.
|
||||
* Convert from Resources.Requests and Resources.Limits to Windows configuration in CRI, and convert from Windows configuration in CRI to container configuration.
|
||||
|
||||
To implement resource controls for Windows containers, refer to [this MSDN documentation](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/resource-controls) and [Docker's conversion to OCI spec](https://github.com/moby/moby/blob/master/daemon/oci_windows.go).
|
||||
|
||||
|
|
|
|||
|
|
@ -142,11 +142,22 @@ extend this by maintaining a metadata file in the pod directory.
|
|||
**Log format**
|
||||
|
||||
The runtime should decorate each log entry with a RFC 3339Nano timestamp
|
||||
prefix, the stream type (i.e., "stdout" or "stderr"), and ends with a newline.
|
||||
prefix, the stream type (i.e., "stdout" or "stderr"), the tags of the log
|
||||
entry, the log content that ends with a newline.
|
||||
|
||||
The `tags` fields can support multiple tags, delimited by `:`. Currently, only
|
||||
one tag is defined in CRI to support multi-line log entries: partial or full.
|
||||
Partial (`P`) is used when a log entry is split into multiple lines by the
|
||||
runtime, and the entry has not ended yet. Full (`F`) indicates that the log
|
||||
entry is completed -- it is either a single-line entry, or this is the last
|
||||
line of the muiltple-line entry.
|
||||
|
||||
For example,
|
||||
```
|
||||
2016-10-06T00:17:09.669794202Z stdout The content of the log entry 1
|
||||
2016-10-06T00:17:10.113242941Z stderr The content of the log entry 2
|
||||
2016-10-06T00:17:09.669794202Z stdout F The content of the log entry 1
|
||||
2016-10-06T00:17:09.669794202Z stdout P First line of log entry 2
|
||||
2016-10-06T00:17:09.669794202Z stdout P Second line of the log entry 2
|
||||
2016-10-06T00:17:10.113242941Z stderr F Last line of the log entry 2
|
||||
```
|
||||
|
||||
With the knowledge, kubelet can parses the logs and serve them for `kubectl
|
||||
|
|
|
|||
|
|
@ -0,0 +1,209 @@
|
|||
# Support Node-Level User Namespaces Remapping
|
||||
|
||||
- [Summary](#summary)
|
||||
- [Motivation](#motivation)
|
||||
- [Goals](#goals)
|
||||
- [Non-Goals](#non-goals)
|
||||
- [Use Stories](#user-stories)
|
||||
- [Proposal](#proposal)
|
||||
- [Future Work](#future-work)
|
||||
- [Risks and Mitigations](risks-and-mitigations)
|
||||
- [Graduation Criteria](graduation-criteria)
|
||||
- [Alternatives](alternatives)
|
||||
|
||||
|
||||
_Authors:_
|
||||
|
||||
* Mrunal Patel <mpatel@redhat.com>
|
||||
* Jan Pazdziora <jpazdziora@redhat.com>
|
||||
* Vikas Choudhary <vichoudh@redhat.com>
|
||||
|
||||
## Summary
|
||||
Container security consists of many different kernel features that work together to make containers secure. User namespaces is one such feature that enables interesting possibilities for containers by allowing them to be root inside the container while not being root on the host. This gives more capabilities to the containers while protecting the host from the container being root and adds one more layer to container security.
|
||||
In this proposal we discuss:
|
||||
- use-cases/user-stories that benefit from this enhancement
|
||||
- implementation design and scope for alpha release
|
||||
- long-term roadmap to fully support this feature beyond alpha
|
||||
|
||||
## Motivation
|
||||
From user_namespaces(7):
|
||||
> User namespaces isolate security-related identifiers and attributes, in particular, user IDs and group IDs, the root directory, keys, and capabilities. A process's user and group IDs can be different inside and outside a user namespace. In particular, a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace.
|
||||
|
||||
In order to run Pods with software which expects to run as root or with elevated privileges while still containing the processes and protecting both the Nodes and other Pods, Linux kernel mechanism of user namespaces can be used make the processes in the Pods view their environment as having the privileges, while on the host (Node) level these processes appear as without privileges or with privileges only affecting processes in the same Pods
|
||||
|
||||
The purpose of using user namespaces in Kubernetes is to let the processes in Pods think they run as one uid set when in fact they run as different “real” uids on the Nodes.
|
||||
|
||||
In this text, most everything said about uids can also be applied to gids.
|
||||
|
||||
## Goals
|
||||
Enable user namespace support in a kubernetes cluster so that workloads that work today also work with user namespaces enabled at runtime. Furthermore, make workloads that require root/privileged user inside the container, safer for the node using the additional security of user namespaces. Containers will run in a user namespace different from user-namespace of the underlying host.
|
||||
|
||||
## Non-Goals
|
||||
- Non-goal is to support pod/container level user namespace isolation. There can be images using different users but on the node, pods/containers running with these images will share common user namespace remapping configuration. In other words, all containers on a node share a common user-namespace range.
|
||||
- Remote volumes support eg. NFS
|
||||
|
||||
## User Stories
|
||||
- As a cluster admin, I want to protect the node from the rogue container process(es) running inside pod containers with root privileges. If such a process is able to break out into the node, it could be a security issue.
|
||||
- As a cluster admin, I want to support all the images irrespective of what user/group that image is using.
|
||||
- As a cluster admin, I want to allow some pods to disable user namespaces if they require elevated privileges.
|
||||
|
||||
## Proposal
|
||||
Proposal is to support user-namespaces for the pod containers. This can be done at two levels:
|
||||
- Node-level : This proposal explains this part in detail.
|
||||
- Namespace-Level/Pod-level: Plan is to target this in future due to missing support in the low level system components such as runtimes and kernel. More on this in the `Future Work` section.
|
||||
|
||||
Node-level user-namespace support means that, if feature is enabled, all pods on a node will share a common user-namespace, common UID(and GID) range (which is a subset of node’s total UIDs(and GIDs)). This common user-namespace is runtime’s default user-namespace range which is remapped to containers’ UIDs(and GID), starting with the first UID as container’s ‘root’.
|
||||
In general Linux convention, UID(or GID) mapping consists of three parts:
|
||||
1. Host (U/G)ID: First (U/G)ID of the range on the host that is being remapped to the (U/G)IDs in the container user-namespace
|
||||
2. Container (U/G)ID: First (U/G)ID of the range in the container namespace and this is mapped to the first (U/G)ID on the host(mentioned in previous point).
|
||||
3. Count/Size: Total number of consecutive mapping between host and container user-namespaces, starting from the first one (including) mentioned above.
|
||||
|
||||
As an example, `host_id 1000, container_id 0, size 10`
|
||||
In this case, 1000 to 1009 on host will be mapped to 0 to 9 inside the container.
|
||||
|
||||
User-namespace support should be enabled only when container runtime on the node supports user-namespace remapping and is enabled in its configuration. To enable user-namespaces, feature-gate flag will need to be passed to Kubelet like this `--feature-gates=”NodeUserNamespace=true”`
|
||||
|
||||
A new CRI API, `GetRuntimeConfigInfo` will be added. Kubelet will use this API:
|
||||
- To verify if user-namespace remapping is enabled at runtime. If found disabled, kubelet will fail to start
|
||||
- To determine the default user-namespace range at the runtime, starting UID of which is mapped to the UID '0' of the container.
|
||||
|
||||
### Volume Permissions
|
||||
Kubelet will change the file permissions, i.e chown, at `/var/lib/kubelet/pods` prior to any container start to get file permissions updated according to remapped UID and GID.
|
||||
This proposal will work only for local volumes and not with remote volumes such as NFS.
|
||||
|
||||
### How to disable `NodeUserNamespace` for a specific pod
|
||||
This can be done in two ways:
|
||||
- **Alpha:** Implicitly using host namespace for the pod containers
|
||||
This support is already present (currently it seems broken, will be fixed) in Kubernetes as an experimental functionality, which can be enabled using `feature-gates=”ExperimentalHostUserNamespaceDefaulting=true”`.
|
||||
If Pod-Security-Policy is configured to allow the following to be requested by a pod, host user-namespace will be enabled for the container:
|
||||
- host namespaces (pid, ipc, net)
|
||||
- non-namespaced capabilities (mknod, sys_time, sys_module)
|
||||
- the pod contains a privileged container or using host path volumes.
|
||||
- https://github.com/kubernetes/kubernetes/commit/d0d78f478ce0fb9d5e121db3b7c6993b482af82c#diff-a53fa76e941e0bdaee26dcbc435ad2ffR437 introduced via https://github.com/kubernetes/kubernetes/commit/d0d78f478ce0fb9d5e121db3b7c6993b482af82c.
|
||||
|
||||
- **Beta:** Explicit API to request host user-namespace in pod spec
|
||||
This is being targeted under Beta graduation plans.
|
||||
|
||||
### CRI API Changes
|
||||
Proposed CRI API changes:
|
||||
|
||||
```golang
|
||||
// Runtime service defines the public APIs for remote container runtimes
|
||||
service RuntimeService {
|
||||
// Version returns the runtime name, runtime version, and runtime API version.
|
||||
rpc Version(VersionRequest) returns (VersionResponse) {}
|
||||
…….
|
||||
…….
|
||||
// GetRuntimeConfigInfo returns the configuration details of the runtime.
|
||||
rpc GetRuntimeConfigInfo(GetRuntimeConfigInfoRequest) returns (GetRuntimeConfigInfoResponse) {}
|
||||
}
|
||||
// LinuxIDMapping represents a single user namespace mapping in Linux.
|
||||
message LinuxIDMapping {
|
||||
// container_id is the starting id for the mapping inside the container.
|
||||
uint32 container_id = 1;
|
||||
// host_id is the starting id for the mapping on the host.
|
||||
uint32 host_id = 2;
|
||||
// size is the length of the mapping.
|
||||
uint32 size = 3;
|
||||
}
|
||||
|
||||
message LinuxUserNamespaceConfig {
|
||||
// is_enabled, if true indicates that user-namespaces are supported and enabled in the container runtime
|
||||
bool is_enabled = 1;
|
||||
// uid_mappings is an array of user id mappings.
|
||||
repeated LinuxIDMapping uid_mappings = 1;
|
||||
// gid_mappings is an array of group id mappings.
|
||||
repeated LinuxIDMapping gid_mappings = 2;
|
||||
}
|
||||
message GetRuntimeConfig {
|
||||
LinuxUserNamespaceConfig user_namespace_config = 1;
|
||||
}
|
||||
|
||||
message GetRuntimeConfigInfoRequest {}
|
||||
|
||||
message GetRuntimeConfigInfoResponse {
|
||||
GetRuntimeConfig runtime_config = 1
|
||||
}
|
||||
|
||||
...
|
||||
|
||||
// NamespaceOption provides options for Linux namespaces.
|
||||
message NamespaceOption {
|
||||
// Network namespace for this container/sandbox.
|
||||
// Note: There is currently no way to set CONTAINER scoped network in the Kubernetes API.
|
||||
// Namespaces currently set by the kubelet: POD, NODE
|
||||
NamespaceMode network = 1;
|
||||
// PID namespace for this container/sandbox.
|
||||
// Note: The CRI default is POD, but the v1.PodSpec default is CONTAINER.
|
||||
// The kubelet's runtime manager will set this to CONTAINER explicitly for v1 pods.
|
||||
// Namespaces currently set by the kubelet: POD, CONTAINER, NODE
|
||||
NamespaceMode pid = 2;
|
||||
// IPC namespace for this container/sandbox.
|
||||
// Note: There is currently no way to set CONTAINER scoped IPC in the Kubernetes API.
|
||||
// Namespaces currently set by the kubelet: POD, NODE
|
||||
NamespaceMode ipc = 3;
|
||||
// User namespace for this container/sandbox.
|
||||
// Note: There is currently no way to set CONTAINER scoped user namespace in the Kubernetes API.
|
||||
// The container runtime should ignore this if user namespace is NOT enabled.
|
||||
// POD is the default value. Kubelet will set it to NODE when trying to use host user-namespace
|
||||
// Namespaces currently set by the kubelet: POD, NODE
|
||||
NamespaceMode user = 4;
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
### Runtime Support
|
||||
- Docker: Here is the [user-namespace documentation](https://docs.docker.com/engine/security/userns-remap/) and this is the [implementation PR](https://github.com/moby/moby/pull/12648)
|
||||
- Concerns:
|
||||
Docker API does not provide user-namespace mapping. Therefore to handle `GetRuntimeConfigInfo` API, changes will be done in `dockershim` to read system files, `/etc/subuid` and `/etc/subgid`, for figuring out default user-namespace mapping. `/info` api will be used to figure out if user-namespace is enabled and `Docker Root Dir` will be used to figure out host uid mapped to the uid `0` in container. eg. `Docker Root Dir: /var/lib/docker/2131616.2131616` this shows host uid `2131616` will be mapped to uid `0`
|
||||
- CRI-O: https://github.com/kubernetes-incubator/cri-o/pull/1519
|
||||
- Containerd: https://github.com/containerd/containerd/blob/129167132c5e0dbd1b031badae201a432d1bd681/container_opts_unix.go#L149
|
||||
|
||||
### Implementation Roadmap
|
||||
#### Phase 1: Support in Kubelet, Alpha, [Target: Kubernetes v1.11]
|
||||
- Add feature gate `NodeUserNamespace`, disabled by default
|
||||
- Add new CRI API, `GetRuntimeConfigInfo()`
|
||||
- Add logic in Kubelet to handle pod creation which includes parsing GetRuntimeConfigInfo response and changing file-permissions in /var/lib/kubelet with learned userns mapping.
|
||||
- Add changes in dockershim to implement GetRuntimeConfigInfo() for docker runtime
|
||||
- Add changes in CRI-O to implement userns support and GetRuntimeConfigInfo() support
|
||||
- Unit test cases
|
||||
- e2e tests
|
||||
|
||||
#### Phase 2: Beta Support [Target: Kubernetes v1.12]
|
||||
- PSP integration
|
||||
- To grow ExperimentalHostUserNamespaceDefaulting from experimental feature gate to a Kubelet flag
|
||||
- API changes to allow pod able to request HostUserNamespace in pod spec
|
||||
- e2e tests
|
||||
|
||||
### References
|
||||
- Default host user namespace via experimental flag
|
||||
- https://github.com/kubernetes/kubernetes/pull/31169
|
||||
- Enable userns support for containers launched by kubelet
|
||||
- https://github.com/kubernetes/features/issues/127
|
||||
- Track Linux User Namespaces in the Pod Security Policy
|
||||
- https://github.com/kubernetes/kubernetes/issues/59152
|
||||
- Add support for experimental-userns-remap-root-uid and experimental-userns-remap-root-gid options to match the remapping used by the container runtime.
|
||||
- https://github.com/kubernetes/kubernetes/pull/55707
|
||||
- rkt User Namespaces Background
|
||||
- https://coreos.com/rkt/docs/latest/devel/user-namespaces.html
|
||||
|
||||
## Future Work
|
||||
### Namespace-Level/Pod-Level user-namespace support
|
||||
There is no runtime today which supports creating containers with a specified user namespace configuration. For example here is the discussion related to this support in Docker https://github.com/moby/moby/issues/28593
|
||||
Once user-namespace feature in the runtimes has evolved to support container’s request for a specific user-namespace mapping(UID and GID range), we can extend current Node-Level user-namespace support in Kubernetes to support Namespace-level isolation(or if desired even pod-level isolation) by dividing and allocating learned mapping from runtime among Kubernetes namespaces (or pods, if desired). From end-user UI perspective, we dont expect any change in the UI related to user namespaces support.
|
||||
### Remote Volumes
|
||||
Remote Volumes support should be investigated and should be targeted in future once support is there at lower infra layers.
|
||||
|
||||
|
||||
## Risks and Mitigations
|
||||
The main risk with this change stems from the fact that processes in Pods will run with different “real” uids than they used to, while expecting the original uids to make operations on the Nodes or consistently access shared persistent storage.
|
||||
- This can be mitigated by turning the feature on gradually, per-Pod or per Kubernetes namespace.
|
||||
- For the Kubernetes' cluster Pods (that provide the Kubernetes functionality), testing of their behaviour and ability to run in user namespaced setups is crucial.
|
||||
|
||||
## Graduation Criteria
|
||||
- PSP integration
|
||||
- API changes to allow pod able to request host user namespace using for example, `HostUserNamespace: True`, in pod spec
|
||||
- e2e tests
|
||||
|
||||
## Alternatives
|
||||
User Namespace mappings can be passed explicitly through kubelet flags similar to https://github.com/kubernetes/kubernetes/pull/55707 but we do not prefer this option because this is very much prone to mis-configuration.
|
||||
|
|
@ -28,7 +28,7 @@ implied. However, describing the process as "moving" the pod is approximately ac
|
|||
and easier to understand, so we will use this terminology in the document.
|
||||
|
||||
We use the term "rescheduling" to describe any action the system takes to move an
|
||||
already-running pod. The decision may be made and executed by any component; we wil
|
||||
already-running pod. The decision may be made and executed by any component; we will
|
||||
introduce the concept of a "rescheduler" component later, but it is not the only
|
||||
component that can do rescheduling.
|
||||
|
||||
|
|
|
|||
|
|
@ -19,8 +19,8 @@ In addition to this, with taint-based-eviction, the Node Controller already tain
|
|||
| ------------------ | ------------------ | ------------ | -------- |
|
||||
|Ready |True | - | |
|
||||
| |False | NoExecute | node.kubernetes.io/not-ready |
|
||||
| |Unknown | NoExecute | node.kubernetes.io/unreachable |
|
||||
|OutOfDisk |True | NoSchedule | node.kubernetes.io/out-of-disk |
|
||||
| |Unknown | NoExecute | node.kubernetes.io/unreachable |
|
||||
|OutOfDisk |True | NoSchedule | node.kubernetes.io/out-of-disk |
|
||||
| |False | - | |
|
||||
| |Unknown | - | |
|
||||
|MemoryPressure |True | NoSchedule | node.kubernetes.io/memory-pressure |
|
||||
|
|
@ -32,6 +32,9 @@ In addition to this, with taint-based-eviction, the Node Controller already tain
|
|||
|NetworkUnavailable |True | NoSchedule | node.kubernetes.io/network-unavailable |
|
||||
| |False | - | |
|
||||
| |Unknown | - | |
|
||||
|PIDPressure |True | NoSchedule | node.kubernetes.io/pid-pressure |
|
||||
| |False | - | |
|
||||
| |Unknown | - | |
|
||||
|
||||
For example, if a CNI network is not detected on the node (e.g. a network is unavailable), the Node Controller will taint the node with `node.kubernetes.io/network-unavailable=:NoSchedule`. This will then allow users to add a toleration to their `PodSpec`, ensuring that the pod can be scheduled to this node if necessary. If the kubelet did not update the node’s status after a grace period, the Node Controller will only taint the node with `node.kubernetes.io/unreachable`; it will not taint the node with any unknown condition.
|
||||
|
||||
|
|
|
|||
|
|
@ -314,7 +314,7 @@ The attach/detach controller,running as part of the kube-controller-manager bina
|
|||
When the controller decides to attach a CSI volume, it will call the in-tree CSI volume plugin’s attach method. The in-tree CSI volume plugin’s attach method will do the following:
|
||||
|
||||
1. Create a new `VolumeAttachment` object (defined in the “Communication Channels” section) to attach the volume.
|
||||
* The name of the of the `VolumeAttachment` object will be `pv-<SHA256(PVName+NodeName)>`.
|
||||
* The name of the `VolumeAttachment` object will be `pv-<SHA256(PVName+NodeName)>`.
|
||||
* `pv-` prefix is used to allow using other scheme(s) for inline volumes in the future, with their own prefix.
|
||||
* SHA256 hash is to reduce length of `PVName` plus `NodeName` string, each of which could be max allowed name length (hexadecimal representation of SHA256 is 64 characters).
|
||||
* `PVName` is `PV.name` of the attached PersistentVolume.
|
||||
|
|
|
|||
|
|
@ -198,7 +198,7 @@ we have considered following options:
|
|||
|
||||
Cons:
|
||||
* I don't know if there is a pattern that exists in kube today for shipping shell scripts that are called out from code in Kubernetes. Flex is
|
||||
different because, none of the flex scripts are shipped with Kuberntes.
|
||||
different because, none of the flex scripts are shipped with Kubernetes.
|
||||
3. Ship resizing tools in a container.
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -55,7 +55,7 @@ the RBD image.
|
|||
### Pros
|
||||
- Simple to implement
|
||||
- Does not cause regression in RBD image names, which remains same as earlier.
|
||||
- The metada information is not immediately visible to RBD admins
|
||||
- The metadata information is not immediately visible to RBD admins
|
||||
|
||||
### Cons
|
||||
- NA
|
||||
|
|
|
|||
|
|
@ -0,0 +1,148 @@
|
|||
# Service Account Token Volumes
|
||||
|
||||
Authors:
|
||||
@smarterclayton
|
||||
@liggitt
|
||||
@mikedanese
|
||||
|
||||
## Summary
|
||||
|
||||
Kubernetes is able to provide pods with unique identity tokens that can prove
|
||||
the caller is a particular pod to a Kubernetes API server. These tokens are
|
||||
injected into pods as secrets. This proposal proposes a new mechanism of
|
||||
distribution with support for [improved service account tokens][better-tokens]
|
||||
and explores how to migrate from the existing mechanism backwards compatibly.
|
||||
|
||||
## Motivation
|
||||
|
||||
Many workloads running on Kubernetes need to prove to external parties who they
|
||||
are in order to participate in a larger application environment. This identity
|
||||
must be attested to by the orchestration system in a way that allows a third
|
||||
party to trust that an arbitrary container on the cluster is who it says it is.
|
||||
In addition, infrastructure running on top of Kubernetes needs a simple
|
||||
mechanism to communicate with the Kubernetes APIs and to provide more complex
|
||||
tooling. Finally, a significant set of security challenges are associated with
|
||||
storing service account tokens as secrets in Kubernetes and limiting the methods
|
||||
whereby malicious parties can get access to these tokens will reduce the risk of
|
||||
platform compromise.
|
||||
|
||||
As a platform, Kubernetes should evolve to allow identity management systems to
|
||||
provide more powerful workload identity without breaking existing use cases, and
|
||||
provide a simple out of the box workload identity that is sufficient to cover
|
||||
the requirements of bootstrapping low-level infrastructure running on
|
||||
Kubernetes. We expect that other systems to cover the more advanced scenarios,
|
||||
and see this effort as necessary glue to allow more powerful systems to succeed.
|
||||
|
||||
With this feature, we hope to provide a backwards compatible replacement for
|
||||
service account tokens that strengthens the security and improves the
|
||||
scalability of the platform.
|
||||
|
||||
## Proposal
|
||||
|
||||
Kubernetes should implement a ServiceAccountToken volume projection that
|
||||
maintains a service account token requested by the node from the TokenRequest
|
||||
API.
|
||||
|
||||
### Token Volume Projection
|
||||
|
||||
A new volume projection will be implemented with an API that closely matches the
|
||||
TokenRequest API.
|
||||
|
||||
```go
|
||||
type ProjectedVolumeSource struct {
|
||||
Sources []VolumeProjection
|
||||
DefaultMode *int32
|
||||
}
|
||||
|
||||
type VolumeProjection struct {
|
||||
Secret *SecretProjection
|
||||
DownwardAPI *DownwardAPIProjection
|
||||
ConfigMap *ConfigMapProjection
|
||||
ServiceAccountToken *ServiceAccountTokenProjection
|
||||
}
|
||||
|
||||
// ServiceAccountTokenProjection represents a projected service account token
|
||||
// volume. This projection can be used to insert a service account token into
|
||||
// the pods runtime filesystem for use against APIs (Kubernetes API Server or
|
||||
// otherwise).
|
||||
type ServiceAccountTokenProjection struct {
|
||||
// Audience is the intended audience of the token. A recipient of a token
|
||||
// must identify itself with an identifier specified in the audience of the
|
||||
// token, and otherwise should reject the token. The audience defaults to the
|
||||
// identifier of the apiserver.
|
||||
Audience string
|
||||
// ExpirationSeconds is the requested duration of validity of the service
|
||||
// account token. As the token approaches expiration, the kubelet volume
|
||||
// plugin will proactively rotate the service account token. The kubelet will
|
||||
// start trying to rotate the token if the token is older than 80 percent of
|
||||
// its time to live or if the token is older than 24 hours.Defaults to 1 hour
|
||||
// and must be at least 10 minutes.
|
||||
ExpirationSeconds int64
|
||||
// Path is the relative path of the file to project the token into.
|
||||
Path string
|
||||
}
|
||||
```
|
||||
|
||||
A volume plugin implemented in the kubelet will project a service account token
|
||||
sourced from the TokenRequest API into volumes created from
|
||||
ProjectedVolumeSources. As the token approaches expiration, the kubelet volume
|
||||
plugin will proactively rotate the service account token. The kubelet will start
|
||||
trying to rotate the token if the token is older than 80 percent of its time to
|
||||
live or if the token is older than 24 hours.
|
||||
|
||||
To replace the current service account token secrets, we also need to inject the
|
||||
clusters CA certificate bundle. Initially we will deploy to data in a configmap
|
||||
per-namespace and reference it using a ConfigMapProjection.
|
||||
|
||||
A projected volume source that is equivalent to the current service account
|
||||
secret:
|
||||
|
||||
```yaml
|
||||
sources:
|
||||
- serviceAccountToken:
|
||||
expirationSeconds: 3153600000 # 100 years
|
||||
path: token
|
||||
- configMap:
|
||||
name: kube-cacrt
|
||||
items:
|
||||
- key: ca.crt
|
||||
path: ca.crt
|
||||
- downwardAPI:
|
||||
items:
|
||||
- path: namespace
|
||||
fieldRef: metadata.namespace
|
||||
```
|
||||
|
||||
|
||||
This fixes one scalability issue with the current service account token
|
||||
deployment model where secret GETs are a large portion of overall apiserver
|
||||
traffic.
|
||||
|
||||
A projected volume source that requests a token for vault and Istio CA:
|
||||
|
||||
```yaml
|
||||
sources:
|
||||
- serviceAccountToken:
|
||||
path: vault-token
|
||||
audience: vault
|
||||
- serviceAccountToken:
|
||||
path: istio-token
|
||||
audience: ca.istio.io
|
||||
```
|
||||
|
||||
### Alternatives
|
||||
|
||||
1. Instead of implementing a service account token volume projection, we could
|
||||
implement all injection as a flex volume or CSI plugin.
|
||||
1. Both flex volume and CSI are alpha and are unlikely to graduate soon.
|
||||
1. Virtual kubelets (like Fargate or ACS) may not be able to run flex
|
||||
volumes.
|
||||
1. Service account tokens are a fundamental part of our API.
|
||||
1. Remove service accounts and service account tokens completely from core, use
|
||||
an alternate mechanism that sits outside the platform.
|
||||
1. Other core features need service account integration, leading to all
|
||||
users needing to install this extension.
|
||||
1. Complicates installation for the majority of users.
|
||||
|
||||
|
||||
[better-tokens]: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/auth/bound-service-account-tokens.md
|
||||
File diff suppressed because it is too large
Load Diff
|
|
@ -365,10 +365,14 @@ being required otherwise.
|
|||
### Edit defaults.go
|
||||
|
||||
If your change includes new fields for which you will need default values, you
|
||||
need to add cases to `pkg/apis/<group>/<version>/defaults.go` (the core v1 API
|
||||
is special, its defaults.go is at `pkg/api/v1/defaults.go`. For simplicity, we
|
||||
will not mention this special case in the rest of the article). Of course, since
|
||||
you have added code, you have to add a test:
|
||||
need to add cases to `pkg/apis/<group>/<version>/defaults.go`
|
||||
|
||||
*Note:* In the past the core v1 API
|
||||
was special. Its `defaults.go` used to live at `pkg/api/v1/defaults.go`.
|
||||
If you see code referencing that path, you can be sure its outdated. Now the core v1 api lives at
|
||||
`pkg/apis/core/v1/defaults.go` which follows the above convention.
|
||||
|
||||
Of course, since you have added code, you have to add a test:
|
||||
`pkg/apis/<group>/<version>/defaults_test.go`.
|
||||
|
||||
Do use pointers to scalars when you need to distinguish between an unset value
|
||||
|
|
@ -601,7 +605,6 @@ Due to the fast changing nature of the project, the following content is probabl
|
|||
to generate protobuf IDL and marshallers.
|
||||
* You must add the new version to
|
||||
[cmd/kube-apiserver/app#apiVersionPriorities](https://github.com/kubernetes/kubernetes/blob/v1.8.0-alpha.2/cmd/kube-apiserver/app/aggregator.go#L172)
|
||||
to let the aggregator list it. This list will be removed before release 1.8.
|
||||
* You must setup storage for the new version in
|
||||
[pkg/registry/group_name/rest](https://github.com/kubernetes/kubernetes/blob/v1.8.0-alpha.2/pkg/registry/authentication/rest/storage_authentication.go)
|
||||
|
||||
|
|
|
|||
|
|
@ -1,3 +0,0 @@
|
|||
This document has been moved to https://git.k8s.io/community/contributors/guide/coding-conventions.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -134,7 +134,9 @@ development environment, please [set one up](http://golang.org/doc/code.html).
|
|||
| 1.5, 1.6 | 1.7 - 1.7.5 |
|
||||
| 1.7 | 1.8.1 |
|
||||
| 1.8 | 1.8.3 |
|
||||
| 1.9+ | 1.9.1 |
|
||||
| 1.9 | 1.9.1 |
|
||||
| 1.10 | 1.9.1 |
|
||||
| 1.11+ | 1.10.1 |
|
||||
|
||||
Ensure your GOPATH and PATH have been configured in accordance with the Go
|
||||
environment instructions.
|
||||
|
|
|
|||
|
|
@ -1,4 +0,0 @@
|
|||
The contents of this file have been moved to https://git.k8s.io/community/contributors/guide/pull-requests.md.
|
||||
<!--
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
-->
|
||||
|
|
@ -132,7 +132,7 @@ Note: Secrets are passed only to "mount/unmount" call-outs.
|
|||
See [nginx-lvm.yaml] & [nginx-nfs.yaml] for a quick example on how to use Flexvolume in a pod.
|
||||
|
||||
|
||||
[lvm]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/lvm
|
||||
[nfs]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nfs
|
||||
[nginx-lvm.yaml]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nginx-lvm.yaml
|
||||
[nginx-nfs.yaml]: https://git.k8s.io/kubernetes/examples/volumes/flexvolume/nginx-nfs.yaml
|
||||
[lvm]: https://git.k8s.io/examples/staging/volumes/flexvolume/lvm
|
||||
[nfs]: https://git.k8s.io/examples/staging/volumes/flexvolume/nfs
|
||||
[nginx-lvm.yaml]: https://git.k8s.io/examples/staging/volumes/flexvolume/nginx-lvm.yaml
|
||||
[nginx-nfs.yaml]: https://git.k8s.io/examples/staging/volumes/flexvolume/nginx-nfs.yaml
|
||||
|
|
|
|||
|
|
@ -1,3 +0,0 @@
|
|||
This document's content has been rolled into https://git.k8s.io/community/contributors/guide/coding-conventions.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
This document has been moved to https://git.k8s.io/community/contributors/guide/owners.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
This file has been moved to https://git.k8s.io/community/contributors/guide/pull-requests.md.
|
||||
<!--
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
-->
|
||||
|
|
@ -1,8 +0,0 @@
|
|||
reviewers:
|
||||
- saad-ali
|
||||
- pwittrock
|
||||
- steveperry-53
|
||||
- chenopis
|
||||
- spiffxp
|
||||
approvers:
|
||||
- sig-release-leads
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/README.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/issues.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/release-process-documentation/release-team-guides/patch-release-manager-playbook.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/patch_release.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/scalability-validation.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/ephemera/testing.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
This document has been moved to https://git.k8s.io/community/contributors/guide/scalability-good-practices.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
||||
|
|
@ -84,7 +84,7 @@ scheduling policies to apply, and can add new ones.
|
|||
The policies that are applied when scheduling can be chosen in one of two ways.
|
||||
The default policies used are selected by the functions `defaultPredicates()` and `defaultPriorities()` in
|
||||
[pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
|
||||
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](http://releases.k8s.io/HEAD/examples/scheduler-policy-config.json) for an example
|
||||
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](https://git.k8s.io/examples/staging/scheduler-policy-config.json) for an example
|
||||
config file. (Note that the config file format is versioned; the API is defined in [pkg/scheduler/api](http://releases.k8s.io/HEAD/pkg/scheduler/api/)).
|
||||
Thus to add a new scheduling policy, you should modify [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) or add to the directory [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/), and either register the policy in `defaultPredicates()` or `defaultPriorities()`, or use a policy config file.
|
||||
|
||||
|
|
|
|||
|
|
@ -1,3 +0,0 @@
|
|||
The original content of this file has been migrated to https://git.k8s.io/sig-release/security-release-process-documentation/security-release-process.md
|
||||
|
||||
This file is a placeholder to preserve links. Please remove after 3 months or the release of kubernetes 1.10, whichever comes first.
|
||||
|
|
@ -208,7 +208,7 @@ If you haven't noticed by now, we have a large, lively, and friendly open-source
|
|||
|
||||
## Events
|
||||
|
||||
Kubernetes is the main focus of CloudNativeCon/KubeCon, held twice per year in EMEA and in North America. Information about these and other community events is available on the CNCF [events](https://www.cncf.io/events/) pages.
|
||||
Kubernetes is the main focus of KubeCon + CloudNativeCon, held three times per year in China, Europe and in North America. Information about these and other community events is available on the CNCF [events](https://www.cncf.io/events/) pages.
|
||||
|
||||
### Meetups
|
||||
|
||||
|
|
|
|||
|
|
@ -17,10 +17,11 @@ A list of common resources when contributing to Kubernetes.
|
|||
- [Gubernator Dashboard - k8s.reviews](https://k8s-gubernator.appspot.com/pr)
|
||||
- [Submit Queue](https://submit-queue.k8s.io)
|
||||
- [Bot commands](https://go.k8s.io/bot-commands)
|
||||
- [Release Buckets](http://gcsweb.k8s.io/gcs/kubernetes-release/)
|
||||
- [GitHub labels](https://go.k8s.io/github-labels)
|
||||
- [Release Buckets](https://gcsweb.k8s.io/gcs/kubernetes-release/)
|
||||
- Developer Guide
|
||||
- [Cherry Picking Guide](/contributors/devel/cherry-picks.md) - [Queue](http://cherrypick.k8s.io/#/queue)
|
||||
- [https://k8s-code.appspot.com/](https://k8s-code.appspot.com/) - Kubernetes Code Search, maintained by [@dims](https://github.com/dims)
|
||||
- [Cherry Picking Guide](/contributors/devel/cherry-picks.md) - [Queue](https://cherrypick.k8s.io/#/queue)
|
||||
- [Kubernetes Code Search](https://cs.k8s.io/), maintained by [@dims](https://github.com/dims)
|
||||
|
||||
|
||||
## SIGs and Working Groups
|
||||
|
|
@ -39,8 +40,10 @@ A list of common resources when contributing to Kubernetes.
|
|||
## Tests
|
||||
|
||||
- [Current Test Status](https://prow.k8s.io/)
|
||||
- [Aggregated Failures](https://storage.googleapis.com/k8s-gubernator/triage/index.html)
|
||||
- [Test Grid](https://k8s-testgrid.appspot.com/)
|
||||
- [Aggregated Failures](https://go.k8s.io/triage)
|
||||
- [Test Grid](https://testgrid.k8s.io)
|
||||
- [Test Health](https://go.k8s.io/test-health)
|
||||
- [Test History](https://go.k8s.io/test-history)
|
||||
|
||||
## Other
|
||||
|
||||
|
|
|
|||
|
|
@ -74,6 +74,22 @@ git checkout -b myfeature
|
|||
Then edit code on the `myfeature` branch.
|
||||
|
||||
#### Build
|
||||
The following section is a quick start on how to build Kubernetes locally, for more detailed information you can see [kubernetes/build](https://git.k8s.io/kubernetes/build/README.md).
|
||||
The best way to validate your current setup is to build a small part of Kubernetes. This way you can address issues without waiting for the full build to complete. To build a specific part of Kubernetes use the `WHAT` environment variable to let the build scripts know you want to build only a certain package/executable.
|
||||
|
||||
```sh
|
||||
make WHAT=cmd/{$package_you_want}
|
||||
```
|
||||
|
||||
*Note:* This applies to all top level folders under kubernetes/cmd.
|
||||
|
||||
So for the cli, you can run:
|
||||
|
||||
```sh
|
||||
make WHAT=cmd/kubectl
|
||||
```
|
||||
|
||||
If everything checks out you will have an executable in the `_output/bin` directory to play around with.
|
||||
|
||||
*Note:* If you are using `CDPATH`, you must either start it with a leading colon, or unset the variable. The make rules and scripts to build require the current directory to come first on the CD search path in order to properly navigate between directories.
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,14 @@
|
|||
reviewers:
|
||||
- parispittman
|
||||
- guineveresaenger
|
||||
- jberkus
|
||||
- errordeveloper
|
||||
- tpepper
|
||||
- spiffxp
|
||||
approvers:
|
||||
- parispittman
|
||||
- guineveresaenger
|
||||
- jberkus
|
||||
- errordeveloper
|
||||
labels:
|
||||
- area/new-contributor-track
|
||||
|
|
@ -0,0 +1,12 @@
|
|||
# Welcome to KubeCon Copenhagen's New Contributor Track!
|
||||
|
||||
Hello new contributors!
|
||||
|
||||
This subfolder of [kubernetes/community](https://github.com/kubernetes/community) will be used as a safe space for participants in the New Contributor Onboarding Track to familiarize themselves with (some of) the Kubernetes Project's review and pull request processes.
|
||||
|
||||
The label associated with this track is `area/new-contributor-track`.
|
||||
|
||||
*If you are not currently attending or organizing this event, please DO NOT create issues or pull requests against this section of the community repo.*
|
||||
|
||||
A [Youtube playlist](https://www.youtube.com/playlist?list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx) of this workshop has been posted, and an outline of content to videos can be found [here](http://git.k8s.io/community/events/2018/05-contributor-summit).
|
||||
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
# Hello from Copenhagen!
|
||||
|
||||
Hello everyone who's attending the Contributor Summit at KubeCon + CloudNativeCon in Copenhagen!
|
||||
Great to see so many amazing people interested in contributing to Kubernetes :)
|
||||
|
|
@ -0,0 +1,350 @@
|
|||
# Kubernetes New Contributor Workshop - KubeCon EU 2018 - Notes
|
||||
|
||||
Joining in the beginning was onboarding on a yacht
|
||||
Now is more onboarding a BIG cruise ship.
|
||||
|
||||
Will be a Hard schedule, and let's hope we can achieve everything
|
||||
Sig-contributor-experience -> from Non-member contributors to Owner
|
||||
|
||||
## SIG presentation
|
||||
|
||||
- SIG-docs & SIG-contributor-experience: **Docs and website** contribution
|
||||
- SIG-testing: **Testing** contribution
|
||||
- SIG-\* (*depends on the area to contribute on*): **Code** contribution
|
||||
|
||||
**=> Find your first topics**: bug, feature, learning, community development and documentation
|
||||
|
||||
Table exercise: Introduce yourself and give a tip on where you want to contribute in Kubernetes
|
||||
|
||||
|
||||
## Communication in the community
|
||||
|
||||
Kubernetes community is like a Capybara: community members are really cool with everyone and they are from a lot of different horizons.
|
||||
|
||||
- Tech question on Slack and Stack Overflow, not on Github
|
||||
- A lot of discussion will be involve when GH issues and PR are opened. Don't be frustrated
|
||||
- Stay patient because there is a lot of contribution
|
||||
|
||||
When in doubt, **ask on Slack**
|
||||
|
||||
Other communication channels:
|
||||
|
||||
- Community meetings
|
||||
- Mailing lists
|
||||
- @ on Github
|
||||
- Office Hour
|
||||
- Kubernetes meetups https://www.meetup.com/topics/kubernetes
|
||||
|
||||
on https://kubernetes.io/community, there is the schedule for all the SIG/Working group meeting.
|
||||
If you want to join or create a meetup. Go to **slack#sig-contribex**
|
||||
|
||||
## SIG - Special Interest Group
|
||||
|
||||
Semi-autonomous teams:
|
||||
- Own leaders & charteers
|
||||
- Code, Github repo, Slack, mailing, meeting responsibility
|
||||
|
||||
### Types
|
||||
|
||||
[SIG List](https://github.com/kubernetes/community/blob/master/sig-list.md)
|
||||
|
||||
1. Features Area
|
||||
- sig-auth
|
||||
- sig-apps
|
||||
- sig-autoscaling
|
||||
- sig-big-data
|
||||
- sig-cli
|
||||
- sig-multicluster
|
||||
- sig-network
|
||||
- sig-node
|
||||
- sig-scalability
|
||||
- sig-scheduling
|
||||
- sig-service-catalog
|
||||
- sig-storage
|
||||
- sig-ui
|
||||
2. Plumbing
|
||||
- sig-cluster-lifecycle
|
||||
- sig-api-machinary
|
||||
- sig-instrumentation
|
||||
3. Cloud Providers *(currently working on moving cloudprovider code out of Core)*
|
||||
- sig-aws
|
||||
- sig-azure
|
||||
- sig-gcp
|
||||
- sig-ibmcloud
|
||||
- sig-openstack
|
||||
4. Meta
|
||||
- sig-architecture: For all general architectural decision
|
||||
- sig-contributor-experience: Helping contributor and community experience
|
||||
- sig-product-management: Long-term decision
|
||||
- sig-release
|
||||
- sig-testing: In charge of all the test for Kubernetes
|
||||
5. Docs
|
||||
- sig-docs: for documentation and website
|
||||
|
||||
## Working groups and "Subproject"
|
||||
|
||||
From working group to "subproject".
|
||||
|
||||
For specific: tools (ex. Helm), goals (ex. Resource Management) or areas (ex. Machine Learning).
|
||||
|
||||
Working groups change around more frequently than SIGs, and some might be temporary.
|
||||
|
||||
- wg-app-def
|
||||
- wg-apply
|
||||
- wg-cloud-provider
|
||||
- wg-cluster-api
|
||||
- wg-container-identity
|
||||
- ...
|
||||
|
||||
### Picking the right SIG:
|
||||
1. Figure out which area you would like to contribute to
|
||||
2. Find out which SIG / WG / subproject covers that (tip: ask on #sig-contribex Slack channel)
|
||||
3. Join that SIG / WG / subproject (you should also join the main SIG when joining a WG / subproject)
|
||||
|
||||
## Tour des repositories
|
||||
|
||||
Everything will be refactored (cleaning, move, merged,...)
|
||||
|
||||
### Core repository
|
||||
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes)
|
||||
|
||||
### Project
|
||||
|
||||
- [kubernetes/Community](https://github.com/kubernetes/Community): Kubecon, proposition, Code of conduct and Contribution guideline, SIG-list
|
||||
- [kubernetes/Features](https://github.com/kubernetes/Features): Features proposal for future release
|
||||
- [kubernetes/Steering](https://github.com/kubernetes/Steering)
|
||||
- [kubernetes/Test-Infra](https://github.com/kubernetes/Test-Infra): All related to test except Perf
|
||||
- [kubernetes/Perf-Tests](https://github.com/kubernetes/Perf-Tests):
|
||||
|
||||
### Docs/Website
|
||||
|
||||
- website
|
||||
- kubernetes-cn
|
||||
- kubernetes-ko
|
||||
|
||||
### Developer Tools
|
||||
|
||||
- sample-controller*
|
||||
- sample- apiserver*
|
||||
- code-generator*
|
||||
- k8s.io
|
||||
- kubernetes-template-project: For new github repo
|
||||
|
||||
### Staging repositories
|
||||
|
||||
Mirror of core part for easy vendoring
|
||||
|
||||
### SIG repositories
|
||||
|
||||
- release
|
||||
- federation
|
||||
- autoscaler
|
||||
|
||||
### Cloud Providers
|
||||
|
||||
No AWS
|
||||
|
||||
### Tools & Products
|
||||
|
||||
- kubeadm
|
||||
- kubectl
|
||||
- kops
|
||||
- helm
|
||||
- charts
|
||||
- kompose
|
||||
- ingress-nginx
|
||||
- minikube
|
||||
- dashboard
|
||||
- heapster
|
||||
- kubernetes-anywhere
|
||||
- kube-openapi
|
||||
|
||||
### 2nd Namespace: Kubernetes-sigs
|
||||
|
||||
Too much places for Random/Incubation stuff.
|
||||
No working path for **promotion/deprecation**
|
||||
|
||||
In future:
|
||||
1. start in Kubernetes-sigs
|
||||
2. SIGs determine when and how the project will be **promoted/deprecated**
|
||||
|
||||
Those repositories can have their own rules:
|
||||
- Approval
|
||||
- Ownership
|
||||
- ...
|
||||
|
||||
## Contribution
|
||||
|
||||
### First Bug report
|
||||
|
||||
```
|
||||
- Bug or Feature
|
||||
|
||||
- What happened
|
||||
|
||||
- How to reproduce
|
||||
|
||||
```
|
||||
|
||||
### Issues as specifications
|
||||
|
||||
|
||||
Most of k8s change start with an issue:
|
||||
|
||||
- Feature proposal
|
||||
- API changes proposal
|
||||
- Specification
|
||||
|
||||
### From Issue to Code/Docs
|
||||
|
||||
1. Start with an issue
|
||||
2. Apply all appropriate labels
|
||||
3. cc SIG leads and concerned devs
|
||||
4. Raise the issue at a SIG meeting or on mailing list
|
||||
5. If *Lazy consensus*, submit a PR
|
||||
|
||||
### Required labels https://github.com/kubernetes/test-infra/blob/master/label_sync/labels.md
|
||||
|
||||
#### On creation
|
||||
- `sig/\*`: the sig the issue belong too
|
||||
- `kind/\*`:
|
||||
- bug
|
||||
- feature
|
||||
- documentation
|
||||
- design
|
||||
- failing-test
|
||||
|
||||
#### For issue closed as port of **triage**
|
||||
|
||||
- `triage/duplicate`
|
||||
- `triage/needs-information`
|
||||
- `triage/support`
|
||||
- `triage/unreproduceable`
|
||||
- `triage/unresolved`
|
||||
|
||||
#### Prority
|
||||
|
||||
- `priority/critical-urgent`
|
||||
- `priority/important-soon`
|
||||
- `priority/important-longtem`
|
||||
- `priority/backlog`
|
||||
- `priority/awaiting-evidence`
|
||||
|
||||
#### Area
|
||||
|
||||
Free for dedicated issue area
|
||||
|
||||
- `area/kubectl`
|
||||
- `area/api`
|
||||
- `area/dns`
|
||||
- `area/platform/gcp`
|
||||
|
||||
#### help-wanted
|
||||
|
||||
Currently mostly complicated things
|
||||
|
||||
#### SOON
|
||||
|
||||
`good-first-issue`
|
||||
|
||||
## Making a contribution by Pull Request
|
||||
|
||||
We will go through the typical PR process on kubernetes repos.
|
||||
|
||||
We will play there: [community/contributors/new-contributor-playground at master · kubernetes/community · GitHub](https://github.com/kubernetes/community/tree/master/contributors/new-contributor-playground)
|
||||
|
||||
1. When we contribute to any kubernetes repository, **fork it**
|
||||
|
||||
2. Do your modification in your fork
|
||||
```
|
||||
$ git clone git@github.com:jgsqware/community.git $GOPATH/src/github.com/kubernetes/community
|
||||
$ git remote add upstream https://github.com/kubernetes/community.git
|
||||
$ git remote -v
|
||||
origin git@github.com:jgsqware/community.git (fetch)
|
||||
origin git@github.com:jgsqware/community.git (push)
|
||||
upstream git@github.com:kubernetes/community.git (fetch)
|
||||
upstream git@github.com:kubernetes/community.git (push)
|
||||
$ git checkout -b kubecon
|
||||
Switched to a new branch 'kubecon'
|
||||
|
||||
## DO YOUR MODIFCATION IN THE CODE##
|
||||
|
||||
$ git add contributors/new-contributor-playground/new-contibutor-playground-xyz.md
|
||||
$ git commit
|
||||
|
||||
|
||||
### IN YOUR COMMIT EDITOR ###
|
||||
|
||||
Adding a new contributors file
|
||||
|
||||
We are currently experimenting PR process in the kubernetes repository.
|
||||
|
||||
$ git push -u origin kubecon
|
||||
```
|
||||
|
||||
3. Create a Pull request via Github
|
||||
4. If needed, sign the CLA to make valid your contribution
|
||||
5. Read the `k8s-ci-robot` message and `/assign @reviewer` recommended by the `k8s-ci-robot`
|
||||
6. wait for a `LTGM` label from one of the `OWNER/reviewers`
|
||||
7. wait for approval from one of `OWNER/approvers`
|
||||
8. `k8s-ci-robot` will automatically merge the PR
|
||||
|
||||
`needs-ok-to-test` is used for non-member contributor to validate the pull request
|
||||
|
||||
## Test infrastructure
|
||||
|
||||
> How bot toll you when you mess up
|
||||
|
||||
At the end of a PR there is a bunch of test.
|
||||
2 types:
|
||||
- required: Always run and needed to pass to validate the PR (eg. end-to-end test)
|
||||
- not required: Needed in specific condition (eg. modifying on ly specific part of code)
|
||||
|
||||
If something failed, click on `details` and check the test failure logs to see what happened.
|
||||
There is `junit-XX.log` with the list of test executed and `e2e-xxxxx` folder with all the component logs.
|
||||
To check if the test failed because of your PR or another one, you can click on the **TOP** `pull-request-xxx` link and you will see the test-grid and check if your failing test is failing in other PR too.
|
||||
|
||||
If you want to retrigger the test manually, you can comment the PR with `/retest` and `k8s-ci-robot` will retrigger the tests.
|
||||
|
||||
## SIG-Docs contribution
|
||||
|
||||
Anyone can contribute to docs.
|
||||
|
||||
### Kubernetes docs
|
||||
|
||||
- Websites URL
|
||||
- Github Repository
|
||||
- k8s slack: #sig-docs
|
||||
|
||||
### Working with docs
|
||||
|
||||
Docs use `k8s-ci-robot`. Approval process is the same as for any k8s repo.
|
||||
In docs, `master` branch is the current version of the docs. So always branch from `master`. It's continuous deployment
|
||||
For a specific release docs, branch from `release-1.X`.
|
||||
|
||||
## Local build and Test
|
||||
|
||||
The code: [kubernetes/kubernetes]
|
||||
The process: [kubernetes/community]
|
||||
|
||||
### Dev Env
|
||||
|
||||
You need:
|
||||
- Go
|
||||
- Docker
|
||||
|
||||
|
||||
- Lot of RAM and CPU and 10 GB of space
|
||||
- best to use Linux
|
||||
- place you k8s repo fork in:
|
||||
- `$GOPATH/src/k8s.io/kubernetes`
|
||||
- `cd $GOPATH/src/k8s.io/kubernetes`
|
||||
- build: `./build/run.sh make`
|
||||
- Build is incremental, keep running `./build/run.sh make` til it works
|
||||
- To build variant: `make WHAT="kubectl"`
|
||||
- Building kubectl on Mac for linux: `KUBE_*_PLATFORM="linux/amd64" make WHAT "kubectl"`
|
||||
|
||||
There is `build` documentation there: https://git.k8s.io/kubernetes/build
|
||||
|
||||
### Testing
|
||||
There is `test` documentation there: https://git.k8s.io/community/contributor/guide
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
# Hello everyone!
|
||||
|
||||
Please feel free to talk amongst yourselves or ask questions if you need help
|
||||
|
||||
First commit at kubecon from @mitsutaka
|
||||
|
|
@ -16,7 +16,7 @@ We need the 80% case, Fabric8 is a good example of this. We need a good set of
|
|||
|
||||
We also need to look at how to get developer feedback on this so that we're building what they need. Pradeepto did a comparison of Kompose vs. Docker Compose for simplicity/usability.
|
||||
|
||||
One of the things we're discussing the Kompose API. We want to get rid of this and supply something which people can use directly with kuberntes. A bunch of shops only have developers. Someone asked though what's so complicated with Kube definitions. Have we identified what gives people trouble with this? We push too many concepts on developers too quickly. We want some high-level abstract types which represent the 95% use case. Then we could decompose these to the real types.
|
||||
One of the things we're discussing the Kompose API. We want to get rid of this and supply something which people can use directly with kubernetes. A bunch of shops only have developers. Someone asked though what's so complicated with Kube definitions. Have we identified what gives people trouble with this? We push too many concepts on developers too quickly. We want some high-level abstract types which represent the 95% use case. Then we could decompose these to the real types.
|
||||
|
||||
What's the gap between compose files and the goal? As an example, say you want to run a webserver pod. You have to deal with ingress, and service, and replication controller, and a bunch of other things. What's the equivalent of "docker run" which is easy to get. The critical thing is how fast you can learn it.
|
||||
|
||||
|
|
|
|||
|
|
@ -13,35 +13,40 @@ In some sense, the summit is a real-life extension of the community meetings and
|
|||
## Registration
|
||||
|
||||
- [Sign the CLA](/CLA.md) if you have not done so already.
|
||||
- [Fill out this Google Form](https://goo.gl/forms/TgoUiqbqZLkyZSZw1)
|
||||
- [Fill out this Google Form](https://goo.gl/forms/TgoUiqbqZLkyZSZw1) - Registration is now <b> closed.</b>
|
||||
|
||||
## When and Where
|
||||
|
||||
- Tuesday, May 1, 2018 (before Kubecon EU)
|
||||
- Bella Center
|
||||
- Copenhagen, Denmark
|
||||
- Bella Center, Copenhagen, Denmark
|
||||
- Registration and breakfast start at 8am in Room C1-M0
|
||||
- Happy hour reception onsite to close at 5:30pm
|
||||
|
||||
All day event with a happy hour reception to close
|
||||
|
||||
## Agenda
|
||||
There is a [Slack channel](https://kubernetes.slack.com/messages/contributor-summit) (#contributor-summit) for you to use during the summit to pass URLs, notes, reserve the hallway track room, etc.
|
||||
|
||||
|
||||
## Agenda
|
||||
|
||||
### Morning
|
||||
|
||||
| Time | Track One | Track Two | Track Three |
|
||||
| ----------- | ------------------------------- | ---------------------------- | -------------- |
|
||||
| 8:00 | Registration and Breakfast | | |
|
||||
| 9:00-9:15 | Welcome and Introduction | | |
|
||||
| 9:15-9:30 | Steering Committee Update | | |
|
||||
| Time | Track One - Room: C1-M1 | Track Two - Room: C1-M2 | Track Three - Room: B4-M5 |
|
||||
| ----------- | ------------------------------- | ---------------------------- | -------------- |
|
||||
| 8:00 | Registration and Breakfast - <b>Room: C1-M0</b> | | |
|
||||
| 9:00-9:15 | | Welcome and Introduction | |
|
||||
| 9:15-9:30 | | Steering Committee Update | |
|
||||
| | | | |
|
||||
| | [New Contributor Workshop](/events/2018/05-contributor-summit/new-contributor-workshop.md) | Current Contributor Workshop | Docs Sprint |
|
||||
| | | | |
|
||||
| | New Contributor Workshop | Current Contributor Workshop | Docs Sprint |
|
||||
| | | | |
|
||||
| 9:30-10:00 | Session | Unconference | |
|
||||
| 10:00-10:50 | Session | Unconference | |
|
||||
| 9:30-10:00 | Part 1 | What's next in networking? Lead: thockin | |
|
||||
| 10:00-10:50 | Part 2 | CRDs and Aggregation - future and pain points. Lead: sttts | |
|
||||
| 10:50-11:00 | B R E A K | B R E A K | |
|
||||
| 11:00-12:00 | Session | Unconference | |
|
||||
| 12:00-1:00 | Session | Unconference | |
|
||||
| 11:00-12:00 | Part 3 | client-go and API extensions. Lead: munnerz | |
|
||||
| 12:00-1:00 | Part 4 | Developer Tools. Leads: errordeveloper and r2d4 | |
|
||||
| 1:00-2:00 | Lunch (Provided) | Lunch (Provided) | |
|
||||
|
||||
*Note: The New Contributor Workshop will be a single continuous training, rather than being divided into sessions as the Current Contributor track is. New contributors should plan to stay for the whole 3 hours. [Outline here](/events/2018/05-contributor-summit/new-contributor-workshop.md).*
|
||||
|
||||
### Afternoon
|
||||
|
||||
| Time | Track One |
|
||||
|
|
@ -55,9 +60,9 @@ All day event with a happy hour reception to close
|
|||
- SIG Updates (~5 minutes per SIG)
|
||||
- 2 slides per SIG, focused on cross-SIG issues, not internal SIG discussions (those are for Kubecon)
|
||||
- Identify potential issues that might affect multiple SIGs across the project
|
||||
- One-to-many announcements about changes a SIG expects that might affect others
|
||||
- One-to-many announcements about changes a SIG expects that might affect others
|
||||
- Track Leads
|
||||
- New Contributor Workshop - Josh Berkus
|
||||
- New Contributor Workshop - Josh Berkus, Guinevere Saenger, Ilya Dmitrichenko
|
||||
- Current Contributor Workshop - Paris Pittman
|
||||
- SIG Updates - Jorge Castro
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,139 @@
|
|||
# Client-go
|
||||
**Lead:** munnerz with assist from lavalamp
|
||||
**Slides:** combined with the CRD session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0) (CRD is first; client-go is after)
|
||||
**Thanks to our notetakers:** kragniz, mrbobbytales, directxman12, onyiny-ang
|
||||
|
||||
## Goals for the Session
|
||||
|
||||
* What is currently painful when building a controller
|
||||
* Questions around best practices
|
||||
* As someone new:
|
||||
* What is hard to grasp?
|
||||
* As someone experienced:
|
||||
* What important bits of info do you think are critical
|
||||
|
||||
|
||||
## Pain points when building controller
|
||||
* A lot of boilerplate
|
||||
* Work queues
|
||||
* HasSynced functions
|
||||
* Re-queuing
|
||||
* Lack of deep documentation in these areas
|
||||
* Some documentation exists, bot focused on k/k core
|
||||
* Securing webhooks & APIServers
|
||||
* Validation schemas
|
||||
* TLS, the number of certs is a pain point
|
||||
* It is hard right now, the internal k8s CA has been used a bit.
|
||||
* OpenShift has a 'serving cert controller' that will generate a cert based on an annotation that might be able to possibly integrate upstream.
|
||||
* Election has been problematic and the Scaling API is low-level and hard to use. doesn't work well if resource has multiple meanings of scale (eg multiple pools of nodes)
|
||||
* Registering CRDs, what's the best way to go about it?
|
||||
* No best way to do it, but has been deployed with application
|
||||
* Personally, deploy the CRDs first for RBAC reasons
|
||||
* Declarative API on one end that has to be translated to translated to a transactional API on the other end (e.g. ingress). Controller trying to change quite a few things.
|
||||
* You can do locking, but it has to be built.
|
||||
* Q: how do you deal with "rolling back" if the underlying infrastructure
|
||||
that you're describing says no on an operation?
|
||||
* A: use validating webhook?
|
||||
* A: use status to keep track of things?
|
||||
* A: two types of controllers: `kube --> kube` and `kube --> external`,
|
||||
they work differently
|
||||
* A: Need a record that keeps track of things in progress. e.g. status. Need more info on how to properly tackle this problem.
|
||||
|
||||
|
||||
## Best practices
|
||||
(discussion may be shown by Q: for question or A: for audience or answer)
|
||||
* How do you keep external resources up to date with Kubernetes resources?
|
||||
* A: the original intention was to use the sync period on the controller if
|
||||
you watch external resources, use that
|
||||
* Should you set resync period to never if you're not dealing with
|
||||
external resources?
|
||||
* A: Yes, it's not a bug if watch fails to deliver things right
|
||||
* A: controller automatically relists on connection issues, resync
|
||||
interval is *only* for external resources
|
||||
* maybe should be renamed to make it clear it's for external resources
|
||||
* how many times to update status per sync?
|
||||
* A: use status conditions to communicate "fluffy" status to user
|
||||
(messages, what might be blocked, etc, in HPA), use fields to
|
||||
communicate "crunchy" status (last numbers we saw, last metrics, state
|
||||
I need later).
|
||||
* How do I generate nice docs (markdown instead of swagger)
|
||||
* A: kubebuilder (kubernetes-sigs/kubebuilder) generates docs out of the
|
||||
box
|
||||
* A: Want to have IDL pipeline that runs on native types to run on CRDs,
|
||||
run on docs generator
|
||||
* Conditions vs fields
|
||||
* used to check a pods’ state
|
||||
* "don't use conditions too much"; other features require the use of conditions, status is unsure
|
||||
* What does condition mean in this context
|
||||
* Additional fields that can have `ready` with a msg, represents `state`.
|
||||
* Limit on states that the object can be in.
|
||||
* Use conditions to reflect the state of the world, is something blocked etc.
|
||||
* Conditions were created to allow for mixed mode of clients, old clients can ignore some conditions while new clients can follow them. Designed to make it easier to extend status without breaking clients.
|
||||
* Validating webhooks vs OpenAPI schema
|
||||
* Can we write a test that spins up main API server in process?
|
||||
* Can do that current in some k/k tests, but not easy to consume
|
||||
* vendoring is hard
|
||||
* Currently have a bug where you have to serve aggregated APIs on 443,
|
||||
so that might complicate things
|
||||
* How are people testing extensions?
|
||||
* Anyone reusing upstream dind cluster?
|
||||
* People looking for a good way to test them.
|
||||
* kube-builder uses the sig-testing framework to bring up a local control plane and use that to test against. (@pwittrock)
|
||||
* How do you start cluster for e2es?
|
||||
* Spin up a full cluster with kubeadm and run tests against that
|
||||
* integration tests -- pull in packages that will build the clusters
|
||||
* Q: what CIs are you using?
|
||||
* A: Circle CI and then spin up new VMs to host cluster
|
||||
* Mirtantis has a tool for a multi-node dind cluster for testing
|
||||
* #testing-commons channel on stack. 27 page document on this--link will be put in slides
|
||||
* Deploying and managing Validating/Mutating webhooks?
|
||||
* how complex should they be?
|
||||
* When to use subresources?
|
||||
* Are people switching to api agg to use this today?
|
||||
* Really just for status and scale
|
||||
* Why not use subresources today with scale?
|
||||
* multiple replicas fields
|
||||
* doesn't fit polymorphic structure that exists
|
||||
* pwittrock@: kubectl side, scale
|
||||
* want to push special kubectl verbs into subresources to make kubectl
|
||||
more tolerant to version skew
|
||||
|
||||
## Other Questions
|
||||
|
||||
* Q: Client-go generated listers, what is the reason for two separate interfaces to retrieve from client and cache?
|
||||
* A: historical, but some things are better done local vs on the server.
|
||||
* issues: client-set interface allows you to pass special options that allow you to do interesting stuff on the API server which isn't necessarily possible in the lister.
|
||||
* started as same function call and then diverged
|
||||
* lister gives you slice of pointers
|
||||
* clientset gives you a slice of not pointers
|
||||
* a lot of people would take return from clientset and then convert it to a slice of pointers so the listers helped avoid having to do deep copies every time. TLDR: interfaces are not identical
|
||||
* Where should questions go on this topic for now?
|
||||
* A: most goes to sig-api-machinery right now
|
||||
* A : Controller related stuff would probably be best for sig-apps
|
||||
* Q: Staleness of data, how are people dealing with keeping data up to date with external data?
|
||||
* A: Specify sync period on your informer, will put everything through the loop and hit external resources.
|
||||
* Q: With strictly kubernetes resources, should your sync period be never? aka does the watch return everything.
|
||||
* A: The watch should return everything and should be used if its strictly k8s in and k8s out, no need to set the sync period.
|
||||
* Q: What about controllers in other languages than go?
|
||||
* A: [metacontroller](https://github.com/GoogleCloudPlatform/metacontroller) There are client libs in other languages, missing piece is work queue,
|
||||
informer, etc
|
||||
* Cluster API controllers cluster, machineset, deployment, have a copy of
|
||||
deployment code for machines. Can we move this code into a library?
|
||||
* A: it's a lot of work, someone needs to do it
|
||||
* A: Janet Kuo is a good person to talk to (worked on getting core workloads
|
||||
API to GA) about opinions on all of this
|
||||
* Node name duplication caused issues with AWS and long-term caches
|
||||
* make sure to store UIDs if you cache across reboot
|
||||
|
||||
## Moving Forwards
|
||||
* How do share/disseminate knowledge (SIG PlatformDev?)
|
||||
* Most SIGs maintain their own controllers
|
||||
* Wiki? Developer Docs working group?
|
||||
* Existing docs focus on in-tree development. Dedicated 'extending kubernetes' section?
|
||||
* Git-book being developed for kubebuilder (book.kubebuilder.io); would appreciate feedback @pwittrock
|
||||
* API extensions authors meetups?
|
||||
* How do we communicate this knowledge for core kubernetes controllers
|
||||
* Current-day: code review, hallway conversations
|
||||
* Working group for platform development kit?
|
||||
* Q: where should we discuss/have real time conversations?
|
||||
* A: #sig-apimachinery, or maybe #sig-apps in slack (or mailing lists) for the workloads controllers
|
||||
|
|
@ -0,0 +1,92 @@
|
|||
# CRDs - future and painpoints
|
||||
**Lead:** sttts
|
||||
**Slides:** combined with the client-go session [here](https://www.dropbox.com/s/n2fczhlbnoabug0/API%20extensions%20contributor%20summit.pdf?dl=0)
|
||||
**Thanks to our notetakers:** mrbobbytales, kragniz, tpepper, and onyiny-ang
|
||||
|
||||
## outlook - aggregation
|
||||
* API stable since 1.10. There is a lack of tools and library support.
|
||||
* GSoC project with @xmudrii: share etcd storage
|
||||
* `kubectl create etcdstorage your api-server`
|
||||
* Store custom data in etcd
|
||||
|
||||
## outlook custom resources
|
||||
|
||||
1.11:
|
||||
* alpha: multiplier versions with/without conversion
|
||||
* alpha: pruning - blocker for GA - unspecified fields are removed
|
||||
* deep change of semantics of custom resources
|
||||
* from JSON blob store to schema based storage
|
||||
* alpha: defaulting - defaults from openapi validation schema are applied
|
||||
* alpha: graceful deletion - (maybe? PR exists)
|
||||
* alpha: server side printing columns for `kubectl get` customization
|
||||
* beta: subresources - alpha in 1.10
|
||||
* will have additionalProperties with extensible string map
|
||||
* mutually exclusive with properties
|
||||
|
||||
1.12
|
||||
* multiple versions with declarative field renames
|
||||
* strict create mode (issue #5889)
|
||||
|
||||
Missing from Roadmap:
|
||||
- Additional Properties: Forbid additional fields
|
||||
- Unknown fields are silently dropped instead of erroring
|
||||
- Istio used CRD extensively: proto requires some kind of verification and CRDs are JSON
|
||||
- currently planning to go to GA without proto support
|
||||
- possibly in the longer term to plan
|
||||
- Resource Quotas for Custom Resources
|
||||
- doable, we know how but not currently implemented
|
||||
- Defaulting: mutating webhook will default things when they are written
|
||||
- Is Validation going to be required in the future
|
||||
- poll the audience!
|
||||
- gauging general sense of validation requirements (who wants them, what's missing?)
|
||||
- missing: references to core types aren't allowed/can't be defined -- this can lead to versioning complications
|
||||
- limit CRDs clusterwide such that the don't affect all namespaces
|
||||
- no good discussion about how to improve this yet
|
||||
- feel free to start one!
|
||||
- Server side printing columns, per resource type needs to come from server -- client could be in different version than server and highlight wrong columns
|
||||
|
||||
Autoscaling is alpha today hopefully beta in 1.11
|
||||
|
||||
## The Future: Versioning
|
||||
* Most asked feature, coming..but slowly
|
||||
* two types, "noConversion" and "Declarative Conversion"
|
||||
* "NoConversion" versioning
|
||||
* maybe in 1.11
|
||||
* ONLY change is apiGroup
|
||||
* Run multiple versions at the same time, they are not converted
|
||||
|
||||
* "Declarative Conversion" 1.12
|
||||
* declarative rename e.g
|
||||
```
|
||||
spec:
|
||||
group: kubecon.io
|
||||
version: v1
|
||||
conversions:
|
||||
declarative:
|
||||
renames:
|
||||
from: v1pha1
|
||||
to: v1
|
||||
old: spec.foo
|
||||
new: bar
|
||||
```
|
||||
* Support for webhook?
|
||||
* not currently, very hard to implement
|
||||
* complex problem for end user
|
||||
* current need is really only changing for single fields
|
||||
* Trying to avoid complexity by adding a lot of conversions
|
||||
|
||||
## Questions:
|
||||
* When should someone move to their own API Server
|
||||
* At the moment, telling people to start with CRDs. If you need an aggregated API server for custom versioning or other specific use-cases.
|
||||
* How do I update everything to a new object version?
|
||||
* Have to touch every object.
|
||||
* are protobuf support in the future?
|
||||
* possibly, likely yes
|
||||
* update on resource quotas for CRDs
|
||||
* PoC PR current out, it's doable just not quite done
|
||||
* Is validation field going to be required?
|
||||
* Eventually, yes? Some work being done to make CRDs work well with `kubectl apply`
|
||||
* Can CRDs be cluster wide but viewable to only some users.
|
||||
* It's been discussed, but hasn't been tackled.
|
||||
* Is there support for CRDs in kubectl output?
|
||||
* server side printing columns will make things easier for client tooling output. Versioning is important for client vs server versioning.
|
||||
|
|
@ -0,0 +1,63 @@
|
|||
# Developer Tools:
|
||||
**Leads:** errordeveloper, r2d4
|
||||
**Slides:** n/a
|
||||
**Thanks to our notetakers:** mrbobbytales, onyiny-ang
|
||||
|
||||
What APIs should we target, what parts of the developer workflow haven't been covered yet?
|
||||
|
||||
* Do you think the Developer tools for Kubernetes is a solved problem?
|
||||
* A: No
|
||||
|
||||
### Long form responses from SIG Apps survey
|
||||
* Need to talk about developer experience
|
||||
* Kubernetes Community can do a lot more in helping evangelize Software development workflow, including CI/CD. Just expecting some guidelines on the more productive ways to write software that runs in k8s.
|
||||
* Although my sentiment is neutral on kube, it is getting better as more tools are emerging to allow my devs to stick to app development and not get distracted by kube items. There is a lot of tooling available which is a dual edge sword, these tools range greatly in usability robustness and security. So it takes a lot of effort to...
|
||||
|
||||
### Current State of Developer Experience
|
||||
* Many Tools
|
||||
* Mostly incompatible
|
||||
* Few end-to-end workflows
|
||||
|
||||
### Comments and Questions
|
||||
* Idea from scaffold to normalize the interface for builders, be able to swap them out behind the scenes.
|
||||
* Possible to formalize these as CRDs?
|
||||
* Lots of choices, helm, other templating, kompose etc..
|
||||
* So much flexibility in the Kubernetes API that it can become complicated for new developers coming up.
|
||||
* Debug containers might make things easier for developers to work through building and troubleshooting their app.
|
||||
* Domains and workflow are so different from companies that everyone has their own opinionated solution.
|
||||
* Lots of work being done in the app def working group to define what an app is.
|
||||
* app CRD work should make things easier for developers.
|
||||
* Break out developer workflow into stages and try and work through expanding them, e.g. develop/debug
|
||||
* debug containers are looking to be used both in prod and developer workflows
|
||||
* Tool in sig-cli called kustomize, was previously 'konflate'?
|
||||
* Hard to talk about all these topics as there isn't the language to talk about these classes of tools.
|
||||
* @jacob investigation into application definition: re: phases, its not just build, deploy, debug, its build, deploy, lifecycle, debug. Managing lifecycle is still a problem, '1-click deploy' doesn't handle lifecycle.
|
||||
* @Bryan Liles: thoughts about why this is hard:
|
||||
* kubectl helm apply objects in different orders
|
||||
* objects vs abstractions
|
||||
* some people love [ksonnet](https://ksonnet.io/), some hate it. Kubernetes concepts are introduced differently to different people so not everyone is starting with the same base. Thus, some tools are harder for some people to grasp than others. Shout out to everyone who's trying to work through it * Being tied to one tool breaks compatibility across providers.
|
||||
* Debug containers are great for break-glass scenarios
|
||||
* CoreOS had an operator that handled the entire stack, additional objects could be created and certain metrics attached.
|
||||
* Everything is open source now, etcd, prometheus operator
|
||||
* Tools are applying things in different orders, and this can be a problem across tooling
|
||||
* People who depend on startup order also tend to have reliability problems as they have their own operational problems, should try and engineer around it.
|
||||
* Can be hard if going crazy on high-level abstractions, can make things overly complicated and there are a slew of constraints in play.
|
||||
* Ordering constraints are needed for certain garbage collection tasks, having ordering may actually be useful.
|
||||
* Some groups have avoided high-level DSLs because people should understand readiness/livelness probes etc. Developers may have a learning curve, but worthwhile when troubleshooting and getting into the weeds.
|
||||
* Lots of people don't want to get into it at all, they want to put in a few details on a db etc and get it.
|
||||
* Maybe standardize on a set of labels to on things that should be managed as a group. Helm is one implementation, it should go beyond helm.
|
||||
* There is a PR that is out there that might take care of some of this.
|
||||
* Everyone has their own "style" when it comes to this space.
|
||||
* Break the phases and components in the development and deployment workflow into sub-problems and they may be able to actually be tackled. Right now the community seems to tackling everything at once and developing different tools to do the same thing.
|
||||
* build UI that displays the whole thing as a list and allows easy creation/destruction of cluster
|
||||
* avoid tools that would prevent portability
|
||||
* objects rendered to file somehow: happens at runtime, additional operator that takes care of the sack
|
||||
* 3, 4 minor upgrades without breakage
|
||||
* @Daniel Smith: start up order problems = probably bigger problems, order shouldn't need to matter but in the real world sometimes it does
|
||||
* platform team, internal paths team (TSL like theme), etc. In some cases it's best to go crazy focusing on the abstractions--whole lot of plumbing that needs to happen to get everything working properly
|
||||
* Well defined order of creation may not be a bad thing. ie. ensure objects aren't created that are immediately garbage collected.
|
||||
* Taking a step back from being contributors and put on developer hats to consider the tool sprawl that exists and is not necessarily compatible across different aspects of kubernetes. Is there anyway to consolidate them and make them more standardized?
|
||||
* Split into sub-problems
|
||||
|
||||
## How can we get involved?
|
||||
- SIG-Apps - join the conversation on slack, mailing list, or weekly Monday meeting
|
||||
|
|
@ -0,0 +1,129 @@
|
|||
# Networking
|
||||
**Lead:** thockin
|
||||
**Slides:** [here](https://docs.google.com/presentation/d/1Qb2fbyTClpl-_DYJtNSReIllhetlOSxFWYei4Zt0qFU/edit#slide=id.g2264d16f0b_0_14)
|
||||
**Thanks to our notetakers:** onyiny-ang, mrbobbytales, tpepper
|
||||
|
||||
|
||||
This session is not declaring what's being implemented next, but rather laying out the problems that loom.
|
||||
|
||||
## Coming soon
|
||||
- kube-proxy with IPVS
|
||||
- currently beta
|
||||
- core DNS replacing kube DNS
|
||||
- currently beta
|
||||
- pod "ready++"
|
||||
- allow external systems to participate in rolling updates. Say your load-balancer takes 5-10 seconds to program, when you bring up new pod and take down old pod the load balancer has lost old backends but hasn't yet added new backends. The external dependency like this becomes a gating pod decorator.
|
||||
- adds configuration to pod to easily verify readiness
|
||||
- design agreed upon, alpha (maybe) in 1.11
|
||||
|
||||
## Ingress
|
||||
* The lowest common-denominator API. This is really limiting for users, especially compared to modern software L7 proxies.
|
||||
* annotation model of markup limits portability
|
||||
* ingress survey reports:
|
||||
* people want portability
|
||||
* everyone uses non-portable features…
|
||||
* 2018 L7 requirements are dramatically higher than what they were and many vendors don’t support that level of functionality.
|
||||
* Possible Solution? Routes
|
||||
* openshift uses routes
|
||||
* heptio prototyping routes currently
|
||||
* All things considered, requirements are driving it closer and closer to istio
|
||||
Possibility, poach some of the ideas and add them to kubernetes native.
|
||||
|
||||
## Istio
|
||||
(as a potential solution)
|
||||
- maturing rapidly with good APIs and support
|
||||
- Given that plus istio is not part of kubernetes, it's unlikely near term to become a default or required part of a k8s deployment. The general ideas around istio style service mesh could be more native in k8s.
|
||||
|
||||
## Topology and node-local Services
|
||||
- demand for node-local network and service discovery but how to go about it?
|
||||
- e.g. “I want to talk to the logging daemon on my current host”
|
||||
- special-case topology?
|
||||
- client-side choice
|
||||
- These types of services should not be a service proper.
|
||||
|
||||
## Multi-network
|
||||
|
||||
- certain scenarios demand multi-network
|
||||
- A pod can be in multiple networks at once. You might have different quality of service on different networks (eg: fast/expensive, slower/cheaper), or different connectivity (eg: the rack-internal network).
|
||||
- Tackling scenarios like NFV
|
||||
- need deeper changes like multiple pod IPs but also need to avoid repeating old mistakes
|
||||
- SIG-Network WG designing a PoC -- If interested jump on SIG-network WG weekly call
|
||||
- Q: Would this PoC help if virtual-kubelets were used to span cloud providers? Spanning latency domains in networks is also complicated. Many parts of k8s are chatty, assuming a cluster internal low-latency connectivity.
|
||||
|
||||
## Net Plugins vs Device Plugins
|
||||
- These plugins do not coordinate today and are difficult to work around
|
||||
- gpu that is also an infiniband device
|
||||
- causes problems because network and device are very different with verbs etc
|
||||
- problems encountered with having to schedule devices and network together at the same time.
|
||||
“I want a gpu on this host that has a gpu attached and I want it to be the same deviec”
|
||||
PoC available to make this work, but its rough and a problem right now.
|
||||
- Resources WG and networking SIG are discussing this challenging problem
|
||||
- SIGs/WGs. Conversation may feel like a cycle, but @thockin feels it is a spiral that is slowly converging and he has a doc he can share covering the evolving thinking.
|
||||
|
||||
## Net Plugins, gRPC, Services
|
||||
- tighter coupling between netplugins and kube-proxy could be useful
|
||||
- grpc is awesome for plugins, why not use a grpc network plugin
|
||||
- pass services to network plugin to bypass kube-proxy, give more awareness to the network plugin and enable more functionality.
|
||||
|
||||
## IPv6
|
||||
- beta but **no** support for dual-stack (v4 & v6 at the same time)
|
||||
- Need deeper changes like multiple pod IPs (need to change the pod API--see Multi-network)
|
||||
- https://github.com/kubernetes/features/issues/563
|
||||
|
||||
## Services v3
|
||||
|
||||
- Services + Endpoints have a grab-bag of features which is not ideal; "grew organically"
|
||||
- Need to start segmenting the "core" API group
|
||||
- write API in a way that is more obvious
|
||||
- split things out and reflect it in API
|
||||
- Opportunity to rethink and refactor:
|
||||
- Endpoints -> Endpoint?
|
||||
- split the grouping construct from the “gazintas”
|
||||
- virtualIP, network, dns name moves into the service
|
||||
- EOL troublesome features
|
||||
- port remapping
|
||||
|
||||
## DNS Reboot
|
||||
- We abuse DNS and mess up our DNS schema
|
||||
- it's possible to write queries in DNS that take over names
|
||||
- @thockin has a doc with more information about the details of this
|
||||
- Why can't I use more than 6 web domains? bugzilla circa 1996
|
||||
- problem: its possible to write queries in dns that write over names
|
||||
- create a namespace called “com” and an app named “google” and it’ll cause a problem
|
||||
- “svc” is an artifact and should not be a part of dns
|
||||
- issues with certain underlying libraries
|
||||
- Changing it is hard (if we care about compatibility)
|
||||
- Can we fix DNS spec or use "enlightened" DNS servers
|
||||
- Smart proxies on behalf of pods that do the searching and become a “better” dns
|
||||
- External DNS
|
||||
- Creates DNS entries in external system (route53)
|
||||
- Currently in incubator, not sure on status, possibly might move out of incubator, but unsure on path forward
|
||||
|
||||
Perf and Scalability
|
||||
- iptables is krufty. nftables implementation should be better.
|
||||
- ebpf implementation (eg; Cilium) has potential
|
||||
|
||||
## Questions:
|
||||
|
||||
- Consistent mechanism to continue progress but maintain backwards compatibility
|
||||
- External DNS was not mentioned -- blue/green traffic switching
|
||||
- synchronizes kubernetes resources into various Kubernetes services
|
||||
- it's in incubator right now (deprecated)
|
||||
- unsure of the future trajectory
|
||||
- widely used in production
|
||||
- relies sometimes on annotations and ingress
|
||||
- Q: Device plugins. . .spiraling around and hoping for eventual convergence/simplification
|
||||
- A: Resource management on device/net plugin, feels like things are going in a spiral, but progress is being made, it is a very difficult problem and hard to keep all design points tracked. Trying to come to consensus on it all.
|
||||
- Q: Would CoreDNS be the best place for the plugins and other modes for DNS proxy etc.
|
||||
- loss of packets are a problem -- long tail of latency
|
||||
- encourage cloud providers to support gRPC
|
||||
- Q: With the issues talked about earlier, why can’t istio be integrated natively?
|
||||
- A: Istio can't be required/default: still green
|
||||
- today we can't proclaim that Kubernetes must support Istio
|
||||
- probably not enough community support this year (not everyone is using it at this point)
|
||||
- Q: Thoughts on k8s v2?
|
||||
- A: Things will not just be turned off, things must be phased out and over the course of years, especially for services which have been core for some time.
|
||||
|
||||
## Take Aways:
|
||||
- This is not a comprehensive list of everything that is up and coming
|
||||
- A lot of work went into all of these projects
|
||||
|
|
@ -0,0 +1,99 @@
|
|||
Kubernetes Summit: New Contributor Workshop
|
||||
|
||||
*This was presented as one continuous 3-hour training with a break. For purposes of live coding exercises, participants were asked to bring a laptop with git installed.*
|
||||
|
||||
This course was captured on video, and the playlist can be found [here](https://www.youtube.com/playlist?list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx).
|
||||
|
||||
*Course Playlist [Part One](https://www.youtube.com/watch?v=obyAKf39H38&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=1):*
|
||||
* Opening
|
||||
* Welcome contributors
|
||||
* Who this is for
|
||||
* Program
|
||||
* The contributor ladder
|
||||
* CLA signing
|
||||
* Why we have a CLA
|
||||
* Going through the signing process
|
||||
* Choose Your Own Adventure: Figuring out where to contribute
|
||||
* Docs & Website
|
||||
* Testing
|
||||
* Community management
|
||||
* Code
|
||||
* Main code
|
||||
* Drivers, platforms, plugins, subprojects
|
||||
* Finding your first topic
|
||||
* Things that fit into your work at work
|
||||
* Interest match
|
||||
* Skills match
|
||||
* Choose your own adventure exercise
|
||||
* Let's talk: Communication
|
||||
* Importance of communication
|
||||
* Community standards and courtesy
|
||||
* Mailing Lists (esp Kube-dev)
|
||||
* Slack
|
||||
* Github Issues & PRs
|
||||
* Zoom meetings & calendar
|
||||
* Office hours, MoC, other events
|
||||
* Meetups
|
||||
* Communication exercise
|
||||
* The SIG system
|
||||
* What are SIGs and WGs
|
||||
* Finding the right SIG
|
||||
* Most active SIGs
|
||||
* SIG Membership, governance
|
||||
* WGs and Subprojects
|
||||
* Repositories
|
||||
* Tour de Repo
|
||||
* Core Repo
|
||||
* Website/docs
|
||||
* Testing
|
||||
* Other core repos
|
||||
* Satellite Repos
|
||||
* Owners files
|
||||
* Repo membership
|
||||
* BREAK (20min)
|
||||
|
||||
*Course Playlist [Part Two](https://www.youtube.com/watch?v=PERboIaNdcI&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=2):*
|
||||
* Contributing by Issue: Josh (15 min) (1:42)
|
||||
* Finding the right repo
|
||||
* What makes a good issue
|
||||
* Issues as spec for changes
|
||||
* Labels
|
||||
* label framework
|
||||
* required labels
|
||||
* Following up and communication
|
||||
* Contributing by PR (with walkthrough)
|
||||
* bugs vs. features vs. KEP
|
||||
* PR approval process
|
||||
* More Labels
|
||||
* Finding a reviewer
|
||||
* Following-up and communication
|
||||
* On you: rebasing, test troubleshooting
|
||||
* Test infrastructure
|
||||
* Automated tests
|
||||
* Understanding test failures
|
||||
* Doc Contributions
|
||||
* Upcoming changes to docs
|
||||
* Building docs locally
|
||||
* Doc review process
|
||||
|
||||
*Course Playlist [Part Three](https://www.youtube.com/watch?v=Z3pLlp6nckI&list=PL69nYSiGNLP3M5X7stuD7N4r3uP2PZQUx&t=0s&index=3):*
|
||||
|
||||
* Code Contributions: Build and Test
|
||||
* Local core kubernetes build
|
||||
* Running unit tests
|
||||
* Troubleshooting build problems
|
||||
* Releases
|
||||
* Brief on Release schedule
|
||||
* Release schedule details
|
||||
* Release Team Opportunities (shadows)
|
||||
* Going beyond
|
||||
* Org membership
|
||||
* Meetups & CNCF ambassador
|
||||
* Mentorship opportunties
|
||||
* Group Mentoring
|
||||
* GSOC/Outreachy
|
||||
* Release Team
|
||||
* Meet Our Contributors
|
||||
* 1-on-1 ad-hoc mentoring
|
||||
* Kubernetes beginner tutorials
|
||||
* Check your own progress on devstats
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
# Steering Committee Update
|
||||
**Leads:** pwittrock, timothysc
|
||||
**Thanks to our notetaker:** tpepper
|
||||
|
||||
* incubation is deprecated, "associated" projects are a thing
|
||||
* WG are horizontal across SIGs and are ephemeral. Subprojects own a piece
|
||||
of code and relate to a SIG. Example: SIG-Cluster-Lifecycle with
|
||||
kubeadm, kops, etc. under it.
|
||||
* SIG charters: PR a proposed new SIG with the draft charter. Discussion
|
||||
can then happen on GitHub around the evolving charter. This is cleaner
|
||||
and more efficient than discussing on mailing list.
|
||||
* K8s values doc updated by Sarah Novotny
|
||||
* changes to voting roles and rules are in the works
|
||||
|
|
@ -1,12 +1,10 @@
|
|||
# Kubernetes Weekly Community Meeting
|
||||
|
||||
We have PUBLIC and RECORDED [weekly meeting](https://zoom.us/my/kubernetescommunity) every Thursday at 6pm UTC (1pm EST / 10am PST)
|
||||
We have PUBLIC and RECORDED [weekly meeting](https://zoom.us/my/kubernetescommunity) every Thursday at [5pm UTC](https://www.google.com/search?q=5pm+UTC).
|
||||
|
||||
Map that to your local time with this [timezone table](https://www.google.com/search?q=1800+in+utc)
|
||||
See it on the web at [calendar.google.com](https://calendar.google.com/calendar/embed?src=cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com&ctz=America/Los_Angeles) , or paste this [iCal url](https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics) into any [iCal client](https://en.wikipedia.org/wiki/ICalendar). Do NOT copy the meetings over to a your personal calendar, you will miss meeting updates. Instead use your client's calendaring feature to say you are attending the meeting so that any changes made to meetings will be reflected on your personal calendar.
|
||||
|
||||
See it on the web at [calendar.google.com](https://calendar.google.com/calendar/embed?src=cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com&ctz=America/Los_Angeles) , or paste this [iCal url](https://calendar.google.com/calendar/ical/cgnt364vd8s86hr2phapfjc6uk%40group.calendar.google.com/public/basic.ics) into any [iCal client](https://en.wikipedia.org/wiki/ICalendar). Do NOT copy the meetings over to a your perosnal calendar, you will miss meeting updates. Instead use your client's calendaring feature to say you are attending the meeting so that any changes made to meetings will be reflected on your personal calendar.
|
||||
|
||||
All meetings are archived on the [Youtube Channel](https://www.youtube.com/watch?v=onlFHICYB4Q&list=PL69nYSiGNLP1pkHsbPjzAewvMgGUpkCnJ)
|
||||
All meetings are archived on the [Youtube Channel](https://www.youtube.com/playlist?list=PL69nYSiGNLP1pkHsbPjzAewvMgGUpkCnJ).
|
||||
|
||||
Quick links:
|
||||
|
||||
|
|
|
|||
|
|
@ -6,8 +6,8 @@ Office Hours is a live stream where we answer live questions about Kubernetes fr
|
|||
|
||||
Third Wednesday of every month, there are two sessions:
|
||||
|
||||
- European Edition: [2pm UTC](https://www.timeanddate.com/worldclock/fixedtime.html?msg=Kubernetes+Office+Hours+%28European+Edition%29&iso=20171115T14&p1=136&ah=1)
|
||||
- Western Edition: [9pm UTC](https://www.timeanddate.com/worldclock/fixedtime.html?msg=Kubernetes+Office+Hours+%28Western+Edition%29&iso=20171115T13&p1=1241)
|
||||
- European Edition: [1pm UTC](https://www.google.com/search?q=1pm+UTC)
|
||||
- Western Edition: [8pm UTC](https://www.google.com/search?q=8pm+UTC)
|
||||
|
||||
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along.
|
||||
|
||||
|
|
|
|||
|
|
@ -16,7 +16,7 @@ The documentation follows a template and uses the values from [`sigs.yaml`](/sig
|
|||
|
||||
**Time Zone gotcha**:
|
||||
Time zones make everything complicated.
|
||||
And Daylight Savings time makes it even more complicated.
|
||||
And Daylight Saving time makes it even more complicated.
|
||||
Meetings are specified with a time zone and we generate a link to http://www.thetimezoneconverter.com/ so that people can easily convert it to their local time zone.
|
||||
To make this work you need to specify the time zone in a way that that web site recognizes.
|
||||
Practically, that means US pacific time must be `PT (Pacific Time)`.
|
||||
|
|
|
|||
|
|
@ -0,0 +1,225 @@
|
|||
---
|
||||
kep-number: 8
|
||||
title: Protomote sysctl annotations to fields
|
||||
authors:
|
||||
- "@ingvagabund"
|
||||
owning-sig: sig-node
|
||||
participating-sigs:
|
||||
- sig-auth
|
||||
reviewers:
|
||||
- "@sjenning"
|
||||
- "@derekwaynecarr"
|
||||
approvers:
|
||||
- "@sjenning "
|
||||
- "@derekwaynecarr"
|
||||
editor:
|
||||
creation-date: 2018-04-30
|
||||
last-updated: 2018-05-02
|
||||
status: provisional
|
||||
see-also:
|
||||
replaces:
|
||||
superseded-by:
|
||||
---
|
||||
|
||||
# Promote sysctl annotations to fields
|
||||
|
||||
## Table of Contents
|
||||
|
||||
* [Promote sysctl annotations to fields](#promote-sysctl-annotations-to-fields)
|
||||
* [Table of Contents](#table-of-contents)
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
* [Promote annotations to fields](#promote-annotations-to-fields)
|
||||
* [Promote --experimental-allowed-unsafe-sysctls kubelet flag to kubelet config api option](#promote---experimental-allowed-unsafe-sysctls-kubelet-flag-to-kubelet-config-api-option)
|
||||
* [Gate the feature](#gate-the-feature)
|
||||
* [Proposal](#proposal)
|
||||
* [User Stories](#user-stories)
|
||||
* [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
|
||||
* [Risks and Mitigations](#risks-and-mitigations)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
|
||||
## Summary
|
||||
|
||||
Setting the `sysctl` parameters through annotations provided a successful story
|
||||
for defining better constraints of running applications.
|
||||
The `sysctl` feature has been tested by a number of people without any serious
|
||||
complaints. Promoting the annotations to fields (i.e. to beta) is another step in making the
|
||||
`sysctl` feature closer towards the stable API.
|
||||
|
||||
Currently, the `sysctl` provides `security.alpha.kubernetes.io/sysctls` and `security.alpha.kubernetes.io/unsafe-sysctls` annotations that can be used
|
||||
in the following way:
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: sysctl-example
|
||||
annotations:
|
||||
security.alpha.kubernetes.io/sysctls: kernel.shm_rmid_forced=1
|
||||
security.alpha.kubernetes.io/unsafe-sysctls: net.ipv4.route.min_pmtu=1000,kernel.msgmax=1 2 3
|
||||
spec:
|
||||
...
|
||||
```
|
||||
|
||||
The goal is to transition into native fields on pods:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: sysctl-example
|
||||
spec:
|
||||
securityContext:
|
||||
sysctls:
|
||||
- name: kernel.shm_rmid_forced
|
||||
value: 1
|
||||
- name: net.ipv4.route.min_pmtu
|
||||
value: 1000
|
||||
unsafe: true
|
||||
- name: kernel.msgmax
|
||||
value: "1 2 3"
|
||||
unsafe: true
|
||||
...
|
||||
```
|
||||
|
||||
The `sysctl` design document with more details and rationals is available at [design-proposals/node/sysctl.md](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/sysctl.md#pod-api-changes)
|
||||
|
||||
## Motivation
|
||||
|
||||
As mentioned in [contributors/devel/api_changes.md#alpha-field-in-existing-api-version](https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#alpha-field-in-existing-api-version):
|
||||
|
||||
> Previously, annotations were used for experimental alpha features, but are no longer recommended for several reasons:
|
||||
>
|
||||
> They expose the cluster to "time-bomb" data added as unstructured annotations against an earlier API server (https://issue.k8s.io/30819)
|
||||
> They cannot be migrated to first-class fields in the same API version (see the issues with representing a single value in multiple places in backward compatibility gotchas)
|
||||
>
|
||||
> The preferred approach adds an alpha field to the existing object, and ensures it is disabled by default:
|
||||
>
|
||||
> ...
|
||||
|
||||
The annotations as a means to set `sysctl` are no longer necessary.
|
||||
The original intent of annotations was to provide additional description of Kubernetes
|
||||
objects through metadata.
|
||||
It's time to separate the ability to annotate from the ability to change sysctls settings
|
||||
so a cluster operator can elevate the distinction between experimental and supported usage
|
||||
of the feature.
|
||||
|
||||
### Promote annotations to fields
|
||||
|
||||
* Introduce native `sysctl` fields in pods through `spec.securityContext.sysctl` field as:
|
||||
|
||||
```yaml
|
||||
sysctl:
|
||||
- name: SYSCTL_PATH_NAME
|
||||
value: SYSCTL_PATH_VALUE
|
||||
unsafe: true # optional field
|
||||
```
|
||||
|
||||
* Introduce native `sysctl` fields in [PSP](https://kubernetes.io/docs/concepts/policy/pod-security-policy/) as:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PodSecurityPolicy
|
||||
metadata:
|
||||
name: psp-example
|
||||
spec:
|
||||
sysctls:
|
||||
- kernel.shmmax
|
||||
- kernel.shmall
|
||||
- net.*
|
||||
```
|
||||
|
||||
More examples at [design-proposals/node/sysctl.md#allowing-only-certain-sysctls](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/sysctl.md#allowing-only-certain-sysctls)
|
||||
|
||||
### Promote `--experimental-allowed-unsafe-sysctls` kubelet flag to kubelet config api option
|
||||
|
||||
As there is no longer a need to consider the `sysctl` feature experimental,
|
||||
the list of unsafe sysctls can be configured accordingly through:
|
||||
|
||||
```go
|
||||
// KubeletConfiguration contains the configuration for the Kubelet
|
||||
type KubeletConfiguration struct {
|
||||
...
|
||||
// Whitelist of unsafe sysctls or unsafe sysctl patterns (ending in *).
|
||||
// Default: nil
|
||||
// +optional
|
||||
AllowedUnsafeSysctls []string `json:"allowedUnsafeSysctls,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
Upstream issue: https://github.com/kubernetes/kubernetes/issues/61669
|
||||
|
||||
### Gate the feature
|
||||
|
||||
As the `sysctl` feature stabilizes, it's time to gate the feature [1] and enable it by default.
|
||||
|
||||
* Expected feature gate key: `Sysctls`
|
||||
* Expected default value: `true`
|
||||
|
||||
With the `Sysctl` feature enabled, both sysctl fields in `Pod` and `PodSecurityPolicy`
|
||||
and the whitelist of unsafed sysctls are acknowledged.
|
||||
If disabled, the fields and the whitelist are just ignored.
|
||||
|
||||
[1] https://kubernetes.io/docs/reference/feature-gates/
|
||||
|
||||
## Proposal
|
||||
|
||||
This is where we get down to the nitty gritty of what the proposal actually is.
|
||||
|
||||
### User Stories
|
||||
|
||||
* As a cluster admin, I want to have `sysctl` feature versioned so I can assure backward compatibility
|
||||
and proper transformation between versioned to internal representation and back..
|
||||
* As a cluster admin, I want to be confident the `sysctl` feature is stable enough and well supported so
|
||||
applications are properly isolated
|
||||
* As a cluster admin, I want to be able to apply the `sysctl` constraints on the cluster level so
|
||||
I can define the default constraints for all pods.
|
||||
|
||||
### Implementation Details/Notes/Constraints
|
||||
|
||||
Extending `SecurityContext` struct with `Sysctls` field:
|
||||
|
||||
```go
|
||||
// PodSecurityContext holds pod-level security attributes and common container settings.
|
||||
// Some fields are also present in container.securityContext. Field values of
|
||||
// container.securityContext take precedence over field values of PodSecurityContext.
|
||||
type PodSecurityContext struct {
|
||||
...
|
||||
// Sysctls is a white list of allowed sysctls in a pod spec.
|
||||
Sysctls []Sysctl `json:"sysctls,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
Extending `PodSecurityPolicySpec` struct with `Sysctls` field:
|
||||
|
||||
```go
|
||||
// PodSecurityPolicySpec defines the policy enforced on sysctls.
|
||||
type PodSecurityPolicySpec struct {
|
||||
...
|
||||
// Sysctls is a white list of allowed sysctls in a pod spec.
|
||||
Sysctls []Sysctl `json:"sysctls,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
Following steps in [devel/api_changes.md#alpha-field-in-existing-api-version](https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#alpha-field-in-existing-api-version)
|
||||
during implemention.
|
||||
|
||||
Validation checks implemented as part of [#27180](https://github.com/kubernetes/kubernetes/pull/27180).
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
We need to assure backward compatibility, i.e. object specifications with `sysctl` annotations
|
||||
must still work after the graduation.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
* API changes allowing to configure the pod-scoped `sysctl` via `spec.securityContext` field.
|
||||
* API changes allowing to configure the cluster-scoped `sysctl` via `PodSecurityPolicy` object
|
||||
* Promote `--experimental-allowed-unsafe-sysctls` kubelet flag to kubelet config api option
|
||||
* feature gate enabled by default
|
||||
* e2e tests
|
||||
|
||||
## Implementation History
|
||||
|
||||
The `sysctl` feature is tracked as part of [features#34](https://github.com/kubernetes/features/issues/34).
|
||||
This is one of the goals to promote the annotations to fields.
|
||||
|
|
@ -0,0 +1,392 @@
|
|||
---
|
||||
kep-number: 8
|
||||
title: Efficient Node Heartbeat
|
||||
authors:
|
||||
- "@wojtek-t"
|
||||
- "with input from @bgrant0607, @dchen1107, @yujuhong, @lavalamp"
|
||||
owning-sig: sig-node
|
||||
participating-sigs:
|
||||
- sig-scalability
|
||||
- sig-apimachinery
|
||||
- sig-scheduling
|
||||
reviewers:
|
||||
- "@deads2k"
|
||||
- "@lavalamp"
|
||||
approvers:
|
||||
- "@dchen1107"
|
||||
- "@derekwaynecarr"
|
||||
editor: TBD
|
||||
creation-date: 2018-04-27
|
||||
last-updated: 2018-04-27
|
||||
status: implementable
|
||||
see-also:
|
||||
- https://github.com/kubernetes/kubernetes/issues/14733
|
||||
- https://github.com/kubernetes/kubernetes/pull/14735
|
||||
replaces:
|
||||
- n/a
|
||||
superseded-by:
|
||||
- n/a
|
||||
---
|
||||
|
||||
# Efficient Node Heartbeats
|
||||
|
||||
## Table of Contents
|
||||
|
||||
Table of Contents
|
||||
=================
|
||||
|
||||
* [Efficient Node Heartbeats](#efficient-node-heartbeats)
|
||||
* [Table of Contents](#table-of-contents)
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
* [Goals](#goals)
|
||||
* [Non-Goals](#non-goals)
|
||||
* [Proposal](#proposal)
|
||||
* [Risks and Mitigations](#risks-and-mitigations)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
* [Alternatives](#alternatives)
|
||||
* [Dedicated “heartbeat” object instead of “leader election” one](#dedicated-heartbeat-object-instead-of-leader-election-one)
|
||||
* [Events instead of dedicated heartbeat object](#events-instead-of-dedicated-heartbeat-object)
|
||||
* [Reuse the Component Registration mechanisms](#reuse-the-component-registration-mechanisms)
|
||||
* [Split Node object into two parts at etcd level](#split-node-object-into-two-parts-at-etcd-level)
|
||||
* [Delta compression in etcd](#delta-compression-in-etcd)
|
||||
* [Replace etcd with other database](#replace-etcd-with-other-database)
|
||||
|
||||
## Summary
|
||||
|
||||
Node heartbeats are necessary for correct functioning of Kubernetes cluster.
|
||||
This proposal makes them significantly cheaper from both scalability and
|
||||
performance perspective.
|
||||
|
||||
## Motivation
|
||||
|
||||
While running different scalability tests we observed that in big enough clusters
|
||||
(more than 2000 nodes) with non-trivial number of images used by pods on all
|
||||
nodes (10-15), we were hitting etcd limits for its database size. That effectively
|
||||
means that etcd enters "alert mode" and stops accepting all write requests.
|
||||
|
||||
The underlying root cause is combination of:
|
||||
|
||||
- etcd keeping both current state and transaction log with copy-on-write
|
||||
- node heartbeats being pontetially very large objects (note that images
|
||||
are only one potential problem, the second are volumes and customers
|
||||
want to mount 100+ volumes to a single node) - they may easily exceed 15kB;
|
||||
even though the patch send over network is small, in etcd we store the
|
||||
whole Node object
|
||||
- Kubelet sending heartbeats every 10s
|
||||
|
||||
This proposal presents a proper solution for that problem.
|
||||
|
||||
|
||||
Note that currently (by default):
|
||||
|
||||
- Lack of NodeStatus update for `<node-monitor-grace-period>` (default: 40s)
|
||||
results in NodeController marking node as NotReady (pods are no longer
|
||||
scheduled on that node)
|
||||
- Lack of NodeStatus updates for `<pod-eviction-timeout>` (default: 5m)
|
||||
results in NodeController starting pod evictions from that node
|
||||
|
||||
We would like to preserve that behavior.
|
||||
|
||||
|
||||
### Goals
|
||||
|
||||
- Reduce size of etcd by making node heartbeats cheaper
|
||||
|
||||
### Non-Goals
|
||||
|
||||
The following are nice-to-haves, but not primary goals:
|
||||
|
||||
- Reduce resource usage (cpu/memory) of control plane (e.g. due to processing
|
||||
less and/or smaller objects)
|
||||
- Reduce watch-related load on Node objects
|
||||
|
||||
## Proposal
|
||||
|
||||
We propose introducing a new `Lease` built-in API in the newly create API group
|
||||
`coordination.k8s.io`. To make it easily reusable for other purposes it will
|
||||
be namespaced. Its schema will be as following:
|
||||
|
||||
```
|
||||
type Lease struct {
|
||||
metav1.TypeMeta `json:",inline"`
|
||||
// Standard object's metadata.
|
||||
// More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata
|
||||
// +optional
|
||||
ObjectMeta metav1.ObjectMeta `json:"metadata,omitempty"`
|
||||
|
||||
// Specification of the Lease.
|
||||
// More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#spec-and-status
|
||||
// +optional
|
||||
Spec LeaseSpec `json:"spec,omitempty"`
|
||||
}
|
||||
|
||||
type LeaseSpec struct {
|
||||
HolderIdentity string `json:"holderIdentity"`
|
||||
LeaseDurationSeconds int32 `json:"leaseDurationSeconds"`
|
||||
AcquireTime metav1.MicroTime `json:"acquireTime"`
|
||||
RenewTime metav1.MicroTime `json:"renewTime"`
|
||||
LeaseTransitions int32 `json:"leaseTransitions"`
|
||||
}
|
||||
```
|
||||
|
||||
The Spec is effectively of already existing (and thus proved) [LeaderElectionRecord][].
|
||||
The only difference is using `MicroTime` instead of `Time` for better precision.
|
||||
That would hopefully allow us go get directly to Beta.
|
||||
|
||||
We will use that object to represent node heartbeat - for each Node there will
|
||||
be a corresponding `Lease` object with Name equal to Node name in a newly
|
||||
created dedicated namespace (we considered using `kube-system` namespace but
|
||||
decided that it's already too overloaded).
|
||||
That namespace should be created automatically (similarly to "default" and
|
||||
"kube-system", probably by NodeController) and never be deleted (so that nodes
|
||||
don't require permission for it).
|
||||
|
||||
We considered using CRD instead of built-in API. However, even though CRDs are
|
||||
`the new way` for creating new APIs, they don't yet have versioning support
|
||||
and are significantly less performant (due to lack of protobuf support yet).
|
||||
We also don't know whether we could seamlessly transition storage from a CRD
|
||||
to a built-in API if we ran into a performance or any other problems.
|
||||
As a result, we decided to proceed with built-in API.
|
||||
|
||||
|
||||
With this new API in place, we will change Kubelet so that:
|
||||
|
||||
1. Kubelet is periodically computing NodeStatus every 10s (at it is now), but that will
|
||||
be independent from reporting status
|
||||
1. Kubelet is reporting NodeStatus if:
|
||||
- there was a meaningful change in it (initially we can probably assume that every
|
||||
change is meaningful, including e.g. images on the node)
|
||||
- or it didn’t report it over last `node-status-update-period` seconds
|
||||
1. Kubelet creates and periodically updates its own Lease object and frequency
|
||||
of those updates is independent from NodeStatus update frequency.
|
||||
|
||||
In the meantime, we will change `NodeController` to treat both updates of NodeStatus
|
||||
object as well as updates of the new `Lease` object corresponding to a given
|
||||
node as healthiness signal from a given Kubelet. This will make it work for both old
|
||||
and new Kubelets.
|
||||
|
||||
We should also:
|
||||
|
||||
1. audit all other existing core controllers to verify if they also don’t require
|
||||
similar changes in their logic ([ttl controller][] being one of the examples)
|
||||
1. change controller manager to auto-register that `Lease` CRD
|
||||
1. ensure that `Lease` resource is deleted when corresponding node is
|
||||
deleted (probably via owner references)
|
||||
1. [out-of-scope] migrate all LeaderElection code to use that CRD
|
||||
|
||||
Once all the code changes are done, we will:
|
||||
|
||||
1. start updating `Lease` object every 10s by default, at the same time
|
||||
reducing frequency of NodeStatus updates initially to 40s by default.
|
||||
We will reduce it further later.
|
||||
Note that it doesn't reduce frequency by which Kubelet sends "meaningful"
|
||||
changes - it only impacts the frequency of "lastHeartbeatTime" changes.
|
||||
<br> TODO: That still results in higher average QPS. It should be acceptable but
|
||||
needs to be verified.
|
||||
1. announce that we are going to reduce frequency of NodeStatus updates further
|
||||
and give people 1-2 releases to switch their code to use `Lease`
|
||||
object (if they relied on frequent NodeStatus changes)
|
||||
1. further reduce NodeStatus updates frequency to not less often than once per
|
||||
1 minute.
|
||||
We can’t stop periodically updating NodeStatus as it would be API breaking change,
|
||||
but it’s fine to reduce its frequency (though we should continue writing it at
|
||||
least once per eviction period).
|
||||
|
||||
|
||||
To be considered:
|
||||
|
||||
1. We may consider reducing frequency of NodeStatus updates to once every 5 minutes
|
||||
(instead of 1 minute). That would help with performance/scalability even more.
|
||||
Caveats:
|
||||
- NodeProblemDetector is currently updating (some) node conditions every 1 minute
|
||||
(unconditionally, because lastHeartbeatTime always changes). To make reduction
|
||||
of NodeStatus updates frequency really useful, we should also change NPD to
|
||||
work in a similar mode (check periodically if condition changes, but report only
|
||||
when something changed or no status was reported for a given time) and decrease
|
||||
its reporting frequency too.
|
||||
- In general, we recommend to keep frequencies of NodeStatus reporting in both
|
||||
Kubelet and NodeProblemDetector in sync (once all changes will be done) and
|
||||
that should be reflected in [NPD documentation][].
|
||||
- Note that reducing frequency to 1 minute already gives us almost 6x improvment.
|
||||
It seems more than enough for any foreseeable future assuming we won’t
|
||||
significantly increase the size of object Node.
|
||||
Note that if we keep adding node conditions owned by other components, the
|
||||
number of writes of Node object will go up. But that issue is separate from
|
||||
that proposal.
|
||||
|
||||
Other notes:
|
||||
|
||||
1. Additional advantage of using Lease for that purpose would be the
|
||||
ability to exclude it from audit profile and thus reduce the audit logs footprint.
|
||||
|
||||
[LeaderElectionRecord]: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/leaderelection/resourcelock/interface.go#L37
|
||||
[ttl controller]: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/ttl/ttl_controller.go#L155
|
||||
[NPD documentation]: https://kubernetes.io/docs/tasks/debug-application-cluster/monitor-node-health/
|
||||
[kubernetes/kubernetes#63667]: https://github.com/kubernetes/kubernetes/issues/63677
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
Increasing default frequency of NodeStatus updates may potentially break clients
|
||||
relying on frequent Node object updates. However, in non-managed solutions, customers
|
||||
will still be able to restore previous behavior by setting appropriate flag values.
|
||||
Thus, changing defaults to what we recommend is the path to go with.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
The API can be immediately promoted to Beta, as the API is effectively a copy of
|
||||
already existing LeaderElectionRecord. It will be promoted to GA once it's gone
|
||||
a sufficient amount of time as Beta with no changes.
|
||||
|
||||
The changes in components logic (Kubelet, NodeController) should be done behind
|
||||
a feature gate. We suggest making that enabled by default once the feature is
|
||||
implemented.
|
||||
|
||||
## Implementation History
|
||||
|
||||
- RRRR-MM-DD: KEP Summary, Motivation and Proposal merged
|
||||
|
||||
## Alternatives
|
||||
|
||||
We considered a number of alternatives, most important mentioned below.
|
||||
|
||||
### Dedicated “heartbeat” object instead of “leader election” one
|
||||
|
||||
Instead of introducing and using “lease” object, we considered
|
||||
introducing a dedicated “heartbeat” object for that purpose. Apart from that,
|
||||
all the details about the solution remain pretty much the same.
|
||||
|
||||
Pros:
|
||||
|
||||
- Conceptually easier to understand what the object is for
|
||||
|
||||
Cons:
|
||||
|
||||
- Introduces a new, narrow-purpose API. Lease is already used by other
|
||||
components, implemented using annotations on Endpoints and ConfigMaps.
|
||||
|
||||
### Events instead of dedicated heartbeat object
|
||||
|
||||
Instead of introducing a dedicated object, we considered using “Event” object
|
||||
for that purpose. At the high-level the solution looks very similar.
|
||||
The differences from the initial proposal are:
|
||||
|
||||
- we use existing “Event” api instead of introducing a new API
|
||||
- we create a dedicated namespace; events that should be treated as healthiness
|
||||
signal by NodeController will be written by Kubelets (unconditionally) to that
|
||||
namespace
|
||||
- NodeController will be watching only Events from that namespace to avoid
|
||||
processing all events in the system (the volume of all events will be huge)
|
||||
- dedicated namespace also helps with security - we can give access to write to
|
||||
that namespace only to Kubelets
|
||||
|
||||
Pros:
|
||||
|
||||
- No need to introduce new API
|
||||
- We can use that approach much earlier due to that.
|
||||
- We already need to optimize event throughput - separate etcd instance we have
|
||||
for them may help with tuning
|
||||
- Low-risk roll-forward/roll-back: no new objects is involved (node controller
|
||||
starts watching events, kubelet just reduces the frequency of heartbeats)
|
||||
|
||||
Cons:
|
||||
|
||||
- Events are conceptually “best-effort” in the system:
|
||||
- they may be silently dropped in case of problems in the system (the event recorder
|
||||
library doesn’t retry on errors, e.g. to not make things worse when control-plane
|
||||
is starved)
|
||||
- currently, components reporting events don’t even know if it succeeded or not (the
|
||||
library is built in a way that you throw the event into it and are not notified if
|
||||
that was successfully submitted or not).
|
||||
Kubelet sending any other update has full control on how/if retry errors.
|
||||
- lack of fairness mechanisms means that even when some events are being successfully
|
||||
send, there is no guarantee that any event from a given Kubelet will be submitted
|
||||
over a given time period
|
||||
So this would require a different mechanism of reporting those “heartbeat” events.
|
||||
- Once we have “request priority” concept, I think events should have the lowest one.
|
||||
Even though no particular heartbeat is important, guarantee that some heartbeats will
|
||||
be successfully send it crucial (not delivering any of them will result in unnecessary
|
||||
evictions or not-scheduling to a given node). So heartbeats should be of the highest
|
||||
priority. OTOH, node heartbeats are one of the most important things in the system
|
||||
(not delivering them may result in unnecessary evictions), so they should have the
|
||||
highest priority.
|
||||
- No core component in the system is currently watching events
|
||||
- it would make system’s operation harder to explain
|
||||
- Users watch Node objects for heartbeats (even though we didn’t recommend it).
|
||||
Introducing a new object for the purpose of heartbeat will allow those users to
|
||||
migrate, while using events for that purpose breaks that ability. (Watching events
|
||||
may put us in tough situation also from performance reasons.)
|
||||
- Deleting all events (e.g. event etcd failure + playbook response) should continue to
|
||||
not cause a catastrophic failure and the design will need to account for this.
|
||||
|
||||
### Reuse the Component Registration mechanisms
|
||||
|
||||
Kubelet is one of control-place components (shared controller). Some time ago, Component
|
||||
Registration proposal converged into three parts:
|
||||
|
||||
- Introducing an API for registering non-pod endpoints, including readiness information: #18610
|
||||
- Changing endpoints controller to also watch those endpoints
|
||||
- Identifying some of those endpoints as “components”
|
||||
|
||||
We could reuse that mechanism to represent Kubelets as non-pod endpoint API.
|
||||
|
||||
Pros:
|
||||
|
||||
- Utilizes desired API
|
||||
|
||||
Cons:
|
||||
|
||||
- Requires introducing that new API
|
||||
- Stabilizing the API would take some time
|
||||
- Implementing that API requires multiple changes in different components
|
||||
|
||||
### Split Node object into two parts at etcd level
|
||||
|
||||
We may stick to existing Node API and solve the problem at storage layer. At the
|
||||
high level, this means splitting the Node object into two parts in etcd (frequently
|
||||
modified one and the rest).
|
||||
|
||||
Pros:
|
||||
|
||||
- No need to introduce new API
|
||||
- No need to change any components other than kube-apiserver
|
||||
|
||||
Cons:
|
||||
|
||||
- Very complicated to support watch
|
||||
- Not very generic (e.g. splitting Spec and Status doesn’t help, it needs to be just
|
||||
heartbeat part)
|
||||
- [minor] Doesn’t reduce amount of data that should be processed in the system (writes,
|
||||
reads, watches, …)
|
||||
|
||||
### Delta compression in etcd
|
||||
|
||||
An alternative for the above can be solving this completely at the etcd layer. To
|
||||
achieve that, instead of storing full updates in etcd transaction log, we will just
|
||||
store “deltas” and snapshot the whole object only every X seconds/minutes.
|
||||
|
||||
Pros:
|
||||
|
||||
- Doesn’t require any changes to any Kubernetes components
|
||||
|
||||
Cons:
|
||||
|
||||
- Computing delta is tricky (etcd doesn’t understand Kubernetes data model, and
|
||||
delta between two protobuf-encoded objects is not necessary small)
|
||||
- May require a major rewrite of etcd code and not even be accepted by its maintainers
|
||||
- More expensive computationally to get an object in a given resource version (which
|
||||
is what e.g. watch is doing)
|
||||
|
||||
### Replace etcd with other database
|
||||
|
||||
Instead of using etcd, we may also consider using some other open-source solution.
|
||||
|
||||
Pros:
|
||||
|
||||
- Doesn’t require new API
|
||||
|
||||
Cons:
|
||||
|
||||
- We don’t even know if there exists solution that solves our problems and can be used.
|
||||
- Migration will take us years.
|
||||
|
|
@ -1 +1 @@
|
|||
8
|
||||
13
|
||||
|
|
|
|||
|
|
@ -0,0 +1,222 @@
|
|||
---
|
||||
kep-number: 8
|
||||
title: Kustomize
|
||||
authors:
|
||||
- "@pwittrock"
|
||||
- "@monopole"
|
||||
owning-sig: sig-cli
|
||||
participating-sigs:
|
||||
- sig-cli
|
||||
reviewers:
|
||||
- "@droot"
|
||||
approvers:
|
||||
- "@maciej"
|
||||
editor: "@droot"
|
||||
creation-date: 2018-05-5
|
||||
last-updated: 2018-05-5
|
||||
status: implemented
|
||||
see-also:
|
||||
- n/a
|
||||
replaces:
|
||||
- kinflate # Old name for kustomize
|
||||
superseded-by:
|
||||
- n/a
|
||||
---
|
||||
|
||||
# Kustomize
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Kustomize](#kustomize)
|
||||
- [Table of Contents](#table-of-contents)
|
||||
- [Summary](#summary)
|
||||
- [Motivation](#motivation)
|
||||
- [Goals](#goals)
|
||||
- [Non-Goals](#non-goals)
|
||||
- [Proposal](#proposal)
|
||||
- [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
|
||||
- [Risks and Mitigations](#risks-and-mitigations)
|
||||
- [Risks of Not Having a Solution](#risks-of-not-having-a-solution)
|
||||
- [Graduation Criteria](#graduation-criteria)
|
||||
- [Implementation History](#implementation-history)
|
||||
- [Drawbacks](#drawbacks)
|
||||
- [Alternatives](#alternatives)
|
||||
- [FAQ](#faq)
|
||||
|
||||
## Summary
|
||||
|
||||
Declarative specification of Kubernetes objects is the recommended way to manage Kubernetes
|
||||
production workloads, however gaps in the kubectl tooling force users to write their own scripting and
|
||||
tooling to augment the declarative tools with preprocessing transformations.
|
||||
While most of theser transformations already exist as imperative kubectl commands, they are not natively accessible
|
||||
from a declarative workflow.
|
||||
|
||||
This KEP describes how `kustomize` addresses this problem by providing a declarative format for users to access
|
||||
the imperative kubectl commands they are already familiar natively from declarative workflows.
|
||||
|
||||
## Motivation
|
||||
|
||||
The kubectl command provides a cli for:
|
||||
|
||||
- accessing the Kubernetes apis through json or yaml configuration
|
||||
- porcelain commands for generating and transforming configuration off of commandline flags.
|
||||
|
||||
Examples:
|
||||
|
||||
- Generate a configmap or secret from a text or binary file
|
||||
- `kubectl create configmap`, `kubectl create secret`
|
||||
- Users can manage their configmaps and secrets text and binary files
|
||||
|
||||
- Create or update fields that cut across other fields and objects
|
||||
- `kubectl label`, `kubectl annotate`
|
||||
- Users can add and update labels for all objects composing an application
|
||||
|
||||
- Transform an existing declarative configuration without forking it
|
||||
- `kubectl patch`
|
||||
- Users may generate multiple variations of the same workload
|
||||
|
||||
- Transform live resources arbitrarily without auditing
|
||||
- `kubectl edit`
|
||||
|
||||
To create a Secret from a binary file, users must first base64 encode the binary file and then create a Secret yaml
|
||||
config from the resulting data. Because the source of truth is actually the binary file, not the config,
|
||||
users must write scripting and tooling to keep the 2 sources consistent.
|
||||
|
||||
Instead, users should be able to access the simple, but necessary, functionality available in the imperative
|
||||
kubectl commands from their declarative workflow.
|
||||
|
||||
#### Long standing issues
|
||||
|
||||
Kustomize addresses a number of long standing issues in kubectl.
|
||||
|
||||
- Declarative enumeration of multiple files [kubernetes/kubernetes#24649](https://github.com/kubernetes/kubernetes/issues/24649)
|
||||
- Declarative configmap and secret creation: [kubernetes/kubernetes#24744](https://github.com/kubernetes/kubernetes/issues/24744), [kubernetes/kubernetes#30337](https://github.com/kubernetes/kubernetes/issues/30337)
|
||||
- Configmap rollouts: [kubernetes/kubernetes#22368](https://github.com/kubernetes/kubernetes/issues/22368)
|
||||
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/tree/master/examples/helloWorld#how-this-works-with-kustomize)
|
||||
- Name/label scoping and safer pruning: [kubernetes/kubernetes#1698](https://github.com/kubernetes/kubernetes/issues/1698)
|
||||
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/blob/master/examples/breakfast.md#demo-configure-breakfast)
|
||||
- Template-free add-on customization: [kubernetes/kubernetes#23233](https://github.com/kubernetes/kubernetes/issues/23233)
|
||||
- [Example in kustomize](https://github.com/kubernetes-sigs/kustomize/tree/master/examples/helloWorld#staging-kustomization)
|
||||
|
||||
### Goals
|
||||
|
||||
- Declarative support for defining ConfigMaps and Secrets generated from binary and text files
|
||||
- Declarative support for adding or updating cross-cutting fields
|
||||
- labels & selectors
|
||||
- annotations
|
||||
- names (as transformation of the original name)
|
||||
- Declarative support for applying patches to transform arbitrary fields
|
||||
- use strategic-merge-patch format
|
||||
- Ease of integration with CICD systems that maintain configuration in a version control repository
|
||||
as a single source of truth, and take action (build, test, deploy, etc.) when that truth changes (gitops).
|
||||
|
||||
### Non-Goals
|
||||
|
||||
#### Exposing every imperative kubectl command in a declarative fashion
|
||||
|
||||
The scope of kustomize is limited only to functionality gaps that would otherwise prevent users from
|
||||
defining their workloads in a purely declarative manner (e.g. without writing scripts to perform pre-processing
|
||||
or linting). Commands such as `kubectl run`, `kubectl create deployment` and `kubectl edit` are unnecessary
|
||||
in a declarative workflow because a Deployment can easily be managed as declarative config.
|
||||
|
||||
#### Providing a simpler facade on top of the Kubernetes APIs
|
||||
|
||||
The community has developed a number of facades in front of the Kubernetes APIs using
|
||||
templates or DSLs. Attempting to provide an alternative interface to the Kubernetes API is
|
||||
a non-goal. Instead the focus is on:
|
||||
|
||||
- Facilitating simple cross-cutting transformations on the raw config that would otherwise require other tooling such
|
||||
as *sed*
|
||||
- Generating configuration when the source of truth resides elsewhere
|
||||
- Patching existing configuration with transformations
|
||||
|
||||
## Proposal
|
||||
|
||||
### Capabilities
|
||||
|
||||
**Note:** This proposal has already been implemented in `github.com/kubernetes/kubectl`.
|
||||
|
||||
Define a new meta config format called *kustomization.yaml*.
|
||||
|
||||
#### *kustomization.yaml* will allow users to reference config files
|
||||
|
||||
- Path to config yaml file (similar to `kubectl apply -f <file>`)
|
||||
- Urls to config yaml file (similar to `kubectl apply -f <url>`)
|
||||
- Path to *kustomization.yaml* file (takes the output of running kustomize)
|
||||
|
||||
#### *kustomization.yaml* will allow users to generate configs from files
|
||||
|
||||
- ConfigMap (`kubectl create configmap`)
|
||||
- Secret (`kubectl create secret`)
|
||||
|
||||
#### *kustomization.yaml* will allow users to apply transformations to configs
|
||||
|
||||
- Label (`kubectl label`)
|
||||
- Annotate (`kubectl annotate`)
|
||||
- Strategic-Merge-Patch (`kubectl patch`)
|
||||
- Name-Prefix
|
||||
|
||||
### UX
|
||||
|
||||
Kustomize will also contain subcommands to facilitate authoring *kustomization.yaml*.
|
||||
|
||||
#### Edit
|
||||
|
||||
The edit subcommands will allow users to modify the *kustomization.yaml* through cli commands containing
|
||||
helpful messaging and documentation.
|
||||
|
||||
- Add ConfigMap - like `kubectl create configmap` but declarative in *kustomization.yaml*
|
||||
- Add Secret - like `kubectl create secret` but declarative in *kustomization.yaml*
|
||||
- Add Resource - adds a file reference to *kustomization.yaml*
|
||||
- Set NamePrefix - adds NamePrefix declaration to *kustomization.yaml*
|
||||
|
||||
#### Diff
|
||||
|
||||
The diff subcommand will allow users to see a diff of the original and transformed configuration files
|
||||
|
||||
- Generated config (configmap) will show the files as created
|
||||
- Transformations (name prefix) will show the files as modified
|
||||
|
||||
### Implementation Details/Notes/Constraints [optional]
|
||||
|
||||
Kustomize has already been implemented in the `github.com/kubernetes/kubectl` repo, and should be moved to a
|
||||
separate repo for the subproject.
|
||||
|
||||
Kustomize was initially developed as its own cli, however once it has matured, it should be published
|
||||
as a subcommand of kubectl or as a statically linked plugin. It should also be more tightly integrated with apply.
|
||||
|
||||
- Create the *kustomize* sig-cli subproject and update sigs.yaml
|
||||
- Move the existing kustomize code from `github.com/kubernetes/kubectl` to `github.com/kubernetes-sigs/kustomize`
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
|
||||
### Risks of Not Having a Solution
|
||||
|
||||
By not providing a viable option for working directly with Kubernetes APIs as json or
|
||||
yaml config, we risk the ecosystem becoming fragmented with various bespoke API facades.
|
||||
By ensuring the raw Kubernetes API json or yaml is a usable approach for declaratively
|
||||
managing applications, even tools that do not use the Kubernetes API as their native format can
|
||||
better work with one another through transformation to a common format.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
- Dogfood kustomize by either:
|
||||
- moving one or more of our own (OSS Kubernetes) services to it.
|
||||
- getting user feedback from one or more mid or large application deployments using kustomize.
|
||||
- Publish kustomize as a subcommand of kubectl.
|
||||
|
||||
## Implementation History
|
||||
|
||||
kustomize was implemented in the kubectl repo before subprojects became a first class thing in Kubernetes.
|
||||
The code has been fully implemented, but it must be moved to a proper location.
|
||||
|
||||
## Drawbacks
|
||||
|
||||
|
||||
## Alternatives
|
||||
|
||||
1. Users write their own bespoke scripts to generate and transform the config before it is applied.
|
||||
2. Users don't work with the API directly, and use or develop DSLs for interacting with Kubernetes.
|
||||
|
||||
## FAQs
|
||||
|
|
@ -4,18 +4,22 @@ title: Cloud Provider Controller Manager
|
|||
authors:
|
||||
- "@cheftako"
|
||||
- "@calebamiles"
|
||||
- "@hogepodge"
|
||||
owning-sig: sig-apimachinery
|
||||
participating-sigs:
|
||||
- sig-apps
|
||||
- sig-aws
|
||||
- sig-azure
|
||||
- sig-cloud-provider
|
||||
- sig-gcp
|
||||
- sig-network
|
||||
- sig-openstack
|
||||
- sig-storage
|
||||
reviewers:
|
||||
- "@wlan0"
|
||||
- "@andrewsykim"
|
||||
- "@calebamiles"
|
||||
- "@hogepodge"
|
||||
- "@jagosan"
|
||||
approvers:
|
||||
- "@thockin"
|
||||
editor: TBD
|
||||
|
|
@ -41,16 +45,21 @@ replaces:
|
|||
- [API Server Changes](#api-server-changes)
|
||||
- [Volume Management Changes](#volume-management-changes)
|
||||
- [Deployment Changes](#deployment-changes)
|
||||
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
|
||||
- [Repository Requirements](#repository-requirements)
|
||||
- [Notes for Repository Requirements](#notes-for-repository-requirements)
|
||||
- [Repository Timeline](#repository-timeline)
|
||||
- [Security Considerations](#security-considerations)
|
||||
- [Graduation Criteria](#graduation-criteria)
|
||||
- [Graduation to Beta](#graduation-to-beta)
|
||||
- [Process Goals](#process-goals)
|
||||
- [Implementation History](#implementation-history)
|
||||
- [Alternatives](#alternatives)
|
||||
|
||||
## Summary
|
||||
|
||||
We want to remove any cloud provider specific logic from the kubernetes/kubernetes repo. We want to restructure the code
|
||||
to make is easy for any cloud provider to extend the kubernetes core in a consistent manner for their cloud. New cloud
|
||||
to make it easy for any cloud provider to extend the kubernetes core in a consistent manner for their cloud. New cloud
|
||||
providers should look at the [Creating a Custom Cluster from Scratch](https://kubernetes.io/docs/getting-started-guides/scratch/#cloud-provider)
|
||||
and the [cloud provider interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go#L31)
|
||||
which will need to be implemented.
|
||||
|
|
@ -208,8 +217,8 @@ taints.
|
|||
|
||||
### API Server Changes
|
||||
|
||||
Finally, in the kube-apiserver, the cloud provider is used for transferring SSH keys to all of the nodes, and within an a
|
||||
dmission controller for setting labels on persistent volumes.
|
||||
Finally, in the kube-apiserver, the cloud provider is used for transferring SSH keys to all of the nodes, and within an
|
||||
admission controller for setting labels on persistent volumes.
|
||||
|
||||
Kube-apiserver uses the cloud provider for two purposes
|
||||
|
||||
|
|
@ -220,7 +229,7 @@ Kube-apiserver uses the cloud provider for two purposes
|
|||
|
||||
Volumes need cloud providers, but they only need **specific** cloud providers. The majority of volume management logic
|
||||
resides in the controller manager. These controller loops need to be moved into the cloud-controller manager. The cloud
|
||||
controller manager also needs a mechanism to read parameters for initilization from cloud config. This can be done via
|
||||
controller manager also needs a mechanism to read parameters for initialization from cloud config. This can be done via
|
||||
config maps.
|
||||
|
||||
There are two entirely different approach to refactoring volumes -
|
||||
|
|
@ -257,6 +266,102 @@ In case of the cloud-controller-manager, the deployment should be deleted using
|
|||
kubectl delete -f cloud-controller-manager.yml
|
||||
```
|
||||
|
||||
### Implementation Details/Notes/Constraints
|
||||
|
||||
#### Repository Requirements
|
||||
|
||||
**This is a proposed structure, and may change during the 1.11 release cycle.
|
||||
WG-Cloud-Provider will work with individual sigs to refine these requirements
|
||||
to maintain consistency while meeting the technical needs of the provider
|
||||
maintainers**
|
||||
|
||||
Each cloud provider hosted within the `kubernetes` organization shall have a
|
||||
single repository named `kubernetes/cloud-provider-<provider_name>`. Those
|
||||
repositories shall have the following structure:
|
||||
|
||||
* A `cloud-controller-manager` subdirectory that contains the implementation
|
||||
of the provider-specific cloud controller.
|
||||
* A `docs` subdirectory.
|
||||
* A `docs/cloud-controller-manager.md` file that describes the options and
|
||||
usage of the cloud controller manager code.
|
||||
* A `docs/testing.md` file that describes how the provider code is tested.
|
||||
* A `Makefile` with a `test` entrypoint to run the provider tests.
|
||||
|
||||
Additionally, the repository should have:
|
||||
|
||||
* A `docs/getting-started.md` file that describes the installation and basic
|
||||
operation of the cloud controller manager code.
|
||||
|
||||
Where the provider has additional capabilities, the repository should have
|
||||
the following subdirectories that contain the common features:
|
||||
|
||||
* `dns` for DNS provider code.
|
||||
* `cni` for the Container Network Interface (CNI) driver.
|
||||
* `csi` for the Container Storage Interface (CSI) driver.
|
||||
* `flex` for the Flex Volume driver.
|
||||
* `installer` for custom installer code.
|
||||
|
||||
Each repository may have additional directories and files that are used for
|
||||
additional feature that include but are not limited to:
|
||||
|
||||
* Other provider specific testing.
|
||||
* Additional documentation, including examples and developer documentation.
|
||||
* Dependencies on provider-hosted or other external code.
|
||||
|
||||
|
||||
##### Notes for Repository Requirements
|
||||
|
||||
This purpose of these requirements is to define a common structure for the
|
||||
cloud provider repositories owned by current and future cloud provider SIGs.
|
||||
In accordance with the
|
||||
[WG-Cloud-Provider Charter](https://docs.google.com/document/d/1m4Kvnh_u_9cENEE9n1ifYowQEFSgiHnbw43urGJMB64/edit#)
|
||||
to "define a set of common expected behaviors across cloud providers", this
|
||||
proposal defines the location and structure of commonly expected code.
|
||||
|
||||
As each provider can and will have additional features that go beyond expected
|
||||
common code, requirements only apply to the location of the
|
||||
following code:
|
||||
|
||||
* Cloud Controller Manager implementations.
|
||||
* Documentation.
|
||||
|
||||
This document may be amended with additional locations that relate to enabling
|
||||
consistent upstream testing, independent storage drivers, and other code with
|
||||
common integration hooks may be added
|
||||
|
||||
The development of the
|
||||
[Cloud Controller Manager](https://github.com/kubernetes/kubernetes/tree/master/cmd/cloud-controller-manager)
|
||||
and
|
||||
[Cloud Provider Interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go)
|
||||
has enabled the provider SIGs to develop external providers that
|
||||
capture the core functionality of the upstream providers. By defining the
|
||||
expected locations and naming conventions of where the external provider code
|
||||
is, we will create a consistent experience for:
|
||||
|
||||
* Users of the providers, who will have easily understandable conventions for
|
||||
discovering and using all of the providers.
|
||||
* SIG-Docs, who will have a common hook for building or linking to externally
|
||||
managed documentation
|
||||
* SIG-Testing, who will be able to use common entry points for enabling
|
||||
provider-specific e2e testing.
|
||||
* Future cloud provider authors, who will have a common framework and examples
|
||||
from which to build and share their code base.
|
||||
|
||||
##### Repository Timeline
|
||||
|
||||
To facilitate community development, providers named in the
|
||||
[Makes SIGs responsible for implementations of `CloudProvider`](https://github.com/kubernetes/community/pull/1862)
|
||||
patch can immediately migrate their external provider work into their named
|
||||
repositories.
|
||||
|
||||
Each provider will work to implement the required structure during the
|
||||
Kubernetes 1.11 development cycle, with conformance by the 1.11 release.
|
||||
WG-Cloud-Provider may actively change repository requirements during the
|
||||
1.11 release cycle to respond to collective SIG technical needs.
|
||||
|
||||
After the 1.11 release all current and new provider implementations must
|
||||
conform with the requirements outlined in this document.
|
||||
|
||||
### Security Considerations
|
||||
|
||||
Make sure that you consider the impact of this feature from the point of view of Security.
|
||||
|
|
@ -307,6 +412,20 @@ is proposed to
|
|||
- serve as a repository for user experience reports related to Cloud Providers
|
||||
which live within the Kubernetes GitHub organization or desire to do so
|
||||
|
||||
Major milestones:
|
||||
|
||||
- March 18, 2018: Accepted proposal for repository requirements.
|
||||
|
||||
*Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
|
||||
Major milestones might include
|
||||
|
||||
- the `Summary` and `Motivation` sections being merged signaling SIG acceptance
|
||||
- the `Proposal` section being merged signaling agreement on a proposed design
|
||||
- the date implementation started
|
||||
- the first Kubernetes release where an initial version of the KEP was available
|
||||
- the version of Kubernetes where the KEP graduated to general availability
|
||||
- when the KEP was retired or superseded*
|
||||
|
||||
The ultimate intention of WG Cloud Provider is to prevent multiple classes
|
||||
of software purporting to be an implementation of the Cloud Provider interface
|
||||
from fracturing the Kubernetes Community while also ensuring that new Cloud
|
||||
|
|
@ -0,0 +1,145 @@
|
|||
---
|
||||
kep-number: draft-20180412
|
||||
title: Kubeadm Config Draft
|
||||
authors:
|
||||
- "@liztio"
|
||||
owning-sig: sig-cluster-lifecycle
|
||||
participating-sigs: []
|
||||
reviewers:
|
||||
- "@timothysc"
|
||||
approvers:
|
||||
- TBD
|
||||
editor: TBD
|
||||
creation-date: 2018-04-12
|
||||
last-updated: 2018-04-12
|
||||
status: draft
|
||||
see-also: []
|
||||
replaces: []
|
||||
superseded-by: []
|
||||
---
|
||||
|
||||
# Kubeadm Config to Beta
|
||||
|
||||
## Table of Contents
|
||||
|
||||
A table of contents is helpful for quickly jumping to sections of a KEP and for highlighting any additional information provided beyond the standard KEP template.
|
||||
|
||||
<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
|
||||
**Table of Contents**
|
||||
|
||||
- [Kubeadm Config to Beta](#kubeadm-config-to-beta)
|
||||
- [Table of Contents](#table-of-contents)
|
||||
- [Summary](#summary)
|
||||
- [Motivation](#motivation)
|
||||
- [Goals](#goals)
|
||||
- [Non-Goals](#non-goals)
|
||||
- [Proposal](#proposal)
|
||||
- [User Stories [optional]](#user-stories-optional)
|
||||
- [As a user upgrading with Kubeadm, I want the upgrade process to not fail with unfamiliar configuration.](#as-a-user-upgrading-with-kubeadm-i-want-the-upgrade-process-to-not-fail-with-unfamiliar-configuration)
|
||||
- [As a infrastructure system using kubeadm, I want to be able to write configuration files that always work.](#as-a-infrastructure-system-using-kubeadm-i-want-to-be-able-to-write-configuration-files-that-always-work)
|
||||
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
|
||||
- [Risks and Mitigations](#risks-and-mitigations)
|
||||
- [Graduation Criteria](#graduation-criteria)
|
||||
- [Implementation History](#implementation-history)
|
||||
- [Alternatives](#alternatives)
|
||||
|
||||
<!-- markdown-toc end -->
|
||||
|
||||
## Summary
|
||||
|
||||
Kubeadm uses MasterConfiguraton for two distinct but similar operations: Initialising a new cluster and upgrading an existing cluster.
|
||||
The former is typically created by hand by an administrator.
|
||||
It is stored on disk and passed to `kubeadm init` via command line flag.
|
||||
The latter is produced by kubeadm using supplied configuration files, command line options, and internal defaults.
|
||||
It will be stored in a ConfigMap so upgrade operations can find.
|
||||
|
||||
Right now the configuration format is unversioned.
|
||||
This means configuration file formats can change between kubeadm versions and there's no safe way to update the configuration format.
|
||||
|
||||
We propose a stable versioning of this configuration, `v1alpha2` and eventually `v1beta1`.
|
||||
Version information will be _mandatory_ going forward, both for user-generated configuration files and machine-generated configuration maps.
|
||||
|
||||
There as an [existing document][config] describing current Kubernetes best practices around component configuration.
|
||||
|
||||
[config]: https://docs.google.com/document/d/1FdaEJUEh091qf5B98HM6_8MS764iXrxxigNIdwHYW9c/edit#heading=h.nlhhig66a0v6
|
||||
|
||||
## Motivation
|
||||
|
||||
After 1.10.0, we discovered a bug in the upgrade process.
|
||||
The `MasterConfiguraton` embedded a [struct that had changed][proxyconfig], which caused a backwards-incompatible change to the configuration format.
|
||||
This caused `kubeadm upgrade` to fail, because a newer version of kubeadm was attempting to deserialise an older version of the struct.
|
||||
|
||||
Because the configuration is often written and read by different versions of kubeadm compiled by different versions of kubernetes,
|
||||
it's very important for this configuration file to be well-versioned.
|
||||
|
||||
[proxyconfig]: https://github.com/kubernetes/kubernetes/commit/57071d85ee2c27332390f0983f42f43d89821961
|
||||
|
||||
### Goals
|
||||
|
||||
* kubeadm init fails if a configuration file isn't versioned
|
||||
* the config map written out contains a version
|
||||
* the configuration struct does not embed any other structs
|
||||
* existing configuration files are converted on upgrade to a known, stable version
|
||||
* structs should be sparsely populated
|
||||
* all structs should have reasonable defaults so an empty config is still sensible
|
||||
|
||||
### Non-Goals
|
||||
|
||||
* kubeadm is able to read and write configuration files for older and newer versions of kubernetes than it was compiled with
|
||||
* substantially changing the schema of the `MasterConfiguration`
|
||||
|
||||
## Proposal
|
||||
|
||||
The concrete proposal is as follows.
|
||||
|
||||
1. Immediately start writing Kind and Version information into the `MasterConfiguraton` struct.
|
||||
2. Define the previous (1.9) version of the struct as `v1alpha1`.
|
||||
3. Duplicate the KubeProxyConfig struct that caused the schema change, adding the old version to the `v1alpha1` struct.
|
||||
3. Create a new `v1alpha2` directory mirroring the existing [`v1alpha1`][v1alpha1], which matches the 1.10 schema.
|
||||
This version need not duplicate the file as well.
|
||||
2. Warn users if their configuration files do not have a version and kind
|
||||
4. Use [apimachinery's conversion][conversion] library to design migrations from the old (v1alpha1) versions to the new (v1alpha2) versions
|
||||
5. Determine the changes for v1beta1
|
||||
6. With v1beta1, enforce presence of version numbers in config files and ConfigMaps, erroring if not present.
|
||||
|
||||
[conversion]: https://godoc.org/k8s.io/apimachinery/pkg/conversion
|
||||
[v1alpha1]: https://github.com/kubernetes/kubernetes/tree/d7d4381961f4eb2a4b581160707feb55731e324e/cmd/kubeadm/app/apis/kubeadm
|
||||
|
||||
### User Stories [optional]
|
||||
|
||||
#### As a user upgrading with Kubeadm, I want the upgrade process to not fail with unfamiliar configuration.
|
||||
|
||||
In the past, the haphazard nature of the versioning system has meant it was hard to provide strong guarantees between versions.
|
||||
Implementing strong version guarantees mean any given configuration generated in the past by kubeadm will work with a future version of kubeadm.
|
||||
Deprecations can happen in the future in well-regulated ways.
|
||||
|
||||
#### As a infrastructure system using kubeadm, I want to be able to write configuration files that always work.
|
||||
|
||||
Having a configuration file that changes without notice makes it very difficult to write software that integrates with kubeadm.
|
||||
By providing strong version guarantees, we can guarantee that the files these tools produce will work with a given version of kubeadm.
|
||||
|
||||
### Implementation Details/Notes/Constraints
|
||||
|
||||
The incident that caused the breakage in alpha wasn't a field changed it Kubeadm, it was a struct [referenced][struct] inside the `MasterConfiguration` struct.
|
||||
By completely owning our own configuration, changes in the rest of the project can't unknowingly affect us.
|
||||
When we do need to interface with the rest of the project, we will do so explicitly in code and be protected by the compiler.
|
||||
|
||||
[struct]: https://github.com/kubernetes/kubernetes/blob/d7d4381961f4eb2a4b581160707feb55731e324e/cmd/kubeadm/app/apis/kubeadm/v1alpha1/types.go#L285
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
Moving to a strongly versioned configuration from a weakly versioned one must be done carefully so as not break kubeadm for existing users.
|
||||
We can start requiring versions of the existing `v1alpha1` format, issuing warnings to users when Version and Kind aren't present.
|
||||
These fields can be used today, they're simply ignored.
|
||||
In the future, we could require them, and transition to using `v1alpha1`.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
This KEP can be considered complete once all currently supported versions of Kubeadm write out `v1beta1`-version structs.
|
||||
|
||||
## Implementation History
|
||||
|
||||
## Alternatives
|
||||
|
||||
Rather than creating our own copies of all structs in the `MasterConfiguration` struct, we could instead continue embedding the structs.
|
||||
To provide our guarantees, we would have to invest a lot more in automated testing for upgrades.
|
||||
|
|
@ -0,0 +1,126 @@
|
|||
---
|
||||
kep-number: 10
|
||||
title: Graduate CoreDNS to GA
|
||||
authors:
|
||||
- "@johnbelamaric"
|
||||
- "@rajansandeep"
|
||||
owning-sig: sig-network
|
||||
participating-sigs:
|
||||
- sig-cluster-lifecycle
|
||||
reviewers:
|
||||
- "@bowei"
|
||||
- "@thockin"
|
||||
approvers:
|
||||
- "@thockin"
|
||||
editor: "@rajansandeep"
|
||||
creation-date: 2018-03-21
|
||||
last-updated: 2018-05-18
|
||||
status: provisional
|
||||
see-also: https://github.com/kubernetes/community/pull/2167
|
||||
---
|
||||
|
||||
# Graduate CoreDNS to GA
|
||||
|
||||
## Table of Contents
|
||||
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
* [Goals](#goals)
|
||||
* [Non-Goals](#non-goals)
|
||||
* [Proposal](#proposal)
|
||||
* [User Cases](#use-cases)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
|
||||
## Summary
|
||||
|
||||
CoreDNS is sister CNCF project and is the successor to SkyDNS, on which kube-dns is based. It is a flexible, extensible
|
||||
authoritative DNS server and directly integrates with the Kubernetes API. It can serve as cluster DNS,
|
||||
complying with the [dns spec](https://git.k8s.io/dns/docs/specification.md). As an independent project,
|
||||
it is more actively developed than kube-dns and offers performance and functionality beyond what kube-dns has. For more details, see the [introductory presentation](https://docs.google.com/presentation/d/1v6Coq1JRlqZ8rQ6bv0Tg0usSictmnN9U80g8WKxiOjQ/edit#slide=id.g249092e088_0_181), or [coredns.io](https://coredns.io), or the [CNCF webinar](https://youtu.be/dz9S7R8r5gw).
|
||||
|
||||
Currently, we are following the road-map defined [here](https://github.com/kubernetes/features/issues/427). CoreDNS is Beta in Kubernetes v1.10, which can be installed as an alternate to kube-dns.
|
||||
The purpose of this proposal is to graduate CoreDNS to GA.
|
||||
|
||||
## Motivation
|
||||
|
||||
* CoreDNS is more flexible and extensible than kube-dns.
|
||||
* CoreDNS is easily extensible and maintainable using a plugin architecture.
|
||||
* CoreDNS has fewer moving parts than kube-dns, taking advantage of the plugin architecture, making it a single executable and single process.
|
||||
* It is written in Go, making it memory-safe (kube-dns includes dnsmasq which is not).
|
||||
* CoreDNS has [better performance](https://github.com/kubernetes/community/pull/1100#issuecomment-337747482) than [kube-dns](https://github.com/kubernetes/community/pull/1100#issuecomment-338329100) in terms of greater QPS, lower latency, and lower memory consumption.
|
||||
|
||||
### Goals
|
||||
|
||||
* Bump up CoreDNS to be GA.
|
||||
* Make CoreDNS available as an image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
|
||||
May be deferred to [next KEP](https://github.com/kubernetes/community/pull/2167) if goal not achieved in time.
|
||||
* Provide a kube-dns to CoreDNS upgrade path with configuration translation in `kubeadm`.
|
||||
* Provide a CoreDNS to CoreDNS upgrade path in `kubeadm`.
|
||||
|
||||
### Non-Goals
|
||||
|
||||
* Translation of CoreDNS ConfigMap back to kube-dns (i.e., downgrade).
|
||||
* Translation configuration of kube-dns to equivalent CoreDNS that is defined outside of the kube-dns ConfigMap. For example, modifications to the manifest or `dnsmasq` configuration.
|
||||
* Fate of kube-dns in future releases, i.e. deprecation path.
|
||||
* Making [CoreDNS the default](https://github.com/kubernetes/community/pull/2167) in every installer.
|
||||
|
||||
## Proposal
|
||||
|
||||
The proposed solution is to enable the selection of CoreDNS as a GA cluster service discovery DNS for Kubernetes.
|
||||
Some of the most used deployment tools have been upgraded by the CoreDNS team, in cooperation of the owners of these tools, to be able to deploy CoreDNS:
|
||||
* kubeadm
|
||||
* kube-up
|
||||
* minikube
|
||||
* kops
|
||||
|
||||
For other tools, each maintainer would have to add the upgrade to CoreDNS.
|
||||
|
||||
### Use Cases
|
||||
|
||||
* CoreDNS supports all functionality of kube-dns and also addresses [several use-cases kube-dns lacks](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/network/coredns.md#use-cases). Some of the Use Cases are as follows:
|
||||
* Supporting [Autopath](https://coredns.io/plugins/autopath/), which reduces the high query load caused by the long DNS search path in Kubernetes.
|
||||
* Making an alias for an external name [#39792](https://github.com/kubernetes/kubernetes/issues/39792)
|
||||
|
||||
* By default, the user experience would be unchanged. For more advanced uses, existing users would need to modify the ConfigMap that contains the CoreDNS configuration file.
|
||||
* Since CoreDNS has more supporting features than kube-dns, there will be no path to retain the CoreDNS configuration in case a user wants to switch to kube-dns.
|
||||
|
||||
#### Configuring CoreDNS
|
||||
|
||||
The CoreDNS configuration file is called a `Corefile` and syntactically is the same as a [Caddyfile](https://caddyserver.com/docs/caddyfile). The file consists of multiple stanzas called _server blocks_.
|
||||
Each of these represents a set of zones for which that server block should respond, along with the list of plugins to apply to a given request. More details on this can be found in the
|
||||
[Corefile Explained](https://coredns.io/2017/07/23/corefile-explained/) and [How Queries Are Processed](https://coredns.io/2017/06/08/how-queries-are-processed-in-coredns/) blog entries.
|
||||
|
||||
The following can be expected when CoreDNS is graduated to GA.
|
||||
|
||||
#### Kubeadm
|
||||
|
||||
* The CoreDNS feature-gates flag will be marked as GA.
|
||||
* As Kubeadm maintainers chose to deploy CoreDNS as the default Cluster DNS for Kubernetes 1.11:
|
||||
* CoreDNS will be installed by default in a fresh install of Kubernetes via kubeadm.
|
||||
* For users upgrading Kubernetes via kubeadm, it will install CoreDNS by default whether the user had kube-dns or CoreDNS in a previous kubernetes version.
|
||||
* In case a user wants to install kube-dns instead of CoreDNS, they have to set the feature-gate of CoreDNS to false. `--feature-gates=CoreDNS=false`
|
||||
* When choosing to install CoreDNS, the configmap of a previously installed kube-dns will be automatically translated to the equivalent CoreDNS configmap.
|
||||
|
||||
#### Kube-up
|
||||
|
||||
* CoreDNS will be installed when the environment variable `CLUSTER_DNS_CORE_DNS` is set to `true`. The default value is `false`.
|
||||
|
||||
#### Minikube
|
||||
|
||||
* CoreDNS to be an option in the add-on manager, with CoreDNS disabled by default.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
* Verify that all e2e conformance and DNS related tests (xxx-kubernetes-e2e-gce, ci-kubernetes-e2e-gce-gci-ci-master and filtered by `--ginkgo.skip=\\[Slow\\]|\\[Serial\\]|\\[Disruptive\\]|\\[Flaky\\]|\\[Feature:.+\\]`) run successfully for CoreDNS.
|
||||
None of the tests successful with Kube-DNS should be failing with CoreDNS.
|
||||
* Add CoreDNS as part of the e2e Kubernetes scale runs and ensure tests are not failing.
|
||||
* Extend [perf-tests](https://github.com/kubernetes/perf-tests/tree/master/dns) for CoreDNS.
|
||||
* Add a dedicated DNS related tests in e2e scalability test [Feature:performance].
|
||||
|
||||
## Implementation History
|
||||
|
||||
* 20170912 - [Feature proposal](https://github.com/kubernetes/features/issues/427) for CoreDNS to be implemented as the default DNS in Kubernetes.
|
||||
* 20171108 - Successfully released [CoreDNS as an Alpha feature-gate in Kubernetes v1.9](https://github.com/kubernetes/kubernetes/pull/52501).
|
||||
* 20180226 - CoreDNS graduation to Incubation in CNCF.
|
||||
* 20180305 - Support for Kube-dns configmap translation and move up [CoreDNS to Beta](https://github.com/kubernetes/kubernetes/pull/58828) for Kubernetes v1.10.
|
||||
|
|
@ -0,0 +1,574 @@
|
|||
---
|
||||
kep-number: TBD
|
||||
title: IPVS Load Balancing Mode in Kubernetes
|
||||
status: implemented
|
||||
authors:
|
||||
- "@rramkumar1"
|
||||
owning-sig: sig-network
|
||||
reviewers:
|
||||
- "@thockin"
|
||||
- "@m1093782566"
|
||||
approvers:
|
||||
- "@thockin"
|
||||
- "@m1093782566"
|
||||
editor:
|
||||
- "@thockin"
|
||||
- "@m1093782566"
|
||||
creation-date: 2018-03-21
|
||||
---
|
||||
|
||||
# IPVS Load Balancing Mode in Kubernetes
|
||||
|
||||
**Note: This is a retroactive KEP. Credit goes to @m1093782566, @haibinxie, and @quinton-hoole for all information & design in this KEP.**
|
||||
|
||||
**Important References: https://github.com/kubernetes/community/pull/692/files**
|
||||
|
||||
## Table of Contents
|
||||
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
* [Goals](#goals)
|
||||
* [Non\-goals](#non-goals)
|
||||
* [Proposal](#proposal)
|
||||
* [Kube-Proxy Parameter Changes](#kube-proxy-parameter-changes)
|
||||
* [Build Changes](#build-changes)
|
||||
* [Deployment Changes](#deployment-changes)
|
||||
* [Design Considerations](#design-considerations)
|
||||
* [IPVS service network topology](#ipvs-service-network-topology)
|
||||
* [Port remapping](#port-remapping)
|
||||
* [Falling back to iptables](#falling-back-to-iptables)
|
||||
* [Supporting NodePort service](#supporting-nodeport-service)
|
||||
* [Supporting ClusterIP service](#supporting-clusterip-service)
|
||||
* [Supporting LoadBalancer service](#supporting-loadbalancer-service)
|
||||
* [Session Affinity](#session-affinity)
|
||||
* [Cleaning up inactive rules](#cleaning-up-inactive-rules)
|
||||
* [Sync loop pseudo code](#sync-loop-pseudo-code)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
* [Drawbacks](#drawbacks)
|
||||
* [Alternatives](#alternatives)
|
||||
|
||||
## Summary
|
||||
|
||||
We are building a new implementation of kube proxy built on top of IPVS (IP Virtual Server).
|
||||
|
||||
## Motivation
|
||||
|
||||
As Kubernetes grows in usage, the scalability of its resources becomes more and more
|
||||
important. In particular, the scalability of services is paramount to the adoption of Kubernetes
|
||||
by developers/companies running large workloads. Kube Proxy, the building block of service routing
|
||||
has relied on the battle-hardened iptables to implement the core supported service types such as
|
||||
ClusterIP and NodePort. However, iptables struggles to scale to tens of thousands of services because
|
||||
it is designed purely for firewalling purposes and is based on in-kernel rule chains. On the
|
||||
other hand, IPVS is specifically designed for load balancing and uses more efficient data structures
|
||||
under the hood. For more information on the performance benefits of IPVS vs. iptables, take a look
|
||||
at these [slides](https://docs.google.com/presentation/d/1BaIAywY2qqeHtyGZtlyAp89JIZs59MZLKcFLxKE6LyM/edit?usp=sharing).
|
||||
|
||||
### Goals
|
||||
|
||||
* Improve the performance of services
|
||||
|
||||
### Non-goals
|
||||
|
||||
None
|
||||
|
||||
### Challenges and Open Questions [optional]
|
||||
|
||||
None
|
||||
|
||||
|
||||
## Proposal
|
||||
|
||||
### Kube-Proxy Parameter Changes
|
||||
|
||||
***Parameter: --proxy-mode***
|
||||
In addition to existing userspace and iptables modes, IPVS mode is configured via --proxy-mode=ipvs. In the initial implementation, it implicitly uses IPVS [NAT](http://www.linuxvirtualserver.org/VS-NAT.html) mode.
|
||||
|
||||
***Parameter: --ipvs-scheduler***
|
||||
A new kube-proxy parameter will be added to specify the IPVS load balancing algorithm, with the parameter being --ipvs-scheduler. If it’s not configured, then round-robin (rr) is default value. If it’s incorrectly configured, then kube-proxy will exit with error message.
|
||||
* rr: round-robin
|
||||
* lc: least connection
|
||||
* dh: destination hashing
|
||||
* sh: source hashing
|
||||
* sed: shortest expected delay
|
||||
* nq: never queue
|
||||
For more details, refer to http://kb.linuxvirtualserver.org/wiki/Ipvsadm
|
||||
|
||||
In future, we can implement service specific scheduler (potentially via annotation), which has higher priority and overwrites the value.
|
||||
|
||||
***Parameter: --cleanup-ipvs***
|
||||
Similar to the --cleanup-iptables parameter, if true, cleanup IPVS configuration and IPTables rules that are created in IPVS mode.
|
||||
|
||||
***Parameter: --ipvs-sync-period***
|
||||
Maximum interval of how often IPVS rules are refreshed (e.g. '5s', '1m'). Must be greater than 0.
|
||||
|
||||
***Parameter: --ipvs-min-sync-period***
|
||||
Minimum interval of how often the IPVS rules are refreshed (e.g. '5s', '1m'). Must be greater than 0.
|
||||
|
||||
|
||||
### Build Changes
|
||||
|
||||
No changes at all. The IPVS implementation is built on [docker/libnetwork](https://godoc.org/github.com/docker/libnetwork/ipvs) IPVS library, which is a pure-golang implementation and talks to kernel via socket communication.
|
||||
|
||||
### Deployment Changes
|
||||
|
||||
IPVS kernel module installation is beyond Kubernetes. It’s assumed that IPVS kernel modules are installed on the node before running kube-proxy. When kube-proxy starts, if the proxy mode is IPVS, kube-proxy would validate if IPVS modules are installed on the node. If it’s not installed, then kube-proxy will fall back to the iptables proxy mode.
|
||||
|
||||
### Design Considerations
|
||||
|
||||
#### IPVS service network topology
|
||||
|
||||
We will create a dummy interface and assign all kubernetes service ClusterIP's to the dummy interface (default name is `kube-ipvs0`). For example,
|
||||
|
||||
```shell
|
||||
# ip link add kube-ipvs0 type dummy
|
||||
# ip addr
|
||||
...
|
||||
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
|
||||
link/ether 26:1f:cc:f8:cd:0f brd ff:ff:ff:ff:ff:ff
|
||||
|
||||
#### Assume 10.102.128.4 is service Cluster IP
|
||||
# ip addr add 10.102.128.4/32 dev kube-ipvs0
|
||||
...
|
||||
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
|
||||
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
|
||||
inet 10.102.128.4/32 scope global kube-ipvs0
|
||||
valid_lft forever preferred_lft forever
|
||||
```
|
||||
|
||||
Note that the relationship between a Kubernetes service and an IPVS service is `1:N`. Consider a Kubernetes service that has more than one access IP. For example, an External IP type service has 2 access IP's (ClusterIP and External IP). Then the IPVS proxier will create 2 IPVS services - one for Cluster IP and the other one for External IP.
|
||||
|
||||
The relationship between a Kubernetes endpoint and an IPVS destination is `1:1`.
|
||||
For instance, deletion of a Kubernetes service will trigger deletion of the corresponding IPVS service and address bound to dummy interface.
|
||||
|
||||
|
||||
#### Port remapping
|
||||
|
||||
There are 3 proxy modes in ipvs - NAT (masq), IPIP and DR. Only NAT mode supports port remapping. We will use IPVS NAT mode in order to supporting port remapping. The following example shows ipvs mapping service port `3080` to container port `8080`.
|
||||
|
||||
```shell
|
||||
# ipvsadm -ln
|
||||
IP Virtual Server version 1.2.1 (size=4096)
|
||||
Prot LocalAddress:Port Scheduler Flags
|
||||
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
|
||||
TCP 10.102.128.4:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
-> 10.244.1.237:8080 Masq 1 0 0
|
||||
|
||||
```
|
||||
|
||||
#### Falling back to iptables
|
||||
|
||||
IPVS proxier will employ iptables in doing packet filtering, SNAT and supporting NodePort type service. Specifically, ipvs proxier will fall back on iptables in the following 4 scenarios.
|
||||
|
||||
* kube-proxy start with --masquerade-all=true
|
||||
* Specify cluster CIDR in kube-proxy startup
|
||||
* Load Balancer Source Ranges is specified for LB type service
|
||||
* Support NodePort type service
|
||||
|
||||
And, IPVS proxier will maintain 5 kubernetes-specific chains in nat table
|
||||
|
||||
- KUBE-POSTROUTING
|
||||
- KUBE-MARK-MASQ
|
||||
- KUBE-MARK-DROP
|
||||
|
||||
`KUBE-POSTROUTING`, `KUBE-MARK-MASQ`, ` KUBE-MARK-DROP` are maintained by kubelet and ipvs proxier won't create them. IPVS proxier will make sure chains `KUBE-MARK-SERVICES` and `KUBE-NODEPORTS` exist in its sync loop.
|
||||
|
||||
**1. kube-proxy start with --masquerade-all=true**
|
||||
|
||||
If kube-proxy starts with `--masquerade-all=true`, the IPVS proxier will masquerade all traffic accessing service ClusterIP, which behaves same as what iptables proxier does.
|
||||
Suppose there is a serivice with Cluster IP `10.244.5.1` and port `8080`:
|
||||
|
||||
```shell
|
||||
# iptables -t nat -nL
|
||||
|
||||
Chain PREROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain OUTPUT (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain POSTROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
|
||||
|
||||
Chain KUBE-POSTROUTING (1 references)
|
||||
target prot opt source destination
|
||||
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
|
||||
|
||||
Chain KUBE-MARK-DROP (0 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
|
||||
|
||||
Chain KUBE-MARK-MASQ (6 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
|
||||
|
||||
Chain KUBE-SERVICES (2 references)
|
||||
target prot opt source destination
|
||||
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 10.244.5.1 /* default/foo:http cluster IP */ tcp dpt:8080
|
||||
```
|
||||
|
||||
**2. Specify cluster CIDR in kube-proxy startup**
|
||||
|
||||
If kube-proxy starts with `--cluster-cidr=<cidr>`, the IPVS proxier will masquerade off-cluster traffic accessing service ClusterIP, which behaves same as what iptables proxier does.
|
||||
Suppose kube-proxy is provided with the cluster cidr `10.244.16.0/24`, and service Cluster IP is `10.244.5.1` and port is `8080`:
|
||||
|
||||
```shell
|
||||
# iptables -t nat -nL
|
||||
|
||||
Chain PREROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain OUTPUT (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain POSTROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
|
||||
|
||||
Chain KUBE-POSTROUTING (1 references)
|
||||
target prot opt source destination
|
||||
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
|
||||
|
||||
Chain KUBE-MARK-DROP (0 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
|
||||
|
||||
Chain KUBE-MARK-MASQ (6 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
|
||||
|
||||
Chain KUBE-SERVICES (2 references)
|
||||
target prot opt source destination
|
||||
KUBE-MARK-MASQ tcp -- !10.244.16.0/24 10.244.5.1 /* default/foo:http cluster IP */ tcp dpt:8080
|
||||
```
|
||||
|
||||
**3. Load Balancer Source Ranges is specified for LB type service**
|
||||
|
||||
When service's `LoadBalancerStatus.ingress.IP` is not empty and service's `LoadBalancerSourceRanges` is specified, IPVS proxier will install iptables rules which looks like what is shown below.
|
||||
|
||||
Suppose service's `LoadBalancerStatus.ingress.IP` is `10.96.1.2` and service's `LoadBalancerSourceRanges` is `10.120.2.0/24`:
|
||||
|
||||
```shell
|
||||
# iptables -t nat -nL
|
||||
|
||||
Chain PREROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain OUTPUT (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain POSTROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
|
||||
|
||||
Chain KUBE-POSTROUTING (1 references)
|
||||
target prot opt source destination
|
||||
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
|
||||
|
||||
Chain KUBE-MARK-DROP (0 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000
|
||||
|
||||
Chain KUBE-MARK-MASQ (6 references)
|
||||
target prot opt source destination
|
||||
MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
|
||||
|
||||
Chain KUBE-SERVICES (2 references)
|
||||
target prot opt source destination
|
||||
ACCEPT tcp -- 10.120.2.0/24 10.96.1.2 /* default/foo:http loadbalancer IP */ tcp dpt:8080
|
||||
DROP tcp -- 0.0.0.0/0 10.96.1.2 /* default/foo:http loadbalancer IP */ tcp dpt:8080
|
||||
```
|
||||
|
||||
**4. Support NodePort type service**
|
||||
|
||||
Please check the section below.
|
||||
|
||||
#### Supporting NodePort service
|
||||
|
||||
For supporting NodePort type service, iptables will recruit the existing implementation in the iptables proxier. For example,
|
||||
|
||||
```shell
|
||||
# kubectl describe svc nginx-service
|
||||
Name: nginx-service
|
||||
...
|
||||
Type: NodePort
|
||||
IP: 10.101.28.148
|
||||
Port: http 3080/TCP
|
||||
NodePort: http 31604/TCP
|
||||
Endpoints: 172.17.0.2:80
|
||||
Session Affinity: None
|
||||
|
||||
# iptables -t nat -nL
|
||||
|
||||
[root@100-106-179-225 ~]# iptables -t nat -nL
|
||||
Chain PREROUTING (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain OUTPUT (policy ACCEPT)
|
||||
target prot opt source destination
|
||||
KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */
|
||||
|
||||
Chain KUBE-SERVICES (2 references)
|
||||
target prot opt source destination
|
||||
KUBE-MARK-MASQ tcp -- !172.16.0.0/16 10.101.28.148 /* default/nginx-service:http cluster IP */ tcp dpt:3080
|
||||
KUBE-SVC-6IM33IEVEEV7U3GP tcp -- 0.0.0.0/0 10.101.28.148 /* default/nginx-service:http cluster IP */ tcp dpt:3080
|
||||
KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
|
||||
|
||||
Chain KUBE-NODEPORTS (1 references)
|
||||
target prot opt source destination
|
||||
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */ tcp dpt:31604
|
||||
KUBE-SVC-6IM33IEVEEV7U3GP tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */ tcp dpt:31604
|
||||
|
||||
Chain KUBE-SVC-6IM33IEVEEV7U3GP (2 references)
|
||||
target prot opt source destination
|
||||
KUBE-SEP-Q3UCPZ54E6Q2R4UT all -- 0.0.0.0/0 0.0.0.0/0 /* default/nginx-service:http */
|
||||
Chain KUBE-SEP-Q3UCPZ54E6Q2R4UT (1 references)
|
||||
target prot opt source destination
|
||||
KUBE-MARK-MASQ all -- 172.17.0.2 0.0.0.0/0 /* default/nginx-service:http */
|
||||
DNAT
|
||||
```
|
||||
|
||||
#### Supporting ClusterIP service
|
||||
|
||||
When creating a ClusterIP type service, IPVS proxier will do 3 things:
|
||||
|
||||
* make sure dummy interface exists in the node
|
||||
* bind service cluster IP to the dummy interface
|
||||
* create an IPVS service whose address corresponds to the Kubernetes service Cluster IP.
|
||||
|
||||
For example,
|
||||
|
||||
```shell
|
||||
# kubectl describe svc nginx-service
|
||||
Name: nginx-service
|
||||
...
|
||||
Type: ClusterIP
|
||||
IP: 10.102.128.4
|
||||
Port: http 3080/TCP
|
||||
Endpoints: 10.244.0.235:8080,10.244.1.237:8080
|
||||
Session Affinity: None
|
||||
|
||||
# ip addr
|
||||
...
|
||||
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
|
||||
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
|
||||
inet 10.102.128.4/32 scope global kube-ipvs0
|
||||
valid_lft forever preferred_lft forever
|
||||
|
||||
# ipvsadm -ln
|
||||
IP Virtual Server version 1.2.1 (size=4096)
|
||||
Prot LocalAddress:Port Scheduler Flags
|
||||
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
|
||||
TCP 10.102.128.4:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
-> 10.244.1.237:8080 Masq 1 0 0
|
||||
```
|
||||
|
||||
### Support LoadBalancer service
|
||||
|
||||
IPVS proxier will NOT bind LB's ingress IP to the dummy interface. When creating a LoadBalancer type service, ipvs proxier will do 4 things:
|
||||
|
||||
- Make sure dummy interface exists in the node
|
||||
- Bind service cluster IP to the dummy interface
|
||||
- Create an ipvs service whose address corresponding to kubernetes service Cluster IP
|
||||
- Iterate LB's ingress IPs, create an ipvs service whose address corresponding LB's ingress IP
|
||||
|
||||
For example,
|
||||
|
||||
```shell
|
||||
# kubectl describe svc nginx-service
|
||||
Name: nginx-service
|
||||
...
|
||||
IP: 10.102.128.4
|
||||
Port: http 3080/TCP
|
||||
Endpoints: 10.244.0.235:8080
|
||||
Session Affinity: None
|
||||
|
||||
#### Only bind Cluter IP to dummy interface
|
||||
# ip addr
|
||||
...
|
||||
73: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 1000
|
||||
link/ether 1a:ce:f5:5f:c1:4d brd ff:ff:ff:ff:ff:ff
|
||||
inet 10.102.128.4/32 scope global kube-ipvs0
|
||||
valid_lft forever preferred_lft forever
|
||||
|
||||
#### Suppose LB's ingress IPs {10.96.1.2, 10.93.1.3}. IPVS proxier will create 1 ipvs service for cluster IP and 2 ipvs services for LB's ingree IP. Each ipvs service has its destination.
|
||||
# ipvsadm -ln
|
||||
IP Virtual Server version 1.2.1 (size=4096)
|
||||
Prot LocalAddress:Port Scheduler Flags
|
||||
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
|
||||
TCP 10.102.128.4:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
TCP 10.96.1.2:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
TCP 10.96.1.3:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
```
|
||||
|
||||
Since there is a need of supporting access control for `LB.ingress.IP`. IPVS proxier will fall back on iptables. Iptables will drop any packet which is not from `LB.LoadBalancerSourceRanges`. For example,
|
||||
|
||||
```shell
|
||||
# iptables -A KUBE-SERVICES -d {ingress.IP} --dport {service.Port} -s {LB.LoadBalancerSourceRanges} -j ACCEPT
|
||||
```
|
||||
|
||||
When the packet reach the end of chain, ipvs proxier will drop it.
|
||||
|
||||
```shell
|
||||
# iptables -A KUBE-SERVICES -d {ingress.IP} --dport {service.Port} -j KUBE-MARK-DROP
|
||||
```
|
||||
|
||||
### Support Only NodeLocal Endpoints
|
||||
|
||||
Similar to iptables proxier, when a service has the "Only NodeLocal Endpoints" annotation, ipvs proxier will only proxy traffic to endpoints in the local node.
|
||||
|
||||
```shell
|
||||
# kubectl describe svc nginx-service
|
||||
Name: nginx-service
|
||||
...
|
||||
IP: 10.102.128.4
|
||||
Port: http 3080/TCP
|
||||
Endpoints: 10.244.0.235:8080, 10.244.1.235:8080
|
||||
Session Affinity: None
|
||||
|
||||
#### Assume only endpoint 10.244.0.235:8080 is in the same host with kube-proxy
|
||||
|
||||
#### There should be 1 destination for ipvs service.
|
||||
[root@SHA1000130405 home]# ipvsadm -ln
|
||||
IP Virtual Server version 1.2.1 (size=4096)
|
||||
Prot LocalAddress:Port Scheduler Flags
|
||||
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
|
||||
TCP 10.102.128.4:3080 rr
|
||||
-> 10.244.0.235:8080 Masq 1 0 0
|
||||
```
|
||||
|
||||
#### Session affinity
|
||||
|
||||
IPVS support client IP session affinity (persistent connection). When a service specifies session affinity, the IPVS proxier will set a timeout value (180min=10800s by default) in the IPVS service. For example,
|
||||
|
||||
```shell
|
||||
# kubectl describe svc nginx-service
|
||||
Name: nginx-service
|
||||
...
|
||||
IP: 10.102.128.4
|
||||
Port: http 3080/TCP
|
||||
Session Affinity: ClientIP
|
||||
|
||||
# ipvsadm -ln
|
||||
IP Virtual Server version 1.2.1 (size=4096)
|
||||
Prot LocalAddress:Port Scheduler Flags
|
||||
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
|
||||
TCP 10.102.128.4:3080 rr persistent 10800
|
||||
```
|
||||
|
||||
#### Cleaning up inactive rules
|
||||
|
||||
It seems difficult to distinguish if an IPVS service is created by the IPVS proxier or other processes. Currently we assume IPVS rules will be created only by the IPVS proxier on a node, so we can clear all IPVSrules on a node. We should add warnings in documentation and flag comments.
|
||||
|
||||
#### Sync loop pseudo code
|
||||
|
||||
Similar to the iptables proxier, the IPVS proxier will do a full sync loop in a configured period. Also, each update on a Kubernetes service or endpoint will trigger an IPVS service or destination update. For example,
|
||||
|
||||
* Creating a Kubernetes service will trigger creating a new IPVS service.
|
||||
* Updating a Kubernetes service(for instance, change session affinity) will trigger updating an existing IPVS service.
|
||||
* Deleting a Kubernetes service will trigger deleting an IPVS service.
|
||||
* Adding an endpoint for a Kubernetes service will trigger adding a destination for an existing IPVS service.
|
||||
* Updating an endpoint for a Kubernetes service will trigger updating a destination for an existing IPVS service.
|
||||
* Deleting an endpoint for a Kubernetes service will trigger deleting a destination for an existing IPVS service.
|
||||
|
||||
Any IPVS service or destination updates will send an update command to kernel via socket communication, which won't take a service down.
|
||||
|
||||
The sync loop pseudo code is shown below:
|
||||
|
||||
```go
|
||||
func (proxier *Proxier) syncProxyRules() {
|
||||
When service or endpoint update, begin sync ipvs rules and iptables rules if needed.
|
||||
ensure dummy interface exists, if not, create one.
|
||||
for svcName, svcInfo := range proxier.serviceMap {
|
||||
// Capture the clusterIP.
|
||||
construct ipvs service from svcInfo
|
||||
Set session affinity flag and timeout value for ipvs service if specified session affinity
|
||||
bind Cluster IP to dummy interface
|
||||
call libnetwork API to create ipvs service and destinations
|
||||
|
||||
// Capture externalIPs.
|
||||
if externalIP is local then hold the svcInfo.Port so that can install ipvs rules on it
|
||||
construct ipvs service from svcInfo
|
||||
Set session affinity flag and timeout value for ipvs service if specified session affinity
|
||||
call libnetwork API to create ipvs service and destinations
|
||||
|
||||
// Capture load-balancer ingress.
|
||||
for _, ingress := range svcInfo.LoadBalancerStatus.Ingress {
|
||||
if ingress.IP != "" {
|
||||
if len(svcInfo.LoadBalancerSourceRanges) != 0 {
|
||||
install specific iptables
|
||||
}
|
||||
construct ipvs service from svcInfo
|
||||
Set session affinity flag and timeout value for ipvs service if specified session affinity
|
||||
call libnetwork API to create ipvs service and destinations
|
||||
}
|
||||
}
|
||||
|
||||
// Capture nodeports.
|
||||
if svcInfo.NodePort != 0 {
|
||||
fall back on iptables, recruit existing iptables proxier implementation
|
||||
}
|
||||
|
||||
call libnetwork API to clean up legacy ipvs services which is inactive any longer
|
||||
unbind service address from dummy interface
|
||||
clean up legacy iptables chains and rules
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
### Beta -> GA
|
||||
|
||||
The following requirements should be met before moving from Beta to GA. It is
|
||||
suggested to file an issue which tracks all the action items.
|
||||
|
||||
- [ ] Testing
|
||||
- [ ] 48 hours of green e2e tests.
|
||||
- [ ] Flakes must be identified and filed as issues.
|
||||
- [ ] Integrate with scale tests and. Failures should be filed as issues.
|
||||
- [ ] Development work
|
||||
- [ ] Identify all pending changes/refactors. Release blockers must be prioritized and fixed.
|
||||
- [ ] Identify all bugs. Release blocking bugs must be identified and fixed.
|
||||
- [ ] Docs
|
||||
- [ ] All user-facing documentation must be updated.
|
||||
|
||||
### GA -> Future
|
||||
|
||||
__TODO__
|
||||
|
||||
## Implementation History
|
||||
|
||||
**In chronological order**
|
||||
|
||||
1. https://github.com/kubernetes/kubernetes/pull/46580
|
||||
|
||||
2. https://github.com/kubernetes/kubernetes/pull/52528
|
||||
|
||||
3. https://github.com/kubernetes/kubernetes/pull/54219
|
||||
|
||||
4. https://github.com/kubernetes/kubernetes/pull/57268
|
||||
|
||||
5. https://github.com/kubernetes/kubernetes/pull/58052
|
||||
|
||||
|
||||
## Drawbacks [optional]
|
||||
|
||||
None
|
||||
|
||||
## Alternatives [optional]
|
||||
|
||||
None
|
||||
|
|
@ -0,0 +1,88 @@
|
|||
---
|
||||
kep-number: 11
|
||||
title: Switch CoreDNS to the default DNS
|
||||
authors:
|
||||
- "@johnbelamaric"
|
||||
- "@rajansandeep"
|
||||
owning-sig: sig-network
|
||||
participating-sigs:
|
||||
- sig-cluster-lifecycle
|
||||
reviewers:
|
||||
- "@bowei"
|
||||
- "@thockin"
|
||||
approvers:
|
||||
- "@thockin"
|
||||
editor: "@rajansandeep"
|
||||
creation-date: 2018-05-18
|
||||
last-updated: 2018-05-18
|
||||
status: provisional
|
||||
---
|
||||
|
||||
# Switch CoreDNS to the default DNS
|
||||
|
||||
## Table of Contents
|
||||
|
||||
* [Summary](#summary)
|
||||
* [Goals](#goals)
|
||||
* [Proposal](#proposal)
|
||||
* [User Cases](#use-cases)
|
||||
* [Graduation Criteria](#graduation-criteria)
|
||||
* [Implementation History](#implementation-history)
|
||||
|
||||
## Summary
|
||||
|
||||
CoreDNS is now well-established in Kubernetes as the DNS service, with CoreDNS starting as an alpha feature from Kubernetes v1.9 to now being GA in v1.11.
|
||||
After successfully implementing the road-map defined [here](https://github.com/kubernetes/features/issues/427), CoreDNS is GA in Kubernetes v1.11, which can be installed as an alternate to kube-dns in tools like kubeadm, kops, minikube and kube-up.
|
||||
Following the [KEP to graduate CoreDNS to GA](https://github.com/kubernetes/community/pull/1956), the purpose of this proposal is to make CoreDNS as the default DNS for Kubernetes, replacing kube-dns.
|
||||
|
||||
## Goals
|
||||
* Make CoreDNS the default DNS for Kubernetes for all the remaining install tools (kube-up, kops, minikube).
|
||||
* Make CoreDNS available as an image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
|
||||
This goal is carried over from the [previous KEP](https://github.com/kubernetes/community/pull/1956), in case it cannot be completed there.
|
||||
|
||||
## Proposal
|
||||
|
||||
The proposed solution is to enable CoreDNS as the default cluster service discovery DNS for Kubernetes.
|
||||
Some of the most used deployment tools will be upgraded by the CoreDNS team, in cooperation with the owners of these tools, to be able to deploy CoreDNS as default:
|
||||
* kubeadm (already done for Kubernetes v1.11)
|
||||
* kube-up
|
||||
* minikube
|
||||
* kops
|
||||
|
||||
For other tools, each maintainer would have to add the upgrade to CoreDNS.
|
||||
|
||||
### Use Cases
|
||||
|
||||
Use cases for CoreDNS has been well defined in the [previous KEP](https://github.com/kubernetes/community/pull/1956).
|
||||
The following can be expected when CoreDNS is made the default DNS.
|
||||
|
||||
#### Kubeadm
|
||||
|
||||
* CoreDNS is already the default DNS from Kubernetes v1.11 and shall continue be the default DNS.
|
||||
* In case users want to install kube-dns instead of CoreDNS, they have to set the feature-gate of CoreDNS to false. `--feature-gates=CoreDNS=false`
|
||||
|
||||
#### Kube-up
|
||||
|
||||
* CoreDNS will now become the default DNS.
|
||||
* To install kube-dns in place of CoreDNS, set the environment variable `CLUSTER_DNS_CORE_DNS` to `false`.
|
||||
|
||||
#### Minikube
|
||||
|
||||
* CoreDNS to be enabled by default in the add-on manager, with kube-dns disabled by default.
|
||||
|
||||
#### Kops
|
||||
|
||||
* CoreDNS will now become the default DNS.
|
||||
|
||||
## Graduation Criteria
|
||||
|
||||
* Add CoreDNS image in a Kubernetes repository (To Be Defined) and ensure a workflow/process to update the CoreDNS versions in the future.
|
||||
* Have a certain number (To Be Defined) of clusters of significant size (To Be Defined) adopting and running CoreDNS as their default DNS.
|
||||
|
||||
## Implementation History
|
||||
|
||||
* 20170912 - [Feature proposal](https://github.com/kubernetes/features/issues/427) for CoreDNS to be implemented as the default DNS in Kubernetes.
|
||||
* 20171108 - Successfully released [CoreDNS as an Alpha feature-gate in Kubernetes v1.9](https://github.com/kubernetes/kubernetes/pull/52501).
|
||||
* 20180226 - CoreDNS graduation to Incubation in CNCF.
|
||||
* 20180305 - Support for Kube-dns configmap translation and move up [CoreDNS to Beta](https://github.com/kubernetes/kubernetes/pull/58828) for Kubernetes v1.10.
|
||||
* 20180515 - CoreDNS was added as [GA and the default DNS in kubeadm](https://github.com/kubernetes/kubernetes/pull/63509) for Kubernetes v1.11.
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
# Meet Our Contributors - Ask Us Anything!
|
||||
# Meet Our Contributors - Ask Us Anything!
|
||||
|
||||
When Slack seems like it’s going too fast, and you just need a quick answer from a human...
|
||||
|
||||
|
|
@ -6,18 +6,18 @@ Meet Our Contributors gives you a monthly one-hour opportunity to ask questions
|
|||
|
||||
## When:
|
||||
Every first Wednesday of the month at the following times. Grab a copy of the calendar to yours from [kubernetes.io/community](https://kubernetes.io/community/)
|
||||
* 03:30pm UTC
|
||||
* 09:00pm UTC
|
||||
* 02:30pm UTC
|
||||
* 08:00pm UTC
|
||||
|
||||
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along with video and [#meet-our-contributors](https://kubernetes.slack.com/messages/meet-our-contributors) on Slack for questions and discourse.
|
||||
Tune into the [Kubernetes YouTube Channel](https://www.youtube.com/c/KubernetesCommunity/live) to follow along with video and [#meet-our-contributors](https://kubernetes.slack.com/messages/meet-our-contributors) on Slack for questions and discourse.
|
||||
|
||||
## What’s on-topic:
|
||||
## What’s on-topic:
|
||||
* How our contributors got started with k8s
|
||||
* Advice for getting attention on your PR
|
||||
* GitHub tooling and automation
|
||||
* Your first commit
|
||||
* kubernetes/community
|
||||
* Testing
|
||||
* Testing
|
||||
|
||||
## What’s off-topic:
|
||||
* End-user questions (Check out [#office-hours](https://kubernetes.slack.com/messages/office-hours) on slack and details [here](/events/office-hours.md))
|
||||
|
|
@ -33,15 +33,13 @@ Questions will be on a first-come, first-served basis. First half will be dedica
|
|||
### Code snip / PR for peer code review / Suggestion for part of codebase walk through:
|
||||
* At least 24 hours before the session to slack channel (#meet-our-contributors)
|
||||
|
||||
Problems will be picked based on time commitment needed, skills of the reviewer, and if a large amount are submitted, need for the project.
|
||||
Problems will be picked based on time commitment needed, skills of the reviewer, and if a large amount are submitted, need for the project.
|
||||
|
||||
## Call for Volunteers:
|
||||
Contributors - [sign up to answer questions!](https://goo.gl/uhEJ33)
|
||||
Contributors - [sign up to answer questions!](https://goo.gl/uhEJ33)
|
||||
|
||||
Expectations of volunteers:
|
||||
* Be on 5 mins early. You can look at questions in the queue by joining the #meet-our-contributors slack channel to give yourself some prep.
|
||||
* Expect questions about the contribution process, membership, navigating the kubernetes seas, testing, and general questions about you and your path to open source/kubernetes. It's ok if you don't know the answer!
|
||||
* We will be using video chat (zoom but live streaming through YouTube) but voice only is fine if you are more comfortable with that.
|
||||
* Be willing to provide suggestions and feedback to make this better!
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -70,7 +70,7 @@ Each organization should have the following teams:
|
|||
- `foo-reviewers`: granted read access to the `foo` repo; intended to be used as
|
||||
a notification mechanism for interested/active contributors for the `foo` repo
|
||||
- a `bots` team
|
||||
- should contain bots such as @k8s-ci-robot and @linuxfoundation that are
|
||||
- should contain bots such as @k8s-ci-robot and @thelinuxfoundation that are
|
||||
necessary for org and repo automation
|
||||
- an `owners` team
|
||||
- should be populated by everyone who has `owner` privileges to the org
|
||||
|
|
|
|||
|
|
@ -21,7 +21,7 @@ the Linux Foundation CNCF CLA check for your repositories, please read on.
|
|||
- Pull request: checked
|
||||
- Issue comment: checked
|
||||
- Active: checked
|
||||
1. Add the [@linuxfoundation](https://github.com/linuxfoundation) GitHub user as an **Owner**
|
||||
1. Add the [@thelinuxfoundation](https://github.com/thelinuxfoundation) GitHub user as an **Owner**
|
||||
to your organization or repo to ensure the CLA status can be applied on PR's
|
||||
1. After you send an invite, contact the [Linux Foundation](mailto:helpdesk@rt.linuxfoundation.org); and cc [Chris Aniszczyk](mailto:caniszczyk@linuxfoundation.org), [Ihor Dvoretskyi](mailto:ihor@cncf.io), [Eric Searcy](mailto:eric@linuxfoundation.org) (to ensure that the invite gets accepted).
|
||||
1. Finally, open up a test PR to check that:
|
||||
|
|
|
|||
|
|
@ -47,34 +47,34 @@ The following subprojects are owned by sig-api-machinery:
|
|||
- **universal-machinery**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/apimachinery/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apimachinery/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apimachinery/OWNERS
|
||||
- **server-frameworks**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apiserver/OWNERS
|
||||
- **server-crd**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/apiextensions-apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/apiextensions-apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/apiextensions-apiserver/OWNERS
|
||||
- **server-api-aggregation**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/kube-aggregator/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/kube-aggregator/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/kube-aggregator/OWNERS
|
||||
- **server-sdk**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/sample-apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/sample-apiserver/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/sample-apiserver/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/sample-controller/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/sample-controller/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/sample-controller/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes-incubator/apiserver-builder/master/OWNERS
|
||||
- **idl-schema-client-pipeline**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/gengo/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/code-generator/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/code-generator/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/code-generator/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kube-openapi/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/api/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/api/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/api/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes-client/gen/master/OWNERS
|
||||
- **kubernetes-clients**
|
||||
- Owners:
|
||||
|
|
@ -90,7 +90,7 @@ The following subprojects are owned by sig-api-machinery:
|
|||
- https://raw.githubusercontent.com/kubernetes-client/typescript/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes-incubator/client-python/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/client-go/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/staging/src/k8s.io/client-go/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/staging/src/k8s.io/client-go/OWNERS
|
||||
- **universal-utils**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/utils/master/OWNERS
|
||||
|
|
|
|||
|
|
@ -21,7 +21,7 @@ The Architecture SIG maintains and evolves the design principles of Kubernetes,
|
|||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Brian Grant (**[@bgrant0607](https://github.com/bgrant0607)**), Google
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
|
||||
|
||||
## Contact
|
||||
* [Slack](https://kubernetes.slack.com/messages/sig-architecture)
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
|
|||
A Special Interest Group for building, deploying, maintaining, supporting, and using Kubernetes on Azure.
|
||||
|
||||
## Meetings
|
||||
* Regular SIG Meeting: [Wednesdays at 16:00 UTC](https://zoom.us/j/2015551212) (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=16:00&tz=UTC).
|
||||
* Regular SIG Meeting: [Wednesdays at 16:00 UTC](https://zoom.us/j/2015551212) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=16:00&tz=UTC).
|
||||
* [Meeting notes and Agenda](https://docs.google.com/document/d/1SpxvmOgHDhnA72Z0lbhBffrfe9inQxZkU9xqlafOW9k/edit).
|
||||
* [Meeting recordings](https://www.youtube.com/watch?v=yQLeUKi_dwg&list=PL69nYSiGNLP2JNdHwB8GxRs2mikK7zyc4).
|
||||
|
||||
|
|
@ -20,9 +20,15 @@ A Special Interest Group for building, deploying, maintaining, supporting, and u
|
|||
### Chairs
|
||||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Jason Hansen (**[@slack](https://github.com/slack)**), Microsoft
|
||||
* Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**), Red Hat
|
||||
* Shubheksha Jalan (**[@shubheksha](https://github.com/shubheksha)**), Microsoft
|
||||
|
||||
### Technical Leads
|
||||
The Technical Leads of the SIG establish new subprojects, decommission existing
|
||||
subprojects, and resolve cross-subproject technical issues and decisions.
|
||||
|
||||
* Kal Khenidak (**[@khenidak](https://github.com/khenidak)**), Microsoft
|
||||
* Cole Mickens (**[@colemickens](https://github.com/colemickens)**), Red Hat
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
|
||||
|
||||
## Contact
|
||||
* [Slack](https://kubernetes.slack.com/messages/sig-azure)
|
||||
|
|
@ -47,7 +53,13 @@ Monitor these for Github activity if you are not a member of the team.
|
|||
|
||||
| Team Name | Details | Google Groups | Description |
|
||||
| --------- |:-------:|:-------------:| ----------- |
|
||||
| @kubernetes/sig-azure-api-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-azure-api-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-api-reviews) | API Changes and Reviews |
|
||||
| @kubernetes/sig-azure-bugs | [link](https://github.com/orgs/kubernetes/teams/sig-azure-bugs) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-bugs) | Bug Triage and Troubleshooting |
|
||||
| @kubernetes/sig-azure-feature-requests | [link](https://github.com/orgs/kubernetes/teams/sig-azure-feature-requests) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-feature-requests) | Feature Requests |
|
||||
| @kubernetes/sig-azure-misc | [link](https://github.com/orgs/kubernetes/teams/sig-azure-misc) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-misc) | General Discussion |
|
||||
| @kubernetes/sig-azure-pr-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-azure-pr-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-pr-reviews) | PR Reviews |
|
||||
| @kubernetes/sig-azure-proposals | [link](https://github.com/orgs/kubernetes/teams/sig-azure-proposals) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-proposals) | Design Proposals |
|
||||
| @kubernetes/sig-azure-test-failures | [link](https://github.com/orgs/kubernetes/teams/sig-azure-test-failures) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-azure-test-failures) | Test Failures and Triage |
|
||||
|
||||
<!-- BEGIN CUSTOM CONTENT -->
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,100 @@
|
|||
# SIG Azure Charter
|
||||
|
||||
_The following is a charter for the Kubernetes Special Interest Group for Azure. It delineates the roles of SIG leadership, SIG members, as well as the organizational processes for the SIG, both as they relate to project management and technical processes for SIG subprojects._
|
||||
|
||||
## Roles
|
||||
|
||||
### SIG Chairs
|
||||
|
||||
- Run operations and processes governing the SIG
|
||||
- Seed members established at SIG founding
|
||||
- Chairs MAY decide to step down at anytime and propose a replacement. Use lazy consensus amongst chairs with fallback on majority vote to accept proposal. This SHOULD be supported by a majority of SIG Members.
|
||||
- Chairs MAY select additional chairs through a [super-majority] vote amongst chairs. This SHOULD be supported by a majority of SIG Members.
|
||||
- Chairs MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed if not proactively working with other chairs to fulfill responsibilities. Coordinated leaves of absence serve as exception to this requirement.
|
||||
- Number: 2 - 3
|
||||
- Defined in [sigs.yaml]
|
||||
|
||||
### SIG Technical Leads
|
||||
|
||||
- Establish new subprojects
|
||||
- Decommission existing subprojects
|
||||
- Resolve X-Subproject technical issues and decisions
|
||||
- Technical Leads MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed if not proactively working with other chairs to fulfill responsibilities. Coordinated leaves of absence serve as exception to this requirement.
|
||||
- Number: 2 - 3
|
||||
- Defined in [sigs.yaml]
|
||||
|
||||
### Subproject Owners
|
||||
|
||||
- Scoped to a subproject defined in [sigs.yaml]
|
||||
- Seed members established at subproject founding
|
||||
- MUST be an escalation point for technical discussions and decisions in the subproject
|
||||
- MUST set milestone priorities or delegate this responsibility
|
||||
- MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months. Coordinated leaves of absence serve as exception to this requirement.
|
||||
- MAY be removed if not proactively working with other Subproject Owners to fulfill responsibilities.
|
||||
- MAY decide to step down at anytime and propose a replacement. Use [lazy-consensus] amongst subproject owners with fallback on majority vote to accept proposal. This SHOULD be supported by a majority of subproject contributors (those having some role in the subproject).
|
||||
- MAY select additional subproject owners through a [super-majority] vote amongst subproject owners. This SHOULD be supported by a majority of subproject contributors (through [lazy-consensus] with fallback on voting).
|
||||
- Number: 3 - 5
|
||||
- Defined in [sigs.yaml] [OWNERS] files
|
||||
|
||||
**IMPORTANT**
|
||||
|
||||
_With regards to leadership roles i.e., Chairs, Technical Leads, and Subproject Owners, we MUST, as a SIG, ensure that positions are held by a committee of members across a diverse set of companies. This allows for thoughtful discussion and structural management that can serve the needs of every consumer of Kubernetes on Azure._
|
||||
|
||||
### Members
|
||||
|
||||
- MUST maintain health of at least one subproject or the health of the SIG
|
||||
- MUST show sustained contributions to at least one subproject or to the SIG
|
||||
- SHOULD hold some documented role or responsibility in the SIG and / or at least one subproject (e.g. reviewer, approver, etc)
|
||||
- MAY build new functionality for subprojects
|
||||
- MAY participate in decision making for the subprojects they hold roles in
|
||||
- Includes all reviewers and approvers in [OWNERS] files for subprojects
|
||||
|
||||
## Organizational management
|
||||
|
||||
- SIG meets bi-weekly on zoom with agenda in meeting notes
|
||||
- SHOULD be facilitated by chairs unless delegated to specific Members
|
||||
- SIG overview and deep-dive sessions organized for Kubecon
|
||||
- SHOULD be organized by chairs unless delegated to specific Members
|
||||
- Contributing instructions defined in the SIG CONTRIBUTING.md
|
||||
|
||||
### Project management
|
||||
|
||||
#### Subproject creation
|
||||
|
||||
Subprojects
|
||||
may be created by [KEP] proposal and accepted by [lazy-consensus] with fallback on majority vote of SIG Technical Leads. The result SHOULD be supported by the majority of SIG members.
|
||||
|
||||
- KEP MUST establish subproject owners
|
||||
- [sigs.yaml] MUST be updated to include subproject information and [OWNERS] files with subproject owners
|
||||
- Where subprojects processes differ from the SIG governance, they MUST document how
|
||||
- e.g., if subprojects release separately - they must document how release and planning is performed
|
||||
|
||||
Subprojects must define how releases are performed and milestones are set.
|
||||
|
||||
Example:
|
||||
- Release milestones
|
||||
- Follows the kubernetes/kubernetes release milestones and schedule
|
||||
- Priorities for upcoming release are discussed during the SIG meeting following the preceding release and shared through a PR. Priorities are finalized before feature freeze.
|
||||
- Code and artifacts are published as part of the kubernetes/kubernetes release
|
||||
|
||||
### Technical processes
|
||||
|
||||
Subprojects of the SIG MUST use the following processes unless explicitly following alternatives they have defined.
|
||||
|
||||
- Proposing and making decisions
|
||||
- Proposals sent as [KEP] PRs and published to Google group as announcement
|
||||
- Follow [KEP] decision making process
|
||||
|
||||
- Test health
|
||||
- Canonical health of code published to
|
||||
- Consistently broken tests automatically send an alert to
|
||||
- SIG members are responsible for responding to broken tests alert. PRs that break tests should be rolled back if not fixed within 24 hours (business hours).
|
||||
- Test dashboard checked and reviewed at start of each SIG meeting. Owners assigned for any broken tests. and followed up during the next SIG meeting.
|
||||
|
||||
Issues impacting multiple subprojects in the SIG should be resolved by SIG Technical Leads, with fallback to consensus of subproject owners.
|
||||
|
||||
[lazy-consensus]: http://communitymgt.wikia.com/wiki/Lazy_consensus
|
||||
[super-majority]: https://en.wikipedia.org/wiki/Supermajority#Two-thirds_vote
|
||||
[KEP]: https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md
|
||||
[sigs.yaml]: https://github.com/kubernetes/community/blob/master/sigs.yaml#L1454
|
||||
[OWNERS]: contributors/devel/owners.md
|
||||
|
|
@ -40,6 +40,9 @@ The following subprojects are owned by sig-cli:
|
|||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/kubectl/master/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/kubectl/OWNERS
|
||||
- **kustomize**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/OWNERS
|
||||
|
||||
## GitHub Teams
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,100 @@
|
|||
# SIG Cloud Provider Charter
|
||||
|
||||
## Mission
|
||||
The Cloud Provider SIG ensures that the Kubernetes ecosystem is evolving in a way that is neutral to all (public and private) cloud providers. It will be responsible for establishing standards and requirements that must be met by all providers to ensure optimal integration with Kubernetes.
|
||||
|
||||
## Subprojects & Areas of Focus
|
||||
|
||||
* Maintaining parts of the Kubernetes project that allows Kubernetes to integrate with the underlying provider. This includes but are not limited to:
|
||||
* [cloud provider interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go)
|
||||
* [cloud-controller-manager](https://github.com/kubernetes/kubernetes/tree/master/cmd/cloud-controller-manager)
|
||||
* Deployment tooling which has historically resided under [cluster/](https://github.com/kubernetes/kubernetes/tree/release-1.11/cluster)
|
||||
* Code ownership for all cloud providers that fall under the kubernetes organization and have opted to be subprojects of SIG Cloud Provider. Following the guidelines around subprojects we anticipate providers will have full autonomy to maintain their own repositories, however, official code ownership will still belong to SIG Cloud Provider.
|
||||
* [cloud-provider-azure](https://github.com/kubernetes/cloud-provider-azure)
|
||||
* [cloud-provider-gcp](https://github.com/kubernetes/cloud-provider-gcp)
|
||||
* [cloud-provider-openstack](https://github.com/kubernetes/cloud-provider-openstack)
|
||||
* [cloud-provider-vsphere](https://github.com/kubernetes/cloud-provider-vsphere)
|
||||
* Standards for documentation that should be included by all providers.
|
||||
* Defining processes/standards for E2E tests that should be reported by all providers
|
||||
* Developing future functionality in Kubernetes to support use cases common to all providers while also allowing custom and pluggable implementations when required, some examples include but are not limited to:
|
||||
* Extendable node status’ and machine states based on provider
|
||||
* Extendable node address types based on provider
|
||||
* See also [Cloud Controller Manager KEP](https://github.com/kubernetes/community/blob/master/keps/0002-controller-manager.md)
|
||||
* The collection of user experience reports from Kubernetes operators running on provider subprojects; and the delivery of roadmap information to SIG PM
|
||||
|
||||
## Organizational Management
|
||||
|
||||
* Six months after this charter is first ratified, it MUST be reviewed and re-approved by the SIG in order to evaluate the assumptions made in its initial drafting
|
||||
* SIG meets bi-weekly on zoom with agenda in meeting notes.
|
||||
* SHOULD be facilitated by chairs unless delegated to specific Members
|
||||
* The SIG MUST make a best effort to provide leadership opportunities to individuals who represent different races, national origins, ethnicities, genders, abilities, sexual preferences, ages, backgrounds, levels of educational achievement, and socioeconomic statuses
|
||||
|
||||
## Subproject Creation
|
||||
|
||||
Each Kubernetes provider will (eventually) be a subproject under SIG Cloud Provider. To add new sub projects (providers), SIG Cloud Provider will maintain an open list of requirements that must be satisfied.
|
||||
The current requirements can be seen [here](https://github.com/kubernetes/community/blob/master/keps/0002-controller-manager.md#repository-requirements). Each provider subproject is entitled to create 1..N repositories related to cluster turn up or operation on their platform, subject to technical standards set by SIG Cloud Provider.
|
||||
Creation of a repository SHOULD follow the KEP process to preserve the motivation for the repository and any additional instructions for how other SIGs (e.g SIG Documentation and SIG Release) should interact with the repository
|
||||
|
||||
Subprojects that fall under SIG Cloud Provider may also be features in Kubernetes that is requested or needed by all, or at least a large majority of providers. The creation process for these subprojects will follow the usual KEP process.
|
||||
|
||||
## Subproject Retirement
|
||||
|
||||
Subprojects representing Kubernetes providers may be retired given they do not satisfy requirements for more than 6 months. Final decisions for retirement should be supported by a majority of SIG members using [lazy consensus](http://communitymgt.wikia.com/wiki/Lazy_consensus). Once retired any code related to that provider will be archived into the kubernetes-retired organization.
|
||||
|
||||
Subprojects representing Kubernetes features may be retired at any point given a lack of development or a lack of demand. Final decisions for retirement should be supported by a majority of SIG members, ideally from every provider. Once retired, any code related to that subproject will be archived into the kubernetes-retired organization.
|
||||
|
||||
|
||||
## Technical Processes
|
||||
Subprojects (providers) of the SIG MUST use the following processes unless explicitly following alternatives they have defined.
|
||||
|
||||
* Proposals will be sent as [KEP](https://github.com/kubernetes/community/blob/master/keps/0000-kep-template.md) PRs, and published to the official group mailing list as an announcement
|
||||
* Proposals, once submitted, SHOULD be placed on the next full meeting agenda
|
||||
* Decisions within the scope of individual subprojects should be made by lazy consensus by subproject owners, with fallback to majority vote by subproject owners; if a decision can’t be made, it should be escalated to the SIG Chairs
|
||||
* Issues impacting multiple subprojects in the SIG should be resolved by consensus of the owners of the involved subprojects; if a decision can’t be made, it should be escalated to the SIG Chairs
|
||||
|
||||
## Roles
|
||||
The following roles are required for the SIG to function properly. In the event that any role is unfilled, the SIG will make a best effort to fill it. Any decisions reliant on a missing role will be postponed until the role is filled.
|
||||
|
||||
|
||||
### Chairs
|
||||
* 3 chairs are required
|
||||
* Run operations and processes governing the SIG
|
||||
* An initial set of chairs was established at the time the SIG was founded.
|
||||
* Chairs MAY decide to step down at anytime and propose a replacement, who must be approved by all of the other chairs. This SHOULD be supported by a majority of SIG Members.
|
||||
* Chairs MAY select additional chairs using lazy consensus amongst SIG Members.
|
||||
* Chairs MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed by consensus of the other Chairs and members if not proactively working with other Chairs to fulfill responsibilities.
|
||||
* Chairs WILL be asked to step down if there is inappropriate behavior or code of conduct issues
|
||||
* SIG Cloud Provider cannot have more than one chair from any one company.
|
||||
|
||||
### Subproject/Provider Owners
|
||||
* There should be at least 1 representative per subproject/provider (though 3 is recommended to avoid deadlock) as specified in the OWNERS file of each cloud provider repository.
|
||||
* MUST be an escalation point for technical discussions and decisions in the subproject/provider
|
||||
* MUST set milestone priorities or delegate this responsibility
|
||||
* MUST remain active in the role and are automatically removed from the position if they are unresponsive for > 3 months and MAY be removed by consensus of other subproject owners and Chairs if not proactively working with other Subproject Owners to fulfill responsibilities.
|
||||
* MAY decide to step down at anytime and propose a replacement. This can be done by updating the OWNERS file for any subprojects.
|
||||
* MAY select additional subproject owners by updating the OWNERs file.
|
||||
* WILL be asked to step down if there is inappropriate behavior or code of conduct issues
|
||||
|
||||
### SIG Members
|
||||
|
||||
Approvers and reviewers in the OWNERS file of all subprojects under SIG Cloud Provider.
|
||||
|
||||
## Long Term Goals
|
||||
|
||||
The long term goal of SIG Cloud Provider is to promote a vendor neutral ecosystem for our community. Vendors wanting to support Kubernetes should feel equally empowered to do so
|
||||
as any of today’s existing cloud providers; but more importantly ensuring a high quality user experience across providers. The SIG will act as a central group for developing
|
||||
the Kubernetes project in a way that ensures all providers share common privileges and responsibilities. Below are some concrete goals on how SIG Cloud Provider plans to accomplish this.
|
||||
|
||||
### Consolidating Existing Cloud SIGs
|
||||
|
||||
SIG Cloud Provider will aim to eventually consolidate existing cloud provider SIGs and have each provider instead form a subproject under it. The subprojects would drive the development of
|
||||
individual providers and work closely with SIG Cloud Provider to ensure compatibility with Kubernetes. With this model, code ownership for new and existing providers will belong to SIG Cloud Provider,
|
||||
limiting SIG sprawl as more providers support Kubernetes. Existing SIGs representing cloud providers are highly encouraged to opt-in as sub-projects under SIG Cloud Provider but are not required to do.
|
||||
As a SIG opts-in, it will operate to ensure a smooth transition, typically over the course of 3 release cycles.
|
||||
|
||||
### Supporting New Cloud Providers
|
||||
|
||||
One of the primary goals of SIG Cloud Provider is to become an entrypoint for new providers wishing to support Kubernetes on their platform and ensuring technical excellence from each of those providers.
|
||||
SIG Cloud Provider will accomplish this by maintaining documentation around how new providers can get started and managing the set of requirements that must be met to onboard them. In addition to
|
||||
onboarding new providers, the entire lifecycle of providers would also fall under the responsibility of SIG Cloud Provider, which may involve clean up work if a provider decides to no longer support Kubernetes.
|
||||
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
reviewers:
|
||||
- sig-cloud-provider-leads
|
||||
approvers:
|
||||
- sig-cloud-provider-leads
|
||||
labels:
|
||||
- sig/cloud-provider
|
||||
|
|
@ -0,0 +1,75 @@
|
|||
<!---
|
||||
This is an autogenerated file!
|
||||
|
||||
Please do not edit this file directly, but instead make changes to the
|
||||
sigs.yaml file in the project root.
|
||||
|
||||
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
|
||||
-->
|
||||
# Cloud Provider Special Interest Group
|
||||
|
||||
Ensures that the Kubernetes ecosystem is evolving in a way that is neutral to all (public and private) cloud providers. It will be responsible for establishing standards and requirements that must be met by all providers to ensure optimal integration with Kubernetes.
|
||||
|
||||
## Meetings
|
||||
* Regular SIG Meeting: [Wednesdays at 10:00 PT (Pacific Time)](https://zoom.us/my/sigcloudprovider) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=10:00&tz=PT%20%28Pacific%20Time%29).
|
||||
* [Meeting notes and Agenda](TODO).
|
||||
* [Meeting recordings](TODO).
|
||||
|
||||
## Leadership
|
||||
|
||||
### Chairs
|
||||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Andrew Sy Kim (**[@andrewsykim](https://github.com/andrewsykim)**), DigitalOcean
|
||||
* Chris Hoge (**[@hogepodge](https://github.com/hogepodge)**), OpenStack Foundation
|
||||
* Jago Macleod (**[@jagosan](https://github.com/jagosan)**), Google
|
||||
|
||||
## Contact
|
||||
* [Slack](https://kubernetes.slack.com/messages/sig-cloud-provider)
|
||||
* [Mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider)
|
||||
* [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fcloud-provider)
|
||||
|
||||
## Subprojects
|
||||
|
||||
The following subprojects are owned by sig-cloud-provider:
|
||||
- **kubernetes-cloud-provider**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/cmd/cloud-controller-manager/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/controller/cloud/OWNERS
|
||||
- https://raw.githubusercontent.com/kubernetes/kubernetes/master/pkg/cloudprovider/OWNERS
|
||||
- **cloud-provider-azure**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/cloud-provider-azure/master/OWNERS
|
||||
- **cloud-provider-gcp**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/cloud-provider-gcp/master/OWNERS
|
||||
- **cloud-provider-openstack**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/OWNERS
|
||||
- **cloud-provider-vsphere**
|
||||
- Owners:
|
||||
- https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/OWNERS
|
||||
|
||||
## GitHub Teams
|
||||
|
||||
The below teams can be mentioned on issues and PRs in order to get attention from the right people.
|
||||
Note that the links to display team membership will only work if you are a member of the org.
|
||||
|
||||
The google groups contain the archive of Github team notifications.
|
||||
Mentioning a team on Github will CC its group.
|
||||
Monitor these for Github activity if you are not a member of the team.
|
||||
|
||||
| Team Name | Details | Google Groups | Description |
|
||||
| --------- |:-------:|:-------------:| ----------- |
|
||||
| @kubernetes/sig-cloud-provider-api-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-api-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-api-reviews) | API Changes and Reviews |
|
||||
| @kubernetes/sig-cloud-provider-bugs | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-bugs) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-bugs) | Bug Triage and Troubleshooting |
|
||||
| @kubernetes/sig-cloud-provider-feature-requests | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-feature-requests) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-feature-requests) | Feature Requests |
|
||||
| @kubernetes/sig-cloud-provider-maintainers | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-maintainers) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-maintainers) | Cloud Providers Maintainers |
|
||||
| @kubernetes/sig-cloud-providers-misc | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-providers-misc) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-providers-misc) | General Discussion |
|
||||
| @kubernetes/sig-cloud-provider-pr-reviews | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-pr-reviews) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-pr-reviews) | PR Reviews |
|
||||
| @kubernetes/sig-cloud-provider-proposals | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-proposals) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-proposals) | Design Proposals |
|
||||
| @kubernetes/sig-cloud-provider-test-failures | [link](https://github.com/orgs/kubernetes/teams/sig-cloud-provider-test-failures) | [link](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider-test-failures) | Test Failures and Triage |
|
||||
|
||||
<!-- BEGIN CUSTOM CONTENT -->
|
||||
|
||||
<!-- END CUSTOM CONTENT -->
|
||||
|
|
@ -21,7 +21,7 @@ Promote operability and interoperability of Kubernetes clusters. We focus on sha
|
|||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Rob Hirschfeld (**[@zehicle](https://github.com/zehicle)**), RackN
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
|
||||
|
||||
## Contact
|
||||
* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ Project | Owner(s)/Lead(s) | Description | Q1, Q2, Later
|
|||
[Meet Our Contributors](/mentoring/meet-our-contributors.md) | @parispittman | Monthly web series similar to user office hours that allows anyone to ask new and current contributors questions about our process, ecosystem, or their stories in open source | Q1 - ongoing
|
||||
[Outreachy](/mentoring/README.md) | @parispittman | Document new features, create new conceptual content, create new user paths | Q1
|
||||
[Google Summer of Code](/mentoring/google-summer-of-code.md) | @nikhita | Kubernetes participation in Google Summer of Code for students | Q1 - ongoing
|
||||
["Buddy" Program](https://github.com/kubernetes/community/issues/1803) | @parispittman, @chrisshort | 1 hour 1:1 sessions for new and current contributors to have dedicated time; meet our contributors but personal | Q2
|
||||
["Buddy" Program](https://github.com/kubernetes/community/issues/1803) | @parispittman, @chris-short | 1 hour 1:1 sessions for new and current contributors to have dedicated time; meet our contributors but personal | Q2
|
||||
|
||||
## Contributor Documentation
|
||||
Ensure the contribution process is well documented, discoverable, and consistent across repos to deliver the best contributor experience.
|
||||
|
|
|
|||
|
|
@ -22,8 +22,8 @@ In order to standardize Special Interest Group efforts, create maximum transpare
|
|||
|
||||
### Prerequisites
|
||||
|
||||
* Propose the new SIG publicly, including a brief mission statement, by emailing kubernetes-dev@googlegroups.com and kubernetes-users@googlegroups.com, then wait a couple of days for feedback
|
||||
* Ask a repo maintainer to create a github label, if one doesn't already exist: sig/foo
|
||||
* Propose the new SIG publicly, including a brief mission statement, by emailing kubernetes-dev@googlegroups.com and kubernetes-users@googlegroups.com, then wait a couple of days for feedback.
|
||||
* Ask a repo maintainer to create a github label, if one doesn't already exist: sig/foo.
|
||||
* Request a new [kubernetes.slack.com](http://kubernetes.slack.com) channel (#sig-foo) from the #slack-admins channel. New users can join at [slack.kubernetes.io](http://slack.kubernetes.io).
|
||||
* Slack activity is archived at [kubernetes.slackarchive.io](http://kubernetes.slackarchive.io). To start archiving a new channel invite the slackarchive bot to the channel via `/invite @slackarchive`
|
||||
* Organize video meetings as needed. No need to wait for the [Weekly Community Video Conference](community/README.md) to discuss. Please report summary of SIG activities there.
|
||||
|
|
@ -54,7 +54,7 @@ Create Google Groups at [https://groups.google.com/forum/#!creategroup](https://
|
|||
* Create groups using the name conventions below;
|
||||
* Groups should be created as e-mail lists with at least three owners (including sarahnovotny at google.com and ihor.dvoretskyi at gmail.com);
|
||||
* To add the owners, visit the Group Settings (drop-down menu on the right side), select Direct Add Members on the left side and add Sarah and Ihor via email address (with a suitable welcome message); in Members/All Members select Ihor and Sarah and assign to an "owner role";
|
||||
* Set "View topics", "Post", "Join the Group" permissions to be "Public"
|
||||
* Set "View topics", "Post", "Join the Group" permissions to be "Public";
|
||||
|
||||
Name convention:
|
||||
|
||||
|
|
|
|||
21
sig-list.md
21
sig-list.md
|
|
@ -24,15 +24,16 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
|
|||
|------|-------|--------|---------|----------|
|
||||
|[API Machinery](sig-api-machinery/README.md)|api-machinery|* [Daniel Smith](https://github.com/lavalamp), Google<br>* [David Eads](https://github.com/deads2k), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-api-machinery)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-api-machinery)|* Regular SIG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/apimachinery)<br>
|
||||
|[Apps](sig-apps/README.md)|apps|* [Matt Farina](https://github.com/mattfarina), Samsung SDS<br>* [Adnan Abdulhussein](https://github.com/prydonius), Bitnami<br>* [Kenneth Owens](https://github.com/kow3ns), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-apps)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-apps)|* Regular SIG Meeting: [Mondays at 9:00 PT (Pacific Time) (weekly)](https://zoom.us/my/sig.apps)<br>* (charts) Charts Chat: [Tuesdays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/166909412)<br>* (helm) Helm Developer call: [Thursdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/j/4526666954)<br>
|
||||
|[Architecture](sig-architecture/README.md)|architecture|* [Brian Grant](https://github.com/bgrant0607), Google<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-architecture)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-architecture)|* Regular SIG Meeting: [Thursdays at 15:30 UTC (weekly)](https://zoom.us/j/9690526922)<br>
|
||||
|[Architecture](sig-architecture/README.md)|architecture|* [Brian Grant](https://github.com/bgrant0607), Google<br>* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-architecture)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-architecture)|* Regular SIG Meeting: [Thursdays at 15:30 UTC (weekly)](https://zoom.us/j/9690526922)<br>
|
||||
|[Auth](sig-auth/README.md)|auth|* [Eric Chiang](https://github.com/ericchiang), Red Hat<br>* [Jordan Liggitt](https://github.com/liggitt), Red Hat<br>* [Tim Allclair](https://github.com/tallclair), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-auth)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-auth)|* Regular SIG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8s.sig.auth)<br>
|
||||
|[Autoscaling](sig-autoscaling/README.md)|autoscaling|* [Marcin Wielgus](https://github.com/mwielgus), Google<br>* [Solly Ross](https://github.com/directxman12), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-autoscaling)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-autoscaling)|* Regular SIG Meeting: [Mondays at 14:00 UTC (biweekly/triweekly)](https://zoom.us/my/k8s.sig.autoscaling)<br>
|
||||
|[AWS](sig-aws/README.md)|aws|* [Justin Santa Barbara](https://github.com/justinsb)<br>* [Kris Nova](https://github.com/kris-nova), Heptio<br>* [Bob Wise](https://github.com/countspongebob), AWS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-aws)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-aws)|* Regular SIG Meeting: [Fridays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8ssigaws)<br>
|
||||
|[Azure](sig-azure/README.md)|azure|* [Jason Hansen](https://github.com/slack), Microsoft<br>* [Cole Mickens](https://github.com/colemickens), Red Hat<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-azure)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-azure)|* Regular SIG Meeting: [Wednesdays at 16:00 UTC (weekly)](https://zoom.us/j/2015551212)<br>
|
||||
|[Azure](sig-azure/README.md)|azure|* [Stephen Augustus](https://github.com/justaugustus), Red Hat<br>* [Shubheksha Jalan](https://github.com/shubheksha), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-azure)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-azure)|* Regular SIG Meeting: [Wednesdays at 16:00 UTC (biweekly)](https://zoom.us/j/2015551212)<br>
|
||||
|[Big Data](sig-big-data/README.md)|big-data|* [Anirudh Ramanathan](https://github.com/foxish), Google<br>* [Erik Erlandson](https://github.com/erikerlandson), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-big-data)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-big-data)|* Regular SIG Meeting: [Wednesdays at 17:00 UTC (weekly)](https://zoom.us/my/sig.big.data)<br>
|
||||
|[CLI](sig-cli/README.md)|cli|* [Maciej Szulik](https://github.com/soltysh), Red Hat<br>* [Phillip Wittrock](https://github.com/pwittrock), Google<br>* [Tony Ado](https://github.com/AdoHe), Alibaba<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cli)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cli)|* Regular SIG Meeting: [Wednesdays at 09:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/sigcli)<br>
|
||||
|[Cloud Provider](sig-cloud-provider/README.md)|cloud-provider|* [Andrew Sy Kim](https://github.com/andrewsykim), DigitalOcean<br>* [Chris Hoge](https://github.com/hogepodge), OpenStack Foundation<br>* [Jago Macleod](https://github.com/jagosan), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cloud-provider)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cloud-provider)|* Regular SIG Meeting: [Wednesdays at 10:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/sigcloudprovider)<br>
|
||||
|[Cluster Lifecycle](sig-cluster-lifecycle/README.md)|cluster-lifecycle|* [Luke Marsden](https://github.com/lukemarsden), Weave<br>* [Robert Bailey](https://github.com/roberthbailey), Google<br>* [Lucas Käldström](https://github.com/luxas), Luxas Labs (occasionally contracting for Weaveworks)<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-lifecycle)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular SIG Meeting: [Tuesdays at 09:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* kubeadm Office Hours: [Wednesdays at 09:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* Cluster API working group: [Wednesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>* kops Office Hours: [Fridays at 09:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8ssigaws)<br>
|
||||
|[Cluster Ops](sig-cluster-ops/README.md)|cluster-ops|* [Rob Hirschfeld](https://github.com/zehicle), RackN<br>* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-ops)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/297937771)<br>
|
||||
|[Cluster Ops](sig-cluster-ops/README.md)|cluster-ops|* [Rob Hirschfeld](https://github.com/zehicle), RackN<br>* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-ops)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-ops)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/297937771)<br>
|
||||
|[Contributor Experience](sig-contributor-experience/README.md)|contributor-experience|* [Elsie Phillips](https://github.com/Phillels), CoreOS<br>* [Paris Pittman](https://github.com/parispittman), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-contribex)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-contribex)|* Regular SIG Meeting: [Wednesdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/j/7658488911)<br>
|
||||
|[Docs](sig-docs/README.md)|docs|* [Zach Corleissen](https://github.com/zacharysarah), Linux Foundation<br>* [Andrew Chen](https://github.com/chenopis), Google<br>* [Jared Bhatti](https://github.com/jaredbhatti), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-docs)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-docs)|* Regular SIG Meeting: [Tuesdays at 17:30 UTC (weekly)](https://zoom.us/j/678394311)<br>
|
||||
|[GCP](sig-gcp/README.md)|gcp|* [Adam Worrall](https://github.com/abgworrall), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-gcp)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-gcp)|* Regular SIG Meeting: [Thursdays at 16:00 UTC (biweekly)](https://zoom.us/j/761149873)<br>
|
||||
|
|
@ -42,15 +43,21 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
|
|||
|[Network](sig-network/README.md)|network|* [Tim Hockin](https://github.com/thockin), Google<br>* [Dan Williams](https://github.com/dcbw), Red Hat<br>* [Casey Davenport](https://github.com/caseydavenport), Tigera<br>|* [Slack](https://kubernetes.slack.com/messages/sig-network)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-network)|* Regular SIG Meeting: [Thursdays at 14:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/5806599998)<br>
|
||||
|[Node](sig-node/README.md)|node|* [Dawn Chen](https://github.com/dchen1107), Google<br>* [Derek Carr](https://github.com/derekwaynecarr), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-node)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-node)|* Regular SIG Meeting: [Tuesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/j/4799874685)<br>
|
||||
|[OpenStack](sig-openstack/README.md)|openstack|* [Chris Hoge](https://github.com/hogepodge), OpenStack Foundation<br>* [David Lyle](https://github.com/dklyle), Intel<br>* [Robert Morse](https://github.com/rjmorse), Ticketmaster<br>|* [Slack](https://kubernetes.slack.com/messages/sig-openstack)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-openstack)|* Regular SIG Meeting: [Wednesdays at 16:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/417251241)<br>
|
||||
<<<<<<< HEAD
|
||||
|[PM](sig-pm/README.md)|pm|* [Aparna Sinha](https://github.com/apsinha), Google<br>* [Ihor Dvoretskyi](https://github.com/idvoretskyi), CNCF<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/kubernetes-pm)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-pm)|* Regular SIG Meeting: [Tuesdays at 18:30 UTC (biweekly)](https://zoom.us/j/845373595)<br>
|
||||
|[Release](sig-release/README.md)|release|* [Jaice Singer DuMars](https://github.com/jdumars), Microsoft<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-release)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-release)|* Regular SIG Meeting: [Tuesdays at 21:00 UTC (biweekly)](https://zoom.us/j/664772523)<br>
|
||||
|[Scalability](sig-scalability/README.md)|scalability|* [Wojciech Tyczynski](https://github.com/wojtek-t), Google<br>* [Bob Wise](https://github.com/countspongebob), Samsung SDS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scalability)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scale)|* Regular SIG Meeting: [Thursdays at 16:30 UTC (bi-weekly)](https://zoom.us/j/989573207)<br>
|
||||
=======
|
||||
|[Product Management](sig-product-management/README.md)|none|* [Aparna Sinha](https://github.com/apsinha), Google<br>* [Ihor Dvoretskyi](https://github.com/idvoretskyi), CNCF<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/kubernetes-pm)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-pm)|* Regular SIG Meeting: [Tuesdays at 16:00 UTC (biweekly)](https://zoom.us/j/845373595)<br>
|
||||
|[Release](sig-release/README.md)|release|* [Jaice Singer DuMars](https://github.com/jdumars), Google<br>* [Caleb Miles](https://github.com/calebamiles), Google<br>|* [Slack](https://kubernetes.slack.com/messages/sig-release)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-release)|* Regular SIG Meeting: [Tuesdays at 21:00 UTC (biweekly)](https://zoom.us/j/664772523)<br>
|
||||
|[Scalability](sig-scalability/README.md)|scalability|* [Wojciech Tyczynski](https://github.com/wojtek-t), Google<br>* [Bob Wise](https://github.com/countspongebob), AWS<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scalability)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scale)|* Regular SIG Meeting: [Thursdays at 16:30 UTC (bi-weekly)](https://zoom.us/j/989573207)<br>
|
||||
>>>>>>> upstream/master
|
||||
|[Scheduling](sig-scheduling/README.md)|scheduling|* [Bobby (Babak) Salamat](https://github.com/bsalamat), Google<br>* [Klaus Ma](https://github.com/k82cn), IBM<br>|* [Slack](https://kubernetes.slack.com/messages/sig-scheduling)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-scheduling)|* Regular SIG Meeting: [Thursdays at 20:00 UTC (biweekly)](https://zoom.us/j/7767391691)<br>
|
||||
|[Service Catalog](sig-service-catalog/README.md)|service-catalog|* [Paul Morie](https://github.com/pmorie), Red Hat<br>* [Aaron Schlesinger](https://github.com/arschles), Microsoft<br>* [Ville Aikas](https://github.com/vaikas-google), Google<br>* [Doug Davis](https://github.com/duglin), IBM<br>|* [Slack](https://kubernetes.slack.com/messages/sig-service-catalog)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-service-catalog)|* Regular SIG Meeting: [Mondays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/j/7201225346)<br>
|
||||
|[Storage](sig-storage/README.md)|storage|* [Saad Ali](https://github.com/saad-ali), Google<br>* [Bradley Childs](https://github.com/childsb), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/sig-storage)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-storage)|* Regular SIG Meeting: [Thursdays at 9:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/614261834)<br>
|
||||
|[Testing](sig-testing/README.md)|testing|* [Aaron Crickenberger](https://github.com/spiffxp), Samsung SDS<br>* [Erick Feja](https://github.com/fejta), Google<br>* [Steve Kuznetsov](https://github.com/stevekuznetsov), Red Hat<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-testing)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-testing)|* Regular SIG Meeting: [Tuesdays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/my/k8s.sig.testing)<br>* (testing-commons) Testing Commons: [Wednesdays at 07:30 PT (Pacific Time) (bi-weekly)](https://zoom.us/my/k8s.sig.testing)<br>
|
||||
|[Testing](sig-testing/README.md)|testing|* [Aaron Crickenberger](https://github.com/spiffxp)<br>* [Erick Feja](https://github.com/fejta), Google<br>* [Steve Kuznetsov](https://github.com/stevekuznetsov), Red Hat<br>* [Timothy St. Clair](https://github.com/timothysc), Heptio<br>|* [Slack](https://kubernetes.slack.com/messages/sig-testing)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-testing)|* Regular SIG Meeting: [Tuesdays at 13:00 PT (Pacific Time) (weekly)](https://zoom.us/my/k8s.sig.testing)<br>* (testing-commons) Testing Commons: [Wednesdays at 07:30 PT (Pacific Time) (bi-weekly)](https://zoom.us/my/k8s.sig.testing)<br>
|
||||
|[UI](sig-ui/README.md)|ui|* [Dan Romlein](https://github.com/danielromlein), Google<br>* [Sebastian Florek](https://github.com/floreks), Fujitsu<br>|* [Slack](https://kubernetes.slack.com/messages/sig-ui)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-ui)|* Regular SIG Meeting: [Thursdays at 18:00 CET (Central European Time) (weekly)](https://groups.google.com/forum/#!forum/kubernetes-sig-ui)<br>
|
||||
|[VMware](sig-vmware/README.md)|vmware|* [Fabio Rapposelli](https://github.com/frapposelli), VMware<br>* [Steve Wong](https://github.com/cantbewong), VMware<br>|* [Slack](https://kubernetes.slack.com/messages/sig-vmware)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-vmware)|* Regular SIG Meeting: [Thursdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/183662780)<br>
|
||||
|[VMware](sig-vmware/README.md)|vmware|* [Fabio Rapposelli](https://github.com/frapposelli), VMware<br>* [Steve Wong](https://github.com/cantbewong), VMware<br>|* [Slack](https://kubernetes.slack.com/messages/sig-vmware)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-vmware)|* Regular SIG Meeting: [Thursdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/183662780)<br>* Cloud Provider vSphere weekly syncup: [Wednesdays at 16:30 UTC (weekly)](https://zoom.us/j/584244729)<br>
|
||||
|[Windows](sig-windows/README.md)|windows|* [Michael Michael](https://github.com/michmike), Apprenda<br>* [Patrick Lang](https://github.com/patricklang), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/sig-windows)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-windows)|* Regular SIG Meeting: [Tuesdays at 12:30 Eastern Standard Time (EST) (weekly)](https://zoom.us/my/sigwindows)<br>
|
||||
|
||||
### Master Working Group List
|
||||
|
|
@ -58,10 +65,10 @@ When the need arises, a [new SIG can be created](sig-creation-procedure.md)
|
|||
| Name | Organizers | Contact | Meetings |
|
||||
|------|------------|---------|----------|
|
||||
|[App Def](wg-app-def/README.md)|* [Antoine Legrand](https://github.com/ant31), CoreOS<br>* [Bryan Liles](https://github.com/bryanl), Heptio<br>* [Gareth Rushgrove](https://github.com/garethr), Docker<br>|* [Slack](https://kubernetes.slack.com/messages/wg-app-def)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-app-def)|* Regular WG Meeting: [Wednesdays at 16:00 UTC (bi-weekly)](https://zoom.us/j/748123863)<br>
|
||||
|[Apply](wg-apply/README.md)|* [Daniel Smith](https://github.com/lavalamp), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-apply)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-apply)|* Regular WG Meeting: [Tuesdays at 9:30 PT (Pacific Time) (weekly)]()<br>
|
||||
|[Apply](wg-apply/README.md)|* [Daniel Smith](https://github.com/lavalamp), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-apply)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-apply)|* Regular WG Meeting: [Tuesdays at 9:30 PT (Pacific Time) (weekly)](https://zoom.us/my/apimachinery)<br>
|
||||
|[Cloud Provider](wg-cloud-provider/README.md)|* [Sidhartha Mani](https://github.com/wlan0), Caascade Labs<br>* [Jago Macleod](https://github.com/jagosan), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-cloud-provider)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-cloud-provider)|* Regular WG Meeting: [Wednesdays at 10:00 PT (Pacific Time) (weekly)](https://zoom.us/my/cloudprovider)<br>
|
||||
|[Cluster API](wg-cluster-api/README.md)|* [Kris Nova](https://github.com/kris-nova), Heptio<br>* [Robert Bailey](https://github.com/roberthbailey), Google<br>|* [Slack](https://kubernetes.slack.com/messages/cluster-api)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular WG Meeting: [s at ()]()<br>
|
||||
|[Container Identity](wg-container-identity/README.md)|* [Clayton Coleman](https://github.com/smarterclayton), Red Hat<br>* [Greg Gastle](https://github.com/destijl), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-container-identity)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-container-identity)|* Regular WG Meeting: [Tuesdays at 15:00 UTC (bi-weekly (On demand))](TBD)<br>
|
||||
|[Container Identity](wg-container-identity/README.md)|* [Clayton Coleman](https://github.com/smarterclayton), Red Hat<br>* [Greg Castle](https://github.com/destijl), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-container-identity)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-container-identity)|* Regular WG Meeting: [Tuesdays at 15:00 UTC (bi-weekly (On demand))](TBD)<br>
|
||||
|[Kubeadm Adoption](wg-kubeadm-adoption/README.md)|* [Lucas Käldström](https://github.com/luxas), Luxas Labs (occasionally contracting for Weaveworks)<br>* [Justin Santa Barbara](https://github.com/justinsb)<br>|* [Slack](https://kubernetes.slack.com/messages/sig-cluster-lifecycle)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle)|* Regular WG Meeting: [Tuesdays at 18:00 UTC (bi-weekly)](https://zoom.us/j/166836%E2%80%8B624)<br>
|
||||
|[Machine Learning](wg-machine-learning/README.md)|* [Vishnu Kannan](https://github.com/vishh), Google<br>* [Kenneth Owens](https://github.com/kow3ns), Google<br>* [Balaji Subramaniam](https://github.com/balajismaniam), Intel<br>* [Connor Doyle](https://github.com/ConnorDoyle), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-machine-learning)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-machine-learning)|* Regular WG Meeting: [Thursdays at 13:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/4799874685)<br>
|
||||
|[Multitenancy](wg-multitenancy/README.md)|* [David Oppenheimer](https://github.com/davidopp), Google<br>* [Jessie Frazelle](https://github.com/jessfraz), Microsoft<br>|* [Slack](https://kubernetes.slack.com/messages/wg-multitenancy)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-multitenancy)|* Regular WG Meeting: [Wednesdays at 11:00 PT (Pacific Time) (biweekly)](https://zoom.us/my/k8s.sig.auth)<br>
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
|
|||
-->
|
||||
# Multicluster Special Interest Group
|
||||
|
||||
A Special Interest Group focussed on solving common challenges related to the management of multiple Kubernetes clusters, and applications that exist therein. The SIG will be responsible for designing, discussing, implementing and maintaining API’s, tools and documentation related to multi-cluster administration and application management. This includes not only active automated approaches such as Cluster Federation, but also those that employ batch workflow-style continuous deployment systems like Spinnaker and others. Standalone building blocks for these and other similar systems (for example a cluster registry), and proposed changes to kubernetes core where appropriate will also be in scope.
|
||||
A Special Interest Group focused on solving common challenges related to the management of multiple Kubernetes clusters, and applications that exist therein. The SIG will be responsible for designing, discussing, implementing and maintaining API’s, tools and documentation related to multi-cluster administration and application management. This includes not only active automated approaches such as Cluster Federation, but also those that employ batch workflow-style continuous deployment systems like Spinnaker and others. Standalone building blocks for these and other similar systems (for example a cluster registry), and proposed changes to kubernetes core where appropriate will also be in scope.
|
||||
|
||||
## Meetings
|
||||
* Regular SIG Meeting: [Tuesdays at 9:30 PT (Pacific Time)](https://zoom.us/my/k8s.mc) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9:30&tz=PT%20%28Pacific%20Time%29).
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
|
|||
### Chairs
|
||||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Microsoft
|
||||
* Jaice Singer DuMars (**[@jdumars](https://github.com/jdumars)**), Google
|
||||
* Caleb Miles (**[@calebamiles](https://github.com/calebamiles)**), Google
|
||||
|
||||
## Contact
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ For more details about our objectives please review our [Scaling And Performance
|
|||
The Chairs of the SIG run operations and processes governing the SIG.
|
||||
|
||||
* Wojciech Tyczynski (**[@wojtek-t](https://github.com/wojtek-t)**), Google
|
||||
* Bob Wise (**[@countspongebob](https://github.com/countspongebob)**), Samsung SDS
|
||||
* Bob Wise (**[@countspongebob](https://github.com/countspongebob)**), AWS
|
||||
|
||||
## Contact
|
||||
* [Slack](https://kubernetes.slack.com/messages/sig-scalability)
|
||||
|
|
@ -62,32 +62,17 @@ Monitor these for Github activity if you are not a member of the team.
|
|||
|
||||
<!-- BEGIN CUSTOM CONTENT -->
|
||||
## Upcoming 2018 Meeting Dates
|
||||
* 1/18
|
||||
* 2/1
|
||||
* 2/15
|
||||
* 3/1
|
||||
* 3/15
|
||||
* 3/29
|
||||
* 4/12
|
||||
* 4/26
|
||||
* 5/10
|
||||
* 5/24
|
||||
* 6/7
|
||||
* 6/21
|
||||
* 7/5
|
||||
* 7/19
|
||||
* 8/2
|
||||
* 8/16
|
||||
* 8/30
|
||||
* 9/13
|
||||
* 9/27
|
||||
|
||||
## Scalability SLOs
|
||||
## Scalability/performance SLIs and SLOs
|
||||
|
||||
We officially support two different SLOs:
|
||||
|
||||
1. "API-responsiveness":
|
||||
99% of all API calls return in less than 1s
|
||||
|
||||
1. "Pod startup time:
|
||||
99% of pods (with pre-pulled images) start within 5s
|
||||
|
||||
This should be valid on appropriate hardware up to a 5000 node cluster with 30 pods/node. We eventually want to expand that to 100 pods/node.
|
||||
|
||||
For more details how do we measure those, you can look at: http://blog.kubernetes.io/2015_09_01_archive.html
|
||||
|
||||
We are working on refining existing SLOs and defining more for other areas of the system.
|
||||
Check out [SLIs/SLOs page](./slos/slos.md).
|
||||
<!-- END CUSTOM CONTENT -->
|
||||
|
|
|
|||
|
|
@ -37,4 +37,4 @@ This document is a compilation of some interesting scalability/performance regre
|
|||
- On many occasions our scalability tests caught critical/risky bugs which were missed by most other tests. If not caught, those could've seriously jeopardized production-readiness of k8s.
|
||||
- SIG-Scalability has caught/fixed several important issues that span across various components, features and SIGs.
|
||||
- Around 60% of times (possibly even more), we catch scalability regressions with just our medium-scale (and fast) tests, i.e gce-100 and kubemark-500. Making them run as presubmits should act as a strong shield against regressions.
|
||||
- Majority of the remaining ones are caught by our large-scale (and slow) tests, i.e kubemark-5k and gce-2k. Making them as post-submit blokcers (given they're "usually" quite healthy) should act as a second layer of protection against regressions.
|
||||
- Majority of the remaining ones are caught by our large-scale (and slow) tests, i.e kubemark-5k and gce-2k. Making them as post-submit blockers (given they're "usually" quite healthy) should act as a second layer of protection against regressions.
|
||||
|
|
|
|||
|
|
@ -1,196 +0,0 @@
|
|||
# API-machinery SLIs and SLOs
|
||||
|
||||
The document was converted from [Google Doc]. Please refer to the original for
|
||||
extended commentary and discussion.
|
||||
|
||||
## Background
|
||||
|
||||
Scalability is an important aspect of the Kubernetes. However, Kubernetes is
|
||||
such a large system that we need to manage users expectations in this area.
|
||||
To achieve it, we are in process of redefining what does it mean that
|
||||
Kubernetes supports X-node clusters - this doc describes the high-level
|
||||
proposal. In this doc we are describing API-machinery related SLIs we would
|
||||
like to introduce and suggest which of those should eventually have a
|
||||
corresponding SLO replacing current "99% of API calls return in under 1s" one.
|
||||
|
||||
The SLOs we are proposing in this doc are our goal - they may not be currently
|
||||
satisfied. As a result, while in the future we would like to block the release
|
||||
when we are violating SLOs, we first need to understand where exactly we are
|
||||
now, define and implement proper tests and potentially improve the system.
|
||||
Only once this is done, we may try to introduce a policy of blocking the
|
||||
release on SLO violation. But this is out of scope of this doc.
|
||||
|
||||
|
||||
### SLIs and SLOs proposal
|
||||
|
||||
Below we introduce all SLIs and SLOs we would like to have in the api-machinery
|
||||
area. A bunch of those are not easy to understand for users, as they are
|
||||
designed for developers or performance tracking of higher level
|
||||
user-understandable SLOs. The user-oriented one (which we want to publicly
|
||||
announce) are additionally highlighted with bold.
|
||||
|
||||
### Prerequisite
|
||||
|
||||
Kubernetes cluster is available and serving.
|
||||
|
||||
### Latency<sup>[1](#footnote1)</sup> of API calls for single objects
|
||||
|
||||
__***SLI1: Non-streaming API calls for single objects (POST, PUT, PATCH, DELETE,
|
||||
GET) latency for every (resource, verb) pair, measured as 99th percentile over
|
||||
last 5 minutes***__
|
||||
|
||||
__***SLI2: 99th percentile for (resource, verb) pairs \[excluding virtual and
|
||||
aggregated resources and Custom Resource Definitions\] combined***__
|
||||
|
||||
__***SLO: In default Kubernetes installation, 99th percentile of SLI2
|
||||
per cluster-day<sup>[2](#footnote2)</sup> <= 1s***__
|
||||
|
||||
User stories:
|
||||
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
|
||||
response from an API call.
|
||||
- As an administrator of Kubernetes cluster, if I know characteristics of my
|
||||
external dependencies of apiserver (e.g custom admission plugins, webhooks and
|
||||
initializers) I want to be able to provide guarantees for API calls latency to
|
||||
users of my cluster
|
||||
|
||||
Background:
|
||||
- We obviously can’t give any guarantee in general, because cluster
|
||||
administrators are allowed to register custom admission plugins, webhooks
|
||||
and/or initializers, which we don’t have any control about and they obviously
|
||||
impact API call latencies.
|
||||
- As a result, we define the SLIs to be very generic (no matter how your
|
||||
cluster is set up), but we provide SLO only for default installations (where we
|
||||
have control over what apiserver is doing). This doesn’t provide a false
|
||||
impression, that we provide guarantee no matter how the cluster is setup and
|
||||
what is installed on top of it.
|
||||
- At the same time, API calls are part of pretty much every non-trivial workflow
|
||||
in Kubernetes, so this metric is a building block for less trivial SLIs and
|
||||
SLOs.
|
||||
|
||||
Other notes:
|
||||
- The SLO has to be satisfied independently from from the used encoding. This
|
||||
makes the mix of client important while testing. However, we assume that all
|
||||
`core` components communicate with apiserver with protocol buffers (otherwise
|
||||
the SLO doesn’t have to be satisfied).
|
||||
- In case of GET requests, user has an option to opt-in for accepting
|
||||
potentially stale data (the request is then served from cache and not hitting
|
||||
underlying storage). However, the SLO has to be satisfied even if all requests
|
||||
ask for up-to-date data, which again makes careful choice of requests in tests
|
||||
important while testing.
|
||||
|
||||
|
||||
### Latency of API calls for multiple objects
|
||||
|
||||
__***SLI1: Non-streaming API calls for multiple objects (LIST) latency for
|
||||
every (resource, verb) pair, measure as 99th percentile over last 5 minutes***__
|
||||
|
||||
__***SLI2: 99th percentile for (resource, verb) pairs [excluding virtual and
|
||||
aggregated resources and Custom Resource Definitions] combined***__
|
||||
|
||||
__***SLO1: In default Kubernetes installation, 99th percentile of SLI2 per
|
||||
cluster-day***__
|
||||
- __***is <= 1s if total number of objects of the same type as resource in the
|
||||
system <= X***__
|
||||
- __***is <= 5s if total number of objects of the same type as resource in the
|
||||
system <= Y***__
|
||||
- __***is <= 30s if total number of objects of the same types as resource in the
|
||||
system <= Z***__
|
||||
|
||||
User stories:
|
||||
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
|
||||
response from an API call.
|
||||
- As an administrator of Kubernetes cluster, if I know characteristics of my
|
||||
external dependencies of apiserver (e.g custom admission plugins, webhooks and
|
||||
initializers) I want to be able to provide guarantees for API calls latency to
|
||||
users of my cluster.
|
||||
|
||||
Background:
|
||||
- On top of arguments from latency of API calls for single objects, LIST
|
||||
operations are crucial part of watch-related frameworks, which in turn are
|
||||
responsible for overall system performance and responsiveness.
|
||||
- The above SLO is user-oriented and may have significant buffer in threshold.
|
||||
In fact, the latency of the request should be proportional to the amount of
|
||||
work to do (which in our case is number of objects of a given type (potentially
|
||||
in a requested namespace if specified)) plus some constant overhead. For better
|
||||
tracking of performance, we define the other SLIs which are supposed to be
|
||||
purely internal (developer-oriented)
|
||||
|
||||
|
||||
_SLI3: Non-streaming API calls for multiple objects (LIST) latency minus 1s
|
||||
(maxed with 0) divided by number of objects in the collection
|
||||
<sup>[3](#footnote3)</sup> (which may be many more than the number of returned
|
||||
objects) for every (resource, verb) pair, measured as 99th percentile over
|
||||
last 5 minutes._
|
||||
|
||||
_SLI4: 99th percentile for (resource, verb) pairs [excluding virtual and
|
||||
aggregated resources and Custom Resource Definitions] combined_
|
||||
|
||||
_SLO2: In default Kubernetes installation, 99th percentile of SLI4 per
|
||||
cluster-day <= Xms_
|
||||
|
||||
|
||||
### Watch latency
|
||||
|
||||
_SLI1: API-machinery watch latency (measured from the moment when object is
|
||||
stored in database to when it’s ready to be sent to all watchers), measured
|
||||
as 99th percentile over last 5 minutes_
|
||||
|
||||
_SLO1 (developer-oriented): 99th percentile of SLI1 per cluster-day <= Xms_
|
||||
|
||||
User stories:
|
||||
- As an administrator, if system is slow, I would like to know if the root
|
||||
cause is slow api-machinery or something farther the path (lack of network
|
||||
bandwidth, slow or cpu-starved controllers, ...).
|
||||
|
||||
Background:
|
||||
- Pretty much all control loops in Kubernetes are watch-based, so slow watch
|
||||
means slow system in general. As a result, we want to give some guarantees on
|
||||
how fast it is.
|
||||
- Note that how we measure it, silently assumes no clock-skew in case of HA
|
||||
clusters.
|
||||
|
||||
|
||||
### Admission plugin latency
|
||||
|
||||
_SLI1: Admission latency for each admission plugin type, measured as 99th
|
||||
percentile over last 5 minutes_
|
||||
|
||||
User stories:
|
||||
- As an administrator, if API calls are slow, I would like to know if this is
|
||||
because slow admission plugins and if so which ones are responsible.
|
||||
|
||||
|
||||
### Webhook latency
|
||||
|
||||
_SLI1: Webhook call latency for each webhook type, measured as 99th percentile
|
||||
over last 5 minutes_
|
||||
|
||||
User stories:
|
||||
- As an administrator, if API calls are slow, I would like to know if this is
|
||||
because slow webhooks and if so which ones are responsible.
|
||||
|
||||
|
||||
### Initializer latency
|
||||
|
||||
_SLI1: Initializer latency for each initializer, measured as 99th percentile
|
||||
over last 5 minutes_
|
||||
|
||||
User stories:
|
||||
- As an administrator, if API calls are slow, I would like to know if this is
|
||||
because of slow initializers and if so which ones are responsible.
|
||||
|
||||
---
|
||||
<a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
|
||||
from the moment when apiserver gets the request to last byte of response sent
|
||||
to the user.
|
||||
|
||||
<a name="footnote2">\[2\]</a> For the purpose of visualization it will be a
|
||||
sliding window. However, for the purpose of reporting the SLO, it means one
|
||||
point per day (whether SLO was satisfied on a given day or not).
|
||||
|
||||
<a name="footnote3">\[3\]</a>A collection contains: (a) all objects of that
|
||||
type for cluster-scoped resources, (b) all object of that type in a given
|
||||
namespace for namespace-scoped resources.
|
||||
|
||||
|
||||
[Google Doc]: https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw/edit#
|
||||
|
|
@ -0,0 +1,47 @@
|
|||
## API call latency SLIs/SLOs details
|
||||
|
||||
### User stories
|
||||
- As a user of vanilla Kubernetes, I want some guarantee how quickly I get the
|
||||
response from an API call.
|
||||
- As an administrator of Kubernetes cluster, if I know characteristics of my
|
||||
external dependencies of apiserver (e.g custom admission plugins, webhooks and
|
||||
initializers) I want to be able to provide guarantees for API calls latency to
|
||||
users of my cluster
|
||||
|
||||
### Other notes
|
||||
- We obviously can’t give any guarantee in general, because cluster
|
||||
administrators are allowed to register custom admission plugins, webhooks
|
||||
and/or initializers, which we don’t have any control about and they obviously
|
||||
impact API call latencies.
|
||||
- As a result, we define the SLIs to be very generic (no matter how your
|
||||
cluster is set up), but we provide SLO only for default installations (where we
|
||||
have control over what apiserver is doing). This doesn’t provide a false
|
||||
impression, that we provide guarantee no matter how the cluster is setup and
|
||||
what is installed on top of it.
|
||||
- At the same time, API calls are part of pretty much every non-trivial workflow
|
||||
in Kubernetes, so this metric is a building block for less trivial SLIs and
|
||||
SLOs.
|
||||
- The SLO for latency for read-only API calls of a given type may have significant
|
||||
buffer in threshold. In fact, the latency of the request should be proportional to
|
||||
the amount of work to do (which is number of objects of a given type in a given
|
||||
scope) plus some constant overhead. For better tracking of performance, we
|
||||
may want to define purely internal SLI of "latency per object". But that
|
||||
isn't in near term plans.
|
||||
|
||||
### Caveats
|
||||
- The SLO has to be satisfied independently from used encoding in user-originated
|
||||
requests. This makes mix of client important while testing. However, we assume
|
||||
that all `core` components communicate with apiserver using protocol buffers.
|
||||
- In case of GET requests, user has an option opt-in for accepting potentially
|
||||
stale data (being served from cache) and the SLO again has to be satisfied
|
||||
independently of that. This makes the careful choice of requests in tests
|
||||
important.
|
||||
|
||||
### TODOs
|
||||
- We may consider treating `non-namespaced` resources as a separate bucket in
|
||||
the future. However, it may not make sense if the number of those may be
|
||||
comparable with `namespaced` ones.
|
||||
|
||||
### Test scenario
|
||||
|
||||
__TODO: Descibe test scenario.__
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
## API call extension points latency SLIs details
|
||||
|
||||
### User stories
|
||||
- As an administrator, if API calls are slow, I would like to know if this is
|
||||
because slow extension points (admission plugins, webhooks, initializers) and
|
||||
if so which ones are responsible for it.
|
||||
|
|
@ -1,72 +0,0 @@
|
|||
# Extended Kubernetes scalability SLOs
|
||||
|
||||
## Goal
|
||||
The goal of this effort is to extend SLOs which Kubernetes cluster has to meet to support given number of Nodes. As of April 2017 we have only two SLOs:
|
||||
- API-responsiveness: 99% of all API calls return in less than 1s
|
||||
- Pod startup time: 99% of Pods (with pre-pulled images) start within 5s
|
||||
which are enough to guarantee that cluster doesn't feel completely dead, but not enough to guarantee that it satisfies user's needs.
|
||||
|
||||
We're going to define more SLOs based on most important indicators, and standardize the format in which we speak about our objectives. Our SLOs need to have two properties:
|
||||
- They need to be testable, i.e. we need to have a benchmark to measure if it's met,
|
||||
- They need to be expressed in a way that's possible to understand by a user not intimately familiar with the system internals, i.e. formulation can't depend on some arcane knowledge.
|
||||
|
||||
On the other hand we do not require that:
|
||||
- SLOs are possible to monitor in a running cluster, i.e. not all SLOs need to be easily translatable to SLAs. Being able to benchmark is enough for us.
|
||||
|
||||
## Split metrics from environment
|
||||
Currently what me measure and how we measure it is tightly coupled. This means that we don't have good environmental constraint suggestions for users (e.g. how many Pods per Namespace we support, how many Endpoints per Service, how to setup the cluster etc.). We need to decide on what's reasonable and make the environment explicit.
|
||||
|
||||
## Split SLOs by kind
|
||||
Current SLOs implicitly assume that the cluster is in a "steady state". By this we mean that we assume that there's only some, limited, number of things going during benchmarking. We need to make this assumption explicit and split SLOs into two categories: steady-state SLOs and burst SLOs.
|
||||
|
||||
## Steady state SLOs
|
||||
With steady state SLO we want to give users the data about system's behavior during normal operation. We define steady state by limiting the churn on the cluster.
|
||||
|
||||
This includes current SLOs:
|
||||
- API call latency
|
||||
- E2e Pod startup latency
|
||||
|
||||
By churn we understand a measure of amount changes happening in the cluster. Its formal(-ish) definition will follow, but informally it can be thought about as number of user-issued requests per second plus number of pods affected by those requests.
|
||||
|
||||
More formally churn per second is defined as:
|
||||
```
|
||||
#Pod creations + #PodSpec updates + #user originated requests in a given second
|
||||
```
|
||||
The last part is necessary only to get rid of situations when user is spamming API server with various requests. In ordinary circumstances we expect it to be in the order of 1-2.
|
||||
|
||||
## Burst SLOs
|
||||
With burst SLOs we want to give user idea on how system behaves under the heavy load, i.e. when one want the system to do something as quickly as possible, not caring too much about response time for a single request. Note that this voids all steady-state SLOs.
|
||||
|
||||
This includes the new SLO:
|
||||
- Pod startup throughput
|
||||
|
||||
## Environment
|
||||
A Kubernetes cluster in which we benchmark SLOs needs to meet the following criteria:
|
||||
- Run a single appropriately sized master machine
|
||||
- Main etcd runs as a single instance on the master machine
|
||||
- Events are stored in a separate etcd instance running on the master machine
|
||||
- Kubernetes version is at least 1.X.Y
|
||||
- Components configuration = _?_
|
||||
|
||||
_TODO: NEED AN HA CONFIGURATION AS WELL_
|
||||
|
||||
## SLO template
|
||||
All our performance SLOs should be defined using the following template:
|
||||
|
||||
---
|
||||
|
||||
# SLO: *TL;DR description of the SLO*
|
||||
## (Burst|Steady state) foo bar SLO
|
||||
|
||||
### Summary
|
||||
_One-two sentences describing the SLO, that's possible to understand by the majority of the community_
|
||||
|
||||
### User Stories
|
||||
_A Few user stories showing in what situations users might be interested in this SLO, and why other ones are not enough_
|
||||
|
||||
## Full definition
|
||||
### Test description
|
||||
_Precise description of test scenario, including maximum number of Pods per Controller, objects per namespace, and anything else that even remotely seems important_
|
||||
|
||||
### Formal definition (can be skipped if the same as title/summary)
|
||||
_Precise and as formal as possible definition of SLO. This does not necessarily need to be easily understandable by layman_
|
||||
|
|
@ -0,0 +1,54 @@
|
|||
## Pod startup latency SLI/SLO details
|
||||
|
||||
### User stories
|
||||
- As a user of vanilla Kubernetes, I want some guarantee how quickly my pods
|
||||
will be started.
|
||||
|
||||
### Other notes
|
||||
- Only schedulable and stateless pods contribute to the SLI:
|
||||
- If there is no space in the cluster to place the pod, there is not much
|
||||
we can do about it (it is task for Cluster Autoscaler which should have
|
||||
separate SLIs/SLOs).
|
||||
- If placing a pod requires preempting other pods, that may heavily depend
|
||||
on the application (e.g. on their graceful termination period). We don't
|
||||
want that to contribute to this SLI.
|
||||
- Mounting disks required by non-stateless pods may potentially also require
|
||||
non-negligible time, not fully dependent on Kubernetes.
|
||||
- We are explicitly excluding image pulling from time the SLI. This is
|
||||
because it highly depends on locality of the image, image registry performance
|
||||
characteristic (e.g. throughput), image size itself, etc. Since we have
|
||||
no control over any of those (and all of those would significantly affect SLI)
|
||||
we decided to simply exclude it.
|
||||
- We are also explicitly excluding time to run init containers, as, again, this
|
||||
is heavily application-dependent (and does't depend on Kubernetes itself).
|
||||
- The answer to question "when pod should be considered as started" is also
|
||||
not obvious. We decided for the semantic of "when all its containers are
|
||||
reported as started and observed via watch", because:
|
||||
- we require all containers to be started (not e.g. the first one) to ensure
|
||||
that the pod is started. We need to ensure that pontential regressions like
|
||||
linearization of container startups within a pod will be catch by this SLI.
|
||||
- note that we don't require all container to be running - if some of them
|
||||
finished before the last one was started that is also fine. It is just
|
||||
required that all of them has been started (at least once).
|
||||
- we don't want to rely on "readiness checks", because they heavily
|
||||
depend on the application. If the application takes couple minutes to
|
||||
initialize before it starts responding to readiness checks, that shouldn't
|
||||
count towards Kubernetes performance.
|
||||
- even if your application started, many control loops in Kubernetes will
|
||||
not fire before they will observe that. If Kubelet is not able to report
|
||||
the status due to some reason, other parts of the system will not have
|
||||
a way to learn about it - this is why reporting part is so important
|
||||
here.
|
||||
- since watch is so centric to Kubernetes (and many control loops are
|
||||
triggered by specific watch events), observing the status of pod is
|
||||
also part of the SLI (as this is the moment when next control loops
|
||||
can potentially be fired).
|
||||
|
||||
### TODOs
|
||||
- We should try to provide guarantees for non-stateless pods (the threshold
|
||||
may be higher for them though).
|
||||
- Revisit whether we want "watch pod status" part to be included in the SLI.
|
||||
|
||||
### Test scenario
|
||||
|
||||
__TODO: Descibe test scenario.__
|
||||
|
|
@ -0,0 +1,148 @@
|
|||
# Kubernetes scalability and performance SLIs/SLOs
|
||||
|
||||
## What Kubernetes guarantees?
|
||||
|
||||
One of the important aspects of Kubernetes is its scalability and performance
|
||||
characteristic. As Kubernetes user or operator/administrator of a cluster
|
||||
you would expect to have some guarantees in those areas.
|
||||
|
||||
The goal of this doc is to organize the guarantees that Kubernetes provides
|
||||
in these areas.
|
||||
|
||||
## What do we require from SLIs/SLOs?
|
||||
|
||||
We are going to define more SLIs and SLOs based on the most important indicators
|
||||
in the system.
|
||||
|
||||
Our SLOs need to have the following properties:
|
||||
- <b> They need to be testable </b> <br/>
|
||||
That means that we need to have a benchmark to measure if it's met.
|
||||
- <b> They need to be understandable for users </b> <br/>
|
||||
In particular, they need to be understandable for people not familiar
|
||||
with the system internals, i.e. their formulation can't depend on some
|
||||
arcane knowledge.
|
||||
|
||||
However, we may introduce some internal (for developers only) SLIs, that
|
||||
may be useful for understanding performance characterstic of the system,
|
||||
but for which we don't provide any guarantees for users and thus may not
|
||||
be fully understandable for users.
|
||||
|
||||
On the other hand, we do NOT require that our SLOs:
|
||||
- are measurable in a running cluster (though that's desired if possible) <br/>
|
||||
In other words, not SLOs need to be easily translatable to SLAs.
|
||||
Being able to benchmark is enough for us.
|
||||
|
||||
## Types of SLOs
|
||||
|
||||
While SLIs are very generic and don't really depend on anything (they just
|
||||
define what and how we measure), it's not the case for SLOs.
|
||||
SLOs provide guarantees, and satisfying them may depend on meeting some
|
||||
specific requirements.
|
||||
|
||||
As a result, we build our SLOs in "you promise, we promise" format.
|
||||
That means, that we provide you a guarantee only if you satisfy the requirement
|
||||
that we put on you.
|
||||
|
||||
As a consequence we introduce the two types of SLOs.
|
||||
|
||||
### Steady state SLOs
|
||||
|
||||
With steady state SLOs, we provide guarantees about system's behavior during
|
||||
normal operations. We are able to provide much more guarantees in that situation.
|
||||
|
||||
```Definition
|
||||
We define system to be in steady state when the cluster churn per second is <= 20, where
|
||||
|
||||
churn = #(Pod spec creations/updates/deletions) + #(user originated requests) in a given second
|
||||
```
|
||||
|
||||
### Burst SLO
|
||||
|
||||
With burst SLOs, we provide guarantees on how system behaves under the heavy load
|
||||
(when user wants the system to do something as quickly as possible not caring too
|
||||
much about response time).
|
||||
|
||||
## Environment
|
||||
|
||||
In order to meet the SLOs, system must run in the environment satisfying
|
||||
the following criteria:
|
||||
- Runs a single or more appropriate sized master machines
|
||||
- Main etcd running on master machine(s)
|
||||
- Events are stored in a separate etcd running on the master machine(s)
|
||||
- Kubernetes version is at least X.Y.Z
|
||||
- ...
|
||||
|
||||
__TODO: Document other necessary configuration.__
|
||||
|
||||
## Thresholds
|
||||
|
||||
To make the cluster eligible for SLO, users also can't have too many objects in
|
||||
their clusters. More concretely, the number of different objects in the cluster
|
||||
MUST satisfy thresholds defined in [thresholds file][].
|
||||
|
||||
[thresholds file]: https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/thresholds.md
|
||||
|
||||
|
||||
## Kubernetes SLIs/SLOs
|
||||
|
||||
The currently existing SLIs/SLOs are enough to guarantee that cluster isn't
|
||||
completely dead. However, the are not enough to satisfy user's needs in most
|
||||
of the cases.
|
||||
|
||||
We are looking into extending the set of SLIs/SLOs to cover more parts of
|
||||
Kubernetes.
|
||||
|
||||
```
|
||||
Prerequisite: Kubernetes cluster is available and serving.
|
||||
```
|
||||
|
||||
### Steady state SLIs/SLOs
|
||||
|
||||
| Status | SLI | SLO | User stories, test scenarios, ... |
|
||||
| --- | --- | --- | --- |
|
||||
| __Official__ | Latency<sup>[1](#footnote1)</sup> of mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[3](#footnote3)</sup> <= 1s | [Details](./api_call_latency.md) |
|
||||
| __Official__ | Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[4](#footnote3)</sup> API calls for every (resource, scope<sup>[5](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day (a) <= 1s if `scope=resource` (b) <= 5s if `scope=namespace` (c) <= 30s if `scope=cluster` | [Details](./api_call_latency.md) |
|
||||
| __Official__ | Startup latency of stateless<sup>[6](#footnode6)</sup> and schedulable<sup>[7](#footnote7)</sup> pods, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day <= 5s | [Details](./pod_startup_latency.md) |
|
||||
|
||||
<a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
|
||||
from the moment when apiserver gets the request to last byte of response sent
|
||||
to the user.
|
||||
|
||||
<a name="footnote2">\[2\]</a>By mutating API calls we mean POST, PUT, DELETE
|
||||
and PATCH.
|
||||
|
||||
<a name="footnote3">\[3\]</a> For the purpose of visualization it will be a
|
||||
sliding window. However, for the purpose of reporting the SLO, it means one
|
||||
point per day (whether SLO was satisfied on a given day or not).
|
||||
|
||||
<a name="footnote4">\[4\]</a>By non-streaming read-only API calls we mean GET
|
||||
requests without `watch=true` option set. (Note that in Kubernetes internally
|
||||
it translates to both GET and LIST calls).
|
||||
|
||||
<a name="footnote5">\[5\]</a>A scope of a request can be either (a) `resource`
|
||||
if the request is about a single object, (b) `namespace` if it is about objects
|
||||
from a single namespace or (c) `cluster` if it spawns objects from multiple
|
||||
namespaces.
|
||||
|
||||
<a name="footnode6">[6\]</a>A `stateless pod` is defined as a pod that doesn't
|
||||
mount volumes with sources other than secrets, config maps, downward API and
|
||||
empty dir.
|
||||
|
||||
<a name="footnode7">[7\]</a>By schedulable pod we mean a pod that can be
|
||||
scheduled in the cluster without causing any preemption.
|
||||
|
||||
### Burst SLIs/SLOs
|
||||
|
||||
| Status | SLI | SLO | User stories, test scenarios, ... |
|
||||
| --- | --- | --- | --- |
|
||||
| WIP | Time to start 30\*#nodes pods, measured from test scenario start until observing last Pod as ready | Benchmark: when all images present on all Nodes, 99th percentile <= X minutes | [Details](./system_throughput.md) |
|
||||
|
||||
### Other SLIs
|
||||
|
||||
| Status | SLI | User stories, ... |
|
||||
| --- | --- | --- |
|
||||
| WIP | Watch latency for every resource, (from the moment when object is stored in database to when it's ready to be sent to all watchers), measured as 99th percentile over last 5 minutes | TODO |
|
||||
| WIP | Admission latency for each admission plugin type, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |
|
||||
| WIP | Webhook call latency for each webhook type, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |
|
||||
| WIP | Initializer latency for each initializer, measured as 99th percentile over last 5 minutes | [Details](./api_extensions_latency.md) |
|
||||
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
## System throughput SLI/SLO details
|
||||
|
||||
### User stories
|
||||
- As a user, I want a guarantee that my workload of X pods can be started
|
||||
within a given time
|
||||
- As a user, I want to understand how quickly I can react to a dramatic
|
||||
change in workload profile when my workload exhibits very bursty behavior
|
||||
(e.g. shop during Back Friday Sale)
|
||||
- As a user, I want a guarantee how quickly I can recreate the whole setup
|
||||
in case of a serious disaster which brings the whole cluster down.
|
||||
|
||||
### Test scenario
|
||||
- Start with a healthy (all nodes ready, all cluster addons already running)
|
||||
cluster with N (>0) running pause pods per node.
|
||||
- Create a number of `Namespaces` and a number of `Deployments` in each of them.
|
||||
- All `Namespaces` should be isomorphic, possibly excluding last one which should
|
||||
run all pods that didn't fit in the previous ones.
|
||||
- Single namespace should run 5000 `Pods` in the following configuration:
|
||||
- one big `Deployment` running ~1/3 of all `Pods` from this `namespace`
|
||||
- medium `Deployments`, each with 120 `Pods`, in total running ~1/3 of all
|
||||
`Pods` from this `namespace`
|
||||
- small `Deployment`, each with 10 `Pods`, in total running ~1/3 of all `Pods`
|
||||
from this `Namespace`
|
||||
- Each `Deployment` should be covered by a single `Service`.
|
||||
- Each `Pod` in any `Deployment` contains two pause containers, one `Secret`
|
||||
other than default `ServiceAccount` and one `ConfigMap`. Additionally it has
|
||||
resource requests set and doesn't use any advanced scheduling features or
|
||||
init containers.
|
||||
|
|
@ -1,26 +0,0 @@
|
|||
# SLO: Kubernetes cluster of size at least X is able to start Y Pods in Z minutes
|
||||
**This is a WIP SLO doc - something that we want to meet, but we may not be there yet**
|
||||
|
||||
## Burst Pod Startup Throughput SLO
|
||||
### User Stories
|
||||
- User is running a workload of X total pods and wants to ensure that it can be started in Y time.
|
||||
- User is running a system that exhibits very bursty behavior (e.g. shop during Black Friday Sale) and wants to understand how quickly they can react to a dramatic change in workload profile.
|
||||
- User is running a huge serving app on a huge cluster. He wants to know how quickly he can recreate his whole setup in case of a serious disaster which will bring the whole cluster down.
|
||||
|
||||
Current steady state SLOs are do not provide enough data to make these assessments about burst behavior.
|
||||
## SLO definition (full)
|
||||
### Test setup
|
||||
Standard performance test kubernetes setup, as describe in [the doc](../extending_slo.md#environment).
|
||||
### Test scenario is following:
|
||||
- Start with a healthy (all nodes ready, all cluster addons already running) cluster with N (>0) running pause Pods/Node.
|
||||
- Create a number of Deployments that run X Pods and Namespaces necessary to create them.
|
||||
- All namespaces should be isomorphic, possibly excluding last one which should run all Pods that didn't fit in the previous ones.
|
||||
- Single Namespace should run at most 5000 Pods in the following configuration:
|
||||
- one big Deployment running 1/3 of all Pods from this Namespace (1667 for 5000 Pod Namespace)
|
||||
- medium Deployments, each of which is not running more than 120 Pods, running in total 1/3 of all Pods from this Namespace (14 Deployments with 119 Pods each for 5000 Pod Namespace)
|
||||
- small Deployments, each of which is not running more than 10 Pods, running in total 1/3 of all Pods from this Namespace (238 Deployments with 7 Pods each for 5000 Pod Namespace)
|
||||
- Each Deployment is covered by a single Service.
|
||||
- Each Pod in any Deployment contains two pause containers, one secret other than ServiceAccount and one ConfigMap, has resource request set and doesn't use any advanced scheduling features (Affinities, etc.) or init containers.
|
||||
- Measure the time between starting the test and moment when last Pod is started according to it's Kubelet. Note that pause container is ready just after it's started, which may not be true for more complex containers that use nontrivial readiness probes.
|
||||
### Definition
|
||||
Kubernetes cluster of size at least X adhering to the environment definition, when running the specified test, 99th percentile of time necessary to start Y pods from the time when user created all controllers to the time when Kubelet starts the last Pod from the set is no greater than Z minutes, assuming that all images are already present on all Nodes.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue