Create Reliability WG charter

This commit is contained in:
wojtekt 2020-09-25 10:30:44 +02:00
parent c038f43a4c
commit 74a24a8974
3 changed files with 101 additions and 0 deletions

View File

@ -2791,6 +2791,7 @@ workinggroups:
Allow users to safely use Kubernetes for managing production workloads by ensuring
Kubernetes is stable and reliable.
charter_link: charter.md
stakeholder_sigs:
- Architecture
- Cluster Lifecycle

View File

@ -10,6 +10,8 @@ To understand how this file is generated, see https://git.k8s.io/community/gener
Allow users to safely use Kubernetes for managing production workloads by ensuring Kubernetes is stable and reliable.
The [charter](charter.md) defines the scope and governance of the Reliability Working Group.
## Stakeholder SIGs
* SIG Architecture
* SIG Cluster Lifecycle

98
wg-reliability/charter.md Normal file
View File

@ -0,0 +1,98 @@
# WG Reliability Charter
This charter adheres to the conventions described in the [Kubernetes Charter README]
and uses the Roles and Organization Management outlined in [sig-governance].
[sig-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance.md
[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md
## Scope
The Reliability Working Group (WG Reliability) is organized with the goal of
allowing users to safely use Kubernetes for managing production workloads by
ensuring Kubernetes is stable and reliable.
### In Scope
- What reliability means for Kubernetes and how to measure it?
- Measuring Kubernetes reliability in tests
- Introducing criteria for blocking the release if the reliability is
below the bar
- Building a list of end-user outages and reliability issues
(if applicable with mitigations and/or workarounds)
- Creating and prioritizing a list of areas that require reliability
investments
- Work with relevant SIGs on delivering necessary infrastructure
(e.g. test frameworks) to unblock further steps
- Initiate and drive cross-SIG reliability improvements
### Out of scope
- Designing and executing improvements clearly falling into individual SIG
responsibilities.
## Special Powers
The Reliability WG has a power to block feature-oriented contributions from
any SIG if requested reliability-related improvements are not being addressed.
Before it can be exercised, sig-arch must approve the criteria suggested by
this working group.
Given WGs are by-definition temporary, on WG Reliability retirement we will
pass this responsibility to to SIG Architecture Production Readiness subproject
or to SIG Architecture generally for reassignment at the leads discretion.
## Stakeholders
Stakeholders in this working group span multiple SIGs.
In the first phase of defining reliability for Kubernetes building list of
reliability gaps and areas for investments the following SIGs will be
involved:
- SIG Architecture
High-level input on requirements.
- SIG Scalability
Input on scale test gaps and reliability issues at scale.
- SIG Cluster Lifecycle
Input on cluster setup and upgrade mechanics.
- SIG Release
Input on blocking and soak requirements.
- SIG Testing
Input on testing mechanics, missing frameworks, etc.
- SIG *
Input on reliability gaps in their areas.
The group will be also reaching out to users and cluster operator
(e.g. via surveys), to build the full picture.
In the later phase improving reliability, every single SIG may potentially
be involved depending on the findings from the initial phase.
## Deliverables
The artifacts the group is supposed to deliver include:
- Document defining what reliability means for Kubernetes and how to measure it.
- List of known user outages and potential failure modes
- List of specific investmenets that should happen to improve reliability
- Set of processes to introduce in Kubernetes to avoid over time degradation
of reliability
The actual investments will be owned by corresponding SIGs.
## Roles and Organization Management
This sig follows adheres to the Roles and Organization Management outlined in
[sig-governance] and opts-in to updates and modifications to [sig-governance].
[sig-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance.md
## Timelines and Disbanding
The exact timeline for existing of this working group is hard to predict at
this time.
The group will start working on the deliverables mentioned above. Once the
group we will be satisfied with the current shape of them and no additional
coordination on their execution will be needed, we will retire Working Group
and pass oversight of reliability to SIG Architecture PRR subproject.