Improve Kubernetes scalability definition

This commit is contained in:
wojtekt 2019-06-01 21:38:17 +02:00
parent bc3fe5de36
commit f042a6d212
1 changed files with 46 additions and 18 deletions

View File

@ -9,27 +9,55 @@ you would expect to have some guarantees in those areas.
The goal of this doc is to organize the guarantees that Kubernetes provides The goal of this doc is to organize the guarantees that Kubernetes provides
in these areas. in these areas.
## What do we require from SLIs/SLOs? ## How we define scalability?
We are in the process of extending the number of SLIs ([Service Level Indicators]) Our scalability definition is built on two concepts:
and SLOs ([Service Level Objectives]) built on top of these SLIs to cover more areas - [Service Level Indicators]
of the system and user expectations. - [Service Level Objectives]
Our SLIs/SLOs need to have the following properties: We require our SLIs/SLOs to have the following properties:
- <b> They need to be testable </b> <br/> - <b> They are precise and well-defined </b> <br/>
Ideally, they (SLIs and SLOs) should be measurable in all running clusters, It's extremely important to ensure that both users and us have exactly the
but if that isn't possible a benchmark may be enough in some situations. same understanding of what we guarantee.
That means that not every SLO may be translatable to SLA ([Service - <b> They are consistent with each other </b> <br/>
Level Agreement]). This is mostly about using the same terminology, same concepts, etc.
- <b> They need to be understandable for users </b> <br/> - <b> They are user-oriented </b> <br/>
In particular, they need to be understandable for people not familiar First, the SLOs we provide need to be things users really care about.
with the system internals, i.e. their formulation can't depend on some Second, they need to be understandable for people not familiar with the system
arcane knowledge. internals (e.g. their formulation can't depend on some arcane knowledge or
implementation details of the system).
- <b> They are testable </b> <br/>
Ideally, SLIs/SLOs should be measurable in all running clusters, but if measuring
some metrics isn't possible or would be extremely expensive (e.g. in terms
of resource overhead for the system), benchmarks sometimes may be enough.
That means that not every SLO may be translatable to SLA ([Service Level
Agreement]).
We may also introduce internal(for developers only) SLIs, that may be useful While SLIs are generic (they just define what and how we measure), SLOs provide
for understanding performance characteristic of the system, but for which specific guarantees and satisfying them may depend on meeting some specific
we don't provide any guarantees for users (and thus don't require them to be requirements. Specific examples that may visibly affect ability to satisfy them
that easily understandable). are:
- cluster configuration
- user of Kubernetes extensibility features
- load on the cluster.
As a result, we define Kubernetes scalability using "you promise, we promise"
framework, as following:
<b> If you promise to:
- correctly configure your cluster
- use extensibility features "reasonably"
- keep the load in the cluster within recommended limits
then we promise that your cluster scales, i.e.:
- all the SLOs are satisfied. </b>
We are in the process of extending coverage of the system with SLIs and SLOs
to better reflect user expectations.
Note that may also introduce internal (for developers only) SLIs, that may be
useful for understanding performance characteristic of the system, but for which
we will not provide any guarantees for users.
[Service Level Indicators]: https://en.wikipedia.org/wiki/Service_level_indicator [Service Level Indicators]: https://en.wikipedia.org/wiki/Service_level_indicator
[Service Level Objectives]: https://en.wikipedia.org/wiki/Service_level_objective [Service Level Objectives]: https://en.wikipedia.org/wiki/Service_level_objective