Document pod startup SLO

2018-06-01 12:31:33 +02:00 · 2018-06-01 12:31:33 +02:00 · 0c4f1bf871
parent bc690fa3d1
commit 0c4f1bf871
2 changed files with 60 additions and 2 deletions
--- a/sig-scalability/slos/pod_startup_latency.md
+++ b/sig-scalability/slos/pod_startup_latency.md
@ -0,0 +1,54 @@
+## Pod startup latency SLI/SLO details
+
+### User stories
+- As a user of vanilla Kubernetes, I want some guarantee how quickly my pods
+will be started.
+
+### Other notes
+- Only schedulable and stateless pods contribute to the SLI:
+  - If there is no space in the cluster to place the pod, there is not much
+    we can do about it (it is task for Cluster Autoscaler which should have
+    separate SLIs/SLOs).
+  - If placing a pod requires preempting other pods, that may heavily depend
+    on the application (e.g. on their graceful termination period). We don't
+    want that to contribute to this SLI.
+  - Mounting disks required by non-stateless pods may potentially also require
+    non-negligible time, not fully dependent on Kubernetes.
+- We are explicitly excluding image pulling from time the SLI. This is
+because it highly depends on locality of the image, image registry performance
+characteristic (e.g. throughput), image size itself, etc. Since we have
+no control over any of those (and all of those would significantly affect SLI)
+we decided to simply exclude it.
+- We are also explicitly excluding time to run init containers, as, again, this
+is heavily application-dependent (and does't depend on Kubernetes itself).
+- The answer to question "when pod should be considered as started" is also
+not obvious. We decided for the semantic of "when all its containers are
+reported as started and observed via watch", because:
+  - we require all containers to be started (not e.g. the first one) to ensure
+    that the pod is started. We need to ensure that pontential regressions like
+    linearization of container startups within a pod will be catch by this SLI.
+  - note that we don't require all container to be running - if some of them
+    finished before the last one was started that is also fine. It is just
+    required that all of them has been started (at least once).
+  - we don't want to rely on "readiness checks", because they heavily
+    depend on the application. If the application takes couple minutes to
+    initialize before it starts responding to readiness checks, that shouldn't
+		count towards Kubernetes performance.
+  - even if your application started, many control loops in Kubernetes will
+    not fire before they will observe that. If Kubelet is not able to report
+    the status due to some reason, other parts of the system will not have
+    a way to learn about it - this is why reporting part is so important
+    here.
+  - since watch is so centric to Kubernetes (and many control loops are
+    triggered by specific watch events), observing the status of pod is
+    also part of the SLI (as this is the moment when next control loops
+    can potentially be fired).
+
+### TODOs
+- We should try to provide guarantees for non-stateless pods (the threshold
+may be higher for them though).
+- Revisit whether we want "watch pod status" part to be included in the SLI.
+
+### Test scenario
+
+__TODO: Descibe test scenario.__
--- a/sig-scalability/slos/slos.md
+++ b/sig-scalability/slos/slos.md
@ -102,6 +102,7 @@ Prerequisite: Kubernetes cluster is available and serving.
 | --- | --- | --- | --- |
 | __Official__ | Latency<sup>[1](#footnote1)</sup> of mutating<sup>[2](#footnote2)</sup> API calls for single objects for every (resource, verb) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, verb) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day<sup>[3](#footnote3)</sup> <= 1s | [Details](./api_call_latency.md) |
 | __Official__ | Latency<sup>[1](#footnote1)</sup> of non-streaming read-only<sup>[4](#footnote3)</sup> API calls for every (resource, scope<sup>[5](#footnote4)</sup>) pair, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, for every (resource, scope) pair, excluding virtual and aggregated resources and Custom Resource Definitions, 99th percentile per cluster-day (a) <= 1s if `scope=resource` (b) <= 5s if `scope=namespace` (c) <= 30s if `scope=cluster` | [Details](./api_call_latency.md) |
+| __Official__ | Startup latency of stateless<sup>[6](#footnode6)</sup> and schedulable<sup>[7](#footnote7)</sup> pods, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch, measured as 99th percentile over last 5 minutes | In default Kubernetes installation, 99th percentile per cluster-day <= 5s | [Details](./pod_startup_latency.md) |

 <a name="footnote1">\[1\]</a>By latency of API call in this doc we mean time
 from the moment when apiserver gets the request to last byte of response sent
@ -123,9 +124,12 @@ if the request is about a single object, (b) `namespace` if it is about objects
 from a single namespace or (c) `cluster` if it spawns objects from multiple
 namespaces.

+<a name="footnode6">[6\]</a>A `stateless pod` is defined as a pod that doesn't
+mount volumes with sources other than secrets, config maps, downward API and
+empty dir.

-__TODO: Migrate existing SLIs/SLOs here:__
- __Pod startup time__
+<a name="footnode7">[7\]</a>By schedulable pod we mean a pod that can be
+scheduled in the cluster without causing any preemption.

 ### Burst SLIs/SLOs