From bc8fd2d88a2aca438eaf4f7095810411090c53d8 Mon Sep 17 00:00:00 2001
From: Victor Agababov <vagababov@gmail.com>
Date: Wed, 20 May 2020 09:55:57 -0700
Subject: [PATCH] Add the documentation to the new
 scale-to-zero-pod-retention-period flag (#2478)

* Add the documentation to the new scale-to-zero-pod-retention-period flag

* review
---
 docs/serving/configuring-autoscaling.md | 41 ++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/docs/serving/configuring-autoscaling.md b/docs/serving/configuring-autoscaling.md
index 8b2fdd01d..a6b0c7927 100644
--- a/docs/serving/configuring-autoscaling.md
+++ b/docs/serving/configuring-autoscaling.md
@@ -496,7 +496,7 @@ This period is an upper bound amount of time the system waits internally for the
 
 * **Global key:** `scale-to-zero-grace-period`
 * **Per-revision annotation key:** n/a
-* **Possible values:** Duration
+* **Possible values:** Duration (must be at least 6s).
 * **Default:** `30s`
 
 **Example:**
@@ -526,6 +526,45 @@ spec:
 {{< /tab >}}
 {{< /tabs >}}
 
+
+### Scale To Zero Last Pod Retention Period
+
+The `scale-to-zero-pod-retention-period` flag determines the **minimum** amount of time that the last pod will remain active after the Autoscaler has decided to scale pods to zero.
+
+This contrasts with the `scale-to-zero-grace-period` flag, which determines the **maximum** amount of time that the last pod will remain active after the Autoscaler has decided to scale pods to zero.
+
+* **Global key:** `scale-to-zero-pod-retention-period`
+* **Per-revision annotation key:** n/a
+* **Possible values:** Non-negative duration string
+* **Default:** `0s`
+
+**Example:**
+{{< tabs name="scale-to-zero-grace" default="Global (ConfigMap)" >}}
+{{% tab name="Global (ConfigMap)" %}}
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: config-autoscaler
+ namespace: knative-serving
+data:
+ scale-to-zero-pod-retention-period: "42s"
+```
+{{< /tab >}}
+{{% tab name="Global (Operator)" %}}
+```yaml
+apiVersion: operator.knative.dev/v1alpha1
+kind: KnativeServing
+metadata:
+  name: knative-serving
+spec:
+  config:
+    autoscaler:
+      scale-to-zero-pod-retention-period: "42s"
+```
+{{< /tab >}}
+{{< /tabs >}}
+
 ## Modes: Stable and Panic
 
 The KPA acts on the respective metrics (concurrency or RPS) aggregated over time-based windows. These windows define the amount of historical data the autoscaler takes into account and are used to smooth the data over the specified amount of time. The shorter these windows are, the more quickly the autoscaler will react but the more hysterically it will react as well.