add stable window documentatino to "scale-bounds" (#4514)

* add stable window documentatino to "scale-bounds"

* moving scaleDownDelayBack and referencing panic mode

* Update docs/serving/autoscaling/scale-bounds.md

Co-authored-by: Samia Nneji <snneji@vmware.com>

* Update docs/serving/autoscaling/scale-bounds.md

Co-authored-by: Samia Nneji <snneji@vmware.com>

* Update docs/serving/autoscaling/scale-bounds.md

Co-authored-by: Julian Friedman <julz.friedman@uk.ibm.com>

* use better formatting for kpa only note

* Update docs/serving/autoscaling/scale-bounds.md

Co-authored-by: Samia Nneji <snneji@vmware.com>

* Update docs/serving/autoscaling/scale-bounds.md

Co-authored-by: Samia Nneji <snneji@vmware.com>

Co-authored-by: Samia Nneji <snneji@vmware.com>
Co-authored-by: Julian Friedman <julz.friedman@uk.ibm.com>
This commit is contained in:
Erez Freiberger 2022-01-17 17:16:29 +01:00 committed by GitHub
parent d4902d2b10
commit 362c294b52
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 57 additions and 5 deletions

View File

@ -158,10 +158,6 @@ When the Revision is created, the larger of initial scale and lower bound is aut
allow-zero-initial-scale: "true" allow-zero-initial-scale: "true"
``` ```
## Scale Down Delay ## Scale Down Delay
Scale Down Delay specifies a time window which must pass at reduced concurrency Scale Down Delay specifies a time window which must pass at reduced concurrency
@ -170,6 +166,8 @@ keep containers around for a configurable duration to avoid a cold start
penalty if new requests come in. Unlike setting a lower bound, the revision penalty if new requests come in. Unlike setting a lower bound, the revision
will eventually be scaled down if reduced concurrency is maintained for the will eventually be scaled down if reduced concurrency is maintained for the
delay period. delay period.
!!! note
Only supported for the default KPA autoscaler class.
* **Global key:** `scale-down-delay` * **Global key:** `scale-down-delay`
* **Per-revision annotation key:** `autoscaling.knative.dev/scale-down-delay` * **Per-revision annotation key:** `autoscaling.knative.dev/scale-down-delay`
@ -217,4 +215,58 @@ delay period.
autoscaler: autoscaler:
scale-down-delay: "15m" scale-down-delay: "15m"
``` ```
## Stable window
The stable window defines the sliding time window over which metrics are averaged to provide the input for scaling decisions when the autoscaler is not in [Panic mode](kpa-specific.md).
* **Global key:** `stable-window`
* **Per-revision annotation key:** `autoscaling.knative.dev/window`
* **Possible values:** Duration, `6s` <= value <= `1h`
* **Default:** `60s`
!!! note
During scale down, in most cases the last Replica is removed after there has been no traffic to the Revision for the entire duration of the stable window.
**Example:**
=== "Per Revision"
```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: helloworld-go
namespace: default
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/window: "40s"
spec:
containers:
- image: gcr.io/knative-samples/helloworld-go
```
=== "Global (ConfigMap)"
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: config-autoscaler
namespace: knative-serving
data:
stable-window: "40s"
```
=== "Global (Operator)"
```yaml
apiVersion: operator.knative.dev/v1alpha1
kind: KnativeServing
metadata:
name: knative-serving
spec:
config:
autoscaler:
stable-window: "40s"
```
--- ---