From e802f6830e9a87a5d035afed7708c6e7d6fc62ac Mon Sep 17 00:00:00 2001 From: Ashleigh Brennan Date: Mon, 22 Nov 2021 08:25:14 -0600 Subject: [PATCH] Move and cleanup autoscaling docs (#4486) * Move and cleanup autoscaling docs * fix links * updates from review comments --- config/nav.yml | 22 +++++++++++----------- docs/serving/autoscaling/README.md | 25 ++++++++++--------------- 2 files changed, 21 insertions(+), 26 deletions(-) diff --git a/config/nav.yml b/config/nav.yml index c50f01b65..b6ca5aab2 100644 --- a/config/nav.yml +++ b/config/nav.yml @@ -64,6 +64,17 @@ nav: ############################################################################### - Serving: - Knative Serving overview: serving/README.md + - Autoscaling: + - About autoscaling: serving/autoscaling/README.md + - Supported autoscaler types: serving/autoscaling/autoscaler-types.md + - Configuring metrics: serving/autoscaling/autoscaling-metrics.md + - Configuring targets: serving/autoscaling/autoscaling-targets.md + - Configuring scale to zero: serving/autoscaling/scale-to-zero.md + - Configuring concurrency: serving/autoscaling/concurrency.md + - Configuring the requests per second (RPS) target: serving/autoscaling/rps-target.md + - Configuring scale bounds: serving/autoscaling/scale-bounds.md + - Additional autoscaling configuration for Knative Pod Autoscaler: serving/autoscaling/kpa-specific.md + - Autoscale Sample App - Go: serving/autoscaling/autoscale-go/README.md # Serving - developer docs - Developer Topics: - Services: @@ -87,17 +98,6 @@ nav: - Configuring target burst capacity: serving/load-balancing/target-burst-capacity.md - Configuring Activator capacity: serving/load-balancing/activator-capacity.md - Revision garbage collection: serving/revision-gc.md - - Autoscaling: - - About autoscaling: serving/autoscaling/README.md - - Supported autoscaler types: serving/autoscaling/autoscaler-types.md - - Configuring metrics: serving/autoscaling/autoscaling-metrics.md - - Configuring targets: serving/autoscaling/autoscaling-targets.md - - Configuring scale to zero: serving/autoscaling/scale-to-zero.md - - Configuring concurrency: serving/autoscaling/concurrency.md - - Configuring the requests per second (RPS) target: serving/autoscaling/rps-target.md - - Configuring scale bounds: serving/autoscaling/scale-bounds.md - - Additional autoscaling configuration for Knative Pod Autoscaler: serving/autoscaling/kpa-specific.md - - Autoscale Sample App - Go: serving/autoscaling/autoscale-go/README.md # Serving - admin docs - Administrator Topics: - Kubernetes services: serving/knative-kubernetes-services.md diff --git a/docs/serving/autoscaling/README.md b/docs/serving/autoscaling/README.md index c058f3aaa..65d2a7b6e 100644 --- a/docs/serving/autoscaling/README.md +++ b/docs/serving/autoscaling/README.md @@ -1,22 +1,17 @@ # Autoscaling -One of the main features of Knative is automatic scaling of replicas for an application to closely match incoming demand, including scaling applications to zero if no traffic is being received. -Knative Serving enables this by default, using the Knative Pod Autoscaler (KPA). -The Autoscaler component watches traffic flow to the application, and scales replicas up or down based on configured metrics. +Knative Serving provides automatic scaling, or _autoscaling_, for applications to match incoming demand. This is provided by default, by using the Knative Pod Autoscaler (KPA). -Knative services default to using autoscaling settings that are suitable for the majority of use cases. However, some workloads may require a custom, more finely-tuned configuration. -This guide provides information about configuration options that you can modify to fit the requirements of your workload. +For example, if an application is receiving no traffic and scale to zero is enabled, Knative Serving scales the application down to zero replicas. If scaling to zero is disabled, the application is scaled down to the minimum number of replicas specified for applications on the cluster. Replicas are scaled up to meet demand if traffic to the application increases. -For more information about autoscaling in Knative, see the [Autoscaler types](autoscaler-types.md) documentation. +You can enable and disable scale to zero functionality for your cluster if you have cluster administrator permissions. See [Configuring scale to zero](scale-to-zero.md). + +To use autoscaling for your application if it is enabled on your cluster, you must configure [concurrency](concurrency.md) and [scale bounds](scale-bounds.md). + -For more information about which metrics can be used to control the Autoscaler, see the [metrics](autoscaling-metrics.md) documentation. +## Additional resources -## Optional autoscaling configuration tasks - -* Configure your Knative deployment to use the Kubernetes Horizontal Pod Autoscaler (HPA) -instead of the default KPA. -For how to install HPA, see [Install optional Serving extensions](../../install/serving/install-serving-with-yaml.md#install-optional-serving-extensions). -* Disable scale to zero functionality for your cluster ([global configuration only](scale-to-zero.md)). -* Configure the [type of metrics](autoscaling-metrics.md) your Autoscaler consumes. -* Configure [concurrency limits](concurrency.md) for applications. + * Try out the [Go Autoscale Sample App](autoscale-go/README.md). +* Configure your Knative deployment to use the Kubernetes Horizontal Pod Autoscaler (HPA) instead of the default KPA. For how to install HPA, see [Install optional Serving extensions](../../install/serving/install-serving-with-yaml.md#install-optional-serving-extensions). +* Configure the [types of metrics](autoscaling-metrics.md) that the Autoscaler consumes.