Renamed autoscaling section, added details for PodAutoscaler custom resource (#1877)

2019-11-11 11:20:21 -06:00 · 2019-11-11 11:20:21 -06:00 · 178957f7ec
parent 080b4b03fa
commit 178957f7ec
2 changed files with 57 additions and 25 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -12,7 +12,7 @@ focus on solving mundane but difficult tasks such as:

 - [Deploying a container](./serving/getting-started-knative-app.md)
 - [Routing and managing traffic with blue/green deployment](./serving/samples/blue-green-deployment.md)
- [Scaling automatically and sizing workloads based on demand](./serving/configuring-the-autoscaler.md)
+- [Scaling automatically and sizing workloads based on demand](./serving/autoscaling.md)
 - [Binding running services to eventing ecosystems](./eventing/samples/kubernetes-event-source/)

 Developers on Knative can use familiar idioms, languages, and frameworks to
--- a/docs/serving/configuring-the-autoscaler.md
+++ b/docs/serving/configuring-the-autoscaler.md
@ -1,16 +1,18 @@
 ---
-title: "Configuring the Autoscaler"
+title: "Configuring autoscaling "
 weight: 10
 type: "docs"
+aliases:
+- /docs/serving/configuring-the-autoscaler/
 ---

-Since Knative v0.2, per revision autoscalers have been replaced by a single
-shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which
+Knative uses a single shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which
 provides fast, request-based autoscaling capabilities out of the box.
+You can also configure Knative to use Horizontal Pod Autoscaler (HPA), or use your own autoscaler, by implementing the Pod Autoscaler custom resource.

-# Configuring Knative Pod Autoscaler
+# Modifying the ConfigMap for KPA

-To modify the Knative Pod Autoscaler (KPA) configuration, you must modify a
+To modify the KPA configuration, you must modify a
 Kubernetes ConfigMap called `config-autoscaler` in the `knative-serving`
 namespace.

@ -18,7 +20,7 @@ You can view the default contents of this ConfigMap using the following command.

 `kubectl -n knative-serving get cm config-autoscaler`

-## Example of default ConfigMap
+## Example of the default Kubernetes ConfigMap

 ```
 apiVersion: v1
@ -76,16 +78,15 @@ Ensure that enable-scale-to-zero is set to `true`, if scale to zero is desired.

 The termination period is the time that the pod takes to shut down after the
 last request is finished. The termination period of the pod is equal to the sum
-of the values of the `stable-window` and `scale-to-zero-grace-period`
-parameters. In the case of this example, the termination period would be 90s.
+of the values of the `stable-window` and `scale-to-zero-grace-period` parameters. In the case of this example, the termination period would be at least 90s.

 ## Configuring concurrency

 Concurrency for autoscaling can be configured using the following methods.

-## Configuring concurrent request limits
+### Configuring concurrent request limits

-### target
+#### target

 `target` defines how many concurrent requests are wanted at a given time (soft
 limit) and is the recommended configuration for autoscaling in Knative.
@ -103,7 +104,7 @@ This value can be configured by adding or modifying the
 autoscaling.knative.dev/target: 50
 ```

-### containerConcurrency
+#### containerConcurrency

 **NOTE:** `containerConcurrency` should only be used if there is a clear need to
 limit how many requests reach the app at a given time. Using
@ -137,11 +138,11 @@ used to prevent cold starts or to help control computing costs.

 ```
 spec:
-  template:
-    metadata:
-      annotations:
-        autoscaling.knative.dev/minScale: "2"
-        autoscaling.knative.dev/maxScale: "10"
+ template:
+  metadata:
+   annotations:
+    autoscaling.knative.dev/minScale: "2"
+    autoscaling.knative.dev/maxScale: "10"
 ```

 Using these annotations in the revision template will propagate this to
@ -165,7 +166,7 @@ If the `minScale` annotation is not set, pods will scale to zero (or to 1 if
 If the `maxScale` annotation is not set, there will be no upper limit for the
 number of pods created.

-## Configuring CPU-based autoscaling
+# Configuring Horizontal Pod Autoscaler (HPA)

 **NOTE:** You can configure Knative autoscaling to work with either the default
 KPA or a CPU based metric, i.e. Horizontal Pod Autoscaler (HPA).
@ -177,16 +178,47 @@ template.

 ```
 spec:
-  template:
-    metadata:
-      annotations:
-        autoscaling.knative.dev/metric: cpu
-        autoscaling.knative.dev/target: 70
-        autoscaling.knative.dev/class: hpa.autoscaling.knative.dev
+ template:
+  metadata:
+   annotations:
+    autoscaling.knative.dev/metric: cpu
+    autoscaling.knative.dev/target: 70
+    autoscaling.knative.dev/class: hpa.autoscaling.knative.dev
 ```
+## Using the recommended autoscaling reconciler for custom Go implementations

+It is recommended to use the [`autoscaling-base-reconciler`](https://github.com/knative/serving/blob/master/pkg/reconciler/autoscaling/reconciler.go) as implemented in Knative Serving.

-## Additional resources
+To use this reconciler, ensure that you are calling `ReconcileSKS` from the `autoscaling-base-reconciler`.
+
+If you want to use metrics collected by Knative like `concurrency`, ensure that you are using `ReconcileMetric` to enable that system.
+
+## Implementing your own Pod Autoscaler
+
+The Pod Autoscaler custom resource allows you to implement your own autoscaler without changing anything else about the Knative Serving system.
+
+You can implement your own Pod Autoscaler if the requirements of your workload cannot be covered by the KPA or HPA, for example if you want to use a more specialized autoscaling algorithm, or if you need to use a specialized set of metrics not supported by Knative out of the box.
+
+To implement your own Pod Autoscaler, you can create a reconciler that operates on your own class of Pod Autoscaler.
+
+To do this, you can copy a [Knative sample controller](https://github.com/knative/sample-controller) and modify its configuration to suit your desired use case.
+
+  For example, if your service's template YAML includes a class annotation like:
+  ```
+  autoscaling.knative.dev/class: sample
+  ```
+  Your reconciler should only reconcile PodAutoscaler resources with that target.
+
+  The informer setup of your controller might look like this:
+
+  ```golang
+  paInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{
+  	FilterFunc: reconciler.AnnotationFilterFunc(autoscaling.ClassAnnotationKey, "sample", false),
+  	Handler:    controller.HandleAll(impl.Enqueue),
+  })
+  ```
+
+# Additional resources

 - [Go autoscaling sample](https://knative.dev/docs/serving/samples/autoscale-go/index.html)
 - ["Knative v0.3 Autoscaling  - A Love Story" blog post](https://knative.dev/blog/2019/03/27/knative-v0.3-autoscaling-a-love-story/)