mirror of https://github.com/knative/docs.git
Renamed autoscaling section, added details for PodAutoscaler custom resource (#1877)
This commit is contained in:
parent
080b4b03fa
commit
178957f7ec
|
@ -12,7 +12,7 @@ focus on solving mundane but difficult tasks such as:
|
|||
|
||||
- [Deploying a container](./serving/getting-started-knative-app.md)
|
||||
- [Routing and managing traffic with blue/green deployment](./serving/samples/blue-green-deployment.md)
|
||||
- [Scaling automatically and sizing workloads based on demand](./serving/configuring-the-autoscaler.md)
|
||||
- [Scaling automatically and sizing workloads based on demand](./serving/autoscaling.md)
|
||||
- [Binding running services to eventing ecosystems](./eventing/samples/kubernetes-event-source/)
|
||||
|
||||
Developers on Knative can use familiar idioms, languages, and frameworks to
|
||||
|
|
|
@ -1,16 +1,18 @@
|
|||
---
|
||||
title: "Configuring the Autoscaler"
|
||||
title: "Configuring autoscaling "
|
||||
weight: 10
|
||||
type: "docs"
|
||||
aliases:
|
||||
- /docs/serving/configuring-the-autoscaler/
|
||||
---
|
||||
|
||||
Since Knative v0.2, per revision autoscalers have been replaced by a single
|
||||
shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which
|
||||
Knative uses a single shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which
|
||||
provides fast, request-based autoscaling capabilities out of the box.
|
||||
You can also configure Knative to use Horizontal Pod Autoscaler (HPA), or use your own autoscaler, by implementing the Pod Autoscaler custom resource.
|
||||
|
||||
# Configuring Knative Pod Autoscaler
|
||||
# Modifying the ConfigMap for KPA
|
||||
|
||||
To modify the Knative Pod Autoscaler (KPA) configuration, you must modify a
|
||||
To modify the KPA configuration, you must modify a
|
||||
Kubernetes ConfigMap called `config-autoscaler` in the `knative-serving`
|
||||
namespace.
|
||||
|
||||
|
@ -18,7 +20,7 @@ You can view the default contents of this ConfigMap using the following command.
|
|||
|
||||
`kubectl -n knative-serving get cm config-autoscaler`
|
||||
|
||||
## Example of default ConfigMap
|
||||
## Example of the default Kubernetes ConfigMap
|
||||
|
||||
```
|
||||
apiVersion: v1
|
||||
|
@ -76,16 +78,15 @@ Ensure that enable-scale-to-zero is set to `true`, if scale to zero is desired.
|
|||
|
||||
The termination period is the time that the pod takes to shut down after the
|
||||
last request is finished. The termination period of the pod is equal to the sum
|
||||
of the values of the `stable-window` and `scale-to-zero-grace-period`
|
||||
parameters. In the case of this example, the termination period would be 90s.
|
||||
of the values of the `stable-window` and `scale-to-zero-grace-period` parameters. In the case of this example, the termination period would be at least 90s.
|
||||
|
||||
## Configuring concurrency
|
||||
|
||||
Concurrency for autoscaling can be configured using the following methods.
|
||||
|
||||
## Configuring concurrent request limits
|
||||
### Configuring concurrent request limits
|
||||
|
||||
### target
|
||||
#### target
|
||||
|
||||
`target` defines how many concurrent requests are wanted at a given time (soft
|
||||
limit) and is the recommended configuration for autoscaling in Knative.
|
||||
|
@ -103,7 +104,7 @@ This value can be configured by adding or modifying the
|
|||
autoscaling.knative.dev/target: 50
|
||||
```
|
||||
|
||||
### containerConcurrency
|
||||
#### containerConcurrency
|
||||
|
||||
**NOTE:** `containerConcurrency` should only be used if there is a clear need to
|
||||
limit how many requests reach the app at a given time. Using
|
||||
|
@ -137,11 +138,11 @@ used to prevent cold starts or to help control computing costs.
|
|||
|
||||
```
|
||||
spec:
|
||||
template:
|
||||
metadata:
|
||||
annotations:
|
||||
autoscaling.knative.dev/minScale: "2"
|
||||
autoscaling.knative.dev/maxScale: "10"
|
||||
template:
|
||||
metadata:
|
||||
annotations:
|
||||
autoscaling.knative.dev/minScale: "2"
|
||||
autoscaling.knative.dev/maxScale: "10"
|
||||
```
|
||||
|
||||
Using these annotations in the revision template will propagate this to
|
||||
|
@ -165,7 +166,7 @@ If the `minScale` annotation is not set, pods will scale to zero (or to 1 if
|
|||
If the `maxScale` annotation is not set, there will be no upper limit for the
|
||||
number of pods created.
|
||||
|
||||
## Configuring CPU-based autoscaling
|
||||
# Configuring Horizontal Pod Autoscaler (HPA)
|
||||
|
||||
**NOTE:** You can configure Knative autoscaling to work with either the default
|
||||
KPA or a CPU based metric, i.e. Horizontal Pod Autoscaler (HPA).
|
||||
|
@ -177,16 +178,47 @@ template.
|
|||
|
||||
```
|
||||
spec:
|
||||
template:
|
||||
metadata:
|
||||
annotations:
|
||||
autoscaling.knative.dev/metric: cpu
|
||||
autoscaling.knative.dev/target: 70
|
||||
autoscaling.knative.dev/class: hpa.autoscaling.knative.dev
|
||||
template:
|
||||
metadata:
|
||||
annotations:
|
||||
autoscaling.knative.dev/metric: cpu
|
||||
autoscaling.knative.dev/target: 70
|
||||
autoscaling.knative.dev/class: hpa.autoscaling.knative.dev
|
||||
```
|
||||
## Using the recommended autoscaling reconciler for custom Go implementations
|
||||
|
||||
It is recommended to use the [`autoscaling-base-reconciler`](https://github.com/knative/serving/blob/master/pkg/reconciler/autoscaling/reconciler.go) as implemented in Knative Serving.
|
||||
|
||||
## Additional resources
|
||||
To use this reconciler, ensure that you are calling `ReconcileSKS` from the `autoscaling-base-reconciler`.
|
||||
|
||||
If you want to use metrics collected by Knative like `concurrency`, ensure that you are using `ReconcileMetric` to enable that system.
|
||||
|
||||
## Implementing your own Pod Autoscaler
|
||||
|
||||
The Pod Autoscaler custom resource allows you to implement your own autoscaler without changing anything else about the Knative Serving system.
|
||||
|
||||
You can implement your own Pod Autoscaler if the requirements of your workload cannot be covered by the KPA or HPA, for example if you want to use a more specialized autoscaling algorithm, or if you need to use a specialized set of metrics not supported by Knative out of the box.
|
||||
|
||||
To implement your own Pod Autoscaler, you can create a reconciler that operates on your own class of Pod Autoscaler.
|
||||
|
||||
To do this, you can copy a [Knative sample controller](https://github.com/knative/sample-controller) and modify its configuration to suit your desired use case.
|
||||
|
||||
For example, if your service's template YAML includes a class annotation like:
|
||||
```
|
||||
autoscaling.knative.dev/class: sample
|
||||
```
|
||||
Your reconciler should only reconcile PodAutoscaler resources with that target.
|
||||
|
||||
The informer setup of your controller might look like this:
|
||||
|
||||
```golang
|
||||
paInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{
|
||||
FilterFunc: reconciler.AnnotationFilterFunc(autoscaling.ClassAnnotationKey, "sample", false),
|
||||
Handler: controller.HandleAll(impl.Enqueue),
|
||||
})
|
||||
```
|
||||
|
||||
# Additional resources
|
||||
|
||||
- [Go autoscaling sample](https://knative.dev/docs/serving/samples/autoscale-go/index.html)
|
||||
- ["Knative v0.3 Autoscaling - A Love Story" blog post](https://knative.dev/blog/2019/03/27/knative-v0.3-autoscaling-a-love-story/)
|
Loading…
Reference in New Issue