mirror of https://github.com/knative/client.git
25 lines
1.6 KiB
Markdown
25 lines
1.6 KiB
Markdown
# Autoscaling
|
|
|
|
The Knative Pod Autoscaler (KPA), provides fast, request-based autoscaling
|
|
capabilities. To correctly configure autoscaling to zero for revisions, you must
|
|
modify its parameters.
|
|
|
|
`target` defines how many concurrent requests are wanted at a given time (soft
|
|
limit) and is the recommended configuration for autoscaling in Knative.
|
|
|
|
The `minScale` and `maxScale` annotations can be used to configure the minimum
|
|
and maximum number of pods that can serve applications.
|
|
|
|
You can access autoscaling capabilities by using `kn` to modify Knative services
|
|
without editing YAML files directly.
|
|
|
|
Use the `service create` and `service update` commands with the appropriate
|
|
flags to configure the autoscaling behavior.
|
|
|
|
| Flag | Description |
|
|
| :------------------------- | :-------------------------------------------------------------------------------------------------------------------------- |
|
|
| `--concurrency-limit int` | Hard limit of concurrent requests to be processed by a single replica. |
|
|
| `--scale-target int` | Recommendation for when to scale up based on the concurrent number of incoming requests. Defaults to `--concurrency-limit`. |
|
|
| `--scale-max int` | Maximum number of replicas. |
|
|
| `--scale-min int` | Minimum number of replicas. |
|