History

erharjotsingh 9c9f95ea8b Fixing #5537 : Adding mandatory target annotation and their units. (#5629 ) * Issue# #5537: Update autoscaling-metrics.md to add the mandatory target values As per issue #5537, creating this PR for adding mandatory target annotation. * Fixing #5537: Implementing PR comments * Fixing #5634: Update home.html * Reverting the commit for other issue Reverting the commit for other issue so that I can fork one branch per issue * Fixing #5537: Adding review comments * Fixing#5537: Updating note formatting * Update autoscaling-metrics.md * Update autoscaling-metrics.md * Update autoscaling-metrics.md * Update autoscaling-metrics.md		2023-07-31 11:20:00 +00:00
..
autoscale-go	Fix image url of autoscale-go service (#5578 )	2023-05-26 12:12:25 +00:00
README.md	remove documentation for container-freezer (#5516 )	2023-03-31 13:42:02 +00:00
autoscaler-types.md	point to the new ghcr.io images (#5551 )	2023-05-09 17:51:00 +00:00
autoscaling-metrics.md	Fixing #5537 : Adding mandatory target annotation and their units. (#5629 )	2023-07-31 11:20:00 +00:00
autoscaling-targets.md	Remove outdated YAML for Hugo (#4114 )	2021-08-16 12:28:55 -07:00
concurrency.md	point to the new ghcr.io images (#5551 )	2023-05-09 17:51:00 +00:00
kpa-specific.md	point to the new ghcr.io images (#5551 )	2023-05-09 17:51:00 +00:00
rps-target.md	point to the new ghcr.io images (#5551 )	2023-05-09 17:51:00 +00:00
scale-bounds.md	Document current behaviour of scaling down with activation scale (#5619 )	2023-07-07 20:54:35 +00:00
scale-to-zero.md	point to the new ghcr.io images (#5551 )	2023-05-09 17:51:00 +00:00

README.md

Autoscaling

Knative Serving provides automatic scaling, or autoscaling, for applications to match incoming demand. This is provided by default, by using the Knative Pod Autoscaler (KPA).

For example, if an application is receiving no traffic and scale to zero is enabled, Knative Serving scales the application down to zero replicas. If scaling to zero is disabled, the application is scaled down to the minimum number of replicas specified for applications on the cluster. Replicas are scaled up to meet demand if traffic to the application increases.

You can enable and disable scale to zero functionality for your cluster if you have cluster administrator permissions. See Configuring scale to zero.

To use autoscaling for your application if it is enabled on your cluster, you must configure concurrency and scale bounds.

Additional resources

Try out the Go Autoscale Sample App.
Configure your Knative deployment to use the Kubernetes Horizontal Pod Autoscaler (HPA) instead of the default KPA. For how to install HPA, see Install optional Serving extensions.
Configure the types of metrics that the Autoscaler consumes.