mirror of https://github.com/knative/docs.git
More autoscale sample details (#242)
* Adding bloat and sleep commands to the sample. * Run all three in parallel. * Load test tool. * Concurreny and qps settings for load test. * Formating tweaks. * Change health check address. * Move load test into subdirectory. * Delete hey Dockerfile now that we have our own load tester. * Output resource usage summary from autoscale.go and total requests from load.go. * Rename sample.yaml as service.yaml to match other samples. * Update README to run new load test. * A few README fixes. * Stop test after duration. * Add files via upload * Add files via upload * Add dashboard screenshots. * Further experiments. * Add metrics prerequisite. * Replace Configuration and Revision with Service. * Remove docs from image path. * Revert edits to helloworld-go.
This commit is contained in:
parent
fed93e3048
commit
cca2e039b5
|
@ -1,10 +1,11 @@
|
||||||
# Autoscale Sample
|
# Autoscale Sample
|
||||||
|
|
||||||
A demonstration of the autoscaling capabilities of an Knative Serving Revision.
|
A demonstration of the autoscaling capabilities of a Knative Serving Revision.
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
1. A Kubernetes cluster with [Knative Serving](https://github.com/knative/docs/blob/master/install/README.md) installed.
|
1. A Kubernetes cluster with [Knative Serving](https://github.com/knative/docs/blob/master/install/README.md) installed.
|
||||||
|
1. A [metrics installation](https://github.com/knative/docs/blob/master/serving/installing-logging-metrics-traces.md) for viewing scaling graphs (optional).
|
||||||
1. Install [Docker](https://docs.docker.com/get-started/#prepare-your-docker-environment).
|
1. Install [Docker](https://docs.docker.com/get-started/#prepare-your-docker-environment).
|
||||||
1. Check out the code:
|
1. Check out the code:
|
||||||
```
|
```
|
||||||
|
@ -51,54 +52,106 @@ Build the application container and publish it to a container registry:
|
||||||
|
|
||||||
1. Deploy the Knative Serving sample:
|
1. Deploy the Knative Serving sample:
|
||||||
```
|
```
|
||||||
kubectl apply -f serving/samples/autoscale-go/sample.yaml
|
kubectl apply -f serving/samples/autoscale-go/service.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Find the ingress hostname and IP and export as an environment variable:
|
1. Find the ingress hostname and IP and export as an environment variable:
|
||||||
```
|
```
|
||||||
export SERVICE_HOST=`kubectl get route autoscale-route -o jsonpath="{.status.domain}"`
|
export IP_ADDRESS=`kubectl get svc knative-ingressgateway -n istio-system -o jsonpath="{.status.loadBalancer.ingress[*].ip}"`
|
||||||
export SERVICE_IP=`kubectl get svc knative-ingressgateway -n istio-system -o jsonpath="{.status.loadBalancer.ingress[*].ip}"`
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## View the Autoscaling Capabilities
|
## View the Autoscaling Capabilities
|
||||||
|
|
||||||
1. Request the largest prime less than 40,000,000 from the autoscale app. Note that it consumes about 1 cpu/sec.
|
1. Make a request to the autoscale app to see it consume some resources.
|
||||||
```
|
```
|
||||||
time curl --header "Host:$SERVICE_HOST" http://${SERVICE_IP?}/primes/40000000
|
curl --header "Host: autoscale-go.default.example.com" "http://${IP_ADDRESS?}?sleep=100&prime=1000000&bloat=50"
|
||||||
|
```
|
||||||
|
```
|
||||||
|
Allocated 50 Mb of memory.
|
||||||
|
The largest prime less than 1000000 is 999983.
|
||||||
|
Slept for 100.13 milliseconds.
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Ramp up traffic on the autoscale app (about 300 QPS):
|
1. Ramp up traffic to maintain 10 in-flight requests.
|
||||||
|
|
||||||
```
|
```
|
||||||
kubectl delete namespace hey --ignore-not-found && kubectl create namespace hey
|
go run serving/samples/autoscale-go/test/test.go -sleep 100 -prime 1000000 -bloat 50 -qps 9999 -concurrency 10
|
||||||
```
|
```
|
||||||
```
|
```
|
||||||
for i in `seq 2 2 60`; do
|
REQUEST STATS:
|
||||||
kubectl -n hey run hey-$i --image josephburnett/hey --restart Never -- \
|
Total: 34 Inflight: 10 Done: 34 Success Rate: 100.00% Avg Latency: 0.2584 sec
|
||||||
-n 999999 -c $i -z 2m -host $SERVICE_HOST \
|
Total: 69 Inflight: 10 Done: 35 Success Rate: 100.00% Avg Latency: 0.2750 sec
|
||||||
"http://${SERVICE_IP?}/primes/40000000"
|
Total: 108 Inflight: 10 Done: 39 Success Rate: 100.00% Avg Latency: 0.2598 sec
|
||||||
sleep 1
|
Total: 148 Inflight: 10 Done: 40 Success Rate: 100.00% Avg Latency: 0.2565 sec
|
||||||
done
|
Total: 185 Inflight: 10 Done: 37 Success Rate: 100.00% Avg Latency: 0.2624 sec
|
||||||
|
...
|
||||||
```
|
```
|
||||||
|
> Note: Use CTRL+C to exit the load test.
|
||||||
|
|
||||||
1. Watch the Knative Serving deployment pod count increase.
|
1. Watch the Knative Serving deployment pod count increase.
|
||||||
```
|
```
|
||||||
kubectl get deploy --watch
|
kubectl get deploy --watch
|
||||||
```
|
```
|
||||||
> Note: Use CTRL+C to exit watch mode.
|
> Note: Use CTRL+C to exit watch mode.
|
||||||
|
|
||||||
1. Watch the pod traffic ramp up.
|
## Analysis
|
||||||
|
|
||||||
|
### Algorithm
|
||||||
|
|
||||||
|
Knative Serving autoscaling is based on the average number of in-flight requests per pod (concurrency). The system has a default [target concurency of 1.0](https://github.com/knative/serving/blob/5441a18b360805d261528b2ac8ac13124e826946/config/config-autoscaler.yaml#L27).
|
||||||
|
|
||||||
|
For example, if a Revision is receiving 35 requests per second, each of which takes about about .25 seconds, Knative Serving will determine the Revision needs about 9 pods
|
||||||
|
|
||||||
|
```
|
||||||
|
35 * .25 = 8.75
|
||||||
|
ceil(8.75) = 9
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dashboards
|
||||||
|
|
||||||
|
View the Knative Serving Scaling and Request dashboards (if configured).
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000
|
||||||
|
```
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Other Experiments
|
||||||
|
|
||||||
|
1. Maintain 100 concurrent requests.
|
||||||
```
|
```
|
||||||
kubectl get pods -n hey --show-all --watch
|
go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 100
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Look at the latency, requests/sec and success rate of each pod.
|
1. Maintain 100 qps with fast requests.
|
||||||
```
|
```
|
||||||
for i in `seq 2 2 60`; do kubectl -n hey logs hey-$i ; done
|
go run serving/samples/autoscale-go/test/test.go -qps 100 -concurrency 9999
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Maintain 100 qps with slow requests.
|
||||||
|
```
|
||||||
|
go run serving/samples/autoscale-go/test/test.go -qps 100 -concurrency 9999 -sleep 500
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Heavy CPU usage.
|
||||||
|
```
|
||||||
|
go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 10 -prime 40000000
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Heavy memory usage.
|
||||||
|
```
|
||||||
|
go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 5 -bloat 1000
|
||||||
```
|
```
|
||||||
|
|
||||||
## Cleanup
|
## Cleanup
|
||||||
|
|
||||||
```
|
```
|
||||||
kubectl delete namespace hey
|
|
||||||
kubectl delete -f serving/samples/autoscale-go/sample.yaml
|
kubectl delete -f serving/samples/autoscale-go/sample.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Further reading
|
||||||
|
|
||||||
|
1. [Autoscaling Developer Documentation](https://github.com/knative/serving/blob/master/docs/scaling/DEVELOPMENT.md)
|
||||||
|
|
|
@ -16,16 +16,18 @@ limitations under the License.
|
||||||
package main
|
package main
|
||||||
|
|
||||||
import (
|
import (
|
||||||
"encoding/json"
|
"fmt"
|
||||||
"math"
|
"math"
|
||||||
"net/http"
|
"net/http"
|
||||||
"strconv"
|
"strconv"
|
||||||
|
"sync"
|
||||||
|
"time"
|
||||||
)
|
)
|
||||||
|
|
||||||
// Algorithm from https://stackoverflow.com/a/21854246
|
// Algorithm from https://stackoverflow.com/a/21854246
|
||||||
|
|
||||||
// Only primes less than or equal to N will be generated
|
// Only primes less than or equal to N will be generated
|
||||||
func primes(N int) []int {
|
func allPrimes(N int) []int {
|
||||||
|
|
||||||
var x, y, n int
|
var x, y, n int
|
||||||
nsqrt := math.Sqrt(float64(N))
|
nsqrt := math.Sqrt(float64(N))
|
||||||
|
@ -71,22 +73,87 @@ func primes(N int) []int {
|
||||||
return primes
|
return primes
|
||||||
}
|
}
|
||||||
|
|
||||||
const primesPath = "/primes/"
|
func bloat(mb int) string {
|
||||||
|
b := make([]byte, mb*1024*1024)
|
||||||
|
b[0] = 1
|
||||||
|
b[len(b)-1] = 1
|
||||||
|
return fmt.Sprintf("Allocated %v Mb of memory.\n", mb)
|
||||||
|
}
|
||||||
|
|
||||||
|
func prime(max int) string {
|
||||||
|
p := allPrimes(max)
|
||||||
|
if len(p) > 0 {
|
||||||
|
return fmt.Sprintf("The largest prime less than %v is %v.\n", max, p[len(p)-1])
|
||||||
|
} else {
|
||||||
|
return fmt.Sprintf("There are no primes smaller than %v.\n", max)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func sleep(ms int) string {
|
||||||
|
start := time.Now().UnixNano()
|
||||||
|
time.Sleep(time.Duration(ms) * time.Millisecond)
|
||||||
|
end := time.Now().UnixNano()
|
||||||
|
return fmt.Sprintf("Slept for %.2f milliseconds.\n", float64(end-start)/1000000)
|
||||||
|
}
|
||||||
|
|
||||||
|
func parseIntParam(r *http.Request, param string) (int, bool, error) {
|
||||||
|
if value := r.URL.Query().Get(param); value != "" {
|
||||||
|
i, err := strconv.Atoi(value)
|
||||||
|
if err != nil {
|
||||||
|
return 0, false, err
|
||||||
|
}
|
||||||
|
if i == 0 {
|
||||||
|
return i, false, nil
|
||||||
|
}
|
||||||
|
return i, true, nil
|
||||||
|
}
|
||||||
|
return 0, false, nil
|
||||||
|
}
|
||||||
|
|
||||||
func handler(w http.ResponseWriter, r *http.Request) {
|
func handler(w http.ResponseWriter, r *http.Request) {
|
||||||
w.Header().Set("Content-Type", "application/json")
|
// Validate inputs.
|
||||||
param := r.URL.Path[len(primesPath):]
|
ms, hasMs, err := parseIntParam(r, "sleep")
|
||||||
n, err := strconv.Atoi(param)
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
w.WriteHeader(http.StatusBadRequest)
|
http.Error(w, err.Error(), http.StatusBadRequest)
|
||||||
} else {
|
return
|
||||||
w.WriteHeader(http.StatusOK)
|
}
|
||||||
p := primes(n)
|
max, hasMax, err := parseIntParam(r, "prime")
|
||||||
json.NewEncoder(w).Encode(p[len(p)-1:])
|
if err != nil {
|
||||||
|
http.Error(w, err.Error(), http.StatusBadRequest)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
mb, hasMb, err := parseIntParam(r, "bloat")
|
||||||
|
if err != nil {
|
||||||
|
http.Error(w, err.Error(), http.StatusBadRequest)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
// Consume time, cpu and memory in parallel.
|
||||||
|
var wg sync.WaitGroup
|
||||||
|
defer wg.Wait()
|
||||||
|
if hasMs {
|
||||||
|
wg.Add(1)
|
||||||
|
go func() {
|
||||||
|
defer wg.Done()
|
||||||
|
fmt.Fprint(w, sleep(ms))
|
||||||
|
}()
|
||||||
|
}
|
||||||
|
if hasMax {
|
||||||
|
wg.Add(1)
|
||||||
|
go func() {
|
||||||
|
defer wg.Done()
|
||||||
|
fmt.Fprint(w, prime(max))
|
||||||
|
}()
|
||||||
|
}
|
||||||
|
if hasMb {
|
||||||
|
wg.Add(1)
|
||||||
|
go func() {
|
||||||
|
defer wg.Done()
|
||||||
|
fmt.Fprint(w, bloat(mb))
|
||||||
|
}()
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
http.HandleFunc(primesPath, handler)
|
http.HandleFunc("/", handler)
|
||||||
http.ListenAndServe(":8080", nil)
|
http.ListenAndServe(":8080", nil)
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,5 +0,0 @@
|
||||||
FROM golang
|
|
||||||
|
|
||||||
RUN go get -u github.com/rakyll/hey
|
|
||||||
|
|
||||||
ENTRYPOINT ["hey"]
|
|
Binary file not shown.
After Width: | Height: | Size: 183 KiB |
|
@ -1,46 +0,0 @@
|
||||||
# Copyright 2018 The Knative Authors
|
|
||||||
#
|
|
||||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
||||||
# you may not use this file except in compliance with the License.
|
|
||||||
# You may obtain a copy of the License at
|
|
||||||
#
|
|
||||||
# https://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
#
|
|
||||||
# Unless required by applicable law or agreed to in writing, software
|
|
||||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
# See the License for the specific language governing permissions and
|
|
||||||
# limitations under the License.
|
|
||||||
apiVersion: serving.knative.dev/v1alpha1
|
|
||||||
kind: Configuration
|
|
||||||
metadata:
|
|
||||||
name: autoscale-configuration
|
|
||||||
namespace: default
|
|
||||||
spec:
|
|
||||||
revisionTemplate:
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
knative.dev/type: app
|
|
||||||
spec:
|
|
||||||
container:
|
|
||||||
# This is the Go import path for the binary to containerize
|
|
||||||
# and substitute here.
|
|
||||||
image: github.com/knative/docs/serving/samples/autoscale-go
|
|
||||||
# When scaling up Knative controller doesn't have a way to
|
|
||||||
# know if the user container is good for serving. Having a
|
|
||||||
# readiness probe prevents traffic to be routed to a pod
|
|
||||||
# before the user container is ready.
|
|
||||||
readinessProbe:
|
|
||||||
httpGet:
|
|
||||||
path: "primes/4"
|
|
||||||
periodSeconds: 2
|
|
||||||
---
|
|
||||||
apiVersion: serving.knative.dev/v1alpha1
|
|
||||||
kind: Route
|
|
||||||
metadata:
|
|
||||||
name: autoscale-route
|
|
||||||
namespace: default
|
|
||||||
spec:
|
|
||||||
traffic:
|
|
||||||
- configurationName: autoscale-configuration
|
|
||||||
percent: 100
|
|
Binary file not shown.
After Width: | Height: | Size: 133 KiB |
|
@ -0,0 +1,25 @@
|
||||||
|
# Copyright 2018 The Knative Authors
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# https://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
apiVersion: serving.knative.dev/v1alpha1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
name: autoscale-go
|
||||||
|
namespace: default
|
||||||
|
spec:
|
||||||
|
runLatest:
|
||||||
|
configuration:
|
||||||
|
revisionTemplate:
|
||||||
|
spec:
|
||||||
|
container:
|
||||||
|
image: github.com/knative/docs/serving/samples/autoscale-go
|
|
@ -0,0 +1,134 @@
|
||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"flag"
|
||||||
|
"fmt"
|
||||||
|
"net/http"
|
||||||
|
"os"
|
||||||
|
"sync/atomic"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
var (
|
||||||
|
sleep = flag.Int("sleep", 0, "milliseconds to sleep")
|
||||||
|
prime = flag.Int("prime", 0, "calculate largest prime less than")
|
||||||
|
bloat = flag.Int("bloat", 0, "mb of memory to consume")
|
||||||
|
ip = flag.String("ip", "", "ip address of knative ingress")
|
||||||
|
port = flag.String("port", "80", "port of call")
|
||||||
|
host = flag.String("host", "autoscale-go.default.example.com", "host name of revision under test")
|
||||||
|
qps = flag.Int("qps", 10, "max requests per second")
|
||||||
|
concurrency = flag.Int("concurrency", 10, "max in-flight requests")
|
||||||
|
duration = flag.Duration("duration", time.Minute, "duration of the test")
|
||||||
|
verbose = flag.Bool("verbose", false, "verbose output for debugging")
|
||||||
|
)
|
||||||
|
|
||||||
|
type result struct {
|
||||||
|
success bool
|
||||||
|
statusCode int
|
||||||
|
latency int64
|
||||||
|
}
|
||||||
|
|
||||||
|
func get(url string, client *http.Client, report chan *result) {
|
||||||
|
start := time.Now()
|
||||||
|
result := &result{}
|
||||||
|
defer func() {
|
||||||
|
end := time.Now()
|
||||||
|
result.latency = end.UnixNano() - start.UnixNano()
|
||||||
|
report <- result
|
||||||
|
}()
|
||||||
|
|
||||||
|
req, err := http.NewRequest("GET", url, nil)
|
||||||
|
if err != nil {
|
||||||
|
if *verbose {
|
||||||
|
fmt.Printf("%v\n", err)
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
req.Host = *host
|
||||||
|
res, err := client.Do(req)
|
||||||
|
if err != nil {
|
||||||
|
if *verbose {
|
||||||
|
fmt.Printf("%v\n", err)
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
result.statusCode = res.StatusCode
|
||||||
|
if result.statusCode != http.StatusOK {
|
||||||
|
if *verbose {
|
||||||
|
fmt.Printf("%+v\n", res)
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
result.success = true
|
||||||
|
}
|
||||||
|
|
||||||
|
func reporter(stopCh <-chan time.Time, report chan *result, inflight *int64) {
|
||||||
|
tickerCh := time.NewTicker(time.Second).C
|
||||||
|
var (
|
||||||
|
total int64
|
||||||
|
count int64
|
||||||
|
nanoseconds int64
|
||||||
|
successful int64
|
||||||
|
)
|
||||||
|
fmt.Println("REQUEST STATS:")
|
||||||
|
for {
|
||||||
|
select {
|
||||||
|
case <-stopCh:
|
||||||
|
return
|
||||||
|
case <-tickerCh:
|
||||||
|
fmt.Printf("Total: %v\tInflight: %v\tDone: %v ", total, atomic.LoadInt64(inflight), count)
|
||||||
|
if count > 0 {
|
||||||
|
fmt.Printf("\tSuccess Rate: %.2f%%\tAvg Latency: %.4f sec\n", float64(successful)/float64(count)*100, float64(nanoseconds)/float64(count)/(1000000000))
|
||||||
|
} else {
|
||||||
|
fmt.Printf("\n")
|
||||||
|
}
|
||||||
|
count = 0
|
||||||
|
nanoseconds = 0
|
||||||
|
successful = 0
|
||||||
|
case r := <-report:
|
||||||
|
total++
|
||||||
|
count++
|
||||||
|
nanoseconds += r.latency
|
||||||
|
if r.success {
|
||||||
|
successful++
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
flag.Parse()
|
||||||
|
if *ip == "" {
|
||||||
|
ipAddress := os.Getenv("IP_ADDRESS")
|
||||||
|
ip = &ipAddress
|
||||||
|
}
|
||||||
|
if *ip == "" {
|
||||||
|
panic("need either $IP_ADDRESS env var or --ip flag")
|
||||||
|
}
|
||||||
|
url := fmt.Sprintf(
|
||||||
|
"http://%v:%v?sleep=%v&prime=%v&bloat=%v",
|
||||||
|
*ip, *port, *sleep, *prime, *bloat)
|
||||||
|
client := &http.Client{}
|
||||||
|
|
||||||
|
stopCh := time.After(*duration)
|
||||||
|
report := make(chan *result, 10000)
|
||||||
|
var inflight int64
|
||||||
|
|
||||||
|
go reporter(stopCh, report, &inflight)
|
||||||
|
|
||||||
|
qpsCh := time.NewTicker(time.Duration(time.Second.Nanoseconds() / int64(*qps))).C
|
||||||
|
for {
|
||||||
|
select {
|
||||||
|
case <-stopCh:
|
||||||
|
return
|
||||||
|
case <-qpsCh:
|
||||||
|
if atomic.LoadInt64(&inflight) < int64(*concurrency) {
|
||||||
|
atomic.AddInt64(&inflight, 1)
|
||||||
|
go func() {
|
||||||
|
get(url, client, report)
|
||||||
|
atomic.AddInt64(&inflight, -1)
|
||||||
|
}()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
Loading…
Reference in New Issue