[0.8] layout and initial content for perf and scalability section (#1076)

* [wip] layout for perf and scalability section

* Some actual content for fortio including embedding result example + scalability page

* iframe border 0

* Minor typos fixed

Still need more content

* Still wip, moved out of concepts

* Still wip; some content and update

* Review feedback from daneyon

* Updates order->weight

Looks like the ordering now is called weight

* Added @ozevren provided section for ubench

and removed latin in other sections too with some early content/pointers

* Remove draft:true so preview shows the page

Even though the content is incomplete

* Add more links and data

* More explanation

* Showcase some more features

* Set ymax and ylog for first histogram

* Making spell check happy

Hopefully, as I can’t seem to run mdspell locally

* Further spell checks

* Adding diagram

* Use BluePerf consistently

* Linters...

* Adding preliminary sizing information

* Spelling/grammar

* Adding latency summary

(Feedback from Louis)

* Rephrase goal and highlight the features you get for cpu cost

* Update scalability page title to include sizing guide

Even if it’s “light” for now

* Move to new location

* Updates for hugo

* Somehow _site was removed from gitignore... fixing

* More merge/gitignore issues

* Put micro benchmark first/before scenarios

and more Hugo removals

* Remove "here" as anchor

Cc @ozevren / from
http://preliminary.istio.io/about/contribute/style-guide/#create-useful-
links

* Make spellchecker happy (Github->GitHub)

* Adding more information about logging on/off and mTLS results

* Review comments

* Switch to approx 10ms

* Hoping to solve linter/spelling
This commit is contained in:
Laurent Demailly 2018-05-29 18:42:56 -07:00 committed by Lin Sun
parent ba0cf03b36
commit 786ce41a26
17 changed files with 298 additions and 0 deletions

View File

@ -553,6 +553,12 @@ yamls
yournamespace
zipkin_dashboard.png
zipkin_span.png
BluePerf
embeddable
p99
perfcheck.sh
vCPU
AES-NI
qcc
- search.md

View File

@ -0,0 +1,19 @@
---
title: Micro Benchmarks
overview: Performance measurement through code level micro-benchmarks.
weight: 20
layout: docs
type: markdown
---
{% include home.html %}
We use Gos native tools to write targeted micro-benchmarks in performance sensitive areas. Our main goal with this approach is to provide easy-to-use micro-benchmarks that developers can use to perform quick before/after performance comparisons for their changes.
[Here](https://github.com/istio/istio/blob/master/mixer/test/perf/singlecheck_test.go) is a sample micro-benchmark for Mixer that measures the performance of attribute processing code.
The developers can also utilize a golden-files approach to capture the state of their benchmark results in the source tree for keeping track and referencing purposes. [Here](https://github.com/istio/istio/blob/master/mixer/test/perf/bench.baseline) is a baseline file.
Due to the nature of this testing type, there is a high-variance in latency numbers across machines. It is recommended that micro-benchmark numbers captured in this way are compared only against the previous runs on the same machine.
The [perfcheck.sh](https://github.com/istio/istio/blob/master/bin/perfcheck.sh) script can be used to quickly run benchmarks in a sub-folder and compare its results against the co-located baseline files.

View File

@ -0,0 +1,21 @@
---
title: Overview
overview: Provides a conceptual introduction to Istio's Performance and Scalability
weight: 10
layout: docs
type: markdown
---
{% include home.html %}
The Performance and Scalability working group has a 4 pronged approach for Istio's performance characterization, tracking and improvements:
* Code level micro-benchmarks
* Synthetic end-to-end benchmarks across various scenarios
* Realistic complex app end-to-end benchmarks across various settings
* Automation to ensure performance doesn't regress
We also aim to provide guidance for sizing and configuration of Istio installation in production and how to ensure scalability.

View File

@ -0,0 +1,17 @@
---
title: Automation
overview: How we ensure performance is tracked and improves or does not regress across releases.
weight: 50
layout: docs
type: markdown
---
{% include home.html %}
Both the synthetic benchmarks (fortio based) and the realistic application (BluePerf)
are part of the nightly release pipeline and you can see the results on:
* [https://fortio-daily.istio.io/](https://fortio-daily.istio.io/)
* [https://ibmcloud-perf.istio.io/regpatrol/](https://ibmcloud-perf.istio.io/regpatrol/)
This enables us to catch regression early and track improvements over time.

View File

@ -0,0 +1,13 @@
---
title: Realistic Application Benchmark
overview: Performance measurement through realistic micro service application tests.
weight: 40
layout: docs
type: markdown
---
{% include home.html %}
For realistic application benchmarks we use [IBM's BluePerf](https://github.com/blueperf).
The results for each build are published automatically on the [Istio Regression Patrol](https://ibmcloud-perf.istio.io/regpatrol/) site.

View File

@ -0,0 +1,33 @@
---
title: Scalability and Sizing Guide
overview: Setup of Istio components to scale horizontally. High availability. Sizing guide.
weight: 60
layout: docs
type: markdown
---
{% include home.html %}
* Setup multiple replicas of the control plane components.
* Setup [Horizontal Pod Autoscaling](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
* Split mixer check and report pods.
* High availability (HA).
* See also [Istio's Performance oriented FAQ](https://github.com/istio/istio/wiki/Istio-Performance-oriented-setup-FAQ)
* And the [Performance and Scalability Working Group](https://github.com/istio/community/blob/master/WORKING-GROUPS.md#performance-and-scalability) work.
Current recommendations (when using all Istio features):
* 1 vCPU per peak thousand requests per second for the sidecar(s)
* Assuming typical cache hit ratio (>80%) for mixer checks: 0.5 vCPU per peak thousand requests per second for the mixer pods.
* Latency cost/overhead is about [14 millisecond](https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json) for service-to-service (2 proxies involved, mixer telemetry and checks) as of 0.7.1, we are working on bringing this down to a low single digit ms.
We plan on providing more granular guidance for customers adopting Istio "A la carte".
The goal for 2018 for Istio is to reduce both the CPU overhead and latency of adding Istio to your application but please note that if you application is handling its own telemetry, policy, security, network routing, a/b testing, etc... all that code and cost can be removed and that should offset most if not all of the Istio overhead.

View File

@ -0,0 +1,23 @@
---
title: Testing scenarios
overview: The different scenarios we are tracking for performance and scalability.
weight: 15
layout: docs
type: markdown
toc: false
---
{% include home.html %}
{% include image.html width="75%" ratio="61.44%"
link="https://raw.githubusercontent.com/istio/istio/master/tools/perf_setup.svg?sanitize=true"
alt="Performance scenarios diagram"
title="Performance scenarios diagram"
caption="Performance scenarios diagram"
%}
The synthetic benchmark scenarios and the source code of the tests are described
on [GitHub](https://github.com/istio/istio/tree/master/tools#istio-load-testing-user-guide)
<!-- add blueperf and more details -->

View File

@ -0,0 +1,29 @@
---
title: Synthetic End to End benchmarks
overview: Fortio is our simple synthetic http and grpc benchmarking tool.
weight: 30
layout: docs
type: markdown
---
{% include home.html %}
We use Fortio (Φορτίο) as Istio's synthetic end to end load testing tool. Fortio runs at a specified query per second (qps) and records an histogram of execution time and calculates percentiles (e.g. p99 i.e. the response time such as 99% of the requests take less than that number (in seconds, SI unit)). It can run for a set duration, for a fixed number of calls, or until interrupted (at a constant target QPS, or max speed/load per connection/thread).
Fortio is a fast, small, reusable, embeddable go library as well as a command line tool and server process, the server includes a simple web UI and graphical representation of the results (both a single latency graph and a multiple results comparative min, max, average and percentiles graphs).
Fortio is also 100% open-source and with no external dependencies beside go and gRPC so you can reproduce all our results easily and add your own variants or scenarios you are interested in exploring.
Here is an example of scenario (one out of the 8 scenarios we run for every build) result graphing the latency distribution for istio-0.7.1 at 400 Query-Per-Second (qps) between 2 services inside the mesh (with mTLS, Mixer Checks and Telemetry):
<iframe src="https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json&xMax=105&yLog=true" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
Comparing 0.6.0 and 0.7.1 histograms/response time distribution for the same scenario, clearly showing 0.7 improvements:
<iframe src="https://fortio.istio.io/?xMin=2&xMax=110&xLog=true&sel=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06&sel=qps_400-s1_to_s2-0.6.0-2018-04-05-22-33" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
And tracking the progress across all the tested releases for that scenario:
<iframe src="https://fortio.istio.io/?s=qps_400-s1_to_s2" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
You can learn more about [Fortio](https://github.com/istio/fortio/blob/master/README.md#fortio) on GitHub and see results on [https://fortio.istio.io](https://fortio.istio.io).

View File

@ -22,6 +22,9 @@ is where you can learn about what Istio does and how it does it.
- [Guides](/docs/guides/). Guides are fully working stand-alone examples
intended to highlight a particular set of Istio's features.
- [Performance and Scalability](/docs/performance-and-scalability/).
Information about Istio's performance and scalability processes and results.
- [Reference](/docs/reference/). Detailed exhaustive lists of
command-line options, configuration options, API definitions, and procedures.

View File

@ -0,0 +1,8 @@
---
title: Performance and Scalability
description: Introduces Performance and Scalability methodology, results and best practices for Istio components.
weight: 50
type: section-index
---

View File

@ -0,0 +1,15 @@
---
title: Micro Benchmarks
overview: Performance measurement through code level micro-benchmarks.
weight: 20
---
We use Gos native tools to write targeted micro-benchmarks in performance sensitive areas. Our main goal with this approach is to provide easy-to-use micro-benchmarks that developers can use to perform quick before/after performance comparisons for their changes.
See the [sample micro-benchmark](https://github.com/istio/istio/blob/master/mixer/test/perf/singlecheck_test.go) for Mixer that measures the performance of attribute processing code.
The developers can also utilize a golden-files approach to capture the state of their benchmark results in the source tree for keeping track and referencing purposes. GitHub has this [baseline file](https://github.com/istio/istio/blob/master/mixer/test/perf/bench.baseline).
Due to the nature of this testing type, there is a high-variance in latency numbers across machines. It is recommended that micro-benchmark numbers captured in this way are compared only against the previous runs on the same machine.
The [perfcheck.sh](https://github.com/istio/istio/blob/master/bin/perfcheck.sh) script can be used to quickly run benchmarks in a sub-folder and compare its results against the co-located baseline files.

View File

@ -0,0 +1,17 @@
---
title: Overview
overview: Provides a conceptual introduction to Istio's Performance and Scalability
weight: 10
---
The Performance and Scalability working group has a 4 pronged approach for Istio's performance characterization, tracking and improvements:
* Code level micro-benchmarks
* Synthetic end-to-end benchmarks across various scenarios
* Realistic complex app end-to-end benchmarks across various settings
* Automation to ensure performance doesn't regress
We also aim to provide guidance for sizing and configuration of Istio installation in production and how to ensure scalability.

View File

@ -0,0 +1,13 @@
---
title: Automation
overview: How we ensure performance is tracked and improves or does not regress across releases.
weight: 50
---
Both the synthetic benchmarks (fortio based) and the realistic application (BluePerf)
are part of the nightly release pipeline and you can see the results on:
* [https://fortio-daily.istio.io/](https://fortio-daily.istio.io/)
* [https://ibmcloud-perf.istio.io/regpatrol/](https://ibmcloud-perf.istio.io/regpatrol/)
This enables us to catch regression early and track improvements over time.

View File

@ -0,0 +1,9 @@
---
title: Realistic Application Benchmark
overview: Performance measurement through realistic micro service application tests.
weight: 40
---
For realistic application benchmarks we use [IBM's BluePerf](https://github.com/blueperf).
The results for each build are published automatically on the [Istio Regression Patrol](https://ibmcloud-perf.istio.io/regpatrol/) site.

View File

@ -0,0 +1,31 @@
---
title: Scalability and Sizing Guide
overview: Setup of Istio components to scale horizontally. High availability. Sizing guide.
weight: 60
---
* Setup multiple replicas of the control plane components.
* Setup [Horizontal Pod Autoscaling](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
* Split mixer check and report pods.
* High availability (HA).
* See also [Istio's Performance oriented FAQ](https://github.com/istio/istio/wiki/Istio-Performance-oriented-setup-FAQ)
* And the [Performance and Scalability Working Group](https://github.com/istio/community/blob/master/WORKING-GROUPS.md#performance-and-scalability) work.
Current recommendations (when using all Istio features):
* 1 vCPU per peak thousand requests per second for the sidecar(s) with access logging (which is on by default) and 0.5 without, `fluentd` on the node is a big contributor to that cost as it captures and uploads logs.
* Assuming typical cache hit ratio (>80%) for mixer checks: 0.5 vCPU per peak thousand requests per second for the mixer pods.
* Latency cost/overhead is approximately [10 millisecond](https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json) for service-to-service (2 proxies involved, mixer telemetry and checks) as of 0.7.1, we expect to bring this down to a low single digit ms.
* mTLS costs are negligible on AES-NI capable hardware in terms of both CPU and latency.
We plan on providing more granular guidance for customers adopting Istio "A la carte".
The goal for 2018 for Istio is to reduce both the CPU overhead and latency of adding Istio to your application but please note that if you application is handling its own telemetry, policy, security, network routing, a/b testing, etc... all that code and cost can be removed and that should offset most if not all of the Istio overhead.

View File

@ -0,0 +1,16 @@
---
title: Testing scenarios
overview: The different scenarios we are tracking for performance and scalability.
weight: 25
---
{{< image width="75%" ratio="61.44%"
link="https://raw.githubusercontent.com/istio/istio/master/tools/perf_setup.svg?sanitize=true"
alt="Performance scenarios diagram"
caption="Performance scenarios diagram"
>}}
The synthetic benchmark scenarios and the source code of the tests are described
on [GitHub](https://github.com/istio/istio/tree/master/tools#istio-load-testing-user-guide)
<!-- add blueperf and more details -->

View File

@ -0,0 +1,25 @@
---
title: Synthetic End to End benchmarks
overview: Fortio is our simple synthetic http and grpc benchmarking tool.
weight: 30
---
We use Fortio (Φορτίο) as Istio's synthetic end to end load testing tool. Fortio runs at a specified query per second (qps) and records an histogram of execution time and calculates percentiles (e.g. p99 i.e. the response time such as 99% of the requests take less than that number (in seconds, SI unit)). It can run for a set duration, for a fixed number of calls, or until interrupted (at a constant target QPS, or max speed/load per connection/thread).
Fortio is a fast, small, reusable, embeddable go library as well as a command line tool and server process, the server includes a simple web UI and graphical representation of the results (both a single latency graph and a multiple results comparative min, max, average and percentiles graphs).
Fortio is also 100% open-source and with no external dependencies beside go and gRPC so you can reproduce all our results easily and add your own variants or scenarios you are interested in exploring.
Here is an example of scenario (one out of the 8 scenarios we run for every build) result graphing the latency distribution for istio-0.7.1 at 400 Query-Per-Second (qps) between 2 services inside the mesh (with mTLS, Mixer Checks and Telemetry):
<iframe src="https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json&xMax=105&yLog=true" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
Comparing 0.6.0 and 0.7.1 histograms/response time distribution for the same scenario, clearly showing 0.7 improvements:
<iframe src="https://fortio.istio.io/?xMin=2&xMax=110&xLog=true&sel=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06&sel=qps_400-s1_to_s2-0.6.0-2018-04-05-22-33" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
And tracking the progress across all the tested releases for that scenario:
<iframe src="https://fortio.istio.io/?s=qps_400-s1_to_s2" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
You can learn more about [Fortio](https://github.com/istio/fortio/blob/master/README.md#fortio) on GitHub and see results on [https://fortio.istio.io](https://fortio.istio.io).