[0.8] layout and initial content for perf and scalability section (#1076)

* [wip] layout for perf and scalability section * Some actual content for fortio including embedding result example + scalability page * iframe border 0 * Minor typos fixed Still need more content * Still wip, moved out of concepts * Still wip; some content and update * Review feedback from daneyon * Updates order->weight Looks like the ordering now is called weight * Added @ozevren provided section for ubench and removed latin in other sections too with some early content/pointers * Remove draft:true so preview shows the page Even though the content is incomplete * Add more links and data * More explanation * Showcase some more features * Set ymax and ylog for first histogram * Making spell check happy Hopefully, as I can’t seem to run mdspell locally * Further spell checks * Adding diagram * Use BluePerf consistently * Linters... * Adding preliminary sizing information * Spelling/grammar * Adding latency summary (Feedback from Louis) * Rephrase goal and highlight the features you get for cpu cost * Update scalability page title to include sizing guide Even if it’s “light” for now * Move to new location * Updates for hugo * Somehow _site was removed from gitignore... fixing * More merge/gitignore issues * Put micro benchmark first/before scenarios and more Hugo removals * Remove "here" as anchor Cc @ozevren / from http://preliminary.istio.io/about/contribute/style-guide/#create-useful- links * Make spellchecker happy (Github->GitHub) * Adding more information about logging on/off and mTLS results * Review comments * Switch to approx 10ms * Hoping to solve linter/spelling
2018-05-29 18:42:56 -07:00 · 2018-05-29 18:42:56 -07:00 · 786ce41a26
parent ba0cf03b36
commit 786ce41a26
17 changed files with 298 additions and 0 deletions
--- a/.spelling
+++ b/.spelling
@ -553,6 +553,12 @@ yamls
 yournamespace
 zipkin_dashboard.png
 zipkin_span.png
+BluePerf
+embeddable
+p99
+perfcheck.sh
+vCPU
+AES-NI

 qcc
 - search.md
--- a/_docs/performance-and-scalability/microbenchmarks.md
+++ b/_docs/performance-and-scalability/microbenchmarks.md
@ -0,0 +1,19 @@
+---
+title: Micro Benchmarks
+overview: Performance measurement through code level micro-benchmarks.
+weight: 20
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+We use Go’s native tools to write targeted micro-benchmarks in performance sensitive areas. Our main goal with this approach is to provide easy-to-use micro-benchmarks that developers can use to perform quick before/after performance comparisons for their changes.
+
+[Here](https://github.com/istio/istio/blob/master/mixer/test/perf/singlecheck_test.go) is a sample micro-benchmark for Mixer that measures the performance of attribute processing code.
+
+The developers can also utilize a golden-files approach to capture the state of their benchmark results in the source tree for keeping track and  referencing purposes. [Here](https://github.com/istio/istio/blob/master/mixer/test/perf/bench.baseline) is a baseline file.
+
+Due to the nature of this testing type, there is a high-variance in latency numbers across machines. It is recommended that micro-benchmark numbers captured in this way are compared only against the previous runs on the same machine.
+
+The [perfcheck.sh](https://github.com/istio/istio/blob/master/bin/perfcheck.sh) script can be used to quickly run benchmarks in a sub-folder and compare its results against the co-located baseline files.
--- a/_docs/performance-and-scalability/overview.md
+++ b/_docs/performance-and-scalability/overview.md
@ -0,0 +1,21 @@
+---
+title: Overview
+overview: Provides a conceptual introduction to Istio's Performance and Scalability
+weight: 10
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+The Performance and Scalability working group has a 4 pronged approach for Istio's performance characterization, tracking and improvements:
+
+* Code level micro-benchmarks
+
+* Synthetic end-to-end benchmarks across various scenarios
+
+* Realistic complex app end-to-end benchmarks across various settings
+
+* Automation to ensure performance doesn't regress
+
+We also aim to provide guidance for sizing and configuration of Istio installation in production and how to ensure scalability.
--- a/_docs/performance-and-scalability/performance-testing-automation.md
+++ b/_docs/performance-and-scalability/performance-testing-automation.md
@ -0,0 +1,17 @@
+---
+title: Automation
+overview: How we ensure performance is tracked and improves or does not regress across releases.
+weight: 50
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+Both the synthetic benchmarks (fortio based) and the realistic application (BluePerf)
+are part of the nightly release pipeline and you can see the results on:
+
+* [https://fortio-daily.istio.io/](https://fortio-daily.istio.io/)
+* [https://ibmcloud-perf.istio.io/regpatrol/](https://ibmcloud-perf.istio.io/regpatrol/)
+
+This enables us to catch regression early and track improvements over time.
--- a/_docs/performance-and-scalability/realistic-app-benchmark.md
+++ b/_docs/performance-and-scalability/realistic-app-benchmark.md
@ -0,0 +1,13 @@
+---
+title: Realistic Application Benchmark
+overview: Performance measurement through realistic micro service application tests.
+weight: 40
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+For realistic application benchmarks we use [IBM's BluePerf](https://github.com/blueperf).
+
+The results for each build are published automatically on the [Istio Regression Patrol](https://ibmcloud-perf.istio.io/regpatrol/) site.
--- a/_docs/performance-and-scalability/scalability.md
+++ b/_docs/performance-and-scalability/scalability.md
@ -0,0 +1,33 @@
+---
+title: Scalability and Sizing Guide
+overview: Setup of Istio components to scale horizontally. High availability. Sizing guide.
+weight: 60
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+* Setup multiple replicas of the control plane components.
+
+* Setup [Horizontal Pod Autoscaling](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
+
+* Split mixer check and report pods.
+
+* High availability (HA).
+
+* See also [Istio's Performance oriented FAQ](https://github.com/istio/istio/wiki/Istio-Performance-oriented-setup-FAQ)
+
+* And the [Performance and Scalability Working Group](https://github.com/istio/community/blob/master/WORKING-GROUPS.md#performance-and-scalability) work.
+
+Current recommendations (when using all Istio features):
+
+* 1 vCPU per peak thousand requests per second for the sidecar(s)
+
+* Assuming typical cache hit ratio (>80%) for mixer checks: 0.5 vCPU per peak thousand requests per second for the mixer pods.
+
+* Latency cost/overhead is about [14 millisecond](https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json) for service-to-service (2 proxies involved, mixer telemetry and checks) as of 0.7.1, we are working on bringing this down to a low single digit ms.
+
+We plan on providing more granular guidance for customers adopting Istio "A la carte".
+
+The goal for 2018 for Istio is to reduce both the CPU overhead and latency of adding Istio to your application but please note that if you application is handling its own telemetry, policy, security, network routing, a/b testing, etc... all that code and cost can be removed and that should offset most if not all of the Istio overhead.
--- a/_docs/performance-and-scalability/scenarios.md
+++ b/_docs/performance-and-scalability/scenarios.md
@ -0,0 +1,23 @@
+---
+title: Testing scenarios
+overview: The different scenarios we are tracking for performance and scalability.
+
+weight: 15
+
+layout: docs
+type: markdown
+toc: false
+---
+{% include home.html %}
+
+{% include image.html width="75%" ratio="61.44%"
+    link="https://raw.githubusercontent.com/istio/istio/master/tools/perf_setup.svg?sanitize=true"
+    alt="Performance scenarios diagram"
+    title="Performance scenarios diagram"
+    caption="Performance scenarios diagram"
+    %}
+
+The synthetic benchmark scenarios and the source code of the tests are described
+on [GitHub](https://github.com/istio/istio/tree/master/tools#istio-load-testing-user-guide)
+
+<!-- add blueperf and more details -->
--- a/_docs/performance-and-scalability/synthetic-benchmarks.md
+++ b/_docs/performance-and-scalability/synthetic-benchmarks.md
@ -0,0 +1,29 @@
+---
+title: Synthetic End to End benchmarks
+overview: Fortio is our simple synthetic http and grpc benchmarking tool.
+weight: 30
+
+layout: docs
+type: markdown
+---
+{% include home.html %}
+
+We use Fortio (Φορτίο) as Istio's synthetic end to end load testing tool. Fortio runs at a specified query per second (qps) and records an histogram of execution time and calculates percentiles (e.g. p99 i.e. the response time such as 99% of the requests take less than that number (in seconds, SI unit)). It can run for a set duration, for a fixed number of calls, or until interrupted (at a constant target QPS, or max speed/load per connection/thread).
+
+Fortio is a fast, small, reusable, embeddable go library as well as a command line tool and server process, the server includes a simple web UI and graphical representation of the results (both a single latency graph and a multiple results comparative min, max, average and percentiles graphs).
+
+Fortio is also 100% open-source and with no external dependencies beside go and gRPC so you can reproduce all our results easily and add your own variants or scenarios you are interested in exploring.
+
+Here is an example of scenario (one out of the 8 scenarios we run for every build) result graphing the latency distribution for istio-0.7.1 at 400 Query-Per-Second (qps) between 2 services inside the mesh (with mTLS, Mixer Checks and Telemetry):
+
+<iframe src="https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json&xMax=105&yLog=true" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+Comparing 0.6.0 and 0.7.1 histograms/response time distribution for the same scenario, clearly showing 0.7 improvements:
+
+<iframe src="https://fortio.istio.io/?xMin=2&xMax=110&xLog=true&sel=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06&sel=qps_400-s1_to_s2-0.6.0-2018-04-05-22-33" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+And tracking the progress across all the tested releases for that scenario:
+
+<iframe src="https://fortio.istio.io/?s=qps_400-s1_to_s2" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+You can learn more about [Fortio](https://github.com/istio/fortio/blob/master/README.md#fortio) on GitHub and see results on [https://fortio.istio.io](https://fortio.istio.io).
--- a/content/docs/_index.md
+++ b/content/docs/_index.md
@ -22,6 +22,9 @@ is where you can learn about what Istio does and how it does it.
 - [Guides](/docs/guides/). Guides are fully working stand-alone examples
 intended to highlight a particular set of Istio's features.

+- [Performance and Scalability](/docs/performance-and-scalability/).
+Information about Istio's performance and scalability processes and results.
+
 - [Reference](/docs/reference/). Detailed exhaustive lists of
 command-line options, configuration options, API definitions, and procedures.

--- a/content/docs/performance-and-scalability/_index.md
+++ b/content/docs/performance-and-scalability/_index.md
@ -0,0 +1,8 @@
+---
+title: Performance and Scalability
+description: Introduces Performance and Scalability methodology, results and best practices for Istio components.
+
+weight: 50
+
+type: section-index
+---
--- a/content/docs/performance-and-scalability/microbenchmarks.md
+++ b/content/docs/performance-and-scalability/microbenchmarks.md
@ -0,0 +1,15 @@
+---
+title: Micro Benchmarks
+overview: Performance measurement through code level micro-benchmarks.
+weight: 20
+---
+
+We use Go’s native tools to write targeted micro-benchmarks in performance sensitive areas. Our main goal with this approach is to provide easy-to-use micro-benchmarks that developers can use to perform quick before/after performance comparisons for their changes.
+
+See the [sample micro-benchmark](https://github.com/istio/istio/blob/master/mixer/test/perf/singlecheck_test.go) for Mixer that measures the performance of attribute processing code.
+
+The developers can also utilize a golden-files approach to capture the state of their benchmark results in the source tree for keeping track and  referencing purposes. GitHub has this [baseline file](https://github.com/istio/istio/blob/master/mixer/test/perf/bench.baseline).
+
+Due to the nature of this testing type, there is a high-variance in latency numbers across machines. It is recommended that micro-benchmark numbers captured in this way are compared only against the previous runs on the same machine.
+
+The [perfcheck.sh](https://github.com/istio/istio/blob/master/bin/perfcheck.sh) script can be used to quickly run benchmarks in a sub-folder and compare its results against the co-located baseline files.
--- a/content/docs/performance-and-scalability/overview.md
+++ b/content/docs/performance-and-scalability/overview.md
@ -0,0 +1,17 @@
+---
+title: Overview
+overview: Provides a conceptual introduction to Istio's Performance and Scalability
+weight: 10
+---
+
+The Performance and Scalability working group has a 4 pronged approach for Istio's performance characterization, tracking and improvements:
+
+* Code level micro-benchmarks
+
+* Synthetic end-to-end benchmarks across various scenarios
+
+* Realistic complex app end-to-end benchmarks across various settings
+
+* Automation to ensure performance doesn't regress
+
+We also aim to provide guidance for sizing and configuration of Istio installation in production and how to ensure scalability.
--- a/content/docs/performance-and-scalability/performance-testing-automation.md
+++ b/content/docs/performance-and-scalability/performance-testing-automation.md
@ -0,0 +1,13 @@
+---
+title: Automation
+overview: How we ensure performance is tracked and improves or does not regress across releases.
+weight: 50
+---
+
+Both the synthetic benchmarks (fortio based) and the realistic application (BluePerf)
+are part of the nightly release pipeline and you can see the results on:
+
+* [https://fortio-daily.istio.io/](https://fortio-daily.istio.io/)
+* [https://ibmcloud-perf.istio.io/regpatrol/](https://ibmcloud-perf.istio.io/regpatrol/)
+
+This enables us to catch regression early and track improvements over time.
--- a/content/docs/performance-and-scalability/realistic-app-benchmark.md
+++ b/content/docs/performance-and-scalability/realistic-app-benchmark.md
@ -0,0 +1,9 @@
+---
+title: Realistic Application Benchmark
+overview: Performance measurement through realistic micro service application tests.
+weight: 40
+---
+
+For realistic application benchmarks we use [IBM's BluePerf](https://github.com/blueperf).
+
+The results for each build are published automatically on the [Istio Regression Patrol](https://ibmcloud-perf.istio.io/regpatrol/) site.
--- a/content/docs/performance-and-scalability/scalability.md
+++ b/content/docs/performance-and-scalability/scalability.md
@ -0,0 +1,31 @@
+---
+title: Scalability and Sizing Guide
+overview: Setup of Istio components to scale horizontally. High availability. Sizing guide.
+weight: 60
+---
+
+* Setup multiple replicas of the control plane components.
+
+* Setup [Horizontal Pod Autoscaling](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
+
+* Split mixer check and report pods.
+
+* High availability (HA).
+
+* See also [Istio's Performance oriented FAQ](https://github.com/istio/istio/wiki/Istio-Performance-oriented-setup-FAQ)
+
+* And the [Performance and Scalability Working Group](https://github.com/istio/community/blob/master/WORKING-GROUPS.md#performance-and-scalability) work.
+
+Current recommendations (when using all Istio features):
+
+* 1 vCPU per peak thousand requests per second for the sidecar(s) with access logging (which is on by default) and 0.5 without, `fluentd` on the node is a big contributor to that cost as it captures and uploads logs.
+
+* Assuming typical cache hit ratio (>80%) for mixer checks: 0.5 vCPU per peak thousand requests per second for the mixer pods.
+
+* Latency cost/overhead is approximately [10 millisecond](https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json) for service-to-service (2 proxies involved, mixer telemetry and checks) as of 0.7.1, we expect to bring this down to a low single digit ms.
+
+* mTLS costs are negligible on AES-NI capable hardware in terms of both CPU and latency.
+
+We plan on providing more granular guidance for customers adopting Istio "A la carte".
+
+The goal for 2018 for Istio is to reduce both the CPU overhead and latency of adding Istio to your application but please note that if you application is handling its own telemetry, policy, security, network routing, a/b testing, etc... all that code and cost can be removed and that should offset most if not all of the Istio overhead.
--- a/content/docs/performance-and-scalability/scenarios.md
+++ b/content/docs/performance-and-scalability/scenarios.md
@ -0,0 +1,16 @@
+---
+title: Testing scenarios
+overview: The different scenarios we are tracking for performance and scalability.
+weight: 25
+---
+
+{{< image width="75%" ratio="61.44%"
+    link="https://raw.githubusercontent.com/istio/istio/master/tools/perf_setup.svg?sanitize=true"
+    alt="Performance scenarios diagram"
+    caption="Performance scenarios diagram"
+    >}}
+
+The synthetic benchmark scenarios and the source code of the tests are described
+on [GitHub](https://github.com/istio/istio/tree/master/tools#istio-load-testing-user-guide)
+
+<!-- add blueperf and more details -->
--- a/content/docs/performance-and-scalability/synthetic-benchmarks.md
+++ b/content/docs/performance-and-scalability/synthetic-benchmarks.md
@ -0,0 +1,25 @@
+---
+title: Synthetic End to End benchmarks
+overview: Fortio is our simple synthetic http and grpc benchmarking tool.
+weight: 30
+---
+
+We use Fortio (Φορτίο) as Istio's synthetic end to end load testing tool. Fortio runs at a specified query per second (qps) and records an histogram of execution time and calculates percentiles (e.g. p99 i.e. the response time such as 99% of the requests take less than that number (in seconds, SI unit)). It can run for a set duration, for a fixed number of calls, or until interrupted (at a constant target QPS, or max speed/load per connection/thread).
+
+Fortio is a fast, small, reusable, embeddable go library as well as a command line tool and server process, the server includes a simple web UI and graphical representation of the results (both a single latency graph and a multiple results comparative min, max, average and percentiles graphs).
+
+Fortio is also 100% open-source and with no external dependencies beside go and gRPC so you can reproduce all our results easily and add your own variants or scenarios you are interested in exploring.
+
+Here is an example of scenario (one out of the 8 scenarios we run for every build) result graphing the latency distribution for istio-0.7.1 at 400 Query-Per-Second (qps) between 2 services inside the mesh (with mTLS, Mixer Checks and Telemetry):
+
+<iframe src="https://fortio.istio.io/browse?url=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06.json&xMax=105&yLog=true" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+Comparing 0.6.0 and 0.7.1 histograms/response time distribution for the same scenario, clearly showing 0.7 improvements:
+
+<iframe src="https://fortio.istio.io/?xMin=2&xMax=110&xLog=true&sel=qps_400-s1_to_s2-0.7.1-2018-04-05-22-06&sel=qps_400-s1_to_s2-0.6.0-2018-04-05-22-33" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+And tracking the progress across all the tested releases for that scenario:
+
+<iframe src="https://fortio.istio.io/?s=qps_400-s1_to_s2" width="100%" height="1024" scrolling="no" frameborder="0"></iframe>
+
+You can learn more about [Fortio](https://github.com/istio/fortio/blob/master/README.md#fortio) on GitHub and see results on [https://fortio.istio.io](https://fortio.istio.io).