opentelemetry.io/content/en/blog/2023/exponential-histograms.md

202 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Exponential Histograms
date: 2023-05-22
author: '[Daniel Dyla](https://github.com/dyladan)'
spelling: cSpell:ignore subsetting Ruslan Vovalov Ganesh Vernekar
canonical_url: https://dyladan.me/histograms/2023/05/04/exponential-histograms/
---
Previously, in [Why Histograms?][] and [Histograms vs Summaries][], I went over
the basics of histograms and summaries, explaining the tradeoffs, benefits, and
limitations of each. Because they're easy to understand and demonstrate, those
posts focused on so-called explicit bucket histograms. The exponential bucket
histogram, also referred to as native histogram in Prometheus, is a low-cost,
efficient alternative to explicit bucket histograms. In this post, I go through
what they are, how they work, and the problems they solve that explicit bucket
histograms struggle with.
## Types of histograms
For the purposes of this blog post, there are two major types of histograms:
explicit bucket histograms and exponential bucket histograms. In previous posts,
I've focused on what OpenTelemetry calls explicit bucket histograms and
Prometheus simply refers to as histograms. As the name implies, an explicit
bucket histogram has each bucket configured explicitly by either the user or
some default list of buckets. Exponential histograms work by calculating bucket
boundaries using an exponential growth function. This means each consecutive
bucket is larger than the previous bucket and ensures a constant relative error
for every bucket.
## Exponential histograms
In OpenTelemetry exponential histograms, buckets are calculated automatically
from an integer _scale factor_, with larger scale factors offering smaller
buckets and greater precision. It is important to select a scale factor that is
appropriate for the distribution of values you are collecting in order to
minimize error, maximize efficiency, and ensure the values being collected fit
in a reasonable number of buckets. In the next few sections, I'll go over the
scale and error calculations in detail.
## Scale factor
The most important and most fundamental part of an exponential histogram is also
one of the trickiest to understand, the scale factor. From the scale factor,
bucket boundaries, and by extension resolution, range, and error rates, are
derived. The first step is to calculate the histogram base.
The base is a constant derived directly from the scale using the equation
`2 ^ (2 ^ -scale)`. For example, given a scale of 3, the base can be calculated
as `2^(2^-3) ~= 1.090508`. Because the calculation depends on the power of the
negative scale, as the scale grows, the base shrinks and vice versa. As will be
shown later, this is the fundamental reason that a greater scale factor results
in smaller buckets and a higher resolution histogram.
## Bucket calculation
Given a scale factor and its resulting base, we can calculate every possible
bucket in the histogram. From the base, the upper bound of each bucket at index
`i` is defined to be `base ^ (i + 1)`, with the first bucket lower boundary
of 1. Because of this, the upper boundary of the first bucket at index 0 is also
exactly the base. For now, we will only consider nonnegative indices, but
negative indexed buckets are also possible and define all buckets between 0
and 1. Keeping with our example using a scale of 3 and resulting base of
1.090508, the third bucket at index 2 has an upper bound of
`1.090508^(2+1) = 1.29684`. The following table shows upper bounds for the first
10 buckets of a few different scale factors:
| index | scale -1 | scale 0 | scale 1 | scale 3 |
| ----- | -------- | ------- | ------- | ------- |
| -1 | **1** | **1** | **1** | **1** |
| 0 | **4** | 2 | 1.4142 | 1.0905 |
| 1 | **16** | **4** | 2 | 1.1892 |
| 2 | 64 | 8 | 2.8284 | 1.2968 |
| 3 | 256 | **16** | **4** | 1.4142 |
| 4 | 1024 | 32 | 5.6569 | 1.5422 |
| 5 | 4096 | 64 | 8 | 1.6818 |
| 6 | 16384 | 128 | 11.3137 | 1.8340 |
| 7 | 65536 | 256 | **16** | 2 |
| 8 | 262144 | 512 | 22.6274 | 2.1810 |
| 9 | 1048576 | 1024 | 32 | 2.3784 |
I've bolded some of the values here to show an important property of exponential
histograms called _perfect subsetting_.
## Perfect subsetting
In the chart above, some of the bucket boundaries are shared between histograms
with differing scale factors. In fact, each time the scale factor increases by
1, exactly 1 boundary is inserted between each existing boundary. This feature
is called perfect subsetting because each set of boundaries for a given scale
factor is a perfect subset of the boundaries for any histogram with a greater
scale factor.
Because of this, histograms with differing scale factors can be normalized to
whichever has the lesser scale factor by combining neighboring buckets. This
means that histograms with different scale factors can still be combined into a
single histogram with exactly the precision of the least precise histogram being
combined. For example, histogram _A_ with scale 3 and histogram _B_ with scale 2
can be combined into a single histogram _C_ with scale 2 by first summing each
pair of neighboring buckets in _A_ to form histogram _A'_ with scale 2. Then,
each bucket in _A'_ is summed with the corresponding bucket of the same index in
_B_ to make _C_.
## Relative Error
A histogram does not store exact values for each point, but represents each
point as a bucket consisting of a range of possible points. This can be thought
of as being similar to lossy compression. In the same way the it is impossible
to recover an exact source image from a compressed JPEG, it is impossible to
recover the exact input data set from a histogram. The difference between the
input data and the estimated reconstruction of the data is the error of the
histogram. It is important to understand histogram errors because it affects
φ-quantile estimation and may affect how you define your SLOs.
The relative error for a histogram is defined as half the bucket width divided
by the bucket midpoint. Because the relative error is the same across all
buckets, we can use the first bucket with the upper bound of the base to make
the math easy. An example is shown below using a scale of 3.
```
scale = 3
# For base calculation, see above
base = 1.090508
relative error = (bucketWidth / 2) / bucketMidpoint
= ((upper - lower) / 2) / ((upper + lower) / 2)
= ((base - 1) / 2) / ((base + 1) / 2)
= (base - 1) / (base + 1)
= (1.090508 - 1) / (1.090508 + 1)
= 0.04329
= 4.329%
```
For more information regarding histogram errors, see [OTEP 149][] and the
[specification for exponential histogram aggregations][].
## Choosing a scale
Because increasing the scale factor increases the resolution and decreases the
relative error, it may be tempting to choose a large scale factor. After all,
why would you want to introduce error? The answer is that there is a positive
relationship between the scale factor and the number of buckets required to
represent values within a specified range. For example, with 160 buckets (the
OpenTelemetry default), histogram _A_ with a scale factor of 3 can represent
values between 1 and about 1 million; histogram _B_ with a scale of 4 the same
number of buckets would only be able to represent values between about 1 and
about 1000, albeit at half the relative error. To represent the same range of
values as _A_ with _B_, twice as many buckets are required; in this case 320.
This brings me to the first most important point of choosing a scale, _data
contrast_. Data contrast is how you describe the difference in scale between the
smallest possible value x and the largest possible value y in your dataset and
is calculated as the constant multiple c such that `y = c * x`. For example, if
your data is between 1 and 1000 milliseconds, your data contrast is 1000. If
your data is between 1 kilobyte and 1 terabyte, your data contrast is
1,000,000,000. Data contrast, scale, and the number of buckets are all
interlinked such that if you have 2, you can calculate the third.
Fortunately, if you are using OpenTelemetry, scale choice is largely done for
you. In OpenTelemetry, you configure a maximum scale (default 20) and a maximum
size (default 160), or number of buckets, in the histogram. The histogram is
initially assumed have the maximum scale. As additional data points are added,
the histogram will rescale itself down such that the data points always fit
within your maximum number of buckets. The default of 160 buckets was chosen by
the OpenTelemetry authors to be able to cover typical web requests between 1ms
and 10s with less than 5% relative error. If your data has less contrast, your
error will be even less.
## Negative or zero values
For the bulk of this post we have ignored zero and negative values, but negative
buckets work much the same way, growing larger as the buckets get further from
zero. All of the math and explanation above applies in the same way to negative
values, but they should be substituted for their absolute values, and upper
bounds for buckets are lower bounds (or upper absolute value bounds). Zero
values, or values with an absolute value less than a configurable threshold, go
into a special zero bucket. When merging histograms with differing zero
thresholds, the larger threshold is taken and any buckets with absolute value
upper bounds within the zero threshold are added to the zero bucket and
discarded.
## OpenTelemetry and Prometheus
Compatibility between OpenTelemetry and Prometheus is probably a topic large
enough for its own post. For now I will just say that for all practical
purposes, OpenTelemetry exponential histograms are 1:1 compatible with
Prometheus native histograms. Scale calculations, bucket boundaries, error
rates, zero buckets, etc are all the same. For more information, I recommend you
watch this talk given by Ruslan Vovalov and Ganesh Vernekar: [Using
OpenTelemetrys Exponential Histograms in Prometheus][]
_A version of this article was [originally posted][] to the author's blog._
<!-- prettier-ignore-start -->
[Using OpenTelemetrys Exponential Histograms in Prometheus]:
https://www.youtube.com/watch?v=W2_TpDcess8
[OTEP 149]: https://github.com/open-telemetry/oteps/blob/976c9395e4cbb3ea933d3b51589eba94b87a17bd/text/0149-exponential-histogram.md
[specification for exponential histogram aggregations]: /docs/specs/otel/metrics/sdk/#base2-exponential-bucket-histogram-aggregation
[Why Histograms?]: {{% relref "why-histograms" %}}
[Histograms vs Summaries]: {{% relref "histograms-vs-summaries" %}}
[originally posted]: {{% param canonical_url %}}
<!-- prettier-ignore-end -->