add info about custom latency buckets (#4236)

* doc: http metrics path normalization

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* doc: code review & path matching rename

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* doc: add configuration examples

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* update: update docs based on last proposal changes

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* feat: more updates based on the ingress/egress merge

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* doc: code review comments

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* doc: code review comments

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* feat: add excludeVerbs

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* feat: new line

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* feat: add review meeting changes

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>

* v1.14 - cherry pick path normalization

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* add additional changes

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* add additional changes

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* add additional changes

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* format table

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Mark Fussell <markfussell@gmail.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* explain buckets

Signed-off-by: Filinto Duran <filinto@diagrid.io>

* Apply suggestions from code review

Co-authored-by: Alice Gibbons <alicejgibbons@gmail.com>
Signed-off-by: Mark Fussell <markfussell@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Mark Fussell <markfussell@gmail.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

* Update daprdocs/content/en/operations/observability/metrics/metrics-overview.md

Co-authored-by: Mark Fussell <markfussell@gmail.com>
Signed-off-by: Filinto Duran <duranto@gmail.com>

---------

Signed-off-by: nelson.parente <nelson_parente@live.com.pt>
Signed-off-by: Filinto Duran <filinto@diagrid.io>
Signed-off-by: Filinto Duran <duranto@gmail.com>
Signed-off-by: Mark Fussell <markfussell@gmail.com>
Signed-off-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
Co-authored-by: nelson.parente <nelson_parente@live.com.pt>
Co-authored-by: Mark Fussell <markfussell@gmail.com>
Co-authored-by: Alice Gibbons <alicejgibbons@gmail.com>
Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com>
This commit is contained in:
Filinto Duran 2024-07-10 09:36:58 -05:00 committed by GitHub
parent 867b15348d
commit 64a22cbe3c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 64 additions and 8 deletions

View File

@ -108,6 +108,7 @@ The `metrics` section under the `Configuration` spec contains the following prop
metrics:
enabled: true
rules: []
latencyDistributionBuckets: []
http:
increasedCardinality: true
pathMatching:
@ -121,17 +122,18 @@ metrics:
excludeVerbs: false
```
In the examples above, the path filter `/orders/{orderID}/items/{itemID}` would return a single metric count matching all the `orderIDs` and all the `itemIDs`, rather than multiple metrics for each `itemID`. For more information, see [HTTP metrics path matching]({{< ref "metrics-overview.md#http-metrics-path-matching" >}}).
In the examples above this path filter `/orders/{orderID}/items/{itemID}` would return a single metric count matching all the orderIDs and all the itemIDs rather than multiple metrics for each itemID. For more information see [HTTP metrics path matching]({{< ref "metrics-overview.md#http-metrics-path-matching" >}})
The following table lists the properties for metrics:
| Property | Type | Description |
|--------------|--------|-------------|
| `enabled` | boolean | When set to true, the default, enables metrics collection and the metrics endpoint. |
| `rules` | array | Named rule to filter metrics. Each rule contains a set of `labels` to filter on and a `regex` expression to apply to the metrics path. |
| `http.increasedCardinality` | boolean | When set to `true` (default), in the Dapr HTTP server, each request path causes the creation of a new "bucket" of metrics. This can cause issues, including excessive memory consumption when there many different requested endpoints (such as when interacting with RESTful APIs).<br> To mitigate high memory usage and egress costs associated with [high cardinality metrics]({{< ref "metrics-overview.md#high-cardinality-metrics" >}}) with the HTTP server, you should set the `metrics.http.increasedCardinality` property to `false`.|
| `http.pathMatching` | array | Paths used for path matching, allowing users to define matching paths in order to manage cardinality. |
| `http.excludeVerbs` | boolean | When set to `true` (default is `false`), the Dapr HTTP server ignores each request HTTP verb when building the method metric label. |
| Property | Type | Description |
|------------------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `enabled` | boolean | When set to true, the default, enables metrics collection and the metrics endpoint. |
| `rules` | array | Named rule to filter metrics. Each rule contains a set of `labels` to filter on and a `regex` expression to apply to the metrics path. |
| `latencyDistributionBuckets` | array | Array of latency distribution buckets in milliseconds for latency metrics histograms. |
| `http.increasedCardinality` | boolean | When set to `true` (default), in the Dapr HTTP server each request path causes the creation of a new "bucket" of metrics. This can cause issues, including excessive memory consumption, when there many different requested endpoints (such as when interacting with RESTful APIs).<br> To mitigate high memory usage and egress costs associated with [high cardinality metrics]({{< ref "metrics-overview.md#high-cardinality-metrics" >}}) with the HTTP server, you should set the `metrics.http.increasedCardinality` property to `false`. |
| `http.pathMatching` | array | Array of paths for path matching, allowing users to define matching paths to manage cardinality. |
| `http.excludeVerbs` | boolean | When set to true (default is false), the Dapr HTTP server ignores each request HTTP verb when building the method metric label. |
To further help managing cardinality, path matching allows specified paths matched according to defined patterns, reducing the number of unique metrics paths and thus controlling metric cardinality. This feature is particularly useful for applications with dynamic URLs, ensuring that metrics remain meaningful and manageable without excessive memory consumption.

View File

@ -198,7 +198,58 @@ dapr_http_server_request_count{app_id="order-service",method="",path="/orders",s
In this example, the HTTP method is excluded from the metrics, resulting in a single metric for all requests to the `/orders` endpoint.
## Configuring custom latency histogram buckets
Dapr uses cumulative histogram metrics to group latency values into buckets, where each bucket contains:
- A count of the number of requests with that latency
- All the requests with lower latency
### Using the default latency bucket configurations
By default, Dapr groups request latency metrics into the following buckets:
```
1, 2, 3, 4, 5, 6, 8, 10, 13, 16, 20, 25, 30, 40, 50, 65, 80, 100, 130, 160, 200, 250, 300, 400, 500, 650, 800, 1000, 2000, 5000, 10000, 20000, 50000, 100000
```
Grouping latency values in a cumulative fashion allows buckets to be used or dropped as needed for increased or decreased granularity of data.
For example, if a request takes 3ms, it's counted in the 3ms bucket, the 4ms bucket, the 5ms bucket, and so on.
Similarly, if a request takes 10ms, it's counted in the 10ms bucket, the 13ms bucket, the 16ms bucket, and so on.
After these two requests have completed, the 3ms bucket has a count of 1 and the 10ms bucket has a count of 2, since both the 3ms and 10ms requests are included here.
This shows up as follows:
|1|2|3|4|5|6|8|10|13|16|20|25|30|40|50|65|80|100|130|160| ..... | 100000 |
|-|-|-|-|-|-|-|--|--|--|--|--|--|--|--|--|--|---|---|---|-------|--------|
|0|0|1|1|1|1|1| 2| 2| 2| 2| 2| 2| 2| 2| 2| 2| 2 | 2 | 2 | ..... | 2 |
The default number of buckets works well for most use cases, but can be adjusted as needed. Each request creates 34 different metrics, leaving this value to grow considerably for a large number of applications.
More accurate latency percentiles can be achieved by increasing the number of buckets. However, a higher number of buckets increases the amount of memory used to store the metrics, potentially negatively impacting your monitoring system.
It is recommended to keep the number of latency buckets set to the default value, unless you are seeing unwanted memory pressure in your monitoring system. Configuring the number of buckets allows you to choose applications where:
- You want to see more detail with a higher number of buckets
- Broader values are sufficient by reducing the buckets
Take note of the default latency values your applications are producing before configuring the number buckets.
### Customizing latency buckets to your scenario
Tailor the latency buckets to your needs, by modifying the `spec.metrics.latencyDistributionBuckets` field in the [Dapr configuration spec]({{< ref configuration-schema.md >}}) for your application(s).
For example, if you aren't interested in extremely low latency values (1-10ms), you can group them in a single 10ms bucket. Similarly, you can group the high values in a single bucket (1000-5000ms), while keeping more detail in the middle range of values that you are most interested in.
The following Configuration spec example replaces the default 34 buckets with 11 buckets, giving a higher level of granularity in the middle range of values:
```yaml
apiVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
name: custom-metrics
spec:
metrics:
enabled: true
latencyDistributionBuckets: [10, 25, 40, 50, 70, 100, 150, 200, 500, 1000, 5000]
```
## Transform metrics with regular expressions

View File

@ -36,6 +36,9 @@ spec:
labels:
- name: <LABEL-NAME>
regex: {}
latencyDistributionBuckets:
- <BUCKET-VALUE-MS-0>
- <BUCKET-VALUE-MS-1>
http:
increasedCardinality: <TRUE-OR-FALSE>
pathMatching: