Update metric instrumentation guide
This commit is contained in:
parent
0512891c73
commit
50d38598e8
|
|
@ -1,11 +1,28 @@
|
||||||
## Instrumenting Kubernetes with a new metric
|
## Instrumenting Kubernetes
|
||||||
|
|
||||||
The following is a step-by-step guide for adding a new metric to the Kubernetes
|
The following references and outlines general guidelines for metric instrumentation
|
||||||
code base.
|
in Kubernetes components. Components are instrumented using the
|
||||||
|
[Prometheus Go client library](https://github.com/prometheus/client_golang). For non-Go
|
||||||
|
components. [Libraries in other languages](https://prometheus.io/docs/instrumenting/clientlibs/)
|
||||||
|
are available.
|
||||||
|
|
||||||
We use the Prometheus monitoring system's golang client library for
|
The metrics are exposed via HTTP in the
|
||||||
instrumenting our code. Once you've picked out a file that you want to add a
|
[Prometheus metric format](https://prometheus.io/docs/instrumenting/exposition_formats/),
|
||||||
metric to, you should:
|
which is open and well-understood by a wide range of third party applications and vendors
|
||||||
|
outside of the Prometheus eco-system.
|
||||||
|
|
||||||
|
The [general instrumentation advice](https://prometheus.io/docs/practices/instrumentation/)
|
||||||
|
from the Prometheus documentation applies. This document reiterates common pitfalls and some
|
||||||
|
Kubernetes specific considerations.
|
||||||
|
|
||||||
|
Prometheus metrics are cheap as they have minimal internal memory state. Set and increment
|
||||||
|
operations are thread safe and take 10-25 nanoseconds (Go & Java).
|
||||||
|
Thus, instrumentation can and should cover all operationally relevant aspects of an application,
|
||||||
|
internal and external.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
The following describes the basic steps required to add a new metric (in Go).
|
||||||
|
|
||||||
1. Import "github.com/prometheus/client_golang/prometheus".
|
1. Import "github.com/prometheus/client_golang/prometheus".
|
||||||
|
|
||||||
|
|
@ -22,29 +39,180 @@ the values.
|
||||||
labels on the metric. If so, add "Vec" to the name of the type of metric you
|
labels on the metric. If so, add "Vec" to the name of the type of metric you
|
||||||
want and add a slice of the label names to the definition.
|
want and add a slice of the label names to the definition.
|
||||||
|
|
||||||
https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53
|
[Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53)
|
||||||
https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/kubelet/metrics/metrics.go#L31
|
```go
|
||||||
|
requestCounter = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "apiserver_request_count",
|
||||||
|
Help: "Counter of apiserver requests broken out for each verb, API resource, client, and HTTP response code.",
|
||||||
|
},
|
||||||
|
[]string{"verb", "resource", "client", "code"},
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
3. Register the metric so that prometheus will know to export it.
|
3. Register the metric so that prometheus will know to export it.
|
||||||
|
|
||||||
https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/kubelet/metrics/metrics.go#L74
|
[Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78)
|
||||||
https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78
|
```go
|
||||||
|
func init() {
|
||||||
|
prometheus.MustRegister(requestCounter)
|
||||||
|
prometheus.MustRegister(requestLatencies)
|
||||||
|
prometheus.MustRegister(requestLatenciesSummary)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
4. Use the metric by calling the appropriate method for your metric type (Set,
|
4. Use the metric by calling the appropriate method for your metric type (Set,
|
||||||
Inc/Add, or Observe, respectively for Gauge, Counter, or Histogram/Summary),
|
Inc/Add, or Observe, respectively for Gauge, Counter, or Histogram/Summary),
|
||||||
first calling WithLabelValues if your metric has any labels
|
first calling WithLabelValues if your metric has any labels
|
||||||
|
|
||||||
https://github.com/kubernetes/kubernetes/blob/3ce7fe8310ff081dbbd3d95490193e1d5250d2c9/pkg/kubelet/kubelet.go#L1384
|
[Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87)
|
||||||
https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87
|
```go
|
||||||
|
requestCounter.WithLabelValues(*verb, *resource, client, strconv.Itoa(*httpCode)).Inc()
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
These are the metric type definitions if you're curious to learn about them or
|
## Instrumentation types
|
||||||
need more information:
|
|
||||||
|
Components have metrics capturing events and states that are inherent to their
|
||||||
|
application logic. Examples are request and error counters, request latency
|
||||||
|
histograms, or internal garbage collection cycles. Those metrics are instrumented
|
||||||
|
directly in the application code.
|
||||||
|
|
||||||
|
Secondly, there are business logic metrics. Those are not about observed application
|
||||||
|
behavior but abstract system state, such as desired replicas for a deployment.
|
||||||
|
They are not directly instrumented but collected from otherwise exposed data.
|
||||||
|
|
||||||
|
In Kubernetes they are generally captured in the [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics)
|
||||||
|
component, which reads them from the API server.
|
||||||
|
For this types of metric exposition, the
|
||||||
|
[exporter guidelines](https://prometheus.io/docs/instrumenting/writing_exporters/)
|
||||||
|
apply additionally.
|
||||||
|
|
||||||
|
## Naming
|
||||||
|
|
||||||
|
Metrics added directly by application or package code should have a unique name.
|
||||||
|
This avoids collisions of metrics added via dependencies. They also clearly
|
||||||
|
distinguish metrics collected with different semantics. This is solved through
|
||||||
|
prefixes:
|
||||||
|
|
||||||
|
```
|
||||||
|
<component_name>_<metric>
|
||||||
|
```
|
||||||
|
|
||||||
|
For example, suppose the kubelet instrumented its HTTP requests but also uses
|
||||||
|
an HTTP router providing its own implementation. Both expose metrics on total
|
||||||
|
http requests. They should be distinguishable as in:
|
||||||
|
|
||||||
|
```
|
||||||
|
kubelet_http_requests_total{path=”/some/path”,status=”200”}
|
||||||
|
routerpkg_http_requests_total{path=”/some/path”,status=”200”,method=”GET”}
|
||||||
|
```
|
||||||
|
|
||||||
|
As we can see they expose different labels and thus a naming collision would
|
||||||
|
not have been possible to resolve even if both metrics counted the exact same
|
||||||
|
requests.
|
||||||
|
|
||||||
|
Resource objects that occur in names should inherit the spelling that is used
|
||||||
|
in kubectl, i.e. daemon sets are `daemonset` rather than `daemon_set`.
|
||||||
|
|
||||||
|
## Dimensionality & Cardinality
|
||||||
|
|
||||||
|
Metrics can often replace more expensive logging as they are time-aggregated
|
||||||
|
over a sampling interval. The [multidimensional data model](https://prometheus.io/docs/concepts/data_model/)
|
||||||
|
enables deep insights and all metrics should use those label dimensions
|
||||||
|
where appropriate.
|
||||||
|
|
||||||
|
A common error that often causes performance issues in the ingesting metric
|
||||||
|
system is considering dimensions that inhibit or eliminate time aggregation
|
||||||
|
by being too specific. Typically those are user IDs or error messages.
|
||||||
|
More generally: one should know a comprehensive list of all possible values
|
||||||
|
for a label at instrumentation time.
|
||||||
|
|
||||||
|
Notable exceptions are exporters like kube-state-metrics, which expose per-pod
|
||||||
|
or per-deployment metrics, which are theoretically unbound over time as one could
|
||||||
|
constantly create new ones, with new names. However, they have
|
||||||
|
a reasonable upper bound for a given size of infrastructure they refer to and
|
||||||
|
its typical frequency of changes.
|
||||||
|
|
||||||
|
In general, “external” labels like pod or node name do not belong into the
|
||||||
|
instrumentation itself. They are to be attached to metrics by the collecting
|
||||||
|
system that has the external knowledge ([blog post](https://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/)).
|
||||||
|
|
||||||
|
## Normalization
|
||||||
|
|
||||||
|
Metrics should be normalized with respect to their dimensions. They should
|
||||||
|
expose the minimal set of labels, each of which provides additional information.
|
||||||
|
Labels that are composed from values of different labels are not desirable.
|
||||||
|
For example:
|
||||||
|
|
||||||
|
```
|
||||||
|
example_metric{pod=”abc”,container=”proxy”,container_long=”abc/proxy”}
|
||||||
|
```
|
||||||
|
|
||||||
|
It often seems feasible to add additional meta information about an object
|
||||||
|
to all metrics about that object, e.g.:
|
||||||
|
|
||||||
|
```
|
||||||
|
kube_pod_container_restarts{namespace=...,pod=...,container=...}
|
||||||
|
```
|
||||||
|
|
||||||
|
A common use case is wanting to look at such metrics w.r.t to the node the
|
||||||
|
pod is scheduled on. So it seems convenient to add a “node” label.
|
||||||
|
|
||||||
|
```
|
||||||
|
kube_pod_container_restarts{namespace=...,pod=...,container=...,node=...}
|
||||||
|
```
|
||||||
|
|
||||||
|
This however only caters to one specific query use case. There are many more
|
||||||
|
pieces of metadata that could be added, effectively blowing up the instrumentation.
|
||||||
|
They are also not guaranteed to be stable over time. What if pods at some
|
||||||
|
point can be live migrated?
|
||||||
|
Those pieces of information should be normalized into an info-level metric
|
||||||
|
([blog post](https://www.robustperception.io/exposing-the-software-version-to-prometheus/)),
|
||||||
|
which is always set to 1. For example:
|
||||||
|
|
||||||
|
```
|
||||||
|
kube_pod_info{pod=...,namespace=...,pod_ip=...,host_ip=..,node=..., ...}
|
||||||
|
```
|
||||||
|
|
||||||
|
The metric system can later denormalize those along the identifying labels
|
||||||
|
“pod” and “namespace” labels. This leads to...
|
||||||
|
|
||||||
|
## Resource Referencing
|
||||||
|
|
||||||
|
It is often desirable to correlate different metrics about a common object,
|
||||||
|
such as a pod. Label dimensions can be used to match up different metrics.
|
||||||
|
This is most easy if label names and values are following a common pattern.
|
||||||
|
For metrics exposed by the same application, that often happens naturally.
|
||||||
|
|
||||||
|
For a system composed of several independent, and also pluggable components,
|
||||||
|
it makes sense to set cross-component standards to allow easy querying in
|
||||||
|
metric systems without extensive post-processing of data.
|
||||||
|
In Kubernetes, those are the resource objects such as deployments,
|
||||||
|
pods, or services and the namespace they belong to.
|
||||||
|
|
||||||
|
The following should be consistently used:
|
||||||
|
|
||||||
|
```
|
||||||
|
example_metric_ccc{pod=”example-app-5378923”, namespace=”default”}
|
||||||
|
```
|
||||||
|
|
||||||
|
An object is referenced by its unique name in a label named after the resource
|
||||||
|
itself (i.e. `pod`/`deployment`/... and not `pod_name`/`deployment_name`)
|
||||||
|
and the namespace it belongs to in the `namespace` label.
|
||||||
|
|
||||||
|
Note: namespace/name combinations are only unique at a certain point in time.
|
||||||
|
For time series this is given by the timestamp associated with any data point.
|
||||||
|
UUIDs are truly unique but not convenient to use in user-facing time series
|
||||||
|
queries.
|
||||||
|
They can still be incorporated using an info level metric as described above for
|
||||||
|
`kube_pod_info`. A query to a metric system selecting by UUID via a the info level
|
||||||
|
metric could look as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
kube_pod_restarts and on(namespace, pod) kube_pod_info{uuid=”ABC”}
|
||||||
|
```
|
||||||
|
|
||||||
https://github.com/prometheus/client_golang/blob/master/prometheus/gauge.go
|
|
||||||
https://github.com/prometheus/client_golang/blob/master/prometheus/counter.go
|
|
||||||
https://github.com/prometheus/client_golang/blob/master/prometheus/histogram.go
|
|
||||||
https://github.com/prometheus/client_golang/blob/master/prometheus/summary.go
|
|
||||||
|
|
||||||
|
|
||||||
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
<!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue