semantic-conventions/docs/general/metrics.md

<!--- Hugo front matter used to generate the website version of this page:
linkTitle: Metrics
aliases: [metrics-general]
--->

# Metrics semantic conventions

**Status**: [Mixed][DocumentStatus]

<!-- toc -->

- [General Guidelines](#general-guidelines)
  - [Units](#units)
  - [Instrument Units](#instrument-units)
  - [Instrument Types](#instrument-types)
  - [Consistent UpDownCounter timeseries](#consistent-updowncounter-timeseries)

<!-- tocstop -->

The following semantic conventions surrounding metrics are defined:

* **[General Guidelines](#general-guidelines): General metrics guidelines.**
* [Database](/docs/database/database-metrics.md): For SQL and NoSQL client metrics.
* [FaaS](/docs/faas/faas-metrics.md): For [Function as a Service](https://wikipedia.org/wiki/Function_as_a_service) metrics.
* [GenAI](/docs/gen-ai/gen-ai-metrics.md): For Generative AI metrics.
* [HTTP](/docs/http/http-metrics.md): For HTTP client and server metrics.
* [Messaging](/docs/messaging/messaging-metrics.md): For messaging systems (queues, publish/subscribe, etc.) metrics.
* [RPC](/docs/rpc/rpc-metrics.md): For RPC client and server metrics.
* [.NET](/docs/dotnet/README.md): For network-related metrics emitted by .NET runtime components.
* **System metrics**
  * [System](/docs/system/system-metrics.md): For standard system metrics.
  * [Container](/docs/system/container-metrics.md)
  * [Hardware](/docs/system/hardware-metrics.md): For hardware-related metrics.
  * [K8s](/docs/system/k8s-metrics.md): For K8s metrics.
  * [Process](/docs/system/process-metrics.md): For standard process metrics.
  * [Runtime Environment](/docs/runtime/README.md#metrics): For runtime environment metrics.
* [OTel SDK Telemetry](/docs/otel/sdk-metrics.md): Metrics emitted by the OpenTelemetry SDK components.

Apart from semantic conventions for metrics, [traces](trace.md), [logs](logs.md), and [events](events.md), OpenTelemetry also
defines the concept of overarching [Resources](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.42.0/specification/resource/sdk.md) with
their own [Resource Semantic Conventions](/docs/resource/README.md).

## General Guidelines

**Status**: [Development][DocumentStatus]

When defining new metric names and attributes, consider the prior art of
existing standard metrics and metrics from frameworks/libraries.

Associated metrics SHOULD be nested together in a hierarchy based on their
usage. Define a top-level hierarchy for common metric categories: for OS
metrics, like CPU and network; for app runtimes, like GC internals. Libraries
and frameworks should nest their metrics into a hierarchy as well. This aids
in discovery and adhoc comparison. This allows a user to find similar metrics
given a certain metric.

The hierarchical structure of metrics defines the namespacing. Supporting
OpenTelemetry artifacts define the metric structures and hierarchies for some
categories of metrics, and these can assist decisions when creating future
metrics.

Common attributes SHOULD be consistently named. This aids in discoverability and
disambiguates similar attributes to metric names.

["As a rule of thumb, **aggregations** over all the attributes of a given
metric **SHOULD** be
meaningful,"](https://prometheus.io/docs/practices/naming/#metric-names) as
Prometheus recommends.

Semantic ambiguity SHOULD be avoided. Use prefixed metric names in cases
where similar metrics have significantly different implementations across the
breadth of all existing metrics. For example, every garbage collected runtime
has slightly different strategies and measures. Using a single set of metric
names for GC, not divided by the runtime, could create dissimilar comparisons
and confusion for end users. (For example, prefer `jvm.gc*` over
`gc.*`.) Measures of many operating system metrics are similarly
ambiguous.

Metric names and attributes SHOULD follow the general [naming guidelines](naming.md).

### Units

Conventional metrics or metrics that have their units included in
OpenTelemetry metadata (e.g. `metric.WithUnit` in Go) SHOULD NOT include the
units in the metric name. Units may be included when it provides additional
meaning to the metric name. Metrics MUST, above all, be understandable and
usable.

When building components that interoperate between OpenTelemetry and a system
using the OpenMetrics exposition format, use the
[OpenMetrics Guidelines](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.42.0/specification/compatibility/prometheus_and_openmetrics.md).

### Instrument Units

**Status**: [Stable][DocumentStatus]

Units should follow the
[Unified Code for Units of Measure](https://unitsofmeasure.org/ucum.html).

- Instruments for **utilization** metrics (that measure the fraction out of a
total) are dimensionless and SHOULD use the default unit `1` (the unity).
- All non-units that use curly braces to annotate a quantity need to match the
  grammatical number of the quantity it represent. For example if measuring the
  number of individual requests to a process the unit would be `{request}`, not
  `{requests}`.
- Instruments that measure an integer count of something SHOULD only use
[annotations](https://ucum.org/ucum.html#para-curly) with curly braces to
give additional meaning *without* the leading default unit (`1`). For example,
use `{packet}`, `{error}`, `{fault}`, etc.
- Instrument units other than `1` and those that use
  [annotations](https://ucum.org/ucum.html#para-curly) SHOULD be specified using
  the UCUM case sensitive ("c/s") variant.
  For example, "Cel" for the unit with full name "degree Celsius".
- Instruments SHOULD use non-prefixed units (i.e. `By` instead of `MiBy`)
  unless there is good technical reason to not do so.
- When instruments are measuring durations, seconds (i.e. `s`) SHOULD be used.

### Instrument Types

**Status**: [Stable][DocumentStatus]

The semantic metric conventions specification is written to use the names of the synchronous instrument types,
like `Counter` or `UpDownCounter`. However, compliant implementations MAY use the asynchronous equivalent instead,
like `Asynchronous Counter` or `Asynchronous UpDownCounter`.
Whether implementations choose the synchronous type or the asynchronous equivalent is considered to be an
implementation detail. Both choices are compliant with this specification.

### Consistent UpDownCounter timeseries

**Status**: [Development][DocumentStatus]

When recording `UpDownCounter` metrics, the same attribute values used to record an increment SHOULD be used to record
any associated decrement, otherwise those increments and decrements will end up as different timeseries.

For example, if you are tracking `active_requests` with an `UpDownCounter`, and you are incrementing it each time a
request starts and decrementing it each time a request ends, then any attributes which are not yet available when
incrementing the counter at request start should not be used when decrementing the counter at request end.

[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status