64 KiB
Semantic conventions for Kubernetes metrics
Status: Development
K8s metrics
This document describes instruments and attributes for common K8s level metrics in OpenTelemetry. These metrics are collected from technology-specific, well-defined APIs (e.g. Kubelet's API).
Metrics in k8s. instruments SHOULD be attached to a K8s Resource
and therefore inherit its attributes, like k8s.pod.name and k8s.pod.uid.
- Pod metrics
- Container metrics
- Node metrics
- Deployment metrics
- ReplicaSet metrics
- ReplicationController metrics
- StatefulSet metrics
- HorizontalPodAutoscaler metrics
- DaemonSet metrics
- Job metrics
- CronJob metrics
- Namespace metrics
- K8s Container metrics
- Metric:
k8s.container.cpu.limit - Metric:
k8s.container.cpu.request - Metric:
k8s.container.memory.limit - Metric:
k8s.container.memory.request - Metric:
k8s.container.storage.limit - Metric:
k8s.container.storage.request - Metric:
k8s.container.ephemeral_storage.limit - Metric:
k8s.container.ephemeral_storage.request - Metric:
k8s.container.restart.count
- Metric:
Pod metrics
Description: Pod level metrics captured under the namespace k8s.pod.
Metric: k8s.pod.uptime
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.uptime |
Gauge | s |
The time the Pod has been running [1] |
[1]: Instrumentations SHOULD use a gauge with type double and measure uptime in seconds as a floating point number with the highest precision available.
The actual accuracy would depend on the instrumentation and operating system.
Metric: k8s.pod.cpu.time
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.cpu.time |
Counter | s |
Total CPU time consumed [1] |
[1]: Total CPU time consumed by the specific Pod on all available CPU cores
Metric: k8s.pod.cpu.usage
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.cpu.usage |
Gauge | {cpu} |
Pod's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] |
[1]: CPU usage of the specific Pod on all available CPU cores, averaged over the sample window
Metric: k8s.pod.memory.usage
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.memory.usage |
Gauge | By |
Memory usage of the Pod [1] |
[1]: Total memory usage of the Pod
Metric: k8s.pod.network.io
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.network.io |
Counter | By |
Network bytes for the Pod |
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
receive |
receive | |
transmit |
transmit |
Metric: k8s.pod.network.errors
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.pod.network.errors |
Counter | {error} |
Pod network errors |
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
receive |
receive | |
transmit |
transmit |
Container metrics
Description: Container level metrics captured under the namespace k8s.container.
Metric: k8s.container.status.state
This metric is recommended.
[1]: All possible container states will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state will be non-zero.
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
k8s.container.status.state |
string | The state of the container. K8s ContainerState | terminated; running; waiting |
Required |
k8s.container.status.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
running |
The container is running. | |
terminated |
The container has terminated. | |
waiting |
The container is waiting. |
Metric: k8s.container.status.reason
This metric is recommended.
[1]: All possible container state reasons will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state reason will be non-zero.
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
k8s.container.status.reason |
string | The reason for the container state. Corresponds to the reason field of the: K8s ContainerStateWaiting or K8s ContainerStateTerminated |
ContainerCreating; CrashLoopBackOff; CreateContainerConfigError; ErrImagePull; ImagePullBackOff; OOMKilled; Completed; Error; ContainerCannotRun |
Required |
k8s.container.status.reason has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Node metrics
Description: Node level metrics captured under the namespace k8s.node.
Metric: k8s.node.uptime
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.uptime |
Gauge | s |
The time the Node has been running [1] |
[1]: Instrumentations SHOULD use a gauge with type double and measure uptime in seconds as a floating point number with the highest precision available.
The actual accuracy would depend on the instrumentation and operating system.
Metric: k8s.node.cpu.time
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.cpu.time |
Counter | s |
Total CPU time consumed [1] |
[1]: Total CPU time consumed by the specific Node on all available CPU cores
Metric: k8s.node.cpu.usage
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.cpu.usage |
Gauge | {cpu} |
Node's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] |
[1]: CPU usage of the specific Node on all available CPU cores, averaged over the sample window
Metric: k8s.node.memory.usage
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.memory.usage |
Gauge | By |
Memory usage of the Node [1] |
[1]: Total memory usage of the Node
Metric: k8s.node.network.io
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.network.io |
Counter | By |
Network bytes for the Node |
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
receive |
receive | |
transmit |
transmit |
Metric: k8s.node.network.errors
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.node.network.errors |
Counter | {error} |
Node network errors |
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
receive |
receive | |
transmit |
transmit |
Deployment metrics
Description: Deployment level metrics captured under the namespace k8s.deployment.
Metric: k8s.deployment.desired_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.deployment.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this deployment [1] |
[1]: This metric aligns with the replicas field of the
K8s DeploymentSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.deployment resource.
Metric: k8s.deployment.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas field of the
K8s DeploymentStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.deployment resource.
ReplicaSet metrics
Description: ReplicaSet level metrics captured under the namespace k8s.replicaset.
Metric: k8s.replicaset.desired_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.replicaset.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this replicaset [1] |
[1]: This metric aligns with the replicas field of the
K8s ReplicaSetSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.replicaset resource.
Metric: k8s.replicaset.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas field of the
K8s ReplicaSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.replicaset resource.
ReplicationController metrics
Description: ReplicationController level metrics captured under the namespace k8s.replicationcontroller.
Metric: k8s.replicationcontroller.desired_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.replicationcontroller.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this replication controller [1] |
[1]: This metric aligns with the replicas field of the
K8s ReplicationControllerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.replicationcontroller resource.
Metric: k8s.replicationcontroller.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas field of the
K8s ReplicationControllerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.replicationcontroller resource.
StatefulSet metrics
Description: StatefulSet level metrics captured under the namespace k8s.statefulset.
Metric: k8s.statefulset.desired_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.statefulset.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this statefulset [1] |
[1]: This metric aligns with the replicas field of the
K8s StatefulSetSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset resource.
Metric: k8s.statefulset.ready_pods
This metric is recommended.
[1]: This metric aligns with the readyReplicas field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset resource.
Metric: k8s.statefulset.current_pods
This metric is recommended.
[1]: This metric aligns with the currentReplicas field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset resource.
Metric: k8s.statefulset.updated_pods
This metric is recommended.
[1]: This metric aligns with the updatedReplicas field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset resource.
HorizontalPodAutoscaler metrics
Description: HorizontalPodAutoscaler level metrics captured under the namespace k8s.hpa.
Metric: k8s.hpa.desired_pods
This metric is recommended.
[1]: This metric aligns with the desiredReplicas field of the
K8s HorizontalPodAutoscalerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.hpa resource.
Metric: k8s.hpa.current_pods
This metric is recommended.
[1]: This metric aligns with the currentReplicas field of the
K8s HorizontalPodAutoscalerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.hpa resource.
Metric: k8s.hpa.max_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.hpa.max_pods |
UpDownCounter | {pod} |
The upper limit for the number of replica pods to which the autoscaler can scale up [1] |
[1]: This metric aligns with the maxReplicas field of the
K8s HorizontalPodAutoscalerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.hpa resource.
Metric: k8s.hpa.min_pods
This metric is recommended.
[1]: This metric aligns with the minReplicas field of the
K8s HorizontalPodAutoscalerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.hpa resource.
DaemonSet metrics
Description: DaemonSet level metrics captured under the namespace k8s.daemonset.
Metric: k8s.daemonset.current_scheduled_nodes
This metric is recommended.
[1]: This metric aligns with the currentNumberScheduled field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset resource.
Metric: k8s.daemonset.desired_scheduled_nodes
This metric is recommended.
[1]: This metric aligns with the desiredNumberScheduled field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset resource.
Metric: k8s.daemonset.misscheduled_nodes
This metric is recommended.
[1]: This metric aligns with the numberMisscheduled field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset resource.
Metric: k8s.daemonset.ready_nodes
This metric is recommended.
[1]: This metric aligns with the numberReady field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset resource.
Job metrics
Description: Job level metrics captured under the namespace k8s.job.
Metric: k8s.job.active_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.job.active_pods |
UpDownCounter | {pod} |
The number of pending and actively running pods for a job [1] |
[1]: This metric aligns with the active field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job resource.
Metric: k8s.job.failed_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.job.failed_pods |
UpDownCounter | {pod} |
The number of pods which reached phase Failed for a job [1] |
[1]: This metric aligns with the failed field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job resource.
Metric: k8s.job.successful_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.job.successful_pods |
UpDownCounter | {pod} |
The number of pods which reached phase Succeeded for a job [1] |
[1]: This metric aligns with the succeeded field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job resource.
Metric: k8s.job.desired_successful_pods
This metric is recommended.
[1]: This metric aligns with the completions field of the
K8s JobSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.job resource.
Metric: k8s.job.max_parallel_pods
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.job.max_parallel_pods |
UpDownCounter | {pod} |
The max desired number of pods the job should run at any given time [1] |
[1]: This metric aligns with the parallelism field of the
K8s JobSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.job resource.
CronJob metrics
Description: CronJob level metrics captured under the namespace k8s.cronjob.
Metric: k8s.cronjob.active_jobs
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.cronjob.active_jobs |
UpDownCounter | {job} |
The number of actively running jobs for a cronjob [1] |
[1]: This metric aligns with the active field of the
K8s CronJobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.cronjob resource.
Namespace metrics
Description: Namespace level metrics captured under the namespace k8s.namespace.
Metric: k8s.namespace.phase
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.namespace.phase |
UpDownCounter | {namespace} |
Describes number of K8s namespaces that are currently in a given phase. [1] |
[1]: This metric SHOULD, at a minimum, be reported against a
k8s.namespace resource.
| Attribute | Type | Description | Examples | Requirement Level | Stability |
|---|---|---|---|---|---|
k8s.namespace.phase |
string | The phase of the K8s namespace. [1] | active; terminating |
Required |
[1] k8s.namespace.phase: This attribute aligns with the phase field of the
K8s NamespaceStatus
k8s.namespace.phase has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
| Value | Description | Stability |
|---|---|---|
active |
Active namespace phase as described by K8s API | |
terminating |
Terminating namespace phase as described by K8s API |
K8s Container metrics
Description: K8s Container level metrics captured under the namespace k8s.container.
Metric: k8s.container.cpu.limit
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.cpu.limit |
Gauge | {cpu} |
Maximum CPU resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.cpu.request
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.cpu.request |
Gauge | {cpu} |
CPU resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.memory.limit
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.memory.limit |
Gauge | By |
Maximum memory resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.memory.request
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.memory.request |
Gauge | By |
Memory resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.storage.limit
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.storage.limit |
Gauge | By |
Maximum storage resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.storage.request
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.storage.request |
Gauge | By |
Storage resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.ephemeral_storage.limit
This metric is recommended.
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.ephemeral_storage.request
This metric is recommended.
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
|---|---|---|---|---|---|
k8s.container.ephemeral_storage.request |
Gauge | By |
Ephemeral storage resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.restart.count
This metric is recommended.
[1]: This value is pulled directly from the K8s API and the value can go indefinitely high and be reset to 0 at any time depending on how your kubelet is configured to prune dead containers. It is best to not depend too much on the exact value but rather look at it as either == 0, in which case you can conclude there were no restarts in the recent past, or > 0, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that.