65 KiB
Semantic conventions for Kubernetes metrics
Status: Development
K8s metrics
This document describes instruments and attributes for common K8s level metrics in OpenTelemetry. These metrics are collected from technology-specific, well-defined APIs (e.g. Kubelet's API).
Metrics in k8s.
instruments SHOULD be attached to a K8s Resource
and therefore inherit its attributes, like k8s.pod.name
and k8s.pod.uid
.
- Pod metrics
- Container metrics
- Node metrics
- Deployment metrics
- ReplicaSet metrics
- ReplicationController metrics
- StatefulSet metrics
- HorizontalPodAutoscaler metrics
- DaemonSet metrics
- Job metrics
- CronJob metrics
- Namespace metrics
- K8s Container metrics
- Metric:
k8s.container.cpu.limit
- Metric:
k8s.container.cpu.request
- Metric:
k8s.container.memory.limit
- Metric:
k8s.container.memory.request
- Metric:
k8s.container.storage.limit
- Metric:
k8s.container.storage.request
- Metric:
k8s.container.ephemeral_storage.limit
- Metric:
k8s.container.ephemeral_storage.request
- Metric:
k8s.container.restart.count
- Metric:
k8s.container.ready
- Metric:
Pod metrics
Description: Pod level metrics captured under the namespace k8s.pod
.
Metric: k8s.pod.uptime
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.uptime |
Gauge | s |
The time the Pod has been running [1] |
[1]: Instrumentations SHOULD use a gauge with type double
and measure uptime in seconds as a floating point number with the highest precision available.
The actual accuracy would depend on the instrumentation and operating system.
Metric: k8s.pod.cpu.time
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.cpu.time |
Counter | s |
Total CPU time consumed [1] |
[1]: Total CPU time consumed by the specific Pod on all available CPU cores
Metric: k8s.pod.cpu.usage
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.cpu.usage |
Gauge | {cpu} |
Pod's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] |
[1]: CPU usage of the specific Pod on all available CPU cores, averaged over the sample window
Metric: k8s.pod.memory.usage
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.memory.usage |
Gauge | By |
Memory usage of the Pod [1] |
[1]: Total memory usage of the Pod
Metric: k8s.pod.network.io
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.network.io |
Counter | By |
Network bytes for the Pod |
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo ; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
receive |
receive | |
transmit |
transmit |
Metric: k8s.pod.network.errors
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.pod.network.errors |
Counter | {error} |
Pod network errors |
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo ; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
receive |
receive | |
transmit |
transmit |
Container metrics
Description: Container level metrics captured under the namespace k8s.container
.
Metric: k8s.container.status.state
This metric is recommended.
[1]: All possible container states will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state will be non-zero.
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
k8s.container.status.state |
string | The state of the container. K8s ContainerState | terminated ; running ; waiting |
Required |
k8s.container.status.state
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
running |
The container is running. | |
terminated |
The container has terminated. | |
waiting |
The container is waiting. |
Metric: k8s.container.status.reason
This metric is recommended.
[1]: All possible container state reasons will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state reason will be non-zero.
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
k8s.container.status.reason |
string | The reason for the container state. Corresponds to the reason field of the: K8s ContainerStateWaiting or K8s ContainerStateTerminated |
ContainerCreating ; CrashLoopBackOff ; CreateContainerConfigError ; ErrImagePull ; ImagePullBackOff ; OOMKilled ; Completed ; Error ; ContainerCannotRun |
Required |
k8s.container.status.reason
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Node metrics
Description: Node level metrics captured under the namespace k8s.node
.
Metric: k8s.node.uptime
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.uptime |
Gauge | s |
The time the Node has been running [1] |
[1]: Instrumentations SHOULD use a gauge with type double
and measure uptime in seconds as a floating point number with the highest precision available.
The actual accuracy would depend on the instrumentation and operating system.
Metric: k8s.node.cpu.time
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.cpu.time |
Counter | s |
Total CPU time consumed [1] |
[1]: Total CPU time consumed by the specific Node on all available CPU cores
Metric: k8s.node.cpu.usage
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.cpu.usage |
Gauge | {cpu} |
Node's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1] |
[1]: CPU usage of the specific Node on all available CPU cores, averaged over the sample window
Metric: k8s.node.memory.usage
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.memory.usage |
Gauge | By |
Memory usage of the Node [1] |
[1]: Total memory usage of the Node
Metric: k8s.node.network.io
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.network.io |
Counter | By |
Network bytes for the Node |
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo ; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
receive |
receive | |
transmit |
transmit |
Metric: k8s.node.network.errors
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.node.network.errors |
Counter | {error} |
Node network errors |
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
network.interface.name |
string | The network interface name. | lo ; eth0 |
Recommended |
|
network.io.direction |
string | The network IO operation direction. | transmit |
Recommended |
network.io.direction
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
receive |
receive | |
transmit |
transmit |
Deployment metrics
Description: Deployment level metrics captured under the namespace k8s.deployment
.
Metric: k8s.deployment.desired_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.deployment.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this deployment [1] |
[1]: This metric aligns with the replicas
field of the
K8s DeploymentSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.deployment
resource.
Metric: k8s.deployment.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas
field of the
K8s DeploymentStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.deployment
resource.
ReplicaSet metrics
Description: ReplicaSet level metrics captured under the namespace k8s.replicaset
.
Metric: k8s.replicaset.desired_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.replicaset.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this replicaset [1] |
[1]: This metric aligns with the replicas
field of the
K8s ReplicaSetSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.replicaset
resource.
Metric: k8s.replicaset.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas
field of the
K8s ReplicaSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.replicaset
resource.
ReplicationController metrics
Description: ReplicationController level metrics captured under the namespace k8s.replicationcontroller
.
Metric: k8s.replicationcontroller.desired_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.replicationcontroller.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this replication controller [1] |
[1]: This metric aligns with the replicas
field of the
K8s ReplicationControllerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.replicationcontroller
resource.
Metric: k8s.replicationcontroller.available_pods
This metric is recommended.
[1]: This metric aligns with the availableReplicas
field of the
K8s ReplicationControllerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.replicationcontroller
resource.
StatefulSet metrics
Description: StatefulSet level metrics captured under the namespace k8s.statefulset
.
Metric: k8s.statefulset.desired_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.statefulset.desired_pods |
UpDownCounter | {pod} |
Number of desired replica pods in this statefulset [1] |
[1]: This metric aligns with the replicas
field of the
K8s StatefulSetSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset
resource.
Metric: k8s.statefulset.ready_pods
This metric is recommended.
[1]: This metric aligns with the readyReplicas
field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset
resource.
Metric: k8s.statefulset.current_pods
This metric is recommended.
[1]: This metric aligns with the currentReplicas
field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset
resource.
Metric: k8s.statefulset.updated_pods
This metric is recommended.
[1]: This metric aligns with the updatedReplicas
field of the
K8s StatefulSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.statefulset
resource.
HorizontalPodAutoscaler metrics
Description: HorizontalPodAutoscaler level metrics captured under the namespace k8s.hpa
.
Metric: k8s.hpa.desired_pods
This metric is recommended.
[1]: This metric aligns with the desiredReplicas
field of the
K8s HorizontalPodAutoscalerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.hpa
resource.
Metric: k8s.hpa.current_pods
This metric is recommended.
[1]: This metric aligns with the currentReplicas
field of the
K8s HorizontalPodAutoscalerStatus
This metric SHOULD, at a minimum, be reported against a
k8s.hpa
resource.
Metric: k8s.hpa.max_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.hpa.max_pods |
UpDownCounter | {pod} |
The upper limit for the number of replica pods to which the autoscaler can scale up [1] |
[1]: This metric aligns with the maxReplicas
field of the
K8s HorizontalPodAutoscalerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.hpa
resource.
Metric: k8s.hpa.min_pods
This metric is recommended.
[1]: This metric aligns with the minReplicas
field of the
K8s HorizontalPodAutoscalerSpec
This metric SHOULD, at a minimum, be reported against a
k8s.hpa
resource.
DaemonSet metrics
Description: DaemonSet level metrics captured under the namespace k8s.daemonset
.
Metric: k8s.daemonset.current_scheduled_nodes
This metric is recommended.
[1]: This metric aligns with the currentNumberScheduled
field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset
resource.
Metric: k8s.daemonset.desired_scheduled_nodes
This metric is recommended.
[1]: This metric aligns with the desiredNumberScheduled
field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset
resource.
Metric: k8s.daemonset.misscheduled_nodes
This metric is recommended.
[1]: This metric aligns with the numberMisscheduled
field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset
resource.
Metric: k8s.daemonset.ready_nodes
This metric is recommended.
[1]: This metric aligns with the numberReady
field of the
K8s DaemonSetStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.daemonset
resource.
Job metrics
Description: Job level metrics captured under the namespace k8s.job
.
Metric: k8s.job.active_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.job.active_pods |
UpDownCounter | {pod} |
The number of pending and actively running pods for a job [1] |
[1]: This metric aligns with the active
field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job
resource.
Metric: k8s.job.failed_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.job.failed_pods |
UpDownCounter | {pod} |
The number of pods which reached phase Failed for a job [1] |
[1]: This metric aligns with the failed
field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job
resource.
Metric: k8s.job.successful_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.job.successful_pods |
UpDownCounter | {pod} |
The number of pods which reached phase Succeeded for a job [1] |
[1]: This metric aligns with the succeeded
field of the
K8s JobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.job
resource.
Metric: k8s.job.desired_successful_pods
This metric is recommended.
[1]: This metric aligns with the completions
field of the
K8s JobSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.job
resource.
Metric: k8s.job.max_parallel_pods
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.job.max_parallel_pods |
UpDownCounter | {pod} |
The max desired number of pods the job should run at any given time [1] |
[1]: This metric aligns with the parallelism
field of the
K8s JobSpec.
This metric SHOULD, at a minimum, be reported against a
k8s.job
resource.
CronJob metrics
Description: CronJob level metrics captured under the namespace k8s.cronjob
.
Metric: k8s.cronjob.active_jobs
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.cronjob.active_jobs |
UpDownCounter | {job} |
The number of actively running jobs for a cronjob [1] |
[1]: This metric aligns with the active
field of the
K8s CronJobStatus.
This metric SHOULD, at a minimum, be reported against a
k8s.cronjob
resource.
Namespace metrics
Description: Namespace level metrics captured under the namespace k8s.namespace
.
Metric: k8s.namespace.phase
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.namespace.phase |
UpDownCounter | {namespace} |
Describes number of K8s namespaces that are currently in a given phase. [1] |
[1]: This metric SHOULD, at a minimum, be reported against a
k8s.namespace
resource.
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
k8s.namespace.phase |
string | The phase of the K8s namespace. [1] | active ; terminating |
Required |
[1] k8s.namespace.phase
: This attribute aligns with the phase
field of the
K8s NamespaceStatus
k8s.namespace.phase
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
active |
Active namespace phase as described by K8s API | |
terminating |
Terminating namespace phase as described by K8s API |
K8s Container metrics
Description: K8s Container level metrics captured under the namespace k8s.container
.
Metric: k8s.container.cpu.limit
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.cpu.limit |
Gauge | {cpu} |
Maximum CPU resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.cpu.request
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.cpu.request |
Gauge | {cpu} |
CPU resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.memory.limit
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.memory.limit |
Gauge | By |
Maximum memory resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.memory.request
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.memory.request |
Gauge | By |
Memory resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.storage.limit
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.storage.limit |
Gauge | By |
Maximum storage resource limit set for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.storage.request
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.storage.request |
Gauge | By |
Storage resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.ephemeral_storage.limit
This metric is recommended.
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.ephemeral_storage.request
This metric is recommended.
Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
---|---|---|---|---|---|
k8s.container.ephemeral_storage.request |
Gauge | By |
Ephemeral storage resource requested for the container [1] | k8s.container |
[1]: See https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#resourcerequirements-v1-core for details.
Metric: k8s.container.restart.count
This metric is recommended.
[1]: This value is pulled directly from the K8s API and the value can go indefinitely high and be reset to 0 at any time depending on how your kubelet is configured to prune dead containers. It is best to not depend too much on the exact value but rather look at it as either == 0, in which case you can conclude there were no restarts in the recent past, or > 0, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that.
Metric: k8s.container.ready
This metric is recommended.
[1]: This metric SHOULD reflect the value of the ready
field in the
K8s ContainerStatus.