64 KiB

Raw Blame History

Semantic conventions for Kubernetes metrics

K8s metrics

This document describes instruments and attributes for common K8s level metrics in OpenTelemetry. These metrics are collected from technology-specific, well-defined APIs (e.g. Kubelet's API).

Metrics in k8s. instruments SHOULD be attached to a K8s Resource and therefore inherit its attributes, like k8s.pod.name and k8s.pod.uid.

Pod metrics
Container metrics
- Metric: k8s.container.status.state
- Metric: k8s.container.status.reason
Node metrics
Deployment metrics
- Metric: k8s.deployment.desired_pods
- Metric: k8s.deployment.available_pods
ReplicaSet metrics
- Metric: k8s.replicaset.desired_pods
- Metric: k8s.replicaset.available_pods
ReplicationController metrics
- Metric: k8s.replicationcontroller.desired_pods
- Metric: k8s.replicationcontroller.available_pods
StatefulSet metrics
HorizontalPodAutoscaler metrics
DaemonSet metrics
Job metrics
CronJob metrics
- Metric: k8s.cronjob.active_jobs
Namespace metrics
- Metric: k8s.namespace.phase
K8s Container metrics

Pod metrics

Description: Pod level metrics captured under the namespace k8s.pod.

Metric: `k8s.pod.uptime`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.uptime`	Gauge	`s`	The time the Pod has been running [1]

[1]: Instrumentations SHOULD use a gauge with type double and measure uptime in seconds as a floating point number with the highest precision available. The actual accuracy would depend on the instrumentation and operating system.

Metric: `k8s.pod.cpu.time`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.cpu.time`	Counter	`s`	Total CPU time consumed [1]

[1]: Total CPU time consumed by the specific Pod on all available CPU cores

Metric: `k8s.pod.cpu.usage`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.cpu.usage`	Gauge	`{cpu}`	Pod's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1]

[1]: CPU usage of the specific Pod on all available CPU cores, averaged over the sample window

Metric: `k8s.pod.memory.usage`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.memory.usage`	Gauge	`By`	Memory usage of the Pod [1]

[1]: Total memory usage of the Pod

Metric: `k8s.pod.network.io`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.network.io`	Counter	`By`	Network bytes for the Pod

Attribute	Type	Description	Examples	Requirement Level	Stability
`network.interface.name`	string	The network interface name.	`lo`; `eth0`	`Recommended`
`network.io.direction`	string	The network IO operation direction.	`transmit`	`Recommended`

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`receive`	receive
`transmit`	transmit

Metric: `k8s.pod.network.errors`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.pod.network.errors`	Counter	`{error}`	Pod network errors

Attribute	Type	Description	Examples	Requirement Level	Stability
`network.interface.name`	string	The network interface name.	`lo`; `eth0`	`Recommended`
`network.io.direction`	string	The network IO operation direction.	`transmit`	`Recommended`

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`receive`	receive
`transmit`	transmit

Container metrics

Description: Container level metrics captured under the namespace k8s.container.

Metric: `k8s.container.status.state`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.container.status.state`	UpDownCounter	`{container}`	Describes the number of K8s containers that are currently in a given state [1]

[1]: All possible container states will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state will be non-zero.

Attribute	Type	Description	Examples	Requirement Level	Stability
`k8s.container.status.state`	string	The state of the container. K8s ContainerState	`terminated`; `running`; `waiting`	`Required`

k8s.container.status.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`running`	The container is running.
`terminated`	The container has terminated.
`waiting`	The container is waiting.

Metric: `k8s.container.status.reason`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.container.status.reason`	UpDownCounter	`{container}`	Describes the number of K8s containers that are currently in a state for a given reason [1]

[1]: All possible container state reasons will be reported at each time interval to avoid missing metrics. Only the value corresponding to the current state reason will be non-zero.

Attribute	Type	Description	Examples	Requirement Level	Stability
`k8s.container.status.reason`	string	The reason for the container state. Corresponds to the `reason` field of the: K8s ContainerStateWaiting or K8s ContainerStateTerminated	`ContainerCreating`; `CrashLoopBackOff`; `CreateContainerConfigError`; `ErrImagePull`; `ImagePullBackOff`; `OOMKilled`; `Completed`; `Error`; `ContainerCannotRun`	`Required`

k8s.container.status.reason has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`Completed`	The container has completed execution.
`ContainerCannotRun`	The container cannot run.
`ContainerCreating`	The container is being created.
`CrashLoopBackOff`	The container is in a crash loop back off state.
`CreateContainerConfigError`	There was an error creating the container configuration.
`ErrImagePull`	There was an error pulling the container image.
`Error`	There was an error with the container.
`ImagePullBackOff`	The container image pull is in back off state.
`OOMKilled`	The container was killed due to out of memory.

Node metrics

Description: Node level metrics captured under the namespace k8s.node.

Metric: `k8s.node.uptime`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.uptime`	Gauge	`s`	The time the Node has been running [1]

Metric: `k8s.node.cpu.time`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.cpu.time`	Counter	`s`	Total CPU time consumed [1]

[1]: Total CPU time consumed by the specific Node on all available CPU cores

Metric: `k8s.node.cpu.usage`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.cpu.usage`	Gauge	`{cpu}`	Node's CPU usage, measured in cpus. Range from 0 to the number of allocatable CPUs [1]

[1]: CPU usage of the specific Node on all available CPU cores, averaged over the sample window

Metric: `k8s.node.memory.usage`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.memory.usage`	Gauge	`By`	Memory usage of the Node [1]

[1]: Total memory usage of the Node

Metric: `k8s.node.network.io`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.network.io`	Counter	`By`	Network bytes for the Node

Attribute	Type	Description	Examples	Requirement Level	Stability
`network.interface.name`	string	The network interface name.	`lo`; `eth0`	`Recommended`
`network.io.direction`	string	The network IO operation direction.	`transmit`	`Recommended`

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`receive`	receive
`transmit`	transmit

Metric: `k8s.node.network.errors`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.node.network.errors`	Counter	`{error}`	Node network errors

Attribute	Type	Description	Examples	Requirement Level	Stability
`network.interface.name`	string	The network interface name.	`lo`; `eth0`	`Recommended`
`network.io.direction`	string	The network IO operation direction.	`transmit`	`Recommended`

network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Value	Description	Stability
`receive`	receive
`transmit`	transmit

Deployment metrics

Description: Deployment level metrics captured under the namespace k8s.deployment.

Metric: `k8s.deployment.desired_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.deployment.desired_pods`	UpDownCounter	`{pod}`	Number of desired replica pods in this deployment [1]

[1]: This metric aligns with the replicas field of the K8s DeploymentSpec.

This metric SHOULD, at a minimum, be reported against a k8s.deployment resource.

Metric: `k8s.deployment.available_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.deployment.available_pods`	UpDownCounter	`{pod}`	Total number of available replica pods (ready for at least minReadySeconds) targeted by this deployment [1]

[1]: This metric aligns with the availableReplicas field of the K8s DeploymentStatus.

This metric SHOULD, at a minimum, be reported against a k8s.deployment resource.

ReplicaSet metrics

Description: ReplicaSet level metrics captured under the namespace k8s.replicaset.

Metric: `k8s.replicaset.desired_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.replicaset.desired_pods`	UpDownCounter	`{pod}`	Number of desired replica pods in this replicaset [1]

[1]: This metric aligns with the replicas field of the K8s ReplicaSetSpec.

This metric SHOULD, at a minimum, be reported against a k8s.replicaset resource.

Metric: `k8s.replicaset.available_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.replicaset.available_pods`	UpDownCounter	`{pod}`	Total number of available replica pods (ready for at least minReadySeconds) targeted by this replicaset [1]

[1]: This metric aligns with the availableReplicas field of the K8s ReplicaSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.replicaset resource.

ReplicationController metrics

Description: ReplicationController level metrics captured under the namespace k8s.replicationcontroller.

Metric: `k8s.replicationcontroller.desired_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.replicationcontroller.desired_pods`	UpDownCounter	`{pod}`	Number of desired replica pods in this replication controller [1]

[1]: This metric aligns with the replicas field of the K8s ReplicationControllerSpec

This metric SHOULD, at a minimum, be reported against a k8s.replicationcontroller resource.

Metric: `k8s.replicationcontroller.available_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.replicationcontroller.available_pods`	UpDownCounter	`{pod}`	Total number of available replica pods (ready for at least minReadySeconds) targeted by this replication controller [1]

[1]: This metric aligns with the availableReplicas field of the K8s ReplicationControllerStatus

This metric SHOULD, at a minimum, be reported against a k8s.replicationcontroller resource.

StatefulSet metrics

Description: StatefulSet level metrics captured under the namespace k8s.statefulset.

Metric: `k8s.statefulset.desired_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.statefulset.desired_pods`	UpDownCounter	`{pod}`	Number of desired replica pods in this statefulset [1]

[1]: This metric aligns with the replicas field of the K8s StatefulSetSpec.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: `k8s.statefulset.ready_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.statefulset.ready_pods`	UpDownCounter	`{pod}`	The number of replica pods created for this statefulset with a Ready Condition [1]

[1]: This metric aligns with the readyReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: `k8s.statefulset.current_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.statefulset.current_pods`	UpDownCounter	`{pod}`	The number of replica pods created by the statefulset controller from the statefulset version indicated by currentRevision [1]

[1]: This metric aligns with the currentReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

Metric: `k8s.statefulset.updated_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.statefulset.updated_pods`	UpDownCounter	`{pod}`	Number of replica pods created by the statefulset controller from the statefulset version indicated by updateRevision [1]

[1]: This metric aligns with the updatedReplicas field of the K8s StatefulSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.statefulset resource.

HorizontalPodAutoscaler metrics

Description: HorizontalPodAutoscaler level metrics captured under the namespace k8s.hpa.

Metric: `k8s.hpa.desired_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.hpa.desired_pods`	UpDownCounter	`{pod}`	Desired number of replica pods managed by this horizontal pod autoscaler, as last calculated by the autoscaler [1]

[1]: This metric aligns with the desiredReplicas field of the K8s HorizontalPodAutoscalerStatus

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: `k8s.hpa.current_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.hpa.current_pods`	UpDownCounter	`{pod}`	Current number of replica pods managed by this horizontal pod autoscaler, as last seen by the autoscaler [1]

[1]: This metric aligns with the currentReplicas field of the K8s HorizontalPodAutoscalerStatus

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: `k8s.hpa.max_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.hpa.max_pods`	UpDownCounter	`{pod}`	The upper limit for the number of replica pods to which the autoscaler can scale up [1]

[1]: This metric aligns with the maxReplicas field of the K8s HorizontalPodAutoscalerSpec

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

Metric: `k8s.hpa.min_pods`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.hpa.min_pods`	UpDownCounter	`{pod}`	The lower limit for the number of replica pods to which the autoscaler can scale down [1]

[1]: This metric aligns with the minReplicas field of the K8s HorizontalPodAutoscalerSpec

This metric SHOULD, at a minimum, be reported against a k8s.hpa resource.

DaemonSet metrics

Description: DaemonSet level metrics captured under the namespace k8s.daemonset.

Metric: `k8s.daemonset.current_scheduled_nodes`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.daemonset.current_scheduled_nodes`	UpDownCounter	`{node}`	Number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod [1]

[1]: This metric aligns with the currentNumberScheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: `k8s.daemonset.desired_scheduled_nodes`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.daemonset.desired_scheduled_nodes`	UpDownCounter	`{node}`	Number of nodes that should be running the daemon pod (including nodes currently running the daemon pod) [1]

[1]: This metric aligns with the desiredNumberScheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: `k8s.daemonset.misscheduled_nodes`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.daemonset.misscheduled_nodes`	UpDownCounter	`{node}`	Number of nodes that are running the daemon pod, but are not supposed to run the daemon pod [1]

[1]: This metric aligns with the numberMisscheduled field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Metric: `k8s.daemonset.ready_nodes`

This metric is recommended.

Name	Instrument Type	Unit (UCUM)	Description	Stability	Entity Associations
`k8s.daemonset.ready_nodes`	UpDownCounter	`{node}`	Number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready [1]

[1]: This metric aligns with the numberReady field of the K8s DaemonSetStatus.

This metric SHOULD, at a minimum, be reported against a k8s.daemonset resource.

Job metrics

Description: Job level metrics captured under the namespace k8s.job.