Add k8s metrics for jobs and cronjobs (#1660)

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Co-authored-by: Liudmila Molkova <limolkova@microsoft.com>
This commit is contained in:
Christos Markou 2025-01-09 02:01:00 +02:00 committed by GitHub
parent 42165ae045
commit f0c108784d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 315 additions and 0 deletions

22
.chloggen/add_k8s_jobs.yaml Executable file
View File

@ -0,0 +1,22 @@
# Use this changelog template to create an entry for release notes.
#
# If your change doesn't affect end users you should instead start
# your pull request title with [chore] or use the "Skip Changelog" label.
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
component: k8s
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add k8s metrics for job and cronjob
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
# The values here must be integers.
issues: [1660]
# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

View File

@ -49,6 +49,8 @@ and one for disabling the old schema called `semconv.k8s.disableLegacy`. Then:
- [K8s StatefulsSet metrics](#k8s-statefulsset-metrics)
- [K8s HorizontalPodAutoscaler metrics](#k8s-horizontalpodautoscaler-metrics)
- [K8s DaemonSet metrics](#k8s-daemonset-metrics)
- [K8s Job metrics](#k8s-job-metrics)
- [K8s Cronjob metrics](#k8s-cronjob-metrics)
<!-- tocstop -->
@ -195,3 +197,40 @@ The changes in their metric types are the following:
| `k8s.daemonset.ready_nodes` (type: `gauge`) | `k8s.daemonset.ready_nodes` (type: `updowncounter`) |
<!-- prettier-ignore-end -->
### K8s Job metrics
The K8s Job metrics implemented by the Collector and specifically the
[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.115.0/receiver/k8sclusterreceiver/documentation.md)
receiver were introduced as semantic conventions in
[#1649](https://github.com/open-telemetry/semantic-conventions/pull/1660) (TODO: replace with SemConv version once
available).
The changes in their metric types are the following:
<!-- prettier-ignore-start -->
| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New |
|----------------------------------------------------------|----------------------------------------|
| `k8s.job.active_pods` (type: `gauge`) | `k8s.job.active_pods` (type: `updowncounter`) |
| `k8s.job.failed_pods` (type: `gauge`) | `k8s.job.failed_pods` (type: `updowncounter`) |
| `k8s.job.desired_successful_pods` (type: `gauge`) | `k8s.job.desired_successful_pods` (type: `updowncounter`) |
| `k8s.job.max_parallel_pods` (type: `gauge`) | `k8s.job.max_parallel_pods` (type: `updowncounter`) |
### K8s Cronjob metrics
The K8s Cronjob metrics implemented by the Collector and specifically the
[k8scluster](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.115.0/receiver/k8sclusterreceiver/documentation.md)
receiver were introduced as semantic conventions in
[#1649](https://github.com/open-telemetry/semantic-conventions/pull/1660) (TODO: replace with SemConv version once
available).
The changes in their metric types are the following:
<!-- prettier-ignore-start -->
| Old (Collector) ![changed](https://img.shields.io/badge/changed-orange?style=flat) | New |
|--------------------------------------------------|--------------------------------|
| `k8s.cronjob.active_jobs` (type: `gauge`) | `k8s.cronjob.active_jobs` (type: `updowncounter`) |
<!-- prettier-ignore-end -->

View File

@ -57,6 +57,14 @@ and therefore inherit its attributes, like `k8s.pod.name` and `k8s.pod.uid`.
- [Metric: `k8s.daemonset.desired_scheduled_nodes`](#metric-k8sdaemonsetdesired_scheduled_nodes)
- [Metric: `k8s.daemonset.misscheduled_nodes`](#metric-k8sdaemonsetmisscheduled_nodes)
- [Metric: `k8s.daemonset.ready_nodes`](#metric-k8sdaemonsetready_nodes)
- [Job Metrics](#job-metrics)
- [Metric: `k8s.job.active_pods`](#metric-k8sjobactive_pods)
- [Metric: `k8s.job.failed_pods`](#metric-k8sjobfailed_pods)
- [Metric: `k8s.job.successful_pods`](#metric-k8sjobsuccessful_pods)
- [Metric: `k8s.job.desired_successful_pods`](#metric-k8sjobdesired_successful_pods)
- [Metric: `k8s.job.max_parallel_pods`](#metric-k8sjobmax_parallel_pods)
- [CronJob Metrics](#cronjob-metrics)
- [Metric: `k8s.cronjob.active_jobs`](#metric-k8scronjobactive_jobs)
<!-- tocstop -->
@ -856,5 +864,169 @@ This metric SHOULD, at a minimum, be reported against a
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
## Job Metrics
**Description:** Job level metrics captured under the namespace `k8s.job`.
### Metric: `k8s.job.active_pods`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.job.active_pods -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.job.active_pods` | UpDownCounter | `{pod}` | The number of pending and actively running pods for a job [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `active` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
### Metric: `k8s.job.failed_pods`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.job.failed_pods -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.job.failed_pods` | UpDownCounter | `{pod}` | The number of pods which reached phase Failed for a job [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `failed` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
### Metric: `k8s.job.successful_pods`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.job.successful_pods -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.job.successful_pods` | UpDownCounter | `{pod}` | The number of pods which reached phase Succeeded for a job [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `succeeded` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
### Metric: `k8s.job.desired_successful_pods`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.job.desired_successful_pods -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.job.desired_successful_pods` | UpDownCounter | `{pod}` | The desired number of successfully finished pods the job should be run with [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `completions` field of the
[K8s JobSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobspec-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
### Metric: `k8s.job.max_parallel_pods`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.job.max_parallel_pods -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.job.max_parallel_pods` | UpDownCounter | `{pod}` | The max desired number of pods the job should run at any given time [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `parallelism` field of the
[K8s JobSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobspec-v1-batch.
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
## CronJob Metrics
**Description:** CronJob level metrics captured under the namespace `k8s.cronjob`.
### Metric: `k8s.cronjob.active_jobs`
This metric is [recommended][MetricRecommended].
<!-- semconv metric.k8s.cronjob.active_jobs -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->
| Name | Instrument Type | Unit (UCUM) | Description | Stability |
| -------- | --------------- | ----------- | -------------- | --------- |
| `k8s.cronjob.active_jobs` | UpDownCounter | `{job}` | The number of actively running jobs for a cronjob [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
**[1]:** This metric aligns with the `active` field of the
[K8s CronJobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#cronjobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.cronjob`](../resource/k8s.md#cronjob) resource.
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->
[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status
[MetricRecommended]: /docs/general/metric-requirement-level.md#recommended

View File

@ -360,3 +360,85 @@ groups:
[`k8s.daemonset`](../resource/k8s.md#daemonset) resource.
instrument: updowncounter
unit: "{node}"
# k8s.job.* metrics
- id: metric.k8s.job.active_pods
type: metric
metric_name: k8s.job.active_pods
stability: experimental
brief: "The number of pending and actively running pods for a job"
instrument: updowncounter
unit: "{pod}"
note: |
This metric aligns with the `active` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
- id: metric.k8s.job.failed_pods
type: metric
metric_name: k8s.job.failed_pods
stability: experimental
brief: "The number of pods which reached phase Failed for a job"
instrument: updowncounter
unit: "{pod}"
note: |
This metric aligns with the `failed` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
- id: metric.k8s.job.successful_pods
type: metric
metric_name: k8s.job.successful_pods
stability: experimental
brief: "The number of pods which reached phase Succeeded for a job"
instrument: updowncounter
unit: "{pod}"
note: |
This metric aligns with the `succeeded` field of the
[K8s JobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
- id: metric.k8s.job.desired_successful_pods
type: metric
metric_name: k8s.job.desired_successful_pods
stability: experimental
brief: "The desired number of successfully finished pods the job should be run with"
instrument: updowncounter
unit: "{pod}"
note: |
This metric aligns with the `completions` field of the
[K8s JobSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobspec-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
- id: metric.k8s.job.max_parallel_pods
type: metric
metric_name: k8s.job.max_parallel_pods
stability: experimental
brief: "The max desired number of pods the job should run at any given time"
instrument: updowncounter
unit: "{pod}"
note: |
This metric aligns with the `parallelism` field of the
[K8s JobSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#jobspec-v1-batch.
This metric SHOULD, at a minimum, be reported against a
[`k8s.job`](../resource/k8s.md#job) resource.
# k8s.job.* metrics
- id: metric.k8s.cronjob.active_jobs
type: metric
metric_name: k8s.cronjob.active_jobs
stability: experimental
brief: "The number of actively running jobs for a cronjob"
instrument: updowncounter
unit: "{job}"
note: |
This metric aligns with the `active` field of the
[K8s CronJobStatus](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#cronjobstatus-v1-batch).
This metric SHOULD, at a minimum, be reported against a
[`k8s.cronjob`](../resource/k8s.md#cronjob) resource.