diff --git a/.chloggen/requests-duration.yaml b/.chloggen/requests-duration.yaml new file mode 100644 index 000000000..a22113d54 --- /dev/null +++ b/.chloggen/requests-duration.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: 'enhancement' + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: 'otel' + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Adds SDK self-monitoring metric for exporter call duration + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [1906] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/otel/sdk-metrics.md b/docs/otel/sdk-metrics.md index d63cc5174..e27fcc00f 100644 --- a/docs/otel/sdk-metrics.md +++ b/docs/otel/sdk-metrics.md @@ -27,6 +27,8 @@ This document describes metrics emitted by the OpenTelemetry SDK components them - [Metric: `otel.sdk.processor.log.processed`](#metric-otelsdkprocessorlogprocessed) - [Metric: `otel.sdk.exporter.log.inflight`](#metric-otelsdkexporterloginflight) - [Metric: `otel.sdk.exporter.log.exported`](#metric-otelsdkexporterlogexported) +- [Operation Metrics](#operation-metrics) + - [Metric: `otel.sdk.exporter.operation.duration`](#metric-otelsdkexporteroperationduration) @@ -885,5 +887,135 @@ E.g. for Java the fully qualified classname SHOULD be used in this case. +## Operation Metrics + +### Metric: `otel.sdk.exporter.operation.duration` + +This metric is [recommended][MetricRecommended]. + +This metric SHOULD be specified with +[`ExplicitBucketBoundaries`](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.42.0/specification/metrics/api.md#instrument-advisory-parameters) +with a single bucket with no boundaries. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `otel.sdk.exporter.operation.duration` | Histogram | `s` | The duration of exporting a batch of telemetry records. [1] | ![Development](https://img.shields.io/badge/-development-blue) | + +**[1]:** This metric defines successful operations using the full success definitions for [http](https://github.com/open-telemetry/opentelemetry-proto/blob/v1.5.0/docs/specification.md#full-success-1) +and [grpc](https://github.com/open-telemetry/opentelemetry-proto/blob/v1.5.0/docs/specification.md#full-success). Anything else is defined as an unsuccessful operation. For successful +operations, `error.type` MUST NOT be set. For unsuccessful export operations, `error.type` must contain a relevant failure cause. + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [1] | `rejected`; `timeout`; `500`; `java.net.UnknownHostException` | `Conditionally Required` If operation has ended with an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`http.response.status_code`](/docs/attributes-registry/http.md) | int | The HTTP status code of the last HTTP request performed in scope of this export call. | `200` | `Recommended` when applicable | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`otel.component.name`](/docs/attributes-registry/otel.md) | string | A name uniquely identifying the instance of the OpenTelemetry component within its containing SDK instance. [2] | `otlp_grpc_span_exporter/0`; `custom-name` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) | +| [`otel.component.type`](/docs/attributes-registry/otel.md) | string | A name identifying the type of the OpenTelemetry component. [3] | `otlp_grpc_span_exporter`; `com.example.MySpanExporter` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) | +| [`rpc.grpc.status_code`](/docs/attributes-registry/rpc.md) | int | The gRPC status code of the last gRPC requests performed in scope of this export call. | `0`; `1`; `2` | `Recommended` when applicable | ![Development](https://img.shields.io/badge/-development-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` when applicable | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [5] | `80`; `8080`; `443` | `Recommended` when applicable | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1] `error.type`:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality. + +When `error.type` is set to a type (e.g., an exception type), its +canonical class name identifying the type within the artifact SHOULD be used. + +Instrumentations SHOULD document the list of errors they report. + +The cardinality of `error.type` within one instrumentation library SHOULD be low. +Telemetry consumers that aggregate data from multiple instrumentation libraries and applications +should be prepared for `error.type` to have high cardinality at query time when no +additional filters are applied. + +If the operation has completed successfully, instrumentations SHOULD NOT set `error.type`. + +If a specific domain defines its own set of error identifiers (such as HTTP or gRPC status codes), +it's RECOMMENDED to: + +- Use a domain-specific attribute +- Set `error.type` to capture all errors, regardless of whether they are defined within the domain-specific set or not. + +**[2] `otel.component.name`:** Implementations SHOULD ensure a low cardinality for this attribute, even across application or SDK restarts. +E.g. implementations MUST NOT use UUIDs as values for this attribute. + +Implementations MAY achieve these goals by following a `/` pattern, e.g. `batching_span_processor/0`. +Hereby `otel.component.type` refers to the corresponding attribute value of the component. + +The value of `instance-counter` MAY be automatically assigned by the component and uniqueness within the enclosing SDK instance MUST be guaranteed. +For example, `` MAY be implemented by using a monotonically increasing counter (starting with `0`), which is incremented every time an +instance of the given component type is started. + +With this implementation, for example the first Batching Span Processor would have `batching_span_processor/0` +as `otel.component.name`, the second one `batching_span_processor/1` and so on. +These values will therefore be reused in the case of an application restart. + +**[3] `otel.component.type`:** If none of the standardized values apply, implementations SHOULD use the language-defined name of the type. +E.g. for Java the fully qualified classname SHOULD be used in this case. + +**[4] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. + +**[5] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. + +--- + +`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +--- + +`otel.component.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `batching_log_processor` | The builtin SDK Batching LogRecord Processor | ![Development](https://img.shields.io/badge/-development-blue) | +| `batching_span_processor` | The builtin SDK Batching Span Processor | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_grpc_log_exporter` | OTLP LogRecord exporter over gRPC with protobuf serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_grpc_span_exporter` | OTLP span exporter over gRPC with protobuf serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_http_json_log_exporter` | OTLP LogRecord exporter over HTTP with JSON serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_http_json_span_exporter` | OTLP span exporter over HTTP with JSON serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_http_log_exporter` | OTLP LogRecord exporter over HTTP with protobuf serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `otlp_http_span_exporter` | OTLP span exporter over HTTP with protobuf serialization | ![Development](https://img.shields.io/badge/-development-blue) | +| `simple_log_processor` | The builtin SDK Simple LogRecord Processor | ![Development](https://img.shields.io/badge/-development-blue) | +| `simple_span_processor` | The builtin SDK Simple Span Processor | ![Development](https://img.shields.io/badge/-development-blue) | + +--- + +`rpc.grpc.status_code` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `0` | OK | ![Development](https://img.shields.io/badge/-development-blue) | +| `1` | CANCELLED | ![Development](https://img.shields.io/badge/-development-blue) | +| `2` | UNKNOWN | ![Development](https://img.shields.io/badge/-development-blue) | +| `3` | INVALID_ARGUMENT | ![Development](https://img.shields.io/badge/-development-blue) | +| `4` | DEADLINE_EXCEEDED | ![Development](https://img.shields.io/badge/-development-blue) | +| `5` | NOT_FOUND | ![Development](https://img.shields.io/badge/-development-blue) | +| `6` | ALREADY_EXISTS | ![Development](https://img.shields.io/badge/-development-blue) | +| `7` | PERMISSION_DENIED | ![Development](https://img.shields.io/badge/-development-blue) | +| `8` | RESOURCE_EXHAUSTED | ![Development](https://img.shields.io/badge/-development-blue) | +| `9` | FAILED_PRECONDITION | ![Development](https://img.shields.io/badge/-development-blue) | +| `10` | ABORTED | ![Development](https://img.shields.io/badge/-development-blue) | +| `11` | OUT_OF_RANGE | ![Development](https://img.shields.io/badge/-development-blue) | +| `12` | UNIMPLEMENTED | ![Development](https://img.shields.io/badge/-development-blue) | +| `13` | INTERNAL | ![Development](https://img.shields.io/badge/-development-blue) | +| `14` | UNAVAILABLE | ![Development](https://img.shields.io/badge/-development-blue) | +| `15` | DATA_LOSS | ![Development](https://img.shields.io/badge/-development-blue) | +| `16` | UNAUTHENTICATED | ![Development](https://img.shields.io/badge/-development-blue) | + + + + + + [DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status [MetricRecommended]: /docs/general/metric-requirement-level.md#recommended diff --git a/model/otel/metrics.yaml b/model/otel/metrics.yaml index 658555480..a8833bc1f 100644 --- a/model/otel/metrics.yaml +++ b/model/otel/metrics.yaml @@ -205,3 +205,37 @@ groups: recommended: when applicable - ref: error.type examples: ["rejected", "timeout", "500", "java.net.UnknownHostException"] + + - id: metric.otel.sdk.exporter.operation.duration + type: metric + metric_name: otel.sdk.exporter.operation.duration + stability: development + brief: "The duration of exporting a batch of telemetry records." + note: | + This metric defines successful operations using the full success definitions for [http](https://github.com/open-telemetry/opentelemetry-proto/blob/v1.5.0/docs/specification.md#full-success-1) + and [grpc](https://github.com/open-telemetry/opentelemetry-proto/blob/v1.5.0/docs/specification.md#full-success). Anything else is defined as an unsuccessful operation. For successful + operations, `error.type` MUST NOT be set. For unsuccessful export operations, `error.type` must contain a relevant failure cause. + instrument: histogram + unit: "s" + attributes: + - ref: otel.component.type + examples: ["otlp_grpc_span_exporter", "com.example.MySpanExporter"] + - ref: otel.component.name + - ref: server.address + requirement_level: + recommended: when applicable + - ref: server.port + requirement_level: + recommended: when applicable + - ref: error.type + requirement_level: + conditionally_required: If operation has ended with an error + examples: ["rejected", "timeout", "500", "java.net.UnknownHostException"] + - ref: http.response.status_code + brief: The HTTP status code of the last HTTP request performed in scope of this export call. + requirement_level: + recommended: when applicable + - ref: rpc.grpc.status_code + brief: The gRPC status code of the last gRPC requests performed in scope of this export call. + requirement_level: + recommended: when applicable