Define a common algorithm for service.instance.id (#312)

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
Co-authored-by: Josh Suereth <joshuasuereth@google.com>
This commit is contained in:
Juraci Paixão Kröhling 2024-02-23 20:02:08 +01:00 committed by GitHub
parent 4362b16491
commit b825ce1b93
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 59 additions and 15 deletions

View File

@ -0,0 +1,4 @@
change_type: 'enhancement'
component: resource
note: Define a common algorithm for `service.instance.id`.
issues: [312]

View File

@ -99,10 +99,35 @@ as specified in the [Resource SDK specification](https://github.com/open-telemet
<!-- semconv service_experimental --> <!-- semconv service_experimental -->
| Attribute | Type | Description | Examples | Requirement Level | | Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---| |---|---|---|---|---|
| `service.instance.id` | string | The string ID of the service instance. [1] | `my-k8s-pod-deployment-1`; `627cc493-f310-47de-96bd-71410b7dec09` | Recommended | | `service.instance.id` | string | The string ID of the service instance. [1] | `627cc493-f310-47de-96bd-71410b7dec09` | Recommended |
| `service.namespace` | string | A namespace for `service.name`. [2] | `Shop` | Recommended | | `service.namespace` | string | A namespace for `service.name`. [2] | `Shop` | Recommended |
**[1]:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled service). It is preferable for the ID to be persistent and stay the same for the lifetime of the service instance, however it is acceptable that the ID is ephemeral and changes during important lifetime events for the service (e.g. service restarts). If the service has no inherent unique ID that can be used as the value of this attribute it is recommended to generate a random Version 1 or Version 4 RFC 4122 UUID (services aiming for reproducible UUIDs may also use Version 5, see RFC 4122 for more recommendations). **[1]:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words
`service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to
distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled
service).
Implementations, such as SDKs, are recommended to generate a random Version 1 or Version 4 [RFC
4122](https://www.ietf.org/rfc/rfc4122.txt) UUID, but are free to use an inherent unique ID as the source of
this value if stability is desirable. In that case, the ID SHOULD be used as source of a UUID Version 5 and
SHOULD use the following UUID as the namespace: `4d63009a-8d0f-11ee-aad7-4c796ed8e320`.
UUIDs are typically recommended, as only an opaque value for the purposes of identifying a service instance is
needed. Similar to what can be seen in the man page for the
[`/etc/machine-id`](https://www.freedesktop.org/software/systemd/man/machine-id.html) file, the underlying
data, such as pod name and namespace should be treated as confidential, being the user's choice to expose it
or not via another resource attribute.
For applications running behind an application server (like unicorn), we do not recommend using one identifier
for all processes participating in the application. Instead, it's recommended each division (e.g. a worker
thread in unicorn) to have its own instance.id.
It's not recommended for a Collector to set `service.instance.id` if it can't unambiguously determine the
service instance that is generating that telemetry. For instance, creating an UUID based on `pod.name` will
likely be wrong, as the Collector might not know from which container within that pod the telemetry originated.
However, Collectors can set the `service.instance.id` if they can unambiguously determine the service instance
for that telemetry. This is typically the case for scraping receivers, as they know the target address and
port.
**[2]:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace. **[2]:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace.
<!-- endsemconv --> <!-- endsemconv -->

View File

@ -22,16 +22,31 @@ groups:
type: string type: string
brief: > brief: >
The string ID of the service instance. The string ID of the service instance.
note: > note: |
MUST be unique for each instance of the same `service.namespace,service.name` pair MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words
(in other words `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to
The ID helps to distinguish instances of the same service that exist at the same time distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled
(e.g. instances of a horizontally scaled service). It is preferable for the ID to be persistent service).
and stay the same for the lifetime of the service instance, however it is acceptable that
the ID is ephemeral and changes during important lifetime events for the service Implementations, such as SDKs, are recommended to generate a random Version 1 or Version 4 [RFC
(e.g. service restarts). 4122](https://www.ietf.org/rfc/rfc4122.txt) UUID, but are free to use an inherent unique ID as the source of
If the service has no inherent unique ID that can be used as the value of this attribute this value if stability is desirable. In that case, the ID SHOULD be used as source of a UUID Version 5 and
it is recommended to generate a random Version 1 or Version 4 RFC 4122 UUID SHOULD use the following UUID as the namespace: `4d63009a-8d0f-11ee-aad7-4c796ed8e320`.
(services aiming for reproducible UUIDs may also use Version 5, see RFC 4122
for more recommendations). UUIDs are typically recommended, as only an opaque value for the purposes of identifying a service instance is
examples: ["my-k8s-pod-deployment-1", "627cc493-f310-47de-96bd-71410b7dec09"] needed. Similar to what can be seen in the man page for the
[`/etc/machine-id`](https://www.freedesktop.org/software/systemd/man/machine-id.html) file, the underlying
data, such as pod name and namespace should be treated as confidential, being the user's choice to expose it
or not via another resource attribute.
For applications running behind an application server (like unicorn), we do not recommend using one identifier
for all processes participating in the application. Instead, it's recommended each division (e.g. a worker
thread in unicorn) to have its own instance.id.
It's not recommended for a Collector to set `service.instance.id` if it can't unambiguously determine the
service instance that is generating that telemetry. For instance, creating an UUID based on `pod.name` will
likely be wrong, as the Collector might not know from which container within that pod the telemetry originated.
However, Collectors can set the `service.instance.id` if they can unambiguously determine the service instance
for that telemetry. This is typically the case for scraping receivers, as they know the target address and
port.
examples: ["627cc493-f310-47de-96bd-71410b7dec09"]