BREAKING: Introduce common `url.*` attributes, and improve use of namespacing under `http.*` (#3355)

This commit is contained in:
Liudmila Molkova 2023-05-08 19:55:53 -07:00 committed by Josh Suereth
parent 70ffed7002
commit 92a7f35fc0
3 changed files with 50 additions and 5 deletions

View File

@ -44,7 +44,7 @@ Names SHOULD follow these rules:
purpose should primarily drive the decision about forming nested namespaces. purpose should primarily drive the decision about forming nested namespaces.
- For each multi-word dot-delimited component of the attribute name separate the - For each multi-word dot-delimited component of the attribute name separate the
words by underscores (i.e. use snake_case). For example `http.status_code` words by underscores (i.e. use snake_case). For example `http.response.status_code`
denotes the status code in the http namespace. denotes the status code in the http namespace.
- Names SHOULD NOT coincide with namespaces. For example if - Names SHOULD NOT coincide with namespaces. For example if
@ -96,8 +96,8 @@ denote old attribute names in rename operations).
- Semantic conventions exist for four areas: for Resource, Span, Log, and Metric - Semantic conventions exist for four areas: for Resource, Span, Log, and Metric
attribute names. In addition, for spans we have two more areas: Event and Link attribute names. In addition, for spans we have two more areas: Event and Link
attribute names. Identical namespaces or names in all these areas MUST have attribute names. Identical namespaces or names in all these areas MUST have
identical meanings. For example the `http.method` span attribute name denotes identical meanings. For example the `http.request.method` span attribute name denotes
exactly the same concept as the `http.method` metric attribute, has the same exactly the same concept as the `http.request.method` metric attribute, has the same
data type and the same set of possible values (in both cases it records the data type and the same set of possible values (in both cases it records the
value of the HTTP protocol's request method as a string). value of the HTTP protocol's request method as a string).

View File

@ -38,7 +38,7 @@ For example, [Database semantic convention](../trace/semantic_conventions/databa
## Required ## Required
All instrumentations MUST populate the attribute. A semantic convention defining a Required attribute expects an absolute majority of instrumentation libraries and applications are able to efficiently retrieve and populate it, and can additionally meet requirements for cardinality, security, and any others specific to the signal defined by the convention. `http.method` is an example of a Required attribute. All instrumentations MUST populate the attribute. A semantic convention defining a Required attribute expects an absolute majority of instrumentation libraries and applications are able to efficiently retrieve and populate it, and can additionally meet requirements for cardinality, security, and any others specific to the signal defined by the convention. `http.request.method` is an example of a Required attribute.
_Note: Consumers of telemetry can detect if a telemetry item follows a specific semantic convention by checking for the presence of a `Required` attribute defined by such convention. For example, the presence of the `db.system` attribute on a span can be used as an indication that the span follows database semantics._ _Note: Consumers of telemetry can detect if a telemetry item follows a specific semantic convention by checking for the presence of a `Required` attribute defined by such convention. For example, the presence of the `db.system` attribute on a span can be used as an indication that the span follows database semantics._
@ -71,4 +71,4 @@ Here are several examples of expensive operations to be avoided by default:
- DNS lookups to populate `server.address` when only an IP address is available to the instrumentation. Caching lookup results does not solve the issue for all possible cases and should be avoided by default too. - DNS lookups to populate `server.address` when only an IP address is available to the instrumentation. Caching lookup results does not solve the issue for all possible cases and should be avoided by default too.
- forcing an `http.route` calculation before the HTTP framework calculates it - forcing an `http.route` calculation before the HTTP framework calculates it
- reading response stream to find `http.response_content_length` when `Content-Length` header is not available - reading response stream to find `http.response.body.size` when `Content-Length` header is not available

View File

@ -0,0 +1,45 @@
# Semantic conventions for URL
**Status**: [Experimental](../document-status.md)
This document defines semantic conventions that describe URL and its components.
<details>
<summary>Table of Contents</summary>
<!-- toc -->
- [Attributes](#attributes)
- [Sensitive information](#sensitive-information)
<!-- tocstop -->
</details>
## Attributes
<!-- semconv url -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `https`; `ftp`; `telnet` | Recommended |
| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [1] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | Recommended |
| `url.path` | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [2] | `/search` | Recommended |
| `url.query` | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [3] | `q=OpenTelemetry` | Recommended |
| `url.fragment` | string | The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | `SemConv` | Recommended |
**[1]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless.
`url.full` MUST NOT contain credentials passed via URL in form of `https://username:password@www.example.com/`. In such case username and password should be redacted and attribute's value should be `https://REDACTED:REDACTED@www.example.com/`.
`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) and SHOULD NOT be validated or modified except for sanitizing purposes.
**[2]:** When missing, the value is assumed to be `/`
**[3]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it.
<!-- endsemconv -->
## Sensitive information
Capturing URL and its components MAY impose security risk. User and password information, when they are provided in [User Information](https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.1) subcomponent, MUST NOT be recorded.
Instrumentations that are aware of specific sensitive query string parameters MUST scrub their values before capturing `url.query` attribute. For example, native instrumentation of a client library that passes credentials or user location in URL, must scrub corresponding properties.
_Note: Applications and telemetry consumers should scrub sensitive information from URL attributes on collected telemetry. In systems unable to identify sensitive information, certain attribute values may be redacted entirely._