Add additional LLM span attributes (#1059)
Co-authored-by: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> Co-authored-by: Liudmila Molkova <limolkova@microsoft.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
This commit is contained in:
parent
c1955204b5
commit
82c0926c82
|
|
@ -0,0 +1,4 @@
|
||||||
|
change_type: enhancement
|
||||||
|
component: gen-ai
|
||||||
|
note: Adding `gen_ai.request.top_k`, `gen_ai.request.presence_penalty`, `gen_ai.request.frequency_penalty` and `gen_ai.request.stop_sequences` attributes.
|
||||||
|
issues: [839]
|
||||||
|
|
@ -10,22 +10,26 @@
|
||||||
|
|
||||||
This document defines the attributes used to describe telemetry in the context of Generative Artificial Intelligence (GenAI) Models requests and responses.
|
This document defines the attributes used to describe telemetry in the context of Generative Artificial Intelligence (GenAI) Models requests and responses.
|
||||||
|
|
||||||
| Attribute | Type | Description | Examples | Stability |
|
| Attribute | Type | Description | Examples | Stability |
|
||||||
| -------------------------------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- | ---------------------------------------------------------------- |
|
| ---------------------------------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- | ---------------------------------------------------------------- |
|
||||||
| `gen_ai.completion` | string | The full response received from the GenAI model. [1] | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` |  |
|
| `gen_ai.completion` | string | The full response received from the GenAI model. [1] | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` |  |
|
||||||
| `gen_ai.operation.name` | string | The name of the operation being performed. | `chat`; `completion` |  |
|
| `gen_ai.operation.name` | string | The name of the operation being performed. | `chat`; `completion` |  |
|
||||||
| `gen_ai.prompt` | string | The full prompt sent to the GenAI model. [2] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` |  |
|
| `gen_ai.prompt` | string | The full prompt sent to the GenAI model. [2] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` |  |
|
||||||
| `gen_ai.request.max_tokens` | int | The maximum number of tokens the model generates for a request. | `100` |  |
|
| `gen_ai.request.frequency_penalty` | double | The frequency penalty setting for the GenAI request. | `0.1` |  |
|
||||||
| `gen_ai.request.model` | string | The name of the GenAI model a request is being made to. | `gpt-4` |  |
|
| `gen_ai.request.max_tokens` | int | The maximum number of tokens the model generates for a request. | `100` |  |
|
||||||
| `gen_ai.request.temperature` | double | The temperature setting for the GenAI request. | `0.0` |  |
|
| `gen_ai.request.model` | string | The name of the GenAI model a request is being made to. | `gpt-4` |  |
|
||||||
| `gen_ai.request.top_p` | double | The top_p sampling setting for the GenAI request. | `1.0` |  |
|
| `gen_ai.request.presence_penalty` | double | The presence penalty setting for the GenAI request. | `0.1` |  |
|
||||||
| `gen_ai.response.finish_reasons` | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` |  |
|
| `gen_ai.request.stop_sequences` | string[] | List of sequences that the model will use to stop generating further tokens. | `["forest", "lived"]` |  |
|
||||||
| `gen_ai.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` |  |
|
| `gen_ai.request.temperature` | double | The temperature setting for the GenAI request. | `0.0` |  |
|
||||||
| `gen_ai.response.model` | string | The name of the model that generated the response. | `gpt-4-0613` |  |
|
| `gen_ai.request.top_k` | double | The top_k sampling setting for the GenAI request. | `1.0` |  |
|
||||||
| `gen_ai.system` | string | The Generative AI product as identified by the client instrumentation. [3] | `openai` |  |
|
| `gen_ai.request.top_p` | double | The top_p sampling setting for the GenAI request. | `1.0` |  |
|
||||||
| `gen_ai.token.type` | string | The type of token being counted. | `input`; `output` |  |
|
| `gen_ai.response.finish_reasons` | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` |  |
|
||||||
| `gen_ai.usage.completion_tokens` | int | The number of tokens used in the GenAI response (completion). | `180` |  |
|
| `gen_ai.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` |  |
|
||||||
| `gen_ai.usage.prompt_tokens` | int | The number of tokens used in the GenAI input or prompt. | `100` |  |
|
| `gen_ai.response.model` | string | The name of the model that generated the response. | `gpt-4-0613` |  |
|
||||||
|
| `gen_ai.system` | string | The Generative AI product as identified by the client instrumentation. [3] | `openai` |  |
|
||||||
|
| `gen_ai.token.type` | string | The type of token being counted. | `input`; `output` |  |
|
||||||
|
| `gen_ai.usage.completion_tokens` | int | The number of tokens used in the GenAI response (completion). | `180` |  |
|
||||||
|
| `gen_ai.usage.prompt_tokens` | int | The number of tokens used in the GenAI input or prompt. | `100` |  |
|
||||||
|
|
||||||
**[1]:** It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation)
|
**[1]:** It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -47,8 +47,12 @@ These attributes track input data and metadata for a request to an GenAI model.
|
||||||
|---|---|---|---|---|---|
|
|---|---|---|---|---|---|
|
||||||
| [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. [1] | `gpt-4` | `Required` |  |
|
| [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. [1] | `gpt-4` | `Required` |  |
|
||||||
| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client instrumentation. [2] | `openai` | `Required` |  |
|
| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client instrumentation. [2] | `openai` | `Required` |  |
|
||||||
|
| [`gen_ai.request.frequency_penalty`](/docs/attributes-registry/gen-ai.md) | double | The frequency penalty setting for the GenAI request. | `0.1` | `Recommended` |  |
|
||||||
| [`gen_ai.request.max_tokens`](/docs/attributes-registry/gen-ai.md) | int | The maximum number of tokens the model generates for a request. | `100` | `Recommended` |  |
|
| [`gen_ai.request.max_tokens`](/docs/attributes-registry/gen-ai.md) | int | The maximum number of tokens the model generates for a request. | `100` | `Recommended` |  |
|
||||||
|
| [`gen_ai.request.presence_penalty`](/docs/attributes-registry/gen-ai.md) | double | The presence penalty setting for the GenAI request. | `0.1` | `Recommended` |  |
|
||||||
|
| [`gen_ai.request.stop_sequences`](/docs/attributes-registry/gen-ai.md) | string[] | List of sequences that the model will use to stop generating further tokens. | `["forest", "lived"]` | `Recommended` |  |
|
||||||
| [`gen_ai.request.temperature`](/docs/attributes-registry/gen-ai.md) | double | The temperature setting for the GenAI request. | `0.0` | `Recommended` |  |
|
| [`gen_ai.request.temperature`](/docs/attributes-registry/gen-ai.md) | double | The temperature setting for the GenAI request. | `0.0` | `Recommended` |  |
|
||||||
|
| [`gen_ai.request.top_k`](/docs/attributes-registry/gen-ai.md) | double | The top_k sampling setting for the GenAI request. | `1.0` | `Recommended` |  |
|
||||||
| [`gen_ai.request.top_p`](/docs/attributes-registry/gen-ai.md) | double | The top_p sampling setting for the GenAI request. | `1.0` | `Recommended` |  |
|
| [`gen_ai.request.top_p`](/docs/attributes-registry/gen-ai.md) | double | The top_p sampling setting for the GenAI request. | `1.0` | `Recommended` |  |
|
||||||
| [`gen_ai.response.finish_reasons`](/docs/attributes-registry/gen-ai.md) | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` | `Recommended` |  |
|
| [`gen_ai.response.finish_reasons`](/docs/attributes-registry/gen-ai.md) | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` | `Recommended` |  |
|
||||||
| [`gen_ai.response.id`](/docs/attributes-registry/gen-ai.md) | string | The unique identifier for the completion. | `chatcmpl-123` | `Recommended` |  |
|
| [`gen_ai.response.id`](/docs/attributes-registry/gen-ai.md) | string | The unique identifier for the completion. | `chatcmpl-123` | `Recommended` |  |
|
||||||
|
|
|
||||||
|
|
@ -54,6 +54,27 @@ groups:
|
||||||
type: double
|
type: double
|
||||||
brief: The top_p sampling setting for the GenAI request.
|
brief: The top_p sampling setting for the GenAI request.
|
||||||
examples: [1.0]
|
examples: [1.0]
|
||||||
|
tag: llm-generic-request
|
||||||
|
- id: request.top_k
|
||||||
|
stability: experimental
|
||||||
|
type: double
|
||||||
|
brief: The top_k sampling setting for the GenAI request.
|
||||||
|
examples: [1.0]
|
||||||
|
- id: request.stop_sequences
|
||||||
|
stability: experimental
|
||||||
|
type: string[]
|
||||||
|
brief: List of sequences that the model will use to stop generating further tokens.
|
||||||
|
examples: ['forest', 'lived']
|
||||||
|
- id: request.frequency_penalty
|
||||||
|
stability: experimental
|
||||||
|
type: double
|
||||||
|
brief: The frequency penalty setting for the GenAI request.
|
||||||
|
examples: [0.1]
|
||||||
|
- id: request.presence_penalty
|
||||||
|
stability: experimental
|
||||||
|
type: double
|
||||||
|
brief: The presence penalty setting for the GenAI request.
|
||||||
|
examples: [0.1]
|
||||||
- id: response.id
|
- id: response.id
|
||||||
stability: experimental
|
stability: experimental
|
||||||
type: string
|
type: string
|
||||||
|
|
|
||||||
|
|
@ -18,6 +18,14 @@ groups:
|
||||||
requirement_level: recommended
|
requirement_level: recommended
|
||||||
- ref: gen_ai.request.top_p
|
- ref: gen_ai.request.top_p
|
||||||
requirement_level: recommended
|
requirement_level: recommended
|
||||||
|
- ref: gen_ai.request.top_k
|
||||||
|
requirement_level: recommended
|
||||||
|
- ref: gen_ai.request.stop_sequences
|
||||||
|
requirement_level: recommended
|
||||||
|
- ref: gen_ai.request.frequency_penalty
|
||||||
|
requirement_level: recommended
|
||||||
|
- ref: gen_ai.request.presence_penalty
|
||||||
|
requirement_level: recommended
|
||||||
- ref: gen_ai.response.id
|
- ref: gen_ai.response.id
|
||||||
requirement_level: recommended
|
requirement_level: recommended
|
||||||
- ref: gen_ai.response.model
|
- ref: gen_ai.response.model
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue