Added span support for genAI langchain llm invocation (#3665)

* Added span support for llm invocation

* removed invalid code

* added entry point and fixed unwrap

* fixed check runs and updated dependencies

* fixed ruff error

* moved span generation code and added test coverage

* ruff formatting

* ruff formatting again

* removed config exception logger

* removed dontThrow

* fixed span name

* fixed ruff

* fixed typecheck

* added span exist check

* fixed typecheck

* removed start time from span state and moved error handler method to span manager

* fixed ruff

* made SpanManager class and method private

* removed deprecated gen_ai.system attribute

* Moved model to fixture and changed imports

* Fixed ruff errors and renamed method

* Added bedrock support and test

* Fixed ruff errors

* Addressed Aaron's comments

* Reverted versions and ignored typecheck errors

* removed context and added issue

* fixed versions

* skipped telemetry for other than ChatOpenAI and ChatBedrock. Added test for the same.

* Fixed telemetry skipping logic

* Fixed ruff

* added notice file

* fixed conflict

* fixed ruff and typecheck

* fixed ruff

* upgraded semcov version

---------

Co-authored-by: Aaron Abbott <aaronabbott@google.com>
This commit is contained in:
wrisa 2025-09-19 14:19:40 -07:00 committed by GitHub
parent 6edb3f8dc7
commit 60a670f093
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
20 changed files with 1359 additions and 4 deletions

View File

@ -40,3 +40,7 @@ components:
util/opentelemetry-util-genai:
- DylanRussell
- keith-decker
instrumentation-genai/opentelemetry-instrumentation-langchain:
- zhirafovod
- wrisa

View File

@ -5,4 +5,7 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## Unreleased
## Unreleased
- Added span support for genAI langchain llm invocation.
([#3665](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/3665))

View File

@ -0,0 +1,3 @@
This project is inspired by and portions of it are derived from Traceloop OpenLLMetry
(https://github.com/traceloop/openllmetry).
Licensed under the Apache License, Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0).

View File

@ -0,0 +1,8 @@
# Update this with your real OpenAI API key
OPENAI_API_KEY=sk-YOUR_API_KEY
# Uncomment and change to your OTLP endpoint
# OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_SERVICE_NAME=opentelemetry-python-langchain-manual

View File

@ -0,0 +1,39 @@
OpenTelemetry Langcahin Instrumentation Example
============================================
This is an example of how to instrument Langchain when configuring OpenTelemetry SDK and instrumentations manually.
When :code:`main.py <main.py>`_ is run, it exports traces to an OTLP-compatible endpoint.
Traces include details such as the span name and other attributes.
Note: :code:`.env <.env>`_ file configures additional environment variables:
- :code:`OTEL_LOGS_EXPORTER=otlp` to specify exporter type.
- :code:`OPENAI_API_KEY` open AI key for accessing the OpenAI API.
- :code:`OTEL_EXPORTER_OTLP_ENDPOINT` to specify the endpoint for exporting traces (default is http://localhost:4317).
Setup
-----
Minimally, update the :code:`.env <.env>`_ file with your :code:`OPENAI_API_KEY`.
An OTLP compatible endpoint should be listening for traces http://localhost:4317.
If not, update :code:`OTEL_EXPORTER_OTLP_ENDPOINT` as well.
Next, set up a virtual environment like this:
::
python3 -m venv .venv
source .venv/bin/activate
pip install "python-dotenv[cli]"
pip install -r requirements.txt
Run
---
Run the example like this:
::
dotenv run -- python main.py
You should see the capital of France generated by Langchain ChatOpenAI while traces export to your configured observability tool.

View File

@ -0,0 +1,48 @@
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import (
OTLPSpanExporter,
)
from opentelemetry.instrumentation.langchain import LangChainInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
# Configure tracing
trace.set_tracer_provider(TracerProvider())
span_processor = BatchSpanProcessor(OTLPSpanExporter())
trace.get_tracer_provider().add_span_processor(span_processor)
def main():
# Set up instrumentation
LangChainInstrumentor().instrument()
# ChatOpenAI
llm = ChatOpenAI(
model="gpt-3.5-turbo",
temperature=0.1,
max_tokens=100,
top_p=0.9,
frequency_penalty=0.5,
presence_penalty=0.5,
stop_sequences=["\n", "Human:", "AI:"],
seed=100,
)
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
result = llm.invoke(messages)
print("LLM output:\n", result)
# Un-instrument after use
LangChainInstrumentor().uninstrument()
if __name__ == "__main__":
main()

View File

@ -0,0 +1,7 @@
langchain==0.3.21
langchain_openai
opentelemetry-sdk>=1.31.0
opentelemetry-exporter-otlp-proto-grpc>=1.31.0
# Uncomment after lanchain instrumetation is released
# opentelemetry-instrumentation-langchain~=2.0b0.dev

View File

@ -0,0 +1,8 @@
# Update this with your real OpenAI API key
OPENAI_API_KEY=sk-YOUR_API_KEY
# Uncomment and change to your OTLP endpoint
# OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_SERVICE_NAME=opentelemetry-python-langchain-zero-code

View File

@ -0,0 +1,40 @@
OpenTelemetry Langchain Zero-Code Instrumentation Example
======================================================
This is an example of how to instrument Langchain with zero code changes,
using `opentelemetry-instrument`.
When :code:`main.py <main.py>`_ is run, it exports traces to an OTLP-compatible endpoint.
Traces include details such as the span name and other attributes.
Note: :code:`.env <.env>`_ file configures additional environment variables:
- :code:`OTEL_LOGS_EXPORTER=otlp` to specify exporter type.
- :code:`OPENAI_API_KEY` open AI key for accessing the OpenAI API.
- :code:`OTEL_EXPORTER_OTLP_ENDPOINT` to specify the endpoint for exporting traces (default is http://localhost:4317).
Setup
-----
Minimally, update the :code:`.env <.env>`_ file with your :code:`OPENAI_API_KEY`.
An OTLP compatible endpoint should be listening for traces http://localhost:4317.
If not, update :code:`OTEL_EXPORTER_OTLP_ENDPOINT` as well.
Next, set up a virtual environment like this:
::
python3 -m venv .venv
source .venv/bin/activate
pip install "python-dotenv[cli]"
pip install -r requirements.txt
Run
---
Run the example like this:
::
dotenv run -- opentelemetry-instrument python main.py
You should see the capital of France generated by Langchain ChatOpenAI while traces export to your configured observability tool.

View File

@ -0,0 +1,27 @@
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
def main():
llm = ChatOpenAI(
model="gpt-3.5-turbo",
temperature=0.1,
max_tokens=100,
top_p=0.9,
frequency_penalty=0.5,
presence_penalty=0.5,
stop_sequences=["\n", "Human:", "AI:"],
seed=100,
)
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
result = llm.invoke(messages).content
print("LLM output:\n", result)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,8 @@
langchain==0.3.21
langchain_openai
opentelemetry-sdk>=1.31.0
opentelemetry-exporter-otlp-proto-grpc>=1.31.0
opentelemetry-distro~=0.51b0
# Uncomment after lanchain instrumetation is released
# opentelemetry-instrumentation-langchain~=2.0b0.dev

View File

@ -25,9 +25,9 @@ classifiers = [
"Programming Language :: Python :: 3.13",
]
dependencies = [
"opentelemetry-api ~= 1.30",
"opentelemetry-instrumentation ~= 0.51b0",
"opentelemetry-semantic-conventions ~= 0.51b0"
"opentelemetry-api >= 1.31.0",
"opentelemetry-instrumentation ~= 0.57b0",
"opentelemetry-semantic-conventions ~= 0.57b0"
]
[project.optional-dependencies]
@ -35,6 +35,9 @@ instruments = [
"langchain >= 0.3.21",
]
[project.entry-points.opentelemetry_instrumentor]
langchain = "opentelemetry.instrumentation.langchain:LangChainInstrumentor"
[project.urls]
Homepage = "https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation-genai/opentelemetry-instrumentation-langchain"
Repository = "https://github.com/open-telemetry/opentelemetry-python-contrib"

View File

@ -0,0 +1,122 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Langchain instrumentation supporting `ChatOpenAI` and `ChatBedrock`, it can be enabled by
using ``LangChainInstrumentor``. Other providers/LLMs may be supported in the future and telemetry for them is skipped for now.
Usage
-----
.. code:: python
from opentelemetry.instrumentation.langchain import LangChainInstrumentor
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
LangChainInstrumentor().instrument()
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, max_tokens=1000)
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
result = llm.invoke(messages)
LangChainInstrumentor().uninstrument()
API
---
"""
from typing import Any, Callable, Collection
from langchain_core.callbacks import BaseCallbackHandler # type: ignore
from wrapt import wrap_function_wrapper # type: ignore
from opentelemetry.instrumentation.instrumentor import BaseInstrumentor
from opentelemetry.instrumentation.langchain.callback_handler import (
OpenTelemetryLangChainCallbackHandler,
)
from opentelemetry.instrumentation.langchain.package import _instruments
from opentelemetry.instrumentation.langchain.version import __version__
from opentelemetry.instrumentation.utils import unwrap
from opentelemetry.semconv.schemas import Schemas
from opentelemetry.trace import get_tracer
class LangChainInstrumentor(BaseInstrumentor):
"""
OpenTelemetry instrumentor for LangChain.
This adds a custom callback handler to the LangChain callback manager
to capture LLM telemetry.
"""
def __init__(
self,
):
super().__init__()
def instrumentation_dependencies(self) -> Collection[str]:
return _instruments
def _instrument(self, **kwargs: Any):
"""
Enable Langchain instrumentation.
"""
tracer_provider = kwargs.get("tracer_provider")
tracer = get_tracer(
__name__,
__version__,
tracer_provider,
schema_url=Schemas.V1_37_0.value,
)
otel_callback_handler = OpenTelemetryLangChainCallbackHandler(
tracer=tracer,
)
wrap_function_wrapper(
module="langchain_core.callbacks",
name="BaseCallbackManager.__init__",
wrapper=_BaseCallbackManagerInitWrapper(otel_callback_handler),
)
def _uninstrument(self, **kwargs: Any):
"""
Cleanup instrumentation (unwrap).
"""
unwrap("langchain_core.callbacks.base.BaseCallbackManager", "__init__")
class _BaseCallbackManagerInitWrapper:
"""
Wrap the BaseCallbackManager __init__ to insert custom callback handler in the manager's handlers list.
"""
def __init__(
self, callback_handler: OpenTelemetryLangChainCallbackHandler
):
self._otel_handler = callback_handler
def __call__(
self,
wrapped: Callable[..., None],
instance: BaseCallbackHandler, # type: ignore
args: tuple[Any, ...],
kwargs: dict[str, Any],
):
wrapped(*args, **kwargs)
# Ensure our OTel callback is present if not already.
for handler in instance.inheritable_handlers: # type: ignore
if isinstance(handler, type(self._otel_handler)):
break
else:
instance.add_handler(self._otel_handler, inherit=True) # type: ignore

View File

@ -0,0 +1,224 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import annotations
from typing import Any
from uuid import UUID
from langchain_core.callbacks import BaseCallbackHandler # type: ignore
from langchain_core.messages import BaseMessage # type: ignore
from langchain_core.outputs import LLMResult # type: ignore
from opentelemetry.instrumentation.langchain.span_manager import _SpanManager
from opentelemetry.semconv._incubating.attributes import (
gen_ai_attributes as GenAI,
)
from opentelemetry.trace import Tracer
class OpenTelemetryLangChainCallbackHandler(BaseCallbackHandler): # type: ignore[misc]
"""
A callback handler for LangChain that uses OpenTelemetry to create spans for LLM calls and chains, tools etc,. in future.
"""
def __init__(
self,
tracer: Tracer,
) -> None:
super().__init__() # type: ignore
self.span_manager = _SpanManager(
tracer=tracer,
)
def on_chat_model_start(
self,
serialized: dict[str, Any],
messages: list[list[BaseMessage]], # type: ignore
*,
run_id: UUID,
tags: list[str] | None,
parent_run_id: UUID | None,
metadata: dict[str, Any] | None,
**kwargs: Any,
) -> None:
# Other providers/LLMs may be supported in the future and telemetry for them is skipped for now.
if serialized.get("name") not in ("ChatOpenAI", "ChatBedrock"):
return
if "invocation_params" in kwargs:
params = (
kwargs["invocation_params"].get("params")
or kwargs["invocation_params"]
)
else:
params = kwargs
request_model = "unknown"
for model_tag in (
"model_name", # ChatOpenAI
"model_id", # ChatBedrock
):
if (model := (params or {}).get(model_tag)) is not None:
request_model = model
break
elif (model := (metadata or {}).get(model_tag)) is not None:
request_model = model
break
# Skip telemetry for unsupported request models
if request_model == "unknown":
return
span = self.span_manager.create_chat_span(
run_id=run_id,
parent_run_id=parent_run_id,
request_model=request_model,
)
if params is not None:
top_p = params.get("top_p")
if top_p is not None:
span.set_attribute(GenAI.GEN_AI_REQUEST_TOP_P, top_p)
frequency_penalty = params.get("frequency_penalty")
if frequency_penalty is not None:
span.set_attribute(
GenAI.GEN_AI_REQUEST_FREQUENCY_PENALTY, frequency_penalty
)
presence_penalty = params.get("presence_penalty")
if presence_penalty is not None:
span.set_attribute(
GenAI.GEN_AI_REQUEST_PRESENCE_PENALTY, presence_penalty
)
stop_sequences = params.get("stop")
if stop_sequences is not None:
span.set_attribute(
GenAI.GEN_AI_REQUEST_STOP_SEQUENCES, stop_sequences
)
seed = params.get("seed")
if seed is not None:
span.set_attribute(GenAI.GEN_AI_REQUEST_SEED, seed)
# ChatOpenAI
temperature = params.get("temperature")
if temperature is not None:
span.set_attribute(
GenAI.GEN_AI_REQUEST_TEMPERATURE, temperature
)
# ChatOpenAI
max_tokens = params.get("max_completion_tokens")
if max_tokens is not None:
span.set_attribute(GenAI.GEN_AI_REQUEST_MAX_TOKENS, max_tokens)
if metadata is not None:
provider = metadata.get("ls_provider")
if provider is not None:
span.set_attribute("gen_ai.provider.name", provider)
# ChatBedrock
temperature = metadata.get("ls_temperature")
if temperature is not None:
span.set_attribute(
GenAI.GEN_AI_REQUEST_TEMPERATURE, temperature
)
# ChatBedrock
max_tokens = metadata.get("ls_max_tokens")
if max_tokens is not None:
span.set_attribute(GenAI.GEN_AI_REQUEST_MAX_TOKENS, max_tokens)
def on_llm_end(
self,
response: LLMResult, # type: ignore [reportUnknownParameterType]
*,
run_id: UUID,
parent_run_id: UUID | None,
**kwargs: Any,
) -> None:
span = self.span_manager.get_span(run_id)
if span is None:
# If the span does not exist, we cannot set attributes or end it
return
finish_reasons: list[str] = []
for generation in getattr(response, "generations", []): # type: ignore
for chat_generation in generation:
generation_info = getattr(
chat_generation, "generation_info", None
)
if generation_info is not None:
finish_reason = generation_info.get(
"finish_reason", "unknown"
)
if finish_reason is not None:
finish_reasons.append(str(finish_reason))
if chat_generation.message:
if (
generation_info is None
and chat_generation.message.response_metadata
):
finish_reason = (
chat_generation.message.response_metadata.get(
"stopReason", "unknown"
)
)
if finish_reason is not None:
finish_reasons.append(str(finish_reason))
if chat_generation.message.usage_metadata:
input_tokens = (
chat_generation.message.usage_metadata.get(
"input_tokens", 0
)
)
output_tokens = (
chat_generation.message.usage_metadata.get(
"output_tokens", 0
)
)
span.set_attribute(
GenAI.GEN_AI_USAGE_INPUT_TOKENS, input_tokens
)
span.set_attribute(
GenAI.GEN_AI_USAGE_OUTPUT_TOKENS, output_tokens
)
span.set_attribute(
GenAI.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons
)
llm_output = getattr(response, "llm_output", None) # type: ignore
if llm_output is not None:
response_model = llm_output.get("model_name") or llm_output.get(
"model"
)
if response_model is not None:
span.set_attribute(
GenAI.GEN_AI_RESPONSE_MODEL, str(response_model)
)
response_id = llm_output.get("id")
if response_id is not None:
span.set_attribute(GenAI.GEN_AI_RESPONSE_ID, str(response_id))
# End the LLM span
self.span_manager.end_span(run_id)
def on_llm_error(
self,
error: BaseException,
*,
run_id: UUID,
parent_run_id: UUID | None,
**kwargs: Any,
) -> None:
self.span_manager.handle_error(error, run_id)

View File

@ -0,0 +1,117 @@
# Copyright The OpenTelemetry Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from uuid import UUID
from opentelemetry.semconv._incubating.attributes import (
gen_ai_attributes as GenAI,
)
from opentelemetry.semconv.attributes import (
error_attributes as ErrorAttributes,
)
from opentelemetry.trace import Span, SpanKind, Tracer, set_span_in_context
from opentelemetry.trace.status import Status, StatusCode
__all__ = ["_SpanManager"]
@dataclass
class _SpanState:
span: Span
children: List[UUID] = field(default_factory=list)
class _SpanManager:
def __init__(
self,
tracer: Tracer,
) -> None:
self._tracer = tracer
# Map from run_id -> _SpanState, to keep track of spans and parent/child relationships
# TODO: Use weak references or a TTL cache to avoid memory leaks in long-running processes. See #3735
self.spans: Dict[UUID, _SpanState] = {}
def _create_span(
self,
run_id: UUID,
parent_run_id: Optional[UUID],
span_name: str,
kind: SpanKind = SpanKind.INTERNAL,
) -> Span:
if parent_run_id is not None and parent_run_id in self.spans:
parent_state = self.spans[parent_run_id]
parent_span = parent_state.span
ctx = set_span_in_context(parent_span)
span = self._tracer.start_span(
name=span_name, kind=kind, context=ctx
)
parent_state.children.append(run_id)
else:
# top-level or missing parent
span = self._tracer.start_span(name=span_name, kind=kind)
set_span_in_context(span)
span_state = _SpanState(span=span)
self.spans[run_id] = span_state
return span
def create_chat_span(
self,
run_id: UUID,
parent_run_id: Optional[UUID],
request_model: str,
) -> Span:
span = self._create_span(
run_id=run_id,
parent_run_id=parent_run_id,
span_name=f"{GenAI.GenAiOperationNameValues.CHAT.value} {request_model}",
kind=SpanKind.CLIENT,
)
span.set_attribute(
GenAI.GEN_AI_OPERATION_NAME,
GenAI.GenAiOperationNameValues.CHAT.value,
)
if request_model:
span.set_attribute(GenAI.GEN_AI_REQUEST_MODEL, request_model)
return span
def end_span(self, run_id: UUID) -> None:
state = self.spans[run_id]
for child_id in state.children:
child_state = self.spans.get(child_id)
if child_state:
child_state.span.end()
del self.spans[child_id]
state.span.end()
del self.spans[run_id]
def get_span(self, run_id: UUID) -> Optional[Span]:
state = self.spans.get(run_id)
return state.span if state else None
def handle_error(self, error: BaseException, run_id: UUID):
span = self.get_span(run_id)
if span is None:
# If the span does not exist, we cannot set the error status
return
span.set_status(Status(StatusCode.ERROR, str(error)))
span.set_attribute(
ErrorAttributes.ERROR_TYPE, type(error).__qualname__
)
self.end_span(run_id)

View File

@ -0,0 +1,157 @@
interactions:
- request:
body: |-
{
"messages": [
{
"content": "You are a helpful assistant!",
"role": "system"
},
{
"content": "What is the capital of France?",
"role": "user"
}
],
"model": "gpt-3.5-turbo",
"frequency_penalty": 0.5,
"max_completion_tokens": 100,
"presence_penalty": 0.5,
"seed": 100,
"stop": [
"\n",
"Human:",
"AI:"
],
"stream": false,
"temperature": 0.1,
"top_p": 0.9
}
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate, zstd
authorization:
- Bearer test_openai_api_key
connection:
- keep-alive
content-length:
- '316'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.106.1
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.106.1
x-stainless-raw-response:
- 'true'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.5
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: |-
{
"id": "chatcmpl-CCAQbtjsmG2294sQ6utRc16OQWeol",
"object": "chat.completion",
"created": 1757016057,
"model": "gpt-3.5-turbo-0125",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris.",
"refusal": null,
"annotations": []
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 7,
"total_tokens": 31,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default",
"system_fingerprint": null
}
headers:
CF-RAY:
- 97a01376ad4d2af1-LAX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 04 Sep 2025 20:00:57 GMT
Server:
- cloudflare
Set-Cookie: test_set_cookie
Strict-Transport-Security:
- max-age=31536000; includeSubDomains; preload
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
content-length:
- '822'
openai-organization: test_openai_org_id
openai-processing-ms:
- '282'
openai-project:
- proj_GLiYlAc06hF0Fm06IMReZLy4
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '287'
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '200000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '199982'
x-ratelimit-reset-requests:
- 8.64s
x-ratelimit-reset-tokens:
- 5ms
x-request-id:
- req_0e343602788d4f33869d09afcc7d4819
status:
code: 200
message: OK
version: 1

View File

@ -0,0 +1,90 @@
interactions:
- request:
body: |-
{
"messages": [
{
"role": "user",
"content": [
{
"text": "What is the capital of France?"
}
]
}
],
"system": [
{
"text": "You are a helpful assistant!"
}
],
"inferenceConfig": {
"maxTokens": 100,
"temperature": 0.1
}
}
headers:
Content-Length:
- '202'
Content-Type:
- !!binary |
YXBwbGljYXRpb24vanNvbg==
User-Agent:
- !!binary |
Qm90bzMvMS40MC4yNCBtZC9Cb3RvY29yZSMxLjQwLjI0IHVhLzIuMSBvcy9tYWNvcyMyNC42LjAg
bWQvYXJjaCNhcm02NCBsYW5nL3B5dGhvbiMzLjEzLjUgbWQvcHlpbXBsI0NQeXRob24gbS9iLFos
RCBjZmcvcmV0cnktbW9kZSNsZWdhY3kgQm90b2NvcmUvMS40MC4yNCB4LWNsaWVudC1mcmFtZXdv
cms6bGFuZ2NoYWluLWF3cw==
X-Amz-Date:
- !!binary |
MjAyNTA5MDRUMjAwMDU4Wg==
amz-sdk-invocation-id:
- !!binary |
MGQ5MTVjMDUtNzM3YS00OTQwLWIzM2ItMzYwMGIzZGIzYzMy
amz-sdk-request:
- !!binary |
YXR0ZW1wdD0x
authorization:
- Bearer test_openai_api_key
method: POST
uri: https://bedrock-runtime.us-west-2.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-west-2%3A906383545488%3Ainference-profile%2Fus.amazon.nova-lite-v1%3A0/converse
response:
body:
string: |-
{
"metrics": {
"latencyMs": 416
},
"output": {
"message": {
"content": [
{
"text": "The capital of France is Paris. It is not only the capital but also the largest city in the country. Paris is known for its rich history, culture, and landmarks such as the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral."
}
],
"role": "assistant"
}
},
"stopReason": "end_turn",
"usage": {
"inputTokens": 13,
"outputTokens": 50,
"totalTokens": 63
}
}
headers:
Connection:
- keep-alive
Content-Length:
- '412'
Content-Type:
- application/json
Date:
- Thu, 04 Sep 2025 20:00:58 GMT
Set-Cookie: test_set_cookie
openai-organization: test_openai_org_id
x-amzn-RequestId:
- 9fd6d377-fc60-4b28-ab3b-a5c723b218c2
status:
code: 200
message: OK
version: 1

View File

@ -0,0 +1,181 @@
"""Unit tests configuration module."""
import json
import os
import boto3
import pytest
import yaml
from langchain_aws import ChatBedrock
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_openai import ChatOpenAI
from opentelemetry.instrumentation.langchain import LangChainInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.trace.export.in_memory_span_exporter import (
InMemorySpanExporter,
)
@pytest.fixture(scope="function", name="chat_openai_gpt_3_5_turbo_model")
def fixture_chat_openai_gpt_3_5_turbo_model():
llm = ChatOpenAI(
model="gpt-3.5-turbo",
temperature=0.1,
max_tokens=100,
top_p=0.9,
frequency_penalty=0.5,
presence_penalty=0.5,
stop_sequences=["\n", "Human:", "AI:"],
seed=100,
)
yield llm
@pytest.fixture(scope="function", name="us_amazon_nova_lite_v1_0")
def fixture_us_amazon_nova_lite_v1_0():
llm_model_value = "us.amazon.nova-lite-v1:0"
llm = ChatBedrock(
model_id=llm_model_value,
client=boto3.client(
"bedrock-runtime",
aws_access_key_id="test_key",
aws_secret_access_key="test_secret",
region_name="us-west-2",
aws_account_id="test_account",
),
provider="amazon",
temperature=0.1,
max_tokens=100,
)
yield llm
@pytest.fixture(scope="function", name="gemini")
def fixture_gemini():
llm_model_value = "gemini-2.5-pro"
llm = ChatGoogleGenerativeAI(model=llm_model_value, api_key="test_key")
yield llm
@pytest.fixture(scope="function", name="span_exporter")
def fixture_span_exporter():
exporter = InMemorySpanExporter()
yield exporter
@pytest.fixture(scope="function", name="tracer_provider")
def fixture_tracer_provider(span_exporter):
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(span_exporter))
return provider
@pytest.fixture(scope="function")
def start_instrumentation(
tracer_provider,
):
instrumentor = LangChainInstrumentor()
instrumentor.instrument(
tracer_provider=tracer_provider,
)
yield instrumentor
instrumentor.uninstrument()
@pytest.fixture(autouse=True)
def environment():
if not os.getenv("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = "test_openai_api_key"
@pytest.fixture(scope="module")
def vcr_config():
return {
"filter_headers": [
("cookie", "test_cookie"),
("authorization", "Bearer test_openai_api_key"),
("openai-organization", "test_openai_org_id"),
("openai-project", "test_openai_project_id"),
],
"decode_compressed_response": True,
"before_record_response": scrub_response_headers,
}
class LiteralBlockScalar(str):
"""Formats the string as a literal block scalar, preserving whitespace and
without interpreting escape characters"""
def literal_block_scalar_presenter(dumper, data):
"""Represents a scalar string as a literal block, via '|' syntax"""
return dumper.represent_scalar("tag:yaml.org,2002:str", data, style="|")
yaml.add_representer(LiteralBlockScalar, literal_block_scalar_presenter)
def process_string_value(string_value):
"""Pretty-prints JSON or returns long strings as a LiteralBlockScalar"""
try:
json_data = json.loads(string_value)
return LiteralBlockScalar(json.dumps(json_data, indent=2))
except (ValueError, TypeError):
if len(string_value) > 80:
return LiteralBlockScalar(string_value)
return string_value
def convert_body_to_literal(data):
"""Searches the data for body strings, attempting to pretty-print JSON"""
if isinstance(data, dict):
for key, value in data.items():
# Handle response body case (e.g., response.body.string)
if key == "body" and isinstance(value, dict) and "string" in value:
value["string"] = process_string_value(value["string"])
# Handle request body case (e.g., request.body)
elif key == "body" and isinstance(value, str):
data[key] = process_string_value(value)
else:
convert_body_to_literal(value)
elif isinstance(data, list):
for idx, choice in enumerate(data):
data[idx] = convert_body_to_literal(choice)
return data
class PrettyPrintJSONBody:
"""This makes request and response body recordings more readable."""
@staticmethod
def serialize(cassette_dict):
cassette_dict = convert_body_to_literal(cassette_dict)
return yaml.dump(
cassette_dict, default_flow_style=False, allow_unicode=True
)
@staticmethod
def deserialize(cassette_string):
return yaml.load(cassette_string, Loader=yaml.Loader)
@pytest.fixture(scope="module", autouse=True)
def fixture_vcr(vcr):
vcr.register_serializer("yaml", PrettyPrintJSONBody)
return vcr
def scrub_response_headers(response):
"""
This scrubs sensitive response headers. Note they are case-sensitive!
"""
response["headers"]["openai-organization"] = "test_openai_org_id"
response["headers"]["Set-Cookie"] = "test_set_cookie"
return response

View File

@ -0,0 +1,166 @@
from typing import Optional
import pytest
from langchain_core.messages import HumanMessage, SystemMessage
from opentelemetry.sdk.trace import ReadableSpan
from opentelemetry.semconv._incubating.attributes import gen_ai_attributes
# span_exporter, start_instrumentation, chat_openai_gpt_3_5_turbo_model are coming from fixtures defined in conftest.py
@pytest.mark.vcr()
def test_chat_openai_gpt_3_5_turbo_model_llm_call(
span_exporter, start_instrumentation, chat_openai_gpt_3_5_turbo_model
):
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
response = chat_openai_gpt_3_5_turbo_model.invoke(messages)
assert response.content == "The capital of France is Paris."
# verify spans
spans = span_exporter.get_finished_spans()
print(f"spans: {spans}")
for span in spans:
print(f"span: {span}")
print(f"span attributes: {span.attributes}")
assert_openai_completion_attributes(spans[0], response)
# span_exporter, start_instrumentation, us_amazon_nova_lite_v1_0 are coming from fixtures defined in conftest.py
@pytest.mark.vcr()
def test_us_amazon_nova_lite_v1_0_bedrock_llm_call(
span_exporter, start_instrumentation, us_amazon_nova_lite_v1_0
):
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
result = us_amazon_nova_lite_v1_0.invoke(messages)
assert result.content.find("The capital of France is Paris") != -1
# verify spans
spans = span_exporter.get_finished_spans()
print(f"spans: {spans}")
for span in spans:
print(f"span: {span}")
print(f"span attributes: {span.attributes}")
# TODO: fix the code and ensure the assertions are correct
assert_bedrock_completion_attributes(spans[0], result)
# span_exporter, start_instrumentation, gemini are coming from fixtures defined in conftest.py
@pytest.mark.vcr()
def test_gemini(span_exporter, start_instrumentation, gemini):
messages = [
SystemMessage(content="You are a helpful assistant!"),
HumanMessage(content="What is the capital of France?"),
]
result = gemini.invoke(messages)
assert result.content.find("The capital of France is **Paris**") != -1
# verify spans
spans = span_exporter.get_finished_spans()
assert len(spans) == 0 # No spans should be created for gemini as of now
def assert_openai_completion_attributes(
span: ReadableSpan, response: Optional
):
assert span.name == "chat gpt-3.5-turbo"
assert span.attributes[gen_ai_attributes.GEN_AI_OPERATION_NAME] == "chat"
assert (
span.attributes[gen_ai_attributes.GEN_AI_REQUEST_MODEL]
== "gpt-3.5-turbo"
)
assert (
span.attributes[gen_ai_attributes.GEN_AI_RESPONSE_MODEL]
== "gpt-3.5-turbo-0125"
)
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_MAX_TOKENS] == 100
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_TEMPERATURE] == 0.1
assert span.attributes["gen_ai.provider.name"] == "openai"
assert gen_ai_attributes.GEN_AI_RESPONSE_ID in span.attributes
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_TOP_P] == 0.9
assert (
span.attributes[gen_ai_attributes.GEN_AI_REQUEST_FREQUENCY_PENALTY]
== 0.5
)
assert (
span.attributes[gen_ai_attributes.GEN_AI_REQUEST_PRESENCE_PENALTY]
== 0.5
)
stop_sequences = span.attributes.get(
gen_ai_attributes.GEN_AI_REQUEST_STOP_SEQUENCES
)
assert all(seq in ["\n", "Human:", "AI:"] for seq in stop_sequences)
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_SEED] == 100
input_tokens = response.response_metadata.get("token_usage").get(
"prompt_tokens"
)
if input_tokens:
assert (
input_tokens
== span.attributes[gen_ai_attributes.GEN_AI_USAGE_INPUT_TOKENS]
)
else:
assert (
gen_ai_attributes.GEN_AI_USAGE_INPUT_TOKENS not in span.attributes
)
output_tokens = response.response_metadata.get("token_usage").get(
"completion_tokens"
)
if output_tokens:
assert (
output_tokens
== span.attributes[gen_ai_attributes.GEN_AI_USAGE_OUTPUT_TOKENS]
)
else:
assert (
gen_ai_attributes.GEN_AI_USAGE_OUTPUT_TOKENS not in span.attributes
)
def assert_bedrock_completion_attributes(
span: ReadableSpan, response: Optional
):
assert span.name == "chat us.amazon.nova-lite-v1:0"
assert span.attributes[gen_ai_attributes.GEN_AI_OPERATION_NAME] == "chat"
assert (
span.attributes[gen_ai_attributes.GEN_AI_REQUEST_MODEL]
== "us.amazon.nova-lite-v1:0"
)
assert span.attributes["gen_ai.provider.name"] == "amazon_bedrock"
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_MAX_TOKENS] == 100
assert span.attributes[gen_ai_attributes.GEN_AI_REQUEST_TEMPERATURE] == 0.1
input_tokens = response.usage_metadata.get("input_tokens")
if input_tokens:
assert (
input_tokens
== span.attributes[gen_ai_attributes.GEN_AI_USAGE_INPUT_TOKENS]
)
else:
assert (
gen_ai_attributes.GEN_AI_USAGE_INPUT_TOKENS not in span.attributes
)
output_tokens = response.usage_metadata.get("output_tokens")
if output_tokens:
assert (
output_tokens
== span.attributes[gen_ai_attributes.GEN_AI_USAGE_OUTPUT_TOKENS]
)
else:
assert (
gen_ai_attributes.GEN_AI_USAGE_OUTPUT_TOKENS not in span.attributes
)

View File

@ -0,0 +1,100 @@
import unittest.mock
import uuid
import pytest
from opentelemetry.instrumentation.langchain.span_manager import (
_SpanManager,
_SpanState,
)
from opentelemetry.trace import SpanKind, get_tracer
from opentelemetry.trace.span import Span
class TestSpanManager:
@pytest.fixture
def tracer(self):
return get_tracer("test_tracer")
@pytest.fixture
def handler(self, tracer):
return _SpanManager(tracer=tracer)
@pytest.mark.parametrize(
"parent_run_id,parent_in_spans",
[
(None, False), # No parent
(uuid.uuid4(), False), # Parent not in spans
(uuid.uuid4(), True), # Parent in spans
],
)
def test_create_span(
self, handler, tracer, parent_run_id, parent_in_spans
):
# Arrange
run_id = uuid.uuid4()
span_name = "test_span"
kind = SpanKind.INTERNAL
mock_span = unittest.mock.Mock(spec=Span)
# Setup parent if needed
if parent_run_id is not None and parent_in_spans:
parent_mock_span = unittest.mock.Mock(spec=Span)
handler.spans[parent_run_id] = _SpanState(span=parent_mock_span)
with (
unittest.mock.patch.object(
tracer, "start_span", return_value=mock_span
) as mock_start_span,
unittest.mock.patch(
"opentelemetry.instrumentation.langchain.span_manager.set_span_in_context"
) as mock_set_span_in_context,
):
# Act
result = handler._create_span(
run_id, parent_run_id, span_name, kind
)
# Assert
assert result == mock_span
assert run_id in handler.spans
assert handler.spans[run_id].span == mock_span
# Verify parent-child relationship
if parent_run_id is not None and parent_in_spans:
mock_set_span_in_context.assert_called_once_with(
handler.spans[parent_run_id].span
)
mock_start_span.assert_called_once_with(
name=span_name,
kind=kind,
context=mock_set_span_in_context.return_value,
)
assert run_id in handler.spans[parent_run_id].children
else:
mock_start_span.assert_called_once_with(
name=span_name, kind=kind
)
mock_set_span_in_context.assert_called_once_with(mock_span)
def test_end_span(self, handler):
# Arrange
run_id = uuid.uuid4()
mock_span = unittest.mock.Mock(spec=Span)
handler.spans[run_id] = _SpanState(span=mock_span)
# Add a child to verify it's removed
child_run_id = uuid.uuid4()
child_mock_span = unittest.mock.Mock(spec=Span)
handler.spans[child_run_id] = _SpanState(span=child_mock_span)
handler.spans[run_id].children.append(child_run_id)
# Act
handler.end_span(run_id)
# Assert
mock_span.end.assert_called_once()
child_mock_span.end.assert_called_once()
assert run_id not in handler.spans
assert child_run_id not in handler.spans