elasticsearch: don't set body as db statement for bulk requests (#2355)

* elasticsearch: don't set body as db statement for bulk requests

bulk requests can be too big and diverse to make sense as db statement.
Other than that the sanitizer currently only handles dicts so it's
crashing.

Closes #2150

Co-authored-by: Jason Mobarak <git@jason.mobarak.name>
Co-authored-by: Quentin Pradet <quentin.pradet@gmail.com>

* Update CHANGELOG

* Please the linter

---------

Co-authored-by: Jason Mobarak <git@jason.mobarak.name>
Co-authored-by: Quentin Pradet <quentin.pradet@gmail.com>
This commit is contained in:
Riccardo Magliocchetti 2024-03-22 23:56:29 +01:00 committed by GitHub
parent ada27842bd
commit ca082a7c52
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 41 additions and 3 deletions

View File

@ -21,6 +21,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#2151](https://github.com/open-telemetry/opentelemetry-python-contrib/issues/2298))
- Avoid losing repeated HTTP headers
([#2266](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2266))
- `opentelemetry-instrumentation-elasticsearch` Don't send bulk request body as db statement
([#2355](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2355))
## Version 1.23.0/0.44b0 (2024-02-23)

View File

@ -245,9 +245,11 @@ def _wrap_perform_request(
if method:
attributes["elasticsearch.method"] = method
if body:
attributes[SpanAttributes.DB_STATEMENT] = sanitize_body(
body
)
# Don't set db.statement for bulk requests, as it can be very large
if isinstance(body, dict):
attributes[
SpanAttributes.DB_STATEMENT
] = sanitize_body(body)
if params:
attributes["elasticsearch.params"] = str(params)
if doc_id:

View File

@ -51,6 +51,8 @@ else:
Article = helpers.Article
# pylint: disable=too-many-public-methods
@mock.patch(
"elasticsearch.connection.http_urllib3.Urllib3HttpConnection.perform_request"
@ -486,3 +488,35 @@ class TestElasticsearchIntegration(TestBase):
sanitize_body(json.dumps(sanitization_queries.interval_query)),
str(sanitization_queries.interval_query_sanitized),
)
def test_bulk(self, request_mock):
request_mock.return_value = (1, {}, "")
es = Elasticsearch()
es.bulk(
[
{
"_op_type": "index",
"_index": "sw",
"_doc_type": "_doc",
"_id": 1,
"doc": {"name": "adam"},
},
{
"_op_type": "index",
"_index": "sw",
"_doc_type": "_doc",
"_id": 1,
"doc": {"name": "adam"},
},
]
)
spans_list = self.get_finished_spans()
self.assertEqual(len(spans_list), 1)
span = spans_list[0]
# Check version and name in span's instrumentation info
self.assertEqualSpanInstrumentationInfo(
span, opentelemetry.instrumentation.elasticsearch
)