* consolidate flags for configuring telemetry
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Enable configuring metrics via service config
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Make components take MetricsLevel from TelemetrySettings
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Fix lint errors
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Remove configuration for metrics prefix and adding instance ID
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Make entire Collector available to telemetry initialization, use it to set metrics prefix to buildInfo.Command
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* fix metrics prefix tests
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Fix lint errors
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* config/telemetry: parseLevel() no longer needs to be exported
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* config/telemetry: remove intanceID and prefix flags
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Apply PR feedback
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* address PR feedback
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Avoid linter complaining about use of deprecated functions
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* Update CHANGELOG
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
* chore: adds porto and fixes vanity imports.
* chore: fixes target overriding.
* chore: fixes install of porto.
* chore: includes porto as a tool.
* chore: upgrades porto to check internals.
* chore: rebase and update vanity import.
* chore: removes unnecessary space.
* chore: rollsback vanity import in generated files.
This addresses one of the items raised in #4025 (move persistent_queue to internal). The overall refactor is broken into three parts:
* **Part 2**: actual moving Persistent Queue into exporthelper/internal (this also requires exporting `internal.RequestUnmarshaller` and `internal.PersistentRequest`) 👈 (this PR)
Link to tracking Issue: #4025
Testing: Unit tests updated, manual tests to follow
* Add telemetry for dropped data due to exporter sending queue overflow
This change adds internal metrics for dropped spans, metric points and log records when exporter sending queue is full:
- exporter/enqueue_failed_metric_points
- exporter/enqueue_failed_spans
- exporter/enqueue_failed_log_records
* Make report*EnqueueFailure methods private
By moving them to the package where they are being used. It requires some code duplication
This commit adds observability to queue_retry exporter helper. It adds the first metric "queue_length" that indicates current size of the queue per exporter. The metrics is updated every second.
This is the first commit to address the issue https://github.com/open-telemetry/opentelemetry-collector/issues/2434
* Tidy up `consumer/consumererror` package.
* Updated docblocks for grammar and consistency
* Added `IsPartial()` predicate to match `IsPermanent()`
* Ensured tests for `PartialError` test the public interface
Remove `PartialError` and replace with individual signal error types
Refactor consumererror signal extraction to simplify exporterhelper request interface
* Rename consumererror signal error types to align with rest of codebase
* Rename `onPartialError` to `onError` in `exporterhelper.request` interface
* Provide conversion methods to consumererror signal error types.
This moves the accessors for signal data to methods on the individual error types
and provides As<Signal>() package functions that behave as targeted versions of
the errors.As() function.
* Avoid unnecessary allocation, fixup docs
Even in code for metrics we did not have a consistent implementation,
also most important thing was not used anymore in observability helper.
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
* add helper function to convert resource attributes to labels
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* add unit tests to improve coverage
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* get a copy of incoming metrics and modify it
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* update unit tests to make sure incoming pdata.Metrics are unchanged
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* expose Resource_To_Label seeting option for all exporters
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* use the helper function directly instead of seeting up a consumer
* update README and address PR feedbacks
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* add unit tests to improve coverage
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* PR feedback: rename ResourceToLabel to ResourceToTelemetry
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
* refactoring: extract common code and create a simple func
Signed-off-by: Rayhan Hossain <hossain.rayhan@outlook.com>
This change introduces diagnostic debug, info and error logging in
queued_retry exporter helpers. The logging will be applied to all
exporters which use the helpers.
All logging is heavily sampled to avoid flooding the logs. We will
likely need to revise the contributing guide here to clarify that
such sampled logging is allowed:
https://github.com/open-telemetry/opentelemetry-collector/blob/master/CONTRIBUTING.md#logging
We will likely also want to fix the logging for batch processor
which currently outputs not very user friendly messages on failures.
Resolves https://github.com/open-telemetry/opentelemetry-collector/issues/2013
Sample output with otlphttp exporter and destination unavailable:
```
{"level":"info","ts":1603846135.431853,"caller":"service/service.go:252","msg":"Everything is ready. Begin running and processing data."}
{"level":"info","ts":1603846140.7589068,"caller":"exporterhelper/queued_retry.go:250","msg":"Exporting failed. Will retry the request after interval.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","interval":"3.571319363s"}
{"level":"info","ts":1603846151.5628319,"caller":"exporterhelper/queued_retry.go:250","msg":"Exporting failed. Will retry the request after interval.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","interval":"13.463090615s"}
{"level":"error","ts":1603846153.3259032,"caller":"exporterhelper/queued_retry.go:238","msg":"Exporting failed. No more retries left. Dropping data.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"max elapsed time expired failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","dropped_items":10,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:238\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/metricshelper.go:119\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:129\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/Users/tnajaryan/.go/pkg/mod/github.com/jaegertracing/jaeger@v1.20.0/pkg/queue/bounded_queue.go:77"}
{"level":"info","ts":1603846162.86763,"caller":"exporterhelper/queued_retry.go:250","msg":"Exporting failed. Will retry the request after interval.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","interval":"7.729392865s"}
{"level":"error","ts":1603846163.918241,"caller":"exporterhelper/queued_retry.go:238","msg":"Exporting failed. No more retries left. Dropping data.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"max elapsed time expired failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","dropped_items":10,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:238\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/metricshelper.go:119\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:129\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/Users/tnajaryan/.go/pkg/mod/github.com/jaegertracing/jaeger@v1.20.0/pkg/queue/bounded_queue.go:77"}
{"level":"error","ts":1603846174.1390579,"caller":"exporterhelper/queued_retry.go:238","msg":"Exporting failed. No more retries left. Dropping data.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"max elapsed time expired failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","dropped_items":12,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:238\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/metricshelper.go:119\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:129\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/Users/tnajaryan/.go/pkg/mod/github.com/jaegertracing/jaeger@v1.20.0/pkg/queue/bounded_queue.go:77"}
{"level":"info","ts":1603846174.140094,"caller":"exporterhelper/queued_retry.go:250","msg":"Exporting failed. Will retry the request after interval.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","interval":"6.505275214s"}
{"level":"error","ts":1603846177.78816,"caller":"exporterhelper/queued_retry.go:150","msg":"Dropping data because sending_queue is full. Try increasing queue_size.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","dropped_items":12,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:150\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*metricsExporter).ConsumeMetrics\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/metricshelper.go:79\ngo.opentelemetry.io/collector/processor/batchprocessor.(*batchMetrics).export\n\t/Users/tnajaryan/work/repos/collector-core/processor/batchprocessor/batch_processor.go:260\ngo.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).sendItems\n\t/Users/tnajaryan/work/repos/collector-core/processor/batchprocessor/batch_processor.go:165\ngo.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).startProcessingCycle\n\t/Users/tnajaryan/work/repos/collector-core/processor/batchprocessor/batch_processor.go:145"}
{"level":"warn","ts":1603846177.78847,"caller":"batchprocessor/batch_processor.go:166","msg":"Sender failed","component_kind":"processor","component_type":"batch","component_name":"batch","error":"Dropping data because sending_queue is full. Try increasing queue_size."}
{"level":"warn","ts":1603846179.797478,"caller":"batchprocessor/batch_processor.go:166","msg":"Sender failed","component_kind":"processor","component_type":"batch","component_name":"batch","error":"Dropping data because sending_queue is full. Try increasing queue_size."}
{"level":"warn","ts":1603846180.798402,"caller":"batchprocessor/batch_processor.go:166","msg":"Sender failed","component_kind":"processor","component_type":"batch","component_name":"batch","error":"Dropping data because sending_queue is full. Try increasing queue_size."}
{"level":"warn","ts":1603846181.8058171,"caller":"batchprocessor/batch_processor.go:166","msg":"Sender failed","component_kind":"processor","component_type":"batch","component_name":"batch","error":"Dropping data because sending_queue is full. Try increasing queue_size."}
{"level":"error","ts":1603846185.3383482,"caller":"exporterhelper/queued_retry.go:238","msg":"Exporting failed. No more retries left. Dropping data.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"max elapsed time expired failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","dropped_items":12,"stacktrace":"go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:238\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/metricshelper.go:119\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1\n\t/Users/tnajaryan/work/repos/collector-core/exporter/exporterhelper/queued_retry.go:129\ngithub.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1\n\t/Users/tnajaryan/.go/pkg/mod/github.com/jaegertracing/jaeger@v1.20.0/pkg/queue/bounded_queue.go:77"}
{"level":"info","ts":1603846185.339205,"caller":"exporterhelper/queued_retry.go:250","msg":"Exporting failed. Will retry the request after interval.","component_kind":"exporter","component_type":"otlphttp","component_name":"otlphttp","error":"failed to make an HTTP request: Post \"http://localhost:1234\": dial tcp [::1]🔢 connect: connection refused","interval":"5.318897979s"}
^C{"level":"info","ts":1603846187.05004,"caller":"service/service.go:265","msg":"Received signal from OS","signal":"interrupt"}
{"level":"info","ts":1603846187.050097,"caller":"service/service.go:432","msg":"Starting shutdown..."}
```
* Add support for queued retry in the exporter helper.
Changed only the OTLP exporter for the moment to use the new settings.
Timeout is enabled for all the exporters. Fixes#1193
There are some missing features that will be added in a followup PR:
1. Enforcing errors. For the moment added the Throttle error as a hack to keep backwards compatibility with OTLP.
2. Enable queued and retry for all exporters.
3. Fix observability metrics for the case when requests are dropped because the queue is full.
* First round of comments addressed
This is the first part of moving all components to use the obsreport package (with the goal of making uniform the metrics used by various components).
- introduce the command-line option to select the new/legacy metrics, the default for now is legacy metrics.
- remove the options to have or not metrics and tracing from exporterhelper since all usages were enabling both (the only not using it on contrib was by mistake)
* INITIAL
* Properly implement name
* reword comment
* address comments and rename stop to shutdown
* Use assert in tests, change to shutdown instead of shutdownFunc and add some missing comments.
* Change census-instrumentation to open-telemetry and update authors
census-instrumentation/opencensus is now open-telemetry/opentelemetry
"OpenCensus Authors" is now "OpenTelemetry Authors"
"Copyright 2018" is now "Copyright 2019"
Fix go fmt
* Add exporterhelper to wrap all exporters with observability.
* Remove the nop view exporter, no needed.
* Move constants in constants.go, add TODO to handle gRPC errors.
* Don't export the errors exporterhelper returns.