inits Deployment contrib (#2498)

Co-authored-by: Patrice Chalin <pchalin@gmail.com>
Co-authored-by: Alex Boten <alex@boten.ca>
Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>
Co-authored-by: Severin Neumann <severin.neumann@altmuehlnet.de>
Co-authored-by: Phillip Carter <pcarter@fastmail.com>
Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
This commit is contained in:
Michael Hausenblas 2023-05-10 19:24:56 +01:00 committed by GitHub
parent 3d9fe511a5
commit d98c705cbf
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
11 changed files with 1066 additions and 33 deletions

View File

@ -198,7 +198,7 @@ Uplight currently has a few different Collector configurations:
Dougs ultimate goal is for any deployment in any environment to be able to
easily send telemetry to an
[OTel Collector gateway](/docs/collector/deployment/#gateway).
[OTel Collector gateway](/docs/collector/deployment/gateway/).
Collectors at Uplight are typically run and maintained by the infrastructure
team, unless individual teams decide to take ownership of their own Collectors.

View File

@ -1,32 +0,0 @@
---
title: Deployment
weight: 2
---
The OpenTelemetry Collector consists of a single binary and two primary
deployment methods:
- **Agent:** A Collector instance running with the application or on the same
host as the application (e.g. binary, sidecar, or daemonset).
- **Gateway:** One or more Collector instances running as a standalone service
(e.g. container or deployment) typically per cluster, data center or region.
## Agent
It is recommended to deploy the Agent on every host within an environment. In
doing so, the Agent is capable of receiving telemetry data (push and pull based)
as well as enhancing telemetry data with metadata such as custom tags or
infrastructure information. In addition, the Agent can offload responsibilities
that client instrumentation would otherwise need to handle including batching,
retry, encryption, compression and more.
## Gateway
Additionally, a Gateway cluster can be deployed in every cluster, data center,
or region. A Gateway cluster runs as a standalone service and can offer advanced
capabilities over the Agent including tail-based sampling. In addition, a
Gateway cluster can limit the number of egress points required to send data as
well as consolidate API token management. Each Collector instance in a Gateway
cluster operates independently so it is easy to scale the architecture based on
performance needs with a simple load balancer. If a gateway cluster is deployed,
it usually receives data from Agents deployed within an environment.

View File

@ -0,0 +1,20 @@
---
title: Deployment
description: Patterns you can apply to deploy the OpenTelemetry collector
weight: 2
---
The OpenTelemetry collector consists of a single binary which you can use in
different ways, for different use cases. This section describes deployment
patterns, their use cases along with pros and cons and best practices for
collector configurations for cross-environment and multi-backend deployments.
## Resources
- KubeCon NA 2021 Talk on [OpenTelemetry Collector Deployment
Patterns][y-patterns]
- [Deployment Patterns][gh-patterns] accompanying the talk
[gh-patterns]:
https://github.com/jpkrohling/opentelemetry-collector-deployment-patterns/
[y-patterns]: https://www.youtube.com/watch?v=WhRrwSHDBFs

View File

@ -0,0 +1,134 @@
---
title: Agent
description:
Why and how to send signals to collectors and from there to backends
weight: 2
---
The agent collector deployment pattern consists of applications —
[instrumented][instrumentation] with an OpenTelemetry SDK using [OpenTelemetry
protocol (OTLP)][otlp] — or other collectors (using the OTLP exporter) that send
telemetry signals to a [collector][collector] instance running with the
application or on the same host as the application (such as a sidecar or a
daemonset).
Each client-side SDK or downstream collector is configured with a collector
location:
![Decentralized collector deployment concept](../../img/agent-sdk.svg)
1. In the app, the SDK is configured to send OTLP data to a collector.
1. The collector is configured to send telemetry data to one or more backends.
## Example
A concrete example of the agent collector deployment pattern could look as
follows: you manually instrument, say, a [Java application to export
metrics][instrument-java-metrics] using the OpenTelemetry Java SDK. In the
context of the app, you would set the `OTEL_METRICS_EXPORTER` to `otlp` (which
is the default value) and configure the [OTLP exporter][otlp-exporter] with the
address of your collector, for example (in Bash or `zsh` shell):
```
export OTEL_EXPORTER_OTLP_ENDPOINT=http://collector.example.com:4318
```
The collector serving at `collector.example.com:4318` would then be configured
like so:
<!-- prettier-ignore-start -->
{{< tabpane lang=yaml persistLang=false >}}
{{< tab Traces >}}
receivers:
otlp: # the OTLP receiver the app is sending traces to
protocols:
grpc:
processors:
batch:
exporters:
jaeger: # the Jaeger exporter, to ingest traces to backend
endpoint: "https://jaeger.example.com:14250"
insecure: true
service:
pipelines:
traces/dev:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
{{< /tab >}}
{{< tab Metrics >}}
receivers:
otlp: # the OTLP receiver the app is sending metrics to
protocols:
grpc:
processors:
batch:
exporters:
prometheusremotewrite: # the PRW exporter, to ingest metrics to backend
endpoint: "https://prw.example.com/v1/api/remote_write"
service:
pipelines:
metrics/prod:
receivers: [otlp]
processors: [batch]
exporters: [prometheusremotewrite]
{{< /tab >}}
{{< tab Logs >}}
receivers:
otlp: # the OTLP receiver the app is sending logs to
protocols:
grpc:
processors:
batch:
exporters:
file: # the File Exporter, to ingest logs to local file
path: "./app42_example.log"
rotation:
service:
pipelines:
logs/dev:
receivers: [otlp]
processors: [batch]
exporters: [file]
{{< /tab >}}
{{< /tabpane>}}
<!-- prettier-ignore-end -->
If you want to try it out for yourself, you can have a look at the end-to-end
[Java][java-otlp-example] or [Python][py-otlp-example] examples.
## Tradeoffs
Pros:
- Simple to get started
- Clear 1:1 mapping between application and collector
Cons:
- Scalability (human and load-wise)
- Inflexible
[instrumentation]: /docs/instrumentation/
[otlp]: /docs/reference/specification/protocol/
[collector]: /docs/collector/
[instrument-java-metrics]: /docs/instrumentation/java/manual/#metrics
[otlp-exporter]: /docs/reference/specification/protocol/exporter/
[java-otlp-example]:
https://github.com/open-telemetry/opentelemetry-java-docs/tree/main/otlp
[py-otlp-example]:
https://opentelemetry-python.readthedocs.io/en/stable/examples/metrics/instruments/README.html
[lb-exporter]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/loadbalancingexporter
[spanmetrics-processor]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/spanmetricsprocessor

View File

@ -0,0 +1,170 @@
---
title: Gateway
description:
Why and how to send signals to a single OTLP end-point and from there to
backends
weight: 3
---
The gateway collector deployment pattern consists of applications (or other
collectors) sending telemetry signals to a single OTLP endpoint provided by one
or more collector instances running as a standalone service (for example, a
deployment in Kubernetes), typically per cluster, per data center or per region.
In the general case you can use an out-of-the-box load balancer to distribute
the load amongst the collectors:
![Gateway deployment concept](../../img/gateway-sdk.svg)
For use cases where the processing of the telemetry data processing has to
happen in a specific collector, you would use a two-tiered setup with a
collector that has a pipeline configured with the [Trace ID/Service-name aware
load-balancing exporter][lb-exporter] in the first tier and the collectors
handling the scale out in the second tier. For example, you will need to use the
load-balancing exporter when using the [Tail Sampling
processor][tailsample-processor] so that all spans for a given trace reach the
same collector instance where the tail sampling policy is applied.
Let's have a look at such a case where we are using the load-balancing exporter:
![Gateway deployment with load-balancing exporter](../../img/gateway-lb-sdk.svg)
1. In the app, the SDK is configured to send OTLP data to a central location.
1. A collector configured using the load-balancing exporter that distributes
signals to a group of collectors.
1. The collectors are configured to send telemetry data to one or more backends.
{{% alert title="Note" color="info" %}} Currently, the load-balancing exporter
only supports pipelines of the `traces` type. {{% /alert %}}
## Example
For a concrete example of the centralized collector deployment pattern we first
need to have a closer look at the load-balancing exporter. It has two main
configuration fields:
- The `resolver`, which determines where to find the downstream collectors (or:
backends). If you use the `static` sub-key here, you will have to manually
enumerate the collector URLs. The other supported resolver is the DNS resolver
which will periodically check for updates and resolve IP addresses. For this
resolver type, the `hostname` sub-key specifies the hostname to query in order
to obtain the list of IP addresses.
- With the `routing_key` field you tell the load-balancing exporter to route
spans to specific downstream collectors. If you set this field to `traceID`
(default) then the Load-balancing exporter exports spans based on their
`traceID`. Otherwise, if you use `service` as the value for `routing_key`, it
exports spans based on their service name which is useful when using
connectors like the [Span Metrics connector][spanmetrics-connector], so all
spans of a service will be send to the same downstream collector for metric
collection, guaranteeting accurate aggregations.
The first-tier collector servicing the OTLP endpoint would be configured as
shown below:
<!-- prettier-ignore-start -->
{{< tabpane lang=yaml persistLang=false >}}
{{< tab Static >}}
receivers:
otlp:
protocols:
grpc:
exporters:
loadbalancing:
protocol:
otlp:
insecure: true
resolver:
static:
hostnames:
- collector-1.example.com:4317
- collector-2.example.com:5317
- collector-3.example.com
service:
pipelines:
traces:
receivers: [otlp]
exporters: [loadbalancing]
{{< /tab >}}
{{< tab DNS >}}
receivers:
otlp:
protocols:
grpc:
exporters:
loadbalancing:
protocol:
otlp:
insecure: true
resolver:
dns:
hostname: collectors.example.com
service:
pipelines:
traces:
receivers: [otlp]
exporters: [loadbalancing]
{{< /tab >}}
{{< tab "DNS with service" >}}
receivers:
otlp:
protocols:
grpc:
exporters:
loadbalancing:
routing_key: "service"
protocol:
otlp:
insecure: true
resolver:
dns:
hostname: collectors.example.com
port: 5317
service:
pipelines:
traces:
receivers: [otlp]
exporters: [loadbalancing]
{{< /tab >}}
{{< /tabpane>}}
<!-- prettier-ignore-end -->
The load-balancing exporter emits metrics including
`otelcol_loadbalancer_num_backends` and `otelcol_loadbalancer_backend_latency`
that you can use for health and performance monitoring of the OTLP endpoint
collector.
## Tradeoffs
Pros:
- Separation of concerns such as centrally managed credentials
- Centralized policy management (for example, filtering certain logs or
sampling)
Cons:
- It's one more thing to maintain and that can fail (complexity)
- Added latency in case of cascaded collectors
- Higher overall resource usage (costs)
[instrumentation]: /docs/instrumentation/
[otlp]: /docs/reference/specification/protocol/
[collector]: /docs/collector/
[instrument-java-metrics]: /docs/instrumentation/java/manual/#metrics
[otlp-exporter]: /docs/reference/specification/protocol/exporter/
[java-otlp-example]:
https://github.com/open-telemetry/opentelemetry-java-docs/tree/main/otlp
[py-otlp-example]:
https://opentelemetry-python.readthedocs.io/en/stable/examples/metrics/instruments/README.html
[lb-exporter]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/loadbalancingexporter
[tailsample-processor]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor
[spanmetrics-connector]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector

View File

@ -0,0 +1,32 @@
---
title: No Collector
description: Why and how to send signals directly from app to backends
weight: 1
---
The simplest pattern is not to use a collector at all. This pattern consists of
applications [instrumented][instrumentation] with an OpenTelemetry SDK that
export telemetry signals (traces, metrics, logs) directly into a backend:
![No collector deployment concept](../../img/sdk.svg)
## Example
See the [code instrumentation for programming languages][instrumentation] for
concrete end-to-end examples for how to export signals from your app directly
into a backend.
## Tradeoffs
Pros:
- Simple to use (especially in a dev/test environment)
- No additional moving parts to operate (in production environments)
Cons:
- Requires code changes if collection, processing, or ingestion changes
- Strong coupling between the application code and the backend
- There are limited number of exporters per language implementation
[instrumentation]: /docs/instrumentation/

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 94 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 137 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 114 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 53 KiB

View File

@ -1675,6 +1675,10 @@
"StatusCode": 200,
"LastSeen": "2023-02-20T08:10:49.246765-05:00"
},
"https://github.com/jpkrohling/opentelemetry-collector-deployment-patterns/": {
"StatusCode": 200,
"LastSeen": "2023-03-14T06:35:50.116854Z"
},
"https://github.com/jufab/opentelemetry-angular-interceptor": {
"StatusCode": 200,
"LastSeen": "2023-02-20T07:43:48.729669-05:00"
@ -3347,6 +3351,10 @@
"StatusCode": 200,
"LastSeen": "2023-02-16T17:43:51.469854-05:00"
},
"https://opentelemetry-python.readthedocs.io/en/stable/examples/metrics/instruments/README.html": {
"StatusCode": 200,
"LastSeen": "2023-03-14T06:35:50.907539Z"
},
"https://opentelemetry-python.readthedocs.io/en/stable/shim/opentracing_shim/opentracing_shim.html": {
"StatusCode": 200,
"LastSeen": "2023-02-16T17:45:15.666125-05:00"