Add troubleshooting guidance to OTel Operator auto-instruentation docs (#3098)

Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>
Co-authored-by: Phillip Carter <pcarter@fastmail.com>
This commit is contained in:
Adriana Villela 2023-07-28 12:55:35 -04:00 committed by GitHub
parent d5ec5401a3
commit 716c116848
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 188 additions and 7 deletions

View File

@ -4,8 +4,8 @@ linkTitle: Auto-instrumentation
weight: 11
description:
An implementation of auto-instrumentation using the OpenTelemetry Operator.
spelling:
cSpell:ignore otlpreceiver k8sattributesprocessor GRPCNETCLIENT REDISCALA
# prettier-ignore
cSpell:ignore: autoinstrumentation GRPCNETCLIENT k8sattributesprocessor otelinst otlpreceiver REDISCALA
---
The OpenTelemetry Operator supports injecting and configuring
@ -297,11 +297,12 @@ spec:
EOF
```
By default, the Instrumentation resource that auto-instruments Python services
uses `otlp` with the `http/protobuf` protocol. This means that the configured
endpoint must be able to receive OTLP over `http/protobuf`. Therefore, the
example uses `http://demo-collector:4318`, which will connect to the `http` port
of the `otlpreceiver` of the Collector created in the previous step.
By default, the `Instrumentation` resource that auto-instruments Python services
uses `otlp` with the `http/protobuf` protocol (gRPC is not supported at this
time). This means that the configured endpoint must be able to receive OTLP over
`http/protobuf`. Therefore, the example uses `http://demo-collector:4318`, which
will connect to the `http` port of the `otlpreceiver` of the Collector created
in the previous step.
> As of operator v0.67.0, the Instrumentation resource automatically sets
> `OTEL_EXPORTER_OTLP_TRACES_PROTOCOL` and `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL`
@ -373,3 +374,175 @@ Alternatively, the annotation can be added to a namespace, which will result in
all services in that namespace to opt-in to automatic instrumentation. See the
[Operators auto-instrumentation documentation](https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md#opentelemetry-auto-instrumentation-injection)
for more details.
## Troubleshooting
If you run into problems trying to auto-instrument your code, here are a few
things that you can try.
### Did the Instrumentation resource install?
After installing the `Instrumentation` resource, verify that it installed
correctly by running this command, where `<namespace>` is the namespace in which
the `Instrumentation` resource is deployed:
```sh
kubectl describe otelinst -n <namespace>
```
Sample output:
```
Name: python-instrumentation
Namespace: application
Labels: app.kubernetes.io/managed-by=opentelemetry-operator
Annotations: instrumentation.opentelemetry.io/default-auto-instrumentation-apache-httpd-image:
ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.3
instrumentation.opentelemetry.io/default-auto-instrumentation-dotnet-image:
ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:0.7.0
instrumentation.opentelemetry.io/default-auto-instrumentation-go-image:
ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.2.1-alpha
instrumentation.opentelemetry.io/default-auto-instrumentation-java-image:
ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.26.0
instrumentation.opentelemetry.io/default-auto-instrumentation-nodejs-image:
ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.40.0
instrumentation.opentelemetry.io/default-auto-instrumentation-python-image:
ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.39b0
API Version: opentelemetry.io/v1alpha1
Kind: Instrumentation
Metadata:
Creation Timestamp: 2023-07-28T03:42:12Z
Generation: 1
Resource Version: 3385
UID: 646661d5-a8fc-4b64-80b7-8587c9865f53
Spec:
...
Exporter:
Endpoint: http://otel-collector-collector.opentelemetry.svc.cluster.local:4318
...
Propagators:
tracecontext
baggage
Python:
Image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.39b0
Resource Requirements:
Limits:
Cpu: 500m
Memory: 32Mi
Requests:
Cpu: 50m
Memory: 32Mi
Resource:
Sampler:
Events: <none>
```
### Do the OTel Operator logs show any auto-instrumentation errors?
Check the OTel Operator logs for any errors pertaining to auto-instrumentation
by running this command:
```sh
kubectl logs -l app.kubernetes.io/name=opentelemetry-operator --container manager -n opentelemetry-operator-system --follow
```
### Were the resources deployed in the right order?
Order matters! The `Instrumentation` resource needs to be deployed before before
deploying the application, otherwise the auto-instrumentation wont work.
Recall the auto-instrumentation annotation:
```yaml
annotations:
instrumentation.opentelemetry.io/inject-python: 'true'
```
The annotation above tells the OTel Operator to look for an `Instrumentation`
object in the pods namespace. It also tells the Operator to inject Python
auto-instrumentation into the pod.
When the pod starts up, the annotation tells the Operator to look for an
Instrumentation object in the pods namespace, and to inject
auto-instrumentation into the pod. It adds an
[init-container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/)
to the application's pod, called `opentelemetry-auto-instrumentation`, which is
then used to injects the auto-instrumentation into the app container.
If the `Instrumentation` resource isnt present by the time the application is
deployed, however, the init-container cant be created. Therefore, if the
application is deployed _before_ deploying the `Instrumentation` resource, the
auto-instrumentation will fail.
To make sure that the `opentelemetry-auto-instrumentation` init-container has
started up correctly (or has even started up at all), run the following command:
```sh
kubectl get events -n <your_app_namespace>
```
Which should output something like this:
```
53s Normal Created pod/py-otel-server-7f54bf4cbc-p8wmj Created container opentelemetry-auto-instrumentation
53s Normal Started pod/py-otel-server-7f54bf4cbc-p8wmj Started container opentelemetry-auto-instrumentation
```
If the output is missing `Created` and/or `Started` entries for
`opentelemetry-auto-instrumentation`, then it means that there is an issue with
your auto-instrumentation. This can be the result of any of the following:
- The `Instrumentation` resource wasnt installed (or wasnt installed
properly).
- The `Instrumentation` resource was installed _after_ the application was
deployed.
- Theres an error in the auto-instrumentation annotation, or the annotation in
the wrong spot — see #4 below.
Be sure to check the output of `kubectl get events` for any errors, as these
might help point to the issue.
### **4- Is the auto-instrumentation annotation correct?**
Sometimes auto-instrumentation can fail due to errors in the
auto-instrumentation annotation.
Here are a few things to check for:
- **Is the auto-instrumentation for the right language?** For example, when
instrumenting a Python application, make sure that the annotation doesn't
incorrectly say `instrumentation.opentelemetry.io/inject-java: "true"`
instead.
- **Is the auto-instrumentation annotation in the correct location?** When
defining a `Deployment`, annotations can be added in one of two locations:
`spec.metadata.annotations`, and `spec.template.metadata.annotations`. The
auto-instrumentation annotation needs to be added to
`spec.template.metadata.annotations`, otherwise it wont work.
### Was the auto-instrumentation endpoint configured correctly?
The `spec.exporter.endpoint` attribute of the `Instrumentation` resource defines
where to send data to. This can be an [OTel Collector](/docs/collector/), or any
OTLP endpoint. If this attribute is left out, it defaults to
`http://localhost:4317`, which, most likely won't send telemetry data anywhere.
When sending telemetry to an OTel Collector located in the same Kubernetes
cluster, `spec.exporter.endpoint` should reference the name of the OTel
Collector
[`Service`](https://kubernetes.io/docs/concepts/services-networking/service/).
For example:
```yaml
spec:
exporter:
endpoint: http://otel-collector-collector.opentelemetry.svc.cluster.local:4317
```
Here, the Collector endpoint is set to
`http://otel-collector.opentelemetry.svc.cluster.local:4317`, where
`otel-collector` is the name of the OTel Collector Kubernetes `Service`. In the
above example, the Collector is running in a different namespace from the
application, which means that `opentelemetry.svc.cluster.local` must be appended
to the Collectors service name, where `opentelemetry` is the namespace in which
the Collector resides.

View File

@ -3439,6 +3439,10 @@
"StatusCode": 206,
"LastSeen": "2023-06-29T16:06:58.962765-04:00"
},
"https://kubernetes.io/docs/concepts/services-networking/service/": {
"StatusCode": 206,
"LastSeen": "2023-07-28T12:17:39.379917-04:00"
},
"https://kubernetes.io/docs/concepts/workloads/controllers/": {
"StatusCode": 206,
"LastSeen": "2023-06-29T16:07:14.3411-04:00"
@ -3467,6 +3471,10 @@
"StatusCode": 206,
"LastSeen": "2023-06-29T16:14:26.894177-04:00"
},
"https://kubernetes.io/docs/concepts/workloads/pods/init-containers/": {
"StatusCode": 206,
"LastSeen": "2023-07-28T12:17:39.145108-04:00"
},
"https://kubernetes.io/docs/reference/kubectl/": {
"StatusCode": 206,
"LastSeen": "2023-06-30T11:49:31.853471-04:00"