Merge branch 'observability-concept' of https://github.com/dapr/docs into observability-concept

This commit is contained in:
Ori Zohar 2021-02-15 15:32:52 -08:00
commit 669d991d77
10 changed files with 359 additions and 52 deletions

View File

@ -9,10 +9,12 @@ description: >
When building an applications, understanding how the system is behaving is an important part of operating it - this includes having the ability to observe the internal calls of an application, gauging its performance and becoming aware of problems as soon as they occur. This is challenging for any system but even more so for a distributed system comprised of multiple microservices where a flow, made of several calls, may start in one microservices but continue in another. Observability is critical in production environments but also useful during development to understand bottlenecks, improve performance and perform basic debugging across the span of microservices.
While some data points about an application can be gathered from the underlying infrastructure (e.g. memory consumption, CPU usage), other meaningful information must be collected from an "application aware" layer - one that can show how an important series of calls is executed across microservices. This usually means a developer must add some code to instrument an application for this purpose. Often, instrumentation code is simply meant to send collected data such as traces and metrics to an external monitoring tool or service that can help store, visualize and analyze all this information. Having to maintain this code, which is not part of the core logic of the application, is another burden on the developer - sometimes requiring understanding monitoring tools APIs, using additional SDKs etc. This instrumentation may also add to the portability challenges of an application which may require different instrumentation depending on where the application is deployed (e.g. different cloud providers offer different monitoring solutions and an on-prem deployment might require an on-prem solution).
While some data points about an application can be gathered from the underlying infrastructure (e.g. memory consumption, CPU usage), other meaningful information must be collected from an "application aware" layer - one that can show how an important series of calls is executed across microservices. This usually means a developer must add some code to instrument an application for this purpose. Often, instrumentation code is simply meant to send collected data such as traces and metrics to an external monitoring tool or service that can help store, visualize and analyze all this information.
Having to maintain this code, which is not part of the core logic of the application, is another burden on the developer, sometimes requiring understanding monitoring tools APIs, using additional SDKs etc. This instrumentation may also add to the portability challenges of an application which may require different instrumentation depending on where the application is deployed. For example, different cloud providers offer different monitoring solutions and an on-prem deployment might require an on-prem solution.
## Observability for your application with Dapr
When building an application which is leveraging Dapr building blocks to perform service-to-service calls and pub/sub messaging, Dapr offers an advantage in respect to [distributed tracing]({{<ref tracing>}}) - Because this inter-service communication goes through the Dapr sidecar, the sidecar is in a unique position to offload the burden of application level instrumentation.
When building an application which is leveraging Dapr building blocks to perform service-to-service calls and pub/sub messaging, Dapr offers an advantage in respect to [distributed tracing]({{<ref tracing>}}) because this inter-service communication flows through the Dapr sidecar, the sidecar is in a unique position to offload the burden of application level instrumentation.
### Distributed tracing
Dapr can be [configured to emit tracing data]({{<ref setup-tracing.md>}}), and because Dapr does so using widely adopted protocols such as the [Zipkin](https://zipkin.io) protocol, it can be easily integrated with multiple [monitoring backends]({{<ref supported-tracing-backends>}}).
@ -28,14 +30,15 @@ Dapr can also be configured to work with the [OpenTelemetry Collector]({{<ref op
Dapr uses [W3C tracing]({{<ref w3c-tracing>}}) specification for tracing context and can generate and propagate the context header itself or propagate user provided context headers.
## Observability for the Dapr sidecar and system services
Like for other parts of your system, you will want to be able to observe Dapr itself and collect metrics and logs emitted by the Dapr sidecar that runs along each microservice as well as the Dapr related services in your environment such as the control plane services that are deployed for a Dapr enabled Kubernetes cluster.
As for other parts of your system, you will want to be able to observe Dapr itself and collect metrics and logs emitted by the Dapr sidecar that runs along each microservice as well as the Dapr related services in your environment such as the control plane services that are deployed for a Dapr enabled Kubernetes cluster.
<img src="/images/observability-sidecar.png" width=1000 alt="Dapr sidecar metrics, logs and health checks">
### Logging
Dapr generates [logs]({{<ref "logs.md">}}) to provide visibility into sidecar operation and to help users identify issues and perform debugging. Logs events contain warning, error, info, and debug messages produced by Dapr system services. Dapr can also be configured to send logs to collectors such as [Fluentd]({{< ref fluentd.md >}}) and [Azure Monitor]({{< ref azure-monitor.md >}}) so they can be easily searched, analyzed and provide insights.
Dapr generates [logs]({{<ref "logs.md">}}) to provide visibility into sidecar operation and to help users identify issues and perform debugging. Log events contain warning, error, info, and debug messages produced by Dapr system services. Dapr can also be configured to send logs to collectors such as [Fluentd]({{< ref fluentd.md >}}) and [Azure Monitor]({{< ref azure-monitor.md >}}) so they can be easily searched, analyzed and provide insights.
### Metrics
Metrics are the series of measured values and counts that are collected and stored over time. [Dapr metrics]({{<ref "metrics">}}) provide monitoring capabilities to understand the behavior of the Dapr sidecar and system services. For example, the metrics between a Dapr sidecar and the user application show call latency, traffic failures, error rates of requests etc. Dapr [system services metrics](https://github.com/dapr/dapr/blob/master/docs/development/dapr-metrics.md) show side car injection failures, health of the system services including CPU usage, number of actor placement made etc.
Metrics are the series of measured values and counts that are collected and stored over time. [Dapr metrics]({{<ref "metrics">}}) provide monitoring capabilities to understand the behavior of the Dapr sidecar and system services. For example, the metrics between a Dapr sidecar and the user application show call latency, traffic failures, error rates of requests etc. Dapr [system services metrics](https://github.com/dapr/dapr/blob/master/docs/development/dapr-metrics.md) show sidecar injection failures, health of the system services including CPU usage, number of actor placements made etc.
### Health checks
Dapr sidecar exposes an HTTP endpoint for [health checks]({{<ref sidecar-health.md>}}). With this API, user code can probe the Dapr sidecar to determine its status and identify issues with sidecar readiness.
The Dapr sidecar exposes an HTTP endpoint for [health checks]({{<ref sidecar-health.md>}}). With this API, user code or hosting environments can probe the Dapr sidecar to determine its status and identify issues with sidecar readiness.

View File

@ -74,7 +74,7 @@ dapr init -k -n mynamespace
### Install in highly available mode:
You can run Dapr with 3 replicas of each control plane pod with the exception of the Placement pod in the dapr-system namespace for [production scenarios]({{< ref kubernetes-production.md >}}).
You can run Dapr with 3 replicas of each control plane pod in the dapr-system namespace for [production scenarios]({{< ref kubernetes-production.md >}}).
```bash
dapr init -k --enable-ha=true
@ -129,6 +129,18 @@ The latest Dapr helm chart no longer supports Helm v2. Please migrate from Helm
--wait
```
To install in high availability mode:
```bash
helm upgrade --install dapr dapr/dapr \
--version=1.0.0 \
--namespace dapr-system \
--create-namespace \
--set global.ha.enabled=true \
--wait
```
See [Guidelines for production ready deployments on Kubernetes]({{<ref kubernetes-production.md>}}) for more information on installing and upgrading Dapr using Helm.
### Verify installation

View File

@ -1,7 +1,7 @@
---
type: docs
title: "Monitor your application with Dapr"
linkTitle: "Monitoring"
weight: 400
title: "Observe your application with Dapr"
linkTitle: "Observability"
weight: 80
description: "How to monitor your application using Dapr integrations"
---
---

View File

@ -0,0 +1,132 @@
---
type: docs
title: "How-To: Set up Azure Monitor to search logs and collect metrics"
linkTitle: "Azure Monitor"
weight: 2000
description: "Enable Dapr metrics and logs with Azure Monitor for Azure Kubernetes Service (AKS)"
---
## Prerequisites
- [Azure Kubernetes Service](https://docs.microsoft.com/en-us/azure/aks/)
- [Enable Azure Monitor For containers in AKS](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-overview)
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- [Helm 3](https://helm.sh/)
## Enable Prometheus metric scrape using config map
1. Make sure that omsagents are running
```bash
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
...
omsagent-75qjs 1/1 Running 1 44h
omsagent-c7c4t 1/1 Running 0 44h
omsagent-rs-74f488997c-dshpx 1/1 Running 1 44h
omsagent-smtk7 1/1 Running 1 44h
...
```
2. Apply config map to enable Prometheus metrics endpoint scrape.
You can use [azm-config-map.yaml](/docs/azm-config-map.yaml) to enable prometheus metrics endpoint scrape.
If you installed Dapr to the different namespace, you need to change the `monitor_kubernetes_pod_namespaces` array values. For example:
```yaml
...
prometheus-data-collection-settings: |-
[prometheus_data_collection_settings.cluster]
interval = "1m"
monitor_kubernetes_pods = true
monitor_kubernetes_pods_namespaces = ["dapr-system", "default"]
[prometheus_data_collection_settings.node]
interval = "1m"
...
```
Apply config map:
```bash
kubectl apply -f ./azm-config.map.yaml
```
## Install Dapr with JSON formatted logs
1. Install Dapr with enabling JSON-formatted logs
```bash
helm install dapr dapr/dapr --namespace dapr-system --set global.logAsJson=true
```
2. Enable JSON formatted log in Dapr sidecar and add Prometheus annotations.
> Note: OMS Agent scrapes the metrics only if replicaset has Prometheus annotations.
Add `dapr.io/log-as-json: "true"` annotation to your deployment yaml.
Example:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pythonapp
namespace: default
labels:
app: python
spec:
replicas: 1
selector:
matchLabels:
app: python
template:
metadata:
labels:
app: python
annotations:
dapr.io/enabled: "true"
dapr.io/app-id: "pythonapp"
dapr.io/log-as-json: "true"
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/"
...
```
## Search metrics and logs with Azure Monitor
1. Go to Azure Monitor
2. Search Dapr logs
Here is an example query, to parse JSON formatted logs and query logs from dapr system processes.
```
ContainerLog
| extend parsed=parse_json(LogEntry)
| project Time=todatetime(parsed['time']), app_id=parsed['app_id'], scope=parsed['scope'],level=parsed['level'], msg=parsed['msg'], type=parsed['type'], ver=parsed['ver'], instance=parsed['instance']
| where level != ""
| sort by Time
```
3. Search metrics
This query, queries process_resident_memory_bytes Prometheus metrics for Dapr system processes and renders timecharts
```
InsightsMetrics
| where Namespace == "prometheus" and Name == "process_resident_memory_bytes"
| extend tags=parse_json(Tags)
| project TimeGenerated, Name, Val, app=tostring(tags['app'])
| summarize memInBytes=percentile(Val, 99) by bin(TimeGenerated, 1m), app
| where app startswith "dapr-"
| render timechart
```
# References
* [Configure scraping of Prometheus metrics with Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-prometheus-integration)
* [Configure agent data collection for Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-agent-config)
* [Azure Monitor Query](https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/query-language)

View File

@ -0,0 +1,72 @@
---
type: docs
title: "Using OpenTelemetry Collector to collect traces to send to AppInsights"
linkTitle: "Using the OpenTelemetry for Azure AppInsights"
weight: 1000
description: "How to push trace events to Azure Application Insights, using the OpenTelemetry Collector."
---
Dapr integrates with [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) using the Zipkin API. This guide walks through an example using Dapr to push trace events to Azure Application Insights, using the OpenTelemetry Collector.
## Requirements
A installation of Dapr on Kubernetes.
## How to configure distributed tracing with Application Insights
### Setup Application Insights
1. First, you'll need an Azure account. See instructions [here](https://azure.microsoft.com/free/) to apply for a **free** Azure account.
2. Follow instructions [here](https://docs.microsoft.com/en-us/azure/azure-monitor/app/create-new-resource) to create a new Application Insights resource.
3. Get the Application Insights Intrumentation key from your Application Insights page.
### Run OpenTelemetry Collector to push to your Application Insights instance
Install the OpenTelemetry Collector to your Kubernetes cluster to push events to your Application Insights instance
1. Check out the file [open-telemetry-collector-appinsights.yaml](/docs/open-telemetry-collector/open-telemetry-collector-appinsights.yaml) and replace the `<INSTRUMENTATION-KEY>` placeholder with your Application Insights Instrumentation Key.
2. Apply the configuration with `kubectl apply -f open-telemetry-collector-appinsights.yaml`.
Next, set up both a Dapr configuration file to turn on tracing and deploy a tracing exporter component that uses the OpenTelemetry Collector.
1. Create a collector-config.yaml file with this [content](/docs/open-telemetry-collector/collector-config.yaml)
2. Apply the configuration with `kubectl apply -f collector-config.yaml`.
### Deploy your app with tracing
When running in Kubernetes mode, apply the `appconfig` configuration by adding a `dapr.io/config` annotation to the container that you want to participate in the distributed tracing, as shown in the following example:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
metadata:
...
annotations:
dapr.io/enabled: "true"
dapr.io/app-id: "MyApp"
dapr.io/app-port: "8080"
dapr.io/config: "appconfig"
```
Some of the quickstarts such as [distributed calculator](https://github.com/dapr/quickstarts/tree/master/distributed-calculator) already configure these settings, so if you are using those no additional settings are needed.
That's it! There's no need include any SDKs or instrument your application code. Dapr automatically handles the distributed tracing for you.
> **NOTE**: You can register multiple tracing exporters at the same time, and the tracing logs are forwarded to all registered exporters.
Deploy and run some applications. After a few minutes, you should see tracing logs appearing in your Application Insights resource. You can also use the **Application Map** to examine the topology of your services, as shown below:
![Application map](/images/open-telemetry-app-insights.png)
> **NOTE**: Only operations going through Dapr API exposed by Dapr sidecar (e.g. service invocation or event publishing) are displayed in Application Map topology.
## Related links
* Try out the [observability quickstart](https://github.com/dapr/quickstarts/tree/master/observability/README.md)
* How to set [tracing configuration options]({{< ref "configuration-overview.md#tracing" >}})

View File

@ -1,38 +1,35 @@
---
type: docs
title: "Using OpenTelemetry Collector to collect traces"
linkTitle: "OpenTelemetry"
weight: 1000
description: "How to use Dapr to push trace events to Azure Application Insights, through the OpenTelemetry Collector."
linkTitle: "Using the OpenTelemetry Collector"
weight: 900
description: "How to use Dapr to push trace events through the OpenTelemetry Collector."
---
Dapr can integrate with [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) using the Zipkin API. This guide walks through an example to use Dapr to push trace events to Azure Application Insights, through the OpenTelemetry Collector.
Dapr will be exporting trace in the OpenTelemetry format when OpenTelemetry is GA. In the mean time, traces can be exported using the Zipkin format. Combining with the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) you can still send trace to many popular tracing backends (like Azure AppInsights, AWS X-Ray, StackDriver, etc).
![Using OpenTelemetry Collect to integrate with many backend](/images/open-telemetry-collector.png)
## Requirements
A installation of Dapr on Kubernetes.
1. A installation of Dapr on Kubernetes.
## How to configure distributed tracing with Application Insights
2. You are already setting up your trace backends to receive traces.
### Setup Application Insights
3. Check OpenTelemetry Collector exporters [here](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter) and [here](https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter) to see if your trace backend is supported by the OpenTelemetry Collector. On those linked pages, find the exporter you want to use and read its doc to find out the parameters required.
1. First, you'll need an Azure account. See instructions [here](https://azure.microsoft.com/free/) to apply for a **free** Azure account.
2. Follow instructions [here](https://docs.microsoft.com/en-us/azure/azure-monitor/app/create-new-resource) to create a new Application Insights resource.
3. Get the Application Insights Intrumentation key from your Application Insights page.
## Setting OpenTelemetry Collector
### Run OpenTelemetry Collector to push to your Application Insights instance
### Run OpenTelemetry Collector to push to your trace backend
First, save your Application Insights Instrumentation Key in an environment variable
```
export APP_INSIGHTS_KEY=<your-app-insight-key>
```
Next, install the OpenTelemetry Collector to your Kubernetes cluster to push events to your Application Insights instance
1. Check out the file [open-telemetry-collector-generic.yaml](/docs/open-telemetry-collector/open-telemetry-collector-generic.yaml) and replace the section marked with `<your-exporter-here>` with the correct settings for your trace exporter. Again, refer to the OpenTelemetry Collector links in the Prerequisites section to determine the correct settings.
1. Check out the file [open-telemetry-collector.yaml](/docs/open-telemetry-collector/open-telemetry-collector.yaml) and replace the `<INSTRUMENTATION-KEY>` placeholder with your `APP_INSIGHTS_KEY`.
2. Apply the configuration with `kubectl apply -f open-telemetry-collector-generic.yaml`.
2. Apply the configuration with `kubectl apply -f open-telemetry-collector.yaml`.
## Set up Dapr to send trace to OpenTelemetry Collector
### Turn on tracing in Dapr
Next, set up both a Dapr configuration file to turn on tracing and deploy a tracing exporter component that uses the OpenTelemetry Collector.
1. Create a collector-config.yaml file with this [content](/docs/open-telemetry-collector/collector-config.yaml)
@ -66,26 +63,9 @@ That's it! There's no need include any SDKs or instrument your application code.
> **NOTE**: You can register multiple tracing exporters at the same time, and the tracing logs are forwarded to all registered exporters.
Deploy and run some applications. After a few minutes, you should see tracing logs appearing in your Application Insights resource. You can also use **Application Map** to examine the topology of your services, as shown below:
Deploy and run some applications. Wait for the trace to propagate to your tracing backend and view them there.
![Application map](/images/open-telemetry-app-insights.png)
## Related links
* Try out the [observability quickstart](https://github.com/dapr/quickstarts/tree/master/observability/README.md)
* How to set [tracing configuration options]({{< ref "configuration-overview.md#tracing" >}})
> **NOTE**: Only operations going through Dapr API exposed by Dapr sidecar (e.g. service invocation or event publishing) are displayed in Application Map topology.
## Tracing configuration
The `tracing` section under the `Configuration` spec contains the following properties:
```yml
tracing:
samplingRate: "1"
```
The following table lists the different properties.
| Property | Type | Description
|-------------- | ------ | -----------
| samplingRate | string | Set sampling rate for tracing to be enabled or disabled.
`samplingRate` is used to enable or disable the tracing. To disable the sampling rate , set `samplingRate : "0"` in the configuration. The valid range of samplingRate is between 0 and 1 inclusive. The sampling rate determines whether a trace span should be sampled or not based on value. `samplingRate : "1"` will always sample the traces. By default, the sampling rate is 1 in 10,000

View File

@ -2,6 +2,6 @@
type: docs
title: "Performance and Scalability"
linkTitle: "Performance and Scalability"
weight: 100
weight: 90
description: "Benchmarks and guidelines for Dapr building blocks"
---
---

View File

@ -0,0 +1,108 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-conf
labels:
app: opentelemetry
component: otel-collector-conf
data:
otel-collector-config: |
receivers:
zipkin:
endpoint: 0.0.0.0:9411
extensions:
health_check:
pprof:
endpoint: :1888
zpages:
endpoint: :55679
exporters:
logging:
loglevel: debug
# Depending on where you want to export your trace, use the
# correct OpenTelemetry trace exporter here.
#
# Refer to
# https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter
# and
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter
# for full lists of trace exporters that you can use, and how to
# configure them.
<your-exporter-here>:
...
service:
extensions: [pprof, zpages, health_check]
pipelines:
traces:
receivers: [zipkin]
# List your exporter here.
exporters: [<your-exporter-here>,logging]
---
apiVersion: v1
kind: Service
metadata:
name: otel-collector
labels:
app: opencesus
component: otel-collector
spec:
ports:
- name: zipkin # Default endpoint for Zipkin receiver.
port: 9411
protocol: TCP
targetPort: 9411
selector:
component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
labels:
app: opentelemetry
component: otel-collector
spec:
replicas: 1 # scale out based on your usage
selector:
matchLabels:
app: opentelemetry
template:
metadata:
labels:
app: opentelemetry
component: otel-collector
spec:
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib-dev:latest
command:
- "/otelcontribcol"
- "--config=/conf/otel-collector-config.yaml"
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 200m
memory: 400Mi
ports:
- containerPort: 9411 # Default endpoint for Zipkin receiver.
volumeMounts:
- name: otel-collector-config-vol
mountPath: /conf
livenessProbe:
httpGet:
path: /
port: 13133
readinessProbe:
httpGet:
path: /
port: 13133
volumes:
- configMap:
name: otel-collector-conf
items:
- key: otel-collector-config
path: otel-collector-config.yaml
name: otel-collector-config-vol

Binary file not shown.

After

Width:  |  Height:  |  Size: 563 KiB