Initial observability docs (concept + howto) (#431)

* add config yaml

* initial doc for observability

* Add links

* Update README.md

* Update logs.md

* Update setup-azure-monitor.md

* Update setup-fluentd-es-kibana.md

* Adding how-to set up Prometheus and Grafana documentation and images.

* deleting .PNG files

* -adding .png files

Co-authored-by: Mark Fussell <mfussell@microsoft.com>
Co-authored-by: Shalabh Mohan Shrivastava <shalabhms@gmail.com>
This commit is contained in:
Young Bu Park 2020-03-12 08:23:27 -07:00 committed by GitHub
parent 4f0c3b45b3
commit fdb5d93815
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 856 additions and 0 deletions

View File

@ -0,0 +1,36 @@
# Observability
Observability is a term from control theory. Observability means you can answer questions about whats happening on the inside of a system by observing the outside of the system, without having to ship new code to answer new questions. Observability is critical in production environments and services to debug, operate and monitor Dapr system services, components and user applications.
The observability capabilities enable users to monitor the Dapr system services, their interaction with user applications and understand how these monitored services behave. The observability capabilities are divided into three main areas;
* **[Metrics](./metrics.md)**: are the series of measured values and counts that are collected and stored over time. Dapr metrics provide monitoring and understanding of the behavior of Dapr system services and user apps. For example, the service metrics between Dapr sidecars and user apps show call latency, traffic failures, error rates of requests etc. Dapr system services metrics show side car injection failures, health of the system services including CPU usage, number of actor placement made etc.
* **[Logs](./logs.md)**: are records of events that occur occur that can be used to determine failures or other status. Logs events contain warning, error, info and debug messages produced by Dapr system services. Each log event includes metadata such as message type, hostname, component name, Dapr id, ip address, etc.
* **[Distributed tracing](./traces.md)**: is used to profile and monitor Dapr system services and user apps. Distributed tracing helps pinpoint where failures occur and what causes poor performance. Distributed tracing is particularly well-suited to debugging and monitoring distributed software architectures, such as microservices. You can use distributed tracing to help debug and optimize application code. Distributed tracing contains trace spans between the Dapr runtime, Dapr system services, and user apps across process, nodes, network, and security boundaries. It provides a detailed understanding of service invocations (call flows) and service dependencies.
## Implementation Status
The table below shows the current status of each of the observabilty capabilites for the Dapr runtime and system services. N/A means not applicable.
| | Runtime | Operator | Injector | Placement | Sentry|
|---------|---------|----------|----------|-----------|--------|
|Metrics | Yes | Yes | Yes | Yes | Yes |
|Tracing | Yes | N/A | N/A | *Planned* | N/A |
|Logs | Yes | Yes | Yes | Yes | Yes |
## Supported monitoring tools
The observability tools listed below are ones that have been tested to work with Dapr
### Metrics
* [Prometheus + Grafana](../../howto/observe-metrics-with-prometheus/README.md)
* [Azure Monitor](../../howto/setup-monitoring-tools/setup-azure-monitor.md)
### Logs
* [Fluentd + Elastic Search + Kibana](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md)
* [Azure Monitor](../../howto/setup-monitoring-tools/setup-azure-monitor.md)
### Traces
* [Zipkin](../../howto/diagnose-with-tracing/zipkin.md)
* [Application Insights](../../howto/diagnose-with-tracing/azure-monitor.md)

View File

@ -0,0 +1,95 @@
# Logs
Dapr produces structured logs to stdout either as a plain text or JSON formatted. By default, all Dapr processes (runtime and system services) write to console out in plain text. To enable JSON formatted logs, you need to add the `--log-as-json` command flag when running Dapr processes.
If you want to use a search engine such as Elastic Search or Azure Monitor to search the logs, it is recommended to use JSON-formatted logs which the log collector and search engine can parse using the built-in JSON parser.
## Log schema
Dapr produces logs based on the following schema.
| Field | Description | Example |
|-------|-------------------|---------|
| time | ISO8601 Timestamp | `2011-10-05T14:48:00.000Z` |
| level | Log Level (info/warn/debug/error) | `info` |
| type | Log Type | `log` |
| msg | Log Message | `hello dapr!` |
| scope | Logging Scope | `dapr.runtime` |
| instance | Container Name | `dapr-pod-xxxxx` |
| app_id | Dapr App ID | `dapr-app` |
| ver | Dapr Runtime Version | `0.5.0` |
## Plain text and JSON formatted logs
* Plain text log examples
```bash
time="2020-03-11T17:08:48.303776-07:00" level=info msg="starting Dapr Runtime -- version 0.5.0-rc.2 -- commit v0.3.0-rc.0-155-g5dfcf2e" instance=dapr-pod-xxxx scope=dapr.runtime type=log ver=0.5.0-rc.2
time="2020-03-11T17:08:48.303913-07:00" level=info msg="log level set to: info" instance=dapr-pod-xxxx scope=dapr.runtime type=log ver=0.5.0-rc.2
```
* JSON formatted log examples
```json
{"instance":"dapr-pod-xxxx","level":"info","msg":"starting Dapr Runtime -- version 0.5.0-rc.2 -- commit v0.3.0-rc.0-155-g5dfcf2e","scope":"dapr.runtime","time":"2020-03-11T17:09:45.788005Z","type":"log","ver":"0.5.0-rc.2"}
{"instance":"dapr-pod-xxxx","level":"info","msg":"log level set to: info","scope":"dapr.runtime","time":"2020-03-11T17:09:45.788075Z","type":"log","ver":"0.5.0-rc.2"}
```
## Configurating plain text or JSON formatted logs
Dapr supports both plain text and JSON formatted logs. The default format is plain-text. If you want to use plain text with a search engine, you will not need to change any configuring options.
To use JSON formatted logs, you need to add additional configuration when you install Dapr and deploy your app. The recommendation is to use JSONformatted logs because most log collectors and search engines can parse JSON more easily with built-in parsers.
## Configuring log format in Kubernetes
The following steps describe how to configure JSON formatted logs for Kubernetes
### Install Dapr to your cluster using the Helm chart
You can enable JSON formatted logs for Dapr system services by adding `--set global.LogASJSON=true` option to Helm command.
```bash
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
```
### Enable JSON formatted log for Dapr sidecars
You can enable JSON-formatted logs in Dapr sidecars activated by the Dapr sidecar-injector service by adding the `dapr.io/log-as-json: "true"` annotation to the deployment.
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pythonapp
labels:
app: python
spec:
replicas: 1
selector:
matchLabels:
app: python
template:
metadata:
labels:
app: python
annotations:
dapr.io/enabled: "true"
dapr.io/id: "pythonapp"
dapr.io/log-as-json: "true"
...
```
## Log collectors
If you run Dapr in a Kubernetes cluster, [Fluentd](https://www.fluentd.org/) is a popular container log collector. You can use Fluentd with a [json parser plugin](https://docs.fluentd.org/parser/json) to parse Dapr JSON formatted logs. This [how-to](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md) shows how to configure the Fleuntd in your cluster.
If you are using the Azure Kubernetes Service, you can use the default OMS Agent to collect logs with Azure Monitor without needing to install Fluentd.
## Search engines
If you use [Fluentd](https://www.fluentd.org/), we recommend to using Elastic Search and Kibana. This [how-to](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md) shows how to set up Elastic Search and Kibana in your Kubernetes cluster.
If you are using the Azure Kubernetes Service, you can use [Azure monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-overview) without indstalling any additional monitoring tools. Also read [How to enable Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-onboard)
## References
- [How-to : Set up Fleuntd, Elastic search, and Kibana](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md)
- [How-to : Set up Azure Monitor in Azure Kubernetes Service](../../howto/setup-monitoring-tools/setup-azure-monitor.md)

View File

@ -0,0 +1,79 @@
kind: ConfigMap
apiVersion: v1
data:
schema-version:
#string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
v1
config-version:
#string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
ver1
log-data-collection-settings: |-
# Log data collection settings
# Any errors related to config map settings can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
[log_collection_settings]
[log_collection_settings.stdout]
# In the absense of this configmap, default value for enabled is true
enabled = true
# exclude_namespaces setting holds good only if enabled is set to true
# kube-system log collection is disabled by default in the absence of 'log_collection_settings.stdout' setting. If you want to enable kube-system, remove it from the following setting.
# If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
# In the absense of this configmap, default value for exclude_namespaces = ["kube-system"]
exclude_namespaces = ["kube-system"]
[log_collection_settings.stderr]
# Default value for enabled is true
enabled = true
# exclude_namespaces setting holds good only if enabled is set to true
# kube-system log collection is disabled by default in the absence of 'log_collection_settings.stderr' setting. If you want to enable kube-system, remove it from the following setting.
# If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
# In the absense of this cofigmap, default value for exclude_namespaces = ["kube-system"]
exclude_namespaces = ["kube-system"]
[log_collection_settings.env_var]
# In the absense of this configmap, default value for enabled is true
enabled = true
[log_collection_settings.enrich_container_logs]
# In the absense of this configmap, default value for enrich_container_logs is false
enabled = true
# When this is enabled (enabled = true), every container log entry (both stdout & stderr) will be enriched with container Name & container Image
[log_collection_settings.collect_all_kube_events]
# In the absense of this configmap, default value for collect_all_kube_events is false
# When the setting is set to false, only the kube events with !normal event type will be collected
enabled = false
# When this is enabled (enabled = true), all kube events including normal events will be collected
prometheus-data-collection-settings: |-
# Custom Prometheus metrics data collection settings
[prometheus_data_collection_settings.cluster]
# Cluster level scrape endpoint(s). These metrics will be scraped from agent's Replicaset (singleton)
# Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
#Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
interval = "1m"
## Uncomment the following settings with valid string arrays for prometheus scraping
#fieldpass = ["metric_to_pass1", "metric_to_pass12"]
#fielddrop = ["metric_to_drop"]
# An array of urls to scrape metrics from.
# urls = ["http://myurl:9101/metrics"]
# An array of Kubernetes services to scrape metrics from.
# kubernetes_services = ["http://my-service-dns.my-namespace:9102/metrics"]
# When monitor_kubernetes_pods = true, replicaset will scrape Kubernetes pods for the following prometheus annotations:
# - prometheus.io/scrape: Enable scraping for this pod
# - prometheus.io/scheme: If the metrics endpoint is secured then you will need to
# set this to `https` & most likely set the tls config.
# - prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
# - prometheus.io/port: If port is not 9102 use this annotation
monitor_kubernetes_pods = true
## Restricts Kubernetes monitoring to namespaces for pods that have annotations set and are scraped using the monitor_kubernetes_pods setting.
## This will take effect when monitor_kubernetes_pods is set to true
## ex: monitor_kubernetes_pods_namespaces = ["default1", "default2", "default3"]
monitor_kubernetes_pods_namespaces = ["dapr-system", "default"]
[prometheus_data_collection_settings.node]
# Node level scrape endpoint(s). These metrics will be scraped from agent's DaemonSet running in every node in the cluster
# Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
#Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
interval = "1m"
## Uncomment the following settings with valid string arrays for prometheus scraping
# An array of urls to scrape metrics from. $NODE_IP (all upper case) will substitute of running Node's IP address
# urls = ["http://$NODE_IP:9103/metrics"]
#fieldpass = ["metric_to_pass1", "metric_to_pass12"]
#fielddrop = ["metric_to_drop"]
metadata:
name: container-azm-ms-agentconfig
namespace: kube-system

View File

@ -0,0 +1,105 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: kube-system
data:
fluent.conf: |
<match fluent.**>
@type null
</match>
<match kubernetes.var.log.containers.**fluentd**.log>
@type null
</match>
<match kubernetes.var.log.containers.**kube-system**.log>
@type null
</match>
<match kubernetes.var.log.containers.**kibana**.log>
@type null
</match>
<source>
@type tail
path /var/log/containers/*.log
pos_file fluentd-docker.pos
time_format %Y-%m-%dT%H:%M:%S
tag kubernetes.*
<parse>
@type multi_format
<pattern>
format json
time_key time
time_type string
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key false
</pattern>
<pattern>
format regexp
expression /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
time_format '%Y-%m-%dT%H:%M:%S.%N%:z'
keep_time_key false
</pattern>
</parse>
</source>
<filter kubernetes.**>
@type kubernetes_metadata
@id filter_kube_metadata
</filter>
<filter kubernetes.var.log.containers.**>
@type parser
<parse>
@type json
format json
time_key time
time_type string
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key false
</parse>
key_name log
replace_invalid_sequence true
emit_invalid_record_to_error true
reserve_data true
</filter>
<match **>
@type elasticsearch
@id out_es
@log_level info
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}"
user "#{ENV['FLUENT_ELASTICSEARCH_USER'] || use_default}"
password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD'] || use_default}"
reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'false'}"
logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'dapr'}"
logstash_dateformat "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_DATEFORMAT'] || '%Y.%m.%d'}"
logstash_format "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_FORMAT'] || 'true'}"
index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'dapr'}"
type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
include_timestamp "#{ENV['FLUENT_ELASTICSEARCH_INCLUDE_TIMESTAMP'] || 'false'}"
template_name "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_NAME'] || use_nil}"
template_file "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_FILE'] || use_nil}"
template_overwrite "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_OVERWRITE'] || use_default}"
sniffer_class_name "#{ENV['FLUENT_SNIFFER_CLASS_NAME'] || 'Fluent::Plugin::ElasticsearchSimpleSniffer'}"
request_timeout "#{ENV['FLUENT_ELASTICSEARCH_REQUEST_TIMEOUT'] || '5s'}"
<buffer>
flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}"
queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}"
retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
retry_forever true
</buffer>
</match>

View File

@ -0,0 +1,99 @@
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: fluentd
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
labels:
k8s-app: fluentd-logging
version: v1
spec:
selector:
matchLabels:
k8s-app: fluentd-logging
version: v1
template:
metadata:
labels:
k8s-app: fluentd-logging
version: v1
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.9.2-debian-elasticsearch7-1.0
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch-master.dapr-monitoring"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENT_UID
value: "0"
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluentd-config
mountPath: /fluentd/etc
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluentd-config
configMap:
name: fluentd-config

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 338 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 360 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 501 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 373 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 440 KiB

View File

@ -0,0 +1,133 @@
# Set up azure monitor to search logs and collect metrics for Dapr
This document describes how to enable Dapr metrics and logs with Azure Monitor for Azure Kubernetes Service (AKS).
## Prerequisites
- [Azure Kubernetes Service](https://docs.microsoft.com/en-us/azure/aks/)
- [Enable Azure Monitor For containers in AKS](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-overview)
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- [Helm 3](https://helm.sh/)
## Contents
- [Enable Prometheus metric scrape using config map](#enable-prometheus-metric-scrape-using-config-map)
- [Install Dapr with JSON formatted logs](#install-dapr-with-json-formatted-logs)
- [Search metrics and logs with Azure Monitor](#Search-metrics-and-logs-with-azure-monitor)
## Enable Prometheus metric scrape using config map
1. Make sure that omsagnets are running
```bash
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
...
omsagent-75qjs 1/1 Running 1 44h
omsagent-c7c4t 1/1 Running 0 44h
omsagent-rs-74f488997c-dshpx 1/1 Running 1 44h
omsagent-smtk7 1/1 Running 1 44h
...
```
2. Apply config map to enable Prometheus metrics endpoint scrape.
You can use [azm-config-map.yaml](./azm-config-map.yaml) to enable prometheus metrics endpoint scrape.
If you installed Dapr to the different namespace, you need to change the `monitor_kubernetes_pod_namespaces` array values. For example;
```yaml
...
prometheus-data-collection-settings: |-
[prometheus_data_collection_settings.cluster]
interval = "1m"
monitor_kubernetes_pods = true
monitor_kubernetes_pods_namespaces = ["dapr-system", "default"]
[prometheus_data_collection_settings.node]
interval = "1m"
...
```
Apply config map:
```
kubectl apply -f ./azm-config.map.yaml
```
## Install Dapr with JSON formatted logs
1. Install Dapr with enabling JSON-formatted logs
```bash
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
```
2. Enable JSON formatted log in Dapr sidecar and add Prometheus annotations.
> Note: OMS Agent scrapes the metrics only if replicaset has Prometheus annotations.
Add `dapr.io/log-as-json: "true"` annotation to your deployment yaml.
Example:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pythonapp
labels:
app: python
spec:
replicas: 1
selector:
matchLabels:
app: python
template:
metadata:
labels:
app: python
annotations:
dapr.io/enabled: "true"
dapr.io/id: "pythonapp"
dapr.io/log-as-json: "true"
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/"
...
```
## Search metrics and logs with Azure Monitor
1. Go to Azure Monitor
2. Search Dapr logs
Here is an example query, to parse JSON formatted logs and query logs from dapr system processes.
```
ContainerLog
| extend parsed=parse_json(LogEntry)
| project Time=todatetime(parsed['time']), app_id=parsed['app_id'], scope=parsed['scope'],level=parsed['level'], msg=parsed['msg'], type=parsed['type'], ver=parsed['ver'], instance=parsed['instance']
| where level != ""
| sort by Time
```
3. Search metrics
This query, queries process_resident_memory_bytes Prometheus metrics for Dapr system processes and renders timecharts
```
InsightsMetrics
| where Namespace == "prometheus" and Name == "process_resident_memory_bytes"
| extend tags=parse_json(Tags)
| project TimeGenerated, Name, Val, app=tostring(tags['app'])
| summarize memInBytes=percentile(Val, 99) by bin(TimeGenerated, 1m), app
| where app startswith "dapr-"
| render timechart
```
# References
* [Configure scraping of Prometheus metrics with Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-prometheus-integration)
* [Configure agent data collection for Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-agent-config)
* [Azure Monitor Query](https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/query-language)

View File

@ -0,0 +1,173 @@
# Set up Fleuntd, Elastic search, and Kibana in Kubernetes
This document descriebs how to install Fluentd, Elastic Search, and Kibana to search logs in Kubernetes
## Prerequisites
- Kubernetes (> 1.14)
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- [Helm 3](https://helm.sh/)
## Contents
- [Install Fluentd, Elastic Search, and Kibana](#install-fluentd-elastic-search-and-kibana)
- [Install Fluentd](#install-fluentd)
- [Install Dapr with JSON formatted logs](#install-dapr-with-json-formatted-logs)
- [Search logs](#search-logs)
## Install Elastic search and Kibana
1. Create namespace for monitoring tool and add Helm repo for Elastic Search
```bash
kubectl create namespace dapr-monitoring
```
2. Add Elastic helm repo
```bash
helm repo add elastic https://helm.elastic.co
helm repo update
```
3. Install Elastic Search using Helm
```bash
helm install elasticsearch elastic/elasticsearch -n dapr-monitoring
```
If you are using minikube or want to disable persistent volumes for development purposes, you can disable it by using the following command.
```bash
helm install elasticsearch elastic/elasticsearch -n dapr-monitoring --set persistence.enabled=false --replicas=1
```
4. Install Kibana
```bash
helm install kibana elastic/kibana -n dapr-monitoring
```
5. Validation
Ensure Elastic Search and Kibana are running in your Kubernetes cluster.
```bash
kubectl get pods -n dapr-monitoring
NAME READY STATUS RESTARTS AGE
elasticsearch-master-0 1/1 Running 0 6m58s
kibana-kibana-95bc54b89-zqdrk 1/1 Running 0 4m21s
```
## Install Fluentd
1. Install config map and Fluentd as a daemonset
> Note: If you are running Fluentd in your cluster, please enable the nested json parser to parse JSON formatted log from Dapr.
```bash
kubectl apply -f ./fluentd-config-map.yaml
kubectl apply -f ./fluentd-dapr-with-rbac.yaml
```
2. Ensure that Fluentd is running as a daemonset
```bash
kubectl get pods -n kube-system -w
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-cxjxk 1/1 Running 0 4m41s
coredns-6955765f44-jlskv 1/1 Running 0 4m41s
etcd-m01 1/1 Running 0 4m48s
fluentd-sdrld 1/1 Running 0 14s
```
## Install Dapr with JSON formatted logs
1. Install Dapr with enabling JSON-formatted logs
```bash
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
```
2. Enable JSON formatted log in Dapr sidecar
Add `dapr.io/log-as-json: "true"` annotation to your deployment yaml.
Example:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pythonapp
labels:
app: python
spec:
replicas: 1
selector:
matchLabels:
app: python
template:
metadata:
labels:
app: python
annotations:
dapr.io/enabled: "true"
dapr.io/id: "pythonapp"
dapr.io/log-as-json: "true"
...
```
## Search logs
> Note: Elastic Search takes a time to index the logs that Fluentd sends.
1. Port-forward to svc/kibana-kibana
```
$ kubectl port-forward svc/kibana-kibana 5601 -n dapr-monitoring
Forwarding from 127.0.0.1:5601 -> 5601
Forwarding from [::1]:5601 -> 5601
Handling connection for 5601
Handling connection for 5601
```
2. Browse `http://localhost:5601`
3. Click Management -> Index Management
![kibana management](./img/kibana-1.png)
4. Wait until dapr-* is indexed.
![index log](./img/kibana-2.png)
5. Once dapr-* indexed, click Kibana->Index Patterns and Create Index Pattern
![create index pattern](./img/kibana-3.png)
6. Define index pattern - type `dapr*` in index pattern
![define index pattern](./img/kibana-4.png)
7. Select time stamp filed: `@timestamp`
![timestamp](./img/kibana-5.png)
8. Confirm that `scope`, `type`, `app_id`, `level`, etc are being indexed.
> Note: if you cannot find the indexed field, please wait. it depends on the volume of data and resource size where elastic search is running.
![indexing](./img/kibana-6.png)
9. Click `discover` icon and search `scope:*`
> Note: it would take some time to make log searchable based on the data volume and resource.
![discover](./img/kibana-7.png)
# References
* [Fluentd for Kubernetes](https://docs.fluentd.org/v/0.12/articles/kubernetes-fluentd)
* [Elastic search helm chart](https://github.com/elastic/helm-charts/tree/master/elasticsearch)
* [Kibana helm chart](https://github.com/elastic/helm-charts/tree/master/kibana)
* [Kibana Query Language](https://www.elastic.co/guide/en/kibana/current/kuery-query.html)

View File

@ -0,0 +1,136 @@
# Set up Prometheus and Grafana
This document shows how to install Prometheus and Grafana to view metrics.
## Prerequisites
- Kubernetes (> 1.14)
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- [Helm 3](https://helm.sh/)
## Contents
- [Install Prometheus and Grafana](#install-prometheus-and-grafana)
- [View metrics](#view-metrics)
## Install Prometheus and Grafana
1. Create namespace for monitoring tool
```bash
kubectl create namespace dapr-monitoring
```
2. Install Prometheus
```bash
helm install dapr-prom stable/prometheus -n dapr-monitoring
```
If you are minikube user or want to disable persistent volume for development purpose, you can disable it by using the following command.
```bash
helm install prometheus stable/prometheus -n dapr-monitoring
--set alertmanager.persistentVolume.enable=false
--set pushgateway.persistentVolume.enabled=false
--set server.persistentVolume.enabled=false
```
3. Install Grafana
```bash
helm install grafana stable/grafana -n dapr-monitoring
```
If you are minikube user or want to disable persistent volume for development purpose, you can disable it by using the following command.
```bash
helm install grafana stable/grafana -n dapr-monitoring --set persistence.enabled=false
```
4. Validation
Ensure Prometheus and Grafana are running in your cluster.
```bash
kubectl get pods -n dapr-monitoring
NAME READY STATUS RESTARTS AGE
dapr-prom-kube-state-metrics-9849d6cc6-t94p8 1/1 Running 0 4m58s
dapr-prom-prometheus-alertmanager-749cc46f6-9b5t8 2/2 Running 0 4m58s
dapr-prom-prometheus-node-exporter-5jh8p 1/1 Running 0 4m58s
dapr-prom-prometheus-node-exporter-88gbg 1/1 Running 0 4m58s
dapr-prom-prometheus-node-exporter-bjp9f 1/1 Running 0 4m58s
dapr-prom-prometheus-pushgateway-688665d597-h4xx2 1/1 Running 0 4m58s
dapr-prom-prometheus-server-694fd8d7c-q5d59 2/2 Running 0 4m58s
grafana-c49889cff-x56vj 1/1 Running 0 5m10s
```
## View metrics
1. Port-forward to svc/grafana
```
$ kubectl port-forward svc/grafana 8080:80 -n dapr-monitoring
Forwarding from 127.0.0.1:8080 -> 3000
Forwarding from [::1]:8080 -> 3000
Handling connection for 8080
Handling connection for 8080
```
2. Browse `http://localhost:8080`
3. Click Configuration Settings -> Data Sources
![data source](./img/grafana-datasources.png)
4. Add Prometheus as a data soruce.
![add data source](./img/grafana-datasources.png)
5. Enter Promethesus server address in your cluster.
You can get the prometheus server address by running following command.
```bash
kubectl get svc -n dapr-monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dapr-prom-kube-state-metrics ClusterIP 10.0.174.177 <none> 8080/TCP 7d9h
dapr-prom-prometheus-alertmanager ClusterIP 10.0.255.199 <none> 80/TCP 7d9h
dapr-prom-prometheus-node-exporter ClusterIP None <none> 9100/TCP 7d9h
dapr-prom-prometheus-pushgateway ClusterIP 10.0.190.59 <none> 9091/TCP 7d9h
dapr-prom-prometheus-server ClusterIP 10.0.172.191 <none> 80/TCP 7d9h
elasticsearch-master ClusterIP 10.0.36.146 <none> 9200/TCP,9300/TCP 7d10h
elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 7d10h
grafana ClusterIP 10.0.15.229 <none> 80/TCP 5d5h
kibana-kibana ClusterIP 10.0.188.224 <none> 5601/TCP 7d10h
```
In this set up tutorial, the server is `dapr-prom-prometheus-server`.
So you need to provide `http://dapr-prom-prometheus-server.dapr-monitoring` in the URL field.
![prometheus server](./img/grafana-prometheus-server-url.png)
6. Click Save & Test button to verify that connected succeeded.
7. Import Dapr dashboards.
You can now import built-in [Grafana dashboard templates](https://github.com/dapr/docs/tree/master/monitoring/grafana/dashboards).
Refer [here](https://github.com/dapr/docs/tree/master/monitoring/grafana) for details.
![upload json](./img/grafana-uploadjson.png)
You can find screenshots of Dapr dashboards [here](https://github.com/dapr/docs/tree/master/monitoring/grafana/img).
# References
* [Prometheus Installation](https://github.com/helm/charts/tree/master/stable/prometheus-operator)
* [Prometheus on Kubernetes](https://github.com/coreos/kube-prometheus)
* [Prometheus Kubernetes Operator](https://github.com/helm/charts/tree/master/stable/prometheus-operator)
* [Prometheus Query Language](https://prometheus.io/docs/prometheus/latest/querying/basics/)