Initial observability docs (concept + howto) (#431)
* add config yaml * initial doc for observability * Add links * Update README.md * Update logs.md * Update setup-azure-monitor.md * Update setup-fluentd-es-kibana.md * Adding how-to set up Prometheus and Grafana documentation and images. * deleting .PNG files * -adding .png files Co-authored-by: Mark Fussell <mfussell@microsoft.com> Co-authored-by: Shalabh Mohan Shrivastava <shalabhms@gmail.com>
|
@ -0,0 +1,36 @@
|
|||
# Observability
|
||||
|
||||
Observability is a term from control theory. Observability means you can answer questions about what’s happening on the inside of a system by observing the outside of the system, without having to ship new code to answer new questions. Observability is critical in production environments and services to debug, operate and monitor Dapr system services, components and user applications.
|
||||
|
||||
The observability capabilities enable users to monitor the Dapr system services, their interaction with user applications and understand how these monitored services behave. The observability capabilities are divided into three main areas;
|
||||
|
||||
* **[Metrics](./metrics.md)**: are the series of measured values and counts that are collected and stored over time. Dapr metrics provide monitoring and understanding of the behavior of Dapr system services and user apps. For example, the service metrics between Dapr sidecars and user apps show call latency, traffic failures, error rates of requests etc. Dapr system services metrics show side car injection failures, health of the system services including CPU usage, number of actor placement made etc.
|
||||
* **[Logs](./logs.md)**: are records of events that occur occur that can be used to determine failures or other status. Logs events contain warning, error, info and debug messages produced by Dapr system services. Each log event includes metadata such as message type, hostname, component name, Dapr id, ip address, etc.
|
||||
* **[Distributed tracing](./traces.md)**: is used to profile and monitor Dapr system services and user apps. Distributed tracing helps pinpoint where failures occur and what causes poor performance. Distributed tracing is particularly well-suited to debugging and monitoring distributed software architectures, such as microservices. You can use distributed tracing to help debug and optimize application code. Distributed tracing contains trace spans between the Dapr runtime, Dapr system services, and user apps across process, nodes, network, and security boundaries. It provides a detailed understanding of service invocations (call flows) and service dependencies.
|
||||
|
||||
## Implementation Status
|
||||
The table below shows the current status of each of the observabilty capabilites for the Dapr runtime and system services. N/A means not applicable.
|
||||
|
||||
| | Runtime | Operator | Injector | Placement | Sentry|
|
||||
|---------|---------|----------|----------|-----------|--------|
|
||||
|Metrics | Yes | Yes | Yes | Yes | Yes |
|
||||
|Tracing | Yes | N/A | N/A | *Planned* | N/A |
|
||||
|Logs | Yes | Yes | Yes | Yes | Yes |
|
||||
|
||||
## Supported monitoring tools
|
||||
The observability tools listed below are ones that have been tested to work with Dapr
|
||||
|
||||
### Metrics
|
||||
|
||||
* [Prometheus + Grafana](../../howto/observe-metrics-with-prometheus/README.md)
|
||||
* [Azure Monitor](../../howto/setup-monitoring-tools/setup-azure-monitor.md)
|
||||
|
||||
### Logs
|
||||
|
||||
* [Fluentd + Elastic Search + Kibana](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md)
|
||||
* [Azure Monitor](../../howto/setup-monitoring-tools/setup-azure-monitor.md)
|
||||
|
||||
### Traces
|
||||
|
||||
* [Zipkin](../../howto/diagnose-with-tracing/zipkin.md)
|
||||
* [Application Insights](../../howto/diagnose-with-tracing/azure-monitor.md)
|
|
@ -0,0 +1,95 @@
|
|||
# Logs
|
||||
|
||||
Dapr produces structured logs to stdout either as a plain text or JSON formatted. By default, all Dapr processes (runtime and system services) write to console out in plain text. To enable JSON formatted logs, you need to add the `--log-as-json` command flag when running Dapr processes.
|
||||
|
||||
If you want to use a search engine such as Elastic Search or Azure Monitor to search the logs, it is recommended to use JSON-formatted logs which the log collector and search engine can parse using the built-in JSON parser.
|
||||
|
||||
## Log schema
|
||||
|
||||
Dapr produces logs based on the following schema.
|
||||
|
||||
| Field | Description | Example |
|
||||
|-------|-------------------|---------|
|
||||
| time | ISO8601 Timestamp | `2011-10-05T14:48:00.000Z` |
|
||||
| level | Log Level (info/warn/debug/error) | `info` |
|
||||
| type | Log Type | `log` |
|
||||
| msg | Log Message | `hello dapr!` |
|
||||
| scope | Logging Scope | `dapr.runtime` |
|
||||
| instance | Container Name | `dapr-pod-xxxxx` |
|
||||
| app_id | Dapr App ID | `dapr-app` |
|
||||
| ver | Dapr Runtime Version | `0.5.0` |
|
||||
|
||||
## Plain text and JSON formatted logs
|
||||
|
||||
* Plain text log examples
|
||||
```bash
|
||||
time="2020-03-11T17:08:48.303776-07:00" level=info msg="starting Dapr Runtime -- version 0.5.0-rc.2 -- commit v0.3.0-rc.0-155-g5dfcf2e" instance=dapr-pod-xxxx scope=dapr.runtime type=log ver=0.5.0-rc.2
|
||||
time="2020-03-11T17:08:48.303913-07:00" level=info msg="log level set to: info" instance=dapr-pod-xxxx scope=dapr.runtime type=log ver=0.5.0-rc.2
|
||||
```
|
||||
|
||||
* JSON formatted log examples
|
||||
```json
|
||||
{"instance":"dapr-pod-xxxx","level":"info","msg":"starting Dapr Runtime -- version 0.5.0-rc.2 -- commit v0.3.0-rc.0-155-g5dfcf2e","scope":"dapr.runtime","time":"2020-03-11T17:09:45.788005Z","type":"log","ver":"0.5.0-rc.2"}
|
||||
{"instance":"dapr-pod-xxxx","level":"info","msg":"log level set to: info","scope":"dapr.runtime","time":"2020-03-11T17:09:45.788075Z","type":"log","ver":"0.5.0-rc.2"}
|
||||
```
|
||||
|
||||
## Configurating plain text or JSON formatted logs
|
||||
|
||||
Dapr supports both plain text and JSON formatted logs. The default format is plain-text. If you want to use plain text with a search engine, you will not need to change any configuring options.
|
||||
|
||||
To use JSON formatted logs, you need to add additional configuration when you install Dapr and deploy your app. The recommendation is to use JSONformatted logs because most log collectors and search engines can parse JSON more easily with built-in parsers.
|
||||
|
||||
## Configuring log format in Kubernetes
|
||||
The following steps describe how to configure JSON formatted logs for Kubernetes
|
||||
|
||||
### Install Dapr to your cluster using the Helm chart
|
||||
|
||||
You can enable JSON formatted logs for Dapr system services by adding `--set global.LogASJSON=true` option to Helm command.
|
||||
|
||||
```bash
|
||||
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
|
||||
```
|
||||
|
||||
### Enable JSON formatted log for Dapr sidecars
|
||||
|
||||
You can enable JSON-formatted logs in Dapr sidecars activated by the Dapr sidecar-injector service by adding the `dapr.io/log-as-json: "true"` annotation to the deployment.
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: pythonapp
|
||||
labels:
|
||||
app: python
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: python
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: python
|
||||
annotations:
|
||||
dapr.io/enabled: "true"
|
||||
dapr.io/id: "pythonapp"
|
||||
dapr.io/log-as-json: "true"
|
||||
...
|
||||
```
|
||||
|
||||
## Log collectors
|
||||
|
||||
If you run Dapr in a Kubernetes cluster, [Fluentd](https://www.fluentd.org/) is a popular container log collector. You can use Fluentd with a [json parser plugin](https://docs.fluentd.org/parser/json) to parse Dapr JSON formatted logs. This [how-to](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md) shows how to configure the Fleuntd in your cluster.
|
||||
|
||||
If you are using the Azure Kubernetes Service, you can use the default OMS Agent to collect logs with Azure Monitor without needing to install Fluentd.
|
||||
|
||||
## Search engines
|
||||
|
||||
If you use [Fluentd](https://www.fluentd.org/), we recommend to using Elastic Search and Kibana. This [how-to](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md) shows how to set up Elastic Search and Kibana in your Kubernetes cluster.
|
||||
|
||||
If you are using the Azure Kubernetes Service, you can use [Azure monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-overview) without indstalling any additional monitoring tools. Also read [How to enable Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-onboard)
|
||||
|
||||
## References
|
||||
|
||||
- [How-to : Set up Fleuntd, Elastic search, and Kibana](../../howto/setup-monitoring-tools/setup-fluentd-es-kibana.md)
|
||||
- [How-to : Set up Azure Monitor in Azure Kubernetes Service](../../howto/setup-monitoring-tools/setup-azure-monitor.md)
|
|
@ -0,0 +1,79 @@
|
|||
kind: ConfigMap
|
||||
apiVersion: v1
|
||||
data:
|
||||
schema-version:
|
||||
#string.used by agent to parse config. supported versions are {v1}. Configs with other schema versions will be rejected by the agent.
|
||||
v1
|
||||
config-version:
|
||||
#string.used by customer to keep track of this config file's version in their source control/repository (max allowed 10 chars, other chars will be truncated)
|
||||
ver1
|
||||
log-data-collection-settings: |-
|
||||
# Log data collection settings
|
||||
# Any errors related to config map settings can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
|
||||
[log_collection_settings]
|
||||
[log_collection_settings.stdout]
|
||||
# In the absense of this configmap, default value for enabled is true
|
||||
enabled = true
|
||||
# exclude_namespaces setting holds good only if enabled is set to true
|
||||
# kube-system log collection is disabled by default in the absence of 'log_collection_settings.stdout' setting. If you want to enable kube-system, remove it from the following setting.
|
||||
# If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
|
||||
# In the absense of this configmap, default value for exclude_namespaces = ["kube-system"]
|
||||
exclude_namespaces = ["kube-system"]
|
||||
[log_collection_settings.stderr]
|
||||
# Default value for enabled is true
|
||||
enabled = true
|
||||
# exclude_namespaces setting holds good only if enabled is set to true
|
||||
# kube-system log collection is disabled by default in the absence of 'log_collection_settings.stderr' setting. If you want to enable kube-system, remove it from the following setting.
|
||||
# If you want to continue to disable kube-system log collection keep this namespace in the following setting and add any other namespace you want to disable log collection to the array.
|
||||
# In the absense of this cofigmap, default value for exclude_namespaces = ["kube-system"]
|
||||
exclude_namespaces = ["kube-system"]
|
||||
[log_collection_settings.env_var]
|
||||
# In the absense of this configmap, default value for enabled is true
|
||||
enabled = true
|
||||
[log_collection_settings.enrich_container_logs]
|
||||
# In the absense of this configmap, default value for enrich_container_logs is false
|
||||
enabled = true
|
||||
# When this is enabled (enabled = true), every container log entry (both stdout & stderr) will be enriched with container Name & container Image
|
||||
[log_collection_settings.collect_all_kube_events]
|
||||
# In the absense of this configmap, default value for collect_all_kube_events is false
|
||||
# When the setting is set to false, only the kube events with !normal event type will be collected
|
||||
enabled = false
|
||||
# When this is enabled (enabled = true), all kube events including normal events will be collected
|
||||
prometheus-data-collection-settings: |-
|
||||
# Custom Prometheus metrics data collection settings
|
||||
[prometheus_data_collection_settings.cluster]
|
||||
# Cluster level scrape endpoint(s). These metrics will be scraped from agent's Replicaset (singleton)
|
||||
# Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
|
||||
#Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
|
||||
interval = "1m"
|
||||
## Uncomment the following settings with valid string arrays for prometheus scraping
|
||||
#fieldpass = ["metric_to_pass1", "metric_to_pass12"]
|
||||
#fielddrop = ["metric_to_drop"]
|
||||
# An array of urls to scrape metrics from.
|
||||
# urls = ["http://myurl:9101/metrics"]
|
||||
# An array of Kubernetes services to scrape metrics from.
|
||||
# kubernetes_services = ["http://my-service-dns.my-namespace:9102/metrics"]
|
||||
# When monitor_kubernetes_pods = true, replicaset will scrape Kubernetes pods for the following prometheus annotations:
|
||||
# - prometheus.io/scrape: Enable scraping for this pod
|
||||
# - prometheus.io/scheme: If the metrics endpoint is secured then you will need to
|
||||
# set this to `https` & most likely set the tls config.
|
||||
# - prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
|
||||
# - prometheus.io/port: If port is not 9102 use this annotation
|
||||
monitor_kubernetes_pods = true
|
||||
## Restricts Kubernetes monitoring to namespaces for pods that have annotations set and are scraped using the monitor_kubernetes_pods setting.
|
||||
## This will take effect when monitor_kubernetes_pods is set to true
|
||||
## ex: monitor_kubernetes_pods_namespaces = ["default1", "default2", "default3"]
|
||||
monitor_kubernetes_pods_namespaces = ["dapr-system", "default"]
|
||||
[prometheus_data_collection_settings.node]
|
||||
# Node level scrape endpoint(s). These metrics will be scraped from agent's DaemonSet running in every node in the cluster
|
||||
# Any errors related to prometheus scraping can be found in the KubeMonAgentEvents table in the Log Analytics workspace that the cluster is sending data to.
|
||||
#Interval specifying how often to scrape for metrics. This is duration of time and can be specified for supporting settings by combining an integer value and time unit as a string value. Valid time units are ns, us (or µs), ms, s, m, h.
|
||||
interval = "1m"
|
||||
## Uncomment the following settings with valid string arrays for prometheus scraping
|
||||
# An array of urls to scrape metrics from. $NODE_IP (all upper case) will substitute of running Node's IP address
|
||||
# urls = ["http://$NODE_IP:9103/metrics"]
|
||||
#fieldpass = ["metric_to_pass1", "metric_to_pass12"]
|
||||
#fielddrop = ["metric_to_drop"]
|
||||
metadata:
|
||||
name: container-azm-ms-agentconfig
|
||||
namespace: kube-system
|
|
@ -0,0 +1,105 @@
|
|||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: fluentd-config
|
||||
namespace: kube-system
|
||||
data:
|
||||
fluent.conf: |
|
||||
<match fluent.**>
|
||||
@type null
|
||||
</match>
|
||||
|
||||
<match kubernetes.var.log.containers.**fluentd**.log>
|
||||
@type null
|
||||
</match>
|
||||
|
||||
<match kubernetes.var.log.containers.**kube-system**.log>
|
||||
@type null
|
||||
</match>
|
||||
|
||||
<match kubernetes.var.log.containers.**kibana**.log>
|
||||
@type null
|
||||
</match>
|
||||
|
||||
<source>
|
||||
@type tail
|
||||
path /var/log/containers/*.log
|
||||
pos_file fluentd-docker.pos
|
||||
time_format %Y-%m-%dT%H:%M:%S
|
||||
tag kubernetes.*
|
||||
<parse>
|
||||
@type multi_format
|
||||
<pattern>
|
||||
format json
|
||||
time_key time
|
||||
time_type string
|
||||
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
|
||||
keep_time_key false
|
||||
</pattern>
|
||||
<pattern>
|
||||
format regexp
|
||||
expression /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
|
||||
time_format '%Y-%m-%dT%H:%M:%S.%N%:z'
|
||||
keep_time_key false
|
||||
</pattern>
|
||||
</parse>
|
||||
</source>
|
||||
|
||||
<filter kubernetes.**>
|
||||
@type kubernetes_metadata
|
||||
@id filter_kube_metadata
|
||||
</filter>
|
||||
|
||||
<filter kubernetes.var.log.containers.**>
|
||||
@type parser
|
||||
<parse>
|
||||
@type json
|
||||
format json
|
||||
time_key time
|
||||
time_type string
|
||||
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
|
||||
keep_time_key false
|
||||
</parse>
|
||||
key_name log
|
||||
replace_invalid_sequence true
|
||||
emit_invalid_record_to_error true
|
||||
reserve_data true
|
||||
</filter>
|
||||
|
||||
<match **>
|
||||
@type elasticsearch
|
||||
@id out_es
|
||||
@log_level info
|
||||
include_tag_key true
|
||||
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
|
||||
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
|
||||
path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
|
||||
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
|
||||
ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
|
||||
ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}"
|
||||
user "#{ENV['FLUENT_ELASTICSEARCH_USER'] || use_default}"
|
||||
password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD'] || use_default}"
|
||||
reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
|
||||
reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
|
||||
reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
|
||||
log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'false'}"
|
||||
logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'dapr'}"
|
||||
logstash_dateformat "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_DATEFORMAT'] || '%Y.%m.%d'}"
|
||||
logstash_format "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_FORMAT'] || 'true'}"
|
||||
index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'dapr'}"
|
||||
type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
|
||||
include_timestamp "#{ENV['FLUENT_ELASTICSEARCH_INCLUDE_TIMESTAMP'] || 'false'}"
|
||||
template_name "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_NAME'] || use_nil}"
|
||||
template_file "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_FILE'] || use_nil}"
|
||||
template_overwrite "#{ENV['FLUENT_ELASTICSEARCH_TEMPLATE_OVERWRITE'] || use_default}"
|
||||
sniffer_class_name "#{ENV['FLUENT_SNIFFER_CLASS_NAME'] || 'Fluent::Plugin::ElasticsearchSimpleSniffer'}"
|
||||
request_timeout "#{ENV['FLUENT_ELASTICSEARCH_REQUEST_TIMEOUT'] || '5s'}"
|
||||
<buffer>
|
||||
flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
|
||||
flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
|
||||
chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}"
|
||||
queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}"
|
||||
retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
|
||||
retry_forever true
|
||||
</buffer>
|
||||
</match>
|
|
@ -0,0 +1,99 @@
|
|||
---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: fluentd
|
||||
namespace: kube-system
|
||||
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: fluentd
|
||||
namespace: kube-system
|
||||
rules:
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- pods
|
||||
- namespaces
|
||||
verbs:
|
||||
- get
|
||||
- list
|
||||
- watch
|
||||
|
||||
---
|
||||
kind: ClusterRoleBinding
|
||||
apiVersion: rbac.authorization.k8s.io/v1beta1
|
||||
metadata:
|
||||
name: fluentd
|
||||
roleRef:
|
||||
kind: ClusterRole
|
||||
name: fluentd
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: fluentd
|
||||
namespace: kube-system
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: fluentd
|
||||
namespace: kube-system
|
||||
labels:
|
||||
k8s-app: fluentd-logging
|
||||
version: v1
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
k8s-app: fluentd-logging
|
||||
version: v1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
k8s-app: fluentd-logging
|
||||
version: v1
|
||||
spec:
|
||||
serviceAccount: fluentd
|
||||
serviceAccountName: fluentd
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/master
|
||||
effect: NoSchedule
|
||||
containers:
|
||||
- name: fluentd
|
||||
image: fluent/fluentd-kubernetes-daemonset:v1.9.2-debian-elasticsearch7-1.0
|
||||
env:
|
||||
- name: FLUENT_ELASTICSEARCH_HOST
|
||||
value: "elasticsearch-master.dapr-monitoring"
|
||||
- name: FLUENT_ELASTICSEARCH_PORT
|
||||
value: "9200"
|
||||
- name: FLUENT_ELASTICSEARCH_SCHEME
|
||||
value: "http"
|
||||
- name: FLUENT_UID
|
||||
value: "0"
|
||||
resources:
|
||||
limits:
|
||||
memory: 200Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 200Mi
|
||||
volumeMounts:
|
||||
- name: varlog
|
||||
mountPath: /var/log
|
||||
- name: varlibdockercontainers
|
||||
mountPath: /var/lib/docker/containers
|
||||
readOnly: true
|
||||
- name: fluentd-config
|
||||
mountPath: /fluentd/etc
|
||||
terminationGracePeriodSeconds: 30
|
||||
volumes:
|
||||
- name: varlog
|
||||
hostPath:
|
||||
path: /var/log
|
||||
- name: varlibdockercontainers
|
||||
hostPath:
|
||||
path: /var/lib/docker/containers
|
||||
- name: fluentd-config
|
||||
configMap:
|
||||
name: fluentd-config
|
After Width: | Height: | Size: 80 KiB |
After Width: | Height: | Size: 34 KiB |
After Width: | Height: | Size: 70 KiB |
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 338 KiB |
After Width: | Height: | Size: 360 KiB |
After Width: | Height: | Size: 238 KiB |
After Width: | Height: | Size: 248 KiB |
After Width: | Height: | Size: 501 KiB |
After Width: | Height: | Size: 373 KiB |
After Width: | Height: | Size: 440 KiB |
|
@ -0,0 +1,133 @@
|
|||
# Set up azure monitor to search logs and collect metrics for Dapr
|
||||
|
||||
This document describes how to enable Dapr metrics and logs with Azure Monitor for Azure Kubernetes Service (AKS).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- [Azure Kubernetes Service](https://docs.microsoft.com/en-us/azure/aks/)
|
||||
- [Enable Azure Monitor For containers in AKS](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-overview)
|
||||
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
|
||||
- [Helm 3](https://helm.sh/)
|
||||
|
||||
## Contents
|
||||
|
||||
- [Enable Prometheus metric scrape using config map](#enable-prometheus-metric-scrape-using-config-map)
|
||||
- [Install Dapr with JSON formatted logs](#install-dapr-with-json-formatted-logs)
|
||||
- [Search metrics and logs with Azure Monitor](#Search-metrics-and-logs-with-azure-monitor)
|
||||
|
||||
## Enable Prometheus metric scrape using config map
|
||||
|
||||
1. Make sure that omsagnets are running
|
||||
|
||||
```bash
|
||||
$ kubectl get pods -n kube-system
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
...
|
||||
omsagent-75qjs 1/1 Running 1 44h
|
||||
omsagent-c7c4t 1/1 Running 0 44h
|
||||
omsagent-rs-74f488997c-dshpx 1/1 Running 1 44h
|
||||
omsagent-smtk7 1/1 Running 1 44h
|
||||
...
|
||||
```
|
||||
|
||||
2. Apply config map to enable Prometheus metrics endpoint scrape.
|
||||
|
||||
You can use [azm-config-map.yaml](./azm-config-map.yaml) to enable prometheus metrics endpoint scrape.
|
||||
|
||||
If you installed Dapr to the different namespace, you need to change the `monitor_kubernetes_pod_namespaces` array values. For example;
|
||||
|
||||
```yaml
|
||||
...
|
||||
prometheus-data-collection-settings: |-
|
||||
[prometheus_data_collection_settings.cluster]
|
||||
interval = "1m"
|
||||
monitor_kubernetes_pods = true
|
||||
monitor_kubernetes_pods_namespaces = ["dapr-system", "default"]
|
||||
[prometheus_data_collection_settings.node]
|
||||
interval = "1m"
|
||||
...
|
||||
```
|
||||
|
||||
Apply config map:
|
||||
|
||||
```
|
||||
kubectl apply -f ./azm-config.map.yaml
|
||||
```
|
||||
|
||||
## Install Dapr with JSON formatted logs
|
||||
|
||||
1. Install Dapr with enabling JSON-formatted logs
|
||||
|
||||
```bash
|
||||
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
|
||||
```
|
||||
|
||||
2. Enable JSON formatted log in Dapr sidecar and add Prometheus annotations.
|
||||
|
||||
> Note: OMS Agent scrapes the metrics only if replicaset has Prometheus annotations.
|
||||
|
||||
Add `dapr.io/log-as-json: "true"` annotation to your deployment yaml.
|
||||
|
||||
Example:
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: pythonapp
|
||||
labels:
|
||||
app: python
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: python
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: python
|
||||
annotations:
|
||||
dapr.io/enabled: "true"
|
||||
dapr.io/id: "pythonapp"
|
||||
dapr.io/log-as-json: "true"
|
||||
prometheus.io/scrape: "true"
|
||||
prometheus.io/port: "9090"
|
||||
prometheus.io/path: "/"
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
## Search metrics and logs with Azure Monitor
|
||||
|
||||
1. Go to Azure Monitor
|
||||
|
||||
2. Search Dapr logs
|
||||
|
||||
Here is an example query, to parse JSON formatted logs and query logs from dapr system processes.
|
||||
|
||||
```
|
||||
ContainerLog
|
||||
| extend parsed=parse_json(LogEntry)
|
||||
| project Time=todatetime(parsed['time']), app_id=parsed['app_id'], scope=parsed['scope'],level=parsed['level'], msg=parsed['msg'], type=parsed['type'], ver=parsed['ver'], instance=parsed['instance']
|
||||
| where level != ""
|
||||
| sort by Time
|
||||
```
|
||||
|
||||
3. Search metrics
|
||||
|
||||
This query, queries process_resident_memory_bytes Prometheus metrics for Dapr system processes and renders timecharts
|
||||
|
||||
```
|
||||
InsightsMetrics
|
||||
| where Namespace == "prometheus" and Name == "process_resident_memory_bytes"
|
||||
| extend tags=parse_json(Tags)
|
||||
| project TimeGenerated, Name, Val, app=tostring(tags['app'])
|
||||
| summarize memInBytes=percentile(Val, 99) by bin(TimeGenerated, 1m), app
|
||||
| where app startswith "dapr-"
|
||||
| render timechart
|
||||
```
|
||||
|
||||
# References
|
||||
|
||||
* [Configure scraping of Prometheus metrics with Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-prometheus-integration)
|
||||
* [Configure agent data collection for Azure Monitor for containers](https://docs.microsoft.com/en-us/azure/azure-monitor/insights/container-insights-agent-config)
|
||||
* [Azure Monitor Query](https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/query-language)
|
|
@ -0,0 +1,173 @@
|
|||
# Set up Fleuntd, Elastic search, and Kibana in Kubernetes
|
||||
|
||||
This document descriebs how to install Fluentd, Elastic Search, and Kibana to search logs in Kubernetes
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes (> 1.14)
|
||||
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
|
||||
- [Helm 3](https://helm.sh/)
|
||||
|
||||
## Contents
|
||||
|
||||
- [Install Fluentd, Elastic Search, and Kibana](#install-fluentd-elastic-search-and-kibana)
|
||||
- [Install Fluentd](#install-fluentd)
|
||||
- [Install Dapr with JSON formatted logs](#install-dapr-with-json-formatted-logs)
|
||||
- [Search logs](#search-logs)
|
||||
|
||||
## Install Elastic search and Kibana
|
||||
|
||||
1. Create namespace for monitoring tool and add Helm repo for Elastic Search
|
||||
|
||||
```bash
|
||||
kubectl create namespace dapr-monitoring
|
||||
```
|
||||
|
||||
2. Add Elastic helm repo
|
||||
|
||||
```bash
|
||||
helm repo add elastic https://helm.elastic.co
|
||||
helm repo update
|
||||
```
|
||||
|
||||
3. Install Elastic Search using Helm
|
||||
|
||||
```bash
|
||||
helm install elasticsearch elastic/elasticsearch -n dapr-monitoring
|
||||
```
|
||||
|
||||
If you are using minikube or want to disable persistent volumes for development purposes, you can disable it by using the following command.
|
||||
```bash
|
||||
helm install elasticsearch elastic/elasticsearch -n dapr-monitoring --set persistence.enabled=false --replicas=1
|
||||
```
|
||||
|
||||
4. Install Kibana
|
||||
|
||||
```bash
|
||||
helm install kibana elastic/kibana -n dapr-monitoring
|
||||
```
|
||||
|
||||
5. Validation
|
||||
|
||||
Ensure Elastic Search and Kibana are running in your Kubernetes cluster.
|
||||
|
||||
```bash
|
||||
kubectl get pods -n dapr-monitoring
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
elasticsearch-master-0 1/1 Running 0 6m58s
|
||||
kibana-kibana-95bc54b89-zqdrk 1/1 Running 0 4m21s
|
||||
```
|
||||
|
||||
## Install Fluentd
|
||||
|
||||
1. Install config map and Fluentd as a daemonset
|
||||
|
||||
> Note: If you are running Fluentd in your cluster, please enable the nested json parser to parse JSON formatted log from Dapr.
|
||||
|
||||
```bash
|
||||
kubectl apply -f ./fluentd-config-map.yaml
|
||||
kubectl apply -f ./fluentd-dapr-with-rbac.yaml
|
||||
```
|
||||
|
||||
2. Ensure that Fluentd is running as a daemonset
|
||||
|
||||
```bash
|
||||
kubectl get pods -n kube-system -w
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
coredns-6955765f44-cxjxk 1/1 Running 0 4m41s
|
||||
coredns-6955765f44-jlskv 1/1 Running 0 4m41s
|
||||
etcd-m01 1/1 Running 0 4m48s
|
||||
fluentd-sdrld 1/1 Running 0 14s
|
||||
```
|
||||
|
||||
|
||||
## Install Dapr with JSON formatted logs
|
||||
|
||||
1. Install Dapr with enabling JSON-formatted logs
|
||||
|
||||
```bash
|
||||
helm install dapr dapr/dapr --namespace dapr-system --set global.LogAsJSON=true
|
||||
```
|
||||
|
||||
2. Enable JSON formatted log in Dapr sidecar
|
||||
|
||||
Add `dapr.io/log-as-json: "true"` annotation to your deployment yaml.
|
||||
|
||||
Example:
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: pythonapp
|
||||
labels:
|
||||
app: python
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: python
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: python
|
||||
annotations:
|
||||
dapr.io/enabled: "true"
|
||||
dapr.io/id: "pythonapp"
|
||||
dapr.io/log-as-json: "true"
|
||||
...
|
||||
```
|
||||
|
||||
## Search logs
|
||||
|
||||
> Note: Elastic Search takes a time to index the logs that Fluentd sends.
|
||||
|
||||
1. Port-forward to svc/kibana-kibana
|
||||
|
||||
```
|
||||
$ kubectl port-forward svc/kibana-kibana 5601 -n dapr-monitoring
|
||||
Forwarding from 127.0.0.1:5601 -> 5601
|
||||
Forwarding from [::1]:5601 -> 5601
|
||||
Handling connection for 5601
|
||||
Handling connection for 5601
|
||||
```
|
||||
|
||||
2. Browse `http://localhost:5601`
|
||||
|
||||
3. Click Management -> Index Management
|
||||
|
||||

|
||||
|
||||
4. Wait until dapr-* is indexed.
|
||||
|
||||

|
||||
|
||||
5. Once dapr-* indexed, click Kibana->Index Patterns and Create Index Pattern
|
||||
|
||||

|
||||
|
||||
6. Define index pattern - type `dapr*` in index pattern
|
||||
|
||||

|
||||
|
||||
7. Select time stamp filed: `@timestamp`
|
||||
|
||||

|
||||
|
||||
8. Confirm that `scope`, `type`, `app_id`, `level`, etc are being indexed.
|
||||
|
||||
> Note: if you cannot find the indexed field, please wait. it depends on the volume of data and resource size where elastic search is running.
|
||||
|
||||

|
||||
|
||||
9. Click `discover` icon and search `scope:*`
|
||||
|
||||
> Note: it would take some time to make log searchable based on the data volume and resource.
|
||||
|
||||

|
||||
|
||||
# References
|
||||
|
||||
* [Fluentd for Kubernetes](https://docs.fluentd.org/v/0.12/articles/kubernetes-fluentd)
|
||||
* [Elastic search helm chart](https://github.com/elastic/helm-charts/tree/master/elasticsearch)
|
||||
* [Kibana helm chart](https://github.com/elastic/helm-charts/tree/master/kibana)
|
||||
* [Kibana Query Language](https://www.elastic.co/guide/en/kibana/current/kuery-query.html)
|
|
@ -0,0 +1,136 @@
|
|||
# Set up Prometheus and Grafana
|
||||
|
||||
This document shows how to install Prometheus and Grafana to view metrics.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes (> 1.14)
|
||||
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
|
||||
- [Helm 3](https://helm.sh/)
|
||||
|
||||
## Contents
|
||||
|
||||
- [Install Prometheus and Grafana](#install-prometheus-and-grafana)
|
||||
- [View metrics](#view-metrics)
|
||||
|
||||
## Install Prometheus and Grafana
|
||||
|
||||
1. Create namespace for monitoring tool
|
||||
|
||||
```bash
|
||||
kubectl create namespace dapr-monitoring
|
||||
```
|
||||
|
||||
2. Install Prometheus
|
||||
|
||||
```bash
|
||||
helm install dapr-prom stable/prometheus -n dapr-monitoring
|
||||
```
|
||||
|
||||
If you are minikube user or want to disable persistent volume for development purpose, you can disable it by using the following command.
|
||||
|
||||
```bash
|
||||
helm install prometheus stable/prometheus -n dapr-monitoring
|
||||
--set alertmanager.persistentVolume.enable=false
|
||||
--set pushgateway.persistentVolume.enabled=false
|
||||
--set server.persistentVolume.enabled=false
|
||||
```
|
||||
|
||||
3. Install Grafana
|
||||
|
||||
```bash
|
||||
helm install grafana stable/grafana -n dapr-monitoring
|
||||
```
|
||||
|
||||
If you are minikube user or want to disable persistent volume for development purpose, you can disable it by using the following command.
|
||||
|
||||
```bash
|
||||
helm install grafana stable/grafana -n dapr-monitoring --set persistence.enabled=false
|
||||
```
|
||||
|
||||
4. Validation
|
||||
|
||||
Ensure Prometheus and Grafana are running in your cluster.
|
||||
|
||||
```bash
|
||||
kubectl get pods -n dapr-monitoring
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
dapr-prom-kube-state-metrics-9849d6cc6-t94p8 1/1 Running 0 4m58s
|
||||
dapr-prom-prometheus-alertmanager-749cc46f6-9b5t8 2/2 Running 0 4m58s
|
||||
dapr-prom-prometheus-node-exporter-5jh8p 1/1 Running 0 4m58s
|
||||
dapr-prom-prometheus-node-exporter-88gbg 1/1 Running 0 4m58s
|
||||
dapr-prom-prometheus-node-exporter-bjp9f 1/1 Running 0 4m58s
|
||||
dapr-prom-prometheus-pushgateway-688665d597-h4xx2 1/1 Running 0 4m58s
|
||||
dapr-prom-prometheus-server-694fd8d7c-q5d59 2/2 Running 0 4m58s
|
||||
grafana-c49889cff-x56vj 1/1 Running 0 5m10s
|
||||
|
||||
```
|
||||
|
||||
## View metrics
|
||||
|
||||
1. Port-forward to svc/grafana
|
||||
|
||||
```
|
||||
$ kubectl port-forward svc/grafana 8080:80 -n dapr-monitoring
|
||||
Forwarding from 127.0.0.1:8080 -> 3000
|
||||
Forwarding from [::1]:8080 -> 3000
|
||||
Handling connection for 8080
|
||||
Handling connection for 8080
|
||||
```
|
||||
|
||||
2. Browse `http://localhost:8080`
|
||||
|
||||
3. Click Configuration Settings -> Data Sources
|
||||
|
||||

|
||||
|
||||
4. Add Prometheus as a data soruce.
|
||||
|
||||

|
||||
|
||||
5. Enter Promethesus server address in your cluster.
|
||||
|
||||
You can get the prometheus server address by running following command.
|
||||
|
||||
```bash
|
||||
kubectl get svc -n dapr-monitoring
|
||||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||||
dapr-prom-kube-state-metrics ClusterIP 10.0.174.177 <none> 8080/TCP 7d9h
|
||||
dapr-prom-prometheus-alertmanager ClusterIP 10.0.255.199 <none> 80/TCP 7d9h
|
||||
dapr-prom-prometheus-node-exporter ClusterIP None <none> 9100/TCP 7d9h
|
||||
dapr-prom-prometheus-pushgateway ClusterIP 10.0.190.59 <none> 9091/TCP 7d9h
|
||||
dapr-prom-prometheus-server ClusterIP 10.0.172.191 <none> 80/TCP 7d9h
|
||||
elasticsearch-master ClusterIP 10.0.36.146 <none> 9200/TCP,9300/TCP 7d10h
|
||||
elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 7d10h
|
||||
grafana ClusterIP 10.0.15.229 <none> 80/TCP 5d5h
|
||||
kibana-kibana ClusterIP 10.0.188.224 <none> 5601/TCP 7d10h
|
||||
|
||||
```
|
||||
|
||||
In this set up tutorial, the server is `dapr-prom-prometheus-server`.
|
||||
|
||||
So you need to provide `http://dapr-prom-prometheus-server.dapr-monitoring` in the URL field.
|
||||
|
||||

|
||||
|
||||
6. Click Save & Test button to verify that connected succeeded.
|
||||
|
||||
7. Import Dapr dashboards.
|
||||
|
||||
You can now import built-in [Grafana dashboard templates](https://github.com/dapr/docs/tree/master/monitoring/grafana/dashboards).
|
||||
|
||||
Refer [here](https://github.com/dapr/docs/tree/master/monitoring/grafana) for details.
|
||||
|
||||

|
||||
|
||||
You can find screenshots of Dapr dashboards [here](https://github.com/dapr/docs/tree/master/monitoring/grafana/img).
|
||||
|
||||
# References
|
||||
|
||||
* [Prometheus Installation](https://github.com/helm/charts/tree/master/stable/prometheus-operator)
|
||||
|
||||
* [Prometheus on Kubernetes](https://github.com/coreos/kube-prometheus)
|
||||
|
||||
* [Prometheus Kubernetes Operator](https://github.com/helm/charts/tree/master/stable/prometheus-operator)
|
||||
|
||||
* [Prometheus Query Language](https://prometheus.io/docs/prometheus/latest/querying/basics/)
|