Add Stackdriver as alternative logging destination (#936)

* add google_cloud output plugin

* refactored structure

* modified readme

* add readme files and rename fluent-es

* unchange the flentd sidecar image

* address comments for first round

* add comment for dev and prod configs

* rename remainding fluent-es

* config for sidecar and daemonset may be different

* remove servicemonitor for fluentd-ds

* change fluentd-es to fluentd-ds

* change stackdriver logging viewer url

* explictly specify the port for forwarding

* rename config/monitoring:everything to config/monitoring:everything-es

* rename elafros to knative

* renamed

* fix a conflict

* changed doc

* rename

* remove space

* rename

* fixed typo and remove BUILD and METADATA files

* add the LICENSE file back

* add the LICENSE file back
This commit is contained in:
Yanwei Guo 2018-06-06 16:29:08 -07:00 committed by GitHub
parent ff2a6276b9
commit a7d6a930b0
2 changed files with 161 additions and 19 deletions

View File

@ -0,0 +1,82 @@
# Setting Up A Logging Plugin
Knative allows cluster operators to use different backends for their logging
needs. This document describes how to change these settings. Knative currently
requires changes in Fluentd configuration files, however we plan on abstracting
logging configuration in the future
([#906](https://github.com/knative/serving/issues/906)). Once
[#906](https://github.com/knative/serving/issues/906) is complete, the
methodology described in this document will no longer be valid and migration to
a new process will be required. In order to minimize the effort for a future
migration, we recommend only changing the output configuration of Fluentd and
leaving the rest intact.
## Configuring
### Configure the DaemonSet for stdout/stderr logs
Operators can do the following steps to configure the Fluentd DaemonSet for
collecting `stdout/stderr` logs from the containers:
1. Replace `900.output.conf` part in
[fluentd-configmap.yaml](/config/monitoring/fluentd-configmap.yaml) with the
desired output configuration. Knative provides samples for sending logs to
Elasticsearch or Stackdriver. Developers can simply choose one of `150-*`
from [/config/monitoring](/config/monitoring) or override any with other
configuration.
1. Replace the `image` field of `fluentd-ds` container
in [fluentd-ds.yaml](/third_party/config/monitoring/common/fluentd/fluentd-ds.yaml)
with the Fluentd image including the desired Fluentd output plugin.
See [here](/image/fluentd/README.md) for the requirements of Flunetd image
on Knative.
### Configure the Sidecar for log files under /var/log
Currently operators have to configure the Fluentd Sidecar separately for
collecting log files under `/var/log`. An
[effort](https://github.com/knative/serving/issues/818)
is in process to get rid of the sidecar. The steps to configure are:
1. Replace `logging.fluentd-sidecar-output-config` flag in
[elaconfig](/config/elaconfig.yaml) with the
desired output configuration. **NOTE**: The Fluentd DaemonSet is in
`monitoring` namespace while the Fluentd sidecar is in the namespace same with
the app. There may be small differences between the configuration for DaemonSet
and sidecar even though the desired backends are the same.
1. Replace `logging.fluentd-sidecar-image` flag in
[elaconfig](/config/elaconfig.yaml) with the Fluentd image including the
desired Fluentd output plugin. In theory, this is the same
with the one for Fluentd DaemonSet.
## Deploying
Operators need to deploy Knative components after the configuring:
```shell
# In case there is no change with the controller code
bazel run config:controller.delete
# Deploy the configuration for sidecar
kubectl apply -f config/elaconfig.yaml
# Deploy the controller to make configuration for sidecar take effect
bazel run config:controller.apply
# Deploy the DaemonSet to make configuration for DaemonSet take effect
kubectl apply -f <the-fluentd-config-for-daemonset> \
-f third_party/config/monitoring/common/kubernetes/fluentd/fluentd-ds.yaml \
-f config/monitoring/200-common/100-fluentd.yaml
-f config/monitoring/200-common/100-istio.yaml
```
In the commands above, replace `<the-fluentd-config-for-daemonset>` with the
Fluentd DaemonSet configuration file, e.g. `config/monitoring/150-stackdriver-prod`.
**NOTE**: Operators sometimes need to deploy extra services as the logging
backends. For example, if they desire Elasticsearch&Kibana, they have to deploy
the Elasticsearch and Kibana services. Knative provides this sample:
```shell
kubectl apply -R -f third_party/config/monitoring/elasticsearch
```
See [here](/config/monitoring/README.md) for deploying the whole Knative
monitoring components.

View File

@ -1,54 +1,108 @@
# Logs and metrics
First, deploy monitoring components. You can use two different setups:
1. **everything**: This configuration collects logs & metrics from user containers, build controller and istio requests.
## Monitoring components Setup
First, deploy monitoring components.
### Elasticsearch, Kibana, Prometheus & Grafana Setup
You can use two different setups:
1. **150-elasticsearch-prod**: This configuration collects logs & metrics from user containers, build controller and Istio requests.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-prod \
-f third_party/config/monitoring \
-f config/monitoring/150-elasticsearch-prod \
-f third_party/config/monitoring/common \
-f third_party/config/monitoring/elasticsearch \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
2. **everything-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
1. **150-elasticsearch-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-dev \
-f third_party/config/monitoring \
-f config/monitoring/150-elasticsearch-dev \
-f third_party/config/monitoring/common \
-f third_party/config/monitoring/elasticsearch \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
### Stackdriver(logs), Prometheus & Grafana Setup
If your Knative Serving is not built on a GCP based cluster or you want to send logs to
another GCP project, you need to build your own Fluentd image and modify the
configuration first. See
1. [Fluentd image on Knative Serving](/image/fluentd/README.md)
2. [Setting up a logging plugin](setting-up-a-logging-plugin.md)
Then you can use two different setups:
1. **150-stackdriver-prod**: This configuration collects logs & metrics from user containers, build controller and Istio requests.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-stackdriver-prod \
-f third_party/config/monitoring/common \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
2. **150-stackdriver-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-stackdriver-dev \
-f third_party/config/monitoring/common \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
## Accessing logs
### Elasticsearch & Kibana
Run,
```shell
kubectl proxy
```
Then open Kibana UI at this [link](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana)
(*it might take a couple of minutes for the proxy to work*).
When Kibana is opened the first time, it will ask you to create an index. Accept the default options as is. As more logs get ingested,
new fields will be discovered and to have them indexed, go to Management -> Index Patterns -> Refresh button (on top right) -> Refresh fields.
### Accessing configuration and revision logs
#### Accessing configuration and revision logs
To access to logs for a configuration, use the following search term in Kibana UI:
```
kubernetes.labels.knative_dev\/configuration: "configuration-example"
```
Replace `configuration-example` with your configuration's name.
To access logs for a revision, use the following search term in Kibana UI:
```
kubernetes.labels.knative_dev\/revision: "configuration-example-00001"
```
Replace `configuration-example-00001` with your revision's name.
### Accessing build logs
#### Accessing build logs
To access to logs for a build, use the following search term in Kibana UI:
```
kubernetes.labels.build\-name: "test-build"
```
Replace `test-build` with your build's name. A build's name is specified in its YAML file as follows:
```yaml
apiVersion: build.dev/v1alpha1
kind: Build
@ -56,6 +110,11 @@ metadata:
name: test-build
```
### Stackdriver
Go to [Pantheon logging page](https://console.cloud.google.com/logs/viewer) for
your GCP project which stores your logs via Stackdriver.
## Accessing metrics
Run:
@ -65,14 +124,16 @@ kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=a
```
Then open Grafana UI at [http://localhost:3000](http://localhost:3000). The following dashboards are pre-installed with Knative Serving:
* **Revision HTTP Requests:** HTTP request count, latency and size metrics per revision and per configuration
* **Nodes:** CPU, memory, network and disk metrics at node level
* **Pods:** CPU, memory and network metrics at pod level
* **Deployment:** CPU, memory and network metrics aggregated at deployment level
* **Deployment:** CPU, memory and network metrics aggregated at deployment level
* **Istio, Mixer and Pilot:** Detailed Istio mesh, Mixer and Pilot metrics
* **Kubernetes:** Dashboards giving insights into cluster health, deployments and capacity usage
## Accessing per request traces
### Accessing per request traces
First open Kibana UI as shown above. Browse to Management -> Index Patterns -> +Create Index Pattern and type "zipkin*" (without the quotes) to the "Index pattern" text field and hit "Create" button. This will create a new index pattern that will store per request traces captured by Zipkin. This is a one time step and is needed only for fresh installations.
Next, start the proxy if it is not already running:
@ -86,6 +147,7 @@ Then open Zipkin UI at this [link](http://localhost:8001/api/v1/namespaces/istio
To see a demo of distributed tracing, deploy the [Telemetry sample](../sample/telemetrysample/README.md), send some traffic to it and explore the traces it generates from Zipkin UI.
## Default metrics
Following metrics are collected by default:
* Knative Serving controller metrics
* Istio metrics (mixer, envoy and pilot)
@ -94,17 +156,14 @@ Following metrics are collected by default:
There are several other collectors that are pre-configured but not enabled. To see the full list, browse to config/monitoring/prometheus-exporter and config/monitoring/prometheus-servicemonitor folders and deploy them using kubectl apply -f.
## Default logs
Deployment above enables collection of the following logs:
* stdout & stderr from all user-container
* stdout & stderr from build-controller
To enable log collection from other containers and destinations, edit fluentd-es-configmap.yaml (search for "fluentd-containers.log" for the starting point). Then run the following:
```shell
kubectl replace -f config/monitoring/fluentd/fluentd-es-configmap.yaml
kubectl replace -f config/monitoring/fluentd/fluentd-es-ds.yaml
```
Note: We will enable a plugin mechanism to define other logs to collect and this step is a workaround until then.
To enable log collection from other containers and destinations, see
[setting up a logging plugin](setting-up-a-logging-plugin.md).
## Metrics troubleshooting
You can use Prometheus web UI to troubleshoot publishing and service discovery issues for metrics.
@ -120,13 +179,14 @@ Then browse to http://localhost:9090 to access the UI:
## Generating metrics
If you want to send metrics from your controller, follow the steps below.
If you want to send metrics from your controller, follow the steps below.
These steps are already applied to autoscaler and controller. For those controllers,
simply add your new metric definitions to the `view`, create new `tag.Key`s if necessary and
instrument your code as described in step 3.
In the example below, we will setup the service to host the metrics and instrument a sample
'Gauge' type metric using the setup.
1. First, go through [OpenCensus Go Documentation](https://godoc.org/go.opencensus.io).
2. Add the following to your application startup: