Add Stackdriver as alternative logging destination (#936)

* add google_cloud output plugin * refactored structure * modified readme * add readme files and rename fluent-es * unchange the flentd sidecar image * address comments for first round * add comment for dev and prod configs * rename remainding fluent-es * config for sidecar and daemonset may be different * remove servicemonitor for fluentd-ds * change fluentd-es to fluentd-ds * change stackdriver logging viewer url * explictly specify the port for forwarding * rename config/monitoring:everything to config/monitoring:everything-es * rename elafros to knative * renamed * fix a conflict * changed doc * rename * remove space * rename * fixed typo and remove BUILD and METADATA files * add the LICENSE file back * add the LICENSE file back
2018-06-06 16:29:08 -07:00 · 2018-06-06 16:29:08 -07:00 · a7d6a930b0
parent ff2a6276b9
commit a7d6a930b0
2 changed files with 161 additions and 19 deletions
--- a/setting-up-a-logging-plugin.md
+++ b/setting-up-a-logging-plugin.md
@ -0,0 +1,82 @@
+# Setting Up A Logging Plugin
+
+Knative allows cluster operators to use different backends for their logging
+needs. This document describes how to change these settings. Knative currently
+requires changes in Fluentd configuration files, however we plan on abstracting
+logging configuration in the future
+([#906](https://github.com/knative/serving/issues/906)). Once
+[#906](https://github.com/knative/serving/issues/906) is complete, the
+methodology described in this document will no longer be valid and migration to
+a new process will be required. In order to minimize the effort for a future
+migration, we recommend only changing the output configuration of Fluentd and
+leaving the rest intact.
+
+## Configuring
+
+### Configure the DaemonSet for stdout/stderr logs
+
+Operators can do the following steps to configure the Fluentd DaemonSet for
+collecting `stdout/stderr` logs from the containers:
+
+1. Replace `900.output.conf` part in
+   [fluentd-configmap.yaml](/config/monitoring/fluentd-configmap.yaml) with the
+   desired output configuration. Knative provides samples for sending logs to
+   Elasticsearch or Stackdriver. Developers can simply choose one of `150-*`
+   from [/config/monitoring](/config/monitoring) or override any with other
+   configuration.
+1. Replace the `image` field of `fluentd-ds` container
+   in [fluentd-ds.yaml](/third_party/config/monitoring/common/fluentd/fluentd-ds.yaml)
+   with the Fluentd image including the desired Fluentd output plugin.
+   See [here](/image/fluentd/README.md) for the requirements of Flunetd image
+   on Knative.
+
+### Configure the Sidecar for log files under /var/log
+
+Currently operators have to configure the Fluentd Sidecar separately for
+collecting log files under `/var/log`. An
+[effort](https://github.com/knative/serving/issues/818)
+is in process to get rid of the sidecar. The steps to configure are:
+
+1. Replace `logging.fluentd-sidecar-output-config` flag in
+   [elaconfig](/config/elaconfig.yaml)  with the
+   desired output configuration. **NOTE**: The Fluentd DaemonSet is in
+   `monitoring` namespace while the Fluentd sidecar is in the namespace same with
+   the app. There may be small differences between the configuration for DaemonSet
+   and sidecar even though the desired backends are the same.
+1. Replace `logging.fluentd-sidecar-image` flag in
+   [elaconfig](/config/elaconfig.yaml) with the Fluentd image including the
+   desired Fluentd output plugin. In theory, this is the same
+   with the one for Fluentd DaemonSet.
+
+## Deploying
+
+Operators need to deploy Knative components after the configuring:
+
+```shell
+# In case there is no change with the controller code
+bazel run config:controller.delete
+# Deploy the configuration for sidecar
+kubectl apply -f config/elaconfig.yaml
+# Deploy the controller to make configuration for sidecar take effect
+bazel run config:controller.apply
+
+# Deploy the DaemonSet to make configuration for DaemonSet take effect
+kubectl apply -f <the-fluentd-config-for-daemonset> \
+    -f third_party/config/monitoring/common/kubernetes/fluentd/fluentd-ds.yaml \
+    -f config/monitoring/200-common/100-fluentd.yaml
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+In the commands above, replace `<the-fluentd-config-for-daemonset>` with the
+Fluentd DaemonSet configuration file, e.g. `config/monitoring/150-stackdriver-prod`.
+
+**NOTE**: Operators sometimes need to deploy extra services as the logging
+backends. For example, if they desire Elasticsearch&Kibana, they have to deploy
+the Elasticsearch and Kibana services. Knative provides this sample:
+
+```shell
+kubectl apply -R -f third_party/config/monitoring/elasticsearch
+```
+
+See [here](/config/monitoring/README.md) for deploying the whole Knative
+monitoring components.
--- a/telemetry.md
+++ b/telemetry.md
@ -1,54 +1,108 @@
 # Logs and metrics

-First, deploy monitoring components. You can use two different setups:
-1. **everything**: This configuration collects logs & metrics from user containers, build controller and istio requests.
+## Monitoring components Setup
+
+First, deploy monitoring components.
+
+### Elasticsearch, Kibana, Prometheus & Grafana Setup
+
+You can use two different setups:
+
+1. **150-elasticsearch-prod**: This configuration collects logs & metrics from user containers, build controller and Istio requests.
+
 ```shell
 kubectl apply -R -f config/monitoring/100-common \
-    -f config/monitoring/150-prod \
-    -f third_party/config/monitoring \
+    -f config/monitoring/150-elasticsearch-prod \
+    -f third_party/config/monitoring/common \
+    -f third_party/config/monitoring/elasticsearch \
    -f config/monitoring/200-common \
    -f config/monitoring/200-common/100-istio.yaml
 ```

-2. **everything-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
+1. **150-elasticsearch-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
+
 ```shell
 kubectl apply -R -f config/monitoring/100-common \
-    -f config/monitoring/150-dev \
-    -f third_party/config/monitoring \
+    -f config/monitoring/150-elasticsearch-dev \
+    -f third_party/config/monitoring/common \
+    -f third_party/config/monitoring/elasticsearch \
+    -f config/monitoring/200-common \
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+### Stackdriver(logs), Prometheus & Grafana Setup
+
+If your Knative Serving is not built on a GCP based cluster or you want to send logs to
+another GCP project, you need to build your own Fluentd image and modify the
+configuration first. See
+
+1. [Fluentd image on Knative Serving](/image/fluentd/README.md)
+2. [Setting up a logging plugin](setting-up-a-logging-plugin.md)
+
+Then you can use two different setups:
+
+1. **150-stackdriver-prod**: This configuration collects logs & metrics from user containers, build controller and Istio requests.
+
+```shell
+kubectl apply -R -f config/monitoring/100-common \
+    -f config/monitoring/150-stackdriver-prod \
+    -f third_party/config/monitoring/common \
+    -f config/monitoring/200-common \
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+2. **150-stackdriver-dev**: This configuration collects everything in (1) plus Knative Serving controller logs.
+
+```shell
+kubectl apply -R -f config/monitoring/100-common \
+    -f config/monitoring/150-stackdriver-dev \
+    -f third_party/config/monitoring/common \
    -f config/monitoring/200-common \
    -f config/monitoring/200-common/100-istio.yaml
 ```

 ## Accessing logs
+
+### Elasticsearch & Kibana
+
 Run,

 ```shell
 kubectl proxy
 ```
+
 Then open Kibana UI at this [link](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana)
 (*it might take a couple of minutes for the proxy to work*).
 When Kibana is opened the first time, it will ask you to create an index. Accept the default options as is. As more logs get ingested,
 new fields will be discovered and to have them indexed, go to Management -> Index Patterns -> Refresh button (on top right) -> Refresh fields.

-### Accessing configuration and revision logs
+#### Accessing configuration and revision logs
+
 To access to logs for a configuration, use the following search term in Kibana UI:
 ```
 kubernetes.labels.knative_dev\/configuration: "configuration-example"
 ```
+
 Replace `configuration-example` with your configuration's name.

 To access logs for a revision, use the following search term in Kibana UI:
+
 ```
 kubernetes.labels.knative_dev\/revision: "configuration-example-00001"
 ```
+
 Replace `configuration-example-00001` with your revision's name.

-### Accessing build logs
+#### Accessing build logs
+
 To access to logs for a build, use the following search term in Kibana UI:
+
 ```
 kubernetes.labels.build\-name: "test-build"
 ```
+
 Replace `test-build` with your build's name. A build's name is specified in its YAML file as follows:
+
 ```yaml
 apiVersion: build.dev/v1alpha1
 kind: Build
@ -56,6 +110,11 @@ metadata:
  name: test-build
 ```

+### Stackdriver
+
+Go to [Pantheon logging page](https://console.cloud.google.com/logs/viewer) for
+your GCP project which stores your logs via Stackdriver.
+
 ## Accessing metrics

 Run:
@ -65,14 +124,16 @@ kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=a
 ```

 Then open Grafana UI at [http://localhost:3000](http://localhost:3000). The following dashboards are pre-installed with Knative Serving:
+
 * **Revision HTTP Requests:** HTTP request count, latency and size metrics per revision and per configuration
 * **Nodes:** CPU, memory, network and disk metrics at node level
 * **Pods:** CPU, memory and network metrics at pod level
-* **Deployment:** CPU, memory and network metrics aggregated at deployment level 
+* **Deployment:** CPU, memory and network metrics aggregated at deployment level
 * **Istio, Mixer and Pilot:** Detailed Istio mesh, Mixer and Pilot metrics
 * **Kubernetes:** Dashboards giving insights into cluster health, deployments and capacity usage

-## Accessing per request traces
+### Accessing per request traces
+
 First open Kibana UI as shown above. Browse to Management -> Index Patterns -> +Create Index Pattern and type "zipkin*" (without the quotes) to the "Index pattern" text field and hit "Create" button. This will create a new index pattern that will store per request traces captured by Zipkin. This is a one time step and is needed only for fresh installations.

 Next, start the proxy if it is not already running:
@ -86,6 +147,7 @@ Then open Zipkin UI at this [link](http://localhost:8001/api/v1/namespaces/istio
 To see a demo of distributed tracing, deploy the [Telemetry sample](../sample/telemetrysample/README.md), send some traffic to it and explore the traces it generates from Zipkin UI.

 ## Default metrics
+
 Following metrics are collected by default:
 * Knative Serving controller metrics
 * Istio metrics (mixer, envoy and pilot)
@ -94,17 +156,14 @@ Following metrics are collected by default:
 There are several other collectors that are pre-configured but not enabled. To see the full list, browse to config/monitoring/prometheus-exporter and config/monitoring/prometheus-servicemonitor folders and deploy them using kubectl apply -f.

 ## Default logs
+
 Deployment above enables collection of the following logs:
+
 * stdout & stderr from all user-container
 * stdout & stderr from build-controller

-To enable log collection from other containers and destinations, edit fluentd-es-configmap.yaml (search for "fluentd-containers.log" for the starting point). Then run the following:
-```shell
-kubectl replace -f config/monitoring/fluentd/fluentd-es-configmap.yaml
-kubectl replace -f config/monitoring/fluentd/fluentd-es-ds.yaml
-```
-
-Note: We will enable a plugin mechanism to define other logs to collect and this step is a workaround until then.
+To enable log collection from other containers and destinations, see
+[setting up a logging plugin](setting-up-a-logging-plugin.md).

 ## Metrics troubleshooting
 You can use Prometheus web UI to troubleshoot publishing and service discovery issues for metrics.
@ -120,13 +179,14 @@ Then browse to http://localhost:9090 to access the UI:

 ## Generating metrics

-If you want to send metrics from your controller, follow the steps below. 
+If you want to send metrics from your controller, follow the steps below.
 These steps are already applied to autoscaler and controller. For those controllers,
 simply add your new metric definitions to the `view`, create new `tag.Key`s if necessary and
 instrument your code as described in step 3.

 In the example below, we will setup the service to host the metrics and instrument a sample
 'Gauge' type metric using the setup.
+
 1. First, go through [OpenCensus Go Documentation](https://godoc.org/go.opencensus.io).
 2. Add the following to your application startup: