Move docs from serving repo (#180)

2018-07-18 10:38:48 -07:00 · 2018-07-18 10:38:48 -07:00 · 526a49c5a9
parent 0401c1ada6
commit 526a49c5a9
4 changed files with 595 additions and 0 deletions
--- a/serving/setting-up-a-docker-registry.md
+++ b/serving/setting-up-a-docker-registry.md
@ -0,0 +1,83 @@
+# Setting Up A Docker Registry
+
+This document explains how to use different Docker registries with Knative Serving. It
+assumes you have gone through the steps listed in
+[DEVELOPMENT.md](/DEVELOPMENT.md) to set up your development environment (or
+that you at least have installed `go`, set `GOPATH`, and put `$GOPATH/bin` on
+your `PATH`).
+
+It currently only contains instructions for [Google Container Registry
+(GCR)](https://cloud.google.com/container-registry/), but you should be able to
+use any Docker registry.
+
+## Google Container Registry (GCR)
+
+### Required Tools
+
+Install the following tools:
+
+1.  [`gcloud`](https://cloud.google.com/sdk/downloads)
+1.  [`docker-credential-gcr`](https://github.com/GoogleCloudPlatform/docker-credential-gcr)
+
+    If you installed `gcloud` using the archive or installer, you can install
+    `docker-credential-gcr` like this:
+
+    ```shell
+    gcloud components install docker-credential-gcr
+    ```
+
+    If you installed `gcloud` using a package manager, you may need to install
+    it with `go get`:
+
+    ```shell
+    go get github.com/GoogleCloudPlatform/docker-credential-gcr
+    ```
+
+    If you used `go get` to install and `$GOPATH/bin` isn't already in `PATH`,
+    add it:
+
+    ```shell
+    export PATH=$PATH:$GOPATH/bin
+    ```
+
+### Setup
+
+1.  If you haven't already set up a GCP project, create one and export its name
+    for use in later commands.
+
+    ```shell
+    export PROJECT_ID=my-project-name
+    gcloud projects create "${PROJECT_ID}"
+    ```
+
+1.  Enable the GCR API.
+
+    ```shell
+    gcloud --project="${PROJECT_ID}" services enable \
+    containerregistry.googleapis.com
+    ```
+
+1.  Hook up your GCR credentials. Note that this may complain if you don't have
+    the docker CLI installed, but it is not necessary and should still work.
+
+    ```shell
+    docker-credential-gcr configure-docker
+    ```
+
+1.  If you need to, update your `KO_DOCKER_REPO` and/or `DOCKER_REPO_OVERRIDE`
+    in your `.bashrc`. It should now be
+
+    ```shell
+    export KO_DOCKER_REPO='us.gcr.io/<your-project-id>'
+    export DOCKER_REPO_OVERRIDE="${KO_DOCKER_REPO}"
+    ```
+
+    (You may need to use a different region than `us` if you didn't pick a`us`
+    Google Cloud region.)
+
+That's it, you're done!
+
+## Local registry
+
+This section has yet to be written. If you'd like to write it, see issue
+[#23](https://github.com/knative/serving/issues/23).
--- a/serving/setting-up-a-logging-plugin.md
+++ b/serving/setting-up-a-logging-plugin.md
@ -0,0 +1,82 @@
+# Setting Up A Logging Plugin
+
+Knative allows cluster operators to use different backends for their logging
+needs. This document describes how to change these settings. Knative currently
+requires changes in Fluentd configuration files, however we plan on abstracting
+logging configuration in the future
+([#906](https://github.com/knative/serving/issues/906)). Once
+[#906](https://github.com/knative/serving/issues/906) is complete, the
+methodology described in this document will no longer be valid and migration to
+a new process will be required. In order to minimize the effort for a future
+migration, we recommend only changing the output configuration of Fluentd and
+leaving the rest intact.
+
+## Configuring
+
+### Configure the DaemonSet for stdout/stderr logs
+
+Operators can do the following steps to configure the Fluentd DaemonSet for
+collecting `stdout/stderr` logs from the containers:
+
+1. Replace `900.output.conf` part in
+   [fluentd-configmap.yaml](/config/monitoring/fluentd-configmap.yaml) with the
+   desired output configuration. Knative provides samples for sending logs to
+   Elasticsearch or Stackdriver. Developers can simply choose one of `150-*`
+   from [/config/monitoring](/config/monitoring) or override any with other
+   configuration.
+1. Replace the `image` field of `fluentd-ds` container
+   in [fluentd-ds.yaml](/third_party/config/monitoring/common/fluentd/fluentd-ds.yaml)
+   with the Fluentd image including the desired Fluentd output plugin.
+   See [here](/image/fluentd/README.md) for the requirements of Flunetd image
+   on Knative.
+
+### Configure the Sidecar for log files under /var/log
+
+Currently operators have to configure the Fluentd Sidecar separately for
+collecting log files under `/var/log`. An
+[effort](https://github.com/knative/serving/issues/818)
+is in process to get rid of the sidecar. The steps to configure are:
+
+1. Replace `logging.fluentd-sidecar-output-config` flag in
+   [config-observability](/config/config-observability.yaml)  with the
+   desired output configuration. **NOTE**: The Fluentd DaemonSet is in
+   `monitoring` namespace while the Fluentd sidecar is in the namespace same with
+   the app. There may be small differences between the configuration for DaemonSet
+   and sidecar even though the desired backends are the same.
+1. Replace `logging.fluentd-sidecar-image` flag in
+   [config-observability](/config/config-observability.yaml) with the Fluentd image including the
+   desired Fluentd output plugin. In theory, this is the same
+   with the one for Fluentd DaemonSet.
+
+## Deploying
+
+Operators need to deploy Knative components after the configuring:
+
+```shell
+# In case there is no change with the controller code
+bazel run config:controller.delete
+# Deploy the configuration for sidecar
+kubectl apply -f config/config-observability.yaml
+# Deploy the controller to make configuration for sidecar take effect
+bazel run config:controller.apply
+
+# Deploy the DaemonSet to make configuration for DaemonSet take effect
+kubectl apply -f <the-fluentd-config-for-daemonset> \
+    -f third_party/config/monitoring/common/kubernetes/fluentd/fluentd-ds.yaml \
+    -f config/monitoring/200-common/100-fluentd.yaml
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+In the commands above, replace `<the-fluentd-config-for-daemonset>` with the
+Fluentd DaemonSet configuration file, e.g. `config/monitoring/150-stackdriver-prod`.
+
+**NOTE**: Operators sometimes need to deploy extra services as the logging
+backends. For example, if they desire Elasticsearch&Kibana, they have to deploy
+the Elasticsearch and Kibana services. Knative provides this sample:
+
+```shell
+kubectl apply -R -f third_party/config/monitoring/elasticsearch
+```
+
+See [here](/config/monitoring/README.md) for deploying the whole Knative
+monitoring components.
--- a/serving/setting-up-ingress-static-ip.md
+++ b/serving/setting-up-ingress-static-ip.md
@ -0,0 +1,53 @@
+# Setting Up Static IP for Knative Gateway
+
+Knative uses a shared Gateway to serve all incoming traffic within Knative 
+service mesh, which is the "knative-shared-gateway" Gateway under 
+"knative-serving" namespace. The IP address to access the gateway is the 
+external IP address of the "knative-ingressgateway" service under the 
+"istio-system" namespace. So in order to set static IP for the Knative shared 
+gateway, you just need to set the external IP address of the 
+"knative-ingressgateway" service to the static IP you need.
+
+## Prerequisites
+
+### Prerequisite 1: Reserve a static IP
+
+#### Knative on GKE
+
+If you are running Knative cluster on GKE, you can follow the [instructions](https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address#reserve_new_static) to reserve a REGIONAL 
+IP address. The region of the IP address should be the region your Knative
+ cluster is running in (e.g. us-east1, us-central1, etc.).
+
+TODO: add documentation on reserving static IP in other cloud platforms.
+
+### Prerequisite 2: Deploy Istio And Knative Serving
+
+Follow the [instructions](https://github.com/knative/serving/blob/master/DEVELOPMENT.md) 
+to deploy Istio and Knative Serving into your cluster.
+
+Once you reach this point, you can start to set up static IP for Knative 
+gateway.
+
+## Set Up Static IP for Knative Gateway
+
+### Step 1: Update external IP of "knative-ingressgateway" service
+
+Run following command to reset the external IP for the 
+"knative-ingressgateway" service to the static IP you reserved.
+```shell
+kubectl patch svc knative-ingressgateway -n istio-system --patch '{"spec": { "loadBalancerIP": "<your-reserved-static-ip>" }}'
+```
+
+### Step 2: Verify static IP address of knative-ingressgateway service
+
+You can check the external IP of the "knative-ingressgateway" service with:
+```shell
+kubectl get svc knative-ingressgateway -n istio-system
+```
+The result should be something like
+```
+NAME                     TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                      AGE
+knative-ingressgateway   LoadBalancer   10.50.250.120   35.210.48.100   80:32380/TCP,443:32390/TCP,32400:32400/TCP   5h
+```
+The external IP will be eventually set to the static IP. This process could 
+take several minutes.
--- a/serving/telemetry.md
+++ b/serving/telemetry.md
@ -0,0 +1,377 @@
+# Logs and metrics
+
+## Monitoring components setup
+
+First, deploy monitoring components.
+
+### Elasticsearch, Kibana, Prometheus, and Grafana Setup
+
+You can use two different setups:
+
+1. **150-elasticsearch-prod**: This configuration collects logs & metrics from
+user containers, build controller and Istio requests.
+
+	```shell
+	kubectl apply -R -f config/monitoring/100-common \
+	    -f config/monitoring/150-elasticsearch-prod \
+	    -f third_party/config/monitoring/common \
+	    -f third_party/config/monitoring/elasticsearch \
+	    -f config/monitoring/200-common \
+	    -f config/monitoring/200-common/100-istio.yaml
+	```
+
+1. **150-elasticsearch-dev**: This configuration collects everything **150
+-elasticsearch-prod** does, plus Knative Serving controller logs.
+
+	```shell
+	kubectl apply -R -f config/monitoring/100-common \
+	    -f config/monitoring/150-elasticsearch-dev \
+	    -f third_party/config/monitoring/common \
+	    -f third_party/config/monitoring/elasticsearch \
+	    -f config/monitoring/200-common \
+	    -f config/monitoring/200-common/100-istio.yaml
+	```
+
+### Stackdriver, Prometheus, and Grafana Setup
+
+If your Knative Serving is not built on a Google Cloud Platform based cluster,
+or you want to send logs to another GCP project, you need to build your own
+Fluentd image and modify the configuration first. See
+
+1. [Fluentd image on Knative Serving](/image/fluentd/README.md)
+2. [Setting up a logging plugin](setting-up-a-logging-plugin.md)
+
+Then you can use two different setups:
+
+1. **150-stackdriver-prod**: This configuration collects logs and metrics from
+user containers, build controller, and Istio requests.
+
+```shell
+kubectl apply -R -f config/monitoring/100-common \
+    -f config/monitoring/150-stackdriver-prod \
+    -f third_party/config/monitoring/common \
+    -f config/monitoring/200-common \
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+2. **150-stackdriver-dev**: This configuration collects everything **150
+-stackdriver-prod** does, plus Knative Serving controller logs.
+
+```shell
+kubectl apply -R -f config/monitoring/100-common \
+    -f config/monitoring/150-stackdriver-dev \
+    -f third_party/config/monitoring/common \
+    -f config/monitoring/200-common \
+    -f config/monitoring/200-common/100-istio.yaml
+```
+
+## Accessing logs
+
+### Kibana and Elasticsearch
+
+To open the Kibana UI (the visualization tool for [Elasticsearch](https://info.elastic.co),
+enter the following command:
+
+```shell
+kubectl proxy
+```
+
+This starts a local proxy of Kibana on port 8001. The Kibana UI is only exposed within
+the cluster for security reasons.
+
+Navigate to the [Kibana UI](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana)
+(*It might take a couple of minutes for the proxy to work*).
+
+When Kibana is opened the first time, it will ask you to create an index.
+Accept the default options:
+
+![Kibana UI Configuring an Index Pattern](images/kibana-landing-page-configure-index.png)
+
+The Discover tab of the Kibana UI looks like this:
+
+![Kibana UI Discover tab](images/kibana-discover-tab-annotated.png)
+
+You can change the time frame of logs Kibana displays in the upper right corner
+of the screen. The main search bar is across the top of the Dicover page.
+
+As more logs are ingested, new fields will be discovered. To have them indexed,
+go to Management > Index Patterns > Refresh button (on top right) > Refresh
+fields.
+
+<!-- TODO: create a video walkthrough of the Kibana UI -->
+
+#### Accessing configuration and revision logs
+
+To access the logs for a configuration, enter the following search query in Kibana:
+
+```
+kubernetes.labels.knative_dev\/configuration: "configuration-example"
+```
+
+Replace `configuration-example` with your configuration's name. Enter the following
+command to get your configuration's name:
+
+```shell
+kubectl get configurations
+```
+
+To access logs for a revision, enter the following search query in Kibana:
+
+```
+kubernetes.labels.knative_dev\/revision: "configuration-example-00001"
+```
+
+Replace `configuration-example-00001` with your revision's name.
+
+#### Accessing build logs
+
+To access the logs for a build, enter the following search query in Kibana:
+
+```
+kubernetes.labels.build\-name: "test-build"
+```
+
+Replace `test-build` with your build's name. The build name is specified in the `.yaml` file as follows:
+
+```yaml
+apiVersion: build.knative.dev/v1alpha1
+kind: Build
+metadata:
+  name: test-build
+```
+
+### Stackdriver
+
+Go to the [Google Cloud Console logging page](https://console.cloud.google.com/logs/viewer) for
+your GCP project which stores your logs via Stackdriver.
+
+## Accessing metrics
+
+Enter:
+
+```shell
+kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000
+```
+
+Then open the Grafana UI at [http://localhost:3000](http://localhost:3000). The following dashboards are
+pre-installed with Knative Serving:
+
+* **Revision HTTP Requests:** HTTP request count, latency and size metrics per revision and per configuration
+* **Nodes:** CPU, memory, network and disk metrics at node level
+* **Pods:** CPU, memory and network metrics at pod level
+* **Deployment:** CPU, memory and network metrics aggregated at deployment level
+* **Istio, Mixer and Pilot:** Detailed Istio mesh, Mixer and Pilot metrics
+* **Kubernetes:** Dashboards giving insights into cluster health, deployments and capacity usage
+
+### Accessing per request traces
+
+Before you can view per request metrics, you'll need to create a new index pattern that will store
+per request traces captured by Zipkin:
+
+1. Start the Kibana UI serving on local port 8001 by entering the following command:
+
+	```shell
+	kubectl proxy
+	```
+
+1. Open the [Kibana UI](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana).
+
+1. Navigate to Management -> Index Patterns -> Create Index Pattern.
+
+1. Enter `zipkin*` in the "Index pattern" text field.
+
+1. Click **Create**.
+
+After you've created the Zipkin index pattern, open the
+[Zipkin UI](http://localhost:8001/api/v1/namespaces/istio-system/services/zipkin:9411/proxy/zipkin/).
+Click on "Find Traces" to see the latest traces. You can search for a trace ID
+or look at traces of a specific application. Click on a trace to see a detailed
+view of a specific call.
+
+To see a demo of distributed tracing, deploy the
+[Telemetry sample](../sample/telemetrysample/README.md), send some traffic to it,
+then explore the traces it generates from Zipkin UI.
+
+<!--TODO: Consider adding a video here. -->
+
+## Default metrics
+
+The following metrics are collected by default:
+* Knative Serving controller metrics
+* Istio metrics (mixer, envoy and pilot)
+* Node and pod metrics
+
+There are several other collectors that are pre-configured but not enabled.
+To see the full list, browse to config/monitoring/prometheus-exporter
+and config/monitoring/prometheus-servicemonitor folders and deploy them
+using `kubectl apply -f`.
+
+## Default logs
+
+Deployment above enables collection of the following logs:
+
+* stdout & stderr from all user-container
+* stdout & stderr from build-controller
+
+To enable log collection from other containers and destinations, see
+[setting up a logging plugin](setting-up-a-logging-plugin.md).
+
+## Metrics troubleshooting
+
+You can use the Prometheus web UI to troubleshoot publishing and service
+discovery issues for metrics. To access to the web UI, forward the Prometheus
+server to your machine:
+
+```shell
+kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=prometheus --output=jsonpath="{.items[0].metadata.name}") 9090
+```
+
+Then browse to http://localhost:9090 to access the UI.
+
+* To see the targets that are being scraped, go to Status -> Targets
+* To see what Prometheus service discovery is picking up vs. dropping, go to Status -> Service Discovery
+
+## Generating metrics
+
+If you want to send metrics from your controller, follow the steps below. These
+steps are already applied to autoscaler and controller. For those controllers,
+simply add your new metric definitions to the `view`, create new `tag.Key`s if
+necessary and instrument your code as described in step 3.
+
+In the example below, we will setup the service to host the metrics and
+instrument a sample 'Gauge' type metric using the setup.
+
+1. First, go through [OpenCensus Go Documentation](https://godoc.org/go.opencensus.io).
+2. Add the following to your application startup:
+
+```go
+import (
+	"net/http"
+	"time"
+
+	"go.opencensus.io/stats"
+	"go.opencensus.io/stats/view"
+	"go.opencensus.io/tag"
+)
+
+var (
+	desiredPodCountM *stats.Int64Measure
+	namespaceTagKey  tag.Key
+	revisionTagKey   tag.Key
+)
+
+func main() {
+	exporter, err := prometheus.NewExporter(prometheus.Options{Namespace: "{your metrics namespace (eg: autoscaler)}"})
+	if err != nil {
+		glog.Fatal(err)
+	}
+	view.RegisterExporter(exporter)
+	view.SetReportingPeriod(10 * time.Second)
+
+	// Create a sample gauge
+	var r = &Reporter{}
+	desiredPodCountM = stats.Int64(
+		"desired_pod_count",
+		"Number of pods autoscaler wants to allocate",
+		stats.UnitNone)
+
+	// Tag the statistics with namespace and revision labels
+	var err error
+	namespaceTagKey, err = tag.NewKey("namespace")
+	if err != nil {
+		// Error handling
+	}
+	revisionTagKey, err = tag.NewKey("revision")
+	if err != nil {
+		// Error handling
+	}
+
+	// Create view to see our measurement.
+	err = view.Register(
+		&view.View{
+			Description: "Number of pods autoscaler wants to allocate",
+			Measure:     r.measurements[DesiredPodCountM],
+			Aggregation: view.LastValue(),
+			TagKeys:     []tag.Key{namespaceTagKey, configTagKey, revisionTagKey},
+		},
+	)
+	if err != nil {
+		// Error handling
+	}
+
+	// Start the endpoint for Prometheus scraping
+	mux := http.NewServeMux()
+	mux.Handle("/metrics", exporter)
+	http.ListenAndServe(":8080", mux)
+}
+```
+
+3. In your code where you want to instrument, set the counter with the
+appropriate label values - example:
+
+```go
+ctx := context.TODO()
+tag.New(
+    ctx,
+    tag.Insert(namespaceTagKey, namespace),
+    tag.Insert(revisionTagKey, revision))
+stats.Record(ctx, desiredPodCountM.M({Measurement Value}))
+```
+
+4. Add the following to scape config file located at
+config/monitoring/200-common/300-prometheus/100-scrape-config.yaml:
+
+```yaml
+- job_name: <YOUR SERVICE NAME>
+    kubernetes_sd_configs:
+    - role: endpoints
+    relabel_configs:
+    # Scrape only the the targets matching the following metadata
+    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_label_app, __meta_kubernetes_endpoint_port_name]
+    action: keep
+    regex: {SERVICE NAMESPACE};{APP LABEL};{PORT NAME}
+    # Rename metadata labels to be reader friendly
+    - source_labels: [__meta_kubernetes_namespace]
+    action: replace
+    regex: (.*)
+    target_label: namespace
+    replacement: $1
+    - source_labels: [__meta_kubernetes_pod_name]
+    action: replace
+    regex: (.*)
+    target_label: pod
+    replacement: $1
+    - source_labels: [__meta_kubernetes_service_name]
+    action: replace
+    regex: (.*)
+    target_label: service
+    replacement: $1
+```
+
+5. Redeploy prometheus and its configuration:
+```sh
+kubectl delete -f config/monitoring/200-common/300-prometheus
+kubectl apply -f config/monitoring/200-common/300-prometheus
+```
+
+6. Add a dashboard for your metrics - you can see examples of it under
+config/grafana/dashboard-definition folder. An easy way to generate JSON
+definitions is to use Grafana UI (make sure to login with as admin user) and
+[export JSON](http://docs.grafana.org/reference/export_import) from it.
+
+7. Validate the metrics flow either by Grafana UI or Prometheus UI (see
+Troubleshooting section above to enable Prometheus UI)
+
+## Distributed tracing with Zipkin
+Check [Telemetry sample](../sample/telemetrysample/README.md) as an example usage of
+[OpenZipkin](https://zipkin.io/pages/existing_instrumentations)'s Go client library.
+
+## Delete monitoring components
+Enter:
+
+```shell
+ko delete --ignore-not-found=true \
+  -f config/monitoring/200-common/100-istio.yaml \
+  -f config/monitoring/200-common/100-zipkin.yaml \
+  -f config/monitoring/100-common
+```