Move docs from serving repo (#180)

This commit is contained in:
Ryan Gregg 2018-07-18 10:38:48 -07:00 committed by GitHub
parent 0401c1ada6
commit 526a49c5a9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 595 additions and 0 deletions

View File

@ -0,0 +1,83 @@
# Setting Up A Docker Registry
This document explains how to use different Docker registries with Knative Serving. It
assumes you have gone through the steps listed in
[DEVELOPMENT.md](/DEVELOPMENT.md) to set up your development environment (or
that you at least have installed `go`, set `GOPATH`, and put `$GOPATH/bin` on
your `PATH`).
It currently only contains instructions for [Google Container Registry
(GCR)](https://cloud.google.com/container-registry/), but you should be able to
use any Docker registry.
## Google Container Registry (GCR)
### Required Tools
Install the following tools:
1. [`gcloud`](https://cloud.google.com/sdk/downloads)
1. [`docker-credential-gcr`](https://github.com/GoogleCloudPlatform/docker-credential-gcr)
If you installed `gcloud` using the archive or installer, you can install
`docker-credential-gcr` like this:
```shell
gcloud components install docker-credential-gcr
```
If you installed `gcloud` using a package manager, you may need to install
it with `go get`:
```shell
go get github.com/GoogleCloudPlatform/docker-credential-gcr
```
If you used `go get` to install and `$GOPATH/bin` isn't already in `PATH`,
add it:
```shell
export PATH=$PATH:$GOPATH/bin
```
### Setup
1. If you haven't already set up a GCP project, create one and export its name
for use in later commands.
```shell
export PROJECT_ID=my-project-name
gcloud projects create "${PROJECT_ID}"
```
1. Enable the GCR API.
```shell
gcloud --project="${PROJECT_ID}" services enable \
containerregistry.googleapis.com
```
1. Hook up your GCR credentials. Note that this may complain if you don't have
the docker CLI installed, but it is not necessary and should still work.
```shell
docker-credential-gcr configure-docker
```
1. If you need to, update your `KO_DOCKER_REPO` and/or `DOCKER_REPO_OVERRIDE`
in your `.bashrc`. It should now be
```shell
export KO_DOCKER_REPO='us.gcr.io/<your-project-id>'
export DOCKER_REPO_OVERRIDE="${KO_DOCKER_REPO}"
```
(You may need to use a different region than `us` if you didn't pick a`us`
Google Cloud region.)
That's it, you're done!
## Local registry
This section has yet to be written. If you'd like to write it, see issue
[#23](https://github.com/knative/serving/issues/23).

View File

@ -0,0 +1,82 @@
# Setting Up A Logging Plugin
Knative allows cluster operators to use different backends for their logging
needs. This document describes how to change these settings. Knative currently
requires changes in Fluentd configuration files, however we plan on abstracting
logging configuration in the future
([#906](https://github.com/knative/serving/issues/906)). Once
[#906](https://github.com/knative/serving/issues/906) is complete, the
methodology described in this document will no longer be valid and migration to
a new process will be required. In order to minimize the effort for a future
migration, we recommend only changing the output configuration of Fluentd and
leaving the rest intact.
## Configuring
### Configure the DaemonSet for stdout/stderr logs
Operators can do the following steps to configure the Fluentd DaemonSet for
collecting `stdout/stderr` logs from the containers:
1. Replace `900.output.conf` part in
[fluentd-configmap.yaml](/config/monitoring/fluentd-configmap.yaml) with the
desired output configuration. Knative provides samples for sending logs to
Elasticsearch or Stackdriver. Developers can simply choose one of `150-*`
from [/config/monitoring](/config/monitoring) or override any with other
configuration.
1. Replace the `image` field of `fluentd-ds` container
in [fluentd-ds.yaml](/third_party/config/monitoring/common/fluentd/fluentd-ds.yaml)
with the Fluentd image including the desired Fluentd output plugin.
See [here](/image/fluentd/README.md) for the requirements of Flunetd image
on Knative.
### Configure the Sidecar for log files under /var/log
Currently operators have to configure the Fluentd Sidecar separately for
collecting log files under `/var/log`. An
[effort](https://github.com/knative/serving/issues/818)
is in process to get rid of the sidecar. The steps to configure are:
1. Replace `logging.fluentd-sidecar-output-config` flag in
[config-observability](/config/config-observability.yaml) with the
desired output configuration. **NOTE**: The Fluentd DaemonSet is in
`monitoring` namespace while the Fluentd sidecar is in the namespace same with
the app. There may be small differences between the configuration for DaemonSet
and sidecar even though the desired backends are the same.
1. Replace `logging.fluentd-sidecar-image` flag in
[config-observability](/config/config-observability.yaml) with the Fluentd image including the
desired Fluentd output plugin. In theory, this is the same
with the one for Fluentd DaemonSet.
## Deploying
Operators need to deploy Knative components after the configuring:
```shell
# In case there is no change with the controller code
bazel run config:controller.delete
# Deploy the configuration for sidecar
kubectl apply -f config/config-observability.yaml
# Deploy the controller to make configuration for sidecar take effect
bazel run config:controller.apply
# Deploy the DaemonSet to make configuration for DaemonSet take effect
kubectl apply -f <the-fluentd-config-for-daemonset> \
-f third_party/config/monitoring/common/kubernetes/fluentd/fluentd-ds.yaml \
-f config/monitoring/200-common/100-fluentd.yaml
-f config/monitoring/200-common/100-istio.yaml
```
In the commands above, replace `<the-fluentd-config-for-daemonset>` with the
Fluentd DaemonSet configuration file, e.g. `config/monitoring/150-stackdriver-prod`.
**NOTE**: Operators sometimes need to deploy extra services as the logging
backends. For example, if they desire Elasticsearch&Kibana, they have to deploy
the Elasticsearch and Kibana services. Knative provides this sample:
```shell
kubectl apply -R -f third_party/config/monitoring/elasticsearch
```
See [here](/config/monitoring/README.md) for deploying the whole Knative
monitoring components.

View File

@ -0,0 +1,53 @@
# Setting Up Static IP for Knative Gateway
Knative uses a shared Gateway to serve all incoming traffic within Knative
service mesh, which is the "knative-shared-gateway" Gateway under
"knative-serving" namespace. The IP address to access the gateway is the
external IP address of the "knative-ingressgateway" service under the
"istio-system" namespace. So in order to set static IP for the Knative shared
gateway, you just need to set the external IP address of the
"knative-ingressgateway" service to the static IP you need.
## Prerequisites
### Prerequisite 1: Reserve a static IP
#### Knative on GKE
If you are running Knative cluster on GKE, you can follow the [instructions](https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address#reserve_new_static) to reserve a REGIONAL
IP address. The region of the IP address should be the region your Knative
cluster is running in (e.g. us-east1, us-central1, etc.).
TODO: add documentation on reserving static IP in other cloud platforms.
### Prerequisite 2: Deploy Istio And Knative Serving
Follow the [instructions](https://github.com/knative/serving/blob/master/DEVELOPMENT.md)
to deploy Istio and Knative Serving into your cluster.
Once you reach this point, you can start to set up static IP for Knative
gateway.
## Set Up Static IP for Knative Gateway
### Step 1: Update external IP of "knative-ingressgateway" service
Run following command to reset the external IP for the
"knative-ingressgateway" service to the static IP you reserved.
```shell
kubectl patch svc knative-ingressgateway -n istio-system --patch '{"spec": { "loadBalancerIP": "<your-reserved-static-ip>" }}'
```
### Step 2: Verify static IP address of knative-ingressgateway service
You can check the external IP of the "knative-ingressgateway" service with:
```shell
kubectl get svc knative-ingressgateway -n istio-system
```
The result should be something like
```
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
knative-ingressgateway LoadBalancer 10.50.250.120 35.210.48.100 80:32380/TCP,443:32390/TCP,32400:32400/TCP 5h
```
The external IP will be eventually set to the static IP. This process could
take several minutes.

377
serving/telemetry.md Normal file
View File

@ -0,0 +1,377 @@
# Logs and metrics
## Monitoring components setup
First, deploy monitoring components.
### Elasticsearch, Kibana, Prometheus, and Grafana Setup
You can use two different setups:
1. **150-elasticsearch-prod**: This configuration collects logs & metrics from
user containers, build controller and Istio requests.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-elasticsearch-prod \
-f third_party/config/monitoring/common \
-f third_party/config/monitoring/elasticsearch \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
1. **150-elasticsearch-dev**: This configuration collects everything **150
-elasticsearch-prod** does, plus Knative Serving controller logs.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-elasticsearch-dev \
-f third_party/config/monitoring/common \
-f third_party/config/monitoring/elasticsearch \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
### Stackdriver, Prometheus, and Grafana Setup
If your Knative Serving is not built on a Google Cloud Platform based cluster,
or you want to send logs to another GCP project, you need to build your own
Fluentd image and modify the configuration first. See
1. [Fluentd image on Knative Serving](/image/fluentd/README.md)
2. [Setting up a logging plugin](setting-up-a-logging-plugin.md)
Then you can use two different setups:
1. **150-stackdriver-prod**: This configuration collects logs and metrics from
user containers, build controller, and Istio requests.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-stackdriver-prod \
-f third_party/config/monitoring/common \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
2. **150-stackdriver-dev**: This configuration collects everything **150
-stackdriver-prod** does, plus Knative Serving controller logs.
```shell
kubectl apply -R -f config/monitoring/100-common \
-f config/monitoring/150-stackdriver-dev \
-f third_party/config/monitoring/common \
-f config/monitoring/200-common \
-f config/monitoring/200-common/100-istio.yaml
```
## Accessing logs
### Kibana and Elasticsearch
To open the Kibana UI (the visualization tool for [Elasticsearch](https://info.elastic.co),
enter the following command:
```shell
kubectl proxy
```
This starts a local proxy of Kibana on port 8001. The Kibana UI is only exposed within
the cluster for security reasons.
Navigate to the [Kibana UI](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana)
(*It might take a couple of minutes for the proxy to work*).
When Kibana is opened the first time, it will ask you to create an index.
Accept the default options:
![Kibana UI Configuring an Index Pattern](images/kibana-landing-page-configure-index.png)
The Discover tab of the Kibana UI looks like this:
![Kibana UI Discover tab](images/kibana-discover-tab-annotated.png)
You can change the time frame of logs Kibana displays in the upper right corner
of the screen. The main search bar is across the top of the Dicover page.
As more logs are ingested, new fields will be discovered. To have them indexed,
go to Management > Index Patterns > Refresh button (on top right) > Refresh
fields.
<!-- TODO: create a video walkthrough of the Kibana UI -->
#### Accessing configuration and revision logs
To access the logs for a configuration, enter the following search query in Kibana:
```
kubernetes.labels.knative_dev\/configuration: "configuration-example"
```
Replace `configuration-example` with your configuration's name. Enter the following
command to get your configuration's name:
```shell
kubectl get configurations
```
To access logs for a revision, enter the following search query in Kibana:
```
kubernetes.labels.knative_dev\/revision: "configuration-example-00001"
```
Replace `configuration-example-00001` with your revision's name.
#### Accessing build logs
To access the logs for a build, enter the following search query in Kibana:
```
kubernetes.labels.build\-name: "test-build"
```
Replace `test-build` with your build's name. The build name is specified in the `.yaml` file as follows:
```yaml
apiVersion: build.knative.dev/v1alpha1
kind: Build
metadata:
name: test-build
```
### Stackdriver
Go to the [Google Cloud Console logging page](https://console.cloud.google.com/logs/viewer) for
your GCP project which stores your logs via Stackdriver.
## Accessing metrics
Enter:
```shell
kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000
```
Then open the Grafana UI at [http://localhost:3000](http://localhost:3000). The following dashboards are
pre-installed with Knative Serving:
* **Revision HTTP Requests:** HTTP request count, latency and size metrics per revision and per configuration
* **Nodes:** CPU, memory, network and disk metrics at node level
* **Pods:** CPU, memory and network metrics at pod level
* **Deployment:** CPU, memory and network metrics aggregated at deployment level
* **Istio, Mixer and Pilot:** Detailed Istio mesh, Mixer and Pilot metrics
* **Kubernetes:** Dashboards giving insights into cluster health, deployments and capacity usage
### Accessing per request traces
Before you can view per request metrics, you'll need to create a new index pattern that will store
per request traces captured by Zipkin:
1. Start the Kibana UI serving on local port 8001 by entering the following command:
```shell
kubectl proxy
```
1. Open the [Kibana UI](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana).
1. Navigate to Management -> Index Patterns -> Create Index Pattern.
1. Enter `zipkin*` in the "Index pattern" text field.
1. Click **Create**.
After you've created the Zipkin index pattern, open the
[Zipkin UI](http://localhost:8001/api/v1/namespaces/istio-system/services/zipkin:9411/proxy/zipkin/).
Click on "Find Traces" to see the latest traces. You can search for a trace ID
or look at traces of a specific application. Click on a trace to see a detailed
view of a specific call.
To see a demo of distributed tracing, deploy the
[Telemetry sample](../sample/telemetrysample/README.md), send some traffic to it,
then explore the traces it generates from Zipkin UI.
<!--TODO: Consider adding a video here. -->
## Default metrics
The following metrics are collected by default:
* Knative Serving controller metrics
* Istio metrics (mixer, envoy and pilot)
* Node and pod metrics
There are several other collectors that are pre-configured but not enabled.
To see the full list, browse to config/monitoring/prometheus-exporter
and config/monitoring/prometheus-servicemonitor folders and deploy them
using `kubectl apply -f`.
## Default logs
Deployment above enables collection of the following logs:
* stdout & stderr from all user-container
* stdout & stderr from build-controller
To enable log collection from other containers and destinations, see
[setting up a logging plugin](setting-up-a-logging-plugin.md).
## Metrics troubleshooting
You can use the Prometheus web UI to troubleshoot publishing and service
discovery issues for metrics. To access to the web UI, forward the Prometheus
server to your machine:
```shell
kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=prometheus --output=jsonpath="{.items[0].metadata.name}") 9090
```
Then browse to http://localhost:9090 to access the UI.
* To see the targets that are being scraped, go to Status -> Targets
* To see what Prometheus service discovery is picking up vs. dropping, go to Status -> Service Discovery
## Generating metrics
If you want to send metrics from your controller, follow the steps below. These
steps are already applied to autoscaler and controller. For those controllers,
simply add your new metric definitions to the `view`, create new `tag.Key`s if
necessary and instrument your code as described in step 3.
In the example below, we will setup the service to host the metrics and
instrument a sample 'Gauge' type metric using the setup.
1. First, go through [OpenCensus Go Documentation](https://godoc.org/go.opencensus.io).
2. Add the following to your application startup:
```go
import (
"net/http"
"time"
"go.opencensus.io/stats"
"go.opencensus.io/stats/view"
"go.opencensus.io/tag"
)
var (
desiredPodCountM *stats.Int64Measure
namespaceTagKey tag.Key
revisionTagKey tag.Key
)
func main() {
exporter, err := prometheus.NewExporter(prometheus.Options{Namespace: "{your metrics namespace (eg: autoscaler)}"})
if err != nil {
glog.Fatal(err)
}
view.RegisterExporter(exporter)
view.SetReportingPeriod(10 * time.Second)
// Create a sample gauge
var r = &Reporter{}
desiredPodCountM = stats.Int64(
"desired_pod_count",
"Number of pods autoscaler wants to allocate",
stats.UnitNone)
// Tag the statistics with namespace and revision labels
var err error
namespaceTagKey, err = tag.NewKey("namespace")
if err != nil {
// Error handling
}
revisionTagKey, err = tag.NewKey("revision")
if err != nil {
// Error handling
}
// Create view to see our measurement.
err = view.Register(
&view.View{
Description: "Number of pods autoscaler wants to allocate",
Measure: r.measurements[DesiredPodCountM],
Aggregation: view.LastValue(),
TagKeys: []tag.Key{namespaceTagKey, configTagKey, revisionTagKey},
},
)
if err != nil {
// Error handling
}
// Start the endpoint for Prometheus scraping
mux := http.NewServeMux()
mux.Handle("/metrics", exporter)
http.ListenAndServe(":8080", mux)
}
```
3. In your code where you want to instrument, set the counter with the
appropriate label values - example:
```go
ctx := context.TODO()
tag.New(
ctx,
tag.Insert(namespaceTagKey, namespace),
tag.Insert(revisionTagKey, revision))
stats.Record(ctx, desiredPodCountM.M({Measurement Value}))
```
4. Add the following to scape config file located at
config/monitoring/200-common/300-prometheus/100-scrape-config.yaml:
```yaml
- job_name: <YOUR SERVICE NAME>
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
# Scrape only the the targets matching the following metadata
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_label_app, __meta_kubernetes_endpoint_port_name]
action: keep
regex: {SERVICE NAMESPACE};{APP LABEL};{PORT NAME}
# Rename metadata labels to be reader friendly
- source_labels: [__meta_kubernetes_namespace]
action: replace
regex: (.*)
target_label: namespace
replacement: $1
- source_labels: [__meta_kubernetes_pod_name]
action: replace
regex: (.*)
target_label: pod
replacement: $1
- source_labels: [__meta_kubernetes_service_name]
action: replace
regex: (.*)
target_label: service
replacement: $1
```
5. Redeploy prometheus and its configuration:
```sh
kubectl delete -f config/monitoring/200-common/300-prometheus
kubectl apply -f config/monitoring/200-common/300-prometheus
```
6. Add a dashboard for your metrics - you can see examples of it under
config/grafana/dashboard-definition folder. An easy way to generate JSON
definitions is to use Grafana UI (make sure to login with as admin user) and
[export JSON](http://docs.grafana.org/reference/export_import) from it.
7. Validate the metrics flow either by Grafana UI or Prometheus UI (see
Troubleshooting section above to enable Prometheus UI)
## Distributed tracing with Zipkin
Check [Telemetry sample](../sample/telemetrysample/README.md) as an example usage of
[OpenZipkin](https://zipkin.io/pages/existing_instrumentations)'s Go client library.
## Delete monitoring components
Enter:
```shell
ko delete --ignore-not-found=true \
-f config/monitoring/200-common/100-istio.yaml \
-f config/monitoring/200-common/100-zipkin.yaml \
-f config/monitoring/100-common
```