mirror of https://github.com/knative/docs.git
212 lines
8.0 KiB
Markdown
212 lines
8.0 KiB
Markdown
# Logs and metrics
|
|
|
|
First, deploy monitoring components. You can use two different setups:
|
|
1. **everything**: This configuration collects logs & metrics from user containers, build controller and istio requests.
|
|
```shell
|
|
bazel run config/monitoring:everything.apply
|
|
```
|
|
|
|
2. **everything-dev**: This configuration collects everything in (1) plus Elafros controller logs.
|
|
```shell
|
|
bazel run config/monitoring:everything-dev.apply
|
|
```
|
|
|
|
## Accessing logs
|
|
Run,
|
|
|
|
```shell
|
|
kubectl proxy
|
|
```
|
|
Then open Kibana UI at this [link](http://localhost:8001/api/v1/namespaces/monitoring/services/kibana-logging/proxy/app/kibana)
|
|
(*it might take a couple of minutes for the proxy to work*).
|
|
When Kibana is opened the first time, it will ask you to create an index. Accept the default options as is. As logs get ingested,
|
|
new fields will be discovered and to have them indexed, go to Management -> Index Patterns -> Refresh button (on top right) -> Refresh fields.
|
|
|
|
## Accessing metrics
|
|
|
|
Run:
|
|
|
|
```shell
|
|
kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000
|
|
```
|
|
|
|
Then open Grafana UI at [http://localhost:3000](http://localhost:3000)
|
|
|
|
## Accessing per request traces
|
|
First open Kibana UI as shown above. Browse to Management -> Index Patterns -> +Create Index Pattern and type "zipkin*" (without the quotes) to the "Index pattern" text field and hit "Create" button. This will create a new index pattern that will store per request traces captured by Zipkin. This is a one time step and is needed only for fresh installations.
|
|
|
|
Next, start the proxy if it is not already running:
|
|
|
|
```shell
|
|
kubectl proxy
|
|
```
|
|
|
|
Then open Zipkin UI at this [link](http://localhost:8001/api/v1/namespaces/istio-system/services/zipkin:9411/proxy/zipkin/). Click on "Find Traces" to see the latest traces. You can search for a trace ID or look at traces of a specific application within this UI. Click on a trace to see a detailed view of a specific call.
|
|
|
|
To see a demo of distributed tracing, deploy the [Telemetry sample](../sample/telemetrysample/README.md), send some traffic to it and explore the traces it generates from Zipkin UI.
|
|
|
|
## Default metrics
|
|
Following metrics are collected by default:
|
|
* Elafros controller metrics
|
|
* Istio metrics (mixer, envoy and pilot)
|
|
* Node and pod metrics
|
|
|
|
There are several other collectors that are pre-configured but not enabled. To see the full list, browse to config/monitoring/prometheus-exporter and config/monitoring/prometheus-servicemonitor folders and deploy them using kubectl apply -f.
|
|
|
|
## Default logs
|
|
Deployment above enables collection of the following logs:
|
|
* stdout & stderr from all ela-container
|
|
* stdout & stderr from build-controller
|
|
|
|
To enable log collection from other containers and destinations, edit fluentd-es-configmap.yaml (search for "fluentd-containers.log" for the starting point). Then run the following:
|
|
```shell
|
|
kubectl replace -f config/monitoring/fluentd/fluentd-es-configmap.yaml
|
|
kubectl replace -f config/monitoring/fluentd/fluentd-es-ds.yaml
|
|
```
|
|
|
|
Note: We will enable a plugin mechanism to define other logs to collect and this step is a workaround until then.
|
|
|
|
## Metrics troubleshooting
|
|
You can use Prometheus web UI to troubleshoot publishing and service discovery issues for metrics.
|
|
To access to the web UI, forward the Prometheus server to your machine:
|
|
|
|
```shell
|
|
kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=prometheus --output=jsonpath="{.items[0].metadata.name}") 9090
|
|
```
|
|
|
|
Then browse to http://localhost:9090 to access the UI:
|
|
* To see the targets that are being scraped, go to Status -> Targets
|
|
* To see what Prometheus service discovery is picking up vs. dropping, go to Status -> Service Discovery
|
|
|
|
## Generating metrics
|
|
|
|
See [Telemetry Sample](../sample/telemetrysample/README.md) for deploying a dedicated instance of Prometheus
|
|
and emitting metrics to it.
|
|
|
|
If you want to generate metrics within Elafros services and send them to shared instance of Prometheus,
|
|
follow the steps below. We will create a counter metric below:
|
|
1. Go through [Prometheus Documentation](https://prometheus.io/docs/introduction/overview/)
|
|
and read [Data model](https://prometheus.io/docs/concepts/data_model/) and
|
|
[Metric types](https://prometheus.io/docs/concepts/metric_types/) sections.
|
|
2. Create a top level variable in your go file and initialize it in init() - example:
|
|
|
|
```go
|
|
import "github.com/prometheus/client_golang/prometheus"
|
|
|
|
var myCounter = prometheus.NewCounterVec(prometheus.CounterOpts{
|
|
Namespace: "elafros",
|
|
Name: "mycomponent_something_count",
|
|
Help: "Counter to keep track of something in my component",
|
|
}, []string{"status"})
|
|
|
|
func init() {
|
|
prometheus.MustRegister(myCounter)
|
|
}
|
|
```
|
|
3. In your code where you want to instrument, increment the counter with the appropriate label values - example:
|
|
|
|
```go
|
|
err := doSomething()
|
|
if err == nil {
|
|
myCounter.With(prometheus.Labels{"status": "success"}).Inc()
|
|
} else {
|
|
myCounter.With(prometheus.Labels{"status": "failure"}).Inc()
|
|
}
|
|
```
|
|
4. Start an http listener to serve as the metrics endpoint for Prometheus scraping (_this step and onwards are needed
|
|
only once in a service. ela-controller is already setup for metrics scraping and you can skip rest of these steps
|
|
if you are targeting ela-controller_):
|
|
|
|
```go
|
|
import "github.com/prometheus/client_golang/prometheus/promhttp"
|
|
|
|
// In your main() func
|
|
srv := &http.Server{Addr: ":9090"}
|
|
http.Handle("/metrics", promhttp.Handler())
|
|
go func() {
|
|
if err := srv.ListenAndServe(); err != nil {
|
|
glog.Infof("Httpserver: ListenAndServe() finished with error: %s", err)
|
|
}
|
|
}()
|
|
|
|
// Wait for the service to shutdown
|
|
<-stopCh
|
|
|
|
// Close the http server gracefully
|
|
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
|
|
defer cancel()
|
|
srv.Shutdown(ctx)
|
|
|
|
```
|
|
|
|
5. Add a Service for the metrics http endpoint:
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
labels:
|
|
app: myappname
|
|
prometheus: myappname
|
|
name: myappname
|
|
namespace: mynamespace
|
|
spec:
|
|
ports:
|
|
- name: metrics
|
|
port: 9090
|
|
protocol: TCP
|
|
targetPort: 9090
|
|
selector:
|
|
app: myappname # put the appropriate value here to select your application
|
|
```
|
|
|
|
6. Add a ServiceMonitor to tell Prometheus to discover pods and scrape the service defined above:
|
|
|
|
```yaml
|
|
apiVersion: monitoring.coreos.com/v1
|
|
kind: ServiceMonitor
|
|
metadata:
|
|
name: myappname
|
|
namespace: monitoring
|
|
labels:
|
|
monitor-category: ela-system # Shared Prometheus instance only targets 'k8s', 'istio', 'node',
|
|
# 'prometheus' or 'ela-system' - if you pick something else,
|
|
# you need to deploy your own Prometheus instance or edit shared
|
|
# instance to target the new category
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: myappname
|
|
prometheus: myappname
|
|
namespaceSelector:
|
|
matchNames:
|
|
- mynamespace
|
|
endpoints:
|
|
- port: metrics
|
|
interval: 30s
|
|
```
|
|
|
|
7. Add a dashboard for your metrics - you can see examples of it under
|
|
config/grafana/dashboard-definition folder. An easy way to generate JSON
|
|
definitions is to use Grafana UI (make sure to login with as admin user) and [export JSON](http://docs.grafana.org/reference/export_import) from it.
|
|
|
|
8. Add the YAML files to BUILD files.
|
|
|
|
9. Deploy changes with bazel.
|
|
|
|
10. Validate the metrics flow either by Grafana UI or Prometheus UI (see Troubleshooting section
|
|
above to enable Prometheus UI)
|
|
|
|
## Generating logs
|
|
Use [glog](https://godoc.org/github.com/golang/glog) to write logs in your code. In your container spec, add the following args to redirect the logs to stderr:
|
|
```yaml
|
|
args:
|
|
- "-logtostderr=true"
|
|
- "-stderrthreshold=INFO"
|
|
```
|
|
|
|
See [helloworld](../sample/helloworld/README.md) sample's configuration file as an example.
|
|
|
|
## Distributed tracing with Zipkin
|
|
Check [Telemetry sample](../sample/telemetrysample/README.md) as an example usage of [OpenZipkin](https://zipkin.io/pages/existing_instrumentations)'s Go client library.
|