istio.io/content/blog/2018/export-logs-through-stackdr.../index.md

11 KiB
Raw Blame History

title description publishdate subtitle attribution weight
Exporting Logs to BigQuery, GCS, Pub/Sub through Stackdriver How to export Istio Access Logs to different sinks like BigQuery, GCS, Pub/Sub through Stackdriver. 2018-07-09 Nupur Garg and Douglas Reid 87

This post shows how to direct Istio logs to Stackdriver and export those logs to various configured sinks such as such as BigQuery, Google Cloud Storage(GCS) or Cloud Pub/Sub. At the end of this post you can perform analytics on Istio data from your favorite places such as BigQuery, GCS or Cloud Pub/Sub.

The Bookinfo sample application is used as the example application throughout this task.

Before you begin

Install Istio in your cluster and deploy an application.

Configuring Istio to export logs

Istio exports logs using the logentry template configured for Mixer as accesslog entry. This specifies all the variables that are available for analysis. It contains information like source service, destination service, auth metrics (coming..) among others. Following is a diagram of the pipeline:

{{< image width="75%" ratio="75%" link="./istio-analytics-using-stackdriver.png" caption="Diagram of exporting logs from Istio to StackDriver for analysis" >}}

Istio supports exporting logs to Stackdriver which can be configured to export logs to your favorite sink like BigQuery, Pub/Sub or GCS. Please follow the steps below to setup your favorite sink for exporting logs first and then Stackdriver in Istio.

Setting up various log sinks

Common setup for all sinks:

  1. Enable StackDriver Monitoring API for the project.
  2. Make sure principalEmail that would be setting up the sink has write access to the project and Logging Admin role permissions.
  3. Make sure the GOOGLE_APPLICATION_CREDENTIALS environment variable is set. Please follow instructions here to set it up.

BigQuery

  1. Create a BigQuery dataset as a destination for the logs export.
  2. Record the ID of the dataset. It will be needed to configure the Stackdriver handler. It would be of the form bigquery.googleapis.com/projects/[PROJECT_ID]/datasets/[DATASET_ID]
  3. Give sinks writer identity: cloud-logs@system.gserviceaccount.com BigQuery Data Editor role in IAM.
  4. If using Google Kubernetes Engine, make sure bigquery Scope is enabled on the cluster.

Google Cloud Storage (GCS)

  1. Create a GCS bucket where you would like logs to get exported in GCS.
  2. Recode the ID of the bucket. It will be needed to configure Stackdriver. It would be of the form storage.googleapis.com/[BUCKET_ID]
  3. Give sinks writer identity: cloud-logs@system.gserviceaccount.com Storage Object Creator role in IAM.

Google Cloud Pub/Sub

  1. Create a topic where you would like logs to get exported in Google Cloud Pub/Sub.
  2. Recode the ID of the topic. It will be needed to configure Stackdriver. It would be of the form pubsub.googleapis.com/projects/[PROJECT_ID]/topics/[TOPIC_ID]
  3. Give sinks writer identity: cloud-logs@system.gserviceaccount.com Pub/Sub Publisher role in IAM.
  4. If using Google Kubernetes Engine, make sure pubsub Scope is enabled on the cluster.

Setting up Stackdriver

A Stackdriver handler must be created to export data to Stackdriver. The configuration schema for a Stackdriver handler can be found here. Config proto for Stackdriver can be found here. Handler is configured based on this proto.

  1. Save the following yaml file as stackdriver.yaml. Replace <project_id>, <sink_id>, <sink_destination>, <log_filter> with their specific values.

        apiVersion: "config.istio.io/v1alpha2"
        kind: stackdriver
        metadata:
          name: handler
          namespace: istio-system
        spec:
          # We'll use the default value from the adapter, once per minute, so we don't need to supply a value.
          # pushInterval: 1m
          # Must be supplied for the Stackdriver adapter to work
          project_id: "<project_id>"
          # One of the following must be set; the preferred method is `appCredentials`, which corresponds to
          # Google Application Default Credentials. See:
          #    https://developers.google.com/identity/protocols/application-default-credentials
          # If none is provided we default to app credentials.
          # appCredentials:
          # apiKey:
          # serviceAccountPath:
          # Describes how to map Istio logs into Stackdriver.
          logInfo:
            accesslog.logentry.istio-system:
              payloadTemplate: '{{or (.sourceIp) "-"}} - {{or (.sourceUser) "-"}} [{{or (.timestamp.Format "02/Jan/2006:15:04:05 -0700") "-"}}] "{{or (.method) "-"}} {{or (.url) "-"}} {{or (.protocol) "-"}}" {{or (.responseCode) "-"}} {{or (.responseSize) "-"}}'
              httpMapping:
                url: url
                status: responseCode
                requestSize: requestSize
                responseSize: responseSize
                latency: latency
                localIp: sourceIp
                remoteIp: destinationIp
                method: method
                userAgent: userAgent
                referer: referer
              labelNames:
              - sourceIp
              - destinationIp
              - sourceService
              - sourceUser
              - sourceNamespace
              - destinationIp
              - destinationService
              - destinationNamespace
              - apiClaims
              - apiKey
              - protocol
              - method
              - url
              - responseCode
              - responseSize
              - requestSize
              - latency
              - connectionMtls
              - userAgent
              - responseTimestamp
              - receivedBytes
              - sentBytes
              - referer
              sinkInfo:
                id: '<sink_id>'
                destination: '<sink_destination>'
                filter: '<log_filter>'
        ---
        apiVersion: "config.istio.io/v1alpha2"
        kind: rule
        metadata:
          name: stackdriver
          namespace: istio-system
        spec:
          match: "true" # If omitted match is true.
          actions:
          - handler: handler.stackdriver
            instances:
            - accesslog.logentry
        ---
    
  2. Push the configuration

    $ kubectl apply -f stackdriver.yaml
    stackdriver "handler" created
    rule "stackdriver" created
    logentry "stackdriverglobalmr" created
    metric "stackdriverrequestcount" created
    metric "stackdriverrequestduration" created
    metric "stackdriverrequestsize" created
    metric "stackdriverresponsesize" created
    
  3. Send traffic to the sample application.

    For the Bookinfo sample, visit http://$GATEWAY_URL/productpage in your web browser or issue the following command:

    $ curl http://$GATEWAY_URL/productpage
    
  4. Verify that logs are flowing through Stackdriver to the configured sink.

    • Stackdriver: Navigate to the Stackdriver Logs Viewer for your project and look under "GKE Container" -> "Cluster Name" -> "Namespace Id" for Istio Access logs.
    • BigQuery: Navigate to the BigQuery Interface for your project and you should find a table with prefix accesslog_logentry_istio in your sink dataset.
    • GCS: Navigate to the Storage Browser for your project and you should find a bucket named accesslog.logentry.istio-system in your sink bucket.
    • Pub/Sub: Navigate to the Pub/Sub TopicList for your project and you should find a topic for accesslog in your sink topic.

Understanding what happened

Stackdriver.yaml file above configured Istio to send accesslogs to StackDriver and then added a sink configuration where these logs could be exported. In detail as follows:

  1. Added a handler of kind stackdriver

        apiVersion: "config.istio.io/v1alpha2"
        kind: stackdriver
        metadata:
          name: handler
          namespace: <your defined namespace>
    
  2. Added logInfo in spec

        spec:
          logInfo: accesslog.logentry.istio-system:
            labelNames:
            - sourceIp
            - destinationIp
            ...
            ...
            sinkInfo:
              id: '<sink_id>'
              destination: '<sink_destination>'
              filter: '<log_filter>'
    

    In the above configuration sinkInfo contains information about the sink where you want the logs to get exported to. For more information on how this gets filled for different sinks please refer here.

  3. Added a rule for Stackdriver

        apiVersion: "config.istio.io/v1alpha2"
        kind: rule
        metadata:
          name: stackdriver
          namespace: istio-system spec:
          match: "true" # If omitted match is true
        actions:
        - handler: handler.stackdriver
          instances:
          - accesslog.logentry
    

Cleanup

  • Remove the new Stackdriver configuration:

    $ kubectl delete -f stackdriver.yaml
    
  • If you are not planning to explore any follow-on tasks, refer to the Bookinfo cleanup instructions to shutdown the application.

Availability of logs in export sinks

Export to BigQuery is within minutes (we see it to be almost instant), GCS can have a delay of 2 to 12 hours and Pub/Sub is almost immediately.