linkerd2/doc/proxy-metrics.md

6.9 KiB

+++ title = "Proxy Metrics" docpage = true [menu.docs] parent = "proxy-metrics" +++

The Conduit proxy exposes metrics that describe the traffic flowing through the proxy. The following metrics are available at /metrics on the proxy's metrics port (default: :4191) in the Prometheus format:

Protocol-Level Metrics

request_total

A counter of the number of requests the proxy has received. This is incremented when the request stream begins.

response_total

A counter of the number of responses the proxy has received. This is incremented when the response stream ends.

response_latency_ms

A histogram of the total latency of a response. This is measured from when the request headers are received to when the response stream has completed.

Labels

Each of these metrics has the following labels:

  • authority: The value of the :authority (HTTP/2) or Host (HTTP/1.1) header of the request.
  • direction: inbound if the request originated from outside of the pod, outbound if the request originated from inside of the pod.
  • meshed: false if the request's source and destination were not in the mesh, true if the request's source and destination were in the mesh.

Response Labels

The following labels are only applicable on response_* metrics.

  • classification: success if the response was successful, or failure if a server error occurred. This classification is based on the gRPC status code if one is present, and on the HTTP status code otherwise. Only applicable to response metrics.
  • grpc_status_code: The value of the grpc-status trailer. Only applicable for gRPC responses.
  • status_code: The HTTP status code of the response.

Outbound labels

The following labels are only applicable if direction=outbound.

  • dst_deployment: The deployment to which this request is being sent.
  • dst_k8s_job: The job to which this request is being sent.
  • dst_replica_set: The replica set to which this request is being sent.
  • dst_daemon_set: The daemon set to which this request is being sent.
  • dst_stateful_set: The stateful set to which this request is being sent.
  • dst_replication_controller: The replication controller to which this request is being sent.
  • dst_namespace: The namespace to which this request is being sent.
  • dst_service: The service to which this request is being sent.
  • dst_pod_template_hash: The pod-template-hash of the pod to which this request is being sent. This label selector roughly approximates a pod's ReplicaSet or ReplicationController.

Prometheus Collector labels

The following labels are added by the Prometheus collector.

  • instance: ip:port of the pod.
  • job: The Prometheus job responsible for the collection, typically conduit-proxy.

Kubernetes labels added at collection time

Kubernetes namespace, pod name, and all labels are mapped to corresponding Prometheus labels.

  • namespace: Kubernetes namespace that the pod belongs to.
  • pod: Kubernetes pod name.
  • pod_template_hash: Corresponds to the pod-template-hash Kubernetes label. This value changes during redeploys and rolling restarts. This label selector roughly approximates a pod's ReplicaSet or ReplicationController.

Conduit labels added at collection time

Kubernetes labels prefixed with conduit.io/ are added to your application at conduit inject time. More specifically, Kubernetes labels prefixed with conduit.io/proxy-* will correspond to these Prometheus labels:

  • daemon_set: The daemon set that the pod belongs to (if applicable).
  • deployment: The deployment that the pod belongs to (if applicable).
  • k8s_job: The job that the pod belongs to (if applicable).
  • replica_set: The replica set that the pod belongs to (if applicable).
  • replication_controller: The replication controller that the pod belongs to (if applicable).
  • stateful_set: The stateful set that the pod belongs to (if applicable).

Example

Here's a concrete example, given the following pod snippet:

name: vote-bot-5b7f5657f6-xbjjw
namespace: emojivoto
labels:
  app: vote-bot
  conduit.io/control-plane-ns: conduit
  conduit.io/proxy-deployment: vote-bot
  pod-template-hash: "3957278789"
  test: vote-bot-test

The resulting Prometheus labels will look like this:

request_total{
  pod="vote-bot-5b7f5657f6-xbjjw",
  namespace="emojivoto",
  app="vote-bot",
  conduit_io_control_plane_ns="conduit",
  deployment="vote-bot",
  pod_template_hash="3957278789",
  test="vote-bot-test",
  instance="10.1.3.93:4191",
  job="conduit-proxy"
}

Transport-Level Metrics

The following metrics are collected at the level of the underlying transport layer.

tcp_open_total

A counter of the total number of opened transport connections.

tcp_close_total

A counter of the total number of transport connections which have closed.

tcp_open_connections

A gauge of the number of transport connections currently open.

tcp_write_bytes_total

A counter of the total number of sent bytes. This is updated when the connection closes.

tcp_read_bytes_total

A counter of the total number of recieved bytes. This is updated when the connection closes.

tcp_connection_duration_ms

A histogram of the duration of the lifetime of a connection, in milliseconds. This is updated when the connection closes.

Labels

Each of these metrics has the following labels:

  • direction: inbound if the connection was established either from outside the pod to the proxy, or from the proxy to the application, outbound if the connection was established either from the application to the proxy, or from the proxy to outside the pod.
  • peer: src if the connection was accepted by the proxy from the source, dst if the connection was opened by the proxy to the destination.

Note that the labels described above under the heading "Prometheus Collector labels" are also added to transport-level metrics, when applicable.

Connection Close Labels

The following labels are added only to metrics which are updated when a connection closes (tcp_close_total and tcp_connection_duration_ms):

  • classification: success if the connection terminated cleanly, failure if the connection closed due to a connection failure.