This PR removes the unused `request_duration_ms` and `response_duration_ms` histogram metrics from the proxy. It also removes them from the `simulate-proxy` script's output, and from `docs/proxy-metrics.md`
Closes#821
The Destination service does not provide ReplicaSet information to the
proxy.
The `pod-template-hash` label approximates selecting over all pods in a
ReplicaSet or ReplicationController. Modify the Destination service to
provide this label to the proxy.
Relates to #508 and #741
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Conduit 0.4.0 overhauls Conduit's telemetry system and improves service discovery
reliability.
* Web UI
* **New** automatically-configured Grafana dashboards for all Deployments.
* Command-line interface
* `conduit stat` has been completely rewritten to accept arguments like `kubectl get`.
The `--to` and `--from` filters can be used to filter traffic by destination and
source, respectively. `conduit stat` currently can operate on `Namespace` and
`Deployment` Kubernetes resources. More resource types will be added in the next
release!
* Proxy (data plane)
* **New** Prometheus-formatted metrics are now exposed on `:4191/metrics`, including
rich destination labeling for outbound HTTP requests. The proxy no longer pushes
metrics to the control plane.
* The proxy now handles `SIGINT` or `SIGTERM`, gracefully draining requests until all
are complete or `SIGQUIT` is received.
* SMTP and MySQL (ports 25 and 3306) are now treated as opaque TCP by default. You
should no longer have to specify `--skip-outbound-ports` to communicate with such
services.
* When the proxy reconnected to the controller, it could continue to send requests to
old endpoints. Now, when the proxy reconnects to the controller, it properly removes
invalid endpoints.
* A bug impacting some HTTP/2 reset scenarios has been fixed.
* Service Discovery
* Previously, the proxy failed to resolve some domain names that could be misinterpreted
as a Kubernetes Service name. This has been fixed by extending the _Destination_ API
with a negative acknowledgement response.
* Control Plane
* The _Telemetry_ service and associated APIs have been removed.
* Documentation
* Updated Roadmap
* Added prometheus metrics guide
The Destination service used slightly different labels than the
telemetry pipeline expected, specifically, prefixed with `k8s_*`.
Make all Prometheus labels consistent by dropping `k8s_*`. Also rename
`pod_name` to `pod` for consistency with `deployement`, etc. Also update
and reorganize `proxy-metrics.md` to reflect new labelling.
Fixes#655
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Previously we were using the instance label to uniquely identify a pod.
This meant that getting stats by pod name would require extra queries to
Kubernetes to map pod name to instance.
This change adds a pod_name label to metrics at collection time. This
should not affect cardinality as pod_name is invariant with respect to
instance.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR adds a `classification` label to proxy response metrics, as @olix0r described in https://github.com/runconduit/conduit/issues/634#issuecomment-376964083. The label is either "success" or "failure", depending on the following rules:
+ **if** the response had a gRPC status code, *then*
- gRPC status code 0 is considered a success
- all others are considered failures
+ **else if** the response had an HTTP status code, *then*
- status codes < 500 are considered success,
- status codes >= 500 are considered failures
+ **else if** the response stream failed **then**
- the response is a failure.
I've also added end-to-end tests for the classification of HTTP responses (with some work towards classifying gRPC responses as well). Additionally, I've updated `doc/proxy_metrics.md` to reflect the added `classification` label.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The Prometheus scrape config collects from Conduit proxies, and maps
Kubernetes labels to Prometheus labels, appending "k8s_".
This change keeps the resultant Prometheus labels consistent with their
source Kubernetes labels. For example: "deployment" and
"pod_template_hash".
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The inject code detects the object it is being injected into, and writes
self-identifying information into the CONDUIT_PROMETHEUS_LABELS
environment variable, so that conduit-proxy may read this information
and report it to Prometheus at collection time.
This change puts the self-identifying information directly into
Kubernetes labels, which Prometheus already collects, removing the need
for conduit-proxy to be aware of this information. The resulting label
in Prometheus is recorded in the form `k8s_deployment`.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>