Commit Graph

14 Commits

Author SHA1 Message Date
Kevin Lingerfelt ec2433e9bd
Update controller to use 'tls' metric label (#1044)
* Update controller to use 'tls' metric label
* Fix meshed column formatter

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-06-01 16:44:33 -07:00
Andrew Seigner 95f9f8dc35
Add meshed label support to Grafana (#1021)
The Grafana dashboards currently show Request Volume by ns/deploy/pod.

Add a `meshed` dimension to the Request Volume graphs, in anticipation
of the `meshed`/`secured` label from the proxy. Also increase `irate`
time window queries from `20s` to `30s`, per recommendation from
Prometheus team.

Relates to #388.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-05-25 14:10:57 -07:00
Andrew Seigner 6fccdee58e
Stop special-casing conduit controller in Grafana (#984)
The Grafana dashboards were explicitly filtering out Conduit
control-plane data.

Remove control-plane filtering from Grafana dashboards. This brings
Grafana in-line with web, and also encourages better dog-fooding of our
proxy metrics and dashboards. Also update Grafana to 5.1.3, update the
BUILD.md architecture diagram to include Promethues and Grafana, and
introduce a Prometheus Benchmark dashboard, courtesy of Robust
Perception.

Fixes #908

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-05-23 13:47:20 -07:00
Andrew Seigner 1275b1ae89
Introduce Grafana, K8s, and Prom dashboards (#904)
Grafana provides default dashboards for Prometheus and Grafana health.
The community also provides Kubernetes-specific dashboards. Conduit was
not taking advantage of these.

Introduce new Grafana dashboards focused on Grafana, Kubernetes, and
Prometheus health. Tag all Conduit dashboards for easier UI navigation.
Also fix layout in Conduit Health dashboard.

Part of #420

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-05-08 23:11:43 +02:00
Andrew Seigner 5a5c6d14ab
Update Grafana to 5.1.0, handle missing data (#876)
Conduit 0.4.1 contained some rough edges in the Grafana deployment.

This PR include the following:
- bump Grafana to 5.1.0
- fix deployment and rc graphs when no data present
- fix some text sections overlapping due to scrolling

Fixes #705

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-29 22:24:22 +02:00
Eliza Weisman d55e334a42
Add TCP stats to deployment dashboards (#824)
This PR adds the TCP metrics added in #785 and #790 to the Grafana deployment dashboards. I've added three new charts in the "Inbound Traffic" and "Outbound Traffic" headings:
+ "TCP Connection Failures": plots the number of failed TCP connections over time
+ "TCP Connections Open": shows the number of accepted and opened connections currently open
+ "TCP Connection Duration": a heatmap of connection durations over time

I'm planning on adding similar graphs to other dashboards as well in subsequent PRs.
2018-04-25 16:26:43 -07:00
Risha Mars aca09813fd
Add a Replication Controller grafana dashboard (#843)
* Add a Replication Controller grafana dashboard, very similar to the Deployment one
2018-04-25 10:57:41 -07:00
Andrew Seigner 326d9f493c
Fix top-line Grafana counts (#815)
The top-line single stat numbers were not calculated properly, resulting
in inflated counts.

Modify the underlying Prometheus queries to ensure accurate counts of
Deployments, Pods, and Namespaces.

Fixes #801.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-19 18:01:06 -07:00
Andrew Seigner c26955186b
Introduce service-centric Grafana dashboard (#810)
Conduit's Grafana currencly provides Top-line, Deployment, Pod, and Mesh
Health dashboards.

This change adds a new Conduit Service dashboard. In addition to
top-line information, this dashboards focuses primarily on requests to a
Service, as only dst_service is available in our metrics.

Part of #706

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-19 17:32:42 -07:00
Kevin Lingerfelt 71a51afb40
Expose pod stats in CLI, web UI, and Grafana (#788)
* Expose pod stats in CLI, web UI, and Grafana
* Fix js api helpers test
* Add outbound traffic stats to pod dashboard

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-04-18 11:26:47 -07:00
Andrew Seigner c9cdd838dc
Standardize and polish Grafana for 0.4.0 release (#766)
The top-line, deployments, and health Grafana dashboards had
inconsistent layouts and data.

This change standardizes our Grafana dashboards. Every row is composed
of Success Rate, Request Rate, and Latency.

Part of #420.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-13 18:01:44 -07:00
Andrew Seigner b6bcdcc059
Namespace-aware Grafana dashboards (#716)
The Grafana dashboards key off of deployment, but had no awareness of
namespaces, causing incorrect metrics aggregation and display.

This change makes the Grafana dashboards key off of namespaces, and also
modifies the Grafana links in the Conduit dashboard to link to
namespace+deployment.

Fixes #704
Part of #420

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-06 15:37:53 -07:00
Andrew Seigner 9508e11b45
Build conduit-specific Grafana Docker image (#679)
Using a vanilla Grafana Docker image as part of `conduit install`
avoided maintaining a conduit-specific Grafana Docker image, but made
packaging dashboard json files cumbersome.

Roll our own Grafana Docker image, that includes conduit-specific
dashboard json files. This significantly decreases the `conduit install`
output size, and enables dashboard integration in the docker-compose
environment.

Fixes #567
Part of #420

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-05 14:20:05 -07:00
Andrew Seigner 12c6531546
Update docker-compose environment to match prod (#609)
The Prometheus config in the docker-compose environment had fallen
behind the prod setup.

This change updates the docker-compose environment in the following
ways:
- Prometheus config more closely matches prod, based on #583
- simulate-proxy labels matches prod, based on #605
- add Grafana container

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-03-23 17:00:39 -07:00