linkerd2

Commit Graph

Author	SHA1	Message	Date
Alejandro Pedraza	8e6680ba1d	Parametrized datasource in grafana dashboards, better script handling (#7603 ) Added the `DS_PROMETHEUS` parameter in all the Grafana dashboard definitions. When importing a definition into a Grafana Cloud instance for example, the import form will allow selecting the datasource from the currently available. OTOH when using an in-cluster instance such as one installed through the Grafana helm chart, the parameter gets overridden with the `values.yaml` entry `dashboards.default.{name}.datasource`. Also, the javascript snippet used in the dashboard definitions for checking for the latest linkerd version has been wrapped around a hidden div. This avoids showing the script itself when it gets escaped when importing the definition into Grafana Cloud.	2022-01-14 11:30:19 -05:00
Alejandro Pedraza	67dfebb259	Stop shipping grafana-based image (#7567 ) * Stop shipping grafana-based image Fixes #6045 #7358 With this change we stop building a Grafana-based image preloaded with the Linkerd Grafana dashboards. Instead, we'll recommend users to install Grafana by themselves, and we provide a file `grafana/values.yaml` with a default config that points to all the same Grafana dashboards we had, which are now hosted in https://grafana.com/orgs/linkerd/dashboards . The new file `grafana/README.md` contains instructions for installing the official Grafana Helm chart, and mentions other available methods. The `grafana.enabled` flag has been removed, and `grafanaUrl` has been moved to `grafana.url`. This will help consolidating other grafana settings that might emerge, in particular when #7429 gets addressed. ## Dashboards definitions changes The dashboard definitions under `grafana/dashboards` (which should be kept in sync with what's published in https://grafana.com/orgs/linkerd/dashboards), got updated, adding the `__inputs`, `__elements` and `__requires` entries at the beginning, that were required in order to be published.	2022-01-11 14:47:40 -05:00
Alejandro Pedraza	72a0cba83f	Fix some grafana links (#7306 ) As per how links to grafana charts are [built](`b0a799eee7/web/app/js/components/GrafanaLink.jsx (L7)`), all the chart's UUIDs should be prefixed by `linkerd-`. So this fixes the broken links to charts for deployments, cronjobs, jobs, daemonsets, replicasets, replicationcontrollers and statefulsets.	2021-11-17 17:50:46 -05:00
Alejandro Pedraza	b8ed799372	Include viz components in Prom scrapes, fix Linkerd Health charts (#5656 ) * Include viz components in Prom scrapes, fix Linkerd Health charts Fixes #5429 Expanded the `linkerd-controller` Prometheus scraping config so it also includes the `linkerd-viz` namespace. Also simplified the first relabelling config there removing the `_meta_kubernetes_pod_label_linkerd_io_control_plane_component` source label that wasn't serving any purpose. Just by its own, that extra scraping now allows having non-empty Go charts at the bottom of the `Linkerd Health` charts for the viz components. Additionally, the `namespace-viz` variable was added into `health.json` which then is leveraged in the queries for the `Control-Plane Traffic` and `Control-Plane TCP Metrics` charts to include the viz pods. Finally in that same file the queries for the `Data-Plane Telemetry` section were simplified by removing the filter on the `control_plane_ns` label which was redundant.	2021-02-04 09:40:23 -05:00
Alejandro Pedraza	2f8d669890	Fixed multicluster Grafana chart (#5114 ) The graphs were empty because they were relying on the metric label `dst_target_gateway` which is no longer relevant.	2020-10-21 10:06:37 -05:00
aimbot31	7c08fffd8a	Fix kubernetes grafana dashboard (#4380 ) (#5012 ) Prometheus use a relabel rule that changed since 1.16 Use "pod_name" and "pod" to avoid breaking changes. Also use "container" and "container_name" for the same reasons. Fixes #4380 Signed-off-by: Florian Davasse <florian.davasse@stack-labs.com>	2020-09-29 11:28:53 -05:00
Alejandro Pedraza	5eb890e735	Upgrade Grafana to v7.1.5 to get CVE fixes (#4981 ) Fixes #4884 Upgrades the underlying Alpine base distro, which resolves [CVE-2020-12723](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-12723) and [CVE-2020-13777](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13777) I tested Grafana continues to work as expected.	2020-09-21 09:12:42 -05:00
Josh Soref	72aadb540f	Spelling (#4872 ) This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling). The misspellings have been reported at `aaf440489e (commitcomment-41423663)` The action reports that the changes in this PR would make it happy: `5b82c6c5ca` Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately. Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-08-12 21:59:50 -07:00
Suraj Deshmukh	d7dbe9cbff	Fix spelling mistakes using codespell (#4700 ) Using following command the wrong spelling were found and later on fixed: ``` codespell --skip CHANGES.md,.git,go.sum,\ controller/cmd/service-mirror/events_formatting.go,\ controller/cmd/service-mirror/cluster_watcher_test_util.go,\ SECURITY_AUDIT.pdf,.gcp.json.enc,web/app/img/favicon.png \ --ignore-words-list=aks,uint,ans,files\' --check-filenames \ --check-hidden ``` Signed-off-by: Suraj Deshmukh <surajd.service@gmail.com>	2020-07-07 17:07:22 -05:00
cpretzer	b176fbeb6d	Upgrade Grafana to 7.0.3 (#4600 ) * Upgrade Grafana to 7.0.3 * use go netdns to avoid DNS resolution errors on alpine Signed-off-by: Charles Pretzer <charles@buoyant.io>	2020-06-17 21:35:29 -07:00
Zahari Dichev	3365455e45	Fix mc labels (#4560 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-06-05 19:36:09 +03:00
Kevin Leimkuhler	8a932ac905	Change text to use source/target terminology in events and metrics (#4527 ) Change terminology from local/remote to source/target in events and metrics. This does not change any variable, function, struct, or field names since testing is still improving Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-03 15:02:39 -04:00
Zahari Dichev	ef1a2c2b10	Multicluster dashboard for traffic metrics (#4178 ) This change adds labels to endpoints that target remote services. It also adds a Grafana dashboard that can be used to monitor multicluster traffic. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-14 17:48:27 +03:00
Alex Lundberg	0d4d2dca65	Add route dashboard to grafana instance (#4155 ) * Add route dashboard to grafana instance Fixes #1737 Signed-off-by: alex lundberg <alex.lundberg@commonbond.co>	2020-03-27 09:16:00 -05:00
mmiller1	c39b698525	Use a globing operator in top-line metrics dashboard for the "all" value (#4057 ) * use custom all values for top line dashboard * convert remaining allValue params to wildcard glob Signed-off-by: Matt Miller <mamiller@rosettastone.com>	2020-03-10 10:08:48 -07:00
Sergio C. Arteaga	cee8e3d0ae	Add CronJobs and ReplicaSets to dashboard and CLI (#3687 ) This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. Closes #3614 Closes #3630 Closes #3584 Closes #3585 Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com	2019-12-11 10:02:37 -08:00
Ivan Sim	5e51208b5d	Increase the Grafana dashboards refresh interval (#3464 ) Signed-off-by: Ivan Sim <ivan@buoyant.io>	2019-09-23 14:47:59 -07:00
Pascal Bourque	b65207213e	Added a "Linkerd Namespace" Grafana dashboard (#3301 ) Closes #3299 Signed-off-by: Pascal Bourque <pascal@studyo.co>	2019-08-21 17:30:38 -07:00
Thomas Rampelberg	ca5b4fab2e	Add container metrics and grafana dashboard (#3217 ) * Add container metrics and grafana dashboard * Review cleanup * Update templates	2019-08-12 08:03:57 -07:00
Andrew Seigner	48a69cb88a	Bump Prometheus to 2.11.1, Grafana to 6.2.5 (#3123 ) - set `disable_sanitize_html` in `grafana.ini`. - make all text box dimensions whole integers to fix dropdown issue, reported in: https://github.com/linkerd/linkerd2/issues/2955#issuecomment-503085444 - rev all dashboards to `schemaVersion` 18 for Grafana 6.2.5 - `prometheus-benchmark.json` based on: https://grafana.com/grafana/dashboards/9761 - `prometheus.json` based on: `69c93e6401/public/app/plugins/datasource/prometheus/dashboards/prometheus_2_stats.json` - `grafana.json` based on: `85aed0276e/public/app/plugins/datasource/prometheus/dashboards/grafana_stats.json` Fixes #2955 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-23 13:37:56 -07:00
Andrew Seigner	81790b6735	Bump Prometheus to v2.10.0 (#2979 ) Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-06-21 12:51:31 -07:00
Gaurav Kumar	cbcd201715	Add TCP stats to the Linkerd Pod Grafana dashboard (#2329 ) (#2477 ) * Add TCP stats to the Linkerd Pod Grafana dashboard (#2329) * Minimize tcp stats and link it to dashboard tcp tables * Add rows to fix minimization issues Signed-off-by: Gaurav Kumar <gaurav.kumar9825@gmail.com>	2019-03-14 14:49:13 -07:00
Tarun Pothulapati	8f6c63d5ea	Added Jobs Resource to Linkerd Dashboard along with grafana. (#2439 ) Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2019-03-06 17:06:46 -08:00
Andrew Seigner	8384f1eb56	Ensure shared tooltips in Linkerd Health dashboard (#2324 ) All Grafana graphs use shared tooltips (display all series in the tooltip rather than the one currently moused-over), except for 3 graphs in the Linkerd Health dashboard. This change ensures all tooltips are shared. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-19 15:55:36 -08:00
Risha Mars	ee18a7fe31	Modify the grafana variable queries to use a tcp-based metric (#2272 ) Currently, we use request_total for the variable query to determine the names in the grafana dropdowns. We should use a non-http-based metric instead, so that if there is only TCP traffic, the dropdowns will still be populated. This branch uses process_start_time_seconds instead of the http-based request_total to query for grafana variables	2019-02-19 13:46:02 -08:00
Andrew Seigner	1df1683b6a	Instrument k8s clients (#2243 ) The control-plane's clients, specifically the Kubernetes clients, did not provide telemetry information. Introduce a `prometheus.ClientWithTelemetry` wrapper to instrument arbitrary clients. Apply this wrapper to Kubernetes clients. Fixes #2183 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-18 09:10:02 -08:00
Andrew Seigner	a9b9908908	Bump Prometheus to v2.7.1, Grafana to 5.4.3 (#2242 ) Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-13 11:27:15 -08:00
Ivan Sim	f6e75ec83a	Add statefulsets to the dashboard and CLI (#2234 ) Fixes #1983 Signed-off-by: Ivan Sim <ivan@buoyant.io>	2019-02-08 15:37:44 -08:00
zak	8c413ca38b	Wire up stats commands for daemonsets (#2006 ) (#2086 ) DaemonSet stats are not currently shown in the cli stat command, web ui or grafana dashboard. This commit adds daemonset support for stat. Update stat command's help message to reference daemonsets. Update the public-api to support stats for daemonsets. Add tests for stat summary and api. Add daemonset get/list/watch permissions to the linkerd-controller cluster role that's created using the install command. Update golden expectation test files for install command yaml manifest output. Update web UI with daemonsets Update navigation, overview and pages to list daemonsets and the pods associated to them. Add daemonset paths to server, and ui apps. Add grafana dashboard for daemonsets; a clone of the deployment dashboard. Update dependencies and dockerfile hashes Add DaemonSet support to tap and top commands Fixes of #2006 Signed-off-by: Zak Knill <zrjknill@gmail.com>	2019-01-24 14:34:13 -08:00
Kevin Lingerfelt	a27bb2e0ce	Proxy grafana requests through web service (#2039 ) * Proxy grafana requests through web service * Fix -grafana-addr default, clarify -api-addr flag * Fix version check in grafana dashboards * Fix comment typo Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2019-01-04 16:07:57 -08:00
Kevin Lingerfelt	37ae423bb3	Add linkerd- prefix to all objects in linkerd install (#1920 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-12-04 15:41:47 -08:00
Oliver Gould	747fd328e9	grafana: Show TCP closes by errno (#1839 ) linkerd/linkerd2-proxy#116 removes the `classification` label for the `tcp_close_total` metric because TCP sockets that close with an error do not actually indicate any sort of failure -- many graceful shutdown situations can still cause a socket error. This change uses the `errno` label to enumerate tcp_close_total metrics.	2018-11-02 10:20:11 -07:00
Alejandro Pedraza	338848d2bc	Add Grafana dashboard for Authorities (#1772 ) * Add Grafana dashboard for Authorities Proposal for #1225 Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Implement code review suggestions Modified Inbound by Deployment and Inbound by Pod graphs according to klingerf's feedback. Removed template variables values. Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>	2018-10-18 13:56:13 -07:00
Andrew Seigner	dccccebd79	Add LICENSE files to all Docker images (#1727 ) To comply with certain environments, include our LICENSE file in all Docker images. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-10-02 16:25:52 -07:00
Kevin Lingerfelt	12b10e27c1	Update version checks to support release channels (#1667 ) * Update version checks to support release channels * Update based on review feedback * Fix sidebar tests * Update CI config for edge and stable tags Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-17 17:13:50 -07:00
Andrew Seigner	b708378d07	Add version check to Grafana dashboard (#1638 ) * Add version check to Grafana dashboard The web dashboard checks the local Linkerd version against the latest release, and informs the user if an update is available. Grafana was not doing this. Modify the Grafana dashboard to perform a version check, and prompt the user to update if needed. Fixes #1607 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-09-13 15:28:44 -07:00
Andrew Seigner	bae05410fd	Bump Prometheus to v2.4.0, Grafana to 5.2.4 (#1625 ) Prometheus v2.3.1 -> v2.4.0 Grafana 5.1.3 -> 5.2.4 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-09-11 14:45:55 -07:00
Kevin Lingerfelt	4845b4ec04	Restore linkerd.io/control-plane* labels (#1411 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-07 13:53:29 -07:00
Kevin Lingerfelt	e0a01c5dd8	Remove node scrape target, kubernetes grafana dashboard (#1410 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-07 13:41:38 -07:00
Kevin Lingerfelt	4b9700933a	Update prometheus labels to match k8s resource names (#1355 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-23 15:45:05 -07:00
Franziska von der Goltz	c7ac072acc	update grafana dashboards: conduit to linkerd (#1320 ) * update grafana dashboards to remove conduit reference and replace with linkerd instances * update test install fixtures to reflect changes Fixes: #1315 Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>	2018-07-16 13:05:01 -07:00
Kevin Lingerfelt	e5cce1abaf	Rename CLI from conduit to linkerd (#1312 ) * Rename CLI binary * Update integration tests for new binary name * Rename --conduit-namespace flag, change default ns * Rename occurrences of conduit in rest of CLI * Rename inject and install components * Remove conduit occurrences in docker files * Additional miscellaneous cleanup * Move protobuf definitions to linkerd2 package * Rename conduit.io labels to use linkerd.io * Rename conduit-managed segment to linkerd-managed * Fix conduit references in web project Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-12 17:14:07 -07:00
Kevin Lingerfelt	6f804d600c	Remove docker-compose / simulate-proxy environment (#1294 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-06 17:44:35 -07:00
Andrew Seigner	e70d62dc9f	Introduce Proxy process telemetry in Grafana (#1199 ) PR #1128 introduced new proxy process stats. Introduce Grafana graphs that expose these new proxy process stats. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-06-27 00:58:28 +01:00
Risha Mars	fdb0b7f63f	Grafana: remove fill and stack from individual resource breakouts (#1092 ) Remove the filling and stacking in request rate graphs that combine resources, to make it easier to spot outliers. * Grafana: remove fill and stack from individual resource breakouts * Remove all the stacks and fills from request rates everywhere	2018-06-18 10:14:39 -07:00
Risha Mars	53b713b2a8	Remove the ⚠️ emoji from non-tlsed grafana stat labels (#1089 )	2018-06-08 15:00:56 -07:00
Risha Mars	b930bc6b88	Fix conduit health grafana dashboard (#1086 ) * Scope health queries to controller namespace * Add a prometheus query variable to get the conduit namespace	2018-06-08 12:57:05 -07:00
Kevin Lingerfelt	ec2433e9bd	Update controller to use 'tls' metric label (#1044 ) * Update controller to use 'tls' metric label * Fix meshed column formatter Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-06-01 16:44:33 -07:00
Andrew Seigner	95f9f8dc35	Add meshed label support to Grafana (#1021 ) The Grafana dashboards currently show Request Volume by ns/deploy/pod. Add a `meshed` dimension to the Request Volume graphs, in anticipation of the `meshed`/`secured` label from the proxy. Also increase `irate` time window queries from `20s` to `30s`, per recommendation from Prometheus team. Relates to #388. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-05-25 14:10:57 -07:00
Andrew Seigner	6fccdee58e	Stop special-casing conduit controller in Grafana (#984 ) The Grafana dashboards were explicitly filtering out Conduit control-plane data. Remove control-plane filtering from Grafana dashboards. This brings Grafana in-line with web, and also encourages better dog-fooding of our proxy metrics and dashboards. Also update Grafana to 5.1.3, update the BUILD.md architecture diagram to include Promethues and Grafana, and introduce a Prometheus Benchmark dashboard, courtesy of Robust Perception. Fixes #908 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-05-23 13:47:20 -07:00

1 2

61 Commits