linkerd2

Commit Graph

Author	SHA1	Message	Date
harsh jain	976bc40345	Fixes #2607 : Remove TLS from stat (#2613 ) Removes the TLS percentages from the stat command in the CLI.	2019-04-04 10:37:42 -07:00
Andrew Seigner	9f748d2d2e	lint: Enable unparam (#2369 ) unparam reports unused function parameters: https://github.com/mvdan/unparam Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-27 10:34:02 -08:00
Andrew Seigner	25e462352d	lint: Enable goimports (#2366 ) goimports checks import lines, adding missing ones and removing unreferenced ones: https://godoc.org/golang.org/x/tools/cmd/goimports It also requires named imports for packages whose import paths don't match their package names: - https://github.com/golang/go/issues/28428 - https://go-review.googlesource.com/c/tools/+/145699/ Also standardized named imports of common Kubernetes packaages. Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-25 15:51:10 -08:00
Andrew Seigner	35a0b652f2	lint: Enable goconst (#2365 ) goconst finds repeated strings that could be replaced by a constant: https://github.com/jgautheron/goconst Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-25 12:00:03 -08:00
Risha Mars	80b6e41d5d	Modify StatSummary to also return TCP stats (#2262 ) Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats. This branch returns TCP stats at /api/tps-reports when this flag is true. TCP stats are now displayed on the Resource Detail pages. The current queried TCP stats are: tcp_open_connections tcp_read_bytes_total tcp_write_bytes_total	2019-02-25 10:37:39 -08:00
Alex Leong	771542dde2	Add support for retries (#2038 )	2019-01-16 14:13:48 -08:00
Radu M	07cbfe2725	Fix most golint issues that are not comment related (#1982 ) Signed-off-by: Radu Matei <radu@radu-matei.com>	2018-12-20 10:37:47 -08:00
Alejandro Pedraza	8c67bfbcc6	Add parameter to stats API to skip retrieving Prometheus stats (#1871 ) * Add parameter to stats API to skip retrieving Prometheus stats Used by the dashboard to populate list of resources. Fixes #1022 Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Prometheus queries check results were being ignored * Refactor verifyPromQueries() to also test when no prometheus queries should be generated * Add test for SkipStats=true Includes adding ability to public.GenStatSummaryResponse to not generate basicStats * Fix previous test	2018-12-10 16:48:12 -08:00
Alex Leong	7a7f6b6ecb	Add TopRoutes method the the public api and route CLI command to consume it (#1860 ) Add a routes command which displays per-route stats for services that have service profiles defined. This change has three parts: * A new public-api RPC called `TopRoutes` which serves per-route stat data about a service * An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared. * A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared. Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-11-19 12:20:30 -08:00
Kevin Leimkuhler	c68693e820	Fix stat filtering for `--from` queries (#1856 ) # Problem When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`. Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`). # Solution Destination query labels are now appended to `labels` so that those labels can be filtered on. # Validation Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries. Fixes #1766 Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2018-11-14 10:52:27 -08:00
Kevin Lingerfelt	b5ff29c8aa	Add data plane check to validate proxy version (#1574 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-04 15:22:38 -07:00
Kevin Lingerfelt	bd19e8aaff	Update prometheus to only scrape proxies in the same mesh (#1402 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-06 12:05:55 -07:00
Kevin Lingerfelt	4b9700933a	Update prometheus labels to match k8s resource names (#1355 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-23 15:45:05 -07:00
Kevin Lingerfelt	e5cce1abaf	Rename CLI from conduit to linkerd (#1312 ) * Rename CLI binary * Update integration tests for new binary name * Rename --conduit-namespace flag, change default ns * Rename occurrences of conduit in rest of CLI * Rename inject and install components * Remove conduit occurrences in docker files * Additional miscellaneous cleanup * Move protobuf definitions to linkerd2 package * Rename conduit.io labels to use linkerd.io * Rename conduit-managed segment to linkerd-managed * Fix conduit references in web project Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-12 17:14:07 -07:00
Oliver Gould	941cad4a9c	Migrate build infrastructure to linkerd2 (#1298 ) This PR begins to migrate Conduit to Linkerd2: * The proxy has been completely removed from this repo, and is now located at github.com/linkerd/linkerd2-proxy. * A `Dockerfile-proxy` has been added to fetch the most-recently published proxy binary from build.l5d.io. * Proxy-specific protobuf bindings have been moved to github.com/linkerd/linkerd2-proxy-api. * All docker images now use the gcr.io/linkerd-io registry. * `inject` now uses `LINKERD2_PROXY_` environment variables * Go paths have been updated to reflect the new (future) repo location.	2018-07-09 15:38:38 -07:00
Risha Mars	9050b2d312	Fix authority stat queries when a --from flag is used (#1289 ) * Fix bug where we were using dst_authorities as a group by instead of authorities * Add test to make sure we don't dst_authorities Previously, we were only checking to make sure we didn't add dst_authorities in the query labels in promDstQueryLabels but we weren't checking the groupBy labels in promDstGroupByLabelNames - this caused us to try to query for dst_authorities when a --from query was sent. There are no dst_authorities, so there would be no named results.	2018-07-06 17:29:08 -07:00
Risha Mars	ba2e13c731	Small tweaks to error modal, add Reason to api error response (#1246 ) - Add Reason to the error data passed from the api - Rewrite error logic in the UI to try to make it clearer - Show 0/0 pods meshed instead of 0/0 pods meshed (N/A) if 0 pods are meshed	2018-07-03 17:14:27 -07:00
Risha Mars	5963b2ac24	Better format empty errors (#1202 )	2018-06-28 14:52:04 -07:00
Risha Mars	68586fe697	Add the ability to query stats by authority (#1181 ) Adds the ability to query by a new non-kubernetes resource type, "authorities", in the StatSummary api. This includes an extensive refactor of stat_summary.go to deal with non-kubernetes resource types. - Add documentation to Resource in the public api so we can use it for authority - Handle non-k8s resource requests in the StatSummary endpoint - Rewrite stat summary fetching and parsing to handle non-k8s resources - keys stat summary metric handling by Resource instead of a generated string - Adds authority to the CLI - Adds /authorities to the Web UI - Adds some more stat integration and unit tests	2018-06-28 14:31:44 -07:00
Risha Mars	e2c2f19d2c	Propagate errors in conduit containers to the api (#1117 ) - It would be nice to display container errors in the UI. This PR gets the pod's container statuses and returns them in the public api - Also add a terminationMessagePolicy to conduit's inject so that we can capture the proxy's error messages if it terminates	2018-06-14 16:22:31 -07:00
Kevin Lingerfelt	6e66f6d662	Rename Lister to API and expose informers as well as listers (#1072 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-06-12 10:27:55 -07:00
Risha Mars	7d4c4aa290	CLI: print resources in the same order every time stat all is run (#1088 ) Previously, in conduit stat all we would just print the map of stat results, which resulted in the order in which stats were displayed varying between prints. Fix: Define an array, k8s.StatAllResourceTypes and use the order in this array to print the map; ensuring a consistent print order every time the command is run.	2018-06-08 15:02:17 -07:00
Kevin Lingerfelt	ec2433e9bd	Update controller to use 'tls' metric label (#1044 ) * Update controller to use 'tls' metric label * Fix meshed column formatter Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-06-01 16:44:33 -07:00
Risha Mars	ffabdefc6c	Add queries to prometheus to determine number of fully meshed requests (#983 ) - Update the `response_total` prometheus query of the StatSummary endpoint to also break queries out by a `meshed` label. - Add a 'Secured' column to the web UI/CLI stat displays, which indicate the percentage of traffic starting and ending in the mesh This meshed label is used in the CLI/Web UI to display a column of the percentage of traffic that starts/ends in the mesh. (Which is a proxy indicator for whether that traffic is 'secured' when we add TLS by default for intra mesh requests). The `meshed` label is not yet added anywhere, so until it is supplied by the proxy, all traffic will show up as 0% secured in the web/CLI.	2018-05-24 11:05:09 -07:00
Andrew Seigner	84e6eb5c87	Fix nil pointer dereference in StatSummary (#991 ) The StatSummary endpoint was dereferencing StatSummaryRequest.Selector.Resource, causing a panic when it received an empty request. Fix StatSummary to use the nil-friendly StatSummaryRequest.GetSelector().GetResource() methods, and add a test to validate. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-05-23 13:21:49 -07:00
Risha Mars	1e6434f6de	Fix bug in the public-api where conduit stat params were ignored (#971 ) * Fix bug where we were dropping parts of the StatSummaryRequest * Add tests for prometheus query strings and for failed cases Problem In #928 I rewrote the stat api to handle 'all' as a resource type. To query for all resource types, we would copy the Resource, LabelSelector and TimeWindow of the original request, and then go through all the resource types and set Resource.Type for each resource we wanted to get. The bug is that while we copy over some fields of the original request, we didn't copy over all of them - namely Resource.Name and the Outbound resource. So the Stat endpoint would ignore any --to or --from flags, and would ignore requests for a specific named resource. Solution Copy over all fields from the request. I've also added tests for this case. In this process I've refactored the stat_summary_test code to make it a bit easier to read/use.	2018-05-18 16:06:06 -07:00
Risha Mars	b8dc83f9d2	Modify the Stat API to handle requests for resource type "all" (#928 ) Allow the Stat endpoint in the public-api to accept requests for resourceType "all". Currently, this queries Pods, Deployments, RCs and Services, but can be modified to query other resources as well. Both the CLI and web endpoints now work if you set resourceType to all. e.g. `conduit stat all`	2018-05-11 14:35:37 -07:00
Kevin Lingerfelt	4e8e1eb84d	CLI: Fix validation for service stats (#935 ) * CLI: Fix validation for service stats * Address review feedback Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-05-11 10:28:49 -07:00
Risha Mars	f94856e489	Modify the Stat endpoint to also return the number of failed conduit pods (#895 ) * Modify the Stat endpoint to also return the count of failed pods * Add comments explaining pod count stats * Rename total pod count to running pod count This is to support the service mesh overview page, as I'd like to include an indicator of failed pods there.	2018-05-08 10:35:21 -07:00
Andrew Seigner	a0a9a42e23	Implement Public API and Tap on top of Lister (#835 ) public-api and and tap were both using their own implementations of the Kubernetes Informer/Lister APIs. This change factors out all Informer/Lister usage into the Lister module. This also introduces a new `Lister.GetObjects` method. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-24 18:10:48 -07:00
Andrew Seigner	79bdc638b3	Service support in stat command (#809 ) The `stat` command did not support `service` as a resource type. This change adds `service` support to the `stat` command. Specifically: - as a destination resource on `--to` commands - as a target resource on `--from` commands Fixes #805 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-19 16:51:20 -07:00
Kevin Lingerfelt	653dc6bfaa	Add replication controller stats in CLI (#794 ) * Add replication controller stats in CLI * Fix pod status in stat summary tests Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-18 18:12:14 -07:00
Kevin Lingerfelt	71a51afb40	Expose pod stats in CLI, web UI, and Grafana (#788 ) * Expose pod stats in CLI, web UI, and Grafana * Fix js api helpers test * Add outbound traffic stats to pod dashboard Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-18 11:26:47 -07:00
Andrew Seigner	727521f914	Permit arbitrary time windows in public-api (#774 ) The public-api previously only permitted 4 hard-coded time windows: 10s, 1m, 10m, 1h. This was primarily a relic of the recently removed telemetry system. Modify the public-api to validate the time string, but allow for any window size, which is then passed through to Prometheus. Fixes #686 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-16 17:37:17 -07:00
Kevin Lingerfelt	11a4359e9a	Misc cleanup following the telemetry rewrite (#771 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-16 15:51:07 -07:00
Andrew Seigner	77fb6d3709	Add namespace as a resource type in public-api (#760 ) * Add namespace as a resource type in public-api The cli and public-api only supported deployments as a resource type. This change adds support for namespace as a resource type in the cli and public-api. This also change includes: - cli statsummary now prints `-`'s when objects are not in the mesh - cli statsummary prints `No resources found.` when applicable - removed `out-` from cli statsummary flags, and analagous proto changes - switched public-api to use native prometheus label types - misc error handling and logging fixes Part of #627 Signed-off-by: Andrew Seigner <siggy@buoyant.io> * Refactor filter and groupby label formulation Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Rename stat_summary.go to stat.go in cli Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Update rbac privileges for namespace stats Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-13 16:53:01 -07:00
Andrew Seigner	21886760c6	Use apps/v1beta2 for Kubernetes 1.8 compatibility (#762 ) Conduit was relying on apps/v1 to Deployment and ReplicaSet APIs. apps/v1 is not available on Kubernetes 1.8. This prevented the public-api from starting. Switch Conduit to use apps/v1beta2. Also increase the Kubernetes API cache sync timeout from 10 to 60 seconds, as it was taking 11 seconds on a test cluster. Fixes #761 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-13 12:08:16 -07:00
Andrew Seigner	e9b209829d	Handle NaN metrics (#750 ) The Prometheus client sometimes returns NaN if a calculation is invalid, such as histogram_quantile when no requests have occurred. Add IsNaN check in the public-api and set output to zero. Fixes #747 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-12 15:21:00 -07:00
Kevin Lingerfelt	47caf1ca07	Add --all-namespaces flag to CLI statsummary command (#745 ) * Add --all-namespaces flag to CLI statsummary command * Fix statsummary output formatting Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-11 16:40:25 -07:00
Andrew Seigner	259fdcd134	Add latency stats in new stat summary endpoint (#737 ) The new StatSummary endpoint was only providing request volume and successs rate information. Add support for retrieving latency stats via StatSummary. Also make all prometheus calls in parallel, and implement kubernetes test fixtures. Fixes #681 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-11 11:58:32 -07:00
Kevin Lingerfelt	91c359e612	Switch public API to use cached k8s resources (#724 ) * Switch public API to use cached k8s resources * Move shared informer code to separate goroutine * Fix spelling issue Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-10 11:39:31 -07:00
Andrew Seigner	3a341abe9a	Fix success rate calculation in public api (#723 ) The success rate calculation relies on the `classification` label, but was incorrectly specifying `fail` rather than `failure`. Fix public api to specify `failure`. Also re-org public api tests for easier Kubernetes and Prometheus mocking. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-10 11:04:04 -07:00
Andrew Seigner	716b392231	Move StatSummary logic into grpc server (#717 ) The StatSummary logic was implemented as a method on http_server. Move the StatSummary logic into grpc_server, for consistency with the other endpoints. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-06 16:46:15 -07:00
Andrew Seigner	50c323c617	Use canonical k8s names, fix prom labels (#702 ) The new statsummary command accepted friendly k8s names, which worked for k8s queries, but Prometheus requires a specific key. Modify the statsummary query to map friendly k8s names to canonical k8s names when constructing the query. Then during the query, map the canonical k8s name to a specific Prometheus label. Fixes #695 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-06 12:34:54 -07:00
Risha Mars	2f5b5ea5f2	Start implementing conduit stat summary endpoint (#671 ) Start implementing new conduit stat summary endpoint. Changes the public-api to call prometheus directly instead of the telemetry service. Wired through to `api/stat` on the web server, as well as `conduit statsummary` on the CLI. Works for deployments only. Current implementation just retrieves requests and mesh/total pod count (so latency stats are always 0). Uses API defined in #663 Example queries the stat endpoint will eventually satisfy in #627 This branch includes commits from @klingerf * run ./bin/dep ensure * run ./bin/update-go-deps-shas	2018-04-05 17:05:06 -07:00

45 Commits