Commit Graph

43 Commits

Author SHA1 Message Date
Kevin Leimkuhler 75fcc9d623
Move tap from core into Viz extension (#5651)
Closes #5545.

This change moves all tap and tap-injector code into the viz directory. 

The tap and tap-injector components now also use a new tap image—separating
these components from the controller image that they are currently part of. This
means the controller image has removed all its build dependencies related to
tap.

Finally, the tap Protobuf has been separated from the metrics-api and moved into
it's own `.proto` file and gen directory. This introduces a clear split between
metrics-api and tap Protobuf.

There is no change in behavior for the `viz tap` command.

### Reviewing

#### Docker images

All the bin directory scripts should be updated to build and load the tap image.
All the CI workflows should be updated to build and push the tap image.

#### Controller and pkg directories

This is primarily deletions. Most of the deleted code in this directory is now
in the tap directory of the Viz extension.

#### viz/tap

This is the location that all the tap related code now lives in. New files are
mostly moved from the controller and pkg directories. Imports have all been
updated to point at the right locations and Protobuf.

The Protobuf here is taken from metrics-api and contains all tap-related
Protobuf.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-09 12:43:21 -05:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Alejandro Pedraza f3b1ebfa99
Separate observability API (#5510)
* Separate observability API

Closes #5312

This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.

- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.

The other changes in the go files are just changes in the imports to point to the new protobufs.

Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
2021-01-13 14:34:54 -05:00
Tarun Pothulapati 39e7f84773
cli: fix and update timeout warnings in profile cmd (#5122)
Fixes #5121

* cli: skip emitting warnings in Profile


Whenever the tapDuration gets completed, there is a warning occured
which we do not emit. This looks like it has been changed in the latest
versions of the dependency.

* Use context.withDeadline instead of client.timeout

The usage of `client.Timeout` is not working correctly causing `W1022
17:20:12.372780   19049 transport.go:260] Unable to cancel request for
   promhttp.RoundTripperFunc` to be emitted by the Kubernetes Client.

This is fixed by using context.WithDeadline and passing that into the
http Request.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-10-27 22:08:21 +05:30
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Tharun Rajendran e24c323bf9
Gateway Metrics in Dashboard (#4717)
* Introduce multicluster gateway api handler in web api server
* Added MetricsUtil for Gateway metrics
* Added gateway api helper
* Added Gateway Component

Updated metricsTable component to support gateway metrics
Added handler for gateway

Fixes #4601

Signed-off-by: Tharun <rajendrantharun@live.com>
2020-07-27 12:43:54 -07:00
Alejandro Pedraza 419b9f1502
Fix race condition in web service (#3883)
Fixes #3859, followup to #3769

The addition of the web service's `statCache` introduced a race condition on the `h.statCache` variable, that is read and written in `handleAPIStat()` without mutext guards. I've moved the `statCache` initialization into `/web/srv/server.go` to avoid this problem.

The issue can be easily reproduced with
```bash
$ bin/web dev

$ for run in {1..2}; do curl 'http://localhost:7777/api/tps-reports?resource_type=deployment&namespace=linkerd&tcp_stats=true&resource_name=linkerd-destination&window=1m' &  done
[1] 11672
[2] 11673
{"ok":{"statTables":[{"podGroup":{"rows":[{"resource":{"namespace":"linkerd","type":"deployment","name":"linkerd-destination"},"timeWindow":"1m","status":"","meshedPodCount":"1","runningPodCount":"1","failedPodCount":"0","stats":{"successCount":"18","failureCount":"0","latencyMsP50":"1","latencyMsP95":"9","latencyMsP99":"10","actualSuccessCount":"0","actualFailureCount":"0"},"tcpStats":{"openConnections":"7","readBytesTotal":"23174","writeBytesTotal":"22946"},"tsStats":null,"errorsByPod":{}}]}}]}}{"ok":{"statTables":[{"podGroup":{"rows":[{"resource":{"namespace":"linkerd","type":"deployment","name":"linkerd-destination"},"timeWindow":"1m","status":"","meshedPodCount":"1","runningPodCount":"1","failedPodCount":"0","stats":{"successCount":"18","failureCount":"0","latencyMsP50":"1","latencyMsP95":"9","latencyMsP99":"10","actualSuccessCount":"0","actualFailureCount":"0"},"tcpStats":{"openConnections":"7","readBytesTotal":"23174","writeBytesTotal":"22946"},"tsStats":null,"errorsByPod":{}}]}}]}}[1]-  Done                    curl 'http://localhost:7777/api/tps-reports?resource_type=deployment&namespace=linkerd&tcp_stats=true&resource_name=linkerd-destination&window=1m'
[2]+  Done                    curl 'http://localhost:7777/api/tps-reports?resource_type=deployment&namespace=linkerd&tcp_stats=true&resource_name=linkerd-destination&window=1m'

==================
WARNING: DATA RACE
Read at 0x00c000192308 by goroutine 58:
  github.com/linkerd/linkerd2/web/srv.(*handler).handleAPIStat()
      /home/alpeb/src/linkerd2/web/srv/api_handlers.go:140 +0x61
  github.com/linkerd/linkerd2/web/srv.(*handler).handleAPIStat-fm()
      /home/alpeb/src/linkerd2/web/srv/api_handlers.go:138 +0x7d
  github.com/julienschmidt/httprouter.(*Router).ServeHTTP()
      /home/alpeb/go/pkg/mod/github.com/julienschmidt/httprouter@v1.2.0/router.go:334 +0x10b7
  github.com/linkerd/linkerd2/web/srv.(*Server).ServeHTTP()
      /home/alpeb/src/linkerd2/web/srv/server.go:69 +0x4c0
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:100 +0xf8
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:196 +0x104
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:68 +0x13c
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP()
      /home/alpeb/go/pkg/mod/go.opencensus.io@v0.22.0/plugin/ochttp/server.go:86 +0x3f9
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2802 +0xce
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1890 +0x837

Previous write at 0x00c000192308 by goroutine 56:
  github.com/linkerd/linkerd2/web/srv.(*handler).handleAPIStat()
      /home/alpeb/src/linkerd2/web/srv/api_handlers.go:141 +0xd5e
  github.com/linkerd/linkerd2/web/srv.(*handler).handleAPIStat-fm()
      /home/alpeb/src/linkerd2/web/srv/api_handlers.go:138 +0x7d
  github.com/julienschmidt/httprouter.(*Router).ServeHTTP()
      /home/alpeb/go/pkg/mod/github.com/julienschmidt/httprouter@v1.2.0/router.go:334 +0x10b7
  github.com/linkerd/linkerd2/web/srv.(*Server).ServeHTTP()
      /home/alpeb/src/linkerd2/web/srv/server.go:69 +0x4c0
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:100 +0xf8
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:196 +0x104
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func1()
      /home/alpeb/go/pkg/mod/github.com/prometheus/client_golang@v1.2.1/prometheus/promhttp/instrument_server.go:68 +0x13c
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2007 +0x51
  go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP()
      /home/alpeb/go/pkg/mod/go.opencensus.io@v0.22.0/plugin/ochttp/server.go:86 +0x3f9
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2802 +0xce
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1890 +0x837

Goroutine 58 (running) created at:
  net/http.(*Server).Serve()
      /usr/local/go/src/net/http/server.go:2927 +0x5be
  net/http.(*Server).ListenAndServe()
      /usr/local/go/src/net/http/server.go:2825 +0x102
  main.main.func1()
      /home/alpeb/src/linkerd2/web/main.go:105 +0xdd

Goroutine 56 (running) created at:
  net/http.(*Server).Serve()
      /usr/local/go/src/net/http/server.go:2927 +0x5be
  net/http.(*Server).ListenAndServe()
      /usr/local/go/src/net/http/server.go:2825 +0x102
  main.main.func1()
      /home/alpeb/src/linkerd2/web/main.go:105 +0xdd
```
2020-01-07 17:21:45 -05:00
Sergio C. Arteaga a1141fc507 Cache StatSummary responses in dashboard web server (#3769)
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-17 09:15:00 -05:00
Sergio C. Arteaga 78ed5f8883 Make resource definition available to dashboard (#3666)
This PR allows the dashboard to query for a resource's definition in YAML
format, if the boolean `queryForDefinition` in the `ResourceDetail` component is
set to true. 

This change to the web API and the dashboard component was made for a future
redesigned dashboard detail page. At present, `queryForDefinition` is set to
false and there is no visible change to the user with this PR. 

Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com> Signed-off-by: Cintia
Sanchez Garcia <cynthiasg@icloud.com>
2019-12-03 10:25:20 -08:00
Sergio C. Arteaga eff1714a08 Add `linkerd check` to dashboard (#3656)
`linkerd check` can now be run from the dashboard in the `/controlplane` view.
Once the check results are received, they are displayed in a modal in a similar
style to the CLI output.

Closes #3613
2019-11-12 12:37:36 -08:00
Kevin Leimkuhler e41986e255
Remove redundant `HTTPError` cast check in web server (#3222)
* Clean up HTTPError cast check

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-09 11:06:50 -07:00
Kevin Leimkuhler db381a007a
Check for 403 status code when preparing tap error (#3215)
* Check for 403 to pass to websocketError

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-08 12:09:54 -07:00
Kevin Leimkuhler 5d7662fd90
Update web server to use tap APIService (#3208)
### Motivation

PR #3167 introduced the tap APIService and migrated `linkerd tap` to use it.
Subsequent PRs (#3186 and #3187) updated `linkerd top` and `linkerd profile
--tap` to use the tap APIService. This PR moves the web's Go server to now also
use the tap APIService instead of the public API. It also ensures an error
banner is shown to the user when unauthorized taps fail via `linkerd top`
command in *Overview* and *Top*, and `linkerd tap` command in *Tap*.

### Details

The majority of these changes are focused around piping through the HTTP error
that occurs and making sure the error banner generated displays the error
message explaining to view the tap RBAC docs.

`httpError` is now public (`HTTPError`) and the error message generated is short
enough to fit in a control frame (explained [here](https://github.com/linkerd/linkerd2/blob/kleimkuhler%2Fweb-tap-apiserver/web/srv/api_handlers.go#L173-L175)).

### Testing

The error we are testing for only occurs when the linkerd-web service account is
not authorzied to tap resources. Unforutnately that is not the case on Docker
For Mac (assuming that is what you use locally), so you'll need to test on a
different cluster. I chose a GKE cluster made through the GKE console--not made
through cluster-utils because it adds cluster-admin.

Checkout the branch locally and `bin/docker-build` or `ares-build` if you have
it setup. It should produce a linkerd with the version `git-04e61786`. I have
already pushed the dependent components, so you won't need to `bin/docker-push
git-04e61786`.

Install linkerd on this GKE cluster and try to run `tap` or `top` commands via
the web. You should see the following errors:

### Tap

![web-tap-unauthorized](https://user-images.githubusercontent.com/4572153/62661243-51464900-b925-11e9-907b-29d7ca3f815d.png)

### Top

![web-top-unauthorized](https://user-images.githubusercontent.com/4572153/62661308-894d8c00-b925-11e9-9498-6c9d38b371f6.png)

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-08 10:18:32 -07:00
Carol A. Scott dce462acd9
Add Edges table to resource detail view of dashboard (#2965)
Adds an Edges table to the resource detail view that shows the source,
destination name and identity for proxied connections to and from the resource
shown.
2019-06-20 10:50:11 -07:00
Alejandro Pedraza 928d4cb522
Remove unimplemented debug page on dashboard (#2952)
* Remove unimplemented debug page on dashboard

Fixes #2895

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-17 14:43:09 -05:00
Andrew Seigner bc735ebdc2
Fix goconst linter breakage following master merge (#2378)
linkerd/linkerd2#2365 introduced the goconst linter and fixes, but additional lint
errors had been introduced to master.

This change fixes the one remaining goconst issue.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 12:57:27 -08:00
Risha Mars 80b6e41d5d
Modify StatSummary to also return TCP stats (#2262)
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.

TCP stats are now displayed on the Resource Detail pages.

The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
2019-02-25 10:37:39 -08:00
Andrew Seigner 4b6f6aeedd
Enable gosimple linter, fix issues (#2356)
gosimple is a Go linter that specializes in simplifying code

Also fix one spelling error in `cred_test.go`

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-22 17:19:07 -08:00
Risha Mars 3e9c7d2132
Add an Endpoints view to the web dashboard (#2275)
In #2195 we introduced `linkerd endpoints` on the CLI. I would like similar
information to be on the web.

This PR adds an api endpoint at `/api/endpoints`, and introduces a new debugging
pagethat shows a table of endpoints, available at `/debug`
2019-02-21 11:57:51 -08:00
Risha Mars cce6183c4a
Update Top Routes in dashboard to use new API, add `to` query options (#2112)
The recent routes API changes caused the Top Routes tab to stop working, as it
wasn't looking for the changed structure of the response. This PR updates that
page to accept the new API response.

This PR also adds to fields to the Top Routes query form, so that the equivalent
of linkerd routes deploy --to deploy/authors will work in the dashboard.
2019-01-23 14:29:27 -08:00
Alena Varkockova 28f662c9c6 Introduce resource selector and deprecate namespace field for ListPods (#2025)
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-23 10:35:55 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Alejandro Pedraza 8c67bfbcc6 Add parameter to stats API to skip retrieving Prometheus stats (#1871)
* Add parameter to stats API to skip retrieving Prometheus stats

Used by the dashboard to populate list of resources.

Fixes #1022

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated

* Add test for SkipStats=true

Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats

* Fix previous test
2018-12-10 16:48:12 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Risha Mars d9539bcb37
Add the top routes feature to the dashboard UI (#1868)
Adds a (currently not displayed in sidebar, but available at /routes) page to
mirror the current functionality of `linkerd routes <service>`. So far, this is just a
barebones form and table, but it works.

Adds a /api/routes path and handler to the api to receive TopRoutes requests from the web.
2018-11-27 16:53:10 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Dennis Adjei-Baah d23103111c
Move grpc tapClient initialization to goroutine (#1686)
When a resource has no tap events being streamed to the Tap UI, and a user hits the "Stop" button in the Tap page, the tap stream is left open due to the WebSocket connection not being closed.

It looks like the web server's tap client that is created to stream events from the tap server blocks the main request thread in the web server. This causes the web server to stop receiving any subsequent close frames from the UI i.e. when the "Stop" button is clicked.

This PR moves the tapClient initialization code to a separate goroutine, specifically, the goroutine that reads tap events from the incoming grpc tap stream. This allows the main thread to continue reading messages from the WebSocket connection and allow it to receive close frames.

fixes #1665

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-19 13:06:57 -07:00
Risha Mars d679c1fa0e
Error display fixes in web for the tap query handling. (#1583)
Previously, WebSocket error messages would appear with the first 
couple characters cut off. I've fixed this by using ws.WriteControl 
instead of ws.WriteMessage to write errors, as gorilla does in 
their example app.

- Use writeControl to write error messages to the client
- Stop the spinner if there is an error present
2018-09-05 10:28:59 -07:00
Risha Mars 249b51f950
Increase MaxRps in Tap server, remove default setting from Web (#1560)
Increase the MaxRps on the tap server to 100 RPS.

The max RPS for tap/top was increased in for the CLI #1531, but we were
still manually setting this to 1 RPS in the Web UI and Web server.

Remove the pervasive setting of MaxRps to 1 in the web frontend and server
2018-08-30 13:37:37 -07:00
Risha Mars 0e6c0a2f3b
Tap: Make use of the Web UI to render tap events in a table (#1391)
* Make use of the Web UI to render tap events in a table

- Return JSON tap events instead of the command line output
- Experiment with a different way of rendering the EventList
- changed the default width back to 100% of the screen because this 
table does not look great squished
2018-08-03 13:45:04 -07:00
Risha Mars ec3c861743
Enable Tap from the Web UI (#1356)
Adds a tap endpoint in the web api that communicates with the dashboard 
via websockets.
I've moved a bunch of code from the cli tap.go into utils so that the code 
can be shared between web and CLI. I think we should consider making the 
display more suited to web, but in the short term, reusing the CLI's 
rendering of tap events works.

Adds a Tap page in the Web UI that you can use to make tap requests. 
The form currently only allows you to enter a resource and namespace, 
other filters coming in a follow-up branch.
2018-07-24 14:23:42 -04:00
Kevin Lingerfelt 4b9700933a
Update prometheus labels to match k8s resource names (#1355)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-07-23 15:45:05 -07:00
Oliver Gould 941cad4a9c
Migrate build infrastructure to linkerd2 (#1298)
This PR begins to migrate Conduit to Linkerd2:
* The proxy has been completely removed from this repo, and is now located at
  github.com/linkerd/linkerd2-proxy.
* A `Dockerfile-proxy` has been added to fetch the most-recently published proxy
  binary from build.l5d.io.
* Proxy-specific protobuf bindings have been moved to
  github.com/linkerd/linkerd2-proxy-api.
* All docker images now use the gcr.io/linkerd-io registry.
* `inject` now uses `LINKERD2_PROXY_` environment variables
* Go paths have been updated to reflect the new (future) repo location.
2018-07-09 15:38:38 -07:00
Risha Mars 5ed7fc563c
Add controller component pod uptimes to the ServiceMesh page (#1205)
- Return pod uptimes from the GetPods endpoint
- Adds filtering by namespace to api.GetPods
- Adds a --namespace filter to conduit get pods
- Adds pod uptimes to the controller component toolitps on the ServiceMesh page
- Moves the ServiceMesh page back to using /api/pods
2018-06-28 15:42:00 -07:00
Risha Mars 0ff1bb4ad8
Don't allow stat requests for named resources in --all-namespaces (#1163)
Don't allow the CLI or Web UI to request named resources if --all-namespaces is used.

This follows kubectl, which also does not allow requesting named resources
over all namespaces.

This PR also updates the Web API's behaviour to be in line with the CLI's. 
Both will now default to the default namespace if no namespace is specified.
2018-06-20 12:59:31 -07:00
Andrew Seigner 77fb6d3709
Add namespace as a resource type in public-api (#760)
* Add namespace as a resource type in public-api

The cli and public-api only supported deployments as a resource type.

This change adds support for namespace as a resource type in the cli and
public-api. This also change includes:
- cli statsummary now prints `-`'s when objects are not in the mesh
- cli statsummary prints `No resources found.` when applicable
- removed `out-` from cli statsummary flags, and analagous proto changes
- switched public-api to use native prometheus label types
- misc error handling and logging fixes

Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>

* Refactor filter and groupby label formulation

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Rename stat_summary.go to stat.go in cli

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Update rbac privileges for namespace stats

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-04-13 16:53:01 -07:00
Kevin Lingerfelt 37434d048a
Update web component to use new stat api (#753)
* Update web component to use new stat api
* Address review feedback
* Add external link icon

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-04-12 17:35:03 -07:00
Risha Mars 2f5b5ea5f2
Start implementing conduit stat summary endpoint (#671)
Start implementing new conduit stat summary endpoint. 
Changes the public-api to call prometheus directly instead of the
telemetry service. Wired through to `api/stat` on the web server,
as well as `conduit statsummary` on the CLI. Works for deployments only.

Current implementation just retrieves requests and mesh/total pod count 
(so latency stats are always 0). 

Uses API defined in #663
Example queries the stat endpoint will eventually satisfy in #627

This branch includes commits from @klingerf 

* run ./bin/dep ensure
* run ./bin/update-go-deps-shas
2018-04-05 17:05:06 -07:00
Eliza Weisman 458e9d2ac5
Remove per-path metrics from telemetry pipeline (#317)
Follow-up from #315.

Now that the UIs don't report per-path metrics, we can remove the path label from Prometheus, the path aggregation and filtering options from the telemetry API, and the path field from the proxy report API.

I've modified the tests to no longer expect the removed fields, and manually verified that Conduit still works after making these changes.

Closes #265 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-09 14:20:28 -08:00
Eliza Weisman 2015d992cc
Remove pod-level metrics from web and CLI (#304)
This PR updates the web UI to remove the pod detail page, and to remove the links to that page from pod names in metrics tables. It also removes the `pods` option from `conduit stat`, and the `sourcePod` and `targetPod` fields from the controller API proto's `MetricMetadata` message.

I've updated the `conduit stat` tests to reflect these changes, and manually verified the web UI changes.

Closes #261 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-08 19:07:10 -08:00
Kevin Lingerfelt 2f114e69fa
Add support for path stats in cli and web api (#13)
* Add support for path stats in cli and web api

The cli stat command supports grouping by pod and deployment. With this
change, it will also support grouping by path, in order to facilitate a
summary stats per individual endpoint.

* Right-align numeric columns in stat output
2017-12-08 12:24:39 -08:00
Oliver Gould b104bd0676 Introducing Conduit, the ultralight service mesh
We’ve built Conduit from the ground up to be the fastest, lightest,
simplest, and most secure service mesh in the world. It features an
incredibly fast and safe data plane written in Rust, a simple yet
powerful control plane written in Go, and a design that’s focused on
performance, security, and usability. Most importantly, Conduit
incorporates the many lessons we’ve learned from over 18 months of
production service mesh experience with Linkerd.

This repository contains a few tightly-related components:
- `proxy` -- an HTTP/2 proxy written in Rust;
- `controller` -- a control plane written in Go with gRPC;
- `web` -- a UI written in React, served by Go.
2017-12-05 00:24:55 +00:00