When a resource has no tap events being streamed to the Tap UI, and a user hits the "Stop" button in the Tap page, the tap stream is left open due to the WebSocket connection not being closed.
It looks like the web server's tap client that is created to stream events from the tap server blocks the main request thread in the web server. This causes the web server to stop receiving any subsequent close frames from the UI i.e. when the "Stop" button is clicked.
This PR moves the tapClient initialization code to a separate goroutine, specifically, the goroutine that reads tap events from the incoming grpc tap stream. This allows the main thread to continue reading messages from the WebSocket connection and allow it to receive close frames.
fixes#1665
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
The success rate mini chart shows a colour based on SR, and also shows the SR
via the proportion of the chart that's filled out. If the success rate is 0% (as
in the VotePoop endpoint in the emojivoto demo), the chart would be zero
percent filled out, causing it to be entirely gray. Really, it should be entirely
red, since zero SR is pretty bad.
Fix: fully fill the bar with red if there is a zero SR
Prevent error when trying to tap without having a namespace and resource
selected by disabling the tap button.
Fixes#1670
Signed-off-by: Mathis Wiehl <mathis.wiehl@sinnerschrader.com>
* Update version checks to support release channels
* Update based on review feedback
* Fix sidebar tests
* Update CI config for edge and stable tags
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
If you go to /top and select a namespace and then a resource, and then clear the
resource, there would be a javascript error that would cause the whole app not
to render. Fix this.
Problem
Previously, we'd display one row in top per sourcePod -> dstPod. When
viewing resources at a higher level though (e.g. deployments with multiple pods)
the src/dst column displays the resource at that level, and displaying multiple
rows with deploy/foo is confusing.
Solution
Key the top table off of the resource currently being requested, so
that all the rows are rolled up appropriately. In the popover for that column,
display a list of pods/ips that are rolled up.
This branch also adds a generic list of resources to the tap/top dropdown (you
were always able to tap them, but when I switched from autocomplete to select
for this dropdown, you lost the ability to type in arbitrary resources).
If you select a from namespace and from resource in /tap and try to clear them
using the little x in the form field, there would be a huge js error causing the
app to not render. Fix this.
Also removes filterOptions which wasn't being used any more. This will probably
make parsing tap results ever so slightly faster as we're now not trying to also
aggregate potential filter options.
* Fix js errors on Tap form when Clear button is hit
* Remove filter options code since we're not using the filters anywhere
When a websocket connection is closed between Chrome and a server, we get a 1006 error code signifying abnormal closure of the websocket connection. It seems as if we only get this error on Chrome web clients. Firefox and Safari do not encounter this issue.
The solution is to suppress 1006 errors that occur in the web browser since the connection is closed anyway. There is no negative side effect that occurs when the connection is closed abnormally and so the error message is benign.
fixes#1630
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Don't display RPS unit in metrics table
* Fix Tap and Top icons not being minimized correctly
* remove metric tooltip on RPS column
* Fix extra spacing on Tap/Top in sidebar
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps
Signed-off-by: Alex Leong <alex@buoyant.io>
Try to make the tap table easier to parse by moving some info into the expanded
row. You can also now click anywhere on the row to expand.
The mocks in #1629 have Authority, Path and Latency buried, but I figured they might be
useful to see in the top level, so they're here.
The `linkerd check` parameter hits
https://versioncheck.linkerd.io/version.json to check for the latest
Linkerd version. This loses information, as that endpoint is intended to
record current version, uuid, and source.
Modify `linkerd check` to set `version`, `uuid`, and `source`
parameters when performing a version check.
Part of #1604.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
I find the tap and top icons a bit strange. Using the filter icon for tap is weird because we already use the filter icon for filtering columns. The caret-up icon looks weird to me for top because it looks like something that is click to expand.
Change the tap icon to the Font Awesome microscope. Change the top icon to the Font Awesome stream.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Remove wait option and make it a default for check
* Switch the wait default to true
* Wait by default also for dashboard
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
* Add version check to Grafana dashboard
The web dashboard checks the local Linkerd version against the latest
release, and informs the user if an update is available. Grafana was not
doing this.
Modify the Grafana dashboard to perform a version check, and prompt the
user to update if needed.
Fixes#1607
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd check` command was not validating whether data plane
proxies were successfully reporting metrics to Prometheus.
Introduce a new check that validates data plane proxies are found in
Prometheus. This is made possible via the existing `ListPods` endpoint
in the public API, which includes an `Added` field, indicating a pod's
metrics were found in Prometheus.
Fixes#1517
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Sometimes, the tap server causes the controller pod to restart after it receives this error.
This error arises when the Tap server does not close gRPC tap streams to proxies before the tap server terminates its streams to its upstream clients and causes the controller pod to restart.
This PR uses the request context from the initial TapByReource to help shutdown tap streams to the data plane proxies gracefully.
fixes#1504
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* change breadcrumb header to default font in styles.css
* change font weight for header to global font weight
* adjust height pixels and set global font to Lato
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
There are two variables we use to control the volume of Top output,
maxRowsToDisplay, which controls how many rows are in the table, and
maxRowsToStore, which controls the size of the event index we keep in memory for
aggregating results.
Previously, we were only keeping in index maxRowsToDisplay rows, which for the
Resource Detail page was 10 (which is really small for high traffic rest-y
resource traffic - it causes rows to be deleted from the index too soon, and
then causes the data in the table to change a lot). Change this to store
maxRowsToStore rows, and also bump this to 50. This allows us to store results
for longer, and also ensures more consistent data over time.
Another fix for the appearance of the Top columns is to add fixed widths to the
metrics. This will prevent the table from wobbling from side to side.
A bunch of web UI tweaks:
- Add a small success rate chart to the metrics tables
- Improve latency formatting for seconds latencies
- Rename upstream/downstream to inbound/outbound
- Make Top table look consistent with rest of tables on page
- Fix widths of metrics column columns so that tables align
Consolidate the source and destination columns into one column,
and add a direction column (To/From) so the user knows if the
displayed resource is src/dst.
This PR is a result of a change request that was missed in PR #1613. This change removes an unnecessary calc() function in the sidebar.css
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
The web client displays `Websocket [code]` on websocket close errors.
Modify the web client to render a more helpful error message to the
user. If a reason is present, render that, otherwise translate the
websocket error code into a message.
Fixes#1599
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR adds a breadcrumb style navigation to the Linkerd dashboard. Each "crumb" links to its corresponding page in the UI.
This PR also includes a small UI fix in the sidebar. The select box always seems to revert to the All Namespaces option whenever there is a state change on the React side. The fix ensures that the select box always displays the namespace filter if it is available and revert to All Namespaces when no namespace is selected.
fixes#1464fixes#1543fixes#1627
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
`linkerd inject` was not checking its input for known sidecars and
initContainers.
Modify `linkerd inject` to check for existing sidecars and
initContainers, specifically, Linkerd, Istio, and Contour.
Part of #1516
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#1593
Add a `--hide-sources` flag to `linkerd top`. Setting this removes the source column from the output.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Use Tap data on Resource Detail page to display unmeshed resources
that send traffic to the specified resource.
* Don't update neighbors on every websocket recv; this causes too much rendering.
Instead, store in internal variable and update with the api results.
This branch uses the src data from tap to discern which unmeshed resources are
sending traffic to the specified resource. We then show this resource in the
octopus graph.
Note that tap is sampled data, so it's possible for an unmeshed resource to not
show up. Also, because we won't know about the resource until it appears in the
Tap results, results could pop into the chart at any time.
linkerd only routes TCP data, but `linkerd inject` does not warn when it
injects into pods with ports set to `protocol: UDP`.
Modify `linkerd inject` to warn when injected into a pod with
`protocol: UDP`. The Linkerd sidecar will still be injected, but the
stderr output will include a warning.
Also add stderr checking on all inject unit tests.
Part of #1516.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
_.throttle setState for receiving websocket tap events to prevent continuous rerendering
Problem
We receive a lot of websocket events from the tap server. Previously, we
were processing each event as we received it, then calling setState after
processing to update the tables. Each call to setState triggered a re-render of
the whole table. We were rerendering multiplie times a second, causing the whole
page to become unresponsive.
Solution
Throttle setState for receiving websocket tap events to prevent
continuous rerendering. Store the tap events in an index outside of state, and
only update the state once every specified interval (currently 500ms).
We can now view entire namespaces with Top and the page won't crash!
To verify: Go to /top and try topping a namespace
When scrollbars are set to always be visible in a browser, we see them appear in the sidebar component of the dashboard.
This PR adds CSS that hides the scrollbar for WebKit browsers, i.e., Chrome and Safari and uses an overflow: hidden technique inspired by this solution to hide the scrollbar in Firefox.
fixes#1611
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
If an input file is un-injectable, existing inject behavior is to simply
output a copy of the input.
Introduce a report, printed to stderr, that communicates the end state
of the inject command. Currently this includes checking for hostNetwork
and unsupported resources.
Malformed YAML documents will continue to cause no YAML output, and return
error code 1.
This change also modifies integration tests to handle stdout and stderr separately.
example outputs...
some pods injected, none with host networking:
```
hostNetwork: pods do not use host networking...............................[ok]
supported: at least one resource injected..................................[ok]
Summary: 4 of 8 YAML document(s) injected
deploy/emoji
deploy/voting
deploy/web
deploy/vote-bot
```
some pods injected, one host networking:
```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true"
supported: at least one resource injected..................................[ok]
Summary: 3 of 8 YAML document(s) injected
deploy/emoji
deploy/voting
deploy/web
```
no pods injected:
```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true"
supported: at least one resource injected..................................[warn] -- no supported objects found
Summary: 0 of 8 YAML document(s) injected
```
TODO: check for UDP and other init containers
Part of #1516
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds a new page that shows all namespaces in an accordion. This will replace
ServiceMesh as the default landing page.
The page will request stats for all namespaces, and then pick the first meshed
namespace that's not the linkerd namespace to auto-expand in the accordion.
This branch also updates the definition of "added to the mesh" in the frontend
to be runningPodCount > 0 && meshedPodCount > 0 (previously, it was
runningPodCount === meshedPodCount, which would count resources with no pods as
"added").
I've also moved the link to /namespaces out of the top-level sidebar and into
the Resources sub-menu.
* Use url query params for tap/top form filters
* Add comment explaining react-url-query onChange handlers
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Previously, WebSocket error messages would appear with the first
couple characters cut off. I've fixed this by using ws.WriteControl
instead of ws.WriteMessage to write errors, as gorilla does in
their example app.
- Use writeControl to write error messages to the client
- Stop the spinner if there is an error present
For a better UI experience, the sidebar should be able to scroll independently from the detail view. This PR allows both the sidebar and the detail view to scroll independently.
fixes#1547
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>