In linkerd/linkerd2-proxy#99, several proxy configuration variables were
deprecated.
This change updates the CLI to use the updated names to avoid
deprecation warnings during startup.
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
* Add a clear button to the tap and top query forms
Add clear functionality to Tap
Add clear functionality to Top
Fix a react key error when there were multiple unmeshed upstreams
Fix bug where tap results from other queries persisted when changing the query
Fix key error in autocomplete dropdown
Use the same tap data we use to display the unmeshed resources in the Octopus
graph to add the unmeshed rows to the Inbound stat table.
The unmeshed rows are filtered by resource type, so if we're on a Deployments
page, only upstreams which are deployments will show in the table. (Others, such
as IPs, will still show in the octopus graph).
* Use the list of unmeshed resources to display unmeshed sources in the table
* Keep track of number of pods in unmeshed sources
When I deleted a resource, I noticed that hard refreshing the page resulted in
js errors that would break the UI (e.g. if you were on a pod page, and the pod's
deployment was deleted, the pod would no longer be found, and the page would
error). This PR better handles not-present resources so that the UI still shows
up and shows you that there aren't metrics for that resource.
Also clean up the undefined/undefined octopus node that would show up in that
case.
Previously, we would display source and destination info in the Top/Tap table
popovers in a vertical format. This PR places them in a table so that each type
of source/dest (ip, pod, pod owner) can be read left to right.
* Display the popover Source/Destination info for the tap and top tables in a
tabular format
* Added an arrow column between Src and Dst
When pods or deployments are in an "Initialization" phase we currently see a "warning" icon that represents pods going under some kind of change. This may sometimes seem alarming when initially injecting pods after installing Linkerd.
This PR adds a new icon that shows up when pods are in the "PodInitializing" phase and shows the former "warning" icon when there is an error in starting pods.
fixes#1652
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Draw customizable SVG paths for octopus arms.
This also combines all the unmeshed resources into a list and displays them in
one resource box, instead of adding one box per unmeshed resource. This helps
keep the box heights constant, which I want to draw the arrows.
When a resource has no tap events being streamed to the Tap UI, and a user hits the "Stop" button in the Tap page, the tap stream is left open due to the WebSocket connection not being closed.
It looks like the web server's tap client that is created to stream events from the tap server blocks the main request thread in the web server. This causes the web server to stop receiving any subsequent close frames from the UI i.e. when the "Stop" button is clicked.
This PR moves the tapClient initialization code to a separate goroutine, specifically, the goroutine that reads tap events from the incoming grpc tap stream. This allows the main thread to continue reading messages from the WebSocket connection and allow it to receive close frames.
fixes#1665
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
The success rate mini chart shows a colour based on SR, and also shows the SR
via the proportion of the chart that's filled out. If the success rate is 0% (as
in the VotePoop endpoint in the emojivoto demo), the chart would be zero
percent filled out, causing it to be entirely gray. Really, it should be entirely
red, since zero SR is pretty bad.
Fix: fully fill the bar with red if there is a zero SR
Prevent error when trying to tap without having a namespace and resource
selected by disabling the tap button.
Fixes#1670
Signed-off-by: Mathis Wiehl <mathis.wiehl@sinnerschrader.com>
* Update version checks to support release channels
* Update based on review feedback
* Fix sidebar tests
* Update CI config for edge and stable tags
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
If you go to /top and select a namespace and then a resource, and then clear the
resource, there would be a javascript error that would cause the whole app not
to render. Fix this.
Problem
Previously, we'd display one row in top per sourcePod -> dstPod. When
viewing resources at a higher level though (e.g. deployments with multiple pods)
the src/dst column displays the resource at that level, and displaying multiple
rows with deploy/foo is confusing.
Solution
Key the top table off of the resource currently being requested, so
that all the rows are rolled up appropriately. In the popover for that column,
display a list of pods/ips that are rolled up.
This branch also adds a generic list of resources to the tap/top dropdown (you
were always able to tap them, but when I switched from autocomplete to select
for this dropdown, you lost the ability to type in arbitrary resources).
If you select a from namespace and from resource in /tap and try to clear them
using the little x in the form field, there would be a huge js error causing the
app to not render. Fix this.
Also removes filterOptions which wasn't being used any more. This will probably
make parsing tap results ever so slightly faster as we're now not trying to also
aggregate potential filter options.
* Fix js errors on Tap form when Clear button is hit
* Remove filter options code since we're not using the filters anywhere
When a websocket connection is closed between Chrome and a server, we get a 1006 error code signifying abnormal closure of the websocket connection. It seems as if we only get this error on Chrome web clients. Firefox and Safari do not encounter this issue.
The solution is to suppress 1006 errors that occur in the web browser since the connection is closed anyway. There is no negative side effect that occurs when the connection is closed abnormally and so the error message is benign.
fixes#1630
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Don't display RPS unit in metrics table
* Fix Tap and Top icons not being minimized correctly
* remove metric tooltip on RPS column
* Fix extra spacing on Tap/Top in sidebar
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps
Signed-off-by: Alex Leong <alex@buoyant.io>
Try to make the tap table easier to parse by moving some info into the expanded
row. You can also now click anywhere on the row to expand.
The mocks in #1629 have Authority, Path and Latency buried, but I figured they might be
useful to see in the top level, so they're here.
The `linkerd check` parameter hits
https://versioncheck.linkerd.io/version.json to check for the latest
Linkerd version. This loses information, as that endpoint is intended to
record current version, uuid, and source.
Modify `linkerd check` to set `version`, `uuid`, and `source`
parameters when performing a version check.
Part of #1604.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
I find the tap and top icons a bit strange. Using the filter icon for tap is weird because we already use the filter icon for filtering columns. The caret-up icon looks weird to me for top because it looks like something that is click to expand.
Change the tap icon to the Font Awesome microscope. Change the top icon to the Font Awesome stream.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Remove wait option and make it a default for check
* Switch the wait default to true
* Wait by default also for dashboard
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
* Add version check to Grafana dashboard
The web dashboard checks the local Linkerd version against the latest
release, and informs the user if an update is available. Grafana was not
doing this.
Modify the Grafana dashboard to perform a version check, and prompt the
user to update if needed.
Fixes#1607
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd check` command was not validating whether data plane
proxies were successfully reporting metrics to Prometheus.
Introduce a new check that validates data plane proxies are found in
Prometheus. This is made possible via the existing `ListPods` endpoint
in the public API, which includes an `Added` field, indicating a pod's
metrics were found in Prometheus.
Fixes#1517
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Sometimes, the tap server causes the controller pod to restart after it receives this error.
This error arises when the Tap server does not close gRPC tap streams to proxies before the tap server terminates its streams to its upstream clients and causes the controller pod to restart.
This PR uses the request context from the initial TapByReource to help shutdown tap streams to the data plane proxies gracefully.
fixes#1504
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* change breadcrumb header to default font in styles.css
* change font weight for header to global font weight
* adjust height pixels and set global font to Lato
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
There are two variables we use to control the volume of Top output,
maxRowsToDisplay, which controls how many rows are in the table, and
maxRowsToStore, which controls the size of the event index we keep in memory for
aggregating results.
Previously, we were only keeping in index maxRowsToDisplay rows, which for the
Resource Detail page was 10 (which is really small for high traffic rest-y
resource traffic - it causes rows to be deleted from the index too soon, and
then causes the data in the table to change a lot). Change this to store
maxRowsToStore rows, and also bump this to 50. This allows us to store results
for longer, and also ensures more consistent data over time.
Another fix for the appearance of the Top columns is to add fixed widths to the
metrics. This will prevent the table from wobbling from side to side.
A bunch of web UI tweaks:
- Add a small success rate chart to the metrics tables
- Improve latency formatting for seconds latencies
- Rename upstream/downstream to inbound/outbound
- Make Top table look consistent with rest of tables on page
- Fix widths of metrics column columns so that tables align