Commit Graph

1231 Commits

Author SHA1 Message Date
Risha Mars 7d2f2afb36
Improve top table to better cope with high RPS traffic (#1634)
There are two variables we use to control the volume of Top output,
maxRowsToDisplay, which controls how many rows are in the table, and
maxRowsToStore, which controls the size of the event index we keep in memory for
aggregating results.

Previously, we were only keeping in index maxRowsToDisplay rows, which for the
Resource Detail page was 10 (which is really small for high traffic rest-y
resource traffic - it causes rows to be deleted from the index too soon, and
then causes the data in the table to change a lot). Change this to store
maxRowsToStore rows, and also bump this to 50. This allows us to store results
for longer, and also ensures more consistent data over time.

Another fix for the appearance of the Top columns is to add fixed widths to the
metrics. This will prevent the table from wobbling from side to side.
2018-09-12 14:56:24 -07:00
Risha Mars b49ccce5f0
Add small success rate chart to table, misc web tweaks (#1628)
A bunch of web UI tweaks: 
- Add a small success rate chart to the metrics tables
- Improve latency formatting for seconds latencies
- Rename upstream/downstream to inbound/outbound
- Make Top table look consistent with rest of tables on page
- Fix widths of metrics column columns so that tables align
2018-09-12 13:47:46 -07:00
Risha Mars 01be78e455
Consolidate the source and destination columns in the Tap and Top tables (#1620)
Consolidate the source and destination columns into one column, 
and add a direction column (To/From) so the user knows if the 
displayed resource is src/dst.
2018-09-12 13:30:52 -07:00
Dennis Adjei-Baah b10b8cb8c4
remove extraneous calc function in sidebar.css (#1632)
This PR is a result of a change request that was missed in PR #1613. This change removes an unnecessary calc() function in the sidebar.css

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-12 11:37:59 -07:00
Andrew Seigner 6c45c07ede
Display more helpful websocket errors (#1626)
The web client displays `Websocket [code]` on websocket close errors.

Modify the web client to render a more helpful error message to the
user. If a reason is present, render that, otherwise translate the
websocket error code into a message.

Fixes #1599

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-12 11:29:11 -07:00
Dennis Adjei-Baah b1181e552d
Add breadcrumb navigation at the top of linkerd dashboard (#1613)
This PR adds a breadcrumb style navigation to the Linkerd dashboard. Each "crumb" links to its corresponding page in the UI.

This PR also includes a small UI fix in the sidebar. The select box always seems to revert to the All Namespaces option whenever there is a state change on the React side. The fix ensures that the select box always displays the namespace filter if it is available and revert to All Namespaces when no namespace is selected.

fixes #1464
fixes #1543
fixes #1627

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-12 09:22:01 -07:00
Andrew Seigner 5d85680ec1
Introduce inject check for known sidecars (#1619)
`linkerd inject` was not checking its input for known sidecars and
initContainers.

Modify `linkerd inject` to check for existing sidecars and
initContainers, specifically, Linkerd, Istio, and Contour.

Part of #1516

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-11 15:09:19 -07:00
Andrew Seigner bae05410fd
Bump Prometheus to v2.4.0, Grafana to 5.2.4 (#1625)
Prometheus v2.3.1 -> v2.4.0
Grafana 5.1.3 -> 5.2.4

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-11 14:45:55 -07:00
Kevin Lingerfelt c4a0278a75
Improve performance of tap table by throttling updates (#1623)
* Improve performance of tap table by throttling updates
* Rename debounced to throttled

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-11 14:28:54 -07:00
Alex Leong bd15482329
Add with-source flag to top (#1614)
Fixes #1593 

Add a `--hide-sources` flag to `linkerd top`.  Setting this removes the source column from the output.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-09-11 14:21:36 -07:00
Risha Mars 6b830ef4b3
Use Tap data on Resource Detail page to display unmeshed resources (#1596)
* Use Tap data on Resource Detail page to display unmeshed resources
that send traffic to the specified resource.

* Don't update neighbors on every websocket recv; this causes too much rendering.
Instead, store in internal variable and update with the api results.

This branch uses the src data from tap to discern which unmeshed resources are
sending traffic to the specified resource. We then show this resource in the
octopus graph.

Note that tap is sampled data, so it's possible for an unmeshed resource to not
show up. Also, because we won't know about the resource until it appears in the
Tap results, results could pop into the chart at any time.
2018-09-11 10:34:27 -07:00
Andrew Seigner 7eec5f181d
Inject warns on UDP ports (#1617)
linkerd only routes TCP data, but `linkerd inject` does not warn when it
injects into pods with ports set to `protocol: UDP`.

Modify `linkerd inject` to warn when injected into a pod with
`protocol: UDP`. The Linkerd sidecar will still be injected, but the
stderr output will include a warning.

Also add stderr checking on all inject unit tests.

Part of #1516.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-11 10:12:45 -07:00
Kevin Lingerfelt f3301594ad
Fix landing page when there are no meshed namespaces (#1622)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-10 19:02:55 -07:00
Risha Mars 55402da493
Improve performance of Top tables (#1616)
_.throttle setState for receiving websocket tap events to prevent continuous rerendering

Problem 

We receive a lot of websocket events from the tap server. Previously, we
were processing each event as we received it, then calling setState after
processing to update the tables. Each call to setState triggered a re-render of
the whole table. We were rerendering multiplie times a second, causing the whole
page to become unresponsive.

Solution 

Throttle setState for receiving websocket tap events to prevent
continuous rerendering. Store the tap events in an index outside of state, and
only update the state once every specified interval (currently 500ms).

We can now view entire namespaces with Top and the page won't crash! 
To verify: Go to /top and try topping a namespace
2018-09-10 16:02:29 -07:00
Dennis Adjei-Baah 7cc64843a3
Hide scrollbars in sidebar (#1615)
When scrollbars are set to always be visible in a browser, we see them appear in the sidebar component of the dashboard.

This PR adds CSS that hides the scrollbar for WebKit browsers, i.e., Chrome and Safari and uses an overflow: hidden technique inspired by this solution to hide the scrollbar in Firefox.

fixes #1611

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-10 13:20:08 -07:00
Andrew Seigner c5a719da47
Modify inject to warn when file is un-injectable (#1603)
If an input file is un-injectable, existing inject behavior is to simply
output a copy of the input.

Introduce a report, printed to stderr, that communicates the end state
of the inject command. Currently this includes checking for hostNetwork
and unsupported resources.

Malformed YAML documents will continue to cause no YAML output, and return
error code 1.

This change also modifies integration tests to handle stdout and stderr separately.

example outputs...

some pods injected, none with host networking:

```
hostNetwork: pods do not use host networking...............................[ok]
supported: at least one resource injected..................................[ok]

Summary: 4 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
  deploy/vote-bot
```

some pods injected, one host networking:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true"
supported: at least one resource injected..................................[ok]

Summary: 3 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
```

no pods injected:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true"
supported: at least one resource injected..................................[warn] -- no supported objects found

Summary: 0 of 8 YAML document(s) injected
```

TODO: check for UDP and other init containers

Part of #1516

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-10 10:34:25 -07:00
Risha Mars 828ea29321
Fix sidebar dot colours for resources not receiving traffic (#1612)
Colour the dot gray in the sidebar if the resource isn't 
receiving traffic (i.e. success rate is null).
2018-09-07 14:26:36 -07:00
Risha Mars 761d8453a8
Add a new namespace overview page with expandable sections (#1605)
Adds a new page that shows all namespaces in an accordion. This will replace
ServiceMesh as the default landing page.

The page will request stats for all namespaces, and then pick the first meshed
namespace that's not the linkerd namespace to auto-expand in the accordion.

This branch also updates the definition of "added to the mesh" in the frontend
to be runningPodCount > 0 && meshedPodCount > 0 (previously, it was
runningPodCount === meshedPodCount, which would count resources with no pods as
"added").

I've also moved the link to /namespaces out of the top-level sidebar and into
the Resources sub-menu.
2018-09-07 13:30:52 -07:00
Kevin Lingerfelt f884caf56d
Upgrade protobuf to v1.2.0 (#1591)
* Upgrade protobuf to v1.2.0
* Fix Gopkg.lock
* Switch linkerd2-proxy-api dep back to stable

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-06 11:36:29 -07:00
Kevin Lingerfelt e4f14cab66
Use url query params for tap/top form filters (#1584)
* Use url query params for tap/top form filters
* Add comment explaining react-url-query onChange handlers

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-05 11:47:42 -07:00
Risha Mars d679c1fa0e
Error display fixes in web for the tap query handling. (#1583)
Previously, WebSocket error messages would appear with the first 
couple characters cut off. I've fixed this by using ws.WriteControl 
instead of ws.WriteMessage to write errors, as gorilla does in 
their example app.

- Use writeControl to write error messages to the client
- Stop the spinner if there is an error present
2018-09-05 10:28:59 -07:00
Dennis Adjei-Baah 127e496444
Make sidebar scrollbar independently (#1572)
For a better UI experience, the sidebar should be able to scroll independently from the detail view. This PR allows both the sidebar and the detail view to scroll independently.

fixes #1547

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-04 15:51:16 -07:00
Kevin Lingerfelt b5ff29c8aa
Add data plane check to validate proxy version (#1574)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-04 15:22:38 -07:00
Risha Mars 7f4fc308af
Link the resources in the Tap and Top tables to their detail pages (#1569)
Introduces a new helper, ResourceLink, that makes a prefixed link to the 
Resource Detail pages.
2018-08-31 15:53:02 -07:00
Kevin Lingerfelt c7a79da89c
Add data plane check to validate proxies are ready (#1570)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-31 15:51:57 -07:00
Risha Mars 0eaa9e4952
Tap the entire namespace if no resource is specified (#1558)
- Use an ant Select instead of Autocomplete for resource list, so that the user
can see all available tappable resources 

- Fix bug where the authority autocomplete wasn't showing any options 

- Adds "namespace/" as an option in the resource selection dropdown

This required some weird handling because we allow requests of the form
linkerd tap namespace/linkerd (taps namespace linkerd)
but not
linkerd tap namespace --namespace linkerd (does not work as intended, 
taps every namespace)
2018-08-31 15:50:48 -07:00
Risha Mars a0a2adc52c
Don't start a Top in the resource detail page if the resource is unmeshed (#1563)
Don't start a Top in the resource detail page if the resource is unmeshed.
Instead, show a call to action showing how to add linkerd.
2018-08-31 14:04:59 -07:00
Risha Mars f396459033
Remove the SocialLinks from the Sidebar (#1565)
Remove twitter, github and slack links from the sidebar.
The "Update Linkerd" menu item will still show up if there's an update.
The "Update now" button will also still show.
2018-08-31 12:34:56 -07:00
Risha Mars 41e5a76355
Update CHANGES.md for v18.8.4 release (#1562) 2018-08-30 14:09:43 -07:00
Risha Mars 249b51f950
Increase MaxRps in Tap server, remove default setting from Web (#1560)
Increase the MaxRps on the tap server to 100 RPS.

The max RPS for tap/top was increased in for the CLI #1531, but we were
still manually setting this to 1 RPS in the Web UI and Web server.

Remove the pervasive setting of MaxRps to 1 in the web frontend and server
2018-08-30 13:37:37 -07:00
Risha Mars d0c5dbd386
Fix query string in version url (#1559)
s/?/&
2018-08-30 10:57:02 -07:00
Risha Mars d3544d4064
Fix sidebar resource names not appearing (#1556)
In #1536 I removed the entire title attribute instead of removing 
the spans in the title, causing the titles to not appear.

Re-add the title text.
2018-08-29 15:57:17 -07:00
Thomas Rampelberg b0d027aeef
Allow cookies for versioncheck (#1551) 2018-08-29 15:49:57 -07:00
Risha Mars dbaf4bd3a4
Update CHANGES.md for v18.8.3 release (#1549) 2018-08-29 12:25:21 -07:00
Risha Mars 78fa120cae
Don't use prefixed fetch for version check (#1548)
In #1540 I moved the version check code and was using our prefixed version of
fetch. This is unnecessary because the version check URL doesn't depend on the
dashboard URL prefix.

TLDR Don't use prefixed fetch for version check
2018-08-29 12:17:14 -07:00
Risha Mars f9b27c7ef2
Miscellaneous small web UI fixes (#1536)
A bunch of small items. 

This branch:
- filters out un-meshed resources from the Tap and Top autocompletes
- removes an un-rendered title attribute from the sidebar menu items
- formats latency in Tap with a comma
- prevents the grafana link from showing if there are 0 pods in a deployment
2018-08-29 10:56:07 -07:00
Sebastian Tiedtke dc4c28345a Better visual distinction of inline code snippet in mesh completion message (#1539)
When the mesh completion message calls to action it prints a CLI command to copy&paste. It's visually hard to separate message from the command snippet which is what this commit fixes.

Flipped background and font color to create a better visual distinction

Successfully ran web app test suite

Signed-off-by: Sebastian Tiedtke <sebastiantiedtke@gmail.com>
2018-08-28 14:37:38 -07:00
Risha Mars 77ddd142c3
Perform linkerd version check once upon page load (#1540)
Previously, we included a version check in the server polling loop, which meant
we were hitting the version check endpoint once very 10 seconds from the
sidebar. This PR moves that check out of the loop so that we only hit it once,
upon pageload.

This PR also includes

- some whitespace fixes 
- a fix for a console error we were triggering with our tests
2018-08-28 14:36:11 -07:00
Alex Leong 0f7d684ca9
Increase default max-rps for tap and top (#1531)
The default value for the max-rps argument to the tap and top commands is an overly conservative 1rps.  This causes the data to come in very slowly and much data to be discarded.  Furthermore, because tap requests are windowed to 10 seconds, this causes long pauses between updates.

We fix this in two ways.  Firstly we reduce the window size to 1s so that updates will come in at least once per second, even when the actual RPS of the data path is extremely high.  Secondly, we increase the default max-rps parameter from 1 to 100.  This allows tap to paint an accurate picture of the data much more quickly and sidesteps some sampling bias that happens when the max-rps is low.

In general, tap events tend to happen in bursts.  For example, one request in may trigger one or more requests out.  Likewise, a single upstream event may trigger several requests to the tapped pod in quick succession.  Sampling bias will occur when the max-rps is less than the actual rps and when the tap event limit subdivides these event bursts (biasing towards the first few events in the burst).  The greater the max-rps, the less the effects of this bias.

Fixes #1525 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-08-28 14:16:39 -07:00
Risha Mars 136b9cc7c1
Add linkerd check flag to run data plane checks (#1528)
Adds a --proxy flag to the linkerd check CLI command which will run 
to-be-implemented data plane checks
2018-08-28 10:16:24 -07:00
Risha Mars fff09c5d06
Only tap pods that are meshed (#1535)
Previously, we would tap any resource's pods, regardless of whether the pods
were meshed or not. We can't actually tap non-meshed pods, so I'm adding a check
that will filter out non-meshed pods from the pods that tap watches.

Previous behaviour:
When attempting to hang a non meshed pod, it would establish
a watch on the pods, but then never return any results. In the CLI you could
just cancel it with Ctrl-C. In the web, clicking Stop would send a
WebSocket.close(1000) but wouldn't actually close the connection... 

Behaviour after change :
If no pods under the specified resource are meshed, it'll
return an error of no pods being found to tap
2018-08-28 09:59:52 -07:00
Dennis Adjei-Baah 097632a2f0
Add kubernetes style sidebar (#1500)
Linkerd CLI's "look and feel" is similar to Kubernetes kubectl CLI. Linkerd's dashboard can be extended to match Kubernetes dashboard UI.

This PR serves as a starting point for this work. The new sidebar shows all resources from all namespaces on initial page load. Resources can be filtered to show only items in a given namespace. The sidebar displays authority, deployment, service and, pod resources. We may need to think about whether it is necessary to show all resources types. Some resources, i.e. authorities, contain a large cardinality of resource details and may not be very useful to a user.

fixes #1449

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-08-27 12:59:37 -07:00
Risha Mars 27e52a6cc0
Add ReadinessProbe and LivenessProbe to injected proxy containers (#1530)
Adds basic probes to the linkerd-proxy containers injected by linkerd inject.

- Currently the Readiness and Liveness probes are configured to be the same. 
- I haven't supplied a periodSeconds, but the default is 10.
- I also set the initialDelaySeconds to 10, but that might be a bit high.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
2018-08-27 11:55:17 -07:00
Alex Leong 1f42996889
Document tps-reports (#1509)
It's not obvious from the name what the tps-reports API endpoint does.

Added a few comments to clarify.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-08-24 13:26:35 -07:00
Risha Mars 1d3580ba0c
Add success rate visuals to the octopus graph (#1519)
Add gauge chart to octopus cards
2018-08-24 10:10:27 -07:00
Kevin Lingerfelt de71132c21
Remove doc dir in favor of linkerd/website repo (#1511)
* rm doc dir in favor of linkerd/website repo
* Add note about doc source code in README

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-23 21:53:09 -07:00
Risha Mars 63be9b1a3d
Add top link to sidebar, fix js error (#1508)
- Add a link to Top in the sidebar
- Fix a console error caused by having duplicate react keys
2018-08-22 18:19:17 -07:00
Risha Mars 3fde755a8f
Add top request table to resource detail page (#1507)
Includes a substantial refactor of Top.jsx to move the websocket
and top-request-aggregation code into a self-contained module
so that this code can be shared by /top and by each resource
detail page.

(This refactor also helps separate concerns in that
page; since that page also makes 10 second requests to the stat
api to populate the autocompletes in the form).

The TopModule uses the startTap prop to figure out whether it
should start a websocket connection and make a tap request
when mounted. (This is because the resource detail pages
start tapping immediately upon load, whereas /top can only
start once you've entered a query.

I've removed the spinner and the awaitingWebSocketConnection
state field because that now belongs in the top module. I think a
similar refactor of tap would be good before we re-add it.
2018-08-22 18:18:35 -07:00
Kevin Lingerfelt 211fca1806
Update CHANGES.md for v18.8.2 release (#1506)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-22 12:59:20 -07:00
Kevin Lingerfelt 4450a7536d
Add --wait flag for CLI check and dashboard commands (#1503)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-22 12:56:42 -07:00