There are two variables we use to control the volume of Top output,
maxRowsToDisplay, which controls how many rows are in the table, and
maxRowsToStore, which controls the size of the event index we keep in memory for
aggregating results.
Previously, we were only keeping in index maxRowsToDisplay rows, which for the
Resource Detail page was 10 (which is really small for high traffic rest-y
resource traffic - it causes rows to be deleted from the index too soon, and
then causes the data in the table to change a lot). Change this to store
maxRowsToStore rows, and also bump this to 50. This allows us to store results
for longer, and also ensures more consistent data over time.
Another fix for the appearance of the Top columns is to add fixed widths to the
metrics. This will prevent the table from wobbling from side to side.
A bunch of web UI tweaks:
- Add a small success rate chart to the metrics tables
- Improve latency formatting for seconds latencies
- Rename upstream/downstream to inbound/outbound
- Make Top table look consistent with rest of tables on page
- Fix widths of metrics column columns so that tables align
Consolidate the source and destination columns into one column,
and add a direction column (To/From) so the user knows if the
displayed resource is src/dst.
* Use Tap data on Resource Detail page to display unmeshed resources
that send traffic to the specified resource.
* Don't update neighbors on every websocket recv; this causes too much rendering.
Instead, store in internal variable and update with the api results.
This branch uses the src data from tap to discern which unmeshed resources are
sending traffic to the specified resource. We then show this resource in the
octopus graph.
Note that tap is sampled data, so it's possible for an unmeshed resource to not
show up. Also, because we won't know about the resource until it appears in the
Tap results, results could pop into the chart at any time.
Add a Top page to the linkerd web UI. This is the web equivalent of #1435.
I've used the same fields as in the current implementation.
This branch also includes some slight refactors to the Tap code to enable code reuse.
The request processing logic is pretty similar to that in Tap.jsx, except that we can
immediately discard the result once we receive the response end and aggregate
that result into the top results. So the index of tap results will tend to be smaller
(unless they're long running requests like streaming). But we also add a similar
index of aggregated Top results, and discard oldest results if top has been
running for a long time.
* Add a Top page to the web UI
* Refactor Tap event parsing into common util code
* Small refactors to the TapQueryForm and the CliCmd display to accomodate Top
* Collate tap events based on the ID (src, dst, stream)
* Also refactor keying of req/rsp/end into requestInit/responseInit/responseEnd for clarity
* Use pod labels when present in top
* Fix bug where src/dst were switched in the Tap display table