Linkerd CLI's "look and feel" is similar to Kubernetes kubectl CLI. Linkerd's dashboard can be extended to match Kubernetes dashboard UI.
This PR serves as a starting point for this work. The new sidebar shows all resources from all namespaces on initial page load. Resources can be filtered to show only items in a given namespace. The sidebar displays authority, deployment, service and, pod resources. We may need to think about whether it is necessary to show all resources types. Some resources, i.e. authorities, contain a large cardinality of resource details and may not be very useful to a user.
fixes#1449
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Add a basic top graph depicting the current resource's stats
and it's upstreams and downstreams.
Also add upstreams and downstreams tables for this resource
This will be styled more later, but just getting the basic components
and data onto the page.
Add a pod table to the Resource Detail page showing metrics
for pods belonging to a resource.
In the future, I think we'll modify the stat summary endpoint to
take multiple resources as arguments, and have the resource detail page
first query for the pods associated with the resource and then
query for stats for those pods.
See #1467 for discussion.
This PR also modifies the queries to not use the withREST component, in anticipation of the above changes.
This PR started out as a PR to link to our Resource Detail dashboard in
addition to grafana in the resource list pages, but I decided to refactor
the way we deal with our svgs since I was here.
This branch:
- modifies the GrafanaLink component to consist of the grafana icon
that links to grafana adds links to the ResourceDetail page in all our metrics tables
- adds a jsx component we can use to wrap svgs so that we don't get
annoying 404s on images that we have to handle
- remove the relative paths hack for images
- removes unused svg files in /img
Currently conduit stat outputs a column that shows the number of meshed pods in the resource being
queried. The web UI does not have this information about meshed pod state.
This commit adds a meshed column for better UI parity with the stat command.
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Add a Top page to the linkerd web UI. This is the web equivalent of #1435.
I've used the same fields as in the current implementation.
This branch also includes some slight refactors to the Tap code to enable code reuse.
The request processing logic is pretty similar to that in Tap.jsx, except that we can
immediately discard the result once we receive the response end and aggregate
that result into the top results. So the index of tap results will tend to be smaller
(unless they're long running requests like streaming). But we also add a similar
index of aggregated Top results, and discard oldest results if top has been
running for a long time.
* Add a Top page to the web UI
* Refactor Tap event parsing into common util code
* Small refactors to the TapQueryForm and the CliCmd display to accomodate Top
* Collate tap events based on the ID (src, dst, stream)
* Also refactor keying of req/rsp/end into requestInit/responseInit/responseEnd for clarity
* Use pod labels when present in top
* Fix bug where src/dst were switched in the Tap display table
Tap.jsx is really large and contains a lot of logic that pertains only to the Tap Query Form.
This PR tries to separate the concerns of the form and the query display from the main
Tap querying and rendering logic.
This will also allow us to easily reuse this form/CLI formatting for the Top page.
Changes in this PR:
* moves all the code for the form into its own component (TapQueryForm)
* moves the code that displays the current query into its own component (TapQueryCliCmd)
* formats the current tap query as the equivalent command line format that you
can paste into a terminal
* Changing the statusText to be an object with more fields, then displaying them in the ErrorBanner
Signed-off-by: Adam Christian <adam@buoyant.io>
Refactoring karma tests and propTypes and defaultProps per the code review from @rmars
Signed-off-by: Adam Christian <adam@buoyant.io>
Changing the default message to pass the ServiceMeshTest ErrorBanner assertion
Revert "Changing the default message to pass the ServiceMeshTest ErrorBanner assertion"
This reverts commit 2415b7099b03ad7a8deda9f67218bb531111b3ec.
Fixing the failing karma unit tests because the statusMessage wasn't being properly passed into the component rendering stub context
Signed-off-by: Adam Christian <adam@buoyant.io>
merging master in
Signed-off-by: Adam Christian <adam@buoyant.io>
* Export api error type independently from ApiHelpers
Signed-off-by: Adam Christian <adam@buoyant.io>
* Make use of the Web UI to render tap events in a table
- Return JSON tap events instead of the command line output
- Experiment with a different way of rendering the EventList
- changed the default width back to 100% of the screen because this
table does not look great squished
* This commit adds an application topology graph within the namespace tab. As a developer / operator one would like to see an overview of the services running to identify dependencies. Adding this graph gives Linkerd2 users a good overview of service dependencies.
* networkgraphtest added
Fixes: #924
Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>
This PR starts removing all references to the word "Conduit" in the web UI.
In the interest of not making huge changes all at once, I'll gradually start moving away
from the usage of "conduit" in the Web UI. For example, there are a lot of components that
have conduit in their names but they don't need to.
This branch is mostly component / variable names. There should be no visible changes except
the spinner is no longer a Conduit spinner.
See #1262 for visible branding changes.
- Rename ConduitLink to PrefixedLink
- Remove ConduitSpinner in favour of antd.Spin
- Remove css classnames that are conduit- centered
- Parameterize the current Product Name so that it's easier to change in the future
Tracking ticket: linkerd/linkerd#2018
- Return pod uptimes from the GetPods endpoint
- Adds filtering by namespace to api.GetPods
- Adds a --namespace filter to conduit get pods
- Adds pod uptimes to the controller component toolitps on the ServiceMesh page
- Moves the ServiceMesh page back to using /api/pods
Adds the ability to query by a new non-kubernetes resource type, "authorities",
in the StatSummary api.
This includes an extensive refactor of stat_summary.go to deal with non-kubernetes
resource types.
- Add documentation to Resource in the public api so we can use it for authority
- Handle non-k8s resource requests in the StatSummary endpoint
- Rewrite stat summary fetching and parsing to handle non-k8s resources
- keys stat summary metric handling by Resource instead of a generated string
- Adds authority to the CLI
- Adds /authorities to the Web UI
- Adds some more stat integration and unit tests
Don't allow the CLI or Web UI to request named resources if --all-namespaces is used.
This follows kubectl, which also does not allow requesting named resources
over all namespaces.
This PR also updates the Web API's behaviour to be in line with the CLI's.
Both will now default to the default namespace if no namespace is specified.
* Display proxy container errors in the Web UI
Add an error modal to display pod errors
Add icon to data tables to indicate errors are present
Display errors on the Service Mesh Overview Page and all the resource pages
Common blacklists have `/api/stat` in them. This causes the dashboard to not load.
`/api/tps-reports` is not in any blacklists, suggests what this route does and is slightly tongue in cheek. Fixes#970
* Web: remove ns column from tables on individual ns page
* Add prop types and tests for MetricsTable component
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Add propType validation
When refactoring components, it is hard to know what is required and isn't.
Adds propTypes to the existing components and enables eslint errors for anything
moving forward. This should keep us documenting the API for components.
* Remove extra newline
As part of the HOC + Context merges, ResourceList missed out on the api injection and errors out on the Namespaces tab.
Wrap the returned HOC in `withContext` to make sure it is there, no matter where it is in the tree. (Fixes#1034)
* Add a HOC for the REST API tooling
We're copying and duplicating logic all over the place with components that need to talk to the API.
Moves most of the REST API tooling into a HOC that can be used by other components. Now, a component can use `withREST`, pass in the promises that it would like resolved and receive the responses as props.
* Show PageHeader whether there's an error or not
* Hiding page header during loading
* Test updates to work with namespace restructuring
Previously, we would filter out stats coming from Conduit itself and from the kube-*
namespaces on some views in the Web UI. Remove this filtering, so that we display
all the resource information we get back from the Stat API. (Fixes#997)
On the Resource pages, the call to action would show up when there were no
metrics present, but that's actually not actionable by the user. Instead, I'm
going to show a blank table with a "no s detected" message.
* Remove special-case filtering out of kube-* namespaces, and conduit namespaces
* Remove the call to action for no metrics
* Linkify the namespace column for the resource pages
* Add an app-wide context for global props.
We've been passing the `api` object down from the top of the react tree. With
16.x, there's now the ability to have context that can inject anywhere in the
tree. This creates a top level context provider that contains most of the global
variables we've been using (api, appData, ...). It subsequently cleans up some
of the routes and nested components.
- Bumps `react-dom` to 16.3.2 (to match `react`).
- Adds `enzyme-context-patch` for now. This is fixed in enzyme master, but there
has not been a release yet. Needs to be removed when that is fixed.
* Use a default inside appData for controllerNamespace
* Update syntax of if to use curly brackets
- Update the `response_total` prometheus query of the StatSummary endpoint to also
break queries out by a `meshed` label.
- Add a 'Secured' column to the web UI/CLI stat displays, which indicate the percentage of traffic
starting and ending in the mesh
This meshed label is used in the CLI/Web UI to display a column of the percentage of traffic that
starts/ends in the mesh. (Which is a proxy indicator for whether that traffic is 'secured' when we
add TLS by default for intra mesh requests).
The `meshed` label is not yet added anywhere, so until it is supplied by the proxy, all traffic will
show up as 0% secured in the web/CLI.
This PR modifies the Namespace page in the web UI to replace the 3 existing api calls
with a single call.
* Consolidate calls to /metrics to use the new resource type all
* Simplify urlsForResource, add comment with assumptions
Add namespaces as a top level resource in the Web UI
This PR does the following:
- Replace the deployments table in the service mesh page with namespaces
- Add a Namespaces index page that lists all namespaces and their stats
- Add an individual namespace page showing all resources for that namespace
- Make the incomplete mesh message more generic to any resource type
- Revamp rest of service mesh page to move off ListPods
* Modify the Stat endpoint to also return the count of failed pods
* Add comments explaining pod count stats
* Rename total pod count to running pod count
This is to support the service mesh overview page, as I'd like to include an indicator of
failed pods there.
* Add a namespace column to the metrics tables, support long resource names
* Add a test for GrafanaLink
* Change the PodList.jsx component to not use the ListPods api
* Add a Replication Controllers page in the Web UI
@siggy pointed out that we don't need to use the PodsList api any more, since the new stats endpoint (#671) includes meshedPodCount and totalPodCount, which is all we need to determine whether the deployment/rc has been added to the mesh (which is what we were using ListPods to determine).
This PR modifies deployments to not use the pods api any more, and adds a Replication Controllers page. This page is quite similar to the Deployments page in logic, so I've made a PodOwnersList component to share the code.
I haven't added Replication Controllers to the Service Mesh page yet, because that page does require a list of component pods. Also, we don't need the calls to Prometheus for the Service Mesh page, so I don't want to use the existing stat apis for it. I figure that is a large enough change for a separate PR.
* Expose pod stats in CLI, web UI, and Grafana
* Fix js api helpers test
* Add outbound traffic stats to pod dashboard
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The public-api previously only permitted 4 hard-coded time windows:
10s, 1m, 10m, 1h. This was primarily a relic of the recently removed
telemetry system.
Modify the public-api to validate the time string, but allow for any
window size, which is then passed through to Prometheus.
Fixes#686
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Remove the telemetry service
The telemetry service is no longer needed, now that prometheus scrapes
metrics directly from proxies, and the public-api talks directly to
prometheus. In this branch I'm removing the service itself as well as
all of the telemetry protobuf, and updating the conduit install command
to no longer install the service. I'm also removing the old version of
the stat command, which required the telemetry service, and renaming the
statsummary command to stat.
* Fix time window tests
* Remove deprecated controller scrape config
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Link to Grafana from Conduit Dashboard
Previously the only way to access the Grafana dashboards was via direct
link, provided by the `conduit dashboard` command.
Add Grafana links throughout the Conduit Dashboard, next to all
Deployment objects. This change also modifies the behavior of the
ConduitLink helper, to enable linking to other deployments proxied by
the `conduit dashboard` command.
Part of #420
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* review feedback
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* review feedback, fix console, remove absolute
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* fix pod status and count display in control plane dashboard section:
- the control plane would show terminated and stale deployments in the UI, this is confusing and might indicate errors
- this filters out temrinated and failed component deploys from the UI
- it is to note that pending deploys will still be counted and represented with a greyed out status dot
- Fixes: #606
Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>
Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>
* Display more decimal points for truncated numbers, add hover info
* Filter completed pods out of web UI
* Decrease the polling interval from 10s to 2s
* Add more detailed pod categorization based on status
* Tweak filtering of pods, tweak explanations in status table
- reduce row spacing on tables to make them more compact
- Rename TabbedMetricsTable to MetricsTable since it's not tabbed any more
- Format latencies greater than 1000ms as seconds
- Make sidebar collapsible
- poll the /pods endpoint from the sidebar in order to refresh the list of deployments in the autocomplete
- display the conduit namespace in the service mesh details table
- Use floats rather than Col for more responsive layout (fixes#224)
This PR updates the web UI to remove the pod detail page, and to remove the links to that page from pod names in metrics tables. It also removes the `pods` option from `conduit stat`, and the `sourcePod` and `targetPod` fields from the controller API proto's `MetricMetadata` message.
I've updated the `conduit stat` tests to reflect these changes, and manually verified the web UI changes.
Closes#261
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Control metricsWindow from root of app
- Add buttons [currently hidden] on metrics pages to control window of metrics requests
- Consolidate metricsWindow usage (stop passing it around)
- Add a ConduitLink component so we can stop passing around pathPrefix
- Add tests for ApiHelpers
* Hide the time window buttons; fix bug in absolute links
* Add a note explaining why metricWindow buttons are disabled
* Convert ConduitLink in to a component that wraps another
* Revive scatterplot: re-add scatterplot to Deployments page
Tried to make some UI improvements to address previous problems:
* added a hover bar and tooltip that displays all of the nodes under the bar,
in descending order of successRate (to correspond with their order in the chart)
* the tooltip looked weird in the empty state so I also added the max/min latencies
observed there
Also cleans up the Deployments page a little when there are not any "least healthy deployments".
* Previously, the sidebar tooltip would still render the last
highlighted nodes' information when the dots updated. Fix that
by selecting a datapoint to highlight when the dots update.
* Add overlay tooltip with names of highlighted nodes
* Align the node labels with the node, except in cases of label overlap
Signed-off-by: Risha Mars <mars@buoyant.io>
* Add /paths page that shows rollup metrics by path
* Clean up ApiHelpers a bit
Adds ability to sort by column in the tabbed metrics table (to make a TabbedMetricsTable sortable, set sortable={true})
Adds a page, accessible via /paths that shows a table of all paths, with their request/success/latency metrics. I haven't exposed it in the sidebar as it doesn't have design treatment.
* Make TabbedMetricsTable in charge of fetching timeseries
All of the parent components in charge of fetching timeseries data
don't actually use them, but pass them to this table. It would simplify
a lot of the parents if this component handled the ts fetching too.
This introduces a urlsForResource helper in ApiHelpers to allow
us to consolidate url generating in the app to one place. It also
associates the appropriate groupBy for various fetches.
Signed-off-by: Risha Mars <mars@buoyant.io>