Closes#9554
Removed the "Configuration" section in the sidebar, along with the "Traffic Splits" item under it, while cleaning all the javascript logic dealing with traffic split data, except for the Service table, where the traffic split weights should still be visible.
04a66ba added a `ReadHeaderTimeout` to our HTTP servers (at gosec's
insistence). We chose a fairly arbitrary timeout of 10s. This
configuration causes any connection that has been idle for 10s to be
torn down by the server. Unfortunately, this timeout value matches the
default Kubernetes probe interval and the default linkerd-viz scrape
interval. This can cause probes to race the timeout so that the
connection is healthy from the proxy's point of view and a request is
sent on the connection exactly as the server drops the connection.
These request failures cause controller success rate to appear degraded.
To correct this, this change raises the timeout to 15s so that the
timeout no longer matches the default probe interval.
The proxy's HTTP client is supposed to [retry] requests that encounter
this type of error. We should follow up by doing more research into why
that is not occurring in this situation.
[retry]: https://docs.rs/hyper/0.14.20/hyper/client/struct.Builder.html#method.retry_canceled_requests
Signed-off-by: Oliver Gould <ver@buoyant.io>
Newer versions of golangci-lint flag `http.Server` instances that do not
set a `ReadHeaderTimeout` as being vulnerable to "slowloris" attacks,
wherein clients initiate requests that hold connections open
indefinitely.
This change sets a `ReadHeaderTimeout` of 10s. This timeout is fairly
conservative so that clients can eagerly create connections, but is
still constrained enough that these connections won't remain open
indefinitely.
This change also updates kubert to v0.9.1, which instruments a header
read timeout on the policy admission server.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Fixes#7429
Currently Linkerd assumes that the Grafana instance is hosted
on-cluster. Some users would like to use externally hosted Grafana
instances, such as Grafana Cloud or AWS Managed Grafana.
In general users will have multiple Linkerd clusters with dashboards
in the same Grafana workspace, so we need to introduce a prefix for the
Grafana dashboard UID's so that they remain unique.
This PR adds two new viz config values, `grafana.uidPrefix` and
`grafana.externalUrl`.
When grafana.uidPrefix is set, it will insert the user-supplied prefix
in the URL's for the Grafana dashboards. When grafana.externalUrl is
set, its value will be used for the links to Grafana dashboards instead
of using the grafana reverse proxy.
Signed-off-by: Jack Gill <jack.gill@elationhealth.com>
When running `linkerd check -o short` there can still be formatting issues:
- When there are no core warnings but there are extension warnings, a newline is printed at the start of the output
- When there are warnings, there is no newline printed between the warnings and the result
Below you can see the extra newline (before `Linkerd extensions checks`) and the lack of a newline on line before `Status check results ...`.
Old:
```shell
$ linkerd check -o short
Linkerd extensions checks
=========================
linkerd-viz
-----------
...
Status check results are √
```
New:
```shell
$ linkerd check -o short
Linkerd extensions checks
=========================
linkerd-viz
-----------
...
Status check results are √
```
---
This fixes the above issues by moving the newline printing to the end of a category—which right now is Core and Extension.
If there is no output for either, then no newline is printed. This results in no stray newlines when running in short output and there are no warnings.
```shell
$ linkerd check -o short
Status check results are √
```
If there is output for a category, then the category handles the newline printing itself meaning we don't need to track if a newline needs to be printed _before_ a category or _before_ the results.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* Add services to web dashboard
Adds services as a new screnn on a web dashboard. Services screen is
implemented as a separate func, because it is different from
pods/jobs/etc.
Services view will not have pods or edges section because this
information is not available.
Signed-off-by: Krzysztof Dryś <krzysztofdrys@gmail.com>
Introduce new table in the control plane view of the dashboard to display installed extensions.
Signed-off-by: Sanni Michael <sannimichaelse@gmail.com>
This change adds a new "Extensions" page to the Linkerd dashboard. This
page lists all known built-in and third party extensions that can be
installed in a cluster.
Fixes#6568
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Removed `Version` API from the public-api
This is a sibling PR to #5993, and it's the second step towards removing the `linkerd-controller` pod.
This one deals with a replacement for the `Version` API, fetching instead the `linkerd-config` CM and retrieving the `LinkerdVersion` value.
## Changes to the public-api
- Removal of the `publicPb.ApiClient` entry from the `Client` interface
- Removal of the `publicPb.ApiServer` entry from the `Server` interface
- Removal of the `Version` and related methods from `client.go`, `grpc_server.go` and `http_server.go`
## Changes to `linkerd version`
- Removal of all references to the public API.
- Call `healthcheck.GetServerVersion` to retrieve the version
## Changes to `linkerd check`
- Removal of the "can query the control API" check from the "linkerd-api" section
- Addition of a new "can retrieve the control plane version" check under the "control-plane-version" section
## Changes to `linkerd-web`
- The version is now retrieved from the `linkerd-config` CM instead of a public-API call.
- Removal of all references to the public API.
- Removal of the `data-go-version` global attribute on the dashboard, which wasn't being used.
## Other changes
- Added `ValuesFromConfigMap` function in `values.go` to convert the `linkerd-config` CM into a `*Values` struct instance
- Removal of the `public` protobuf
- Refactor 'linkerd repair' to use the refactored 'healthcheck.GetServerVersion()' function
* Hide the "Gateway" sidebar link
This commit hides the "Gateway" sidebar link in the dashboard if the
`linkerd-multicluster` extension is not installed. If a user happens to navigate to
the Gateway page anyway, we display a CTA (Call to Action) that informs
the user that they would need to run the multicluster install command.
This change includes a new endpoint in the dashboard server; `GET
/api/extensions`. This endpoint returns the namespace an extension
is installed in when passing in extension name. The dashboard uses
this endpoint to detect whether it needs to hide the navigation link
and whether to display the CTA.
Fixes#5330
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.
* Added chart templates for new viz linkerd-metrics-api pod
* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.
* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).
* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.
* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.
* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.
* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
- Updated `endpoints.go` according to new API interface name.
- Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
- `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
- Added `metrics-api` to list of docker images to build in actions workflows.
- In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).
* Add retry to 'tap API service is running' check
* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
* Separate observability API
Closes#5312
This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.
- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.
The other changes in the go files are just changes in the imports to point to the new protobufs.
Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
* Introduce multicluster gateway api handler in web api server
* Added MetricsUtil for Gateway metrics
* Added gateway api helper
* Added Gateway Component
Updated metricsTable component to support gateway metrics
Added handler for gateway
Fixes#4601
Signed-off-by: Tharun <rajendrantharun@live.com>
Regenerated protobuf files, using version 1.4.2 that was upgraded from
1.3.2 with the proxy-api update in #4614.
As of v1.4 protobuf messages are disallowed to be copied (because they
hold a mutex), so whenever a message is passed to or returned from a
function we need to use a pointer.
This affects _mostly_ test files.
This is required to unblock #4620 which is adding a field to the config
protobuf.
## Motivation
Full background: #2074#2074 was recently reopened because a user reported an error that occurs when
refreshing an already opened dashboard after the dashboard build has changed.
This can occur when upgrading or downgrading.
#2074 explores a larger issue about a redirection that occurs when loading the
dashboard JS. However, the actual issue that users are experiencing happens
because `index_bundle.js` is being cached when it should not be.
Even if the hash of the JS bundle changes, users can see (on the current edge)
that browsers do in fact cache `index_bundle.js`.
The easiest way I reproduced this was:
1. Install `edge-19.12.3`
2. `linkerd dashboard` (and keep the tab open)
2. Uninstall `edge-19.12.3`
3. Install `stable-2.5.0`
4. `linkerd dashboard`
5. Refresh in all browsers: Users will observe the `edge-19.12.3` dashboard
still renders (with all of it's new additions) even though `stable-2.5.0` is
installed with it's older theme.
Below are screenshots of Safari and Firefox caching the file. Chrome was not as
easy to reproduce:
*Safari*

*Firefox*

## Solution
This change only changes the response header when requesting `index_bundle.js`
from the server to ensure caching does not take place; mainly `no-cache` is
changed to `no-store` and `must-revalidate` is now included.
`no-store` and `must-revalidate` are redundant on some browsers but both
required to cover all browsers (and versions).
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource.
Closes#3614Closes#3630Closes#3584Closes#3585
Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
This PR allows the dashboard to query for a resource's definition in YAML
format, if the boolean `queryForDefinition` in the `ResourceDetail` component is
set to true.
This change to the web API and the dashboard component was made for a future
redesigned dashboard detail page. At present, `queryForDefinition` is set to
false and there is no visible change to the user with this PR.
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com> Signed-off-by: Cintia
Sanchez Garcia <cynthiasg@icloud.com>
`linkerd check` can now be run from the dashboard in the `/controlplane` view.
Once the check results are received, they are displayed in a modal in a similar
style to the CLI output.
Closes#3613
* DNS rebinding protection for the dashboard
Fixes#3083 and replacement for #3629
This adds a new parameter to the `linkerd-web` container `enforcedHost`
that establishes the regexp that the Host header must enforce, otherwise
it returns an error.
This parameter will be hard-coded for now, in `linkerd-web`'s deployment
yaml.
Note this also protects the dashboard because that's proxied from
`linkerd-web`.
Also note this means the usage of `linkerd dashboard --address` will
require the user to change that parameter in the deployment yaml (or
have Kustomize do it).
How to test:
- Run `linkerd dashboard`
- Go to http://rebind.it:8080/manager.html and change the target port to
50750
- Click on “Start Attack” and wait for a minute.
- The response from the dashboard will be returned, showing an 'Invalid
Host header' message returned by the dashboard. If the attack would have
succeeded then the dashboard's html would be shown instead.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
This reverts commit edd3b1f6d4.
This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy. We will reintroduce this change once the config plan is straightened out.
Signed-off-by: Alex Leong <alex@buoyant.io>
### Motivation
PR #3167 introduced the tap APIService and migrated `linkerd tap` to use it.
Subsequent PRs (#3186 and #3187) updated `linkerd top` and `linkerd profile
--tap` to use the tap APIService. This PR moves the web's Go server to now also
use the tap APIService instead of the public API. It also ensures an error
banner is shown to the user when unauthorized taps fail via `linkerd top`
command in *Overview* and *Top*, and `linkerd tap` command in *Tap*.
### Details
The majority of these changes are focused around piping through the HTTP error
that occurs and making sure the error banner generated displays the error
message explaining to view the tap RBAC docs.
`httpError` is now public (`HTTPError`) and the error message generated is short
enough to fit in a control frame (explained [here](https://github.com/linkerd/linkerd2/blob/kleimkuhler%2Fweb-tap-apiserver/web/srv/api_handlers.go#L173-L175)).
### Testing
The error we are testing for only occurs when the linkerd-web service account is
not authorzied to tap resources. Unforutnately that is not the case on Docker
For Mac (assuming that is what you use locally), so you'll need to test on a
different cluster. I chose a GKE cluster made through the GKE console--not made
through cluster-utils because it adds cluster-admin.
Checkout the branch locally and `bin/docker-build` or `ares-build` if you have
it setup. It should produce a linkerd with the version `git-04e61786`. I have
already pushed the dependent components, so you won't need to `bin/docker-push
git-04e61786`.
Install linkerd on this GKE cluster and try to run `tap` or `top` commands via
the web. You should see the following errors:
### Tap

### Top

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Continue of #2950.
I decided to check for the `clusterDomain` in the config map in web server main for the same reasons as as pointed out here https://github.com/linkerd/linkerd2/pull/3113#discussion_r306935817
It decouples the server implementations from the config.
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
Set the following headers on every dashboard response:
- `X-Content-Type-Options: nosniff`
- `X-Frame-Options: SAMEORIGIN`
- `X-XSS-Protection: 1; mode=block`
Fixes#3082
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds an Edges table to the resource detail view that shows the source,
destination name and identity for proxied connections to and from the resource
shown.
Closes#2327.
This PR creates a "Community" menu item on the dashboard sidebar that, when clicked, displays an iFrame of a page on linkerd.io. A yellow badge appears on the menu item if there has been an update since the user last clicked the "Community" menu item. This is calculated by comparing a date in the user's localStorage to a JSON feed at linkerd.io.
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.
This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.
A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
combining some check categories, as we can always assume cluster-wide
requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
make a decision based on cluster-wide vs. namespace-wide access.
Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.
Reverts #1721
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.
Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.
Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.
TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
In #2195 we introduced `linkerd endpoints` on the CLI. I would like similar
information to be on the web.
This PR adds an api endpoint at `/api/endpoints`, and introduces a new debugging
pagethat shows a table of endpoints, available at `/debug`
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.
Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.
Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.
Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.
Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.
Update dependencies and dockerfile hashes
Add DaemonSet support to tap and top commands
Fixes of #2006
Signed-off-by: Zak Knill <zrjknill@gmail.com>
JavaScript assets could be cached across Linkerd releases, showing an
out of date ui, or a broken page.
Modify the webpack build pipeline to add a hash to the JS bundle
filename. Move all logic around webpack-dev-server state from Go into
JS, via a templatized index_bundle.js file, generated at build time.
Disable caching of index_bundle.js in Go, via a `Cache-Control` header.
Fixes#1996
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Proxy grafana requests through web service
* Fix -grafana-addr default, clarify -api-addr flag
* Fix version check in grafana dashboards
* Fix comment typo
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Commit 1: Enable lint check for comments
Part of #217. Follow up from #1982 and #2018.
A subsequent commit will fix the ci failure.
Commit 2: Address all comment-related linter errors.
This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
be removed
This PR does not:
- Modify, move, or remove any code
- Modify existing comments
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds an endpoint, at /profiles/new that allows you to input a service name and
namespace, and download a service profile yaml template.
This will enable future work, where we can add more of the yaml customization via
a form in the dashboard, and use that data to help the user configure routes.
Adds a (currently not displayed in sidebar, but available at /routes) page to
mirror the current functionality of `linkerd routes <service>`. So far, this is just a
barebones form and table, but it works.
Adds a /api/routes path and handler to the api to receive TopRoutes requests from the web.
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).
Accessible from the web UI via http://localhost:8084/api/services
Adds a new page that shows all namespaces in an accordion. This will replace
ServiceMesh as the default landing page.
The page will request stats for all namespaces, and then pick the first meshed
namespace that's not the linkerd namespace to auto-expand in the accordion.
This branch also updates the definition of "added to the mesh" in the frontend
to be runningPodCount > 0 && meshedPodCount > 0 (previously, it was
runningPodCount === meshedPodCount, which would count resources with no pods as
"added").
I've also moved the link to /namespaces out of the top-level sidebar and into
the Resources sub-menu.