Commit Graph

858 Commits

Author SHA1 Message Date
Alex Leong c6536996f7
Add linkerd repair command (#5636)
In the stable-2.9.0, stable-2.9.1, and stable-2.9.2 releases, the `linkerd-config-overrides` secret is missing the `linkerd.io/control-plane-ns` label.  This means that if a `linkerd upgrade` is performed to one of these versions using the `--prune` flag, then the secret will be deleted.  Missing this secret will prevent any further upgrades.

We add a `linkerd repair` command which recreates the `linkerd-config-overrides` secret by fetching the installed values from the `linkerd-config` configmap and then re-populating the redacted identity values from the `linkerd-identity-issuer` secret.

Usage:

```bash
linkerd repair | kubectl apply -f -
```

To test:
```
# Set Linkerd version to stable-2.8.0
> linkerd install | kubectl apply -f -
# Set Linkerd version to stable-2.9.1
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command fails)
# Set Linkerd version to HEAD
> linkerd repair | kubectl apply -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command succeeds)
> linkerd check
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-02-04 16:47:04 -08:00
Tarun Pothulapati b521091ca7
cli: make `linkerd uninstall` fail when injected pods are present (#5642)
* cli: make `linkerd uninstall` fail when injected pods are present

Fixes #5622

This PR updates the `linkerd uninstall` cmd to check if there
are any injected pods and fails if there are any. This also
provides `--force` flag to skip this check.

pods from namespaces with prefix `linkerd` are skipped
so as to not error out for control-plane and extension
pods.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-03 12:31:30 +05:30
Kevin Leimkuhler df0ce24b12
viz: add viz profile command (#5621)
## What this changes

This adds a `viz profile` command that outputs a service profile based off tap
data. It is identical—but fixes—the current `profile --tap` command.

Additionally, it removes the `--tap` flag from the `profile` command since this
depends on the Viz extension being installed in order to tap a service.

## Why

The `profile --tap` command is currently broken since it depends on the Viz
extension being installed, but the `profile` command is part of the core
install.

Closes #5613

Unblocks #5545

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-01 19:02:46 -05:00
Takumi Sue 77add64860
Remove extra three dashes from helm templates (#5628)
(Background information)
In our company we are checking the sops-encrypted Linkerd manifest into GitHub repository,
and I came across the following problem.

---

Three dashes mean the start of the YAML document (or the end of the
directive).
https://yaml.org/spec/1.2/spec.html#id2800132

If there are only comments between `---`, the document is empty.
Assume the file which include an empty document at the top of itself.

```yaml
---
# foo
---
apiVersion: v1
kind: Namespace
metadata:
  name: foo
---
# bar
---
apiVersion: v1
kind: Namespace
metadata:
  name: bar
```

When we encrypt and decrypt it with [sops](https://github.com/mozilla/sops), the empty document will be
converted to `{}`.

```yaml
{}
---
apiVersion: v1
kind: Namespace
metadata:
    name: foo
---
apiVersion: v1
kind: Namespace
metadata:
    name: bar
```

It is invalid as k8s manifest ([apiVersion not set, kind not set]).

```
error validating data: [apiVersion not set, kind not set]
```

---

I'm afraid that it's sops's problem (at least partly), but anyhow this modification is enough harmless I think.
Thank you.

Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2021-02-01 10:51:34 -05:00
Matei David 0ce9e84a94
Introduce V1 to CRDs and Mutating Hooks (#5603)
*Closes #5484*
 ### Changes
---
*Overview*:
 * Update golden files and make necessary spec changes
 * Update test files for viz
 * Add v1 to healthcheck and uninstall
 * Fix link-crd clusterDomain field validation

- To update to v1, I had to change crd schemas to be version-based (i.e each version has to declare its own schema). I noticed an error in the link-crd (`targetClusterDomain` was `targetDomainName`). Also, additionalPrinterColumns are also version-dependent as a field now.

- For `admissionregistration` resources I had to add an additional `admissionReviewVersions` field -- I included `v1` and `v1beta1`.

- In `healthcheck.go` and `resources.go` (used by `uninstall`) I had to make some changes to the client-go versions (i.e from `v1beta1` to `v1` for admissionreg and apiextension) so that we don't see any warning messages when uninstalling or when we do any install checks. 

I tested again different cli and k8s versions to have a bit more confidence in the changes (in addition to automated tests), hope the cases below will be enough, if not let me know and I can test further.

### Tests

Linkerd local build CLI + k8s 1.19+
`install/check/mc-check/mc-install/mc-link/viz-install/viz-check/uninstall/`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-b0fd2ec8
Server version: unavailable

$ bin/linkerd install | kubectl apply -f -
- no errors, no version warnings - 

$ bin/linkerd check --expected-version git-b0fd2ec8
Status check results are :tick:

# MC

$ bin/linkerd mc install | k apply -f - 
- no erros, no version warnings - 

$ bin/linkerd mc check
Status check results are :tick:

$ bin/linkerd mc link foo | k apply -f -   # test crd creation
# had a validation error here because the schema had targetDomainName instead of targetClusterDomain
# changed, rebuilt cli, re-installed mc, tried command again
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created
...

# VIZ
$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd uninstall | k delete -f -
- no errors, no version warnings - 
```

Linkerd local build CLI + k8s 1.17
`check-pre/install/mc-check/mc-install/mc-link/viz-install/viz-check`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17-rc1+k3s1", GitCommit:"e8c9484078bc59f2cd04f4018b095407758073f5", GitTreeState:"clean", BuildDate:"2021-01-14T06:20:56Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-3d2d4df1 # made changes to link-crd after prev test case
Server version: unavailable

$ bin/linkerd check --pre --expected-version git-3d2d4df1
- no errors, no version warnings -
Status check results are :tick:

$ bin/linkerd install | k apply -f -
- no errors, no version warnings -

$ bin/linkerd check --expected-version git-3d2d4df1
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc install | k apply -f -
- no errors, no version warnings - 

$ bin/linkerd mc check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc link --cluster-name foo | k apply -f -
bin/linkerd mc link --cluster-name foo | k apply -f -
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created

# VIZ

$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check
- no errors, no version warnings -
- hangs up indefinitely after linkerd-viz can talk to Kubernetes
```

Linkerd edge (21.1.3) CLI + k8s 1.17 (already installed)
`check`
```
$ linkerd version
Client version: edge-21.1.3
Server version: git-3d2d4df1

$ linkerd check
- no errors -
- warnings: mismatch between cli & control plane, control plane not up to date (both expected) -
Status check results are :tick:
```

Linkerd stable (2.9.2) CLI + k8s 1.17 (already installed)
`check/uninstall`
```
$ linkerd version
Client version: stable-2.9.2
Server version: git-3d2d4df1

$ linkerd check
× control plane ClusterRoles exist
    missing ClusterRoles: linkerd-linkerd-tap
    see https://linkerd.io/checks/#l5d-existence-cr for hints

Status check results are ×

# viz wasn't installed, hence the error, installing viz didn't help since
# the res is named `viz-tap` now
# moving to uninstall

$ linkerd uninstall | k delete -f -
- no warnings, no errors - 
```

_Note_: I used `go test ./cli/cmd/... --generate` which is why there are so many changes 😨 

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-02-01 09:18:13 -05:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Tarun Pothulapati a95efe2db1
tests: update newly added golden tests (#5588)
#5507 added new golden tests but missed some updates from other PRs
that got merged meanwhile.

This branch updates those golden tests with those changes

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-21 23:47:36 +05:30
Tarun Pothulapati d0d2e0ea7a
cli: add helm customization flags to core install (#5507)
* cli: add helm customization flags to core install

Fixes #5506

This branch adds helm way of customization through
 `set`, `set-string`, `values`, `set-files` flags for
`linkerd install` cmd along with unit tests.

For this to work, the helm v3 engine rendering helpers
had to be used instead of our own wrapper type.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-21 22:49:50 +05:30
Andrew Seigner 9c80d4d2a1
Fix `linkerd metrics` resource selector (#5567)
The `linkerd metrics` command was selecting pods based on owner resource
names. If multiple owners existed with the same name (for example
`sts/web`, `deploy/web`), additional pods would be incorrectly included
in the output.

Fix the pod selector code to validate pods have owner references to the
given workload/owner.

Before:
```
$ linkerd metrics -n emojivoto deploy/web|grep POD
  # POD web-0 (1 of 3)
  # POD web-d9ffd684f-gnbcx (2 of 3)
  # POD web-fs6l7 (3 of 3)
```

After:
```
$ bin/go-run cli metrics -n emojivoto deploy/web|grep POD
  # POD web-d9ffd684f-gnbcx (1 of 1)
```

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2021-01-21 11:57:27 -05:00
Matei David c63fbdf0e4
Introduce OpenAPIV3 validation for CRDs (#5573)
* Introduce OpenAPIV3 validation for CRDs

* Add validation to link crd
* Add validation to sp using kube-gen
* Add openapiv3 under schema fields in specific versions
* Modify fields to rid spec of yaml errors
* Add top level validation for all three CRDs

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-01-21 11:56:28 -05:00
Naseem 2cc96d4ab9
fix alertmanagers casing (one word) (#5377)
fixes #5371

Signed-off-by: naseemkullah <naseem@transit.app>
2021-01-21 11:55:24 -05:00
Kevin Leimkuhler e7f2a3fba3
viz: add tap-injector (#5540)
## What this changes

This adds a tap-injector component to the `linkerd-viz` extension which is
responsible for adding the tap service name environment variable to the Linkerd
proxy container.

If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled
via annotation on the pod or the namespace, no action is taken.

This also removes the environment variable for explicitly disabling tap through
an environment variable. Tap status for a proxy is now determined only be the
presence or absence of the tap service name environment variable.

Closes #5326

## How it changes

### tap-injector

The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be
added to a pod's Linkerd proxy container environment. If the pod satisfies the
following, it is added:

- The pod has a Linkerd proxy container
- The pod has not already been mutated
- Tap is not disabled via annotation on the pod or the pod's namespace

### LINKERD2_PROXY_TAP_DISABLED

Now that tap is an extension of Linkerd and not a core component, it no longer
made sense to explicitly enable or disable tap through this Linkerd proxy
environment variable. The status of tap is now determined only be if the
tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment
variable.

### controller image

The tap-injector has been added to the controller image's several startup
commands which determines what it will do in the cluster.

As a follow-up, I think splitting out the `tap` and `tap-injector` commands from
the controller image into a linkerd-viz image (or something like that) makes
sense.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-21 11:24:08 -05:00
Eugene Formanenko 535a36af7c
Add log-format flag to control plane components (#5537)
Fixes #5536

Signed-off-by: Eugene Formanenko <mo4islona@gmail.com>
2021-01-15 10:51:32 -05:00
Alejandro Pedraza d7e4f901e6
Detect default ns for metrics, identity, routes and profile subcommands (#5530)
* Detect default ns for metrics and profile subcommands

Followup to #5485, fixes remaining cases for #5524

Properly detect the default namespace given `kubeConfigPath` and
`kubeContext` for the `metrics`, `identity`, `routes` and `profile` subcommands.

Also gets rid once and for all of the `defaultNamespace` global var.
2021-01-15 08:51:26 -05:00
Alejandro Pedraza cf143f2068
Revert "Add default ns detection to 'linkerd identity' and fixed --namespace description"
This reverts commit 5966d7c6b6.
2021-01-14 10:09:20 -05:00
Alejandro Pedraza 5966d7c6b6
Add default ns detection to 'linkerd identity' and fixed --namespace description 2021-01-14 10:07:22 -05:00
Alejandro Pedraza dd9ea0aef4
Helm template helpers cleanup (#5514)
Removed Helm template files no longer used, as well as some helper
functions.
2021-01-14 09:05:31 -05:00
Tarun Pothulapati eeaf4a5359
viz: make viz cmds available at root (#5525)
* viz: make viz cmds available at root

Fixes #5523

This branch makes viz commands that were previously available
under root to be available at both places i.e `linkerd` and
`linkerd viz`.

We also show a depreciated notice when ran under root, asking
to use them with the `viz` prefix.

This also updates all the help messages to address these cmds
as `linkerd viz xyz` instead of `linkerd xyz`

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-14 13:14:25 +05:30
Alejandro Pedraza f3b1ebfa99
Separate observability API (#5510)
* Separate observability API

Closes #5312

This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.

- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.

The other changes in the go files are just changes in the imports to point to the new protobufs.

Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
2021-01-13 14:34:54 -05:00
Tarun Pothulapati 4c3d002501
viz: move sub-cmds using viz extension under viz cmd (#5485)
* viz: move sub-cmds using viz extension under viz cmd

Fixes #5327 , #5524 

This branch moves the following commands, under the `linkerd viz`
cmd as they use the viz extension to perform the job.

- dashboard
- edges
- routes
- stat
- tap
- top

This also creates a new pkg `public-api` which fecilitates
interaction and communication with public-api to be used
across extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
2021-01-13 12:11:25 +05:30
Tarun Pothulapati 836c077898
viz: add render golden tests (#5433)
* viz: add render golden tests

This branch adds golden tests for the viz install. This would be
useful to track changes in render as more changes are added.

This also moves the common code that is used across extensions
to generate diffs into `testutil` to be able to be used widely.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-12 11:59:16 +05:30
Piyush Singariya b5dddf5daf
service profile generation work offline using --ignore-cluster (#5482)
Closes #5401 

* offline profile generation with --ignore-cluster
* validation added for ignoreCluster and service profile with tap data
Signed-off-by: Piyush Singariya <piyushsingariya@gmail.com>
2021-01-09 10:23:25 -08:00
Mitch Hulscher 462fe32ef2
fix(linkerd2-cni): execute container preStop command `kill` command as shell builtin (#5453)
The container-image `ghcr.io/linkerd/cni-plugin:stable-2.9.1` does not contain the `kill` command as an executable. Instead, it is available as a shell built-in. In its current state, Kubernetes emits error events whenever linkerd2-cni pods are terminated because the `kill` command can not be found.

Signed-off-by: Mitch Hulscher <mitch.hulscher@lib.io>
2021-01-07 10:24:24 -05:00
Tarun Pothulapati 68c02d82d1
healthcheck: simplify Checker construction with a builder (#5475)
Currently, Each new instance of `Checker` type have to manually
set all the fields with the `NewChecker()`, even though most
use-cases are fine with the defaults.

This branch makes this simpler by using the Builder pattern, so
that the users of `Checker` can override the defaults by using
specific field methods when needed. Thus simplifying the code.

This also removes some of the methods that were specific to tests,
and replaces them with the currently used ones.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-06 14:32:39 -08:00
Jimil Desai bce3547e9b
CLI: Introduced `identity` command to fetch tls-certificates for a pod (#5408)
CLI: Introduced `identity` command to fetch tls-certificates for a pod (#4459)


Modified and added a new cli command, which initiates a sni-tls session to the proxy's admin port and returns the certificate.

Usage:
- `linkerd identity pod/<pod-name>`   : fetches certificate from the specified pod
- `linkerd identity -l app=svc/emoji`      : fetches certificate from all pods with label app=svc/emoji

Signed-off-by: Jimil Desai <jimildesai42@gmail.com>
2021-01-06 16:27:05 -05:00
Raphael Taylor-Davies c9d789156c
Add PodDisruptionBudgets to control plane (#5398) (#5406)
Closes #5398

* Add PodDisruptionBudget to controller deployments
* Add .yaml to editorconfig

Signed-off-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
2021-01-06 09:19:15 -05:00
Tarun Pothulapati 2087c95dd8
viz: move some components into linkerd-viz (#5340)
* viz: move some components into linkerd-viz

This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.

The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as 
`set`, `set-string`, `values`, `set-files`.

**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
 fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.

**Testing**

```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -

# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -

# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```

What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-23 20:17:31 +05:30
Josh Soref 84a9fc9b53
Fix description to match command (#5431)
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-12-22 15:18:51 -08:00
Kevin Leimkuhler f6c8d27d83
Add mulitcluster check command (#5410)
## What

This change moves the `linkerd check --multicluster` functionality under it's
own multicluster subcommand: `linkerd multicluster check`.

There should be no functional changes as a result of this change. `linkerd
check` no longer checks for anything multicluster related and the
`--multicluster` flag has been removed.

## Why

Closes #5208

The bulk of these changes are moving all the multicluster checks from
`pkg/healthcheck` into the multicluster package.

Doing this completely separates it from core Linkerd. It still uses
`pkg/healtcheck` when possible, but anything that is used only by `multicluster
check` has been moved.

**Note the the `kubernetes-api` and `linkerd-existence` checks are run.**

These checks are required for setting up the Linkerd health checker. They set
the health checker's `kubeAPI`, `linkerdConfig`, and `apiClient` fields.

These could be set manually so that the only check the user sees is
`linkerd-multicluster`, but I chose not to do this.

If any of the setting functions errors, it would just tell the user to run
`linkerd check` and ensure the installation is correct. I find the user error
handling to be better by including these required checks since they should be
run in the first place.

## How to test

Installing Linkerd and multicluster should result in a basic check output:

```
$ bin/linkerd install |kubectl apply -f -
..
$ bin/linkerd check
..
$ bin/linkerd multicluster install |kubectl apply -f -
..
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists


Status check results are √
```

After linking a cluster:

```
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * k3d-y
√ remote cluster access credentials are valid
        * k3d-y
√ clusters share trust anchors
        * k3d-y
√ service mirror controller has required permissions
        * k3d-y
√ service mirror controllers are running
        * k3d-y
× all gateway mirrors are healthy
        probe-gateway-k3d-y.linkerd-multicluster mirrored from cluster [k3d-y] has no endpoints
    see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints

Status check results are ×
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-21 15:50:17 -05:00
Alejandro Pedraza 7471752fc4
Add missing label to linkerd-config-overrides secret (#5407)
Fixes #5385 (second bug in there)

Added missing label `linkerd.io/control-plane-ns=linkerd` that all the
control plane resources must have, that is passed to `kubectl apply
--prune`
2020-12-18 13:47:19 -05:00
Kevin Leimkuhler 7c0843a823
Add opaque ports to destination service updates (#5294)
## Summary

This changes the destination service to start indicating whether a profile is an
opaque protocol or not.

Currently, profiles returned by the destination service are built by chaining
together updates coming from watching Profile and Traffic Split updates.

With this change, we now also watch updates to Opaque Port annotations on pods
and namespaces; if an update occurs this is now included in building a profile
update and is sent to the client.

## Details

Watching updates to Profiles and Traffic Splits is straightforward--we watch
those resources and if an update occurs on one associated to a service we care
about then the update is passed through.

For Opaque Ports this is a little different because it is an annotation on pods
or namespaces. To account for this, we watch the endpoints that we should care
about.

### When host is a Pod IP

When getting the profile for a Pod IP, we check for the opaque ports annotation
on the pod and the pod's namespace. If one is found, we'll indicate if the
profile is an opaque protocol if the requested port is in the annotation.

We do not subscribe for updates to this pod IP. The only update we really care
about is if the pod is deleted and this is already handled by the proxy.

### When host is a Service

When getting the profile for a Service, we subscribe for updates to the
endpoints of that service. For any ports set in the opaque ports annotation on
any of the pods, we check if the requested port is present.

Since the endpoints for a service can be added and removed, we do subscribe for
updates to the endpoints of the service.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-18 12:38:59 -05:00
Alejandro Pedraza d661054795
Fix CLI install/upgrade overriding settings in HA (#5399)
Fixes #5385

## The problems

- `linkerd install --ha` isn't honoring flags
- `linkerd upgrade --ha` is overridding existing configs silently or failing with an error
- *Upgrading HA instances from before 2.9 to version 2.9.1 results in configs being overridden silently, or the upgrade fails with an error*

## The cause

The change in #5358 attempted to fix `linkerd install --ha` that was only applying some of the `values-ha.yaml` defaults, by calling `charts.NewValues(true)` and merging that with the values built from `values.yaml` overriden by the flags. It turns out the `charts.NewValues()` implementation was by itself merging against `values.yaml` and as a result any flag was getting overridden by its default.

This also happened when doing `linkerd upgrade --ha` on an existing instance, which could result in silently overriding settings, or it could also fail loudly like for example when upgrading set up that has an external issuer (in this case the issuer cert won't be able to be read during upgrade and an error would occur as described in #5385).

Finally, when doing `linkerd upgrade` (no --ha flag) on an HA install from before 2.9 results in configs getting overridden as well (silently or with an error) because in order to generate the `linkerd-config-overrides` secret, the original install flags are retrieved from `linkerd-config` via the `loadStoredValuesLegacy()` function which then effectively ends up performing a `linkerd upgrade` with all the flags used for `linkerd install` and falls into the same trap as above.

## The fix

In `values.go` the faulting merging logic is not used anymore, so now `NewValues()` only returns the default values from `values.yaml` and doesn't require an argument anymore. It calls `readDefaults()` which now only returns the appropriate values depending on whether we're on HA or not.
There's a new function `MergeHAValues()` that merges `values-ha.yaml` into the current values (it doesn't look into `values.yaml` anymore), which is only used when processing the `--ha` flag in `options.go`.

## How to test

To replicate the issue try setting a custom setting and check it's not applied:
```bash
linkerd install --ha --controller-log level debug | grep log.level
- -log-level=info
```

## Followup

This wasn't caught because we don't have HA integration tests. Now that our test infra is based on k3d, it should be easy to make such a test using a cluster with multiple nodes. Either that or issuing `linkerd install --ha` with additional configs and compare against a golden file.
2020-12-18 12:11:52 -05:00
Tarun Pothulapati 589f36c4c2
jaeger: add check sub command (#5295)
* jaeger: add check sub command

This adds a new `linkerd jaeger check` command to have checks w.r.t
jaeger extension. This is similar to that of the `linkerd check` cmd.
As jaeger is a separate package, It was a bit complex for this to work
as not all types and fields from healthcheck pkg are public, Helper
funcs were used to mitigate this.

This has the following changes:

- Adds a new `check.go` file under the jaeger extension pkg
- Moves some commonly needed funcs and types from `cli/cmd/check.go`
  and `pkg/healthcheck/health.go` into
  `pkg/healthcheck/healthcheck_output.go`.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-17 00:26:34 +05:30
Alex Leong 74950e9407
Add jaeger uninstall command (#5353)
Add a `linkerd jaeger uninstall` command which prints the linkerd-jaeger extension resources so that they can be deleted.  This is similar to the `linkerd uninstall` command.

```
> bin/linkerd jaeger uninstall | k delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
clusterrolebinding.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
mutatingwebhookconfiguration.admissionregistration.k8s.io "linkerd-proxy-mutator-webhook-config" deleted
namespace "linkerd-jaeger" deleted
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-14 15:48:44 -08:00
Tarun Pothulapati c19cfd71a1
upgrades: make webhooks restart if TLS creds are updated (#5349)
* upgrades: make webhooks restart if TLS creds are updated

Fixes #5231

Currently, we do not re-use the TLS certs during upgrades, which
means that the secrets are updated while the webhooks are still
paired with the older ones, causing the webhook requests to fail.

This can be solved by making webhooks be restarted whenever there
is a change in the certs. This can be performed by storing the hash
of the `*-rbac` file, which contains the secrets, thus making the
pod templates change whenever there is an update to the certs thus
making restarts required.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-10 11:56:53 -05:00
Simon Weald cae4add8d0
Update RBAC API versions to avoid deprecations (#5332)
When testing the `linkerd2-cni` chart with `ct`, it flags up usage
of some deprecated apiVersions.

This PR aligns the RBAC API group across all resources in the chart.

---

Signed-off-by: Simon Weald <glitchcrab-github@simonweald.com>
2020-12-09 15:56:25 -05:00
Alejandro Pedraza 35612ae268
`linkerd install --ha` was only partially applying HA config (#5358)
* `linkerd install --ha` was only partially applying HA config

Fixes #5342

`values-ha.yml` contains the specific config for HA, but only the proxy
resources controller replicas settings were applied. This PR adds
EnablePodAntiafinity, WebhookFailurePolicy and all the resource settings
for the other CP pods.

Also the `--controller-replicas` flag is moved after the HA flags so it
can override the HA settings.

Finally, some comments no longer relevant were removed.

## How to test

Perform `linkerd install --ha` and make sure the values in
`values-ha.yml` are propagated correctly in the produced yaml.

## 2.9.1

After merging to `main`, this should be cherry-picked into the
`release/stable-2.9` branch.

Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-09 15:23:37 -05:00
Alex Leong cdc57d1af0
Use linkerd-jaeger extension for control plane tracing (#5299)
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:

* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests

We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension.  To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.

Note that for now, only the control plane components send traces; the proxies in the control plane do not.  This is because the linkerd-jaeger injector is not yet available.  However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.

I tested this by doing the following:

1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-08 14:34:26 -08:00
Tarun Pothulapati 72a0ca974d
extension: Separate multicluster chart and binary (#5293)
Fixes #5257

This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.

Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-04 16:36:10 -08:00
Kevin Leimkuhler 5c39c9b44d
Fix namespace flag in install help text (#5322)
The shorthand flag for `--linkerd-namespace` is `-L` not `-l`.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-03 14:33:15 -05:00
Björn Wenzel 0ee18eb168
Allow Multicluster Service to be non LoadBalancer ServiceType (#5307)
Signed-off-by: Björn Wenzel <bjoern.wenzel@dbschenker.com>
2020-12-03 13:03:49 -05:00
jiraguha ed1dff5366
Fix CLI zsh completion (#5285)
Signed-off-by: jpiraguha <jiraguha@gmail.com>
2020-12-02 15:37:57 -05:00
Alejandro Pedraza 94574d4003
Add automatic readme generation for charts (#5316)
* Add automatic readme generation for charts

The current readmes for each chart is generated
manually and doesn't contain all the information available.

Utilize helm-docs to automatically fill out readme.mds
for the helm charts by pulling metadata from values.yml.

Fixes #4156

Co-authored-by: GMarkfjard <gabma047@student.liu.se>
2020-12-02 14:37:45 -05:00
Alejandro Pedraza 9cbfb08a38
Bump proxy-init to v1.3.8 (#5283) 2020-11-27 09:07:34 -05:00
Tarun Pothulapati e7f4c31257
extension: Add new jaeger binary (#5278)
* extension: Add new jaeger binary

This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.

This covers only the install part of the command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-25 20:10:35 +05:30
Tarun Pothulapati fafeee38d7
Upgrade to Helm v3 sdk (#4878)
Fixes #4874

This branch upgrades Helm sdk from v2 to v3 *without any functionaly
changes*, just replacing types with newer API's.

This should not effect our current support for Helm v2 as we did not
change any of the underlying tempaltes(which work with Helm v2). This
works becuase we did not use any of the API's that read the Chart
metadata (which are the only ones changed from v2 to v3) and currently
manually load files and pass ito the sdk.

This PR should provide a great point to start more of the newer Helm v3
API's including for the upgrade workflow thus allowing us to make
Linkerd CLI more simpler.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-23 12:46:41 -08:00
hodbn 92eb174e06
Add safe accessor for Global in linkerd-config (#5269)
CLI crashes if linkerd-config contains unexpected values.

Add a safe accessor that initializes an empty Global on the first
access. Refactor all accesses to use the newly introduced accessor using
gopls.

Add test for linkerd-config data without Global.

Fixes #5215

Co-authored-by: Itai Schwartz <yitai27@gmail.com>
Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>
2020-11-23 12:45:58 -08:00
Devdutt Shenoi 8ab6177c1f
Simplified boolean expression (#5251)
Simplified boolean expressions in routes command

Signed-off-by: Devdutt Shenoi <devdutt@outlook.in>
2020-11-20 09:37:06 -05:00
Takumi Sue 53afc7dbc4
Fix an odd indent (and test data) (#5262)
Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2020-11-20 09:34:40 -05:00
Tarun Pothulapati b389054d53
cli: Don't check for SAN in root and intermediate certs (#5237)
As discussed in #5228, it is not correct for root and intermediate
certs to have SAN. This PR updates the check to not verify the
intermediate issuer cert with the identity dns name (which checks with
SAN and not CN as the the `verify` func is used to verify leaf certs and
not root and intermediate certs). This PR also avoids setting a SAN
field when generating certs in the `install` command.

Fixes #5228
2020-11-18 15:30:39 -08:00