Commit Graph

31 Commits

Author SHA1 Message Date
wangchenglong01 4c1b51a6f2
Condition is always 'false' because 'err' is always 'nil' (#5982)
Remove unnecessary err check.

Signed-off-by: Cookie Wang <wangchl01@inspur.com>
2021-04-05 10:49:12 +05:30
Tarun Pothulapati 7ab8255855
multicluster: make service mirror honour `requeueLimit` (#5969)
* multicluster: make service mirror honour `requeueLimit`

Fixes #5374

Currently, Whenever the `gatewayAddress` is changed the service
mirror component keeps trying to repairEndpoints (which is
invoked every `repairPeriod`). This behavior is fine and
expected

but as the service mirror does not honor `requeueLimit` currently,
It keeps on requeuing the same event and keeps trying with no limit.

The condition that we use to limit requeues
`if (rcsw.eventsQueue.NumRequeues(event) < rcsw.requeueLimit)` does
not work for the following reason:

- For this queue to actually track requeues, `AddRateLimited` has to be
  used instead which makes `NumRequeues` actually return the actual
  number of requeues for a specific event.

This change updates the requeuing logic to use `AddRateLimited` instead
of `Add`

After these changes, The logs in the service mirror are as follows

```bash
time="2021-03-30T16:52:31Z" level=info msg="Received: OnAddCalled: {svc: Service: {name: grafana, namespace: linkerd-viz, annotations: [[linkerd.io/created-by=linkerd/helm git-0e2ecd7b]], labels [[linkerd.io/extension=viz]]}}" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Requeues: 1, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Requeues: 2, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=info msg="Requeues: 3, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (giving up): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Requeues: 0, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Requeues: 1, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Requeues: 2, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=info msg="Requeues: 3, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote
time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (giving up): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote

```

As seen, The `RepairEndpoints` is called every `repairPeriod` which
is 1 minute by default. Whenever a failure happens, It is retried
but now the failures are tracked and the event is given up if it
reaches the `reuqueLimit` which is 3 by default.

This also fixes the requeuing logic for all type of events
not just `repairEndpoints`.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-04-01 11:16:57 +05:30
Tarun Pothulapati 36084c6958
helm: add NOTES.txt for extension charts (#5870)
Currently, There is no `Notes` that get printed out after installatio
is performed through helm for extensions, like we do for the core
chart. This updates the viz and jaeger charts to include that
along with instructions to view the dashbaord.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-03-09 15:31:18 -05:00
Alejandro Pedraza 571f505b6b
Move CP check after the readiness check (#5848)
* Move CP check after the readiness check

Moved the `can initialize client` and `can query the control plane API`
checks from the `linkerd-existence` section to the `linkerd-api` because
they required the `linkerd-controller` pod to not just be "Running" but
actually be ready.

This was causing `linkerd check` to show some port-forwarding warnings
when ran right after install.

This also allowed getting rid of the `CheckPublicAPIClientOrExit` function
and directly use `CheckPublicAPIClientOrRetryOrExit` (better naming
punted for later) which was refactored so it always runs the
`linkerd-api` checks before retrieving the client.

Other changes:

- Temporarily disabled `upgrade-edge` test because the latest edge has this readiness check issue
- Have the upgrade tests do proper pruning (stolen for @Pothulapati's #5673 😉 )
- Added missing label to tap SA (fixes #5850)
- Complete tap-injector Service selector
2021-03-01 19:47:25 -05:00
Dennis Adjei-Baah 15d1809bd0
Remove linkerd prefix from extension resources (#5803)
* Remove linkerd prefix from extension resources

This change removes the `linkerd-` prefix on all non-cluster resources
in the jaeger and viz linkerd extensions. Removing the prefix makes all
linkerd extensions consistent in their naming.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2021-02-25 11:01:31 -05:00
Alex Leong 167c823297
Add support for CLI extensions (#5762)
As described in https://github.com/linkerd/linkerd2/pull/5692, this PR adds support for CLI extensions.

Calling `linkerd foo` (if `foo` is not an existing Linkerd command) will now search the current PATH for an executable named `linkerd-foo` and invoke it with the current arguments.

* All arguments and flags will be passed to the extension command
* The Linkerd command itself will not process any flags
* To simplify parsing, flags are not allowed before the extension name

e.g. with an executable called `linkerd-foo` on my PATH:

```console
> linkerd foo install
Welcome to Linkerd foo!
Got: install
> linkerd foo --context=prod install
Welcome to Linkerd foo!
Got: --context=prod install
> linkerd --context=prod foo install
Cannot accept flags before Linkerd extension name
> linkerd bar install
Error: unknown command "bar" for "linkerd"
Run 'linkerd --help' for usage.
```

We also update `linkerd check` to invoke `linkerd <extension> check` for each extension found installed on the current cluster.  A check warning is emitted if the extension command is not found on the path.

e.g. with both `linkerd.io/extension=foo` and `linkerd.io/extension=bar` extensions installed on the cluster:

```console
> linkerd check
[...]
Linkerd extensions checks
=========================

Welcome to Linkerd foo!
Got: check --as-group=[] --cni-namespace=linkerd-cni --help=false --linkerd-cni-enabled=false --linkerd-namespace=linkerd --output=table --pre=false --proxy=false --verbose=false --wait=5m0s

linkerd-bar
-----------
‼ Linkerd extension command linkerd-bar exists

Status check results are ‼
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-02-24 13:26:21 -08:00
Kevin Leimkuhler 5bd5db6524
Revert "Rename multicluster annotation prefix and move when possible (#5771)" (#5813)
This reverts commit f9ab867cbc which renamed the
multicluster label name from `mirror.linkerd.io` to `multicluster.linkerd.io`.

While this change was made to follow similar namings in other extensions, it
complicates the multicluster upgrade process due to the secret creation.

`mirror.linkerd.io` is not that important of a label to change and this will
allow a smoother upgrade process for `stable-2.10.x`

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-24 12:54:52 -05:00
Kevin Leimkuhler ff93d2d317
Mirror opaque port annotations on services (#5770)
This change introduces an opaque ports annotation watcher that will send
destination profile updates when a service has its opaque ports annotation
change.

The user facing change introduced by this is that the opaque ports annotation is
now required on services when using the multicluster extension. This is because
the service mirror will create mirrored services in the source cluster, and
destination lookups in the source cluster need to discover that the workloads in
the target cluster are opaque protocols.

### Why

Closes #5650

### How

The destination server now has a new opaque ports annotation watcher. When a
client subscribes to updates for a service name or cluster IP, the `GetProfile`
method creates a profile translator stack that passes updates through resource
adaptors such as: traffic split adaptor, service profile adaptor, and now opaque
ports adaptor.

When the annotation on a service changes, the update is passed through to the
client where the `opaque_protocol` field will either be set to true or false.

A few scenarios to consider are:

  - If the annotation is removed from the service, the client should receive
    an update with no opaque ports set.
  - If the service is deleted, the stream stays open so the client should
    receive an update with no opaque ports set.
  - If the service has the annotation added, the client should receive that
    update.

### Testing

Unit test have been added to the watcher as well as the destination server.

An integration test has been added that tests the opaque port annotation on a
service.

For manual testing, using the destination server scripts is easiest:

```
# install Linkerd

# start the destination server
$ go run controller/cmd/main.go destination -kubeconfig ~/.kube/config

# Create a service or namespace with the annotation and inject it

# get the destination profile for that service and observe the opaque protocol field
$ go run controller/script/destination-client/main.go -method getProfile -path test-svc.default.svc.cluster.local:8080
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]                                              
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-23 13:36:17 -05:00
Piyush Singariya 9295b4778c
Fix for CLI printing command usage for API Errors for multiple commands (#5768)
Problem: CLI prints command usage for multiple commands in-case of API errors
Solution: Print the error and the exit using os.Exit(1) to avoid Cobra printing usage

Closes #5058

Signed-off-by: Piyush Singariya piyushsingariya@gmail.com
2021-02-19 11:08:27 -05:00
Alejandro Pedraza b53dc3b400
Removed "do-not-edit" entries from values.yaml files (#5758)
Fixes #5574 and supersedes #5660

- Removed from all the `values.yaml` files all those "do not edit" entries for annotation/label names, hard-coding them in the templates instead.
- The `values.go` files got simplified as a result.
- The `created-by` annotation was also refactored into a reusable partial. This means we had to add a `partials` dependency to multicluster.
2021-02-19 09:17:45 -05:00
Kevin Leimkuhler f9ab867cbc
Rename multicluster annotation prefix and move when possible (#5771)
This renames the multicluster annotation prefix from `mirror.linkerd.io` to
`multicluster.linkerd.io` in order to reflect other extension naming patterns.

Additionally, it moves labels only used in the Multicluster extension into their
own labels file—again to reflect other extensions.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-18 17:10:33 -05:00
Tarun Pothulapati e16697d49f
cli: make jaeger and multicluster installs wait for core cp (#5767)
* cli: make jaeger and multicluster installs wait for core cp

This PR updates the jaeger and multicluster installs to wait
for the core control-plane to be up before moving to the rendering
logic. This prevents these components from being installed before
the injector is up and running correctly.

`--skip-checks` has been added to jaeger to skip these checks. The
same has not been added to `multicluster` as the install fails when
there is no core cp is present.

This PR also cleans up extra core cp check that we have for `viz install`

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-19 00:37:18 +05:30
Oliver Gould 6dc7efd704
docker: Access container images via cr.l5d.io (#5756)
We've created a custom domain, `cr.l5d.io`, that redirects to `ghcr.io`
(using `scarf.sh`). This custom domain allows us to swap the underlying
container registry without impacting users. It also provides us with
important metrics about container usage, without collecting PII like IP
addresses.

This change updates our Helm charts and CLIs to reference this custom
domain. The integration test workflow now refers to the new domain,
while the release workflow continues to use the `ghcr.io/linkerd` registry
for the purpose of publishing images.
2021-02-17 14:31:54 -08:00
Piyush Singariya 76e00cae02
Multicluster: Uninstall multicluster without Linkerd control plane (#5744)
Problem

If the main Linkerd control plane has been uninstalled, it is no longer possible to uninstall the multicluster extension.

```
$ bin/linkerd mc uninstall | k delete -f -
Error: you need Linkerd to be installed in order to install multicluster addons
Usage:
  linkerd multicluster uninstall [flags]
```
Solution

Fetch resources with the label `linkerd.io/extension: linkerd-multicluster` and delete them

Closes #5624

Signed-off-by: Piyush Singariya <piyushsingariya@gmail.com>
2021-02-17 13:05:05 -05:00
Alejandro Pedraza cbdd1cab03
Increase min k8s version to 1.16 (#5741)
... in order to support the bumped CRD and webhook config versions made
in #5603. In #5688 we asked if there were any concerns. None so far.
2021-02-15 13:03:14 -05:00
Tarun Pothulapati a393c42536
values: removal of .global field (#5699)
* values: removal of .global field

Fixes #5425

With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.

To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.

We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-11 23:38:34 +05:30
Dennis Adjei-Baah e4069b47e0
Run extension checks when linkerd check is invoked (#5647)
* Run extension checks when linkerd check is invoked

This change allows the linkerd check command to also run any known
linkerd extension commands that have been installed in the cluster. It
does this by first querying for any namespace that has the label
selector `linkerd.io/extension` and then runs the subcommands for either
`jaeger`, `multicluster` or `viz`. This change runs basic namespace
healthchecks for extensions that aren't part of the Linkerd extension suite.

Fixes #5233
2021-02-11 10:50:16 -06:00
Takumi Sue 77add64860
Remove extra three dashes from helm templates (#5628)
(Background information)
In our company we are checking the sops-encrypted Linkerd manifest into GitHub repository,
and I came across the following problem.

---

Three dashes mean the start of the YAML document (or the end of the
directive).
https://yaml.org/spec/1.2/spec.html#id2800132

If there are only comments between `---`, the document is empty.
Assume the file which include an empty document at the top of itself.

```yaml
---
# foo
---
apiVersion: v1
kind: Namespace
metadata:
  name: foo
---
# bar
---
apiVersion: v1
kind: Namespace
metadata:
  name: bar
```

When we encrypt and decrypt it with [sops](https://github.com/mozilla/sops), the empty document will be
converted to `{}`.

```yaml
{}
---
apiVersion: v1
kind: Namespace
metadata:
    name: foo
---
apiVersion: v1
kind: Namespace
metadata:
    name: bar
```

It is invalid as k8s manifest ([apiVersion not set, kind not set]).

```
error validating data: [apiVersion not set, kind not set]
```

---

I'm afraid that it's sops's problem (at least partly), but anyhow this modification is enough harmless I think.
Thank you.

Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2021-02-01 10:51:34 -05:00
Matei David 0ce9e84a94
Introduce V1 to CRDs and Mutating Hooks (#5603)
*Closes #5484*
 ### Changes
---
*Overview*:
 * Update golden files and make necessary spec changes
 * Update test files for viz
 * Add v1 to healthcheck and uninstall
 * Fix link-crd clusterDomain field validation

- To update to v1, I had to change crd schemas to be version-based (i.e each version has to declare its own schema). I noticed an error in the link-crd (`targetClusterDomain` was `targetDomainName`). Also, additionalPrinterColumns are also version-dependent as a field now.

- For `admissionregistration` resources I had to add an additional `admissionReviewVersions` field -- I included `v1` and `v1beta1`.

- In `healthcheck.go` and `resources.go` (used by `uninstall`) I had to make some changes to the client-go versions (i.e from `v1beta1` to `v1` for admissionreg and apiextension) so that we don't see any warning messages when uninstalling or when we do any install checks. 

I tested again different cli and k8s versions to have a bit more confidence in the changes (in addition to automated tests), hope the cases below will be enough, if not let me know and I can test further.

### Tests

Linkerd local build CLI + k8s 1.19+
`install/check/mc-check/mc-install/mc-link/viz-install/viz-check/uninstall/`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-b0fd2ec8
Server version: unavailable

$ bin/linkerd install | kubectl apply -f -
- no errors, no version warnings - 

$ bin/linkerd check --expected-version git-b0fd2ec8
Status check results are :tick:

# MC

$ bin/linkerd mc install | k apply -f - 
- no erros, no version warnings - 

$ bin/linkerd mc check
Status check results are :tick:

$ bin/linkerd mc link foo | k apply -f -   # test crd creation
# had a validation error here because the schema had targetDomainName instead of targetClusterDomain
# changed, rebuilt cli, re-installed mc, tried command again
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created
...

# VIZ
$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd uninstall | k delete -f -
- no errors, no version warnings - 
```

Linkerd local build CLI + k8s 1.17
`check-pre/install/mc-check/mc-install/mc-link/viz-install/viz-check`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17-rc1+k3s1", GitCommit:"e8c9484078bc59f2cd04f4018b095407758073f5", GitTreeState:"clean", BuildDate:"2021-01-14T06:20:56Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-3d2d4df1 # made changes to link-crd after prev test case
Server version: unavailable

$ bin/linkerd check --pre --expected-version git-3d2d4df1
- no errors, no version warnings -
Status check results are :tick:

$ bin/linkerd install | k apply -f -
- no errors, no version warnings -

$ bin/linkerd check --expected-version git-3d2d4df1
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc install | k apply -f -
- no errors, no version warnings - 

$ bin/linkerd mc check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc link --cluster-name foo | k apply -f -
bin/linkerd mc link --cluster-name foo | k apply -f -
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created

# VIZ

$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check
- no errors, no version warnings -
- hangs up indefinitely after linkerd-viz can talk to Kubernetes
```

Linkerd edge (21.1.3) CLI + k8s 1.17 (already installed)
`check`
```
$ linkerd version
Client version: edge-21.1.3
Server version: git-3d2d4df1

$ linkerd check
- no errors -
- warnings: mismatch between cli & control plane, control plane not up to date (both expected) -
Status check results are :tick:
```

Linkerd stable (2.9.2) CLI + k8s 1.17 (already installed)
`check/uninstall`
```
$ linkerd version
Client version: stable-2.9.2
Server version: git-3d2d4df1

$ linkerd check
× control plane ClusterRoles exist
    missing ClusterRoles: linkerd-linkerd-tap
    see https://linkerd.io/checks/#l5d-existence-cr for hints

Status check results are ×

# viz wasn't installed, hence the error, installing viz didn't help since
# the res is named `viz-tap` now
# moving to uninstall

$ linkerd uninstall | k delete -f -
- no warnings, no errors - 
```

_Note_: I used `go test ./cli/cmd/... --generate` which is why there are so many changes 😨 

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-02-01 09:18:13 -05:00
Alex Leong dd8e5fc5bc
Rename extension charts to linkerd-* (#5552)
For consistency we rename the extension charts to a common naming scheme:

linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link

We also make the chart files and chart readmes a bit more uniform.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-26 16:20:49 -08:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Matei David c63fbdf0e4
Introduce OpenAPIV3 validation for CRDs (#5573)
* Introduce OpenAPIV3 validation for CRDs

* Add validation to link crd
* Add validation to sp using kube-gen
* Add openapiv3 under schema fields in specific versions
* Modify fields to rid spec of yaml errors
* Add top level validation for all three CRDs

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-01-21 11:56:28 -05:00
Tarun Pothulapati 3b755e5c1d
multicluster: add helm customization flags for install (#5534)
* multicluster: add helm customization flags

This branch updates the multicluster install flow to use the
helm engine directly instead of our own chart wrapper. This
also adds the helm customization flags.

```bash
tarun in dev in on  k3d-deep (default) linkerd2 on  tarun/mc-helm-flags [$+?] via  v1.15.4
 ./bin/go-run cli mc install --set namespace=l5d-mc | grep l5d-mc
github.com/linkerd/linkerd2/multicluster/cmd
github.com/linkerd/linkerd2/cli/cmd
  name: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
    mirror.linkerd.io/gateway-identity: linkerd-gateway.l5d-mc.serviceaccount.identity.linkerd.cluster.local
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
```

* add customization flags even for link cmd

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-20 11:29:42 +05:30
Alejandro Pedraza f3b1ebfa99
Separate observability API (#5510)
* Separate observability API

Closes #5312

This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.

- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.

The other changes in the go files are just changes in the imports to point to the new protobufs.

Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
2021-01-13 14:34:54 -05:00
Tarun Pothulapati 2b6c5e807d
multicluster: Add removed non-lb ServiceType logic (#5473)
As #5307 & #5293 went in the same time-frame, Some of the logic
added in #5307 got lost during the merge. (oopss, Sorry!)

The same logic has been added back. The MC refactor PR #5293 moved
all the logic from `multicluster.go` into cmd specific files
whose changes added in #5307 were lost, while the changes added
in `multicluster/values.go` and template files still remained.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-07 23:52:37 +05:30
Tarun Pothulapati 68c02d82d1
healthcheck: simplify Checker construction with a builder (#5475)
Currently, Each new instance of `Checker` type have to manually
set all the fields with the `NewChecker()`, even though most
use-cases are fine with the defaults.

This branch makes this simpler by using the Builder pattern, so
that the users of `Checker` can override the defaults by using
specific field methods when needed. Thus simplifying the code.

This also removes some of the methods that were specific to tests,
and replaces them with the currently used ones.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-06 14:32:39 -08:00
Kevin Leimkuhler f6c8d27d83
Add mulitcluster check command (#5410)
## What

This change moves the `linkerd check --multicluster` functionality under it's
own multicluster subcommand: `linkerd multicluster check`.

There should be no functional changes as a result of this change. `linkerd
check` no longer checks for anything multicluster related and the
`--multicluster` flag has been removed.

## Why

Closes #5208

The bulk of these changes are moving all the multicluster checks from
`pkg/healthcheck` into the multicluster package.

Doing this completely separates it from core Linkerd. It still uses
`pkg/healtcheck` when possible, but anything that is used only by `multicluster
check` has been moved.

**Note the the `kubernetes-api` and `linkerd-existence` checks are run.**

These checks are required for setting up the Linkerd health checker. They set
the health checker's `kubeAPI`, `linkerdConfig`, and `apiClient` fields.

These could be set manually so that the only check the user sees is
`linkerd-multicluster`, but I chose not to do this.

If any of the setting functions errors, it would just tell the user to run
`linkerd check` and ensure the installation is correct. I find the user error
handling to be better by including these required checks since they should be
run in the first place.

## How to test

Installing Linkerd and multicluster should result in a basic check output:

```
$ bin/linkerd install |kubectl apply -f -
..
$ bin/linkerd check
..
$ bin/linkerd multicluster install |kubectl apply -f -
..
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists


Status check results are √
```

After linking a cluster:

```
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * k3d-y
√ remote cluster access credentials are valid
        * k3d-y
√ clusters share trust anchors
        * k3d-y
√ service mirror controller has required permissions
        * k3d-y
√ service mirror controllers are running
        * k3d-y
× all gateway mirrors are healthy
        probe-gateway-k3d-y.linkerd-multicluster mirrored from cluster [k3d-y] has no endpoints
    see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints

Status check results are ×
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-21 15:50:17 -05:00
Alejandro Pedraza 131e270d5a
Don't swallow error when MC gateway hostname can't be resolved (#5362)
* Don't swallow error when MC gateway hostname can't be resolved

Ref #5343

When none of the gateway addresses is resolvable, propagate the error as
a retryable error so it gets retried and logged. Don't create the
mirrored resources if there's no success after the retries.
2020-12-11 09:58:44 -05:00
Alejandro Pedraza 02b456087d
Stop publishing the linkerd2-multicluster-link chart (#5365)
Closes #5348

That chart generates the service mirror resources and related RBAC, but
doesn't generate the credentials secret nor the Link CR which require
go-client logic not available from sheer Helm templates.

This PR stops publishing that chart, and adds a comment to its README
about it.
2020-12-11 08:55:50 -05:00
Kevin Leimkuhler 15dc97c70e
add some missing helm values for multicluster setup (#5346)
Original description:

> **Subject**
> Add missing helm values for multicluster setup
> 
> **Problem**
> When executing this without the linkerd command the two variables are missing and the rendering will generate empty values.
> This produces the following gateway identity, that is also used in the gateway link command to generate the link crd:
> 
> ```
> mirror.linkerd.io/gateway-identity: linkerd-gateway.linkerd-multicluster.serviceaccount.identity..
> ```
> 
> **Solution**
> Add the values as defaults to the helm chart values.yaml file. If the cli is used they are overwritten by the following parameters:
> * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L197
> * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L196

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Co-authored-by: Björn Wenzel <bjoern.wenzel@dbschenker.com>
2020-12-08 10:27:16 -05:00
Tarun Pothulapati 72a0ca974d
extension: Separate multicluster chart and binary (#5293)
Fixes #5257

This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.

Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-04 16:36:10 -08:00