linkerd2

Commit Graph

Author	SHA1	Message	Date
wangchenglong01	4c1b51a6f2	Condition is always 'false' because 'err' is always 'nil' (#5982 ) Remove unnecessary err check. Signed-off-by: Cookie Wang <wangchl01@inspur.com>	2021-04-05 10:49:12 +05:30
Tarun Pothulapati	7ab8255855	multicluster: make service mirror honour `requeueLimit` (#5969 ) * multicluster: make service mirror honour `requeueLimit` Fixes #5374 Currently, Whenever the `gatewayAddress` is changed the service mirror component keeps trying to repairEndpoints (which is invoked every `repairPeriod`). This behavior is fine and expected but as the service mirror does not honor `requeueLimit` currently, It keeps on requeuing the same event and keeps trying with no limit. The condition that we use to limit requeues `if (rcsw.eventsQueue.NumRequeues(event) < rcsw.requeueLimit)` does not work for the following reason: - For this queue to actually track requeues, `AddRateLimited` has to be used instead which makes `NumRequeues` actually return the actual number of requeues for a specific event. This change updates the requeuing logic to use `AddRateLimited` instead of `Add` After these changes, The logs in the service mirror are as follows ```bash time="2021-03-30T16:52:31Z" level=info msg="Received: OnAddCalled: {svc: Service: {name: grafana, namespace: linkerd-viz, annotations: [[linkerd.io/created-by=linkerd/helm git-0e2ecd7b]], labels [[linkerd.io/extension=viz]]}}" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Requeues: 1, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Requeues: 2, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=info msg="Requeues: 3, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:52:31Z" level=error msg="Error processing RepairEndpoints (giving up): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Requeues: 0, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Requeues: 1, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Requeues: 2, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (will retry): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Received: RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=warning msg="Error resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=info msg="Requeues: 3, Limit: 3 for event RepairEndpoints" apiAddress="https://172.18.0.4:6443" cluster=remote time="2021-03-30T16:53:31Z" level=error msg="Error processing RepairEndpoints (giving up): Inner errors:\n\tError resolving 'foobar': lookup foobar on 10.43.0.10:53: no such host" apiAddress="https://172.18.0.4:6443" cluster=remote ``` As seen, The `RepairEndpoints` is called every `repairPeriod` which is 1 minute by default. Whenever a failure happens, It is retried but now the failures are tracked and the event is given up if it reaches the `reuqueLimit` which is 3 by default. This also fixes the requeuing logic for all type of events not just `repairEndpoints`. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-04-01 11:16:57 +05:30
Tarun Pothulapati	36084c6958	helm: add NOTES.txt for extension charts (#5870 ) Currently, There is no `Notes` that get printed out after installatio is performed through helm for extensions, like we do for the core chart. This updates the viz and jaeger charts to include that along with instructions to view the dashbaord. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-03-09 15:31:18 -05:00
Alejandro Pedraza	571f505b6b	Move CP check after the readiness check (#5848 ) * Move CP check after the readiness check Moved the `can initialize client` and `can query the control plane API` checks from the `linkerd-existence` section to the `linkerd-api` because they required the `linkerd-controller` pod to not just be "Running" but actually be ready. This was causing `linkerd check` to show some port-forwarding warnings when ran right after install. This also allowed getting rid of the `CheckPublicAPIClientOrExit` function and directly use `CheckPublicAPIClientOrRetryOrExit` (better naming punted for later) which was refactored so it always runs the `linkerd-api` checks before retrieving the client. Other changes: - Temporarily disabled `upgrade-edge` test because the latest edge has this readiness check issue - Have the upgrade tests do proper pruning (stolen for @Pothulapati's #5673 😉 ) - Added missing label to tap SA (fixes #5850) - Complete tap-injector Service selector	2021-03-01 19:47:25 -05:00
Dennis Adjei-Baah	15d1809bd0	Remove linkerd prefix from extension resources (#5803 ) * Remove linkerd prefix from extension resources This change removes the `linkerd-` prefix on all non-cluster resources in the jaeger and viz linkerd extensions. Removing the prefix makes all linkerd extensions consistent in their naming. Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2021-02-25 11:01:31 -05:00
Alex Leong	167c823297	Add support for CLI extensions (#5762 ) As described in https://github.com/linkerd/linkerd2/pull/5692, this PR adds support for CLI extensions. Calling `linkerd foo` (if `foo` is not an existing Linkerd command) will now search the current PATH for an executable named `linkerd-foo` and invoke it with the current arguments. * All arguments and flags will be passed to the extension command * The Linkerd command itself will not process any flags * To simplify parsing, flags are not allowed before the extension name e.g. with an executable called `linkerd-foo` on my PATH: ```console > linkerd foo install Welcome to Linkerd foo! Got: install > linkerd foo --context=prod install Welcome to Linkerd foo! Got: --context=prod install > linkerd --context=prod foo install Cannot accept flags before Linkerd extension name > linkerd bar install Error: unknown command "bar" for "linkerd" Run 'linkerd --help' for usage. ``` We also update `linkerd check` to invoke `linkerd <extension> check` for each extension found installed on the current cluster. A check warning is emitted if the extension command is not found on the path. e.g. with both `linkerd.io/extension=foo` and `linkerd.io/extension=bar` extensions installed on the cluster: ```console > linkerd check [...] Linkerd extensions checks ========================= Welcome to Linkerd foo! Got: check --as-group=[] --cni-namespace=linkerd-cni --help=false --linkerd-cni-enabled=false --linkerd-namespace=linkerd --output=table --pre=false --proxy=false --verbose=false --wait=5m0s linkerd-bar ----------- ‼ Linkerd extension command linkerd-bar exists Status check results are ‼ ``` Signed-off-by: Alex Leong <alex@buoyant.io>	2021-02-24 13:26:21 -08:00
Kevin Leimkuhler	5bd5db6524	Revert "Rename multicluster annotation prefix and move when possible (#5771 )" (#5813 ) This reverts commit `f9ab867cbc` which renamed the multicluster label name from `mirror.linkerd.io` to `multicluster.linkerd.io`. While this change was made to follow similar namings in other extensions, it complicates the multicluster upgrade process due to the secret creation. `mirror.linkerd.io` is not that important of a label to change and this will allow a smoother upgrade process for `stable-2.10.x` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-02-24 12:54:52 -05:00
Kevin Leimkuhler	ff93d2d317	Mirror opaque port annotations on services (#5770 ) This change introduces an opaque ports annotation watcher that will send destination profile updates when a service has its opaque ports annotation change. The user facing change introduced by this is that the opaque ports annotation is now required on services when using the multicluster extension. This is because the service mirror will create mirrored services in the source cluster, and destination lookups in the source cluster need to discover that the workloads in the target cluster are opaque protocols. ### Why Closes #5650 ### How The destination server now has a new opaque ports annotation watcher. When a client subscribes to updates for a service name or cluster IP, the `GetProfile` method creates a profile translator stack that passes updates through resource adaptors such as: traffic split adaptor, service profile adaptor, and now opaque ports adaptor. When the annotation on a service changes, the update is passed through to the client where the `opaque_protocol` field will either be set to true or false. A few scenarios to consider are: - If the annotation is removed from the service, the client should receive an update with no opaque ports set. - If the service is deleted, the stream stays open so the client should receive an update with no opaque ports set. - If the service has the annotation added, the client should receive that update. ### Testing Unit test have been added to the watcher as well as the destination server. An integration test has been added that tests the opaque port annotation on a service. For manual testing, using the destination server scripts is easiest: ``` # install Linkerd # start the destination server $ go run controller/cmd/main.go destination -kubeconfig ~/.kube/config # Create a service or namespace with the annotation and inject it # get the destination profile for that service and observe the opaque protocol field $ go run controller/script/destination-client/main.go -method getProfile -path test-svc.default.svc.cluster.local:8080 INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} INFO[0000] INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} INFO[0000] ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-02-23 13:36:17 -05:00
Piyush Singariya	9295b4778c	Fix for CLI printing command usage for API Errors for multiple commands (#5768 ) Problem: CLI prints command usage for multiple commands in-case of API errors Solution: Print the error and the exit using os.Exit(1) to avoid Cobra printing usage Closes #5058 Signed-off-by: Piyush Singariya piyushsingariya@gmail.com	2021-02-19 11:08:27 -05:00
Alejandro Pedraza	b53dc3b400	Removed "do-not-edit" entries from values.yaml files (#5758 ) Fixes #5574 and supersedes #5660 - Removed from all the `values.yaml` files all those "do not edit" entries for annotation/label names, hard-coding them in the templates instead. - The `values.go` files got simplified as a result. - The `created-by` annotation was also refactored into a reusable partial. This means we had to add a `partials` dependency to multicluster.	2021-02-19 09:17:45 -05:00
Kevin Leimkuhler	f9ab867cbc	Rename multicluster annotation prefix and move when possible (#5771 ) This renames the multicluster annotation prefix from `mirror.linkerd.io` to `multicluster.linkerd.io` in order to reflect other extension naming patterns. Additionally, it moves labels only used in the Multicluster extension into their own labels file—again to reflect other extensions. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-02-18 17:10:33 -05:00
Tarun Pothulapati	e16697d49f	cli: make jaeger and multicluster installs wait for core cp (#5767 ) * cli: make jaeger and multicluster installs wait for core cp This PR updates the jaeger and multicluster installs to wait for the core control-plane to be up before moving to the rendering logic. This prevents these components from being installed before the injector is up and running correctly. `--skip-checks` has been added to jaeger to skip these checks. The same has not been added to `multicluster` as the install fails when there is no core cp is present. This PR also cleans up extra core cp check that we have for `viz install` Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-02-19 00:37:18 +05:30
Oliver Gould	6dc7efd704	docker: Access container images via cr.l5d.io (#5756 ) We've created a custom domain, `cr.l5d.io`, that redirects to `ghcr.io` (using `scarf.sh`). This custom domain allows us to swap the underlying container registry without impacting users. It also provides us with important metrics about container usage, without collecting PII like IP addresses. This change updates our Helm charts and CLIs to reference this custom domain. The integration test workflow now refers to the new domain, while the release workflow continues to use the `ghcr.io/linkerd` registry for the purpose of publishing images.	2021-02-17 14:31:54 -08:00
Piyush Singariya	76e00cae02	Multicluster: Uninstall multicluster without Linkerd control plane (#5744 ) Problem If the main Linkerd control plane has been uninstalled, it is no longer possible to uninstall the multicluster extension. ``` $ bin/linkerd mc uninstall \| k delete -f - Error: you need Linkerd to be installed in order to install multicluster addons Usage: linkerd multicluster uninstall [flags] ``` Solution Fetch resources with the label `linkerd.io/extension: linkerd-multicluster` and delete them Closes #5624 Signed-off-by: Piyush Singariya <piyushsingariya@gmail.com>	2021-02-17 13:05:05 -05:00
Alejandro Pedraza	cbdd1cab03	Increase min k8s version to 1.16 (#5741 ) ... in order to support the bumped CRD and webhook config versions made in #5603. In #5688 we asked if there were any concerns. None so far.	2021-02-15 13:03:14 -05:00
Tarun Pothulapati	a393c42536	values: removal of .global field (#5699 ) * values: removal of .global field Fixes #5425 With the new extension model, We no longer need `Global` field as we don't rely on chart dependencies anymore. This helps us further cleanup Values, and make configuration more simpler. To make upgrades and the usage of new CLI with older config work, We add a new method called `config.RemoveGlobalFieldIfPresent` that is used in the upgrade and `FetchCurrentConfiguration` paths to remove global field and attach its child nodes if global is present. This is verified by the `TestFetchCurrentConfiguration`'s older test that has the global field. We also don't yet remove .global in some helm stable-upgrade tests for the initial install to work. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-02-11 23:38:34 +05:30
Dennis Adjei-Baah	e4069b47e0	Run extension checks when linkerd check is invoked (#5647 ) * Run extension checks when linkerd check is invoked This change allows the linkerd check command to also run any known linkerd extension commands that have been installed in the cluster. It does this by first querying for any namespace that has the label selector `linkerd.io/extension` and then runs the subcommands for either `jaeger`, `multicluster` or `viz`. This change runs basic namespace healthchecks for extensions that aren't part of the Linkerd extension suite. Fixes #5233	2021-02-11 10:50:16 -06:00
Takumi Sue	77add64860	Remove extra three dashes from helm templates (#5628 ) (Background information) In our company we are checking the sops-encrypted Linkerd manifest into GitHub repository, and I came across the following problem. --- Three dashes mean the start of the YAML document (or the end of the directive). https://yaml.org/spec/1.2/spec.html#id2800132 If there are only comments between `---`, the document is empty. Assume the file which include an empty document at the top of itself. ```yaml --- # foo --- apiVersion: v1 kind: Namespace metadata: name: foo --- # bar --- apiVersion: v1 kind: Namespace metadata: name: bar ``` When we encrypt and decrypt it with [sops](https://github.com/mozilla/sops), the empty document will be converted to `{}`. ```yaml {} --- apiVersion: v1 kind: Namespace metadata: name: foo --- apiVersion: v1 kind: Namespace metadata: name: bar ``` It is invalid as k8s manifest ([apiVersion not set, kind not set]). ``` error validating data: [apiVersion not set, kind not set] ``` --- I'm afraid that it's sops's problem (at least partly), but anyhow this modification is enough harmless I think. Thank you. Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>	2021-02-01 10:51:34 -05:00
Matei David	0ce9e84a94	Introduce V1 to CRDs and Mutating Hooks (#5603 ) Closes #5484 ### Changes --- Overview: * Update golden files and make necessary spec changes * Update test files for viz * Add v1 to healthcheck and uninstall * Fix link-crd clusterDomain field validation - To update to v1, I had to change crd schemas to be version-based (i.e each version has to declare its own schema). I noticed an error in the link-crd (`targetClusterDomain` was `targetDomainName`). Also, additionalPrinterColumns are also version-dependent as a field now. - For `admissionregistration` resources I had to add an additional `admissionReviewVersions` field -- I included `v1` and `v1beta1`. - In `healthcheck.go` and `resources.go` (used by `uninstall`) I had to make some changes to the client-go versions (i.e from `v1beta1` to `v1` for admissionreg and apiextension) so that we don't see any warning messages when uninstalling or when we do any install checks. I tested again different cli and k8s versions to have a bit more confidence in the changes (in addition to automated tests), hope the cases below will be enough, if not let me know and I can test further. ### Tests Linkerd local build CLI + k8s 1.19+ `install/check/mc-check/mc-install/mc-link/viz-install/viz-check/uninstall/` ``` $ kubectl version Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"} $ bin/linkerd version Client version: git-b0fd2ec8 Server version: unavailable $ bin/linkerd install \| kubectl apply -f - - no errors, no version warnings - $ bin/linkerd check --expected-version git-b0fd2ec8 Status check results are :tick: # MC $ bin/linkerd mc install \| k apply -f - - no erros, no version warnings - $ bin/linkerd mc check Status check results are :tick: $ bin/linkerd mc link foo \| k apply -f - # test crd creation # had a validation error here because the schema had targetDomainName instead of targetClusterDomain # changed, rebuilt cli, re-installed mc, tried command again secret/cluster-credentials-foo created link.multicluster.linkerd.io/foo created ... # VIZ $ bin/linkerd viz install \| k apply -f - - no errors, no version warnings - $ bin/linkerd viz check - no errors, no version warnings - Status check results are :tick: $ bin/linkerd uninstall \| k delete -f - - no errors, no version warnings - ``` Linkerd local build CLI + k8s 1.17 `check-pre/install/mc-check/mc-install/mc-link/viz-install/viz-check` ``` $ kubectl version Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17-rc1+k3s1", GitCommit:"e8c9484078bc59f2cd04f4018b095407758073f5", GitTreeState:"clean", BuildDate:"2021-01-14T06:20:56Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"} $ bin/linkerd version Client version: git-3d2d4df1 # made changes to link-crd after prev test case Server version: unavailable $ bin/linkerd check --pre --expected-version git-3d2d4df1 - no errors, no version warnings - Status check results are :tick: $ bin/linkerd install \| k apply -f - - no errors, no version warnings - $ bin/linkerd check --expected-version git-3d2d4df1 - no errors, no version warnings - Status check results are :tick: $ bin/linkerd mc install \| k apply -f - - no errors, no version warnings - $ bin/linkerd mc check - no errors, no version warnings - Status check results are :tick: $ bin/linkerd mc link --cluster-name foo \| k apply -f - bin/linkerd mc link --cluster-name foo \| k apply -f - secret/cluster-credentials-foo created link.multicluster.linkerd.io/foo created # VIZ $ bin/linkerd viz install \| k apply -f - - no errors, no version warnings - $ bin/linkerd viz check - no errors, no version warnings - - hangs up indefinitely after linkerd-viz can talk to Kubernetes ``` Linkerd edge (21.1.3) CLI + k8s 1.17 (already installed) `check` ``` $ linkerd version Client version: edge-21.1.3 Server version: git-3d2d4df1 $ linkerd check - no errors - - warnings: mismatch between cli & control plane, control plane not up to date (both expected) - Status check results are :tick: ``` Linkerd stable (2.9.2) CLI + k8s 1.17 (already installed) `check/uninstall` ``` $ linkerd version Client version: stable-2.9.2 Server version: git-3d2d4df1 $ linkerd check × control plane ClusterRoles exist missing ClusterRoles: linkerd-linkerd-tap see https://linkerd.io/checks/#l5d-existence-cr for hints Status check results are × # viz wasn't installed, hence the error, installing viz didn't help since # the res is named `viz-tap` now # moving to uninstall $ linkerd uninstall \| k delete -f - - no warnings, no errors - ``` _Note_: I used `go test ./cli/cmd/... --generate` which is why there are so many changes 😨 Signed-off-by: Matei David <matei.david.35@gmail.com>	2021-02-01 09:18:13 -05:00
Alex Leong	dd8e5fc5bc	Rename extension charts to linkerd-* (#5552 ) For consistency we rename the extension charts to a common naming scheme: linkerd-viz -> linkerd-viz (unchanged) jaeger -> linkerd-jaeger linkerd2-multicluster -> linkerd-multicluster linkerd2-multicluster-link -> linkerd-multicluster-link We also make the chart files and chart readmes a bit more uniform. Signed-off-by: Alex Leong <alex@buoyant.io>	2021-01-26 16:20:49 -08:00
Alejandro Pedraza	8ac5360041	Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560 ) * Protobuf changes: - Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510). - Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs. * Added chart templates for new viz linkerd-metrics-api pod * Spin-off viz healthcheck: - Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients. - The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface. - Refactored the data plane checks so they don't rely on calling `ListPods` - The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck. * Removed linkerd-controller dependency on Prometheus: - Removed the `global.prometheusUrl` config in the core values.yml. - Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352). * Moved observability gRPC from linkerd-controller to viz: - Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server). - Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type. - Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.). - Also simplified some type names to avoid stuttering. * Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits. * linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container. * CLI updates and other minor things: - Changes to command files under `cli/cmd`: - Updated `endpoints.go` according to new API interface name. - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically. - Changes to command files under `viz/cmd`: - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz. - Other changes to have tests pass: - Added `metrics-api` to list of docker images to build in actions workflows. - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`). * Add retry to 'tap API service is running' check * mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used	2021-01-21 18:26:38 -05:00
Matei David	c63fbdf0e4	Introduce OpenAPIV3 validation for CRDs (#5573 ) * Introduce OpenAPIV3 validation for CRDs * Add validation to link crd * Add validation to sp using kube-gen * Add openapiv3 under schema fields in specific versions * Modify fields to rid spec of yaml errors * Add top level validation for all three CRDs Signed-off-by: Matei David <matei.david.35@gmail.com>	2021-01-21 11:56:28 -05:00
Tarun Pothulapati	3b755e5c1d	multicluster: add helm customization flags for install (#5534 ) * multicluster: add helm customization flags This branch updates the multicluster install flow to use the helm engine directly instead of our own chart wrapper. This also adds the helm customization flags. ```bash tarun in dev in on k3d-deep (default) linkerd2 on tarun/mc-helm-flags [$+?] via v1.15.4 ./bin/go-run cli mc install --set namespace=l5d-mc \| grep l5d-mc github.com/linkerd/linkerd2/multicluster/cmd github.com/linkerd/linkerd2/cli/cmd name: l5d-mc namespace: l5d-mc namespace: l5d-mc namespace: l5d-mc mirror.linkerd.io/gateway-identity: linkerd-gateway.l5d-mc.serviceaccount.identity.linkerd.cluster.local namespace: l5d-mc namespace: l5d-mc namespace: l5d-mc namespace: l5d-mc namespace: l5d-mc ``` * add customization flags even for link cmd Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-01-20 11:29:42 +05:30
Alejandro Pedraza	f3b1ebfa99	Separate observability API (#5510 ) * Separate observability API Closes #5312 This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here. - Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot` - Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff. - Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto` - `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup. - The web server provides handlers for both interfaces. - `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs. - Most of the CLI commands will use the Viz API, except for `version`. The other changes in the go files are just changes in the imports to point to the new protobufs. Other minor changes: - Removed `git add controller/gen` from `bin/protoc-go.sh`	2021-01-13 14:34:54 -05:00
Tarun Pothulapati	2b6c5e807d	multicluster: Add removed non-lb ServiceType logic (#5473 ) As #5307 & #5293 went in the same time-frame, Some of the logic added in #5307 got lost during the merge. (oopss, Sorry!) The same logic has been added back. The MC refactor PR #5293 moved all the logic from `multicluster.go` into cmd specific files whose changes added in #5307 were lost, while the changes added in `multicluster/values.go` and template files still remained. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-01-07 23:52:37 +05:30
Tarun Pothulapati	68c02d82d1	healthcheck: simplify Checker construction with a builder (#5475 ) Currently, Each new instance of `Checker` type have to manually set all the fields with the `NewChecker()`, even though most use-cases are fine with the defaults. This branch makes this simpler by using the Builder pattern, so that the users of `Checker` can override the defaults by using specific field methods when needed. Thus simplifying the code. This also removes some of the methods that were specific to tests, and replaces them with the currently used ones. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-01-06 14:32:39 -08:00
Kevin Leimkuhler	f6c8d27d83	Add mulitcluster check command (#5410 ) ## What This change moves the `linkerd check --multicluster` functionality under it's own multicluster subcommand: `linkerd multicluster check`. There should be no functional changes as a result of this change. `linkerd check` no longer checks for anything multicluster related and the `--multicluster` flag has been removed. ## Why Closes #5208 The bulk of these changes are moving all the multicluster checks from `pkg/healthcheck` into the multicluster package. Doing this completely separates it from core Linkerd. It still uses `pkg/healtcheck` when possible, but anything that is used only by `multicluster check` has been moved. Note the the `kubernetes-api` and `linkerd-existence` checks are run. These checks are required for setting up the Linkerd health checker. They set the health checker's `kubeAPI`, `linkerdConfig`, and `apiClient` fields. These could be set manually so that the only check the user sees is `linkerd-multicluster`, but I chose not to do this. If any of the setting functions errors, it would just tell the user to run `linkerd check` and ensure the installation is correct. I find the user error handling to be better by including these required checks since they should be run in the first place. ## How to test Installing Linkerd and multicluster should result in a basic check output: ``` $ bin/linkerd install \|kubectl apply -f - .. $ bin/linkerd check .. $ bin/linkerd multicluster install \|kubectl apply -f - .. $ bin/linkerd multicluster check kubernetes-api -------------- √ can initialize the client √ can query the Kubernetes API linkerd-existence ----------------- √ 'linkerd-config' config map exists √ heartbeat ServiceAccount exist √ control plane replica sets are ready √ no unschedulable pods √ controller pod is running √ can initialize the client √ can query the control plane API linkerd-multicluster -------------------- √ Link CRD exists Status check results are √ ``` After linking a cluster: ``` $ bin/linkerd multicluster check kubernetes-api -------------- √ can initialize the client √ can query the Kubernetes API linkerd-existence ----------------- √ 'linkerd-config' config map exists √ heartbeat ServiceAccount exist √ control plane replica sets are ready √ no unschedulable pods √ controller pod is running √ can initialize the client √ can query the control plane API linkerd-multicluster -------------------- √ Link CRD exists √ Link resources are valid * k3d-y √ remote cluster access credentials are valid * k3d-y √ clusters share trust anchors * k3d-y √ service mirror controller has required permissions * k3d-y √ service mirror controllers are running * k3d-y × all gateway mirrors are healthy probe-gateway-k3d-y.linkerd-multicluster mirrored from cluster [k3d-y] has no endpoints see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints Status check results are × ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-12-21 15:50:17 -05:00
Alejandro Pedraza	131e270d5a	Don't swallow error when MC gateway hostname can't be resolved (#5362 ) * Don't swallow error when MC gateway hostname can't be resolved Ref #5343 When none of the gateway addresses is resolvable, propagate the error as a retryable error so it gets retried and logged. Don't create the mirrored resources if there's no success after the retries.	2020-12-11 09:58:44 -05:00
Alejandro Pedraza	02b456087d	Stop publishing the linkerd2-multicluster-link chart (#5365 ) Closes #5348 That chart generates the service mirror resources and related RBAC, but doesn't generate the credentials secret nor the Link CR which require go-client logic not available from sheer Helm templates. This PR stops publishing that chart, and adds a comment to its README about it.	2020-12-11 08:55:50 -05:00
Kevin Leimkuhler	15dc97c70e	add some missing helm values for multicluster setup (#5346 ) Original description: > Subject > Add missing helm values for multicluster setup > > Problem > When executing this without the linkerd command the two variables are missing and the rendering will generate empty values. > This produces the following gateway identity, that is also used in the gateway link command to generate the link crd: > > ``` > mirror.linkerd.io/gateway-identity: linkerd-gateway.linkerd-multicluster.serviceaccount.identity.. > ``` > > Solution > Add the values as defaults to the helm chart values.yaml file. If the cli is used they are overwritten by the following parameters: > * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L197 > * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L196 Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com> Co-authored-by: Björn Wenzel <bjoern.wenzel@dbschenker.com>	2020-12-08 10:27:16 -05:00
Tarun Pothulapati	72a0ca974d	extension: Separate multicluster chart and binary (#5293 ) Fixes #5257 This branch movies mc charts and cli level code to a new top level directory. None of the logic is changed. Also, moves some common types into `/pkg` so that they are accessible both to the main cli and extensions. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-12-04 16:36:10 -08:00

31 Commits