In the stable-2.9.0, stable-2.9.1, and stable-2.9.2 releases, the `linkerd-config-overrides` secret is missing the `linkerd.io/control-plane-ns` label. This means that if a `linkerd upgrade` is performed to one of these versions using the `--prune` flag, then the secret will be deleted. Missing this secret will prevent any further upgrades.
We add a `linkerd repair` command which recreates the `linkerd-config-overrides` secret by fetching the installed values from the `linkerd-config` configmap and then re-populating the redacted identity values from the `linkerd-identity-issuer` secret.
Usage:
```bash
linkerd repair | kubectl apply -f -
```
To test:
```
# Set Linkerd version to stable-2.8.0
> linkerd install | kubectl apply -f -
# Set Linkerd version to stable-2.9.1
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command fails)
# Set Linkerd version to HEAD
> linkerd repair | kubectl apply -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command succeeds)
> linkerd check
```
Signed-off-by: Alex Leong <alex@buoyant.io>
* cli: make `linkerd uninstall` fail when injected pods are present
Fixes#5622
This PR updates the `linkerd uninstall` cmd to check if there
are any injected pods and fails if there are any. This also
provides `--force` flag to skip this check.
pods from namespaces with prefix `linkerd` are skipped
so as to not error out for control-plane and extension
pods.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
## What this changes
This adds a `viz profile` command that outputs a service profile based off tap
data. It is identical—but fixes—the current `profile --tap` command.
Additionally, it removes the `--tap` flag from the `profile` command since this
depends on the Viz extension being installed in order to tap a service.
## Why
The `profile --tap` command is currently broken since it depends on the Viz
extension being installed, but the `profile` command is part of the core
install.
Closes#5613
Unblocks #5545
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
(Background information)
In our company we are checking the sops-encrypted Linkerd manifest into GitHub repository,
and I came across the following problem.
---
Three dashes mean the start of the YAML document (or the end of the
directive).
https://yaml.org/spec/1.2/spec.html#id2800132
If there are only comments between `---`, the document is empty.
Assume the file which include an empty document at the top of itself.
```yaml
---
# foo
---
apiVersion: v1
kind: Namespace
metadata:
name: foo
---
# bar
---
apiVersion: v1
kind: Namespace
metadata:
name: bar
```
When we encrypt and decrypt it with [sops](https://github.com/mozilla/sops), the empty document will be
converted to `{}`.
```yaml
{}
---
apiVersion: v1
kind: Namespace
metadata:
name: foo
---
apiVersion: v1
kind: Namespace
metadata:
name: bar
```
It is invalid as k8s manifest ([apiVersion not set, kind not set]).
```
error validating data: [apiVersion not set, kind not set]
```
---
I'm afraid that it's sops's problem (at least partly), but anyhow this modification is enough harmless I think.
Thank you.
Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
*Closes #5484*
### Changes
---
*Overview*:
* Update golden files and make necessary spec changes
* Update test files for viz
* Add v1 to healthcheck and uninstall
* Fix link-crd clusterDomain field validation
- To update to v1, I had to change crd schemas to be version-based (i.e each version has to declare its own schema). I noticed an error in the link-crd (`targetClusterDomain` was `targetDomainName`). Also, additionalPrinterColumns are also version-dependent as a field now.
- For `admissionregistration` resources I had to add an additional `admissionReviewVersions` field -- I included `v1` and `v1beta1`.
- In `healthcheck.go` and `resources.go` (used by `uninstall`) I had to make some changes to the client-go versions (i.e from `v1beta1` to `v1` for admissionreg and apiextension) so that we don't see any warning messages when uninstalling or when we do any install checks.
I tested again different cli and k8s versions to have a bit more confidence in the changes (in addition to automated tests), hope the cases below will be enough, if not let me know and I can test further.
### Tests
Linkerd local build CLI + k8s 1.19+
`install/check/mc-check/mc-install/mc-link/viz-install/viz-check/uninstall/`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
$ bin/linkerd version
Client version: git-b0fd2ec8
Server version: unavailable
$ bin/linkerd install | kubectl apply -f -
- no errors, no version warnings -
$ bin/linkerd check --expected-version git-b0fd2ec8
Status check results are :tick:
# MC
$ bin/linkerd mc install | k apply -f -
- no erros, no version warnings -
$ bin/linkerd mc check
Status check results are :tick:
$ bin/linkerd mc link foo | k apply -f - # test crd creation
# had a validation error here because the schema had targetDomainName instead of targetClusterDomain
# changed, rebuilt cli, re-installed mc, tried command again
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created
...
# VIZ
$ bin/linkerd viz install | k apply -f -
- no errors, no version warnings -
$ bin/linkerd viz check
- no errors, no version warnings -
Status check results are :tick:
$ bin/linkerd uninstall | k delete -f -
- no errors, no version warnings -
```
Linkerd local build CLI + k8s 1.17
`check-pre/install/mc-check/mc-install/mc-link/viz-install/viz-check`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17-rc1+k3s1", GitCommit:"e8c9484078bc59f2cd04f4018b095407758073f5", GitTreeState:"clean", BuildDate:"2021-01-14T06:20:56Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
$ bin/linkerd version
Client version: git-3d2d4df1 # made changes to link-crd after prev test case
Server version: unavailable
$ bin/linkerd check --pre --expected-version git-3d2d4df1
- no errors, no version warnings -
Status check results are :tick:
$ bin/linkerd install | k apply -f -
- no errors, no version warnings -
$ bin/linkerd check --expected-version git-3d2d4df1
- no errors, no version warnings -
Status check results are :tick:
$ bin/linkerd mc install | k apply -f -
- no errors, no version warnings -
$ bin/linkerd mc check
- no errors, no version warnings -
Status check results are :tick:
$ bin/linkerd mc link --cluster-name foo | k apply -f -
bin/linkerd mc link --cluster-name foo | k apply -f -
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created
# VIZ
$ bin/linkerd viz install | k apply -f -
- no errors, no version warnings -
$ bin/linkerd viz check
- no errors, no version warnings -
- hangs up indefinitely after linkerd-viz can talk to Kubernetes
```
Linkerd edge (21.1.3) CLI + k8s 1.17 (already installed)
`check`
```
$ linkerd version
Client version: edge-21.1.3
Server version: git-3d2d4df1
$ linkerd check
- no errors -
- warnings: mismatch between cli & control plane, control plane not up to date (both expected) -
Status check results are :tick:
```
Linkerd stable (2.9.2) CLI + k8s 1.17 (already installed)
`check/uninstall`
```
$ linkerd version
Client version: stable-2.9.2
Server version: git-3d2d4df1
$ linkerd check
× control plane ClusterRoles exist
missing ClusterRoles: linkerd-linkerd-tap
see https://linkerd.io/checks/#l5d-existence-cr for hints
Status check results are ×
# viz wasn't installed, hence the error, installing viz didn't help since
# the res is named `viz-tap` now
# moving to uninstall
$ linkerd uninstall | k delete -f -
- no warnings, no errors -
```
_Note_: I used `go test ./cli/cmd/... --generate` which is why there are so many changes 😨
Signed-off-by: Matei David <matei.david.35@gmail.com>
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.
* Added chart templates for new viz linkerd-metrics-api pod
* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.
* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).
* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.
* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.
* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.
* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
- Updated `endpoints.go` according to new API interface name.
- Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
- `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
- Added `metrics-api` to list of docker images to build in actions workflows.
- In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).
* Add retry to 'tap API service is running' check
* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
#5507 added new golden tests but missed some updates from other PRs
that got merged meanwhile.
This branch updates those golden tests with those changes
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* cli: add helm customization flags to core install
Fixes#5506
This branch adds helm way of customization through
`set`, `set-string`, `values`, `set-files` flags for
`linkerd install` cmd along with unit tests.
For this to work, the helm v3 engine rendering helpers
had to be used instead of our own wrapper type.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The `linkerd metrics` command was selecting pods based on owner resource
names. If multiple owners existed with the same name (for example
`sts/web`, `deploy/web`), additional pods would be incorrectly included
in the output.
Fix the pod selector code to validate pods have owner references to the
given workload/owner.
Before:
```
$ linkerd metrics -n emojivoto deploy/web|grep POD
# POD web-0 (1 of 3)
# POD web-d9ffd684f-gnbcx (2 of 3)
# POD web-fs6l7 (3 of 3)
```
After:
```
$ bin/go-run cli metrics -n emojivoto deploy/web|grep POD
# POD web-d9ffd684f-gnbcx (1 of 1)
```
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Introduce OpenAPIV3 validation for CRDs
* Add validation to link crd
* Add validation to sp using kube-gen
* Add openapiv3 under schema fields in specific versions
* Modify fields to rid spec of yaml errors
* Add top level validation for all three CRDs
Signed-off-by: Matei David <matei.david.35@gmail.com>
## What this changes
This adds a tap-injector component to the `linkerd-viz` extension which is
responsible for adding the tap service name environment variable to the Linkerd
proxy container.
If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled
via annotation on the pod or the namespace, no action is taken.
This also removes the environment variable for explicitly disabling tap through
an environment variable. Tap status for a proxy is now determined only be the
presence or absence of the tap service name environment variable.
Closes#5326
## How it changes
### tap-injector
The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be
added to a pod's Linkerd proxy container environment. If the pod satisfies the
following, it is added:
- The pod has a Linkerd proxy container
- The pod has not already been mutated
- Tap is not disabled via annotation on the pod or the pod's namespace
### LINKERD2_PROXY_TAP_DISABLED
Now that tap is an extension of Linkerd and not a core component, it no longer
made sense to explicitly enable or disable tap through this Linkerd proxy
environment variable. The status of tap is now determined only be if the
tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment
variable.
### controller image
The tap-injector has been added to the controller image's several startup
commands which determines what it will do in the cluster.
As a follow-up, I think splitting out the `tap` and `tap-injector` commands from
the controller image into a linkerd-viz image (or something like that) makes
sense.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Detect default ns for metrics and profile subcommands
Followup to #5485, fixes remaining cases for #5524
Properly detect the default namespace given `kubeConfigPath` and
`kubeContext` for the `metrics`, `identity`, `routes` and `profile` subcommands.
Also gets rid once and for all of the `defaultNamespace` global var.
* viz: make viz cmds available at root
Fixes#5523
This branch makes viz commands that were previously available
under root to be available at both places i.e `linkerd` and
`linkerd viz`.
We also show a depreciated notice when ran under root, asking
to use them with the `viz` prefix.
This also updates all the help messages to address these cmds
as `linkerd viz xyz` instead of `linkerd xyz`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Separate observability API
Closes#5312
This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.
- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.
The other changes in the go files are just changes in the imports to point to the new protobufs.
Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
* viz: move sub-cmds using viz extension under viz cmd
Fixes#5327 , #5524
This branch moves the following commands, under the `linkerd viz`
cmd as they use the viz extension to perform the job.
- dashboard
- edges
- routes
- stat
- tap
- top
This also creates a new pkg `public-api` which fecilitates
interaction and communication with public-api to be used
across extensions.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
* viz: add render golden tests
This branch adds golden tests for the viz install. This would be
useful to track changes in render as more changes are added.
This also moves the common code that is used across extensions
to generate diffs into `testutil` to be able to be used widely.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Closes#5401
* offline profile generation with --ignore-cluster
* validation added for ignoreCluster and service profile with tap data
Signed-off-by: Piyush Singariya <piyushsingariya@gmail.com>
The container-image `ghcr.io/linkerd/cni-plugin:stable-2.9.1` does not contain the `kill` command as an executable. Instead, it is available as a shell built-in. In its current state, Kubernetes emits error events whenever linkerd2-cni pods are terminated because the `kill` command can not be found.
Signed-off-by: Mitch Hulscher <mitch.hulscher@lib.io>
Currently, Each new instance of `Checker` type have to manually
set all the fields with the `NewChecker()`, even though most
use-cases are fine with the defaults.
This branch makes this simpler by using the Builder pattern, so
that the users of `Checker` can override the defaults by using
specific field methods when needed. Thus simplifying the code.
This also removes some of the methods that were specific to tests,
and replaces them with the currently used ones.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
CLI: Introduced `identity` command to fetch tls-certificates for a pod (#4459)
Modified and added a new cli command, which initiates a sni-tls session to the proxy's admin port and returns the certificate.
Usage:
- `linkerd identity pod/<pod-name>` : fetches certificate from the specified pod
- `linkerd identity -l app=svc/emoji` : fetches certificate from all pods with label app=svc/emoji
Signed-off-by: Jimil Desai <jimildesai42@gmail.com>
* viz: move some components into linkerd-viz
This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.
The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as
`set`, `set-string`, `values`, `set-files`.
**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.
**Testing**
```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -
# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -
# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```
What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
## What
This change moves the `linkerd check --multicluster` functionality under it's
own multicluster subcommand: `linkerd multicluster check`.
There should be no functional changes as a result of this change. `linkerd
check` no longer checks for anything multicluster related and the
`--multicluster` flag has been removed.
## Why
Closes#5208
The bulk of these changes are moving all the multicluster checks from
`pkg/healthcheck` into the multicluster package.
Doing this completely separates it from core Linkerd. It still uses
`pkg/healtcheck` when possible, but anything that is used only by `multicluster
check` has been moved.
**Note the the `kubernetes-api` and `linkerd-existence` checks are run.**
These checks are required for setting up the Linkerd health checker. They set
the health checker's `kubeAPI`, `linkerdConfig`, and `apiClient` fields.
These could be set manually so that the only check the user sees is
`linkerd-multicluster`, but I chose not to do this.
If any of the setting functions errors, it would just tell the user to run
`linkerd check` and ensure the installation is correct. I find the user error
handling to be better by including these required checks since they should be
run in the first place.
## How to test
Installing Linkerd and multicluster should result in a basic check output:
```
$ bin/linkerd install |kubectl apply -f -
..
$ bin/linkerd check
..
$ bin/linkerd multicluster install |kubectl apply -f -
..
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API
linkerd-multicluster
--------------------
√ Link CRD exists
Status check results are √
```
After linking a cluster:
```
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API
linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
* k3d-y
√ remote cluster access credentials are valid
* k3d-y
√ clusters share trust anchors
* k3d-y
√ service mirror controller has required permissions
* k3d-y
√ service mirror controllers are running
* k3d-y
× all gateway mirrors are healthy
probe-gateway-k3d-y.linkerd-multicluster mirrored from cluster [k3d-y] has no endpoints
see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints
Status check results are ×
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Fixes#5385 (second bug in there)
Added missing label `linkerd.io/control-plane-ns=linkerd` that all the
control plane resources must have, that is passed to `kubectl apply
--prune`
## Summary
This changes the destination service to start indicating whether a profile is an
opaque protocol or not.
Currently, profiles returned by the destination service are built by chaining
together updates coming from watching Profile and Traffic Split updates.
With this change, we now also watch updates to Opaque Port annotations on pods
and namespaces; if an update occurs this is now included in building a profile
update and is sent to the client.
## Details
Watching updates to Profiles and Traffic Splits is straightforward--we watch
those resources and if an update occurs on one associated to a service we care
about then the update is passed through.
For Opaque Ports this is a little different because it is an annotation on pods
or namespaces. To account for this, we watch the endpoints that we should care
about.
### When host is a Pod IP
When getting the profile for a Pod IP, we check for the opaque ports annotation
on the pod and the pod's namespace. If one is found, we'll indicate if the
profile is an opaque protocol if the requested port is in the annotation.
We do not subscribe for updates to this pod IP. The only update we really care
about is if the pod is deleted and this is already handled by the proxy.
### When host is a Service
When getting the profile for a Service, we subscribe for updates to the
endpoints of that service. For any ports set in the opaque ports annotation on
any of the pods, we check if the requested port is present.
Since the endpoints for a service can be added and removed, we do subscribe for
updates to the endpoints of the service.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Fixes#5385
## The problems
- `linkerd install --ha` isn't honoring flags
- `linkerd upgrade --ha` is overridding existing configs silently or failing with an error
- *Upgrading HA instances from before 2.9 to version 2.9.1 results in configs being overridden silently, or the upgrade fails with an error*
## The cause
The change in #5358 attempted to fix `linkerd install --ha` that was only applying some of the `values-ha.yaml` defaults, by calling `charts.NewValues(true)` and merging that with the values built from `values.yaml` overriden by the flags. It turns out the `charts.NewValues()` implementation was by itself merging against `values.yaml` and as a result any flag was getting overridden by its default.
This also happened when doing `linkerd upgrade --ha` on an existing instance, which could result in silently overriding settings, or it could also fail loudly like for example when upgrading set up that has an external issuer (in this case the issuer cert won't be able to be read during upgrade and an error would occur as described in #5385).
Finally, when doing `linkerd upgrade` (no --ha flag) on an HA install from before 2.9 results in configs getting overridden as well (silently or with an error) because in order to generate the `linkerd-config-overrides` secret, the original install flags are retrieved from `linkerd-config` via the `loadStoredValuesLegacy()` function which then effectively ends up performing a `linkerd upgrade` with all the flags used for `linkerd install` and falls into the same trap as above.
## The fix
In `values.go` the faulting merging logic is not used anymore, so now `NewValues()` only returns the default values from `values.yaml` and doesn't require an argument anymore. It calls `readDefaults()` which now only returns the appropriate values depending on whether we're on HA or not.
There's a new function `MergeHAValues()` that merges `values-ha.yaml` into the current values (it doesn't look into `values.yaml` anymore), which is only used when processing the `--ha` flag in `options.go`.
## How to test
To replicate the issue try setting a custom setting and check it's not applied:
```bash
linkerd install --ha --controller-log level debug | grep log.level
- -log-level=info
```
## Followup
This wasn't caught because we don't have HA integration tests. Now that our test infra is based on k3d, it should be easy to make such a test using a cluster with multiple nodes. Either that or issuing `linkerd install --ha` with additional configs and compare against a golden file.
* jaeger: add check sub command
This adds a new `linkerd jaeger check` command to have checks w.r.t
jaeger extension. This is similar to that of the `linkerd check` cmd.
As jaeger is a separate package, It was a bit complex for this to work
as not all types and fields from healthcheck pkg are public, Helper
funcs were used to mitigate this.
This has the following changes:
- Adds a new `check.go` file under the jaeger extension pkg
- Moves some commonly needed funcs and types from `cli/cmd/check.go`
and `pkg/healthcheck/health.go` into
`pkg/healthcheck/healthcheck_output.go`.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Add a `linkerd jaeger uninstall` command which prints the linkerd-jaeger extension resources so that they can be deleted. This is similar to the `linkerd uninstall` command.
```
> bin/linkerd jaeger uninstall | k delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
clusterrolebinding.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
mutatingwebhookconfiguration.admissionregistration.k8s.io "linkerd-proxy-mutator-webhook-config" deleted
namespace "linkerd-jaeger" deleted
```
Signed-off-by: Alex Leong <alex@buoyant.io>
* upgrades: make webhooks restart if TLS creds are updated
Fixes#5231
Currently, we do not re-use the TLS certs during upgrades, which
means that the secrets are updated while the webhooks are still
paired with the older ones, causing the webhook requests to fail.
This can be solved by making webhooks be restarted whenever there
is a change in the certs. This can be performed by storing the hash
of the `*-rbac` file, which contains the secrets, thus making the
pod templates change whenever there is an update to the certs thus
making restarts required.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When testing the `linkerd2-cni` chart with `ct`, it flags up usage
of some deprecated apiVersions.
This PR aligns the RBAC API group across all resources in the chart.
---
Signed-off-by: Simon Weald <glitchcrab-github@simonweald.com>
* `linkerd install --ha` was only partially applying HA config
Fixes#5342
`values-ha.yml` contains the specific config for HA, but only the proxy
resources controller replicas settings were applied. This PR adds
EnablePodAntiafinity, WebhookFailurePolicy and all the resource settings
for the other CP pods.
Also the `--controller-replicas` flag is moved after the HA flags so it
can override the HA settings.
Finally, some comments no longer relevant were removed.
## How to test
Perform `linkerd install --ha` and make sure the values in
`values-ha.yml` are propagated correctly in the produced yaml.
## 2.9.1
After merging to `main`, this should be cherry-picked into the
`release/stable-2.9` branch.
Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:
* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests
We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension. To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.
Note that for now, only the control plane components send traces; the proxies in the control plane do not. This is because the linkerd-jaeger injector is not yet available. However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.
I tested this by doing the following:
1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#5257
This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.
Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Add automatic readme generation for charts
The current readmes for each chart is generated
manually and doesn't contain all the information available.
Utilize helm-docs to automatically fill out readme.mds
for the helm charts by pulling metadata from values.yml.
Fixes#4156
Co-authored-by: GMarkfjard <gabma047@student.liu.se>
* extension: Add new jaeger binary
This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.
This covers only the install part of the command
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#4874
This branch upgrades Helm sdk from v2 to v3 *without any functionaly
changes*, just replacing types with newer API's.
This should not effect our current support for Helm v2 as we did not
change any of the underlying tempaltes(which work with Helm v2). This
works becuase we did not use any of the API's that read the Chart
metadata (which are the only ones changed from v2 to v3) and currently
manually load files and pass ito the sdk.
This PR should provide a great point to start more of the newer Helm v3
API's including for the upgrade workflow thus allowing us to make
Linkerd CLI more simpler.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
CLI crashes if linkerd-config contains unexpected values.
Add a safe accessor that initializes an empty Global on the first
access. Refactor all accesses to use the newly introduced accessor using
gopls.
Add test for linkerd-config data without Global.
Fixes#5215
Co-authored-by: Itai Schwartz <yitai27@gmail.com>
Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>
As discussed in #5228, it is not correct for root and intermediate
certs to have SAN. This PR updates the check to not verify the
intermediate issuer cert with the identity dns name (which checks with
SAN and not CN as the the `verify` func is used to verify leaf certs and
not root and intermediate certs). This PR also avoids setting a SAN
field when generating certs in the `install` command.
Fixes#5228