Followup to #13770 and #13781, based off of branch `alpeb/multicluster-chart-manage-smc-probes`
Addresses CLI tasks listed in #13768
- Refers in `multicluster/cmd/install.go` the new templates added in #13770, allowing `linkerd mc install` to also install the model that was implemented in #13770 just for helm.
- Adds a new flag `linkerd mc link --service-mirror`; if `false` then it only outputs the Link CR and the credentials secret. The default is `true` so the new model is opt-in.
- Refactors `linkerd mc check` in order to check for resources from both the old and new models.
- Extend the image registry overriding logic to cover the new controllers as well.
When the service mirror controller detects a service in the remote cluster which matches the federated service selector (`mirror.linkerd.io/federated=memeber` by default), it will add that service to the federated service in the local cluster named `<svc>-federated`, creating this service if it does not already exist. To join a service to a federated service, it is added to the `multicluster.linkerd.io/remote-discovery` annotation on the federated service which contains a comma separated list of values in the form `<svc>@<cluster>`. When a remote service no longer exists or matches the federated service selector, it is removed from the federated service by removing it from the `mutlicluster.linkerd.io/remote-discovery` annotation.
We also add a new `local-service-mirror` deployment to the Linkerd-multicluster extension which watches the local cluster for any services which match the federated service selector. Any services in the local cluster which match will be added to the federated service by setting the `mutlicluster.linkerd.io/local-discovery` annotation on the federated service to the local service name.
Signed-off-by: Alex Leong <alex@buoyant.io>
We add support for the `--output/-o` flag in linkerd install and related commands. The supported output formats are yaml (default) and json. Kubectl is able to accept both of these formats which means that the output can be piped into kubectl regardless of which output format is used.
The full list of install related commands which we add json support to is:
* linkerd install
* linkerd prune
* linkerd upgrade
* linkerd uninstall
* linkerd viz install
* linkerd viz prune
* linkerd viz uninstall
* linkerd multicluster install
* linkerd multicluster prune
* linkerd multicluster uninstall
* linkerd jaeger install
* linkerd jaeger prune
* linkerd jaeger uninstall
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#9364
Since probes are automatically authorized, Linkerd extensions no longer need admin Server resources in order for probes to be authorized. We therefore remove them.
Signed-off-by: Alex Leong <alex@buoyant.io>
Closes#9676
This adds the `pod-security.kubernetes.io/enforce` label as described in [Pod Security Admission labels for namespaces](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces).
PSA gives us three different possible values (policies or modes): [privileged, baseline and restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/).
For non-CNI mode, the proxy-init container relies on granting the NET_RAW and NET_ADMIN capabilities, which places those pods under the `restricted` policy. OTOH for CNI mode we can enforce the `restricted` policy, by setting some defaults on the containers' `securityContext` as done in this PR.
Also note this change also adds the `cniEnabled` entry in the `values.yaml` file for all the extension charts, which determines what policy to use.
Final note: this includes the fix from #9717, otherwise an empty gateway UID prevents the pod to be created under the `restricted` policy.
## How to test
As this is only enforced as of k8s 1.25, here are the instructions to run 1.25 with k3d using Calico as CNI:
```bash
# launch k3d with k8s v1.25, with no flannel CI
$ k3d cluster create --image='+v1.25' --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0'
# install Calico
$ k apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml
# load all the images
$ bin/image-load --k3d proxy controller policy-controller web metrics-api tap cni-plugin jaeger-webhook
# install linkerd-cni
$ bin/go-run cli install-cni|k apply -f -
# install linkerd-crds
$ bin/go-run cli install --crds|k apply -f -
# install linkerd-control-plane in CNI mode
$ bin/go-run cli install --linkerd-cni-enabled|k apply -f -
# Pods should come up without issues. You can also try the viz and jaeger extensions.
# Try removing one of the securityContext entries added in this PR, and the Pod
# won't come up. You should be able to see the PodSecurity error in the associated
# ReplicaSet.
```
To test the multicluster extension using CNI, check this [gist](https://gist.github.com/alpeb/4cbbd5ad87538b9e0d39a29b4e3f02eb) with a patch to run the multicluster integration test with CNI in k8s 1.25.
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.
This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).
[el]: https://github.com/polyfloyd/go-errorlint
Signed-off-by: Oliver Gould <ver@buoyant.io>
Fixes#6584#6620#7405
# Namespace Removal
With this change, the `namespace.yaml` template is rendered only for CLI installs and not Helm, and likewise the `namespace:` entry in the namespace-level objects (using a new `partials.namespace` helper).
The `installNamespace` and `namespace` entries in `values.yaml` have been removed.
There in the templates where the namespace is required, we moved from `.Values.namespace` to `.Release.Namespace` which is filled-in automatically by Helm. For the CLI, `install.go` now explicitly defines the contents of the `Release` map alongside `Values`.
The proxy-injector has a new `linkerd-namespace` argument given the namespace is no longer persisted in the `linkerd-config` ConfigMap, so it has to be passed in. To pass it further down to `injector.Inject()` without modifying the `Handler` signature, a closure was used.
------------
Update: Merged-in #6638: Similar changes for the `linkerd-viz` chart:
Stop rendering `namespace.yaml` in the `linkerd-viz` chart.
The additional change here is the addition of the `namespace-metadata.yaml` template (and its RBAC), _not_ rendered in CLI installs, which is a Helm `post-install` hook, consisting on a Job that executes a script adding the required annotations and labels to the viz namespace using a PATCH request against kube-api. The script first checks if the namespace doesn't already have an annotations/labels entries, in which case it has to add extra ops in that patch.
---------
Update: Merged-in the approved #6643, #6665 and #6669 which address the `linkerd2-cni`, `linkerd-multicluster` and `linkerd-jaeger` charts.
Additional changes from what's already mentioned above:
- Removes the install-namespace option from `linkerd install-cni`, which isn't found in `linkerd install` nor `linkerd viz install` anyways, and it would add some complexity to support.
- Added a dependency on the `partials` chart to the `linkerd-multicluster-link` chart, so that we can tap on the `partials.namespace` helper.
- We don't have any more the restriction on having the muticluster objects live in a separate namespace than linkerd. It's still good practice, and that's the default for the CLI install, but I removed that validation.
Finally, as a side-effect, the `linkerd mc allow` subcommand was fixed; it has been broken for a while apparently:
```console
$ linkerd mc allow --service-account-name foobar
Error: template: linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml:16:7: executing "linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml" at <include "partials.annotations.created-by" $>: error calling include: template: no template "partials.annotations.created-by" associated with template "gotpl"
```
---------
Update: see helm/helm#5465 describing the current best-practice
# Core Helm Charts Split
This removes the `linkerd2` chart, and replaces it with the `linkerd-crds` and `linkerd-control-plane` charts. Note that the viz and other extension charts are not concerned by this change.
Also note the original `values.yaml` file has been split into both charts accordingly.
### UX
```console
$ helm install linkerd-crds --namespace linkerd --create-namespace linkerd/linkerd-crds
...
# certs.yaml should contain identityTrustAnchorsPEM and the identity issuer values
$ helm install linkerd-control-plane --namespace linkerd -f certs.yaml linkerd/linkerd-control-plane
```
### Upgrade
As explained in #6635, this is a breaking change. Users will have to uninstall the `linkerd2` chart and install these two, and eventually rollout the proxies (they should continue to work during the transition anyway).
### CLI
The CLI install/upgrade code was updated to be able to pick the templates from these new charts, but the CLI UX remains identical as before.
### Other changes
- The `linkerd-crds` and `linkerd-control-plane` charts now carry a version scheme independent of linkerd's own versioning, as explained in #7405.
- These charts are Helm v3, which is reflected in the `Chart.yaml` entries and in the removal of the `requirements.yaml` files.
- In the integration tests, replaced the `helm-chart` arg with `helm-charts` containing the path `./charts`, used to build the paths for both charts.
### Followups
- Now it's possible to add a `ServiceProfile` instance for Destination in the `linkerd-control-plane` chart.
* Add high availability mode to multicluster gateway
Currently the multicluster components cannot be released in a high availability
mode which makes it brittle for the dynamic nature of kubernetes resources.
This change adds an --ha flag to the "linkerd multicluster install" command
which configures the charts to add a PodDisruptionBudget, sets multiple replicas
and anti-affinity on the gateway deployment.
Signed-off-by: Crevil <bjoern.soerensen@gmail.com>
Currently there are no tests of the output of the multicluster install command
unlike the install and viz install commands. This makes it error prone to hard
to validate changes. This is motivated by the addition of an ha mode to the
multicluster components discussed in #7082.
This change adds two test cases and refactors the install command to look like
viz install making it easily testable. This means in practice that the body of
the command is moved into an install function. Here we extract external data,
eg. values, and delegates the values to a render function that handles the
actual rendering.
This is a non-functional change and the output used for the
install_default.golden file is based of the main branch to validate this.
Signed-off-by: Crevil <bjoern.soerensen@gmail.com>
Fixes#7030
The `service-mirror` Server selects all service mirror pods from all links because it uses the label selector: `linkerd.io/control-plane-component: linkerd-service-mirror`. Thus it is not necessary to create a Server for each Link and will result in duplicate policy resources if you have multiple Links.
We move these policy resources into the base multicluster chart so that they are installed as part of `linkerd multicluster install` instead of `linkerd multicluster link`. These policies will apply to all links.
Signed-off-by: Alex Leong <alex@buoyant.io>
Ref #6813
This adds the necessary Server and ServerAuthorization resources to the
viz, multicluster and jaeger extensions, for them to properly work when
using a default-deny policy (installing linkerd with `--set
policyController.defaultAllowPolicy=deny`).
This includes adding the policy for the admin servers (for k8s liveness
and readiness probes) that require granting all unauthenticated access.
When the a component shares its main service port with its admin server
port (e.g. Grafana and Prometheus), this means we can't properly lock
down the main service access, unfortunately.
Also note traffic coming from the kube-api (for the tap api-server and
the webhooks (tap-injector, jaeger-injector)) also requires leaving
those ports wide open.
The multicluster gateway has a policy to only allow traffic into the
`linkerd-proxy` port with a meshed identity. The source cluster also
hits the gateway in the probe port, but the proxy's `linkerd-admin` port
doesn't support policy at the moment.
Other changes:
- Added missing `containerPort` entry in jaeger's `tracing.yaml`
template.
- Added policy for smoke-test-terminus in the install integration
tests, that'll serve for the default-deny integration test that'll
followup.
* Add missing psp for extensions
This change fixes an issue where the `viz`, `jaeger` and `multicluster`
extensions did not have `podsecuritypolicy` Roles. This causes an issue
where the extensions aren't able to be installed on a cluster that has
pod security enabled.
Fixes#6122
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Add nodePorts option and group gateway options
For LoadBalancer and NodePort type services it should be possible to set
a specific node port, this wasn't possible yet. Also, the gateway options
were not grouped nicely.
This commit adds nodePorts to the multicluster helm chart. Also, it groups
all gateway options under the gateway group.
Note; this is now a breaking change
Signed-off-by: Peter Smit <peter.smit@inscripta.io>
* Remove nginx container from mc gateway pod
Fixes#5444
Replaced the nginx container with a pause container, and removed all the
config boilerplate associated to nginx.
The probe now targets the proxy's readiness endpoint, port 4191 path
`/ready`.
## Upgrade note
* @Pothulapati's feedback
* @Pothulapati's feedback
When multicluster gets updated in both source and target this won't
cause downtime because this change only affects the probing. The `Links`
will have to be re-generated of course. Til that is done, `linkerd mc
gateways` will show the target cluster as not alive, but that shouldn't
affect exiting connections.
* Move CP check after the readiness check
Moved the `can initialize client` and `can query the control plane API`
checks from the `linkerd-existence` section to the `linkerd-api` because
they required the `linkerd-controller` pod to not just be "Running" but
actually be ready.
This was causing `linkerd check` to show some port-forwarding warnings
when ran right after install.
This also allowed getting rid of the `CheckPublicAPIClientOrExit` function
and directly use `CheckPublicAPIClientOrRetryOrExit` (better naming
punted for later) which was refactored so it always runs the
`linkerd-api` checks before retrieving the client.
Other changes:
- Temporarily disabled `upgrade-edge` test because the latest edge has this readiness check issue
- Have the upgrade tests do proper pruning (stolen for @Pothulapati's #5673😉 )
- Added missing label to tap SA (fixes#5850)
- Complete tap-injector Service selector
Fixes#5574 and supersedes #5660
- Removed from all the `values.yaml` files all those "do not edit" entries for annotation/label names, hard-coding them in the templates instead.
- The `values.go` files got simplified as a result.
- The `created-by` annotation was also refactored into a reusable partial. This means we had to add a `partials` dependency to multicluster.
* cli: make jaeger and multicluster installs wait for core cp
This PR updates the jaeger and multicluster installs to wait
for the core control-plane to be up before moving to the rendering
logic. This prevents these components from being installed before
the injector is up and running correctly.
`--skip-checks` has been added to jaeger to skip these checks. The
same has not been added to `multicluster` as the install fails when
there is no core cp is present.
This PR also cleans up extra core cp check that we have for `viz install`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* values: removal of .global field
Fixes#5425
With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.
To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.
We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
For consistency we rename the extension charts to a common naming scheme:
linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link
We also make the chart files and chart readmes a bit more uniform.
Signed-off-by: Alex Leong <alex@buoyant.io>
* multicluster: add helm customization flags
This branch updates the multicluster install flow to use the
helm engine directly instead of our own chart wrapper. This
also adds the helm customization flags.
```bash
tarun in dev in on k3d-deep (default) linkerd2 on tarun/mc-helm-flags [$+?] via v1.15.4
./bin/go-run cli mc install --set namespace=l5d-mc | grep l5d-mc
github.com/linkerd/linkerd2/multicluster/cmd
github.com/linkerd/linkerd2/cli/cmd
name: l5d-mc
namespace: l5d-mc
namespace: l5d-mc
namespace: l5d-mc
mirror.linkerd.io/gateway-identity: linkerd-gateway.l5d-mc.serviceaccount.identity.linkerd.cluster.local
namespace: l5d-mc
namespace: l5d-mc
namespace: l5d-mc
namespace: l5d-mc
namespace: l5d-mc
```
* add customization flags even for link cmd
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
As #5307 & #5293 went in the same time-frame, Some of the logic
added in #5307 got lost during the merge. (oopss, Sorry!)
The same logic has been added back. The MC refactor PR #5293 moved
all the logic from `multicluster.go` into cmd specific files
whose changes added in #5307 were lost, while the changes added
in `multicluster/values.go` and template files still remained.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5257
This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.
Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>