We want to begin to add support for the GRPCRoute resource type
to the policy controller. The GRPCRoute resource was introduced in
gateway-api v0.7.
In preparation for this upcoming work, this commit updates our Gateway
API CRD dependencies to 0.7.1, including the experimental GRPCRoute
resource type.
For consistency, we should export our CRD templates in the same way we
export normal control plane templates and extensions.
Signed-off-by: Matei David <matei@buoyant.io>
The `errMsgCannotInitializeClient` format string did not include a
trailing newline, unlike other error messages. This caused the user's
prompt to appear immediately after the error without a newline.
Add a trailing newline to `errMsgCannotInitializeClient`.
Before:
```
$ linkerd install --context not-a-cluster
Unable to install the Linkerd control plane. Cannot connect to the Kubernetes cluster:
error configuring Kubernetes API client: context "not-a-cluster" does not exist
You can use the --ignore-cluster flag if you just want to generate the installation config.$
```
After:
```
$ bin/go-run cli install --context not-a-cluster
Unable to install the Linkerd control plane. Cannot connect to the Kubernetes cluster:
error configuring Kubernetes API client: context "not-a-cluster" does not exist
You can use the --ignore-cluster flag if you just want to generate the installation config.
$
```
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
To support mesh expansion, the control plane needs to read configuration
associated with an external instance (i.e. a VM) for the purpose of
service and inbound authorization policy discovery.
This change introduces a new CRD that supports the required
configuration options. The resource supports:
* a list of workload IPs (with a generic format to support ipv4 now and ipv6
in the future)
* a set of mesh TLS settings (SNI and identity)
* a set of ports exposed by the workload
* a set of status conditions
---------
Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
Updates the policy-controller to watch `httproute.gateway.networking.k8s.io` resources in addition to watching `httproute.policy.linkerd.io` resources. Routes of either or both types can be returned in policy responses and will be appropriately identified by the `group` field on their metadata. Furthermore we update the Status of these resources to correctly reflect when they are accepted.
We add the `httproute.gateway.networking.k8s.io` CRD to the Linkerd installed CRD list and add the appropriate RBAC to the policy controller so that it may watch these resources.
Signed-off-by: Alex Leong <alex@buoyant.io>
Closes#9676
This adds the `pod-security.kubernetes.io/enforce` label as described in [Pod Security Admission labels for namespaces](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces).
PSA gives us three different possible values (policies or modes): [privileged, baseline and restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/).
For non-CNI mode, the proxy-init container relies on granting the NET_RAW and NET_ADMIN capabilities, which places those pods under the `restricted` policy. OTOH for CNI mode we can enforce the `restricted` policy, by setting some defaults on the containers' `securityContext` as done in this PR.
Also note this change also adds the `cniEnabled` entry in the `values.yaml` file for all the extension charts, which determines what policy to use.
Final note: this includes the fix from #9717, otherwise an empty gateway UID prevents the pod to be created under the `restricted` policy.
## How to test
As this is only enforced as of k8s 1.25, here are the instructions to run 1.25 with k3d using Calico as CNI:
```bash
# launch k3d with k8s v1.25, with no flannel CI
$ k3d cluster create --image='+v1.25' --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0'
# install Calico
$ k apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml
# load all the images
$ bin/image-load --k3d proxy controller policy-controller web metrics-api tap cni-plugin jaeger-webhook
# install linkerd-cni
$ bin/go-run cli install-cni|k apply -f -
# install linkerd-crds
$ bin/go-run cli install --crds|k apply -f -
# install linkerd-control-plane in CNI mode
$ bin/go-run cli install --linkerd-cni-enabled|k apply -f -
# Pods should come up without issues. You can also try the viz and jaeger extensions.
# Try removing one of the securityContext entries added in this PR, and the Pod
# won't come up. You should be able to see the PodSecurity error in the associated
# ReplicaSet.
```
To test the multicluster extension using CNI, check this [gist](https://gist.github.com/alpeb/4cbbd5ad87538b9e0d39a29b4e3f02eb) with a patch to run the multicluster integration test with CNI in k8s 1.25.
When CNI plugins run in ebpf mode, they may rewrite the packet
destination when doing socket-level load balancing (i.e in the
`connect()` call). In these cases, skipping `443` on the outbound side
for control plane components becomes redundant; the packet is re-written
to target the actual Kubernetes API Server backend (which typically
listens on port `6443`, but may be overridden when the cluster is
created).
This change adds port `6443` to the list of skipped ports for control
plane components. On the linkerd-cni plugin side, the ports are
non-configurable. Whenever a pod with the control plane component label
is handled by the plugin, we look-up the `kubernetes` service in the
default namespace and append the port values (of both ClusterIP and
backend) to the list.
On the initContainer side, we make this value configurable in Helm and
provide a sensible default (`443,6443`). Users may override this value
if the ports do not correspond to what they have in their cluster. In
the CLI, if no override is given, we look-up the service in the same way
that we do for linkerd-cni; if failures are encountered we fallback to
the default list of ports from the values file.
Closes#9817
Signed-off-by: Matei David <matei@buoyant.io>
Add PodMonitor resources to the Helm chart
With an external Prometheus setup installed using prometheus-operator the Prometheus instance scraping can be configured using Service/PodMonitor resources.
By adding PodMonitor resource into Linkerd Helm chart we can mimic the configuration of bundled Prometheus, see https://github.com/linkerd/linkerd2/blob/main/viz/charts/linkerd-viz/templates/prometheus.yaml#L47-L151, that comes with linkerd-viz extension. The PodMonitor resources are based on https://github.com/linkerd/website/issues/853#issuecomment-913234295 which are proven to be working. The only problem we face is that bundled Grafana charts will need to look at different jobs when querying metrics.
When enabled by `podMonitor.enabled` value in the Helm chart, PodMonitor for Linkerd resources should be installed alongside the Linkerd and Linkerd metrics should be present in the Prometheus.
Fixes#6596
Signed-off-by: Martin Odstrcilik <martin.odstrcilik@gmail.com>
As discussed in #8944, Linkerd's current use of the
`gateway.networking.k8s.io` `HTTPRoute` CRD is not a spec-compliant use
of the Gateway API, because we don't support some "core" features of the
Gateway API that don't make sense in Linkerd's use-case. Therefore,
we've chosen to replace the `gateway.networking.k8s.io` `HTTPRoute` CRD
with our own `HTTPRoute` CRD in the `policy.linkerd.io` API group, which
removes the unsupported features.
PR #8949 added the Linkerd versions of those CRDs, but did not remove
support for the Gateway API CRDs. This branch removes the Gateway API
CRDs from the policy controller and `linkerd install`/Helm charts.
The various helper functions for converting the Gateway API resource
binding types from `k8s-gateway-api` to the policy controller's internal
representation is kept in place, but the actual use of that code in the
indexer is disabled. This way, we can add support for the Gateway API
CRDs again easily. Similarly, I've kept the validation code for Gateway
API types in the policy admission controller, but the admission
controller no longer actually tries to validate those resources.
Depends on #8949Closes#8944
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Our use of the `gateway.networking.k8s.io` types is not compliant with
the gateway API spec in at least a few ways:
1. We do not support the `Gateway` types. This is considered a "core"
feature of the `HTTPRoute` type.
2. We do not currently update `HTTPRoute` status fields as dictated by
the spec.
3. Our use of Linkerd-specific `parentRef` types may not work well with
the gateway project's admission controller (untested).
Issue #8944 proposes solving this by replacing our use of
`gateway.networking.k8s.io`'s `HTTPRoute` type with our own
`policy.linkerd.io` version of the same type. That issue suggests that
the new `policy.linkerd.io` types be added separately from the change
that removes support for the `gateway.networking.k8s.io` versions, so
that the migration can be done incrementally.
This branch does the following:
* Add new `HTTPRoute` CRDs. These are based on the
`gateway.networking.k8s.io` CRDs, with the following changes:
- The group is `policy.linkerd.io`,
- The API version is `v1alpha1`,
- `backendRefs` fields are removed, as Linkerd does not support them,
- filter types Linkerd does not support (`RequestMirror` and
`ExtensionRef`), are removed.
* Add Rust bindings for the new `policy.linkerd.io` versions of
`HTTPRoute` types in `linkerd-policy-controller-k8s-api`.
The Rust bindings define their own versions of the `HttpRoute`,
`HttpRouteRule`, and `HttpRouteFilter` types, because these types'
structures are changed from the Gateway API versions (due to the
removal of unsupported filter types and fields). For other types,
which are identical to the upstream Gateway API versions (such as the
various match types and filter types), we re-export the existing
bindings from the `k8s-gateway-api`crate to minimize duplication.
* Add conversions to `InboundRouteBinding` from the `policy.linkerd.io`
`HTTPRoute` types.
When possible, I tried to factor out the code that was shared between
the conversions for Linkerd's `HTTPRoute` types and the upstream
Gateway API versions.
* Implement `kubert`'s `IndexNamespacedResource` trait for
`linkerd_policy_controller_k8s_api::policy::HttpRoute`, so that the
policy controller can index both versions of the `HTTPRoute` CRD.
* Adds validation for `policy.linkerd.io` `HTTPRoute`s to the policy
controller's validating admission webhook.
* Updated the policy controller tests to test both versions of
`HTTPRoute`.
## Notes
A couple questions I had about this approach:
- Is re-using bindings from the `k8s-gateway-api` crate appropriate
here, when the type has not changed from the Gateway API version? If
not, I can change this PR to vendor those types as well, but it will
result in a lot more code duplication.
- Right now, the indexer stores all `HTTPRoute`s in the same index.
This means that applying a `policy.linkerd.io` version of `HTTPRoute`
and then applying the Gateway API version with the same ns/name will
update the same value in the index. Is this what we want? I wasn't
entirely sure...
See #8944.
Fixes#8660
We add the HttpRoute CRD to the CRDs installed with `linkerd install --crds` and `linkerd upgrade --crds`. You can use the `--set installHttpRoute=false` to skip installing this CRD.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#8373
We update `CheckCustomResourceDefinitions` so that it not only checks for the existence of the CRDs, but also ensures that they contain the latest version of each CRD. Note that this means that we'll need to keep this list of CRD versions in `CheckCustomResourceDefinitions` in sync with the actual CRD versions in the templates. We also add this check to `linkerd upgrade` when the `--crds` flag is not provided. This means that users who are upgrading will be required to run `linkerd upgrade --crds` first if they don't have the latest version of any of the CRDs.
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
This change follows on 4f3c374, which split the install logic for CRDs
and the core control plane, by splitting the upgrade logic for the CRDs
and the core control plane.
Signed-off-by: Oliver Gould <ver@buoyant.io>
We currently have singular `install` and `render` functions, each of
which takes a `crds` bool that completely alters the behavior of the
function. This change splits this behavior into distinct functions so
we have `installCRDs`/`renderCRDs` and `installControlPlane`/
`renderControlPlane`.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Fixes#8364
When `linkerd install` is called with the `--ignore-cluster`, we pass `nil` for the `k8sAPI`. This causes a panic when using this client for validation. We add a conditional so that we skip this validation when the `k8sAPI` is `nil`.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes: #8173
In order to support having custom resources in the default Linkerd installation, it is necessary to add a separate install step to install CRDs before the core install. The Linkerd Helm charts already accomplish this by having CRDs in a separate chart.
We add this functionality to the CLI by adding a `--crds` flag to `linkerd install` and `linkerd upgrade` which outputs manifests for the CRDs only and remove the CRD manifests when the `--crds` flag is not set. To avoid a compounding of complexity, we remove the `config` and `control-plane` stages from install/upgrade. The effect of this is that we drop support for splitting up an install by privilege level (cluster admin vs Linkerd admin).
The Linkerd install flow is now always a 2-step process where `linkerd install --crds` must be run first to install CRDs only and then `linkerd install` is run to install everything else. This more closely aligns the CLI install flow with the Helm install flow where the CRDs are a separate chart. Attempting to run `linkerd install` before the CRDs are installed will result in a helpful error message.
Similarly, upgrade is also a 2-step process of `linkerd upgrade --crds` follow by `linkerd upgrade`.
Signed-off-by: Alex Leong <alex@buoyant.io>
Issue #7709 proposes new Custom Resource types to support generalized
authorization policies:
- `AuthorizationPolicy`
- `MeshTLSAuthentication`
- `NetworkAuthentication`
This change introduces these CRDs to the default linkerd installation
(via the `linkerd-crds` chart) and updates the policy controller's
to handle these resource types. The policy admission controller
validates that these resource reference only suppported types.
This new functionality is tested at multiple levels:
* `linkerd-policy-controller-k8s-index` includes unit tests for the
indexer to test how events update the index;
* `linkerd-policy-test` includes integration tests that run in-cluster
to validate that the gRPC API updates as resources are manipulated;
* `linkerd-policy-test` includes integration tests that exercise the
admission controller's resource validation; and
* `linkerd-policy-test` includes integration tests that ensure that
proxies honor authorization resources.
This change does NOT update Linkerd's control plane and extensions to
use these new authorization primitives. Furthermore, the `linkerd` CLI
does not yet support inspecting these new resource types. These
enhancements will be made in followup changes.
Signed-off-by: Oliver Gould <ver@buoyant.io>
In preparation for new policy CRD resources, this change adds end-to-end
tests to validate policy enforcement for `ServerAuthorization`
resources.
In adding these tests, it became clear that the OpenAPI validation for
`ServerAuthorization` resources is too strict. Various `oneof`
constraints have been removed in favor of admission controller
validation. These changes are semantically compatible and do not
necessitate an API version change.
The end-to-end tests work by creating `curl` pods that call an `nginx`
pod. In order to test network policies, the `curl` pod may be created
before the nginx pod, in which case an init container blocks execution
until a `curl-lock` configmap is deleted from the cluster. If the
configmap is not present to begin with, no blocking occurs.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.
This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).
[el]: https://github.com/polyfloyd/go-errorlint
Signed-off-by: Oliver Gould <ver@buoyant.io>
Currently both the `Server` and `ServerAuthorization` CRDs are defined
in a single file. As additional CRDs are introduced, this becomes
unwieldy to navigate.
This change splits `policy-crd.yaml` into `policy/sever.yaml` and
`policy/serverauthorization.yaml`. It also renames
`serviceprofile-crd.yaml` to `serviceprofile.yaml` (since it's already
under the `linkerd-crds` chart).
No functional changes.
Signed-off-by: Oliver Gould <ver@buoyant.io>
When installing Linkerd on a cluster with the Docker container runtime, `proxyInit.runAsRoot` but be set to `true` in order for Linkerd to operate. This is checked two different ways: `linkerd check --pre` and `linkerd check`.
#7457 discussed if it's better to emit this as a warning or error, but after some further discussion it makes more sense as a `linkerd install` runtime error so that a user cannot miss this configuration.
It still remains as part of `linkerd check` in case more nodes are added that do not satisfy this condition, or Linkerd is installed through Helm.
```sh
$ linkerd install
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
$ linkerd install --set proxyInit.runAsRoot=false
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
$ linkerd install --set proxyInit.runAsRoot=""
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
$ linkerd install --set proxyInit.runAsRoot=true
...
$ linkerd install --set proxyInit.runAsRoot=1
...
```
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Fixes#6584#6620#7405
# Namespace Removal
With this change, the `namespace.yaml` template is rendered only for CLI installs and not Helm, and likewise the `namespace:` entry in the namespace-level objects (using a new `partials.namespace` helper).
The `installNamespace` and `namespace` entries in `values.yaml` have been removed.
There in the templates where the namespace is required, we moved from `.Values.namespace` to `.Release.Namespace` which is filled-in automatically by Helm. For the CLI, `install.go` now explicitly defines the contents of the `Release` map alongside `Values`.
The proxy-injector has a new `linkerd-namespace` argument given the namespace is no longer persisted in the `linkerd-config` ConfigMap, so it has to be passed in. To pass it further down to `injector.Inject()` without modifying the `Handler` signature, a closure was used.
------------
Update: Merged-in #6638: Similar changes for the `linkerd-viz` chart:
Stop rendering `namespace.yaml` in the `linkerd-viz` chart.
The additional change here is the addition of the `namespace-metadata.yaml` template (and its RBAC), _not_ rendered in CLI installs, which is a Helm `post-install` hook, consisting on a Job that executes a script adding the required annotations and labels to the viz namespace using a PATCH request against kube-api. The script first checks if the namespace doesn't already have an annotations/labels entries, in which case it has to add extra ops in that patch.
---------
Update: Merged-in the approved #6643, #6665 and #6669 which address the `linkerd2-cni`, `linkerd-multicluster` and `linkerd-jaeger` charts.
Additional changes from what's already mentioned above:
- Removes the install-namespace option from `linkerd install-cni`, which isn't found in `linkerd install` nor `linkerd viz install` anyways, and it would add some complexity to support.
- Added a dependency on the `partials` chart to the `linkerd-multicluster-link` chart, so that we can tap on the `partials.namespace` helper.
- We don't have any more the restriction on having the muticluster objects live in a separate namespace than linkerd. It's still good practice, and that's the default for the CLI install, but I removed that validation.
Finally, as a side-effect, the `linkerd mc allow` subcommand was fixed; it has been broken for a while apparently:
```console
$ linkerd mc allow --service-account-name foobar
Error: template: linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml:16:7: executing "linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml" at <include "partials.annotations.created-by" $>: error calling include: template: no template "partials.annotations.created-by" associated with template "gotpl"
```
---------
Update: see helm/helm#5465 describing the current best-practice
# Core Helm Charts Split
This removes the `linkerd2` chart, and replaces it with the `linkerd-crds` and `linkerd-control-plane` charts. Note that the viz and other extension charts are not concerned by this change.
Also note the original `values.yaml` file has been split into both charts accordingly.
### UX
```console
$ helm install linkerd-crds --namespace linkerd --create-namespace linkerd/linkerd-crds
...
# certs.yaml should contain identityTrustAnchorsPEM and the identity issuer values
$ helm install linkerd-control-plane --namespace linkerd -f certs.yaml linkerd/linkerd-control-plane
```
### Upgrade
As explained in #6635, this is a breaking change. Users will have to uninstall the `linkerd2` chart and install these two, and eventually rollout the proxies (they should continue to work during the transition anyway).
### CLI
The CLI install/upgrade code was updated to be able to pick the templates from these new charts, but the CLI UX remains identical as before.
### Other changes
- The `linkerd-crds` and `linkerd-control-plane` charts now carry a version scheme independent of linkerd's own versioning, as explained in #7405.
- These charts are Helm v3, which is reflected in the `Chart.yaml` entries and in the removal of the `requirements.yaml` files.
- In the integration tests, replaced the `helm-chart` arg with `helm-charts` containing the path `./charts`, used to build the paths for both charts.
### Followups
- Now it's possible to add a `ServiceProfile` instance for Destination in the `linkerd-control-plane` chart.
Now, that SMI functionality is fully being moved into the
[linkerd-smi](www.github.com/linkerd/linkerd-smi) extension, we can
stop supporting its functionality by default.
This means that the `destination` component will stop reacting
to the `TrafficSplit` objects. When `linkerd-smi` is installed,
It does the conversion of `TrafficSplit` objects to `ServiceProfiles`
that destination components can understand, and will react accordingly.
Also, Whenever a `ServiceProfile` with traffic splitting is associated
with a service, the same information (i.e splits and weights) is also
surfaced through the `UI` (in the new `services` tab) and the `viz cmd`.
So, We are not really loosing any UI functionality here.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
We've implemented new CRDs to support configuration and discovery of
inbound policies. This change imports these CRDs from
linkerd/polixy@25af9b5 and adds them to the default Linkerd installation:
- `servers.policy.linkerd.io`
- `serverauthorizations.policy.linkerd.io`
Fixes#6452
We add a `linkerd-identity-trust-roots` ConfigMap which contains the configured trust root bundle. The proxy template partial is modified so that core control plane components load this bundle from the configmap through the downward API.
The identity controller is updated to mount this new configmap as a volume read the trust root bundle at startup.
Similarly, the proxy-injector also mounts this new configmap. For each pod it injects, it reads the trust root bundle file and sets it on the injected pod.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Remove the `linkerd-controller` pod
Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.
## Removals
- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.
## Health checks
- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.
## Other changes
- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
* Move `sp-validator` into the `destination` pod
Fixes#5195
The webhook secrets remain the same, as do the `profileValidator` settings in `values.yaml`, so this doesn't pose any upgrading challenges.
* Schedule heartbeat 10 mins after install
... for the Helm installation method, thus aligning it with the CLI
installation method, to reduce the midnight peak on the receiving end.
The logic added into the chart is now reused by the CLI as well.
Also, set `concurrencyPolicy=Replace` so that when a job fails and it's
retried, the retries get canceled when the next scheduled job is triggered.
Finally, the go client only failed when the connection failed;
successful connections with a non 200 response status were considered
successful and thus the job wasn't retried. Fixed that as well.
* cli: update err msg when control-plane already exists.
Fixes#5889
Currently, The error msg that is sent when control-plane already
exists suggests to use `--ignore-cluster`. We should instead
suggest users to use `linkerd upgrade`.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5782
`linkerd install` was checking for the existence of the
`linkerd-config-overrides` secret which hasn't been available till
recent versions. Changed this to check for the usual `linkerd-config`
ConfigMap. Uses a straight k8s API call for simplicity.
* install: persist helm override flags for upgrades
Fixes#5646
Currently, Overrides passed through helm flags are not being
persisted and hence lost w.r.t upgrades.
This PR fixes this by passing the using the right final `Values`
instead of using the one without the helm override flags.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* values: removal of .global field
Fixes#5425
With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.
To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.
We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
In the stable-2.9.0, stable-2.9.1, and stable-2.9.2 releases, the `linkerd-config-overrides` secret is missing the `linkerd.io/control-plane-ns` label. This means that if a `linkerd upgrade` is performed to one of these versions using the `--prune` flag, then the secret will be deleted. Missing this secret will prevent any further upgrades.
We add a `linkerd repair` command which recreates the `linkerd-config-overrides` secret by fetching the installed values from the `linkerd-config` configmap and then re-populating the redacted identity values from the `linkerd-identity-issuer` secret.
Usage:
```bash
linkerd repair | kubectl apply -f -
```
To test:
```
# Set Linkerd version to stable-2.8.0
> linkerd install | kubectl apply -f -
# Set Linkerd version to stable-2.9.1
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command fails)
# Set Linkerd version to HEAD
> linkerd repair | kubectl apply -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command succeeds)
> linkerd check
```
Signed-off-by: Alex Leong <alex@buoyant.io>
* cli: add helm customization flags to core install
Fixes#5506
This branch adds helm way of customization through
`set`, `set-string`, `values`, `set-files` flags for
`linkerd install` cmd along with unit tests.
For this to work, the helm v3 engine rendering helpers
had to be used instead of our own wrapper type.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* viz: move some components into linkerd-viz
This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.
The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as
`set`, `set-string`, `values`, `set-files`.
**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.
**Testing**
```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -
# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -
# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```
What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5385 (second bug in there)
Added missing label `linkerd.io/control-plane-ns=linkerd` that all the
control plane resources must have, that is passed to `kubectl apply
--prune`
Fixes#5385
## The problems
- `linkerd install --ha` isn't honoring flags
- `linkerd upgrade --ha` is overridding existing configs silently or failing with an error
- *Upgrading HA instances from before 2.9 to version 2.9.1 results in configs being overridden silently, or the upgrade fails with an error*
## The cause
The change in #5358 attempted to fix `linkerd install --ha` that was only applying some of the `values-ha.yaml` defaults, by calling `charts.NewValues(true)` and merging that with the values built from `values.yaml` overriden by the flags. It turns out the `charts.NewValues()` implementation was by itself merging against `values.yaml` and as a result any flag was getting overridden by its default.
This also happened when doing `linkerd upgrade --ha` on an existing instance, which could result in silently overriding settings, or it could also fail loudly like for example when upgrading set up that has an external issuer (in this case the issuer cert won't be able to be read during upgrade and an error would occur as described in #5385).
Finally, when doing `linkerd upgrade` (no --ha flag) on an HA install from before 2.9 results in configs getting overridden as well (silently or with an error) because in order to generate the `linkerd-config-overrides` secret, the original install flags are retrieved from `linkerd-config` via the `loadStoredValuesLegacy()` function which then effectively ends up performing a `linkerd upgrade` with all the flags used for `linkerd install` and falls into the same trap as above.
## The fix
In `values.go` the faulting merging logic is not used anymore, so now `NewValues()` only returns the default values from `values.yaml` and doesn't require an argument anymore. It calls `readDefaults()` which now only returns the appropriate values depending on whether we're on HA or not.
There's a new function `MergeHAValues()` that merges `values-ha.yaml` into the current values (it doesn't look into `values.yaml` anymore), which is only used when processing the `--ha` flag in `options.go`.
## How to test
To replicate the issue try setting a custom setting and check it's not applied:
```bash
linkerd install --ha --controller-log level debug | grep log.level
- -log-level=info
```
## Followup
This wasn't caught because we don't have HA integration tests. Now that our test infra is based on k3d, it should be easy to make such a test using a cluster with multiple nodes. Either that or issuing `linkerd install --ha` with additional configs and compare against a golden file.
* upgrades: make webhooks restart if TLS creds are updated
Fixes#5231
Currently, we do not re-use the TLS certs during upgrades, which
means that the secrets are updated while the webhooks are still
paired with the older ones, causing the webhook requests to fail.
This can be solved by making webhooks be restarted whenever there
is a change in the certs. This can be performed by storing the hash
of the `*-rbac` file, which contains the secrets, thus making the
pod templates change whenever there is an update to the certs thus
making restarts required.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* extension: Add new jaeger binary
This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.
This covers only the install part of the command
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#4874
This branch upgrades Helm sdk from v2 to v3 *without any functionaly
changes*, just replacing types with newer API's.
This should not effect our current support for Helm v2 as we did not
change any of the underlying tempaltes(which work with Helm v2). This
works becuase we did not use any of the API's that read the Chart
metadata (which are the only ones changed from v2 to v3) and currently
manually load files and pass ito the sdk.
This PR should provide a great point to start more of the newer Helm v3
API's including for the upgrade workflow thus allowing us to make
Linkerd CLI more simpler.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
CLI crashes if linkerd-config contains unexpected values.
Add a safe accessor that initializes an empty Global on the first
access. Refactor all accesses to use the newly introduced accessor using
gopls.
Add test for linkerd-config data without Global.
Fixes#5215
Co-authored-by: Itai Schwartz <yitai27@gmail.com>
Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>