Commit Graph

329 Commits

Author SHA1 Message Date
Matei David f05d1e9e26
feat(helm): default proxy-init resource requests to proxy values (#12741)
Default values for `linkerd-init` (resources allocated) are not always
the right fit. We offer default values to ensure proxy-init does not get
in the way of QOS Guaranteed (`linkerd-init` resource limits and
requests cannot be configured in any other way).

Instead of using default values that can be overridden, we can re-use
the proxy's configuration values. For the pod to be QOS Guaranteed, the
values for the proxy have to be set any way. If we re-use the same
values for proxy-init we can ensure we'll always request the same amount
of CPU and memory as needed.

* `linkerd-init` now defaults to the proxy's values
* when the proxy has an annotation configuration for resource requests,
  it also impacts `linkerd-init`
* Helm chart and docs have been updated to reflect the missing values.
* tests now no longer use `ProxyInit.Resources`

UPGRADE NOTE:
- Deprecates `proxyInit.resources` field in the Helm values.
  - It will be a no-op if specified (no hard failures)

Closes #11320

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-06-24 12:37:47 +01:00
Alejandro Pedraza b59149388f
Bump proxy-init to v2.4.1 and cni-plugin to v1.5.1 (#12711)
Those releases ensure that when IPv6 is enabled, the series of ip6tables commands succeed. If they fail, the proxy-init/linkerd-cni containers should fail as well, instead of ignoring errors.

See linkerd/linkerd2-proxy-init#388
2024-06-13 17:15:41 -05:00
Nico Feulner 3d674599b3
make group ID configurable (#11924)
Fixes #11773

Make the proxy's GUID configurable via `proxy.gid` which defaults to `-1`, in which case the GUID is not set.
Also added ability to set the GUID for proxy-init and the core and extension controllers.

---------

Signed-off-by: Nico Feulner <nico.feulner@gmail.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-05-23 15:54:21 -05:00
Alejandro Pedraza 6db4bd667c
Fix issues with native sidecars (#12453)
Closes #12395

Failing to iterate over init containers as well as regular containers for finding the proxy in various parts of the code when the proxy is injected as a native sidecar resulted in:

- `Get` Destination API failing in the presence of opaque ports
- Failure having the injector detecting already injected pods
- Various CLI issues

This PR is split into the following commits addressing each issue separately:

a8ebe76e3 - Fix injection check for existing sidecars
44e9625e0 - Fix 'linkerd uninject'
62694965d - Fix 'linkerd version --proxy'
42dbdaddf - Fix 'linkerd identity'
39db823fe - Fix 'linkerd check'
7359f371d - Fix 'linkerd dg proxy-metrics'
f8f73c47c - Fix destination controller
2024-04-26 14:38:01 -05:00
Matei David 4ce461e967
Bump proxy-init and CNI plugin versions (#12462)
A new release has been cut for both. The new release adds a new `GID`
feature that allows iptables to skip traffic originating from a process
running under the specified GID. The CNI plugin also includes a fix for
native sidecar containers.

* Bump proxy-init from `v2.3.0` to `v2.4.0`
* Bump CNI plugin from `v1.4.0` to `v1.5.0`

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-04-19 10:50:28 +01:00
Alejandro Pedraza 8b0e55ab38
Upgrade to proxy-init:v2.3.0 and linkerd-cni:1.4.0 (#12361) 2024-04-02 11:38:53 -05:00
Alex Leong a82bf9cb16
Remove kube-system injection check (#12263)
Fixes: https://github.com/linkerd/linkerd2/issues/12233

When Linkerd is installed in HA mode, linkerd check warns if the admission-webhooks=disabled annotation on kube-system is not set BUT the admission webhooks already exclude kube-system, making the annotation no longer necessary.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-03-15 11:06:51 -07:00
Alejandro Pedraza 796bb85323
Bump proxy-init to v2.2.4 (#11988)
Upgraded Alpine to 3.19.0, and other various dependencies bumps.
2024-01-26 09:28:14 -08:00
Matei David be41965cd2
Bump CNI plugin version to v1.3.0 in Go pkg (#11912)
We released a new version of the CNI plugin. The chart has been updated
to reference the new version, however, some of the tests and the Go
`version` pkg still reference the old version (v1.2.2). When installing
through the CLI, I noticed that even though the chart value renders an
image for the new repair controller, the image used is still v1.2.2, and
as such, the container won't be started due to a missing binary.

This change bumps the version to v1.3.0 everywhere.

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-12 13:49:01 +00:00
Matei David 2cc13d776b
Introduce a new check for extension namespace configuration (#11629)
* Introduce a new check for extension namespace configuration

Linkerd's extension model requires that each namespace that "owns" an
extension to be labelled with the extension name. Core extensions in
particular strictly follow this pattern. For example, the namespace viz
is installed in would be labelled with `linkerd.io/extension=viz`.

The extension is used by the CLI in many different instances. It is used
in checks, it is used in uninstalls, and so on. Whenever a namespace
contains a duplicate label value (e.g. two namespaces are registered as
the owner of "viz") we introduce undefined behaviour. Extension checks
or uninstalls may or may not work correctly. These issues are not
straightforward to debug. Misconfiguration can be introduced due to a
variety of reasons.

This change adds a new "core" category (`linkerd-extension-checks`) and
a new checker that asserts all extension namespaces are configured
properly. There are two reasons why this has been made a "core"
extension:

* Extensions may have their own health checking library. It is hard to
  share a common abstraction here without duplicating the logic. For
  example, viz imports the healthchecking package whereas the
  multicluster extension has its own. A dedicated core check will work
  better with all extensions that opt-in to use linkerd's extension
  label.
* Being part of the core checks means this is going to run before any of
  the other extension checks do which might improve visibility.

The change is straightforward; if an extension value is used for the
label key more than once across the cluster, the check issues a warning
along with the namespaces the label key and value tuple exists on.

This should be followed-up with a docs change.

Closes #11509

Signed-off-by: Matei David <matei@buoyant.io>
2023-11-20 16:27:10 +00:00
Táskai Dominik 64b66f9218
Add informative error messages when there is no internet connection. (#11377)
When the Linkerd CLI is unable to access the internet, it will encounter
a DNS error when trying to discover the latest Linkerd releases from linkerd.io.

This change handles this DNS resolution error explicitly so that users receive
a more informative error message.

Fixes #11349

Signed-off-by: Dominik Táskai <dominik.taskai@leannet.eu>
Co-authored-by: Dominik Táskai <dominik.taskai@leannet.eu>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2023-09-26 10:23:56 -07:00
Alejandro Pedraza ec1c898bd9
Bump proxy-init:v2.2.3 and cni-plugin:v1.2.2 (#11399)
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/proxy-init%2Fv2.2.3
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/cni-plugin%2Fv1.2.2

Updated to use go 1.21
2023-09-21 11:16:37 -05:00
Takumi Sue 26dfc6a3be
Check cli version match only for running pods (#11295)
Fixes #11280

Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2023-09-11 15:30:15 -07:00
Matei David c0da3b95bc
Bump CNI plugin and proxy-init versions (#11348)
* Bump CNI plugin to v1.2.1
* Bump proxy-init to v2.2.2

Both dependencies include a fix for CVE-2023-2603. Since alpine is used
as the runtime image, there is a security vulnerability detected in the
produced images (due to an issue with libcap). The alpine images have
been bumped to address the CVE.

Signed-off-by: Matei David <matei@buoyant.io>
2023-09-07 16:27:13 +01:00
Alex Leong 368b63866d
Add support for remote discovery (#11224)
Adds support for remote discovery to the destination controller.

When the destination controller gets a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` label, this is an indication that the destination controller should discover the endpoints for this service from a remote cluster.  The destination controller will look for a remote cluster which has been linked to it (using the `linkerd multicluster link` command) with that name.  It will look at the `multicluster.linkerd.io/remote-discovery` label for the service name to look up in that cluster.  It then streams back the endpoint data for that remote service.

Since we now have multiple client-go informers for the same resource types (one for the local cluster and one for each linked remote cluster) we add a `cluster` label onto the prometheus metrics for the informers and EndpointWatchers to ensure that each of these components' metrics are correctly tracked and don't overwrite each other.

---------

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-08-11 09:31:45 -07:00
Abhijeet Gaurav bca15f59ed
Removed hostNetwork: true from linkerd-cni Helm chart templates (#11158)
Problem - Current does Linkerd CNI Helm chart templates have hostNetwork: true set which is unnecessary and less secure.

Solution - Removed hostNetwork: true from linkerd-cni Helm chart templates

PR Fixes #11141 
---------

Signed-off-by: Abhijeet Gaurav <abhijeetdav24aug@gmail.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2023-08-03 09:53:57 -05:00
Alex Leong b7e0be6e67
Policy controller watches gateway-api HttpRoutes (#11042)
Updates the policy-controller to watch `httproute.gateway.networking.k8s.io` resources in addition to watching `httproute.policy.linkerd.io` resources.  Routes of either or both types can be returned in policy responses and will be appropriately identified by the `group` field on their metadata.  Furthermore we update the Status of these resources to correctly reflect when they are accepted.

We add the `httproute.gateway.networking.k8s.io` CRD to the Linkerd installed CRD list and add the appropriate RBAC to the policy controller so that it may watch these resources.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-07-03 15:48:49 -07:00
Alejandro Pedraza 040481cd80
linkerd-cni v1.2.0 (#10973)
This release stops using the "interface" mode, and instead wait till
another CNI plugin drops a proper network config and then append the
linkerd CNI config to it. This avoids having pods start before proper
networking is established in the node.
2023-06-02 09:10:04 -05:00
Eliza Weisman 4c7a9ab157
cni-plugin: v1.1.3 (#10855)
This release of the CNI plugin changes the base runtime Docker image
from `debian:bullseye-slim` to `alpine:3.17.3`.

---

* cni: use `scratch` as the base runtime docker image (linkerd/linkerd2-proxy-init/pull/237)
* cni: change base runtime image from `scratch` to `alpine` (linkerd/linkerd2-proxy-init#238)
2023-05-04 17:15:09 -07:00
Alejandro Pedraza f57c925ecb
Bump cni-plugin to v1.1.1 (#10780)
Fixed incompatibility issue with AWS CNI addon in EKS, that was
forbidding pods to acquire networking after scaling up nodes.

Credits to @frimik for providing a diagnosis and fix, and to @JonKusz for the detailed repro
2023-04-20 12:21:09 -05:00
Alejandro Pedraza 0c202bf17b
Bump linkerd2-proxy-init packages (#10678)
proxy-init v2.2.1:
* Sanitize `subnets-to-ignore` flag
* Dep bumps

cni-plugin v1.1.0:
* Add support for the `config.linkerd.io/skip-subnets` annotation
* Dep bumps

validator v0.1.2:
* Dep bumps

Also, `linkerd-network-validator` is now released wrapped in a tar file, so this PR also amends `Dockerfile-proxy` to account for that.
2023-04-04 18:07:03 -05:00
Andrew Seigner e71266f2c9
cli: Support running `check` on CLI-only extensions (#10588)
The existing `linkerd check` command runs extension checks based on extension namespaces already on-cluster. This approach does not permit running extension checks without cluster-side components.

Introduce "CLI Checks". These extensions run as part of `linkerd check`, if they satisfy the following criteria:
1) executable in PATH
2) prefixed by `linkerd-`
3) supports an `_extension-metadata` subcommand, that outputs self-identifying
   JSON, for example:
   ```
   $ linkerd-foo _extension-metadata
   {
     "name": "linkerd-foo",
     "checks": "always"
   }
   ```
4) The `name` value from `_extension-metadata` must match the filename. And `checks` must equal `always`.

If a CLI Check is found that also would have run as an on-cluster extension check, it is run as a CLI Check only.

Fixes #10544
2023-03-29 12:07:36 -07:00
cui fliter 8c6de42210
all: fix some comments (#10387)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-03-01 11:47:02 +00:00
Steve Jenson 44424466c1
linkerd-cni: add new release to the build (#10209)
wind the new linkerd-cni build through the build. refactor image, version, and pullPolicy into an Image object.

Signed-off-by: Steve Jenson <stevej@buoyant.io>
2023-02-08 13:54:35 -08:00
Steve Jenson 1e8d96509b
Upgrading proxy-init from v2.1.0 to v2.2.0 this time without JSON formatting (#10234)
Signed-off-by: Steve Jenson <stevej@buoyant.io>
2023-02-01 11:53:02 -08:00
Alejandro Pedraza 7428d4aa51
Removed dupe imports (#10049)
* Removed dupe imports

My IDE (vim-gopls) has been complaining for a while, so I decided to take
care of it. Found via
[staticcheck](https://github.com/dominikh/go-tools)

* Add stylecheck to go-lint checks
2023-01-10 14:34:56 -05:00
Alejandro Pedraza 6247730141
Refactor `linkerd check` calls in the integration tests (#9989)
* Refactor `linkerd check` calls in the integration tests

Extracted logic into the new file `testutil/test_helper_check.go` which exposes the functions `TestCheckPre`, `TestCheck` and `TestCheckProxy`.

`linkerd check --output json` is called so its output is properly captured without the need of golden files.

Besides checking that there are no errors (although warnings are allowed), we check that the expected check categories are returned.

The plan is to leverage this in #9856 when re-enabling the helm-upgrade test.
2022-12-21 12:14:43 -05:00
Alejandro Pedraza faf0ff62f7
Add support for Pod Security Admission (#9719)
Closes #9676

This adds the `pod-security.kubernetes.io/enforce` label as described in [Pod Security Admission labels for namespaces](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces).

PSA gives us three different possible values (policies or modes): [privileged, baseline and restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/).

For non-CNI mode, the proxy-init container relies on granting the NET_RAW and NET_ADMIN capabilities, which places those pods under the `restricted` policy. OTOH for CNI mode we can enforce the `restricted` policy, by setting some defaults on the containers' `securityContext` as done in this PR.

Also note this change also adds the `cniEnabled` entry in the `values.yaml` file for all the extension charts, which determines what policy to use.

Final note: this includes the fix from #9717, otherwise an empty gateway UID prevents the pod to be created under the `restricted` policy.

## How to test

As this is only enforced as of k8s 1.25, here are the instructions to run 1.25 with k3d using Calico as CNI:

```bash
# launch k3d with k8s v1.25, with no flannel CI
$ k3d cluster create --image='+v1.25' --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0'

# install Calico
$ k apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml

# load all the images
$ bin/image-load --k3d proxy controller policy-controller web metrics-api tap cni-plugin jaeger-webhook

# install linkerd-cni
$ bin/go-run cli install-cni|k apply -f -

# install linkerd-crds
$ bin/go-run cli install --crds|k apply -f -

# install linkerd-control-plane in CNI mode
$ bin/go-run cli install --linkerd-cni-enabled|k apply -f -

# Pods should come up without issues. You can also try the viz and jaeger extensions.
# Try removing one of the securityContext entries added in this PR, and the Pod
# won't come up. You should be able to see the PodSecurity error in the associated
# ReplicaSet.
```

To test the multicluster extension using CNI, check this [gist](https://gist.github.com/alpeb/4cbbd5ad87538b9e0d39a29b4e3f02eb) with a patch to run the multicluster integration test with CNI in k8s 1.25.
2022-12-19 10:23:46 -05:00
Alejandro Pedraza 4ea8ab21dc
edge-22.11.3 change notes (#9884)
* edge-22.11.3 change notes

Besides the notes, this corrects a small point in `RELEASE.md`, and
bumps the proxy-init image tag to `v2.1.0`. Note that the entry under
`go.mod` wasn't bumped because moving it past v2 requires changes on
`linkerd2-proxy-init`'s `go.mod` file, and we're gonna drop that
dependency soon anyways. Finally, all the charts got their patch version
bumped, except for `linkerd2-cni` that got its minor bumped because of
the tolerations default change.

## edge-22.11.3

This edge release fixes connection errors to pods using a `hostPort` different
than their `containerPort`. Also the `network-validator` init container improves
its logging, and the `linkerd-cni` DaemonSet now gets deployed in all nodes by
default.

* Fixed `destination` service to properly discover targets using a `hostPort`
  different than their `containerPort`, which was causing 502 errors
* Upgraded the `network-validator` with better logging allowing users to
  determine whether failures occur as a result of their environment or the tool
  itself
* Added default `Exists` toleration to the `linkerd-cni` DaemonSet, allowing it
  to be deployed in all nodes by default, regardless of taints

Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-11-23 14:35:20 -05:00
Alejandro Pedraza 1998f12791
Fix "cluster networks contains all services" fails with services with no ClusterIP (#9662)
Fixes #9661

This excludes any service with no ClusterIP from this check, which
includes the services of type ExternalName.
2022-10-24 09:33:39 -05:00
ziollek ca685f78ad
Fixes #9616 remove kubectl version check (#9623)
* Fixes #9616 remove kubectl version check

Signed-off-by: tomasz.ziolkowski <e.prace@gmail.com>
2022-10-19 15:27:11 -05:00
Alex Leong dba0a985d8
Add check that clusterip services are in the cluster networks (#9567)
The root cause of https://github.com/linkerd/linkerd2/issues/9521 was that there were clusterip Services which were not in Linkerd's cluster networks.  This means that Linkerd was not performing discovery when connecting to these services and therefore was not doing mTLS.  This issue was difficult to detect and diagnose.

We add a check which verifies that all clusterIP services in the cluster have their clusterIP in the cluster networks.  This is very similar to the existing check which verifies that all pods have a podIP in the cluster networks.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-10-10 11:38:59 -07:00
Matei David 75673f7922
Bump proxy-init to v2.0.0 (#9179)
* Bump proxy-init to v2.0.0

New release of proxy-init.

Updated:

* Helm values to use v2.0.0 of proxy-init
* Helm docs
* Tests

Note: go dependencies have not been updated since the new version will
break API compatibility with older versions (source files have been
moved, see issue for more details).

Closes #9164

Signed-off-by: Matei David <matei@buoyant.io>
Signed-off-by: Oliver Gould <ver@buoyant.io>

Signed-off-by: Matei David <matei@buoyant.io>
Signed-off-by: Oliver Gould <ver@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-08-17 11:48:27 +01:00
Dani Baeyens 074f5e6cdf
Allows RSA signed trust anchors on linkerd cli (#7771) (#8868)
* Allows RSA signed trust anchors on linkerd cli (#7771)

Linkerd currently forces using an ECDSA P-256
issuer certificate along with a ECDSA trust
anchor. Still, it's still cryptographically valid
to have an ECDSA P-256 issuer certificate issued
by an RSA signed CA.

CheckCertAlgoRequirements checks if CA cert uses
ECDSA or RSA 2048/4096 signing algorithm.

Fixes #7771

Signed-off-by: Baeyens, Daniel <daniel.baeyens@gmail.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2022-08-08 08:04:24 -05:00
Alejandro Pedraza e80a791777
Allow initializing a k8s namespace-scoped API (#8751)
* Allow initializing a k8s namespace-scoped API

This allows reusing the `k8s.API` informers by other projects that don't
necessarily have cluster-wide permissions.
2022-08-04 09:14:26 -05:00
Agrim Prasad 2281e6bb69
healthcheck: ignore Terminated state for pods (#9002)
`Terminated` status pods have been evicted pending an imminent node shutdown, and thus would not have a proxy sidecar. 

Similar to https://github.com/linkerd/linkerd2/pull/8032 but also addresses `Terminated` status in addition to the `NodeShutdown` status.

Ref: https://kubernetes.io/docs/concepts/architecture/nodes/#graceful-node-shutdown

Signed-off-by: Agrim Prasad <AgrimPrasad@users.noreply.github.com>

Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-07-26 12:14:41 -07:00
Matei David 59734271d3
Bump proxy-init to v1.6.2 (#8989)
This change bumps the proxy-init version from v1.6.1 to the latest
version, v1.6.2. As part of the new release, proxy-init now adds
net_admin and net_raw sys caps to xtables-nft-multi so that nftables
mode can be used without requiring root privileges.

* Bump go.mod
* Bump version in helm values
* Bump version in misc files
* Bump version in code

Signed-off-by: Matei David <matei@buoyant.io>
2022-07-25 18:40:06 +03:00
Kevin Leimkuhler 2442ca07bf
Parse Pod labels for owning Deployment instead of name (#8920)
Closes #8916

When a random Pod (meshed or not) is created in the `linkerd`, `linkerd-viz`, or
`linkerd-jaeger` namespaces their respective `check` subcommands can fail.

We parse Pod names for their owning Deployment by assuming the Pod name has a
randomized suffix. For example, the `linkerd-destination` Deployment creates the
`linkerd-destination-58c57dd675-7tthr` Pod. We split the name on `-` and take
the first two parts (`["linkerd", "destination"]`); those first two parts make
up the Deployment name.

Now, if a random Pod is created in the namespace with the name `test`, we apply
that same logic but hit a runtime error when trying to get the first two parts
of the split. `test` did not split at all since it contains no `-` and therefore
we error with `slice bounds out of range`.

To fix this, we now use the fact that all Linkerd components have a
`linkerd.io/control-plane-component` or `component` label with a value that is
the owning Deployment. This allows us to avoid any extra parsing logic and just
look at a single label value.

Additionally, some of these checks get all the Pods in a namespace with the
`GetPodsByNamespace` method but we don't always need something so general. In
the places where we are checking specifically for Linkerd components, we can
narrow this further by using the expected LabelSelector such as
`linkerd.io/extension=viz`.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-07-19 12:14:55 -06:00
Matei David b3ec9111d2
Bump proxy-init version to v1.6.1 (#8913)
Release v1.6.1 of proxy-init adds support for iptables-nft. This change
bumps up the proxy-init version used in code, chart values, and golden
files.

* Update go.mod dep
* Update CNI plugin with new opts
* Update proxy-init ref in golden files and chart values
* Update policy controller CI workflow

Signed-off-by: Matei David <matei@buoyant.io>
2022-07-18 13:03:26 -07:00
Alex Leong df177e67eb
Add HttpRoute CRD (#8675)
Fixes #8660

We add the HttpRoute CRD to the CRDs installed with `linkerd install --crds` and `linkerd upgrade --crds`.  You can use the `--set installHttpRoute=false` to skip installing this CRD.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-29 09:50:23 -07:00
Alex Leong d8dab60b02
change podCIDR check to use pod ips (#8557)
Fixes #8555

We remove the "cluster networks can be verified" which checks that the podCIDR field exists on nodes and replace it with a "cluster networks contains all pods" check.  This looks at all the pods in the cluster an verifies that each pod's IP is contained in the cluster networks.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-03 16:48:01 -07:00
Alex Leong be2733b2b1
Add --crds flag to linkerd check (#8499)
Fixes #8372

Add a `--crds` flag to `linkerd check`.  This flag causes `linkerd check` to validate that the Linkerd  CRDs have been installed, and will wait until the check succeeds.  This way, `linkerd check --crds` can be used after `linkerd install --crds` and before `linkerd install` to ensure the CRDs have been installed successfully and to avoid race conditions where `linkerd install` could potentially attempt to create custom resources for which the CRD does not yet exist.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-05-31 17:04:11 -07:00
Alex Leong 8038aa24d6
linkerd upgrade should require CRDs be installed (#8413)
Fixes #8373

We update `CheckCustomResourceDefinitions` so that it not only checks for the existence of the CRDs, but also ensures that they contain the latest version of each CRD.  Note that this means that we'll need to keep this list of CRD versions in `CheckCustomResourceDefinitions` in sync with the actual CRD versions in the templates.  We also add this check to `linkerd upgrade` when the `--crds` flag is not provided.  This means that users who are upgrading will be required to run `linkerd upgrade --crds` first if they don't have the latest version of any of the CRDs.

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-05-16 10:25:46 -07:00
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00
Oliver Gould fa8ddb4801
Use go-test/deep for comparisons in tests (#8427)
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-05 09:31:07 -07:00
Alex Leong 6762dd28ac
Add --crds flag to install/upgrade and remove config/control-plane stages (#8251)
Fixes: #8173 

In order to support having custom resources in the default Linkerd installation, it is necessary to add a separate install step to install CRDs before the core install.  The Linkerd Helm charts already accomplish this by having CRDs in a separate chart.

We add this functionality to the CLI by adding a `--crds` flag to `linkerd install` and `linkerd upgrade` which outputs manifests for the CRDs only and remove the CRD manifests when the `--crds` flag is not set.  To avoid a compounding of complexity, we remove the `config` and `control-plane` stages from install/upgrade.  The effect of this is that we drop support for splitting up an install by privilege level (cluster admin vs Linkerd admin).

The Linkerd install flow is now always a 2-step process where `linkerd install --crds` must be run first to install CRDs only and then `linkerd install` is run to install everything else.  This more closely aligns the CLI install flow with the Helm install flow where the CRDs are a separate chart.  Attempting to run `linkerd install` before the CRDs are installed will result in a helpful error message.

Similarly, upgrade is also a 2-step process of `linkerd upgrade --crds` follow by `linkerd upgrade`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-04-28 09:36:14 -07:00
Michał Romanowski 88b8da50d2
Introduce node affinity support for linkerd pods (#8137)
In order to restrict pods to run only on arbitrarily chosen nodes, affinities
or tolerations can be used. Currently, Linkerd only supports tolerations,
which are applied to pods and allow them to be scheduled on nodes with
matching "taints".

Certain environments and workflows lean more towards affinity instead of
tolerations to determine preferred or required scheduling. This change
introduces a new "nodeAffinity" field so that users may specify affinity
rules for scheduling Linkerd pods.

Closes #8136

Signed-off-by: Michal Romanowski <michal.rom089@gmail.com>
2022-04-15 11:24:16 +01:00
Kevin Leimkuhler b9001ba6b7
cli: skip pods that have `NodeShutdown` status (#8032)
Closes #8010.

Pods that have `NodeShutdown` status should be skipped during validation as they will not have a running proxy container.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-09 09:27:43 -07:00
Kevin Leimkuhler 67bcd8f642
Add `gosec` and `errcheck` lints (#7954)
Closes #7826

This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.

A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.

Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.

Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-03 10:09:51 -07:00
Alex Leong 6ec0ef1576
Call cancel sooner to cleanup WithTimeout timer (#7957)
The `WithTimeout` documentation states:

> Canceling this context releases resources associated with it, so code should call cancel as soon as the operations running in this Context complete

We only use the context for calling `c.check`, therefore we can call cancel immediately after `c.check` completes to free resources associated with the timeout.  This prevents potentially holding on to (and accumulating) timeout related resources for the entire duration of the loop.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-02-24 10:12:59 -08:00