Commit Graph

983 Commits

Author SHA1 Message Date
Kevin Leimkuhler d6c33e9743
Unset `policyValidator.keyPEM` in `linkerd-config` (#8827)
Closes #8823 

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-07-07 20:53:37 -06:00
Alex Leong 120f91ca2c
Add validation for HTTPRoute (#8730)
Fixes #8665

We add validation for HTTPRoute resources to the policy admission controller.  We validate that for any HTTPRoute which has a Server as a parent_ref, that it doesn't have unsupported filters.  For the moment we do not support any HTTP filters.  As we add support for HTTP filter types, we should update the validator accordingly.

Signed-off-by: Alex Leong <alex@buoyant.io>

Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-07-07 16:14:26 -07:00
Alex Leong e84a27506a
Relax Server proxyProtocol validation (#8655)
Fixes #8564

Removes the enum of allowed values from the proxyProtocol field in the Server CRD.  Instead, we rely on the admission controller to validate this field.

Before:

```
The Server "myserver" is invalid: spec.proxyProtocol: Unsupported value: "invalid": supported values: "unknown", "HTTP/1", "HTTP/2", "gRPC", "opaque", "TLS"
```

After:

```
k apply -f myserver.yml                              
Error from server: error when creating "myserver.yml": admission webhook "linkerd-policy-validator.linkerd.io" denied the request: unknown variant `invalid`, expected one of `unknown`, `HTTP/1`, `HTTP/2`, `gRPC`, `opaque`, `TLS`
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-07-01 18:51:03 -07:00
Alex Leong df177e67eb
Add HttpRoute CRD (#8675)
Fixes #8660

We add the HttpRoute CRD to the CRDs installed with `linkerd install --crds` and `linkerd upgrade --crds`.  You can use the `--set installHttpRoute=false` to skip installing this CRD.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-29 09:50:23 -07:00
dependabot[bot] aaff8a74e9
build(deps): bump github.com/spf13/cobra from 1.4.0 to 1.5.0 (#8717)
* build(deps): bump github.com/spf13/cobra from 1.4.0 to 1.5.0

Bumps [github.com/spf13/cobra](https://github.com/spf13/cobra) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/spf13/cobra/releases)
- [Commits](https://github.com/spf13/cobra/compare/v1.4.0...v1.5.0)

---
updated-dependencies:
- dependency-name: github.com/spf13/cobra
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Relax completion tests

Signed-off-by: Alex Leong <alex@buoyant.io>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
2022-06-24 10:47:28 -07:00
Kevin Leimkuhler 7e8167b7b8
Fix Docker runtime check to happen only during install (#8667)
Closes #8583 

Even though #7468 removed the Docker container runtime check from `linkerd check --pre` to `linkerd install` runtime error, we still do a dry run of the installation so that we can render the control plane manifests. Therefore, we still hit this check which results in not being able to run `linkerd check --pre` when nodes are using the Docker container runtime. This fixes the issue by introducing a `dryRun` flag that we check beforehand.

```shell
❯ kubectl get nodes docker-desktop -o jsonpath='{.status.nodeInfo.containerRuntimeVersion}'
docker://20.10.16

❯ bin/linkerd check --pre
Linkerd core checks
===================
...
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-06-22 08:04:24 -06:00
Alex Leong b7a0b8adb4
Bump minimum kubernetes version to 1.21 (#8647)
Fixes #8592

Increase the minimum supported kubernetes version from 1.20 to 1.21.  This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources.  This fixes deprecation warnings about these warnings printed by the CLI.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-14 15:15:28 -07:00
Matei David 574cd49b3a
Include pod probe ports in inbound proxy config (#8645)
The injector configures the proxy with a set of known inbound ports
which are used (by the proxy) to discover inbound server configuration.
The list of ports is derived from the pod's container ports; container
ports may be optional and thus not present. The proxy supports dynamic
discovery of additional ports at runtime but since they are lazy,
additional ports may be dropped or updated long after pod start-up.

To ensure HTTP probes are handled correctly, this change introduces new
functionality to configure the list of inbound ports for the proxy with
any ports targeted by healthcheck probes, as long as they are HTTP, and
even if they are not present in the containerPorts configuration.

This change also introduces additional liveness (or readiness) probes to
the current injector webhook test fixtures in order to assert that
injected pods will always have their healthcheck target ports included
in the proxy's configuration.

Closes #8638

Signed-off-by: Matei David <matei@buoyant.io>
2022-06-13 18:33:56 +01:00
Alex Leong be2733b2b1
Add --crds flag to linkerd check (#8499)
Fixes #8372

Add a `--crds` flag to `linkerd check`.  This flag causes `linkerd check` to validate that the Linkerd  CRDs have been installed, and will wait until the check succeeds.  This way, `linkerd check --crds` can be used after `linkerd install --crds` and before `linkerd install` to ensure the CRDs have been installed successfully and to avoid race conditions where `linkerd install` could potentially attempt to create custom resources for which the CRD does not yet exist.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-05-31 17:04:11 -07:00
Alex Leong 7dfebf3588
Add safe-to-evict annotation to control plane components (#8524)
Fixes #4067

We add the `cluster-autoscaler.kubernetes.io/safe-to-evict: "true"` annotation to the Linkerd control plane components.  This annotation tells the cluster autoscaler that even though the control plane components have volume mounts, it is okay to evict them (subject to pod disruption constraints).  This is because we only use the volume as temporary storage for certificates and do not need to persist that data.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-05-31 16:09:12 -07:00
Kevin Leimkuhler 5d0e676f0b
Remove linkerd-viz dependency from linkerd-mutlicluster `gateways` command (#8467)
This changes linkerd-multicluster's `gateways` command to use the service mirror component's `/metrics` endpoint so that there is no longer a dependency on linkerd-viz. The dependency on linkerd-viz is leftover from when those components were part of the default installation meaning that we could always rely on the Prometheus component being present.

Now, the `gateways` command starts a port-forward to each service mirror component (for each linked cluster) and queries the `/metrics` endpoint for the `gateway_alive` and `gateway_latency` metrics. It then queries the local cluster for the number of mirror services that correspond to the target cluster of that service mirror. Using these three data points, it creates the output table for the command.

### Output changes

Currently the `gateways` command displays the P50, P95, and P99 latencies for each gateway

```shell
$ linkerd multicluster gateways 
CLUSTER  ALIVE    NUM_SVC  LATENCY_P50  LATENCY_P95  LATENCY_P99  
k3d-x    True           1          1ms          3ms          3ms  
k3d-z    True           0          1ms          3ms          3ms
```

With this change, we now just show the last observed latency. This involved adding the `gateway_latency` metric Gauge — different from the current latencies Observer.

```shell
$ linkerd multicluster gateways
CLUSTER  ALIVE    NUM_SVC      LATENCY  
k3d-x    True           1          2ms  
k3d-z    True           0          3ms
```

This is because I have not found a Prometheus common library for taking the parsed metrics from `/metrics` and turning that into a histogram yet; I think we should be able to do this but I'm leaving as a follow-up for now.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-31 13:51:38 -06:00
Alex Leong 8038aa24d6
linkerd upgrade should require CRDs be installed (#8413)
Fixes #8373

We update `CheckCustomResourceDefinitions` so that it not only checks for the existence of the CRDs, but also ensures that they contain the latest version of each CRD.  Note that this means that we'll need to keep this list of CRD versions in `CheckCustomResourceDefinitions` in sync with the actual CRD versions in the templates.  We also add this check to `linkerd upgrade` when the `--crds` flag is not provided.  This means that users who are upgrading will be required to run `linkerd upgrade --crds` first if they don't have the latest version of any of the CRDs.

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-05-16 10:25:46 -07:00
Alex Leong 893fa78671
Split HA functionality into multiple configurable values (#8445)
Some autoscalers, namely Karpenter, don't allow podAntiAffinity and the enablePodAntiAffinity flag is
currently overloaded with other HA requirements. This commit splits out the PDB and updateStrategy
configuration into separate value inputs.

Fixes #8062

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Evan Hines <evan@firebolt.io>
2022-05-10 09:49:58 -07:00
Oliver Gould 33c1d610ad
test: Diff structured YAML when possible (#8432)
When we compare generated manifests against fixtures, we do a simple
string comparison to compare output. The diffed data can be pretty hard
to understand.

This change adds a new test helper, `DiffTestYAML` that parses strings
as arbitrary YAML data structures and uses `deep.Equal` to generate a
diff of the datastructures.

Now, when a test fails, we'll get output like:

```
install_test.go:244: YAML mismatches install_output.golden:
	slice[32].map[spec].map[template].map[spec].map[containers].slice[3].map[image]: PolicyControllerImageName:PolicyControllerVersion != SomeOtherImage:PolicyControllerVersion
```

While testing this, it became apparent that several of our generated
golden files were not actually valid YAML, due to the `LinkerdVersion`
value being unset. This has been fixed.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-10 08:40:29 -07:00
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00
Oliver Gould 7d1e4a6953
refactor: Split CRD & Control Plane upgrade logic (#8423)
This change follows on 4f3c374, which split the install logic for CRDs
and the core control plane, by splitting the upgrade logic for the CRDs
and the core control plane.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-04 16:11:48 -07:00
Matei David 3112b85b6a
Introduce file watch to CNI installer (#8299)
Introduce fs watch for cni installer

Our CNI installer script is prone to race conditions, especially when a
node is rebooted, or restarted. Order of configuration should not matter
and our CNI plugin should attach to other plugins (i.e chain to them) or
run standalone when applicable. In order to be more flexible, we
introduce a filesystem watcher through inotifywait to react to changes
in the cni config directory. We react to changes based on SHAs.

Linkerd's CNI plugin should append configuration when at least one other
file exists, but if multiple files exist, the CNI plugin should not have
to make a decision on whether thats the current file to append itself
to. As a result, most of the logic in this commit revolves around the
assumption that whatever file we detect has been created should be
injected with Linkerd's config -- the rest is up to the host.

In addition, we also introduce a sleep in the cni preStop hook, changed to
using bash and introduce procps to get access to ps and pgrep.

Closes #8070

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2022-05-04 12:03:59 +01:00
Oliver Gould 4f3c374bb7
refactor: Split CRD and control-plane installation (#8401)
We currently have singular `install` and `render` functions, each of
which takes a `crds` bool that completely alters the behavior of the
function. This change splits this behavior into distinct functions so
we have `installCRDs`/`renderCRDs` and `installControlPlane`/
`renderControlPlane`.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-03 17:21:27 -07:00
Alex Leong 820fac758c
Fix panic in install --ignore-cluster (#8377)
Fixes #8364 

When `linkerd install` is called with the `--ignore-cluster`, we pass `nil` for the `k8sAPI`.  This causes a panic when using this client for validation.  We add a conditional so that we skip this validation when the `k8sAPI` is `nil`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-05-02 12:06:48 -07:00
Alejandro Pedraza 0238950868
edge-22.4.1 change notes (#8362)
## edge-22.4.1

In order to support having custom resources in the default Linkerd installation,
the CLI install flow is now always a 2-step process where `linkerd install
--crds` must be run first to install CRDs only and then `linkerd install` is run
to install everything else. This more closely aligns the CLI install flow with
the Helm install flow where the CRDs are a separate chart. This also applies to
`linkerd upgrade`. Also, the `config` and `control-plane` sub-commands have been
removed from both `linkerd install` and `linkerd upgrade`.

On the proxy side, this release fixes an issue where proxies would not honor the
cluster's opaqueness settings for non-pod/service addresses. This could cause
protocol detection to be peformed, for instance, when using off-cluster
databases.

This release also disables the use of regexes in Linkerd log filters (i.e., as
set by `LINKERD2_PROXY_LOG`). Malformed log directives could, in theory, cause a
proxy to stop responding.

The `helm.sh/chart` label in some of the CRDs had its formatting fixed, which
avoids issues when installing/upgrading through external tools that make use of
it, such as recent versions of Flux.

* Added `--crds` flag to install/upgrade and remove config/control-plane stages
* Allowed the `AuthorizationPolicy` CRD to have an empty
  `requiredAuthenticationRefs` entry that allows all traffic
* Introduced `nodeAffinity` config in all the charts for enhanced control on the
  pods scheduling (thanks @michalrom089!)
* Introduced `resources`, `nodeSelector` and `tolerations` configs in the
  `linkerd-multicluster-link` chart for enhanced control on the service mirror
  deployment (thanks @utay!)
* Fixed formatting of the `helm.sh/chart` label in CRDs
* Updated container base images from buster to bullseye
* Added support for spaces in the `config.linkerd.io/opaque-ports` annotation
2022-04-28 13:43:11 -05:00
Alex Leong 6762dd28ac
Add --crds flag to install/upgrade and remove config/control-plane stages (#8251)
Fixes: #8173 

In order to support having custom resources in the default Linkerd installation, it is necessary to add a separate install step to install CRDs before the core install.  The Linkerd Helm charts already accomplish this by having CRDs in a separate chart.

We add this functionality to the CLI by adding a `--crds` flag to `linkerd install` and `linkerd upgrade` which outputs manifests for the CRDs only and remove the CRD manifests when the `--crds` flag is not set.  To avoid a compounding of complexity, we remove the `config` and `control-plane` stages from install/upgrade.  The effect of this is that we drop support for splitting up an install by privilege level (cluster admin vs Linkerd admin).

The Linkerd install flow is now always a 2-step process where `linkerd install --crds` must be run first to install CRDs only and then `linkerd install` is run to install everything else.  This more closely aligns the CLI install flow with the Helm install flow where the CRDs are a separate chart.  Attempting to run `linkerd install` before the CRDs are installed will result in a helpful error message.

Similarly, upgrade is also a 2-step process of `linkerd upgrade --crds` follow by `linkerd upgrade`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-04-28 09:36:14 -07:00
Michał Romanowski 88b8da50d2
Introduce node affinity support for linkerd pods (#8137)
In order to restrict pods to run only on arbitrarily chosen nodes, affinities
or tolerations can be used. Currently, Linkerd only supports tolerations,
which are applied to pods and allow them to be scheduled on nodes with
matching "taints".

Certain environments and workflows lean more towards affinity instead of
tolerations to determine preferred or required scheduling. This change
introduces a new "nodeAffinity" field so that users may specify affinity
rules for scheduling Linkerd pods.

Closes #8136

Signed-off-by: Michal Romanowski <michal.rom089@gmail.com>
2022-04-15 11:24:16 +01:00
Kevin Leimkuhler bb8737b912
Add change notes for `edge-22.3.5` (#8182)
This edge release introduces new policy CRDs that allow for more generalized
authorization policies.

The `AuthorizationPolicy` CRD authorizes clients that satisfy all the required
authentications to communicate with the Linkerd `Server` that it targets.
Required authentications are specified through the new `MeshTLSAuthentication`
and `NetworkAuthentication` CRDs.

A `MeshTLSAuthentication` defines a list of authenticated client IDs—specified
directly by proxy identity strings or referencing resources such as
`ServiceAccount`s.

A `NetworkAuthentication` defines a list of client networks that will be
authenticated.

Additionally, to support the new CRDs, policy-related labels have been changed
to better categorize policy metrics. A `srv_kind` label has been introduced
which splits the current `srv_name` value—formatted as `kind:name`—into separate
labels. The `saz_name` label has been removed and is replaced by the new
`authz_kind` and `authz_name` labels.

* Introduced the `srv_kind` label which allowed splitting the value of the
  current `srv_name` label
* Removed the `saz_name` label and replaced it with the new `authz_kind` and
  `authz_name` labels
* Fixed an issue in the destination controller where an update would not be sent
  after an endpoint was discovered for a currently empty service
* Introduced the following custom resource types to support generalized
  authorization policies: `AuthorizationPolicy`, `MeshTLSAuthentication`,
  `NetworkAuthentication`
* Deprecated the `--proxy-version` flag (thanks @importhuman!)
* Updated linkerd-viz to use new policy CRDs

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-31 15:59:32 -06:00
Ujjwal Goyal 786c9cf14a
cli: Deprecate proxy-version flag (#8027)
Fixes #7939 

Signed-off-by: Ujjwal Goyal <importujjwal@gmail.com>

Co-authored-by: Matei David <matei.david.35@gmail.com>
2022-03-30 14:44:19 -07:00
Oliver Gould c1a1430d1a
Introduce AuthorizationPolicy CRDs (#8007)
Issue #7709 proposes new Custom Resource types to support generalized
authorization policies:

- `AuthorizationPolicy`
- `MeshTLSAuthentication`
- `NetworkAuthentication`

This change introduces these CRDs to the default linkerd installation
(via the `linkerd-crds` chart) and updates the policy controller's
to handle these resource types. The policy admission controller
validates that these resource reference only suppported types.

This new functionality is tested at multiple levels:

* `linkerd-policy-controller-k8s-index` includes unit tests for the
  indexer to test how events update the index;
* `linkerd-policy-test` includes integration tests that run in-cluster
  to validate that the gRPC API updates as resources are manipulated;
* `linkerd-policy-test` includes integration tests that exercise the
  admission controller's resource validation; and
* `linkerd-policy-test` includes integration tests that ensure that
  proxies honor authorization resources.

This change does NOT update Linkerd's control plane and extensions to
use these new authorization primitives. Furthermore, the `linkerd` CLI
does not yet support inspecting these new resource types. These
enhancements will be made in followup changes.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-03-30 12:26:45 -07:00
Oliver Gould 00954d71c6
policy: Add end-to-end ServerAuthorization tests (#8155)
In preparation for new policy CRD resources, this change adds end-to-end
tests to validate policy enforcement for `ServerAuthorization`
resources.

In adding these tests, it became clear that the OpenAPI validation for
`ServerAuthorization` resources is too strict. Various `oneof`
constraints have been removed in favor of admission controller
validation. These changes are semantically compatible and do not
necessitate an API version change.

The end-to-end tests work by creating `curl` pods that call an `nginx`
pod. In order to test network policies, the `curl` pod may be created
before the nginx pod, in which case an init container blocks execution
until a `curl-lock` configmap is deleted from the cluster. If the
configmap is not present to begin with, no blocking occurs.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-03-28 14:03:24 -07:00
Alex Leong 47105d5eb6
edge-22.3.4 (#8141)
* Disabled pprof endpoints on Linkerd control plane components by default
* Fixed an issue where mirror service endpoints of headless services were always
  ready regardless of gateway liveness
* Added server side validation for ServerAuthorization resources
* Fixed an "origin not allowed" issue when using the latest Grafana with the
  Linkerd Viz extension

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-03-24 13:44:49 -07:00
Kevin Leimkuhler 388f14f48f
allow pprof to be configurable via helm flags (#8090)
Follow-up to #8087 that allows pprof to be enabled via the `--set
enablePprof=true` flag.

Each control plane components spawns its own admin server, so each of these
received it's own `enable-pprof` flag. When `enablePprof=true`, it is passed
through to each component so that when it launches its admin server, its pprof
endpoints are enabled.

A note on the templating: `-enable-pprof={{.Values.enablePprof | default
false}}`. `false` values are not rendered by Helm so without the `... | default
false}}`, it tries to pass the flag as `-enable-pprof=""` which results in an
error. Inlining this felt better than conditionally passing the flag with

```yaml {{ if .Values.enablePprof -}} -enable-pprof={{.Values.enablePprof}} {{
end -}} ```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-22 14:31:04 -06:00
Oliver Gould c445c72d61
policy: Validate ServerAuthorization resources (#8076)
`ServerAuthorization` resources are not validated by the admission
controller.

This change enables validation for `ServerAuthorization` resources,
based on changes to the admission controller proposed as a part of
linkerd/linkerd2#8007. This admission controller is generalized to
support arbitrary resource types. The `ServerAuthoriation` validation
currently only ensures that network blocks are valid CIDRs and that they
are coherent. We use the new _schemars_ feature of `ipnet` v2.4.0 to
support using IpNet data structures directly in the custom resource
type bindings.

This change also adds an integration test to validate that the admission
controller behaves as expected.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-03-16 14:03:21 -07:00
Kevin Leimkuhler d5e58f214d
add changes for edge-22.3.3 (#8072)
This edge release ensures that in multicluster installations, mirror service
endpoints have their readiness tied to gateway liveness. When the gateway for a
target cluster is not alive, the endpoints that point to it on a source cluster
will properly indicate that they are not ready.

* Fixed tap controller logging errors that were succeptible to log forgery by
  ensuring special characters are escaped
* Fixed issue where mirror service endpoints were always ready regardless of
  gateway liveness
* Removed unused `namespace` entry in `linkerd-control-plane` chart

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-15 12:44:30 -07:00
Matei David a7b8a5b66b
edge-22.3.2 (#8048)
* edge-22.3.2

This edge release includes a few fixes and quality of life improvements. An
issue has been fixed in the proxy allowing HTTP Upgrade requests to work
through multi-cluster gateways, and the init container's resource limits and
requests have been revised. Additionally, more Go linters have been enabled and
improvements have been made to the devcontainer.

* Changed `linkerd-init` resource (CPU/memory) limits and requests to ensure by
  default the init container does not break a pod's `Guaranteed` QOS class
* Added a new check condition to skip pods whose status is `NodeShutdown`
  during validation as they will not have a proxy container
* Fixed an issue that would prevent proxies from sending HTTP Upgrade requests
  (used in websockets) through multi-cluster gateways

Signed-off-by: Matei David <matei@buoyant.io>

Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2022-03-11 17:26:22 +00:00
Kevin Leimkuhler fc2032fb8e
enable `staticcheck` (#8037)
Closes #7881 

This makes the rest of the necessary fixes to satisfy the `staticcheck` lint.

The only class of lints that are being skipped are those related to deprecated tap code. There was some discussion on the original change started by @adleong about if this _actually_ deprecated [here](https://github.com/linkerd/linkerd2/pull/3240#discussion_r313634584); it doesn't look like we every came back around to fully removing it but I don't think it should be a blocker for enabling the lint right now.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-10 15:43:35 -08:00
Kevin Leimkuhler 3222778191
Match linkerd-init CPU/memory requests/limits (#7989)
Closes #7980 

A pod is considered `Burstable` instead of `Guaranteed` if there exists at least one container in the pod that specifies CPU/memory limits/requests that do not match.

The `linkerd-init` container falls into this category meaning that even if all other containers in a Pod have matching CPU/memory limits/requests, the Pod will not be considered `Guaranteed` because of `linkerd-init`'s hardcoded values.

This changes the values to match, meaning that `linkerd-init` will not be the culprit container if a Pod is not considered `Guaranteed`. Raising the requests—instead of lowering the limits—felt like the safer option here. This means that the container will now always be guaranteed these amounts _and_ will never use more.

[Docs](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed) explain this in more detail.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-08 15:30:03 -07:00
cpretzer 2065e817fa
Changes for edge-22.3.1 (#8002)
## edge-22.3.1

This edge release includes updates to dependencies, CI, and rust 1.59.0. It also
includes changes to the `linkerd-jaeger` chart to ensure that namespace labels
are preserved and adds support for `imagePullSecrets`, along with improvements
to the multicluster and policy functionality.

* Added note to `multicluster link` command to clarify that the link is
  one-direction
* Introduced `imagePullSecrets` to Jaeger Helm chart
* Updated Rust to v1.59.0
* Fixed a bug where labels can be overwritten in the `linkerd-jaeger` chart
* Fix broken mirrored headles services after `repairEndpoints` runs
* Updated `Server` CRD to handle an empty `PodSelector`

Signed-off-by: Charles Pretzer <charles@buoyant.io>
2022-03-03 14:00:11 -07:00
Kevin Leimkuhler 67bcd8f642
Add `gosec` and `errcheck` lints (#7954)
Closes #7826

This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.

A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.

Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.

Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-03 10:09:51 -07:00
Oliver Gould d4543cd86e
policy: Use a `kubert::Runtime` (#7961)
`kubert` provides a runtime utility that helps reduce boilerplate around
process lifecycle management, construction of admin and HTTPS servers,
etc.

The admission controller server preserves the certificate reloading
functionality introduced in 96131b5 and updates the utility to read both
RSA and PKSC8 keys to close #7963.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-03-01 16:29:32 -08:00
Kevin Leimkuhler af34bbd017
Add changes for edge-22.2.4 (#7962)
## edge-22.2.4

 This edge release continues to address several security related lints and
 ensures they are checked by CI.

 * Add `linkerd check` warning for clusters that cannot verify their
   `clusterNetworks` due to Nodes missing the `podCIDR` field
 * Changed `Server` CRD to allow having an empty `PodSelector`
 * Modified `linkerd inject` to only support `https` URLs to mitigate security
   risks
 * Fixed potential goroutine leak in the port forwarding used by several CLI
   commands and control plane components
 * Fixed timeouts in the policiy validator which could lead to failures if
   `failurePolicy` was set to `Fail`

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-02-24 18:48:22 -07:00
Alex Leong 2a0084c6ac
Replace time.After with time.NewTimer to avoid memory leaks (#7956)
The timer created by calling `time.After` is not cleaned up until the timer fires.  Repeated calls to `time.After` in a loop create multiple timers which can accumulate if the loop runs faster than the timeout.

We replace `time.After` with `time.NewTimer` and explicitly call `Stop` when we are finished to clean up the timer.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-02-24 15:34:52 -08:00
Oliver Gould 27d89a8e3a
cli: Only allow HTTPS URLs with `inject` (#7940)
The CLI may access manifests over insecure channels, potentially
allowing MITM-attacks to run arbitrary code.

This change modifies `inject` to only support `https` URLs to mitigate
this risk.

This change addresses a security review finding (`TOB-LKDTM-4`).

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-23 09:26:27 -08:00
Alejandro Pedraza a268ff11c9
Allow `Server` CRD to have empty `PodSelector` (#7925)
Fixes #7904

Allow the `Server` CRD to have the `PodSelector` entry be an empty object, by removing the `omitempty` tag from its go type definition and the `oneof` section in the CRD. No update to the CRD version is required, as this is BC change -- The CRD overriding was tested fine.

Also added some unit tests to confirm podSelector conditions are ANDed, and some minor refactorings in the `Selector` constructors.

Co-authored-by: Oliver Gould <ver@buoyant.io>
2022-02-23 13:45:34 +00:00
Oliver Gould 425a43def5
Enable gocritic linting (#7906)
[gocritic][gc] helps to enforce some consistency and check for potential
errors. This change applies linting changes and enables gocritic via
golangci-lint.

[gc]: https://github.com/go-critic/go-critic

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-17 22:45:25 +00:00
Alex Leong 2a4c84db3e
edge-22.2.3 (#7911)
This edge release fixes some `Instant`-related proxy panics that occur on Amazon
Linux. It also includes many behind the scenes improvements to the project's
CI and linting.

* Removed the `--controller-image-version` install flag to simplify the way that
  image versions are handled. The controller image version can be set using the
  `--set linkerdVersion` flag or Helm value
* Lowercased logs and removed redundant lines from the Linkerd2 proxy init
  container
* Prevented the proxy from logging spurious errors when its pod does not define
  any container ports
* Added workarounds to reduce the likelihood of `Instant`-related proxy panics
  that occur on Amazon Linux

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-02-17 13:51:08 -08:00
Matei David 0d59864033
Remove usage of controllerImageVersion values field (#7883)
Remove usage of controllerImageVersion values field

This change removes the unused `controllerImageVersion` field, first
from the tests, and then from the actual chart values structure. Note
that at this point in time, it is impossible to use
`--controller-image-version` through Helm, yet it still seems to be
working for the CLI.

* We configure the charts to use `linkerdVersionValue` instead of
  `controlPlaneImageVersion` (or default to it where appropriate).
* We add the stringslicevar flag (i.e `--set`) to the flagset we use in
  upgrade tests. This means instead of testing value overrides through a
  dedicated flag, we can now make use of `--set` in upgrade tests. We
  first set the linkerdVersionValue in the install option and then
  override the policy controller image version and the linkerd
  controller image version to test flags work as expected.
* We remove hardcoded values from healthcheck test.
* We remove field from chart values struct.

Signed-off-by: Matei David <matei@buoyant.io>
2022-02-17 15:19:08 +00:00
Matei David 3606972bac
Bump linkerd2-proxy-init to v1.5.3 (#7899)
* Bump linkerd2-proxy-init to v1.5.3

Signed-off-by: Matei David <matei@buoyant.io>
2022-02-17 12:40:48 +00:00
Oliver Gould f5876c2a98
go: Enable `errorlint` checking (#7885)
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.

This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).

[el]: https://github.com/polyfloyd/go-errorlint

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-16 18:32:19 -07:00
Kevin Leimkuhler 9342b0e98d
Always render `LINKERD2_PROXY_INBOUND_PORTS` env var in install/inject output (#7893)
Closes #7816.

With this change, the `LINKERD2_PROXY_INBOUND_PORTS` is always rendered in install/inject output. This means that if a workload does not expose any ports, then the env var is rendered as the empty string.

Coupled with linkerd/linkerd2-proxy#1478, no error is printed upon proxy startup.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-02-16 17:20:21 +00:00
Alejandro Pedraza df311fd8ca
Edge-22.2.2 change notes (#7860)
* Edge-22.2.2 change notes

## edge-22.2.2

This edge release updates the jaeger extension to be available in ARM
architectures as well, and applies some security-oriented amendments.

* Upgraded jaeger and the opentelemetry-collector to their latest versions,
  which now support ARM architectures
* Fixed `linkerd multicluster check` which was reporting false warnings
* Started enforcing TLS v1.2 as a minimum in the webhook servers
* Had the identity controller emit SHA256 certificate fingerprints in its
  logs/events, instead of MD5
2022-02-10 18:06:23 -05:00
Oliver Gould 863a51c2af
identity: Document use of `InsecureSkipVerify` (#7835)
The CLI's diagnostic command that dumps a proxy's certificate
information does not (and should not) verify the proxy's certificate.

This change documents why verification is disabled.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-07 19:16:24 -08:00
Oliver Gould 9f659145a7
Split policy CRDs into separate files (#7787)
Currently both the `Server` and `ServerAuthorization` CRDs are defined
in a single file. As additional CRDs are introduced, this becomes
unwieldy to navigate.

This change splits `policy-crd.yaml` into `policy/sever.yaml` and
`policy/serverauthorization.yaml`. It also renames
`serviceprofile-crd.yaml` to `serviceprofile.yaml` (since it's already
under the `linkerd-crds` chart).

No functional changes.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-04 14:04:41 -08:00
Kevin Leimkuhler e79bd72dbd
Add 2 minutes linkerd-await timeout (#7778)
If the proxy doesn't become ready `linkerd-await` never succeeds
and the proxy's logs don't become accessible.

This change adds a default 2 minute timeout so that pod startup
continues despite the proxy failing to become ready. `linkerd-await`
fails and `kubectl` will report that a post start hook failed.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-02-03 17:23:06 -08:00