Long ago, the policy-controller image shipped with a distroless base image, but
we have since been able to remove all runtime dependencies and ship with a
scratch image. There's no reason to manage this binary seperately from the rest
of the controller.
This change moves the controller/Dockerfile to Dockerfile.controller, and it is
updated to subsume the policy-controller/Dockerfile.
This should *not* impact users, except to reduce the overhead of extra image
pulls.
BREAKING CHANGE: with this change, we no longer ship a seperate
policy-controller image.
* chore(helm)!: change iptables default mode to `nft`
This change sets as a new default `nft` for the `proxyInit.iptablesMode`
and `iptablesMode` values in the linkerd-control-plane and linkerd2-cni
helm charts. This doesn't imply any change in user-facing behavior.
This was prompted by EKS with k8s 1.33 no longer supporting the iptables
legacy mode.
Further testing in multiple platforms with different k8s versions
revealed nft mode is now broadly supported.
Upgrading via Helm will apply the new default, unless the initial
install explicitly set the legacy mode.
Handling whitespace in helm templates is far from trivial!
e82bac8bd9 first syncs all the golden files with the current manifest templates. Given that these golden files are compared _semantically_ and not _textually_, differences on non-significant content like empty lines don't generate test failures and accidental inclusion of empty lines won't trigger updates of golden files via `go test ./... -update`. So this syncing required some manual work.
5504b8e762 then makes the following updates to the manifests, and updates the golden files again accordingly:
* Empty line after the `nodeSelector`:
Caused by the `linkerd.affinity` partial template after it, which might produce no output at all. So including this partial might or might not require a trailing newline. We wrap it around `with` to address that.
* Empty line after the `LINKERD2_PROXY_OUTBOUND_METRICS_HOSTNAME_LABELS` env var:
Caused by the comment `{{- /* Configure inbound and outbound parameters, e.g. for HTTP/2 servers. */}}`. The whitespace suppression should go at the end instead, to align with the other suppression operators surrounding this comment.
There are other instances of whitespace lines that this didn't address.
Previously, this would require the `linkerd-jaeger` extension to be installed. This comes with its own set of issues, namely managing yet another component of linkerd.
Since the tracing config is basically just environment variables and one volume mount, hoisting them up into the main control plane helm chart for the proxy injector to handle is likely the simplest and most maintainable path going forward.
The injector from the `linkerd-jaeger` extension will still work for the time being, and is interoprable to some degree as long as the injector in the extension is disabled. A follow-up to this PR should add documentation around this and how users can migrate from the extension to these configs.
Signed-off-by: Scott Fleener <scott@buoyant.io>
Kubernetes 1.33 introduced a warning for duplicate port names across
containers in the same Pod. This change renames several container ports
in the Linkerd control-plane and CNI charts to ensure unique identifiers
across all containers, eliminating these warnings.
- destination:
- grpc → dest-grpc
- admin-http → dest-admin
- sp-validator:
- admin-http → spval-admin
- policy-controller:
- grpc → policy-grpc
- admin-http → policy-admin
- identity:
- grpc → ident-grpc
- admin-http → ident-admin
- proxy-injector:
- admin-http → injector-admin
- linkerd2-cni:
- admin-http → repair-admin
BREAKING CHANGE: Consumers must update any references to the renamed ports in
Service definitions, probes, monitoring rules, and related configuration.
* Add pod ip to via downward API to Trace Attributes
Provide additional attributes for tracing associations
Modified the helm templates to add the pod IP, and the jaeger injector to add it to the
standard attributes
Deploying should show the ip as a trace attribute
Fixes#13980
Signed-off-by: Justin <justin@sphinxdefense.com>
* fix: Update proxy-injector controller test goldens
These aren't updated automatically by the `go test ./cli/cmd/... --update` command, so they have to be updated manually.
Signed-off-by: Scott Fleener <scott@buoyant.io>
---------
Signed-off-by: Justin <justin@sphinxdefense.com>
Signed-off-by: Scott Fleener <scott@buoyant.io>
Co-authored-by: Scott Fleener <scott@buoyant.io>
Now that the default tracing protocol is OpenTelemetry, this changes the default port for traces to the OpenTelemetry port on the collector instead of the OpenCensus one.
The current default port, matched with the current default trace protocol of OpenTelemetry, is currently broken as it causes traces to be sent to a collector port that expects OpenCensus traces. This is technically a breaking change, but it is less breaking than the change of the default protocol to OpenTelemetry.
More explicitly, if a user only used the defaults, this change brings them from a broken state to a working state. If a user brings their own tracing infrastructure with a custom collector address, this change doesn't affect them at all. The only users that may be broken by this are ones that explicitly set the protocol to OpenCensus, but we expect this to be rare as OpenCensus as a protocol has been sunset for a few years now.
To avoid getting into a bad state, we add the following checks to `linkerd install --crds`:
* If the GatewayAPI CRDs are not present on the cluster and `installGatewayAPI` is false, then we report an error that the gateway API is a requirement for Linkerd
* If the GatewayAPI CRDs are present on the cluster and not installed by Linkerd but `installGatewayAPI` is true, then we report an error that this would cause a conflict
Linkerd proxies no longer omit `hostname` labels for outbound policy metrics (due to potential for high-cardinality).
This change adds Helm templates and annotations to control this behavior, allowing users to opt-in to these outbound hostname labels.
Signed-off-by: Scott Fleener <scott@buoyant.io>
The proxy.cores helm value is overly restrictive: it enforces a hard upper
limit. In some scenarios, a fixed limit is not practical: for example, when the
proxy is meshing an application that configures no limits.
This change replaces the proxy.cores value with a new proxy.runtime.workers
structure, with members:
- `minimum`: configures the minimum number of worker threads a proxy may use.
- `maximumCPURatio`: optionally allows the proxy to use a larger
number of CPUs, relative to the number of available CPUs on the node.
So with a minimum of 2 and a ratio of 0.1, a proxy would run 2 worker threads
(the minimum) running on an 8 core node, but allocate 10 worker threads on a 96
core node.
When the `config.linkerd.io/proxy-cpu-limit` is used, that will continue to set
the maximum number of worker threads to a fixed value.
When it is not set, however, the minimum worker pool size is derived from the
`config.linkerd.io/proxy-cpu-request`.
An additional `config.linkerd.io/proxy-cpu-ratio-limit` annotation is introduced
to allow workload-level configuration.
A follow up to https://github.com/linkerd/linkerd2/pull/13699, this default-enables the config option introduced in that PR. Now, all traffic between meshed pods should flow to the proxy's inbound port.
Signed-off-by: Scott Fleener <scott@buoyant.io>
The Helm function `default` will treat a boolean false value as unset and use the default instead of the false, even when it is set. When rendering CRDs during install or upgrade, this can cause Linkerd to fall back to using the `installGatewayAPI` value even when `enableHttpRoutes` is explicitly set to false.
We replace the `default` function with a ternary which checks if the key is present. We also add tests for both CLI and Helm.
Signed-off-by: Alex Leong <alex@buoyant.io>
We add the `helm.sh/resource-policy: keep` annotation to Gateway API CRDs installed by Linkerd. This is to enable Linkerd to stop managing these CRDs in the future without deleting them and causing downtime during upgrades.
Signed-off-by: Alex Leong <alex@buoyant.io>
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/cni-plugin%2Fv1.6.2
> Fixed shutdown issue
>
> This release fixes an issue introduced in v1.6.0 where the linkerd-cni
> pod was failing to complete its cleanup tasks during shutdown, leaving
> the linkerd-cni active but potentially with revoked permissions, thus
> interfering with the proper startup of pods in the node.
We upgrade the gateway API CRDs in the linkerd-crd chart to gateway API release v1.1.1 experimental. This CRD version includes HTTPRoute v1beta1 AND v1 which means that it is compatible with the current policy-controller which reads v1beta1 and compatible with the changes coming to the policy-controller which will cause it to read v1. Similarly, it includes GRPCRoute v1alpha2 and v1.
Tested by installing Linkerd edge-25.2.1 and then upgrading the CRDs. CRDs were upgraded without any impact to the running control plane.
Signed-off-by: Alex Leong <alex@buoyant.io>
These values are useful as fields for correlating OpenTelemetry traces. A corresponding proxy change will be needed to emit these fields in said traces.
Signed-off-by: Scott Fleener <scott@buoyant.io>
This change removes the `policy` entry from the cni config template, which isn't used. That contained a `__SERVICEACCOUNT_TOKEN__` placeholder, which was coupling this config file with the `ZZZ-linkerd-cni-kubeconfig` file generated by linkerd-cni. In linkerd/linkerd2-proxy-init#440 we add support for detecting changes in the mounted serviceaccount token file (see #12573), and the current change facilitates that effort.
Co-authored-by: Oliver Gould <ver@buoyant.io>
Fixes#13389
Values added:
- `destinationController.podAnnotations`
- annotations only for `linkerd-destination`
- `identity.podAnnotations`
- annotations only for `linkerd-identity`
- `proxyInjector.podAnnotations`
- annotations only for `linkerd-proxy-injector`
Each deployment's podAnnotations take precedence over global one by means of [mergeOverwrite](https://helm.sh/docs/chart_template_guide/function_list/#mergeoverwrite-mustmergeoverwrite).
Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
The policy container is configured differently than all other controllers: other
controllers configure an `initialDelaySeconds` on their `livenessProbe` but not
on their `readinessProbe`. The policy container, however, includes this
configuration on its `readinessProbe` but not on its `livenessProbe`.
This commit fixes the policy container to match the other controllers.
This reduces pod readiness time from 20s to 4s.
This adds the status stanza to the HTTPLocalRateLimitPolicy CRD, and implements its handling in the policy status controller.
For the controller to accept an HTTPLocalRateLimitPolicy CR it checks that:
- The targetRef is an existing Server
- If there are multiple HTTPLocalRateLimitPolicy CRs pointing to the same server, only accept the oldest one, or if created at the same time, the first in alphabetical order (that logic was moved from the inbound indexer to the status controller).
6cd7dc22c: Update RL CRD and RBAC to allow patching its status
69aee0129: Update golden files
60f25b716: Implement status handling for HTTPLocalRateLimitPolicy CRD
fc99d3adf: Update existing unit tests
0204acf65: New unit tests
## Examples
Not accepted CR:
```yaml
...
status:
conditions:
- lastTransitionTime: "2024-11-12T23:10:05Z"
message: ""
reason: RateLimitReasonAlreadyExists
status: "False"
type: Accepted
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
```
Accepted CR:
```yaml
...
status:
conditions:
- lastTransitionTime: "2024-11-12T23:10:05Z"
message: ""
reason: Accepted
status: "True"
type: Accepted
targetRef:
group: policy.linkerd.io
kind: Server
name: web-http
```
In a previous PR (#13246) we introduced an egress networks namespace that is used to create `EgressNetwork` objects that affect all client workloads.
This change makes this namespace configurable through helm values. Additionally, we unify the naming convention of the arguments to use **egress** as opposed to **external**
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Subject
Disables "automountServiceAccountToken", instead manually mounts it as a projected volume where necessary
Problem
By default, kubernetes enables "automountServiceAccountToken" for all pods.
This poses a security risk, as pods might get kube-api permissions unintentionally.
More specifically, this fails security compliance tests:
https://learn.microsoft.com/en-us/azure/governance/policy/samples/built-in-policieshttps://www.azadvertizer.net/azpolicyadvertizer/kubernetes_block-automount-token.html
Solution
Disable "automountServiceAccountToken", create projected volume for the token, and mount it on relevant containers
Validation
Linkerd pods are able to access k8s API, work as expected (same as before)
Fixes#13108
---------
Signed-off-by: Aran Shavit <Aranshavit@gmail.com>
This PR adds an `EgressNetwork` CRD, which purpose is to describe networks that are external to the cluster.
In addition to that it also adds `TLSRoute` and `TCPRoute` gateway api CRDs.
Most of the work in this change is focused on introducing these CRDs and correctly setting their status based on route specificity rules described in: https://gateway-api.sigs.k8s.io/geps/gep-1426/#route-types.
Notable changes include:
- ability to attach TCP and TLS routes to both `EgressNetworks` and `Service` objects
- implemented conflict resolutions between routes
- admission validation on the newly introduced resources
- module + integration tests
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This release fixes the issue that when the node had hit the inotify limit, deploying the linkerd-cni daemonset would silently fail. Now the problem is caught and the pod enters a crash loop until the limit is no longer surpassed.
Default values for 30s will be enough to linux TCP-stack completes about 7 packages retransmissions, after about 7 retransmissions RTO (retransmission timeout) will rapidly grows and does not make much sense to wait for too long.
Setting TCP_USER_TIMEOUT between linkerd-proxy and wild world is enough, since connections to containers in same pod are more stable and reliable
Fixes#13023
Signed-off-by: UsingCoding <extendedmoment@outlook.com>
* Dual-stack support for ExternalWorkloads
This changes the `workloadIPs.maxItems` field in the ExternalWorkload CRD from `1` to `2`, to accommodate for an IPv4 and IPv6 pair. This is a BC change, so there's no need to bump the CRD version.
The control plane already supports this, so this change is mainly about expansions to the unit tests to also account for the double stack case.
## Problem
When the IPv6 stack in Linux is disabled, the proxy will crash at startup.
## Repro
In a Linux machine, disable IPv6 networking through the `net.ipv6.conf.*` sysctl kernel tunables, and restart the system:
- In /etc/sysctl.conf add:
```
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
```
- In /etc/default/grub set:
```
GRUB_CMDLINE_LINUX="ipv6.disable=1"
```
- Don't forget to update grub before rebooting:
```
sudo update-grub
```
In a default k3d cluster, install Linkerd. You should see the following error in any proxy log:
```
thread 'main' panicked at /__w/linkerd2-proxy/linkerd2-proxy/linkerd/app/src/lib.rs:245:14:
Failed to bind inbound listener: Os { code: 97, kind: Uncategorized, message: "Address family not supported by protocol" }
```
## Cause
Even if a k8s cluster didn't support IPv6, we were counting on the nodes having an IPv6 stack, which allowed us to bind to the inbound proxy to [::] (although not to [::1] for the outbound proxy, as seen in GKE). This was the case in the major cloud providers we tested, but it turns out there are folks running nodes with IPv6 disabled and so we have to cater that case as well.
## Fix
The current change undoes some of the changes from 7cbe2f5ca6 (for the proxy config), 7cbe2f5ca6 (for the policy controller) and 66034099d9 (for linkerd-cni), binding back again to 0.0.0.0 unless `disableIPv6` is false.
The Linkerd proxy suppresses all logging of HTTP headers at debug level or higher unless the `proxy.logHTTPHeaders` helm values is set to `insecure`. However, even when this value is not set, HTTP headers can still be logged if the log level is set to trace.
We update the log string we use to disable logging of HTTP headers from `linkerd_proxy_http::client[{headers}]=off` to the more general `[{headers}]=off,[{request}]=off`. This will disable any logging which includes a `headers` or `request` field. This has the effect of disabling the logging of headers at trace level as well. As before, these logs can be re-enabled by settings `proxy.logHTTPHeaders=insecure`.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Configure network-validator and repair-controller to work with IPv6
Fixes#12864
The linkerd-cni network-validator container was binding to the IPv4 wildcard and connecting to an IPv4 address. This wasn't breaking things in IPv6 clusters but it was only validating the iptables rules and not the ip6tables ones. This change introduces logic to use addresses according to the value of `disableIPv6`. If IPv6 is enabled, then the ip6tables rules would get exercised. Note that a more complete change would also exercise both iptables and ip6tables, but for now we're defaulting to ip6tables.
Similarly was the case with repair-controller, but given the IPv4 wildcard was used for the admin server, in IPv6 clusters the kubelet wasn't able to reach the probe endpoints and the container was failing. In this case the fix is just have the admin server bind to `[::]`, which works for IPv4 and IPv6 clusters.
Followup to #12844
This new field defines the default policy for Servers, i.e. if a request doesn't match the policy associated to a Server then this policy applies. The values are the same as for `proxy.defaultInboundPolicy` and the `config.linkerd.io/default-inbound-policy` annotation (all-unauthenticated, all-authenticated, cluster-authenticated, cluster-unauthenticated, deny), plus a new value "audit". The default is "deny", thus remaining backwards-compatible.
This field is also exposed as an additional printer column.
The `shortnames: []` field on the `ExternalWorkload` CRD causes tools
like ArgoCD to report applications out of sync.
Remove the empty field completely from the CRD manifest.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Default values for `linkerd-init` (resources allocated) are not always
the right fit. We offer default values to ensure proxy-init does not get
in the way of QOS Guaranteed (`linkerd-init` resource limits and
requests cannot be configured in any other way).
Instead of using default values that can be overridden, we can re-use
the proxy's configuration values. For the pod to be QOS Guaranteed, the
values for the proxy have to be set any way. If we re-use the same
values for proxy-init we can ensure we'll always request the same amount
of CPU and memory as needed.
* `linkerd-init` now defaults to the proxy's values
* when the proxy has an annotation configuration for resource requests,
it also impacts `linkerd-init`
* Helm chart and docs have been updated to reflect the missing values.
* tests now no longer use `ProxyInit.Resources`
UPGRADE NOTE:
- Deprecates `proxyInit.resources` field in the Helm values.
- It will be a no-op if specified (no hard failures)
Closes#11320
---------
Signed-off-by: Matei David <matei@buoyant.io>
We add support for GrpcRoute resources in the policy-controller's status controller. This means that the policy controller will watch GrpcRoute resources in the cluster and keep their status up to date, in the same way that it currently does for HttpRoute resources.
Signed-off-by: Alex Leong <alex@buoyant.io>
The `ext-namespace-metadata-linkerd-config` Role is the only resource in
the base Linkerd install that doesn't include the
`linkerd.io/control-plane-ns` label, and that appears to be an
oversight.
This change adds the missing label for consistency.
Signed-off-by: Kevin Ingelman <ki@buoyant.io>
The proxy may expose a /shutdown HTTP endpoint on its admin server that may be used by `linkerd-await --shutdown` to trigger proxy shutdown after a process completes. If an application has an SSRF vulnerability, however, an attacker could use this endpoint to trigger proxy shutdown, causing a denial of service. This admin endpoint is only useful with linkerd-await; and this functionality is supplanted by Kubernetes Native Sidecars.
To address this potential issue, this change disables the proxy's admin endpoint by default. A helm value is introduced to support enabling the admin endpoint cluster-wide; and the `config.linkerd.io/proxy-admin-shutdown: enabled` annotation may be set to enable it the admin endpoint on an individual workload.
Signed-off-by: Alex Leong <alex@buoyant.io>
Followup to linkerd/linkerd2-proxy#2872 , where we swapped the
trust-dns-resolver with the hickory-resolver dependency. This change
updates the default log level setting for the proxy to account for
that.
Those releases ensure that when IPv6 is enabled, the series of ip6tables commands succeed. If they fail, the proxy-init/linkerd-cni containers should fail as well, instead of ignoring errors.
See linkerd/linkerd2-proxy-init#388
Fixes#12620
When the Linkerd proxy log level is set to `debug` or higher, the proxy logs HTTP headers which may contain sensitive information.
While we want to avoid logging sensitive data by default, logging of HTTP headers can be a helpful debugging tool. Therefore, we add a `proxy.logHTTPHeaders` Helm value which prevents the logging of HTTP headers when set to false. The default value of this value is false so that headers cannot be logged unless users opt-in.
Signed-off-by: Alex Leong <alex@buoyant.io>