Commit Graph

1289 Commits

Author SHA1 Message Date
Alejandro Pedraza 81eb2c8618
Simplify cni config (#13407)
This change removes the `policy` entry from the cni config template, which isn't used. That contained a `__SERVICEACCOUNT_TOKEN__` placeholder, which was coupling this config file with the `ZZZ-linkerd-cni-kubeconfig` file generated by linkerd-cni. In linkerd/linkerd2-proxy-init#440 we add support for detecting changes in the mounted serviceaccount token file (see #12573), and the current change facilitates that effort.

Co-authored-by: Oliver Gould <ver@buoyant.io>
2024-12-10 14:59:02 -05:00
Takumi Sue a20fc0bfa1
feat(helm): Allow specifying podAnnotations per deployment (#13388)
Fixes #13389

Values added:

- `destinationController.podAnnotations`
  - annotations only for `linkerd-destination`
- `identity.podAnnotations`
  - annotations only for `linkerd-identity`
- `proxyInjector.podAnnotations`
  - annotations only for `linkerd-proxy-injector`

 Each deployment's podAnnotations take precedence over global one by means of [mergeOverwrite](https://helm.sh/docs/chart_template_guide/function_list/#mergeoverwrite-mustmergeoverwrite).

Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2024-12-10 11:48:59 -08:00
Oliver Gould 17b2692d58
build(deps): bump linkerd/dev from v43 to v44 (#13428)
* docker.io/library/golang from 1.22 to 1.23
* gotestsum from 0.4.2 to 1.12.0
* protoc-gen-go from 1.28.1 to 1.35.2
* protoc-gen-go-grpc from 1.2 to 1.5.1
* docker.io/library/rust from 1.76.0 to 1.83.0
* cargo-deny from 0.14.11 to 0.16.3
* cargo-nextest from 0.9.67 to 0.9.85
* cargo-tarpaulin from 0.27.3 to 0.31.3
* just from 1.24.0 to 1.37.0
* yq from 4.33.3 to 4.44.5
* markdownlint-cli2 from 0.10.0 to 0.15.0
* shellcheck from 0.9.0 to 0.10.0
* actionlint from 1.6.26 to 1.7.4
* protoc from 3.20.3 to 29.0
* step from 0.25.2 to 0.28.2
* kubectl from 1.29.2 to 1.31.3
* k3d from 5.6.0 to 5.7.5
* k3s image shas
* helm from 3.14.1 to 3.16.3
* helm-docs from 1.12.0 to 1.14.2
2024-12-06 11:38:36 -08:00
Oliver Gould 82c47a9794
fix(policy): fix policy readiness probe delay (#13380)
The policy container is configured differently than all other controllers: other
controllers configure an `initialDelaySeconds` on their `livenessProbe` but not
on their `readinessProbe`. The policy container, however, includes this
configuration on its `readinessProbe` but not on its `livenessProbe`.

This commit fixes the policy container to match the other controllers.

This reduces pod readiness time from 20s to 4s.
2024-11-25 10:36:49 -05:00
Derek Brown 80e444edbd
lint: fix docker build warnings (#13351)
Docker builds emit a warning because the case of 'FROM' and 'as' don't match. Fix this everywhere.

Signed-off-by: Derek Brown <6845676+DerekTBrown@users.noreply.github.com>
2024-11-20 08:44:45 -05:00
Alex Leong 09ee0d41fc
Allow diagnostics endpoints command to receive more than one message (#13285)
The `linkerd diagnostics endpoints` command initiates a `Get` lookup to the destination controller to get the set of endpoints for a destination.  This is a streaming response API and the command takes only the first response message and displays it.  However, the full current state of endpoints may be split across multiple messages, resulting in an incomplete list of endpoints displayed.

We instead read continuously from the response stream for a short amount of time (5 seconds) before displaying the full set of endpoints received.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-11-14 09:30:49 -08:00
Alejandro Pedraza ce6c172749
Implement status handling for HTTPLocalRateLimitPolicy CRD (#13314)
This adds the status stanza to the HTTPLocalRateLimitPolicy CRD, and implements its handling in the policy status controller.

For the controller to accept an HTTPLocalRateLimitPolicy CR it checks that:
- The targetRef is an existing Server
- If there are multiple HTTPLocalRateLimitPolicy CRs pointing to the same server, only accept the oldest one, or if created at the same time, the first in alphabetical order (that logic was moved from the inbound indexer to the status controller).

6cd7dc22c: Update RL CRD and RBAC to allow patching its status
69aee0129: Update golden files
60f25b716: Implement status handling for HTTPLocalRateLimitPolicy CRD
fc99d3adf: Update existing unit tests
0204acf65: New unit tests

## Examples

Not accepted CR:
```yaml
...
status:
  conditions:
  - lastTransitionTime: "2024-11-12T23:10:05Z"
    message: ""
    reason: RateLimitReasonAlreadyExists
    status: "False"
    type: Accepted
  targetRef:
    group: policy.linkerd.io
    kind: Server
    name: web-http
```

Accepted CR:
```yaml
...
status:
  conditions:
  - lastTransitionTime: "2024-11-12T23:10:05Z"
    message: ""
    reason: Accepted
    status: "True"
    type: Accepted
  targetRef:
    group: policy.linkerd.io
    kind: Server
    name: web-http
```
2024-11-13 17:31:12 -05:00
Alejandro Pedraza a6057bb0e4
Remove empty `shortnames: []` from HTTPLocalRateLimitPolicy (#13297)
ArgoCD is not compatible with an empty `shortnames` entry.
(We encountered this same issue in the past with ExternalWorkloads: #12793)

Fixes #13295
2024-11-11 09:18:27 -08:00
Alejandro Pedraza caf8e82e7a
feat(policy): add HTTPLocalRateLimitPolicy (#13231)
This adds the HTTPLocalRateLimitPolicy CRD, which is indexed by the policy controller and exposed by the inbound API.

- 81ebc08bd: HTTPLocalRateLimitPolicy CRD and related changes
- 01afd2304: policy controller central changes
- b09892529: rust tests updates and additions
- 2f455973c: golden files updates.

## Testing

In a cluster with linkerd and emojivoto injected, deploy these resources:

```yaml
 apiVersion: policy.linkerd.io/v1beta3
kind: Server
metadata:
  namespace: emojivoto
  name: web-http
spec:
  # permissive policy, so we don't require setting up authz
  accessPolicy: all-unauthenticated
  podSelector:
    matchLabels:
      app: web-svc
  port: http
  proxyProtocol: HTTP/1
```
```yaml
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPLocalRateLimitPolicy
metadata:
  namespace: emojivoto
  name: web-rl
spec:
  targetRef:
    group: policy.linkerd.io
    kind: Server
    name: web-http
  total:
    requestsPerSecond: 100
  identity:
    requestsPerSecond: 20
  overrides:
  - requestsPerSecond: 10
    clientRefs:
    - kind: ServiceAccount
      namespace: emojivoto
      name: default
```

```console
$ kubectl -n emojivoto get httplocalratelimitpolicies.policy.linkerd.io
NAME     TARGET_KIND   TARGET_NAME   TOTAL_RPS   IDENTITY_RPS
web-rl   Server        web-http      100         20
```

Then see how the RL policy is exposed at the inbound API under the protocol section, with `linkerd dg policy -n emojivoto po/web-85f6fb8564-jp67d 8080`:

```yaml
...
protocol:
  Kind:
    Http1:
      local_rate_limit:
        identity:
          requestsPerSecond: 20
        metadata:
          Kind:
            Resource:
              group: policy.linkerd.io
              kind: httplocalratelimitpolicy
              name: web-rl
        overrides:
        - clients:
            identities:
            - name: default.emojivoto.serviceaccount.identity.linkerd.cluster.local
          limit:
            requestsPerSecond: 10
        total:
          requestsPerSecond: 100
...
```
2024-11-08 16:24:24 -08:00
Zahari Dichev 4b10157a5d
policy: Make global egress network namespace configurable (#13250)
In a previous PR (#13246) we introduced an egress networks namespace that is used to create `EgressNetwork` objects that affect all client workloads.

This change makes this namespace configurable through helm values. Additionally, we unify the naming convention of the arguments to use **egress** as opposed to **external**

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-11-01 12:11:23 +02:00
Aran Shavit 351cc68b10
Manually mount serviceAccount token (#13186)
Subject
Disables "automountServiceAccountToken", instead manually mounts it as a projected volume where necessary

Problem
By default, kubernetes enables "automountServiceAccountToken" for all pods.
This poses a security risk, as pods might get kube-api permissions unintentionally.
More specifically, this fails security compliance tests:
https://learn.microsoft.com/en-us/azure/governance/policy/samples/built-in-policies
https://www.azadvertizer.net/azpolicyadvertizer/kubernetes_block-automount-token.html

Solution
Disable  "automountServiceAccountToken", create projected volume for the token, and mount it on relevant containers

Validation
Linkerd pods are able to access k8s API, work as expected (same as before)

Fixes #13108 
---------

Signed-off-by: Aran Shavit <Aranshavit@gmail.com>
2024-10-22 13:55:01 -05:00
Zahari Dichev 3e2f31dc7a
Add `EgressNetwork` and routes statuses (#13181)
This PR adds an `EgressNetwork` CRD, which purpose is to describe networks that are external to the cluster. 
In addition to that it also adds `TLSRoute` and `TCPRoute` gateway api CRDs.

Most of the work in this change is focused on introducing these CRDs and correctly setting their status based on route specificity rules described in: https://gateway-api.sigs.k8s.io/geps/gep-1426/#route-types.

Notable changes include: 

- ability to attach TCP and TLS routes to both `EgressNetworks` and `Service` objects
- implemented conflict resolutions between routes
- admission validation on the newly introduced resources
- module + integration tests

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-10-19 18:40:32 +03:00
patest-dev 40c85713b6
Update deprecation warning text for v1beta1 Server (#13188)
* Update deprecation text for v1beta1 Server
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-10-18 15:51:36 -05:00
Alejandro Pedraza 64130e1eb5
Bump linkerd-cni to v1.5.2 (#13198)
This release fixes the issue that when the node had hit the inotify limit, deploying the linkerd-cni daemonset would silently fail. Now the problem is caught and the pod enters a crash loop until the limit is no longer surpassed.
2024-10-17 18:18:20 -07:00
Vadim Makerov 005a3a470e
Implement providing configuration for TCP_USER_TIMEOUT to linkerd-proxy (#13024)
Default values for 30s will be enough to linux TCP-stack completes about 7 packages retransmissions, after about 7 retransmissions RTO (retransmission timeout) will rapidly grows and does not make much sense to wait for too long.

Setting TCP_USER_TIMEOUT between linkerd-proxy and wild world is enough, since connections to containers in same pod are more stable and reliable

Fixes #13023

Signed-off-by: UsingCoding <extendedmoment@outlook.com>
2024-10-03 16:08:01 +00:00
Maxime Brunet 260fd19a2a
build: add image source label to all Dockerfiles (#13042)
This change adds the org.opencontainers.image.source label to all Dockerfiles.

It allows tools to find the source repository for the produced images.

Signed-off-by: Maxime Brunet <max@brnt.mx>
2024-09-10 11:25:32 -07:00
Alejandro Pedraza 6271350725
Remove redundant dashes from identity manifest template (#13022)
Fixes #13021
2024-09-09 11:48:36 -05:00
kristjankullerkann-wearemp 6f7d3a4425
Add support to modify liveness and readiness probe timeouts on control plane containers (#13002)
Signed-off-by: Kristjan Kullerkann <kristjan.kullerkann@wearemp.com>
2024-09-04 14:26:57 -05:00
Alejandro Pedraza 567288a060
Dual-stack support for ExternalWorkloads (#12965)
* Dual-stack support for ExternalWorkloads

This changes the `workloadIPs.maxItems` field in the ExternalWorkload CRD from `1` to `2`, to accommodate for an IPv4 and IPv6 pair. This is a BC change, so there's no need to bump the CRD version.

The control plane already supports this, so this change is mainly about expansions to the unit tests to also account for the double stack case.
2024-08-30 13:23:56 -05:00
Alex Leong 366ab94519
Add viz stat-inbound and viz stat-outbound commands (#12994)
We add two new commands to the linkerd viz extension: `linkerd viz stat-inbound` and `linkerd viz stat-outbound`.  These commands are meant as replacements for the `linkerd viz stat`.  The `linkerd viz stat` command provides stats when ServiceProfiles are used whereas the new commands provide stats when xRoute resources are used.  Either command can be used when no xRoute or ServiceProfile is used but the new commands include several improvements:

* Inbound and outbound stats are clearly separated into different commands rather than being contextual based on flag combinations
* Route level and backend level stats are displayed together in a tree-view in `linkerd viz stat-outbound` to easily see the effects of retries, timeouts, and traffic splitting

```
> linkerd viz stat-outbound -n schlep deploy                  
NAME         SERVICE      ROUTE           TYPE       BACKEND    SUCCESS   RPS  LATENCY_P50  LATENCY_P95  LATENCY_P99  TIMEOUTS  RETRIES  
client-http  schlep:80    schlep-default  HTTPRoute             100.00%  1.00         31ms        387ms        478ms     0.00%    6.25%  
                          └───────────────────────►  schlep:80   93.75%  1.07         16ms         88ms         98ms     1.56%           
client-grpc  schlep:8080  schlep-default  GRPCRoute              98.31%  0.98         36ms        425ms        485ms     0.00%    0.00%  
                          ├───────────────────────►  fail:8080   96.88%  0.53         12ms         24ms         25ms     0.00%           
                          └───────────────────────►  good:8080  100.00%  0.45         25ms         95ms         99ms     0.00%
```

```
> linkerd viz stat-inbound -n schlep deploy
NAME         SERVER          ROUTE      TYPE  SUCCESS   RPS  LATENCY_P50  LATENCY_P95  LATENCY_P99  
client-grpc  [default]:4191  [default]        100.00%  0.10          2ms          3ms          3ms  
client-grpc  [default]:4191  probe            100.00%  0.20          0ms          1ms          1ms  
client-http  [default]:4191  [default]        100.00%  0.10          2ms          2ms          2ms  
client-http  [default]:4191  probe            100.00%  0.20          0ms          1ms          1ms  
server-fail  [default]:4191  probe            100.00%  0.20          0ms          1ms          1ms  
server-fail  [default]:4191  [default]        100.00%  0.10          2ms          2ms          2ms  
server-fail  [default]:8080  [default]         94.87%  1.30          0ms          1ms          1ms  
server-good  [default]:4191  [default]        100.00%  0.10          0ms          1ms          1ms  
server-good  [default]:4191  probe            100.00%  0.20          0ms          1ms          1ms  
server-good  [default]:8080  [default]        100.00%  0.73          8ms         92ms         98ms  
server-slow  [default]:4191  [default]        100.00%  0.10          0ms          1ms          1ms  
server-slow  [default]:4191  probe            100.00%  0.20          0ms          1ms          1ms
```

Unlike the `linkerd viz stat` command, these commands query prometheus directly rather than going through the intermediary of the metrics-api.  If prometheus is enabled in linkerd-viz, these commands will use a port-forward to connect to that prometheus instance.  If an external prometheus is configured, these commands will attempt to use that prometheus URL; however note that the prometheus URL must be reachable from where the CLI is executed for this to work.  This can be overridden by a `--prometheusURL` flag.

Json and table output are both supported.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-08-29 12:31:16 -07:00
Alejandro Pedraza 332c4efa8c
Only bind to IPv6 addresses when disableIPv6=false (#12938)
## Problem

When the IPv6 stack in Linux is disabled, the proxy will crash at startup.

## Repro

In a Linux machine, disable IPv6 networking through the `net.ipv6.conf.*` sysctl kernel tunables, and restart the system:

- In /etc/sysctl.conf add:
```
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
```

- In /etc/default/grub set:
```
GRUB_CMDLINE_LINUX="ipv6.disable=1"
```

- Don't forget to update grub before rebooting:
```
sudo update-grub
```

In a default k3d cluster, install Linkerd. You should see the following error in any proxy log:

```
thread 'main' panicked at /__w/linkerd2-proxy/linkerd2-proxy/linkerd/app/src/lib.rs:245:14:
Failed to bind inbound listener: Os { code: 97, kind: Uncategorized, message: "Address family not supported by protocol" }
```

## Cause

Even if a k8s cluster didn't support IPv6, we were counting on the nodes having an IPv6 stack, which allowed us to bind to the inbound proxy to [::] (although not to [::1] for the outbound proxy, as seen in GKE). This was the case in the major cloud providers we tested, but it turns out there are folks running nodes with IPv6 disabled and so we have to cater that case as well.

## Fix

The current change undoes some of the changes from 7cbe2f5ca6 (for the proxy config), 7cbe2f5ca6 (for the policy controller) and 66034099d9 (for linkerd-cni), binding back again to 0.0.0.0 unless `disableIPv6` is false.
2024-08-05 13:29:55 -05:00
Alex Leong 53619175b2
Make GrpcRoute watches optional (#12917)
Since the GrpcRoute CRD is not included in the Gateway API version 0.7.0 standard, we cannot rely on it being present in all clusters.  Therefore, we make the GrpcRoute watch in the policy controller optional, conditioned on if the API resource exists.  In fact, we do this for all HttpRoute as well.

This means that if either of these CRDs are not present on the cluster at the time that the policy controller is started, we will not initiate watches on that type and it will not be possible to use those resources for policy until the CRD is installed and the policy controller is restarted.  We log at startup if any watches are skipped in this way.

Furthermore, we relax the `linkerd check` validation of CRDs to not require that gateway API CRDs are present.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-07-31 12:03:50 -07:00
Alex Leong 59fbf7b01d
feat(proxy): Disable all header and request logging (#12903)
The Linkerd proxy suppresses all logging of HTTP headers at debug level or higher unless the `proxy.logHTTPHeaders` helm values is set to `insecure`.  However, even when this value is not set, HTTP headers can still be logged if the log level is set to trace.

We update the log string we use to disable logging of HTTP headers from `linkerd_proxy_http::client[{headers}]=off` to the more general `[{headers}]=off,[{request}]=off`.  This will disable any logging which includes a `headers` or `request` field.  This has the effect of disabling the logging of headers at trace level as well.  As before, these logs can be re-enabled by settings `proxy.logHTTPHeaders=insecure`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-07-31 11:31:25 -07:00
Alejandro Pedraza 66034099d9
Configure network-validator and repair-controller to work with IPv6 (#12874)
* Configure network-validator and repair-controller to work with IPv6

Fixes #12864

The linkerd-cni network-validator container was binding to the IPv4 wildcard and connecting to an IPv4 address. This wasn't breaking things in IPv6 clusters but it was only validating the iptables rules and not the ip6tables ones. This change introduces logic to use addresses according to the value of `disableIPv6`. If IPv6 is enabled, then the ip6tables rules would get exercised. Note that a more complete change would also exercise both iptables and ip6tables, but for now we're defaulting to ip6tables.

Similarly was the case with repair-controller, but given the IPv4 wildcard was used for the admin server, in IPv6 clusters the kubelet wasn't able to reach the probe endpoints and the container was failing. In this case the fix is just have the admin server bind to `[::]`, which works for IPv4 and IPv6 clusters.
2024-07-24 09:56:41 -05:00
Alejandro Pedraza 71291fe7bc
Add `accessPolicy` field to Server CRD (#12845)
Followup to #12844

This new field defines the default policy for Servers, i.e. if a request doesn't match the policy associated to a Server then this policy applies. The values are the same as for `proxy.defaultInboundPolicy` and the `config.linkerd.io/default-inbound-policy` annotation (all-unauthenticated, all-authenticated, cluster-authenticated, cluster-unauthenticated, deny), plus a new value "audit". The default is "deny", thus remaining backwards-compatible.

This field is also exposed as an additional printer column.
2024-07-22 09:01:09 -05:00
Alejandro Pedraza aeadb63340
New "audit" value for default inbound policy (#12844)
* New "audit" value for default inbound policy

As a preliminary for audit-mode support, this change just adds "audit" to the allowed values for the `proxy.defaultInboundPolicy` helm entry, and to the `--default-inbound-policy` flag for the install CLI. It also adds it to the allowed values for the `config.linkerd.io/default-inbound-policy` annotation.
2024-07-17 15:54:27 -05:00
Andrew Seigner 0719b11666
Remove empty `shortnames` from ExternalWorkload (#12793)
The `shortnames: []` field on the `ExternalWorkload` CRD causes tools
like ArgoCD to report applications out of sync.

Remove the empty field completely from the CRD manifest.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2024-07-01 11:36:16 -07:00
Matei David f05d1e9e26
feat(helm): default proxy-init resource requests to proxy values (#12741)
Default values for `linkerd-init` (resources allocated) are not always
the right fit. We offer default values to ensure proxy-init does not get
in the way of QOS Guaranteed (`linkerd-init` resource limits and
requests cannot be configured in any other way).

Instead of using default values that can be overridden, we can re-use
the proxy's configuration values. For the pod to be QOS Guaranteed, the
values for the proxy have to be set any way. If we re-use the same
values for proxy-init we can ensure we'll always request the same amount
of CPU and memory as needed.

* `linkerd-init` now defaults to the proxy's values
* when the proxy has an annotation configuration for resource requests,
  it also impacts `linkerd-init`
* Helm chart and docs have been updated to reflect the missing values.
* tests now no longer use `ProxyInit.Resources`

UPGRADE NOTE:
- Deprecates `proxyInit.resources` field in the Helm values.
  - It will be a no-op if specified (no hard failures)

Closes #11320

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-06-24 12:37:47 +01:00
Alex Leong 1785592091
Manage GrpcRoute resource status (#12748)
We add support for GrpcRoute resources in the policy-controller's status controller.  This means that the policy controller will watch GrpcRoute resources in the cluster and keep their status up to date, in the same way that it currently does for HttpRoute resources.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-21 06:48:33 -07:00
Kevin Ingelman 10c2c140ee
Add missing `linkerd.io/control-plane-ns` label (#12742)
The `ext-namespace-metadata-linkerd-config` Role is the only resource in
the base Linkerd install that doesn't include the
`linkerd.io/control-plane-ns` label, and that appears to be an
oversight.

This change adds the missing label for consistency.

Signed-off-by: Kevin Ingelman <ki@buoyant.io>
2024-06-18 12:55:08 -05:00
Alex Leong 35fb2d6d11
feat!: Add config to disable proxy /shutdown admin endpoint (#12705)
The proxy may expose a /shutdown HTTP endpoint on its admin server that may be used by `linkerd-await --shutdown` to trigger proxy shutdown after a process completes. If an application has an SSRF vulnerability, however, an attacker could use this endpoint to trigger proxy shutdown, causing a denial of service. This admin endpoint is only useful with linkerd-await; and this functionality is supplanted by Kubernetes Native Sidecars.

To address this potential issue, this change disables the proxy's admin endpoint by default. A helm value is introduced to support enabling the admin endpoint cluster-wide; and the `config.linkerd.io/proxy-admin-shutdown: enabled` annotation may be set to enable it the admin endpoint on an individual workload.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-14 09:55:15 -07:00
Alejandro Pedraza 80a1803c7f
Properly set log level for hickory dependency in proxy (#12722)
Followup to linkerd/linkerd2-proxy#2872 , where we swapped the
trust-dns-resolver with the hickory-resolver dependency. This change
updates the default log level setting for the proxy to account for
that.
2024-06-14 08:32:12 -07:00
John Howard ca1ded55f3
Fix typo in diagnostics command (#12723)
Signed-off-by: John Howard <john.howard@solo.io>
2024-06-14 08:31:36 -07:00
Alejandro Pedraza b59149388f
Bump proxy-init to v2.4.1 and cni-plugin to v1.5.1 (#12711)
Those releases ensure that when IPv6 is enabled, the series of ip6tables commands succeed. If they fail, the proxy-init/linkerd-cni containers should fail as well, instead of ignoring errors.

See linkerd/linkerd2-proxy-init#388
2024-06-13 17:15:41 -05:00
Alex Leong e0fe0248d5
Add config to disable HTTP proxy logging (#12665)
Fixes #12620

When the Linkerd proxy log level is set to `debug` or higher, the proxy logs HTTP headers which may contain sensitive information.

While we want to avoid logging sensitive data by default, logging of HTTP headers can be a helpful debugging tool.  Therefore, we add a `proxy.logHTTPHeaders` Helm value which prevents the logging of HTTP headers when set to false.  The default value of this value is false so that headers cannot be logged unless users opt-in.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-11 17:46:54 -07:00
Alex Leong 3bc12b7807
Add json output to install and related commands (#12641)
We add support for the `--output/-o` flag in linkerd install and related commands. The supported output formats are yaml (default) and json. Kubectl is able to accept both of these formats which means that the output can be piped into kubectl regardless of which output format is used.

The full list of install related commands which we add json support to is:

* linkerd install
* linkerd prune
* linkerd upgrade
* linkerd uninstall
* linkerd viz install
* linkerd viz prune
* linkerd viz uninstall
* linkerd multicluster install
* linkerd multicluster prune
* linkerd multicluster uninstall
* linkerd jaeger install
* linkerd jaeger prune
* linkerd jaeger uninstall

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-04 12:43:23 -07:00
Alejandro Pedraza b028db78a2
Make IPv6 support opt-in in linkerd-cni (#12663)
Followup to 7dbafb26c8 where we made IPv6
support opt-in for the control plane and proxy init. This follows suit
doing the same for linkerd-cni.
2024-05-31 11:47:45 +01:00
Nico Feulner 3d674599b3
make group ID configurable (#11924)
Fixes #11773

Make the proxy's GUID configurable via `proxy.gid` which defaults to `-1`, in which case the GUID is not set.
Also added ability to set the GUID for proxy-init and the core and extension controllers.

---------

Signed-off-by: Nico Feulner <nico.feulner@gmail.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-05-23 15:54:21 -05:00
dependabot[bot] d42432914d
build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0 (#12593)
* build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0

Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.63.2 to 1.64.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.63.2...v1.64.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

I've replaced all the `grpc.Dial` calls with `grpc.NewClient`. There was one call `grpc.DialContext(ctx,...)` in `viz/tap/api/grpc_server.go` that also got replaced with `grpc.NewClient`, which loses the ability to pass `ctxt` but that doesn't matter; as we're not using `WithBlock(true)` that context wasn't being accounted for when we were using `DialContext()` anyways.

https://github.com/grpc/grpc-go/blob/v1.64.0/clientconn.go#L216-L242

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-05-22 14:40:04 -05:00
Alex Leong 5a67e83ff5
Add json output format support to linkerd profile command (#12611)
Add an`-o/--output` flag to the `linkerd profile` command which outputs ServiceProfile manifests.  The supported output formats are yaml (default) and json.  Both of these formats are supported by kubectl and either one can be piped into `kubectl apply`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-05-21 16:37:14 -07:00
Alex Leong 10b1a7af6a
Add support for json output in inject and uninject commands (#12600)
We add support for the `--output/-o` flag in `linkerd inject` and `linkerd uninject` commands.  The supported output formats are yaml (default) and json.  Kubectl is able to accept both of these formats which means that `linkerd inject` and `linkerd uninject` output can be piped into kubectl regardless of which output format is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-05-21 16:36:53 -07:00
Alex Leong 2ed9ce4056
Add --token flag to diagnostics policy command (#12613)
Add a flag to the `linkerd diagnostics policy` command for being able to specify the context token sent in the policy request.  This allows the diagnostic command to get the policy as it would be seen by clients in particular namespaces.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-05-20 11:17:42 -07:00
Alejandro Pedraza 7dbafb26c8
Make IPv6 support opt-in (#12576)
This changes the default of the Helm value `disableIPv6` to `true`.
Additionally, the proxy's `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS` env var
is now set accordingly to that value.

This addresses an incompatibility with GKE introduced in last week's
edge (`edge-24.5.1`): in default IPv4-only nodes in GKE clusters the
proxy can't bind to `::1`, so we have make IPv6 opt-in to avoid
surprises.
2024-05-09 09:09:26 -05:00
Mark S e641d0701d
feat(crds): Update gateway-api to 0.7.1 (#12507)
We want to begin to add support for the GRPCRoute resource type
to the policy controller. The GRPCRoute resource was introduced in
gateway-api v0.7.

In preparation for this upcoming work, this commit updates our Gateway
API CRD dependencies to 0.7.1, including the experimental GRPCRoute
resource type.
2024-05-06 13:29:43 -07:00
Alejandro Pedraza 5114e8e45a
Fix `linkerd dg endpoints` to work with IPv6 (#12541)
```
# BEFORE
$ bin/linkerd dg endpoints family-server.default.svc.cluster.local:8080
NAMESPACE   IP        PORT   POD                              SERVICE
default     0.0.0.0   8080   family-server-7cf95b6b89-x6cb9   family-server.default

# AFTER
$ bin/linkerd dg endpoints family-server.default.svc.cluster.local:8080
NAMESPACE   IP                      PORT   POD                              SERVICE
default     fd00:10:244::5          8080   family-server-7cf95b6b89-x6cb9   family-server.default
```
2024-05-02 14:39:43 -05:00
Alejandro Pedraza 137eac9df3
Add IPv6 support for the destination controller (#12428)
Services in dual-stack mode result in the creation of two EndpointSlices, one for each IP family. Before this change, the Get Destination API would nondeterministically return the address for any of those ES, depending on which one was processed last by the controller because they would overwrite each other.

As part of the ongoing effort to support IPv6/dual-stack networks, this change fixes that behavior giving preference to IPv6 addresses whenever a service exposes both families.

There are a new set of unit tests in server_ipv6_test.go, and in the TestEndpointTranslatorForPods tests there's a couple of new cases to test the interaction with zone filtering.
Also the server unit tests were updated to segregate the tests and resources dealing with the IPv4/IPv6/dual-stack cases.
2024-05-02 14:39:05 -05:00
Alejandro Pedraza 7cbe2f5ca6
Enable forwarding IPv6 connections through the proxy (#12495)
As part of the ongoing effort to support IPv6/dual-stack networks, this change
enables the proxy to properly forward IPv6 connections:

- Adds the new `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS` environment variable when
  injecting the proxy. This is supported as of proxy v2.228.0 which was just
  pulled into the linkerd2 repo in #2d5085b56e465ef56ed4a178dfd766a3e16a631d.
  This adds the IPv6 loopback address (`[::1]`) to the IPv4 one (`127.0.0.1`)
  so the proxy can forward outbound connections received via IPv6. The injector
  will still inject `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR` to support the rare
  case where the `proxy.image.version` value is overridden with an older
  version. The new proxy still considers that variable, but it's superseded by
  the new one. The old variable is considered deprecated and should be removed
  in the future.
- The values for `LINKERD2_PROXY_CONTROL_LISTEN_ADDR`,
  `LINKERD2_PROXY_ADMIN_LISTEN_ADDR` and `LINKERD2_PROXY_INBOUND_LISTEN_ADDR`
  have been updated to point to the IPv6 wildcard address (`[::]`) instead of
  the IPv4 one (`0.0.0.0`) for the same reason. Unlike with the loopback
  address, the IPv6 wildcard address suffices to capture both IPv4 and IPv6
  traffic.
- The endpoint translator's `getInboundPort()` has been updated to properly
  parse the IPv6 loopback address retrieved from the proxy container manifest.
  A unit test was added to validate the behavior.
2024-05-02 16:39:19 +01:00
Andrew Seigner 42820d886f
Update `values.go` to better align with Helm (#12534)
Some fields in the Helm templates are not represented in `values.go`. This causes data loss when valid Helm values are unmarshalled into a `linkerd2.Values` struct.

Update `linkerd2.Values` to include additional fields already represented in the Helm templates.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2024-05-01 16:18:01 -07:00
Oliver Gould aef8a02426
feat(destination): Add meshed HTTP/2 keep-alive settings (#12504)
This commit adds destination controller configuration that enables default
keep-alives for meshed HTTP/2 clients.

This is accomplished by encoding the raw protobuf message structure into the
helm values, and then encoding that as JSON in the destination controller's
command-line options. This allows operators to set any supported HTTP/2 client
configuration without having to modify the destination controller.
2024-04-30 19:35:30 +00:00
Oliver Gould 246c62e7d3
feat: Configure default HTTP/2 server keep-alives (#12498)
HTTP/2 keep-alives enable HTTP/2 servers to issue PING messages to clients to
ensure that the connections are still healthy. This is especially useful when
the OS loses a FIN packet, which causes the connection resources to be held by
the proxy until we attempt to write to the socket again. Keep-alives provide a
mechanism to issue periodic writes on the socket to test the connection health.

5760ed2 updated the proxy to support HTTP/2 server configuration and 9e241a7
updated the proxy's Helm templating to support aribtrary configuration.
Together, these changes enable configuring default HTTP/2 server keep-alives
for all data planes in a cluster.

The default keep-alive configuration uses a 10s interval and a 3s timeout. These
are chosen to be conservative enough that they don't trigger false positive
timeouts, but frequent enough that these connections may be garbage collected
in a timely fashion.
2024-04-29 13:51:21 -07:00