Commit Graph

584 Commits

Author SHA1 Message Date
Oliver Gould f0d861ea9b
edge-24.2.4 (#12113)
Updated the ExternalWorkload CRD to v1beta1, renaming the meshTls field to
  meshTLS ([#12098])
* Updated the proxy to address some logging and metrics inconsistencies
  ([#12099])
2024-02-20 11:47:59 -08:00
Matei David 98e38a66b6
Rename meshTls to meshTLS in ExternalWorkload CRD (#12098)
The ExternalWorkload resource we introduced has a minor naming
inconsistency; `Tls` in `meshTls` is not capitalised. Other resources
that we have (e.g. authentication resources) capitalise TLS (and so does
Go, it follows a similar naming convention).

We fix this in the workload resource by changing the field's name and
bumping the version to `v1beta1`.

Upgrading the control plane version will continue to work without
downtime. However, if an existing resource exists, the policy controller
will not completely initialise. It will not enter a crashloop backoff,
but it will also not become ready until the resource is edited or
deleted.

Signed-off-by: Matei David <matei@buoyant.io>
2024-02-20 11:00:13 -08:00
Alex Leong 42cbf8fdc7
edge 24.2.3 (#12087)
* Allowed the `MutatingWebhookConfig` timeout value to be configured ([#12028])
  (thanks @mikebell90)
* Added a counter for items dropped from destination controller workqueue
  ([#12079])
* Fixed a spurious `linkerd check` error when using container images with
  digests ([#12059])
* Fixed an issue where inbound policy could be incorrect after certain policy
  resources are deleted ([#12088])

[#12028]: https://github.com/linkerd/linkerd2/pull/12028
[#12079]: https://github.com/linkerd/linkerd2/pull/12079
[#12059]: https://github.com/linkerd/linkerd2/pull/12059
[#12088]: https://github.com/linkerd/linkerd2/pull/12088

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-02-16 12:27:34 -08:00
Alejandro Pedraza 6142e52af0
Add `additionalArgs` helm Settings (#12081)
Add `additionalArgs` helm settings to the destination and policy controller manifests alongside the existing `experimentalArgs` ones.
2024-02-15 14:27:04 -05:00
Alejandro Pedraza 9ac1caaf1b
Add `additionalEnv` helm settings (#12080)
Add `additionalEnv` helm settings to the proxy and controller manifests
alongside the existing `experimentalEnv` ones.
2024-02-15 14:26:45 -05:00
Michael Bell 24d308d42e
Allow `MutatingWebhookConfig` timeout value to be configured (#12028)
The proxy injector's admission request timeout is set to the Kubernetes default
10 second value. If the proxy injector does not write out a response within
this time frame, the `webhookFailurePolicy` configured on the webhook will be
used by the API Server.

In certain situations, it would help to have the timeout value configurable.
This change introduces a new Helm value for the `proxyInjector` that allows the
webhook config timeout duration to be overridden.

---------

Signed-off-by: Michael Bell <mbell@opentable.com>
Signed-off-by: Michael Bell <mikebell90@users.noreply.github.com>
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Alex Leong <alex@buoyant.io>
2024-02-09 12:13:06 +00:00
Matei David 3073c406f3
edge-24.2.2 (#12053)
This release addresses some issues in the destination service that could cause
it to behave unexpectedly when processing updates.

* Fixed a race condition in the destination service that could cause panics
  under very specific conditions ([#12022]; fixes [#12010])
* Changed how updates to a `Server` selector are handled in the destination
  service. When a `Server` that marks a port as opaque no longer selects a
  resource, the resource's opaqueness will reverted to default settings
  ([#12031]; fixes [#11995])
* Introduced Helm configuration values for liveness and readiness probe
  timeouts and delays ([#11458]; fixes [#11453]) (thanks @jan-kantert!)

[#12010]: https://github.com/linkerd/linkerd2/issues/12010
[#12022]: https://github.com/linkerd/linkerd2/pull/12022
[#11995]: https://github.com/linkerd/linkerd2/issues/11995
[#12031]: https://github.com/linkerd/linkerd2/pull/12031
[#11453]: https://github.com/linkerd/linkerd2/issues/11453
[#11458]: https://github.com/linkerd/linkerd2/pull/11458

Signed-off-by: Matei David <matei@buoyant.io>
2024-02-09 11:19:14 +00:00
jan-kantert af402a35ff
Introduce Helm configuration for probe timeout and delays (#11458)
In certain cases (e.g. high CPU load) kubelets can be slow to read readiness
and liveness responses. Linkerd is configured with a default time out of `1s`
for its probes. To prevent injected pod restarts under high load, this
change makes probe timeouts configurable.

---------

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Matei David <matei@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-02-08 18:05:53 +00:00
Alejandro Pedraza bcbcf43c78
Change notes for edge-24.2.1 (#12029)
This edge release contains performance and stability improvements to the
Destination controller, and continues stabilizing support for ExternalWorkloads.

* Reduced the load on the Destination controller by only processing Server
  updates on workloads affected by the Server ([#12017])
* Changed how the Destination controller reacts to target clusters (in
  multicluster pod-to-pod mode) whose Server CRD is outdated: skip them and log
  an error instead of panicking ([#12008])
* Improved the leader election of the ExternalWorkloads Endpoints controller to
  avoid missing events ([#12021])
* Improved naming of EndpointSlices generated by ExternWorkloads ([#12016])
* Restriced the number of IPs an ExternalWorkload can have ([#12026])
2024-02-02 12:45:04 -05:00
Matei David d820a236f0
Restrict the number of IPs an ExternalWorkload can have (#12026)
A Kubernetes pod may be assigned at [most one IP address][pod-docs] for each supported protocol (i.e. IPv6 and IPv4), without the use of specialised CNIs or network configurations. When processing addresses in an endpoint, we will only ever use one address.

ExternalWorkload resources have a generic workloadIPs field that allow any number of addresses to be added. We want the behaviour to be similar to a pod -- only one address (of each protcol) should be used for routing.

We restrict the CRD server-side validation to allow only one IP address. Since we do not yet support IPv6, this will ensure that two IPv4 addresses will not be declared by the same workload. Once IPv6 support lands, or once we have a dedicated validator, we can relax the CRD validation.

[pod-docs]: https://pkg.go.dev/k8s.io/kubernetes@v1.29.1/pkg/apis/core#PodStatus

### How to test

* Install Linkerd after building the branch (just the crds will do, `linkerd install --crds`).
* Try to apply the following CRD:

```yaml
apiVersion: workload.linkerd.io/v1alpha1
kind: ExternalWorkload
metadata:
  labels:
    app: legacy
  name: external-workload-invalid
  namespace: mixed-env
spec:
  meshTls:
    identity: spiffe://root.linkerd.cluster.local/external-workload-invalid
    serverName: external-workload-invalid.cluster.local
  ports:
  - name: http
    port: 80
    protocol: TCP
  workloadIPs:
  - ip: 172.22.0.5
  - ip: 172.22.0.6
status:
  conditions:
  - lastTransitionTime: "2024-01-24T11:53:43Z"
    message: This workload is alive
    reason: Alive
    status: "True"
    type: Ready
```

* Expect the creation to fail

> The ExternalWorkload "external-workload-invalid" is invalid: spec.workloadIPs: Too many: 2: must have at most 1 items

*Edit*: going to open up a separate PR for the refactor.

Signed-off-by: Matei David <matei@buoyant.io>
2024-02-02 08:20:57 -05:00
Oliver Gould 4e4ff03255
edge-24.1.3 (#11994)
* proxy: v2.220.0

* build(deps): bump itertools from 0.10.5 to 0.11.0 (linkerd/linkerd2-proxy#2594)
* build(deps): bump async-trait from 0.1.68 to 0.1.75 (linkerd/linkerd2-proxy#2595)
* pool: Decompose the pool and balancer crates (linkerd/linkerd2-proxy#2597)
* balance: Move endpoint state gauge into balancer (linkerd/linkerd2-proxy#2598)
* cargo: Remove cyclic meshtls dependency (linkerd/linkerd2-proxy#2602)
* build(deps): bump mime from 0.3.16 to 0.3.17 (linkerd/linkerd2-proxy#2599)
* build(deps): bump parking_lot_core from 0.9.5 to 0.9.9 (linkerd/linkerd2-proxy#2600)
* build(deps): bump prost-build from 0.12.1 to 0.12.3 (linkerd/linkerd2-proxy#2601)
* outbound: Update route backend metrics implementation (linkerd/linkerd2-proxy#2603)
* deps: Update to indexmap v2 (linkerd/linkerd2-proxy#2604)
* build(deps): bump actions/download-artifact from 3.0.2 to 4.1.0 (linkerd/linkerd2-proxy#2569)
* deps: h2 v0.3.22 (linkerd/linkerd2-proxy#2605)
* tracing: Ensure that INFO-level spans are preserved (linkerd/linkerd2-proxy#2611)
* build(deps): bump serde from 1.0.185 to 1.0.193 (linkerd/linkerd2-proxy#2606)
* build(deps): bump tokio-boring from 3.0.4 to 3.1.0 (linkerd/linkerd2-proxy#2607)
* build(deps): bump deranged from 0.3.10 to 0.3.11 (linkerd/linkerd2-proxy#2608)
* build(deps): bump axum from 0.6.11 to 0.6.20 (linkerd/linkerd2-proxy#2609)
* build(deps): bump proc-macro2 from 1.0.69 to 1.0.74 (linkerd/linkerd2-proxy#2610)
* build(deps): bump ahash from 0.8.6 to 0.8.7 (linkerd/linkerd2-proxy#2612)
* build(deps): bump cc from 1.0.79 to 1.0.83 (linkerd/linkerd2-proxy#2613)
* build(deps): bump scopeguard from 1.1.0 to 1.2.0 (linkerd/linkerd2-proxy#2614)
* build(deps): bump io-lifetimes from 1.0.10 to 1.0.11 (linkerd/linkerd2-proxy#2616)
* build(deps): bump pem from 3.0.2 to 3.0.3 (linkerd/linkerd2-proxy#2615)
* build(deps): bump anyhow from 1.0.76 to 1.0.79 (linkerd/linkerd2-proxy#2619)
* build(deps): bump socket2 from 0.4.9 to 0.5.5 (linkerd/linkerd2-proxy#2622)
* build(deps): bump libfuzzer-sys from 0.4.6 to 0.4.7 (linkerd/linkerd2-proxy#2620)
* build(deps): bump tempfile from 3.5.0 to 3.6.0 (linkerd/linkerd2-proxy#2621)
* build(deps): bump ryu from 1.0.13 to 1.0.16 (linkerd/linkerd2-proxy#2623)
* identity: Update metrics to follow OpenMetrics best practices (linkerd/linkerd2-proxy#2617)
* build(deps): bump tokio from 1.34.0 to 1.35.1 (linkerd/linkerd2-proxy#2627)
* build(deps): bump tracing from 0.1.37 to 0.1.40 (linkerd/linkerd2-proxy#2628)
* build(deps): bump slab from 0.4.8 to 0.4.9 (linkerd/linkerd2-proxy#2629)
* build(deps): bump unicode-bidi from 0.3.11 to 0.3.14 (linkerd/linkerd2-proxy#2630)
* build(deps): bump tokio-stream from 0.1.12 to 0.1.14 (linkerd/linkerd2-proxy#2632)
* build(deps): bump boring-sys from 3.0.4 to 3.1.0 (linkerd/linkerd2-proxy#2633)
* build(deps): bump rcgen from 0.11.3 to 0.12.0 (linkerd/linkerd2-proxy#2635)
* build(deps): bump trust-dns-resolver from 0.22.0 to 0.23.2 (linkerd/linkerd2-proxy#2631)
* build(deps): bump memchr from 2.6.4 to 2.7.1 (linkerd/linkerd2-proxy#2637)
* build(deps): bump pin-project from 1.0.12 to 1.1.3 (linkerd/linkerd2-proxy#2638)
* build(deps): bump futures from 0.3.28 to 0.3.30 (linkerd/linkerd2-proxy#2639)
* build(deps): bump rangemap from 1.3.0 to 1.4.0 (linkerd/linkerd2-proxy#2640)
* build(deps): bump actions/download-artifact from 4.1.0 to 4.1.1 (linkerd/linkerd2-proxy#2636)
* build(deps): bump thingbuf from 0.1.3 to 0.1.4 (linkerd/linkerd2-proxy#2642)
* build(deps): bump rustix from 0.36.16 to 0.36.17 (linkerd/linkerd2-proxy#2643)
* build(deps): bump httpdate from 1.0.2 to 1.0.3 (linkerd/linkerd2-proxy#2645)
* build(deps): bump num_cpus from 1.15.0 to 1.16.0 (linkerd/linkerd2-proxy#2646)
* Change inbound port check log level to debug. (linkerd/linkerd2-proxy#2625)
* docs: Fix bad reference link (linkerd/linkerd2-proxy#2647)
* identity: add spire identity client (linkerd/linkerd2-proxy#2580)
* config:add spire client config (linkerd/linkerd2-proxy#2641)
* discovery: consume server_name and UriLikeIdentity from proto (linkerd/linkerd2-proxy#2618)
* build(deps): bump h2 from 0.3.22 to 0.3.24 (linkerd/linkerd2-proxy#2660)
* build(deps): bump procfs from 0.15.1 to 0.16.0 (linkerd/linkerd2-proxy#2649)
* build(deps): bump async-trait from 0.1.75 to 0.1.77 (linkerd/linkerd2-proxy#2650)
* build(deps): bump semver from 1.0.20 to 1.0.21 (linkerd/linkerd2-proxy#2651)
* build(deps): bump smallvec from 1.10.0 to 1.13.1 (linkerd/linkerd2-proxy#2661)
* build(deps): bump either from 1.8.1 to 1.9.0 (linkerd/linkerd2-proxy#2652)
* build(deps): bump actions/upload-artifact from 4.0.0 to 4.2.0 (linkerd/linkerd2-proxy#2658)
* build(deps): bump shlex from 1.1.0 to 1.3.0 (linkerd/linkerd2-proxy#2664)
* build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2656)
* build(deps): bump EmbarkStudios/cargo-deny-action from 1.5.5 to 1.5.10 (linkerd/linkerd2-proxy#2665)
* build(deps): bump serde from 1.0.193 to 1.0.195 (linkerd/linkerd2-proxy#2670)
* build(deps): bump clang-sys from 1.6.0 to 1.7.0 (linkerd/linkerd2-proxy#2668)
* build(deps): bump zerocopy from 0.7.31 to 0.7.32 (linkerd/linkerd2-proxy#2666)
* build(deps): bump unicode-ident from 1.0.6 to 1.0.12 (linkerd/linkerd2-proxy#2667)
* build(deps): bump actions/upload-artifact from 4.2.0 to 4.3.0 (linkerd/linkerd2-proxy#2671)
* build(deps): bump prettyplease from 0.2.15 to 0.2.16 (linkerd/linkerd2-proxy#2673)
* build(deps): bump getrandom from 0.2.8 to 0.2.12 (linkerd/linkerd2-proxy#2674)
* build(deps): bump which from 4.4.0 to 4.4.2 (linkerd/linkerd2-proxy#2675)
* build(deps): bump sharded-slab from 0.1.4 to 0.1.7 (linkerd/linkerd2-proxy#2676)
* build(deps): bump EmbarkStudios/cargo-deny-action from 1.5.10 to 1.5.11 (linkerd/linkerd2-proxy#2672)
* build(deps): bump tj-actions/changed-files from 41.0.1 to 42.0.0 (linkerd/linkerd2-proxy#2657)

Signed-off-by: Oliver Gould <ver@buoyant.io>

* Bump helm version

* +changes

* Update CHANGES.md

Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>

---------

Signed-off-by: Oliver Gould <ver@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-01-26 11:14:32 -08:00
Alejandro Pedraza 796bb85323
Bump proxy-init to v2.2.4 (#11988)
Upgraded Alpine to 3.19.0, and other various dependencies bumps.
2024-01-26 09:28:14 -08:00
Matei David dbd72cc283
Relax validation for ExternalWorkload Status fields (#11979)
ExternalWorkload resources require that status condition has almost all of its
fields set (with the exception of a date field). The original inspiration for
this design was the HTTPRoute object.

When using the resource, it is more practical to handle many of the fields as
optional; it is cumbersome to fill out the fields when creating an
ExternalWorkload. We change the settings to be in-line with a [Pod] object
instead.

[Pod]:
7d1a2f7a73/core/v1/types.go (L3063-L3084)


---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-24 14:12:32 +00:00
Alex Leong 38777c7b0b
edge-24.1.2 (#11951)
This edge release incrementally improves support for ExternalWorkload resources
throughout the control plane.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-01-19 10:31:49 -08:00
Matei David 983fc55abc
Introduce new external endpoints controller (#11905)
For mesh expansion, we need to register an ExternalWorkload's service
membership. Service memberships describe which Service objects an
ExternalWorkload is part of (i.e. which service can be used to route
traffic to an external endpoint).

Service membership will allow the control plane to discover
configuration associated with an external endpoint when performing
discovery on a service target.

To build these memberships, we introduce a new controller to the
destination service, responsible for watching Service and
ExternalWorkload objects, and for writing out EndpointSlice objects for
each Service that selects one or more external endpoints.

As a first step, we add a new externalworkload module and a new controller in the
that watches services and workloads. In a follow-up change, 
the ExternalEndpointManager will additionally perform
the necessary reconciliation by writing EndpointSlice objects.

Since Linkerd's control plane may run in HA, we also add a lease object
that will be used by the manager. When a lease is claimed, a flag is
turned on in the manager to let it know it may perform writes.

A more compact list of changes:
* Add a new externalworkload module
* Add an EndpointsController in the module along with necessary mechanisms to watch resources.
* Add RBAC rules to the destination service:
  * Allow policy and destination to read ExternalWorkload objects
  * Allow destination to create / update / read Lease objects

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-17 12:15:28 +00:00
Matei David b5f384f55e
Index ExternalWorkload resources in the policy controller (#11940)
ExternalWorkload resources represent as a resource configuration associated
with a process (or a group of processes) that are foreign to a Kubernetes
cluster. It allows Linkerd to read / write and store configuration for mesh
expansion. Since VMs will be able to receive inbound traffic from a variety of
resources, the proxy should be able to dynamically discover inbound
authorisation policies.

This change introduces a set of callbacks in the indexer that will apply (or
delete) ExternalWorkload resources. In addition, we ensure that
ExternalWorkloads can be processed in a similar fashion to pods (where
applicable, of course) wrt to server matching and defaulting. To serve
discovery requests for a VM, the policy controller will now also start a
watcher for external workloads and allow requests to reference an
`external_workload` target

A quick list of changes:

* ExternalWorkloads can now be indexed in the inbound (policy) index. Renamed
* the pod module in the inbound index to be more generic ("workload"); the
* module has some re-usable building blocks that we can use for external
* workloads. Moved common functions (e.g. building a default inbound server)
* around to share what's already been done without abstracting more or
* introducing generics. Changed gRPC target types to a tuple of `(Workload,
* port)` from a tuple of `(String, String, port)` Added RBAC to watch external
* workloads.

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-17 10:43:43 +00:00
Andrew Seigner b9546af08f
helm: Use k8s `EnvVar` for `proxy.ExperimentalEnv` (#11923)
PR #11874 introduced a `proxy.ExperimentalEnv` setting, allowing
arbitrary name+value environment variables on proxies. This name+value
pairing was a subset of k8s' environment variables, specifically, it did
not allow for `valueFrom.configMapKeyRef` and related fields. PR #11908
introduced this pattern in the ControlPlane containers.

Modify `proxy.ExperimentalEnv` to behave identically to k8s' native
`EnvVar`, allowing settings such as:
```
--set proxy.experimentalEnv[0].name=LINKERD2_PROXY_DEFROBINATION
--set proxy.experimentalEnv[0].valueFrom.configMapKeyRef.key=extreme-key
--set proxy.experimentalEnv[0].valueFrom.configMapKeyRef.name=extreme-config
```

Context:
https://github.com/linkerd/linkerd2/pull/11908#issuecomment-1888945793

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2024-01-15 10:03:11 +00:00
Matei David af823dcddf
edge-24.1.1 (#11922)
This edge release introduces a number of different fixes and improvements. More
notably, it introduces a new `cni-repair-controller` binary to the CNI plugin
image. The controller will automatically restart pods that have not received
their iptables configuration.

* Removed shortnames from Tap API resources to avoid colliding with existing
  Kubernetes resources ([#11816]; fixes [#11784])
* Introduced a new ExternalWorkload CRD to support upcoming mesh expansion
  feature ([#11805])
* Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI
  identities ([#11882])
* Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to
  automatically restart misconfigured pods that are missing iptables rules
  ([#11699]; fixes [#11073])
* Fixed a `"duplicate metrics"` warning in the multicluster service-mirror
  component ([#11875]; fixes [#11839])
* Added metric labels and weights to `linkerd diagnostics endpoints` json
  output ([#11889])
* Changed how `Server` updates are handled in the destination service. The
  change will ensure that during a cluster resync, consumers won't be
  overloaded by redundant updates ([#11907])
* Changed `linkerd install` error output to add a newline when a Kubernetes
  client cannot be successfully initialised ([#11917])

[#11816]: https://github.com/linkerd/linkerd2/pull/11816
[#11784]: https://github.com/linkerd/linkerd2/issues/11784
[#11805]: https://github.com/linkerd/linkerd2/pull/11805
[#11882]: https://github.com/linkerd/linkerd2/pull/11882
[#11699]: https://github.com/linkerd/linkerd2/pull/11699
[#11073]: https://github.com/linkerd/linkerd2/issues/11073
[#11875]: https://github.com/linkerd/linkerd2/pull/11875
[#11839]: https://github.com/linkerd/linkerd2/issues/11839
[#11889]: https://github.com/linkerd/linkerd2/pull/11889
[#11907]: https://github.com/linkerd/linkerd2/pull/11907
[#11917]: https://github.com/linkerd/linkerd2/pull/11917

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-12 18:12:22 +00:00
Andrew Seigner f1f536761a
helm: Add `experimentalEnv` settings (#11908)
Enable setting arbitrary environment variables on all Go-based Control
Plane containers

For example, via CLI:

```
--set destinationController.experimentalEnv[0].name=ENV_VAR,destinationController.experimentalEnv[0].value=ENV_VAL
```

Via values file:

```
destinationController:
  experimentalEnv:
  - name: DEST_ENV_VAR
    value: DEST_ENV_VAL
  - name: DEST_CONFIG_MAP_VAR
    valueFrom:
      configMapKeyRef:
        key: dest_config_map_key
        name: dest-config-map-name
heartbeat:
  experimentalEnv:
identity:
  experimentalEnv:
proxyInjector:
  experimentalEnv:
spValidator:
  experimentalEnv:

serviceMirrorExperimentalEnv:
```

Relates to #11862 and #11874

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2024-01-12 14:17:15 +00:00
Zahari Dichev abb9d819a0
policy: use json encoded string to represent policy token (#11910)
Currently, the value that is put in the `LINKERD2_PROXY_POLICY_WORKLOAD` env var has the format of `pod_ns:pod_name`. This PR changes the format of the policy token into a json struct, so it can encode the type of workload and not only its location. For now, we add an additional `external_workload` type.


 Zahari Dichev <zaharidichev@gmail.com>
2024-01-11 22:15:29 +02:00
Matei David 3f4925bfdb
Improve server-side validation for ExternalWorkload (#11900)
We introduced an ExternalWorkload CRD along with bindings for mesh
expansion. Currently, the CRD allows users to create ExternalWorkload
resources without adding a meshTls strategy.

This change adds some more validation restrictions to the CRD definition
(i.e. server side validation). When a meshTls strategy is used, we
require both identity and serverName to be present. We also mark meshTls
as the only required field in the spec. Every ExternalWorkload regardless
of the direction of its traffic must have it set.

WorkloadIPs and ports now become optional to allow resources to be
created only to configure outbound discovery (VM to workload)
and inbound policy discovery (VM).

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-01-11 10:04:39 +00:00
Zahari Dichev 5e32446111
policy: add externalWorkloadSelector to Server resource (#11899)
This PR adds the ability for a `Server` resource to select over `ExternalWorkload`
resources in addition to `Pods`. For the time being, only one of these selector types
can be specified. This has been realized via incrementing the version of the resource
to `v1beta2`

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-01-09 13:26:20 +02:00
Alejandro Pedraza 55d1049b73
Add cni-repair-controller to linkerd-cni DaemonSet (#11699)
Followup to linkerd/linkerd2-proxy-init#306
Fixes linkerd/linkerd2#11073

This adds the `reinitialize-pods` container to the `linkerd-cni`
DaemonSet, along with its config in `values.yaml`.

Also the `linkerd-cni`'s version is bumped, to contain the new binary
for this controller.
2024-01-05 09:28:43 -08:00
Zahari Dichev 6f3a6461b9
policy: allow spiffe ids in `MeshTLSAuthentication` (#11882)
This change enables the use of SPIFFE identities in `MeshTLSAuthentication`.
To make that happen validation of the identities field on the CRD has been moved
to the policy controller admission webhook. Apart from a more clear expression
of the constraints that a SPIFFE ID needs to meet, this approach allows for
richer error messages. Note that the DNS validation is still based on a regex.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-01-05 17:00:54 +02:00
Matei David 31e1334f9e
Introduce ExternalWorkload CRD (#11805)
To support mesh expansion, the control plane needs to read configuration
associated with an external instance (i.e. a VM) for the purpose of
service and inbound authorization policy discovery.

This change introduces a new CRD that supports the required
configuration options. The resource supports:

* a list of workload IPs (with a generic format to support ipv4 now and ipv6
  in the future)
* a set of mesh TLS settings (SNI and identity)
* a set of ports exposed by the workload
* a set of status conditions

---------

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2024-01-05 11:35:38 +00:00
Oliver Gould 7cb0e3645a
helm: Add experimentalArgs to destination and policy controllers (#11862)
When debugging controller behavior, it may desirable to run a controller
with additional command-line flags that aren't explicitly referenced in
our values.yaml.

This change adds support for undocumented experimentalArgs values that
can be set on the policyController and destinationController parent
scopes.
2024-01-04 10:36:44 +00:00
Oliver Gould 3a2d164a5d
helm: Add proxy.experimentalEnv settings (#11874)
When working with experimental proxy features that are not yet exposed via
control plane APIs, it can be convenient to set additional environment variables
on proxies.

To support this, we add an undocumented `proxy.experimentalEnv` value:

    --set proxy.experimentalEnv.LINKERD2_PROXY_DEFROBINATION=extreme
2024-01-04 10:28:16 +00:00
Oliver Gould 9972fd630d
edge-23.12.4 (#11843)
This edge release includes fixes and improvements to the destination
controller's endpoint resolution API.

* Fixed an issue in the control plane where discovery for pod IP addresses could
  hang indefinitely ([#11815])
* Updated the proxy to enforce time limits on control plane response streams so
  that proxies more naturally distribute load over control plane replicas
  ([#11837])
* Fixed the policy's controller service metadata responses so that proxy logs
  and metrics have informative values ([#11842])
2023-12-28 06:54:31 -08:00
Oliver Gould 04f2ce511a
inject: Configure proxy stream lifetime limits (#11837)
linkerd/linkerd2-proxy#2587 adds configuration parameters that bound the
lifetime and idle times of control plane streams. This change helps to
mitigate imbalanced control plane replica usage and to generally prevent
scenarios where a stream becomes "stuck," as has been observed when a
control plane replica is unhealthy.

This change adds helm values to control this behavior. Default values
are provided.
2023-12-27 16:24:33 -08:00
Alex Leong 8ed1735200
edge-23.12.3 (#11806)
This edge release contains improvements to the logging and diagnostics of the
destination controller.

* Added a control plane metric to count errors talking to the Kubernetes API
  ([#11774])
* Fixed an issue causing spurious destination controller error messages for
  profile lookups on unmeshed pods with port in default opaque list ([#11550])

[#11774]: https://github.com/linkerd/linkerd2/pull/11774
[#11550]: https://github.com/linkerd/linkerd2/pull/11550

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-12-20 15:09:42 -08:00
Alejandro Pedraza 913e118bc8
edge-23.12.2 change notes (#11764)
## edge-23.12.2

This edge release includes a restructuring of the proxy's balancer along with
accompanying new metrics. The new minimum supported Kubernetes version is 1.22.

* Restructured the proxy's balancer ([#11750]): balancer changes may now occur
  independently of request processing. Fail-fast circuit breaking is enforced on
  the balancer's queue so that requests can't get stuck in a queue indefinitely.
  This new balancer is instrumented with new metrics: request (in-queue) latency
  histograms, failfast states, discovery updates counts, and balancer endpoint
  pool sizes.
* Changed how the policy controller updates HTTPRoute status so that it doesn't
  affect statuses from other non-linkerd controllers ([#11705]; fixes [#11659])

[#11750]: https://github.com/linkerd/linkerd2/pull/11750
[#11705]: https://github.com/linkerd/linkerd2/pull/11705
[#11659]: https://github.com/linkerd/linkerd2/pull/11659
2023-12-14 18:56:52 -05:00
Oliver Gould 5f100b3195
Bump min Kubernetes API to v1.22 (#11737)
New versions of the k8s-openapi crate drop support for Kubernetes 1.21.
Kubernetes v1.22 has been considered EOL by the upstream project since
2022-07-08. Major cloud providers have EOL'd it as well (GKE's current
MSKV is 1.24).

This change updates the MSKV to v1.22. It also updates the max version
in _test-helpers.sh to v1.28.
2023-12-11 12:15:56 -08:00
Matei David d0ca071bed
edge-23.12.1 (#11675)
This edge release introduces new configuration values in the identity
controller for client-go's `QPS` and `Burst` settings. Default values for these
settings have also been raised from `5` (QPS) and `10` (Burst) to `100` and
`200` respectively.

* Added `namespaceSelector` fields for the tap-injector and jaeger-injector
  webhooks. The webhooks are now configured to skip `kube-system` by default
  ([#11649]; fixes [#11647]) (thanks @mikutas!)
* Added the ability to configure client-go's `QPS` and `Burst` settings in the
  identity controller ([#11644])
* Improved client-go logging visibility throughout the control plane's
  components ([#11632])
* Introduced `PodDisruptionBudgets` in the linkerd-viz Helm chart for tap and
  tap-injector ([#11628]; fixes [#11248]) (thanks @mcharriere!)

[#11649]: https://github.com/linkerd/linkerd2/pull/11649
[#11647]: https://github.com/linkerd/linkerd2/issues/11647
[#11644]: https://github.com/linkerd/linkerd2/pull/11644
[#11632]: https://github.com/linkerd/linkerd2/pull/11632
[#11628]: https://github.com/linkerd/linkerd2/pull/11628
[#11248]: https://github.com/linkerd/linkerd2/issues/11248

Signed-off-by: Matei David <matei@buoyant.io>
2023-12-01 10:30:41 +00:00
Alejandro Pedraza 2d716299a1
Add ability to configure client-go's `QPS` and `Burst` settings (#11644)
* Add ability to configure client-go's `QPS` and `Burst` settings

## Problem and Symptoms

When having a very large number of proxies request identity in a short period of time (e.g. during large node scaling events), the identity controller will attempt to validate the tokens sent by the proxies at a rate surpassing client-go's the default request rate threshold, triggering client-side throttling, which will delay the proxies initialization, and even failing their startup (after a 2m timeout). The identity controller will surface this through log entries like this:

```
time="2023-11-08T19:50:45Z" level=error msg="error validating token for web.emojivoto.serviceaccount.identity.linkerd.cluster.local: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline"
```

## Solution

Client-go's default `QPS` is 5 and `Burst` is 10. This PR exposes those settings as entries in `values.yaml` with defaults of 100 and 200 respectively. Note this only applies to the identity controller, as it's the only controller performing direct requests to the `kube-apiserver` in a hot path. The other controllers mostly rely in informers, and direct calls are sporadic.

## Observability

The `QPS` and `Burst` settings used are exposed both as a log entry as soon as the controller starts, and as in the new metric gauges `http_client_qps` and `http_client_burst`

## Testing

You can use the following K6 script, which simulates 6k calls to the `Certify` service during one minute from emojivoto's web pod. Before running this you need to:

- Put the identity.proto and [all the other proto files](https://github.com/linkerd/linkerd2-proxy-api/tree/v0.11.0/proto) in the same directory.
- Edit the [checkRequest](https://github.com/linkerd/linkerd2/blob/edge-23.11.3/pkg/identity/service.go#L266) function and add logging statements to figure the `token` and `csr` entries you can use here, that will be shown as soon as a web pod starts.

```javascript
import { Client, Stream } from 'k6/experimental/grpc';
import { sleep } from 'k6';

const client = new Client();
client.load(['.'], 'identity.proto');

// This always holds:
// req_num = (1 / req_duration ) * duration * VUs
// Given req_duration (0.5s) test duration (1m) and the target req_num (6k), we
// can solve for the required VUs:
// VUs = req_num * req_duration / duration
// VUs = 6000 * 0.5 / 60 = 50
export const options = {
  scenarios: {
    identity: {
      executor: 'constant-vus',
      vus: 50,
      duration: '1m',
    },
  },
};

export default () => {
  client.connect('localhost:8080', {
    plaintext: true,
  });

  const stream = new Stream(client, 'io.linkerd.proxy.identity.Identity/Certify');

  // Replace with your own token
  let token = "ZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNkluQjBaV1pUZWtaNWQyVm5OMmxmTTBkV2VUTlhWSFpqTmxwSmJYRmtNMWRSVEhwNVNHWllhUzFaZDNNaWZRLmV5SmhkV1FpT2xzaWFXUmxiblJwZEhrdWJEVmtMbWx2SWwwc0ltVjRjQ0k2TVRjd01EWTRPVFk1TUN3aWFXRjBJam94TnpBd05qQXpNamt3TENKcGMzTWlPaUpvZEhSd2N6b3ZMMnQxWW1WeWJtVjBaWE11WkdWbVlYVnNkQzV6ZG1NdVkyeDFjM1JsY2k1c2IyTmhiQ0lzSW10MVltVnlibVYwWlhNdWFXOGlPbnNpYm1GdFpYTndZV05sSWpvaVpXMXZhbWwyYjNSdklpd2ljRzlrSWpwN0ltNWhiV1VpT2lKM1pXSXRPRFUxT1dJNU4yWTNZeTEwYldJNU5TSXNJblZwWkNJNklqaGlZbUV5WWpsbExXTXdOVGN0TkRnMk1TMWhNalZsTFRjelpEY3dOV1EzWmpoaU1TSjlMQ0p6WlhKMmFXTmxZV05qYjNWdWRDSTZleUp1WVcxbElqb2lkMlZpSWl3aWRXbGtJam9pWm1JelpUQXlNRE10TmpZMU55MDBOMk0xTFRoa09EUXRORGt6WXpBM1lXUTJaak0zSW4xOUxDSnVZbVlpT2pFM01EQTJNRE15T1RBc0luTjFZaUk2SW5ONWMzUmxiVHB6WlhKMmFXTmxZV05qYjNWdWREcGxiVzlxYVhadmRHODZkMlZpSW4wLnlwMzAzZVZkeHhpamxBOG1wVjFObGZKUDB3SC03RmpUQl9PcWJ3NTNPeGU1cnNTcDNNNk96VWR6OFdhYS1hcjNkVVhQR2x2QXRDRVU2RjJUN1lKUFoxVmxxOUFZZTNvV2YwOXUzOWRodUU1ZDhEX21JUl9rWDUxY193am9UcVlORHA5ZzZ4ZFJNcW9reGg3NE9GNXFjaEFhRGtENUJNZVZ6a25kUWZtVVZwME5BdTdDMTZ3UFZWSlFmNlVXRGtnYkI1SW9UQXpxSmcyWlpyNXBBY3F5enJ0WE1rRkhSWmdvYUxVam5sN1FwX0ljWm8yYzJWWk03T2QzRjIwcFZaVzJvejlOdGt3THZoSEhSMkc5WlNJQ3RHRjdhTkYwNVR5ZC1UeU1BVnZMYnM0ZFl1clRYaHNORjhQMVk4RmFuNjE4d0x6ZUVMOUkzS1BJLUctUXRUNHhWdw==";
  // Replace with your own CSR
  let csr = "MIIBWjCCAQECAQAwRjFEMEIGA1UEAxM7d2ViLmVtb2ppdm90by5zZXJ2aWNlYWNjb3VudC5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAATKjgVXu6F+WCda3Bbq2ue6m3z6OTMfQ4Vnmekmvirip/XGyi2HbzRzjARnIzGlG8wo4EfeYBtd2MBCb50kP8F8oFkwVwYJKoZIhvcNAQkOMUowSDBGBgNVHREEPzA9gjt3ZWIuZW1vaml2b3RvLnNlcnZpY2VhY2NvdW50LmlkZW50aXR5LmxpbmtlcmQuY2x1c3Rlci5sb2NhbDAKBggqhkjOPQQDAgNHADBEAiAM7aXY8MRs/EOhtPo4+PRHuiNOV+nsmNDv5lvtJt8T+QIgFP5JAq0iq7M6ShRNkRG99ZquJ3L3TtLWMNVTPvqvvUE=";

  const data = {
		identity:                     "web.emojivoto.serviceaccount.identity.linkerd.cluster.local",
    token:                        token,
		certificate_signing_request:  csr,
  };
  stream.write(data);

  // This request takes around 2ms, so this sleep will mostly determine its final duration
  sleep(0.5);
};
```

This results in the following report:

```
scenarios: (100.00%) 1 scenario, 50 max VUs, 1m30s max duration (incl. graceful stop):
           * identity: 50 looping VUs for 1m0s (gracefulStop: 30s)

     data_received................: 6.3 MB 104 kB/s
     data_sent....................: 9.4 MB 156 kB/s
     grpc_req_duration............: avg=2.14ms   min=873.93µs med=1.9ms    max=12.89ms  p(90)=3.13ms   p(95)=3.86ms
     grpc_streams.................: 6000   99.355331/s
     grpc_streams_msgs_received...: 6000   99.355331/s
     grpc_streams_msgs_sent.......: 6000   99.355331/s
     iteration_duration...........: avg=503.16ms min=500.8ms  med=502.64ms max=532.36ms p(90)=504.05ms p(95)=505.72ms
     iterations...................: 6000   99.355331/s
     vus..........................: 50     min=50      max=50
     vus_max......................: 50     min=50      max=50

running (1m00.4s), 00/50 VUs, 6000 complete and 0 interrupted iterations
```

With the old defaults (QPS=5 and Burst=10), the latencies would be much higher and number of complete requests much lower.
2023-11-28 15:25:05 -05:00
Eliza Weisman 6a260fa69f
edge-23.11.4 (#11642)
## edge-23.11.4

This edge release introduces support for the native sidecar containers
entering beta support in Kubernetes 1.29. This improves the startup and
shutdown ordering for the proxy relative to other containers, fixing the
long-standing shutdown issue with injected `Job`s. Furthermore, traffic
from other `initContainer`s can now be proxied by Linkerd.

In addition, this edge release includes Helm chart improvements, and
improvements to the multicluster extension.

* Added a new `config.alpha.linkerd.io/proxy-enable-native-sidecar`
  annotation and `Proxy.NativeSidecar` Helm option that causes the proxy
  container to run as an init-container (thanks @teejaded!) (#11465;
  fixes #11461)
* Fixed broken affinity rules for the multicluster `service-mirror` when
  running in HA mode (#11609; fixes #11603)
* Added a new check to `linkerd check` that ensures all extension
  namespaces are configured properly (#11629; fixes #11509)
* Updated the Prometheus Docker image used by the `linkerd-viz`
  extension to v2.48.0, resolving a number of CVEs in older Prometheus
  versions (#11633)
* Added `nodeAffinity` to `deployment` templates in the `linkerd-viz`
  and `linkerd-jaeger` Helm charts (thanks @naing2victor!) (#11464;
  fixes #10680)
2023-11-22 12:55:12 -08:00
TJ Miller 1b37e1989f
Add native sidecar support (#11465)
* Add native sidecar support

Kubernetes will be providing beta support for native sidecar containers in version 1.29.  This feature improves network proxy sidecar compatibility for jobs and initContainers.

Introduce a new annotation config.alpha.linkerd.io/proxy-enable-native-sidecar and configuration option Proxy.NativeSidecar that causes the proxy container to run as an init-container.

Fixes: #11461

Signed-off-by: TJ Miller <millert@us.ibm.com>
2023-11-22 12:23:24 -05:00
Alex Leong d341b6acce
edge-23.11.3 (#11627)
This edge release fixes a bug where Linkerd could cause EOF errors during bursts
of TCP connections.

* Fixed a bug where the `linkerd multicluster link` command's
  `--gateway-addresses` flag was not respected when a remote gateway exists
  ([#11564])
* proxy: Increased DEFAULT_OUTBOUND_TCP_QUEUE_CAPACITY to prevent EOF errors
  during bursts of TCP connections

[#11564]: https://github.com/linkerd/linkerd2/pull/11564

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-11-17 10:54:42 -08:00
Alejandro Pedraza 4018b2ffbe
Change notes for edge-23.11.2 (#11600)
## edge-23.11.2

This edge release contains observability improvements and bug fixes to the
Destination controller, and a refinement to the multicluster gateway resolution
logic.

* Fixed an issue where the Destination controller could stop processing service
  profile updates, if a proxy subscribed to those updates stops reading them;
  this is a followup to the issue [#11491] fixed in [edge-23.10.3] ([#11546])
* In the Destination controller, added informer lag histogram metrics to track
  whenever the Kubernetes objects watched by the controller are falling behind
  the state in the kube-apiserver ([#11534])
* In the multicluster service mirror, extended the target gateway resolution
  logic to take into account all the possible IPs a hostname might resolve to,
  rather than just the first one (thanks @MrFreezeex!) ([#11499])
* Added probes to the debug container to appease environments requiring probes
  for all containers ([#11308])

[edge-23.10.3]: https://github.com/linkerd/linkerd2/releases/tag/edge-23.10.3
[#11546]: https://github.com/linkerd/linkerd2/pull/11546
[#11534]: https://github.com/linkerd/linkerd2/pull/11534
[#11499]: https://github.com/linkerd/linkerd2/pull/11499
[#11308]: https://github.com/linkerd/linkerd2/pull/11308
2023-11-09 18:24:10 -05:00
deusxanima 33dffe6ce2
added liveness and readiness probes to debug container partials template (#11308)
Signed-off-by: Alen Haric <aharic88@gmail.com>
2023-11-03 11:55:50 -05:00
Matei David 774eb067be
Fix annotation names and partials field (#11560)
Prior to setting an enormous value to disable protocol detection, the
field was meant to be configurable. In the refactor, the annotation name
stayed the same instead of reflecting the change in the contract (i.e.
not configurable but toggled). Additionally, there were two types in the
proxy partials.

Signed-off-by: Matei David <matei@buoyant.io>
2023-11-02 11:53:52 -07:00
Oliver Gould 14beb8970d
edge-23.11.1 (#11558)
This edge release fixes two bugs in the Destination controller that could cause
outbound connections to hang indefinitely.

* helm: Introduce configurable values for protocol detection ([#11536])
* destination: Fix GetProfiles error when address is opaque and unmeshed ([#11556])
* destination: Return NotFound for unknown pod names ([#11540])
* proxy: Log controller errors at WARN
* proxy: Fix grpc_status metric labels for inbound traffic

[#11536]: https://github.com/linkerd/linkerd2/pull/11536
[#11556]: https://github.com/linkerd/linkerd2/pull/11556
[#11540]: https://github.com/linkerd/linkerd2/pull/11540
2023-11-02 09:02:26 -07:00
Matei David 1e6a019b31
Introduce configurable values for protocol detection (#11536)
This change allows users to configure protocol detection timeout values
(outbound and inbound). Certain environments may find that protocol
detection inhibits debugging and makes it harder to reason with a
client's behaviour. In such cases (and not only) it may be deseriable to
change the default protocol detection timeout to a higher value than the
default 10s.

Through this change, users may configure their timeout values either
with install-time settings or through annotations; this follows our
usual proxy configuration model. The proxy uses different timeout values
for the inbound and outbound stacks (even though they use the same
default value) and this change respects that by adding two separate
fields.

Signed-off-by: Matei David <matei@buoyant.io>
2023-11-02 14:03:50 +00:00
Matei David 798c5d9787
edge-23.10.4 (#11543)
This edge release includes a fix for the `ServiceProfile` CRD resource schema.
The schema incorrectly required `not` response matches to be arrays, while the
in-cluster validator parsed `not` response matches as objects. In addition, an
issues has been fixed in `linkerd profile`. When used with the `--open-api`
flag, it would not strip trailing slashes when generating a resource from
swagger specifications.

* Fixed an issue where trailing slashes wouldn't be stripped when generating
  `ServiceProfile` resources through `linkerd profile --open-api` ([#11519])
* Fixed an issue in the `ServiceProfile` CRD schema. The schema incorrectly
  required that a `not` response match should be an array, which the service
  profile validator rejected since it expected an object. The schema has been
  updated to properly indicate that `not` values should be an object ([#11510];
  fixes [#11483])
* Improved logging in the destination controller by adding the client pod's
  name to the logging context. This will improve visibility into the messages
  sent and received by the control plane from a specific proxy ([#11532])
* Fixed an issue in the destination controller where the metadata API would not
  initialize a `Job` informer. The destination controller uses the metadata API
  to retrieve `Job` metadata, and relies mostly on informers. Without an
  initialized informer, an error message would be logged, and the controller
  relied on direct API calls ([#11541]; fixes [#11531])

[#11541]: https://github.com/linkerd/linkerd2/pull/11532
[#11532]: https://github.com/linkerd/linkerd2/pull/11532
[#11531]: https://github.com/linkerd/linkerd2/issues/11531
[#11519]: https://github.com/linkerd/linkerd2/pull/11519
[#11510]: https://github.com/linkerd/linkerd2/pull/11510
[#11483]: https://github.com/linkerd/linkerd2/issues/11483

Signed-off-by: Matei David <matei@buoyant.io>
2023-10-27 22:14:28 +01:00
Alex Leong 4e7a588a2c
Add pod name to context token and logging (#11532)
When the destination controller logs about receiving or sending messages to a data plane proxy, there is no information in the log about which data plane pod it is communicating with.  This can make it difficult to diagnose issues which span the data plane and control plane.

We add a `pod` field to the context token that proxies include in requests to the destination controller.  We add this pod name to the logging context so that it shows up in log messages.  In order to accomplish this, we had to plumb through logging context in a few places where it previously had not been.  This gives us a more complete logging context and more information in each log message.

An example log message with this fuller logging context is:

```
time="2023-10-24T00:14:09Z" level=debug msg="Sending destination add: add:{addrs:{addr:{ip:{ipv4:183762990}  port:8080}  weight:10000  metric_labels:{key:\"control_plane_ns\"  value:\"linkerd\"}  metric_labels:{key:\"deployment\"  value:\"voting\"}  metric_labels:{key:\"pod\"  value:\"voting-7475cb974c-2crt5\"}  metric_labels:{key:\"pod_template_hash\"  value:\"7475cb974c\"}  metric_labels:{key:\"serviceaccount\"  value:\"voting\"}  tls_identity:{dns_like_identity:{name:\"voting.emojivoto.serviceaccount.identity.linkerd.cluster.local\"}}  protocol_hint:{h2:{}}}  metric_labels:{key:\"namespace\"  value:\"emojivoto\"}  metric_labels:{key:\"service\"  value:\"voting-svc\"}}" addr=":8086" component=endpoint-translator context-ns=emojivoto context-pod=web-767f4484fd-wmpvf remote="10.244.0.65:52786" service="voting-svc.emojivoto.svc.cluster.local:8080"
```

Note the `context-pod` field.

Additionally, we have tested this when no pod field is included in the context token (e.g. when handling requests from a pod which does not yet add this field) and confirmed that the `context-pod` log field is empty, but no errors occur.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-25 13:48:42 -07:00
Alex Leong cca3cf8005
Fix response class schema (#11510)
Fixes #11483

Service profile's response class schema indicates that a `not` response match should be an array.  This is incorrect and parsing of the response class will fail if an array is provided.  

Update the schema to properly indicate that `not`'s value should be an object.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-20 15:17:16 -07:00
Eliza Weisman 166c94f2d4
edge-23.10.3 (#11513)
## edge-23.10.3

This edge release fixes issues in the proxy and destination controller which can
result in Linkerd proxies sending traffic to stale endpoints. In addition, it
contains other bugfixes and updates dependencies to include patches for the
security advisories [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 and GHSA-c827-hfw6-qwvm.

* Fixed an issue where the Destination controller could stop processing
  changes in the endpoints of a destination, if a proxy subscribed to that
  destination stops reading service discovery updates. This issue results in
  proxies attempting to send traffic for that destination to stale endpoints
  ([#11483], fixes [#11480], [#11279], and [#10590])
* Fixed a regression introduced in stable-2.13.0 where proxies would not
  terminate unused service discovery watches, exerting backpressure on the
  Destination controller which could cause it to become stuck
  ([linkerd2-proxy#2484] and [linkerd2-proxy#2486])
* Added `INFO`-level logging to the proxy when endpoints are added or removed
  from a load balancer. These logs are enabled by default, and can be disabled
  by [setting the proxy log level][proxy-log-level] to
  `warn,linkerd=info,linkerd_proxy_balance=warn` or similar
  ([linkerd2-proxy#2486])
* Fixed a regression where the proxy rendered `grpc_status` metric labels as a
  string rather than as the numeric status code ([linkerd2-proxy#2480]; fixes
  [#11449])
* Added missing `imagePullSecrets` to `linkerd-jaeger` ServiceAccount ([#11504])
* Updated the control plane's dependency on the `golang.google.org/grpc` Go
  package to include patches for [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 ([#11496])
* Updated dependencies on `rustix` to include patches for GHSA-c827-hfw6-qwvm
  ([linkerd2-proxy#2488] and [#11512]).

[#10590]: https://github.com/linkerd/linkerd2/issues/10590
[#11279]: https://github.com/linkerd/linkerd2/issues/11279
[#11483]: https://github.com/linkerd/linkerd2/issues/11483
[#11449]: https://github.com/linkerd/linkerd2/issues/11449
[#11480]: https://github.com/linkerd/linkerd2/issues/11480
[#11504]: https://github.com/linkerd/linkerd2/issues/11504
[#11504]: https://github.com/linkerd/linkerd2/issues/11512
[linkerd2-proxy#2480]: https://github.com/linkerd/linkerd2-proxy/pull/2480
[linkerd2-proxy#2484]: https://github.com/linkerd/linkerd2-proxy/pull/2484
[linkerd2-proxy#2486]: https://github.com/linkerd/linkerd2-proxy/pull/2486
[linkerd2-proxy#2488]: https://github.com/linkerd/linkerd2-proxy/pull/2488
[proxy-log-level]: https://linkerd.io/2.14/tasks/modifying-proxy-log-level/
[CVE-2023-44487]: https://github.com/advisories/GHSA-qppj-fm5r-hxr3
2023-10-19 15:21:46 -07:00
Alejandro Pedraza cd2c88ec34
edge-23.10.2 change notes (#11482)
## edge-23.10.2

This edge release includes a fix addressing an issue during upgrades for
instances not relying on automated webhook certificate management (like
cert-manager provides).

* Added a `checksum/config` annotation to the destination and proxy injector
  deployment manifests, to force restarting those workloads whenever their
  webhook secrets change during upgrade (thanks @iAnomaly!) ([#11440])
* Fixed policy controller error when deleting a Gateway API HTTPRoute resource
  ([#11471])

[#11440]: https://github.com/linkerd/linkerd2/pull/11440
[#11471]: https://github.com/linkerd/linkerd2/pull/11471
2023-10-12 17:17:23 -05:00
Cameron Boulton a6ea765d39
Restart destination, proxy-injector controllers on config change. (#11440)
Fixes #6940

Added a `checksum/config` annotation into the destination, proxy-injector and tap-injector workloads, whose value is calculated as the SHA256 of the template file containing the TLS cert they depend on. This is necessary so that every time those other files change (they get re-generated on every upgrade or config update via `linkerd upgrade`), the workloads change as well. 
We had this in place before, but with the 2.12 helm charts migrations we dropped it by mistake.

Signed-off-by: Cameron Boulton <cameron.boulton@calm.com>
2023-10-05 11:00:33 -07:00
Alex Leong 094890cfa4
edge-23.10.1 (#11454)
This edge release adds additional configurability to Linkerd's viz and
multicluster extensions.

* Added a `podAnnotations` Helm value to allow adding additional annotations to
  the Linkerd-Viz Prometheus Deployment ([#11365]) (thanks @cemenson)
* Added `imagePullSecrets` Helm values to the multicluster chart so that it can
  be installed in an air-gapped environment. ([#11285]) (thanks @lhaussknecht)

[#11365]: https://github.com/linkerd/linkerd2/issues/11365
[#11285]: https://github.com/linkerd/linkerd2/issues/11285

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-04 14:43:39 -07:00
Oliver Gould bc97b02169
edge-23.9.4 (#11441)
This edge release makes Linkerd even better.

* Added a controlPlaneVersion override to the `linkerd-control-plane` Helm chart
  to support including SHA256 image digests in Linkerd manifests (thanks
  @cromulentbanana!) ([#11406])
* Improved `linkerd viz check` to attempt to validate that the Prometheus scrape
  interval will work well with the CLI and Web query parameters ([#11376])
* Improved CLI error handling to print differentiated error information when
  versioncheck.linkerd.io cannot be resolved (thanks @dtaskai) ([#11377])
* Fixed an issue where the destination controller would not update pod metadata
  for profile resolutions for a pod accessed via the host network (e.g.
  HostPort endpoints) ([#11334]).
* Added a validating webhook config for httproutes.gateway.networking.k8s.io
  resources (thanks @mikutas!) ([#11150])
* Introduced a new `multicluster check --timeout` flag to limit the time
  allowed for Kubernetes API calls (thanks @moki1202) ([#11420])

[#11150]: https://github.com/linkerd/linkerd2/pull/11150
[#11334]: https://github.com/linkerd/linkerd2/pull/11334
[#11376]: https://github.com/linkerd/linkerd2/pull/11376
[#11377]: https://github.com/linkerd/linkerd2/pull/11377
[#11406]: https://github.com/linkerd/linkerd2/pull/11406
[#11420]: https://github.com/linkerd/linkerd2/pull/11420
2023-09-29 07:46:16 -07:00