## edge-20.12.2
* Fixed an issue where the `proxy-injector` and `sp-validator` did not refresh
their certs automatically when provided externally—like through cert-manager
* Added support for overrides flags to the `jaeger install` command to allow
setting Helm values when installing the Linkerd-jaeger extension
* Added missing Helm values to the multicluster chart (thanks @DaspawnW!)
* Moved tracing functionality to the `linkerd-jaeger` extension
* Fixed various issues in developer shell scripts (thanks @joakimr-axis!)
* Fixed an issue where `install --ha` was only partially applying the high
availability config
* Updated RBAC API versions in the CNI chart (thanks @glitchcrab!)
* Fixed an issue where TLS credentials are changed during upgrades, but the
Linkerd webhooks would not restart, leaving them to use older credentials and
fail requests
* Stopped publishing the multicluster link chart as its primary use case is in
the `multicluster link` command and not being installed through Helm
* Added service mirror error logs for when the multicluster gateway's hostname
cannot be resolved.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Don't swallow error when MC gateway hostname can't be resolved
Ref #5343
When none of the gateway addresses is resolvable, propagate the error as
a retryable error so it gets retried and logged. Don't create the
mirrored resources if there's no success after the retries.
Closes#5348
That chart generates the service mirror resources and related RBAC, but
doesn't generate the credentials secret nor the Link CR which require
go-client logic not available from sheer Helm templates.
This PR stops publishing that chart, and adds a comment to its README
about it.
Moved the `collectorSvcAccount` and `collectorSvcAddr` values in
`values.yaml` under the `webhook` section, given it's the injector that
will make use of that, and to not confuse with the SA and address for
the collector that is provided by default (the injector could point to a
different collector than that one).
* upgrades: make webhooks restart if TLS creds are updated
Fixes#5231
Currently, we do not re-use the TLS certs during upgrades, which
means that the secrets are updated while the webhooks are still
paired with the older ones, causing the webhook requests to fail.
This can be solved by making webhooks be restarted whenever there
is a change in the certs. This can be performed by storing the hash
of the `*-rbac` file, which contains the secrets, thus making the
pod templates change whenever there is an update to the certs thus
making restarts required.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When testing the `linkerd2-cni` chart with `ct`, it flags up usage
of some deprecated apiVersions.
This PR aligns the RBAC API group across all resources in the chart.
---
Signed-off-by: Simon Weald <glitchcrab-github@simonweald.com>
* `linkerd install --ha` was only partially applying HA config
Fixes#5342
`values-ha.yml` contains the specific config for HA, but only the proxy
resources controller replicas settings were applied. This PR adds
EnablePodAntiafinity, WebhookFailurePolicy and all the resource settings
for the other CP pods.
Also the `--controller-replicas` flag is moved after the HA flags so it
can override the HA settings.
Finally, some comments no longer relevant were removed.
## How to test
Perform `linkerd install --ha` and make sure the values in
`values-ha.yml` are propagated correctly in the produced yaml.
## 2.9.1
After merging to `main`, this should be cherry-picked into the
`release/stable-2.9` branch.
Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
- Do unzip check but don't install; leave installation to user
- Move unzip check to bin/protoc that actually uses unzip
- Make sure the protoc scripts can be called from any directory
Fixes#5337
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:
* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests
We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension. To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.
Note that for now, only the control plane components send traces; the proxies in the control plane do not. This is because the linkerd-jaeger injector is not yet available. However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.
I tested this by doing the following:
1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane
Signed-off-by: Alex Leong <alex@buoyant.io>
Original description:
> **Subject**
> Add missing helm values for multicluster setup
>
> **Problem**
> When executing this without the linkerd command the two variables are missing and the rendering will generate empty values.
> This produces the following gateway identity, that is also used in the gateway link command to generate the link crd:
>
> ```
> mirror.linkerd.io/gateway-identity: linkerd-gateway.linkerd-multicluster.serviceaccount.identity..
> ```
>
> **Solution**
> Add the values as defaults to the helm chart values.yaml file. If the cli is used they are overwritten by the following parameters:
> * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L197
> * https://github.com/linkerd/linkerd2/blob/main/cli/cmd/multicluster.go#L196
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Co-authored-by: Björn Wenzel <bjoern.wenzel@dbschenker.com>
* bin/shellcheck-all was missing some files
`bin/shellcheck-all` identifies what files to check by filtering by the
`text/x-shellscript` mime-type, which only applies to files with a
shebang pointing to bash. We had a number of files with a
`#!/usr/bin/env sh` shebang that (at least in Ubuntu given `sh` points
to `dash`) only exposes a `text/plain` mime-type, thus they were not
being checked.
This fixes that issue by replacing the filter in `bin/shellcheck-all`, using a simple grep over the file shebang instead of using the `file` command.
This changes the install-pr script to work with k3d.
Additionally, it now only installs the CLI; it no longer installs Linkerd on the
cluster. This was removed because most of the time when installing a Linkerd
version from a PR, some extra installation configuration is required and I was
always commenting out that final part of the script.
`--context` was changed to `--cluster` since we no longer need a context value,
only the cluster name which we are loading the images in to.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Fixes#5257
This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.
Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This change adds flags `set`, `set-string`, `values`, `set-files`,
etc flags which are used to override the default values. This is
similar to that of Helm.
This also updates the install workflow to directly use Helm v3
pkg for chart loading and generation, without having to use
our chart type, etc.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Have webhooks refresh their certs automatically
Fixes partially #5272
In 2.9 we introduced the ability for providing the certs for `proxy-injector` and `sp-validator` through some external means like cert-manager, through the new helm setting `externalSecret`.
We forgot however to have those services watch changes in their secrets, so whenever they were rotated they would fail with a cert error, with the only workaround being to restart those pods to pick the new secrets.
This addresses that by first abstracting out `FsCredsWatcher` from the identity controller, which now lives under `pkg/tls`.
The webhook's logic in `launcher.go` no longer reads the certs before starting the https server, moving that instead into `server.go` which in a similar way as identity will receive events from `FsCredsWatcher` and update `Server.cert`. We're leveraging `http.Server.TLSConfig.GetCertificate` which allows us to provide a function that will return the current cert for every incoming request.
### How to test
```bash
# Create some root cert
$ step certificate create linkerd-proxy-injector.linkerd.svc ca.crt ca.key \
--profile root-ca --no-password --insecure --san linkerd-proxy-injector.linkerd.svc
# configure injector's caBundle to be that root cert
$ cat > linkerd-overrides.yaml << EOF
proxyInjector:
externalSecret: true
caBundle: |
< ca.crt contents>
EOF
# Install linkerd. The injector won't start untill we create the secret below
$ bin/linkerd install --controller-log-level debug --config linkerd-overrides.yaml | k apply -f -
# Generate an intermediatery cert with short lifespan
step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc
# Create the secret using that intermediate cert
$ kubectl create secret tls \
linkerd-proxy-injector-k8s-tls \
--cert=ca-int.crt \
--key=ca-int.key \
--namespace=linkerd
# start following the injector log
$ k -n linkerd logs -f -l linkerd.io/control-plane-component=proxy-injector -c proxy-injector
# Inject emojivoto. The pods should be injected normally
$ bin/linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
# Wait about 5 minutes and delete a pod
$ k -n emojivoto delete po -l app=emoji-svc
# You'll see it won't be injected, and something like "remote error: tls: bad certificate" will appear in the injector logs.
# Regenerate the intermediate cert
$ step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc
# Delete the secret and recreate it
$ k -n linkerd delete secret linkerd-proxy-injector-k8s-tls
$ kubectl create secret tls \
linkerd-proxy-injector-k8s-tls \
--cert=ca-int.crt \
--key=ca-int.key \
--namespace=linkerd
# Wait a couple of minutes and you'll see some filesystem events in the injector log along with a "Certificate has been updated" entry
# Then delete the pod again and you'll see it gets injected this time
$ k -n emojivoto delete po -l app=emoji-svc
```
This edge release continues the work of decoupling non-core Linkerd components
by moving more tracing related functionality into the Linkerd-jaeger extension.
* Continued work on moving tracing functionality from the main control plane
into the `linkerd-jaeger` extension
* Fixed a potential panic in the proxy when looking up a socket's peer address
while under high load
* Added automatic readme generation for charts (thanks @GMarkfjard!)
* Fixed zsh completion for the CLI (thanks @jiraguha!)
* Added support for multicluster gateways of types other than LoadBalancer
(thanks @DaspawnW!)
Signed-off-by: Alex Leong <alex@buoyant.io>
This release updates the proxy's `*ring*` dependency to pick up the
latest changes from BoringSSL.
Additionally, we've audited uses of non-cryptographic random number
generators in the proxy to ensure that each balancer/router intializes
its own RNG state.
---
* Audit uses of SmallRng (linkerd/linkerd2-proxy#757)
* Update *ring* to 0.6.19 (linkerd/linkerd2-proxy#758)
* metrics: Support the Summary metric type (linkerd/linkerd2-proxy#756)
The namespace that Linkerd extensions are installed into is configurable. This can make it difficult to know which extensions are installed and where they are located. We add a `linkerd.io/extension` namespace label to easily enumerate and locate Linkerd extensions. This can be used, for example, to enable certain features only when certain extensions are installed. All new Linkerd extensions should include this namespace label.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Add automatic readme generation for charts
The current readmes for each chart is generated
manually and doesn't contain all the information available.
Utilize helm-docs to automatically fill out readme.mds
for the helm charts by pulling metadata from values.yml.
Fixes#4156
Co-authored-by: GMarkfjard <gabma047@student.liu.se>
This branch adds `jaeger dashboard` sub-command which is used
to view the jaeger dashboard. This follows the same logic/pattern
of that of `linkerd-dashboard`. Also, provides the same flags.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This release removes a potential panic: it was assumed that looking up a
socket's peer address was infallible, but in practice this call can
fail when a host is under high load. Now these failures only impact the
connection-level task and not the whole proxy proces.
Also, the `process_cpu_seconds_total` metric is now exposed as a float
so that its value may include fractional seconds with 10ms granularity.
---
* io: Make peer_addr fallible (linkerd/linkerd2-proxy#755)
* metrics: Expose process_cpu_seconds_total as a float (linkerd/linkerd2-proxy#754)
* Jaeger injector mutating webhook
Closes#5231. This is based off of the `alex/sep-tracing` branch.
This webhook injects the `LINKERD2_PROXY_TRACE_COLLECTOR_SVC_ADDR`,
`LINKERD2_PROXY_TRACE_COLLECTOR_SVC_NAME` and
`LINKERD2_PROXY_TRACE_ATTRIBUTES_PATH` environment vars into the proxy
spec when a pod is created, as well as the podinfo volume and its mount.
If any of these are found to be present already in the pod spec, it
exits without applying a patch.
The `values.yaml` file has been expanded to include config for this
webhook. In particular, one can define a `namespaceSelector` and/or a
`objectSelector` to filter which pods will this webhook act on.
The config entries in `values.yam` for `collectorSvcAddr` and
`collectorSvcAccount` can be overriden with the
`config.linkerd.io/trace-collector` and
`config.alpha.linkerd.io/trace-collector-service-account` annotation at
the namespace or pod spec level.
## How to test:
```bash
docker build . -t ghcr.io/linkerd/jaeger-webhook:0.0.1 -f
jaeger/proxy-mutator/Dockerfile
k3d image import ghcr.io/linkerd/jaeger-webhook:0.0.1
bin/helm-build
linkerd install
helm install jaeger jaeger/charts/jaeger
linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
kubectl -n emojivoto get po -l app=emoji-svc -oyaml | grep -A1 TRACE
```
## Reinvocation policy
The webhookconfig resource is configured with `reinvocationPolicy:
IfNeeded` so that if the tracing injector gets triggered before the
proxy injector, it will get triggered a second time after the proxy
injector runs so it can act on the injected proxy. By default this won't
be necessary because the webhooks run in alphabetical order (this is not
documented in k8s docs though) so
`linkerd-proxy-injector-webhook-config` will run before
`linkerd-proxy-mutator-webhook-config`. In order to test the
reinvocation mechanism, you can change the name of the former so it gets
called first.
I versioned the webhook image as `0.0.1`, but we can decide to align
that with linkerd's main version tag.
This edge release improves the proxy's support high-traffic workloads. It also
contains the first steps towards decoupling non-core Linkerd components, the
first iteration being a new `linkerd jaeger` sub-command for installing tracing.
Please note this is still a work in progress.
* Addressed some issues reported around clients seeing max-concurrency errors by
increasing the default in-flight request limit to 100K pending requests
* Have the proxy appropriately set `content-type` when synthesizing gRPC error
responses
* Bumped the `proxy-init` image to `v1.3.8` which is based off of
`buster-20201117-slim` to reduce potential security vulnerabilities
* No longer panic in rare cases when `linkerd-config` doesn't have an entry for
`Global` configs (thanks @hodbn!)
* Work in progress: the `/jaeger` directory now contains the charts and commands
for installing the tracing component.
* extension: Add new jaeger binary
This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.
This covers only the install part of the command
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5230
This PR moves tracing into a jaeger chart with no proxy injection
templates. We still keep the dependency on partials, as we could use
common templates like resources, etc from there.
Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com
This release addresses some issues reported around clients seeing
max-concurrency errors by increasing the default in-flight request limit
to 100K pending requests.
Additionally, the proxy now sets an appropriate content-type when
synthesizing gRPC error responses.
---
* style: fix some random clippy lints (linkerd/linkerd2-proxy#749)
* errors: Set `content-type` for synthesized grpc errors (linkerd/linkerd2-proxy#750)
* concurrency-limit: Drop permit on readiness (linkerd/linkerd2-proxy#751)
* Increase the default buffer capacity to 100K (linkerd/linkerd2-proxy#752)
* Change default max-in-flight and buffer-capacity (linkerd/linkerd2-proxy#753)
Fixes#4874
This branch upgrades Helm sdk from v2 to v3 *without any functionaly
changes*, just replacing types with newer API's.
This should not effect our current support for Helm v2 as we did not
change any of the underlying tempaltes(which work with Helm v2). This
works becuase we did not use any of the API's that read the Chart
metadata (which are the only ones changed from v2 to v3) and currently
manually load files and pass ito the sdk.
This PR should provide a great point to start more of the newer Helm v3
API's including for the upgrade workflow thus allowing us to make
Linkerd CLI more simpler.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
CLI crashes if linkerd-config contains unexpected values.
Add a safe accessor that initializes an empty Global on the first
access. Refactor all accesses to use the newly introduced accessor using
gopls.
Add test for linkerd-config data without Global.
Fixes#5215
Co-authored-by: Itai Schwartz <yitai27@gmail.com>
Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>
This adds additional tests for the destination service that assert `GetProfile`
behavior when the path is an IP address.
1. Assert that when the path is a cluster IP, the configured service profile is
returned.
2. Assert that when the path a pod IP, the endpoint field is populated in the
service profile returned.
3. Assert that when the path is not a cluster or pod IP, the default service
profile is returned.
4. Assert that when path is a pod IP with or without the controller annotation,
the endpoint has or does not have a protocol hint
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Refactor webhook framework to allow webhook define their flags
Pulled out of `launcher.go` the flag parsing logic and moved it into the `Main` methods of the webhooks (under `controller/cmd/proxy.injector/main.go` and `controller/cmd/sp-validator/main.go`), so that individual webhooks themselves can define the flags they want to use.
Also no longer require that webhooks have cluster-wide access.
Finally, renamed the type `webhook.handlerFunc` to `webhook.Handler` so it can be exported. This will be used in the upcoming jaeger webhook.
## edge-20.11.4
* Fixed an issue in the destination service where endpoints always included a
protocol hint, regardless of the controller label being present or not
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This fixes an issue where the protocol hint is always set on endpoint responses.
We now check the right value which determines if the pod has the required label.
A test for this has been added to #5266.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This release changes error handling to teardown the server-side
connection when an unexpected error is encountered.
Additionally, the outbound TCP routing stack can now skip redundant
service discovery lookups when profile responses include endpoint
information.
Finally, the cache implementation has been updated to reduce latency by
removing unnecessary buffers.
---
* h2: enable HTTP/2 keepalive PING frames (linkerd/linkerd2-proxy#737)
* actions: Add timeouts to GitHub actions (linkerd/linkerd2-proxy#738)
* outbound: Skip endpoint resolution on profile hint (linkerd/linkerd2-proxy#736)
* Add a FromStr for dns::Name (linkerd/linkerd2-proxy#746)
* outbound: Avoid redundant TCP endpoint resolution (linkerd/linkerd2-proxy#742)
* cache: Make the cache cloneable with RwLock (linkerd/linkerd2-proxy#743)
* http: Teardown serverside connections on error (linkerd/linkerd2-proxy#747)
As discussed in #5228, it is not correct for root and intermediate
certs to have SAN. This PR updates the check to not verify the
intermediate issuer cert with the identity dns name (which checks with
SAN and not CN as the the `verify` func is used to verify leaf certs and
not root and intermediate certs). This PR also avoids setting a SAN
field when generating certs in the `install` command.
Fixes#5228
Context: #5209
This updates the destination service to set the `Endpoint` field in `GetProfile`
responses.
The `Endpoint` field is only set if the IP maps to a Pod--not a Service.
Additionally in this scenario, the default Service Profile is used as the base
profile so no other significant fields are set.
### Examples
```
# GetProfile for an IP that maps to a Service
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.43.222.0:9090
INFO[0000] fully_qualified_name:"linkerd-prometheus.linkerd.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"linkerd-prometheus.linkerd.svc.cluster.local.:9090" weight:10000}
```
Before:
```
# GetProfile for an IP that maps to a Pod
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}}
```
After:
```
# GetProfile for an IP that maps to a Pod
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524692}} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"fast-1"} metric_labels:{key:"pod" value:"fast-1-5cc87f64bc-9hx7h"} metric_labels:{key:"pod_template_hash" value:"5cc87f64bc"} metric_labels:{key:"serviceaccount" value:"default"} tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Consolidate integration tests under k3d
Fixes#5007
Simplified integration tests by moving all to k3d. Previously things were running in Kind, except for the multicluster tests, which implied some extra complexity in the supporting scripts.
Removed the KinD config files under `test/integration/configs`, as config is now passed as flags into the `k3d` command.
Also renamed `kind_integration.yml` to `integration_tests.yml`
Test skipping logic under ARM was also simplified.