This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
The misspellings have been reported at 0d56327e6f (commitcomment-51603624)
The action reports that the changes in this PR would make it happy: 03a9c310aa
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
This release adds support for retrying HTTP/2 requests with small (<64KB)
message bodies, allowing the proxy to properly buffer message bodies when
responses are classified as a failure. Documentation on how to configure
retries can be found [here](https://linkerd.io/2.10/tasks/configuring-retries/).
This release also modifies the proxy's identity subsystem to instantiate a
client on-demand so client connections are not retained continually. Also
included in this release are various bug fixes and improvements as well as
expanding support for resource-aware tab completion in the jaeger and
multicluster CLI extensions.
* Added support for specifying a `gateway-port` flag for the `multicluster link`
command (thanks @psmit!)
* Added support for Kubernetes resource aware tab completion for `jaeger` and
`multicluster` commands
* Fixed an issue where `viz`, `jaeger` and `multicluster` extensions could not
be installed on `PodSecurityPolicy`-enabled clusters
* Fixed an issue where `linkerd check --proxy` could incorrectly report
out-of-date proxy versions caused by incorrect regex (thanks @aryan9600!)
* Added support for the proxy to retry HTTP/2 requests with message bodies
<= 64KB
* Modified the proxy's controller stack to create new client connections
on-demand
* Fixed Viz's `uninstall` command to remove viz installations that used the
legacy `linkerd.io/extension: linkerd-viz` label (thanks @jsoref!)
* Expanded the "linkerd-existence" health check to also check for the
destination pod readiness
Followup to #6209, fixes#6169
Added flag `--images preload` to `bin/image-load` (which is also surfaced in `bin/tests`) to pull docker images before loading them into the cluster. This is used in the `release.yml` workflow so k3d doesn't pull the images itself from the public registry, which has proven to be more inefficient.
I tested not-preloading vs preloading in this [fork](https://github.com/alpeb/linkerd2/runs/2719971781?check_suite_focus=true), which sees how longs it takes to install linkerd and waiting for `linkerd check` to complete in both scenarios. Without preloading it took 1m54s, with preloading it took 1m10s. The difference was persistent across multiple re-runs.
Prior to 15d1809bd0, the name for viz objects was
`linkerd-viz` and people may still want to uninstall them using the current version of linkerd.
To support this, an additional code path is included when the fast path fails to
suggest things to remove.
Tested with my broken cluster.
Fixes#6213
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
This changes the data plane and control plane checks to only consider pods that
are in a running state.
This is currently an issue in scenarios when upgrading and an old pod is still
terminating while the upgraded pod is already running. Running the `check`
command will report that the old pod has an unexpected proxy version even though
it is terminating; it should not warn in this case.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Supersedes #6196
Waaaay back when the `destination` container was moved out of the
"public-api" pod, we failed to add the new pod into the list of pods to
be checked in the `control plane pods are ready` health check.
`linkerd viz install` uses that check before proceeding to install. So
if that happens too quickly we risk trying to run viz components without
a `destination` pod being ready, which was causing the viz pods to
fail and restart sporadically.
Most integration tests use `test/integration/install_test.go` to setup
the cluster, which calls `kubectl rollout status` after the core
install, which blocks on full readiness before installing viz.
But the `uninstall` test does its own install and doesn't have that
rollout blocking, so that's why we were seeing those pod restarts only
for that test!
Addresses #6169 in part.
Improve `bin/image-load` commands to receive as arguments the images we
want to load (any from proxy, controller, web, metrics-api, grafana, debug, cni-plugin, jaeger-webhook or tap).
If nothing is passed, it loads them all, as it currently does.
Note that now in order to specify a target cluster different that the default (`k3s-default` for k3d and `kind` for KinD) you need to use the new `--cluster` flag.
The current method can fail to return the version, if for example, the
proxy image used is hosted on a private registry.
Split the image string using ":" as a delimiter and return the last
element instead of the second element, as it _should_ always be the
image version.
Fixes#6154
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
This release adds support for retrying messages with small (<64KB)
bodies. Now when retry policies specify retries for POST messages, etc,
the proxy will properly buffer and resubmit these messages bodies when
responses are classified as a failure.
This release also modifies the proxy's identity subsystem to instantiate
a client on-demand so client connections are not retained continually.
The identity client is typically used only once per day, so there's no
need to maintain these resources continually.
---
* retry: retry requests with <=64KB bodies (linkerd/linkerd2-proxy#1017)
* retry: only buffer request bodies when retries are enabled (linkerd/linkerd2-proxy#1020)
* control: Build the identity client each time its used (linkerd/linkerd2-proxy#1021)
* Restrict permissions in CI
Fixes#6171
Added a workflow-level restrictive `permissions: content: read`
statement in all the workflows to limit the `GITHUB_TOKEN` privileges in
all the workflows, overridding with laxer perms there where required.
Addresses #6169 in part.
Launch k3d with the `--no-hostip` option, and disabling `local-storage`
and `metrics-server`, which are never required.
Also disable the load balancer and the `servicelb` and `traefik`
components always except for the `multicluster` test.
* Add missing psp for extensions
This change fixes an issue where the `viz`, `jaeger` and `multicluster`
extensions did not have `podsecuritypolicy` Roles. This causes an issue
where the extensions aren't able to be installed on a cluster that has
pod security enabled.
Fixes#6122
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Create gateway-port option for mc link (#6175)
In case of a NodePort service type, there might be non-kubernetes load
balancers that map a public port to the selected NodePorts. This is not
something that is detectable, so a flag needs to be provided to specify
the public port during the `mc link` command.
Note that the probe port can already be set using `--set
gateway.probe.port 1234`.
Signed-off-by: Peter Smit <peter.smit@inscripta.io>
Increase timeout when doing `linkerd viz check` in particular to avoid
upgrade tests to fail, given the prometheus pod takes a while to come up
on thus to replace the old one.
This edge release contains various improvements to the Viz and Jaeger install
charts, along with bug fixes in the CLI, and destination. This release also
adds kubernetes aware autocompletion to all viz commands, along with
ServiceProfiles to be part of the default `viz install`.
Finally, the proxy has been updated to continue supporting requests without
`l5d-dst-override` in ingress-mode proxies, to no longer include query parameters
in the OpenCensus trace spans, and to prevent timeouts with controller clients
of components with more than one replica.
* Separated protocol hint setting from H2 upgrades in destination profile
response, thus preventing `hint.OpaqueTransport` field from not being set when
H2 upgrades are disabled
* Updated OpenCensus trace spans for HTTP requests to no longer include query
parameters (thanks @aatarasoff!)
* Reverted [linkerd/linkerd2-proxy#992](https://github.com/linkerd/linkerd2-proxy/pull/992)
to support requests without `l5d-dst-override` in ingress-mode proxies
* Fixed an issue in the proxy to prevent timeouts with controller clients
of components with more than one replica
* Fixed `linkerd check --proxy` failure with pods that are part of Jobs
* Updated `viz install` to also include ServiceProfiles of its components.
As a side-effect, `linkerd diagnostics install-sp` cmd has been removed
* Added support for Kubernetes resource aware tab completion for all
viz commands
* Updated destination to prefer `ServiceProfile.dstOverrides` over
`TrafficSplit` when both are present for a service
* Added toggle flags for `collector` and `jaeger` components in the
jaeger extension (thanks @tarvip!)
* Added support for setting `nodeselector`, `toleration` fields for components
in the Viz extension (thanks @aatarasoff!)
* Fixed a templating issue in Viz, making `podAnnotations` field
work with prometheus
* Updated Golang version to 1.16.4
* Removed unecessary `--addon-overwrite` flag in `linkerd upgrade`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Controller clients of components with more than one replica could fail
to drive all connections to completion. This could result in timeouts
showing up in logs, but would not have prevented proxies from
communicating with controllers. linkerd/linkerd2#6146
* linkerd/linkerd2-proxy#992 made the `l5d-dst-override` header required
for ingress-mode proxies. This behavior has been reverted so that
requests without this header are forwarded to their original
destination.
* OpenCensus trace spans for HTTP requests no longer include query
parameters.
---
* ci: Update/pin action dependencies (linkerd/linkerd2-proxy#1012)
* control: Ensure endpoints are driven to readiness (linkerd/linkerd2-proxy#1014)
* Make span name without query string (linkerd/linkerd2-proxy#1013)
* ingress: Restore original dst address routing (linkerd/linkerd2-proxy#1016)
* ci: Restict permissions in Actions (linkerd/linkerd2-proxy#1019)
* Forbid unsafe code in most module (linkerd/linkerd2-proxy#1018)
The owners in that file match the entries in MAINTAINERS.md.
On the next release build this will upload that file into our Helm repo,
and we'll be able to claim ownership of the chart at artifacthub.io.
Once we do that we'll get a `repositoryID` we can add into that file, to
obtain the "Verified publisher" badge.
This change also removes a couple of no longer relevant OWNERS files.
While debugging the latest CI flakiness I only saw problems at
retrieving the images, which aren't surfaced as such unless you
explicitly ask for k8s events. After doing so I also found things like:
```
Warning FreeDiskSpaceFailed node/k3d-target-server-0 failed to garbage collect required amount of images. Wanted to free 5491729203 bytes, but freed 0 bytes
```
and entries in the k3d log such as:
```
I0526 15:21:25.045203 11 image_gc_manager.go:304] [imageGCManager]: Disk usage on image filesystem is at 87% which is over the high threshold (85%). Trying to free 5491729203 bytes down to the low threshold (80%).
E0526 15:21:25.046709 11 kubelet.go:1214] Image garbage collection failed multiple times in a row: failed to garbage collect required amount of images. Wanted to free 5491729203 bytes, but freed 0 bytes
```
So this PR expands the removal of the `images-archives.zip` file and the
docker images pruning to all tests, not just *-deep ones. 🙏
This updates the destination to prefer `serviceprofiles.dstOverrides`
over `trafficsplits`. This is useful as it is important for
ServiceProfile to take preference over TrafficSplits when both are
present.
This also makes integration testing the `smi-adaptor` easier.
This also adds unit tests in the `traffic_split_adaptor` to check
for the same.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
PR #6120 added flags to disable and enable jaeger, and opencensus
collector.
The helm indentation was not correctly set, which seems
add additional unnecessary new-lines.
This PR fixes that while also adding new tests, to test
and track the manifests with these options.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Our web development environment has a dependency on webpack-dev-server,
which pulls in bonjour, which pulls in multicast-dns, which depends on
dns-packet, which has a [critical security vulnerability][1].
webpack/webpack-dev-server#3340 tracks fixing this dependency, but it
appears that the bonjour project is no longer maintained.
This change works around this issue by patching the multicast-dns
dependency to pull in a fixed version. This could potentially break mdns
functionality in our development environment, but we probably don't even
use this functionality.
Dependencies are bad.
[1]: https://nvd.nist.gov/vuln/detail/CVE-2021-23386
* add shell completion for all viz subcommands
This change is part of #5981 and follows up from #6091. It adds resource
aware shell completion for all viz subcommands and their related flags
where applicable.
Related to #5981
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* reviewer feedback
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Improve linkerd-jaeger so that it is possible to exclude all-in-one
Jaeger installation. This is useful when pointing `.Values.collector.jaegerAddr` to existing
Jaeger. Furthermore, this change makes the collector optional as well.
Signed-off-by: Tarvi Pillessaar <tarvip@gmail.com>
Because of changes in control-plane and extensions, the current
serviceprofiles in `install-sp` are no longer relevant. Most of the
current ones should be moved over into the viz extension.
This leaves the core cp with just the `linkerd-dst` serviceprofile.
Fixes#6084
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Manifests are very prone to go wrong as Helm indentation
changes, etc are very easy to go wrong. This PR adds
unit tests to the render logic, so that changes to the
output manifests are tracked.
This follows the same pattern as that of other render
unit tests in core cp, viz, etc.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When `linkerd check --proxy` is run in a given namespace, the check
command ensures that all pods within that namespace have a running proxy
and that and the pod is in a `Running` status.
This however, is not necessarily true when dealing with CronJobs or Jobs
as these resources create pods that run their containers for a short period
of time and then terminate. When these pods terminate, they get placed
in a `Completed` phase and their sidecars are no longer running. This
causes an issue in `linkerd check --proxy` because that command expects
that pods will always be in a running state.
This change effectively skips checking the proxy status of a pod if that
pod is in a `Completed` phase as a pod no longer has an actively running
proxy
Fixes#6128
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
While uncommon, if H2 upgrades are disabled it's possible for an opaque workload
to not have it's hint.OpaqueTransport field set in it's destination profile
response.
This changes the H2 upgrade enabled check to be specific for setting the
hint.Protocol while allowing hint.OpaqueTransport to be set independent of
that value.
Signed-off-by: Kevin Leimkuhler kevin@kleimkuhler.com
Co-authored-by: Oliver Gould <ver@buoyant.io>
This PR updates the Integration tests workflow to skip running
them for markdown updates.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Oliver Gould <ver@buoyant.io>
Go 1.16.4 includes a fix for a denial-of-service in net/http: golang/go#45710
Go's error file-line formatting changed in 1.16.3, so this change
updates tests to only do suffix matching on these error strings.
CodeQL flags our debug-logging of user and group header values. While
this information isn't strictly sensitive, it isn't really necessary,
either--we've haven't used any of this logged information for practical
diagnostics in years, if then. And because these header names are read
as configuration and not hardcoded, there's no way for us to be really
_sure_ that these headers are safe to log. So, I propose that we just
stop logging them and instead log the header names instead :).
While we're here, we should use `Header.Values()` instead of direct
slice access. `Values()` ensures that headers are encoded in the
canonical MIME format ("Train-Case") to ensure case-insenstive
comparison.
[1]: https://codeql.github.com/codeql-query-help/go/go-clear-text-logging/
While we continue to deal with Actions CI timeouts, this adds an additional
timeout to the helm upgrade command in the install integration tests.
This was encounted in this CI run.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
GitHub's CodeQL runs static analyzers on Go and JavaScript. This change
adds a CodeQL action to be run regularly on our code.
This change also updates actions/checkout to v2.3.4.
This edge release updates the proxy-init container to check whether the iptables
rules have already been added, which prevents errors if the proxy-init container
is restarted. Also, the `viz stat` command now has tab completion for Kubernetes
resources, saving you precious keystrokes! Finally, the proxy has been updated
with several fixes and improvements.
* Added instructions to `build.md` for using a locally built proxy
(thanks @jroper!)
* Added support for Kubernetes resource aware tab completion to the `viz stat`
command
* Updated `proxy-init` to skip configuring firewall if rules exists
* Fixed `viz uninstall` to delete all RBAC objects (thanks @aryan9600!)
* Improved diagnostics for rejected profile discovery
* Added the `l5d-client-id` header on mutually-authenticated inbound requests so
that applications can discover the client's identity.
* Reduced proxy resource usage when there are no profiles
* Changed the admin server to assume all meshed connections are HTTP/2 and fail
connections when that is not the case
* Updated the proxy to require the `l5d-dst-override` header on outbound
requests when the proxy is in ingress-mode
* Removed support for TCP-forwarding in ingress-mode
* Skip configuring firewall if rules exists
This change fixes an issue where the `proxy-init` will fail if
`PROXY_INIT_*` chains already exist in the pod's iptables. This then
causes the pod to never start because proxy-init never finishes running
with a non-zero exit code.
In this change, we capture the output of the `iptables-save` command and
then check to see if the output contains the `PROXY_INIT_*` chains. If
they do, exist and log a warning stating that the chains already
exist.
Fixes#5786
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Uninstalling the viz extension would not remove a few RBAC objects that
were created while installing the extension.
Fetch the required RBAC objects and add them to the list of resources to
be uninstalled.
Related to #6062.
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>