* Bump proxy-init to v1.3.2
Bumped `proxy-init` version to v1.3.2, fixing an issue with `go.mod`
(linkerd/linkerd2-proxy-init#9).
This is a non-user-facing fix.
## Motivation
I noticed the Go language server stopped working in VS Code and narrowed it
down to `go build ./...` failing with the following:
```
❯ go build ./...
go: github.com/linkerd/stern@v0.0.0-20190907020106-201e8ccdff9c: parsing go.mod: go.mod:3: usage: go 1.23
```
This change updates `linkerd/stern` version with changes made in
linkerd/stern#3 to fix this issue.
This does not depend on #4170, but it is also needed in order to completely
fix `go build ./...`
## Motivation
After #4147 added the `install-pr` script, installing PRs into existing
clusters does not work if that cluster is a KinD cluster
Changing the script to be able to use KinD, and specifically automate `kind
load` would be helpful!
## Solution
The script can now be used in the following ways.
```
❯ bin/install-pr --help
Install Linkerd with the changes made in a GitHub Pull Request.
Usage:
--context: The name of the kubeconfig context to use
# Install Linkerd into the current cluster
bin/install-pr 1234
# Install Linkerd into the current KinD cluster
bin/install-pr [-k|--kind] 1234
# Install Linkerd into the 'kind-pr-1234' KinD cluster
bin/install-pr [-k|--kind] --context kind-pr-1234 1234
```
The script assumes that the cluster (KinD or not) has already been created. If
the cluster is a KinD cluster, the `-k|--kind` flag should be passed.
If the `--context` flag is not passsed, the install defaults to the current
context (`kubectl config current-context`).
I also added a [`-h|--help]` option that describes how to use the script.
This change removes the target port requirement when resolving ports in the dst service. Based on the comments, it seems that we need to have a target port defined in the port spec in order to resolve to the port in the Endpoints. In reality if target port is note defined when creating the service, k8s will set the port and the target port to the same value. Seems to me that checking for the targetPort to be different than 0, is a no-op.
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
## Motivation
Testing #4167 has revealed some `linkerd check` failures that occur only
because the checks happen too quickly after cluster creation or install. If
retried, they pass on the second time.
Some checkers already handle this with the `retryDeadline` field. If a checker
does not set this field, there is no retry.
## Solution
Add retries to the `l5d-existence-replicasets`
`l5d-existence-unschedulable-pods` checks so that these checks do not fail
during a chained cluster creation > install > check process.
We add the `linkerd alpha clients` command which displays client side metrics from each of a resource's clients. This allows you to see who all of your clients are and see what your resource's metrics look like from your clients' point of view. Since these metrics are measured on the client-side, they include network latency.
```
> linkerd alpha clients deploy/web -n emojivoto
FROM TO SUCCESS RPS LATENCY_P50 LATENCY_P90 LATENCY_P99
vote-bot.emojivoto web 97.50% 2.0rps 4ms 5ms 5ms
```
Signed-off-by: Alex Leong <alex@buoyant.io>
This change is in a similar vein to #4052 which provided support for
configuring service profile retries via a vendor extension of
`x-linkerd-retryable`, when generating from an openapi specification.
This change is very similar to the final version of that pull request,
and adds a timeout value based on `x-linkerd-timeout`.
At this point I believe that if the timeout is not specified then the
default provided by linkerd of 30s will apply anyway, but won't
explicitly be reflected in the service profile, which I'm somewhat okay
with as a current state, but I think there's a potential future
improvement that the default timeout is always shown when generating
from an open api spec, but that's more to make it clear and obvious that
that timeout exists.
Signed-off-by: Lewis Cowper <lewis.cowper@googlemail.com>
This release builds on changes in the prior release to ensure that
balancers process updates eagerly.
Cache capacity limitations have been removed; and services now fail
eagerly, rather than making all requests wait for the timeout to expire.
Also, a bug was fixed in the way the `LINKERD2_PROXY_LOG` env variable
is parsed.
---
* Introduce a backpressure-propagating buffer (linkerd/linkerd2-proxy#451)
* trace: update tracing-subscriber to 0.2.3 (linkerd/linkerd2-proxy#455)
* timeout: Introduce FailFast, Idle, and Probe middlewares (linkerd/linkerd2-proxy#452)
* cache: Let services self-evict (linkerd/linkerd2-proxy#456)
## Motivation
#4147 adds a script for setting up a local cluster that uses the images built
from the changes introduced in a forked PR. This would be useful for all PRs.
In order to install Linkerd from a PR into a local cluster, the images still
need to be built at some point. If you happen to have SSH config setup for our
Packet host, you can pull them from there. That is not very
accessible--requiring that someone adds you as a user--so we can take a
similar approach to forked PRs.
## Solution
All PRs now make an artifact directory that is uploaded as part of the KinD
integration tests. This way, the `install-pr` script can use those images no
matter if the PR is a fork or not.
We use curl for fetching remote files in our `bin` scripts. Replace the use of `wget` with `curl` in `bin/shellcheck` for consistency.
Signed-off-by: Alex Leong <alex@buoyant.io>
# Install PR
This script takes a Github pull request number as an argument, downloads the
docker images from the pull request's artifacts, pushes them, and installs
them on your Kubernetes cluster. Requires a Github personal access token
in the $GITHUB_TOKEN environment variable.
Signed-off-by: Alex Leong <alex@buoyant.io>
* use custom all values for top line dashboard
* convert remaining allValue params to wildcard glob
Signed-off-by: Matt Miller <mamiller@rosettastone.com>
## Motivation
Closes#4140
Automatically create new edge release tag:
```
❯ bin/create-release-tag edge
edge-20.3.2 tag created and signed.
tag: edge-20.3.2
To push tag, run:
git push origin edge-20.3.2
```
Validate new stable release tag:
```
❯ bin/create-release-tag stable 2.7.1
stable-2.7.1 tag created and signed.
tag: stable-2.7.1
To push tag, run:
git push origin stable-2.7.1
```
## Solution
The release tag script now takes a release channel argument. If the release
channel argument is `stable`, a second argument is required for the version.
If the release channel is `edge`, the script gets the current edge version and
creates a new edge version with the current year: `YY`, month: `MM`, and
increments the current month minor if it is not a new month.
If the release channel is `stable`, the script will only validate the version.
Example error cases:
```
❯ bin/create-release-tag
Error: create-release-tag accepts 1 or 2 arguments
Usage:
create-release-tag edge
create-release-tag stable x.x.x
```
```
❯ bin/create-release-tag foo
Error: valid release channels: edge, stable
Usage:
bin/create-release-tag edge
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag edge 2.7.1
Error: accepts 1 argument
Usage:
bin/create-release-tag edge
```
```
❯ bin/create-release-tag stable
Error: accepts 2 arguments
Usage:
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag stable 2.7
Error: version reference incorrect
Usage:
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag stable 2.7.1.1
Error: version reference incorrect
Usage:
bin/create-release-tag stable 2.4.8
```
More helpful error messages when the `linkerd alpha stat` command fails. For example, when the user is not authorized:
```
> linkerd alpha stat deploy/web -n emojivoto --as obama@buoyant.io
Error: deployments.metrics.smi-spec.io "web" is forbidden: User "obama@buoyant.io" cannot get resource "deployments" in API group "metrics.smi-spec.io" in the namespace "emojivoto"
```
When an error is encountered on the server:
```
> linkerd alpha stat deploy -n emojivoto
Error: Unauthorized client certificate. Check configuration and try again.
```
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR introduces the `linkerd alpha stat` command which will eventually replace the `linkerd stat` command. This command functions in a similar way, but with slightly different arguments and is implemented using the smi-metrics API. This means that access to metrics can be controlled with RBAC.
See the `linkerd alpha stat` help text for full details, or try one of these commands:
* `linkerd alpha stat -n emojivoto deploy/web`
* `linkerd alpha stat -n emojivoto deploy`
* `linkerd alpha stat -n emojivoto deploy/web --to deploy/emoji`
Signed-off-by: Alex Leong <alex@buoyant.io>
* proxy: v2.88.0
This release includes a significant internal change to how backpressure
is handled in the proxy. These changes fix a class of bugs related to discovery
staleness, and it should be rarer to encounter "dispatch timeout"
errors.
---
* orig-proto: Be more flexible to stack placement (linkerd/linkerd2-proxy#444)
* Remove Clone requirement from controller clients (linkerd/linkerd2-proxy#449)
* server: Simplify HTTP server type constraints (linkerd/linkerd2-proxy#450)
* Overhaul buffering & caching to better-support backpressure (linkerd/linkerd2-proxy#453)
This is a followup to #4129, fixing this warning:
```
In ./bin/create-release-tag line 32:
tmp=$(. "$bindir"/_release.sh; extract_release_notes)
^-------------------^ SC2119: Use
extract_release_notes "$@" if function's
$1 should mean script's $1.
```
In order to use functions in bash that use optional arguments that don't
generate this warning, we have to disable the SC2120 check, as explained here:
https://github.com/koalaman/shellcheck/wiki/SC2120#exceptions
Extracted the logic to pull the latest release notes, out of
`bin/create-release-tag` into `bin/_release.sh` so that it can be reused
in the `release.yml` workflow, which needs to use that inside
`gh_release` when creating the github release in order to have prettier
markup release notes instead of a plaintext message pulled out of the tag
message.
The new extracted function also receives an optional argument with the
name of the file to put the release notes into, because the `body_path`
parameter in `softprops/action-gh-release` doesn't work with dynamic
vars.
Finally, now the `website_publish` job will only launch until the `gh_release`
has succeeded.
Unit tests that exercise most of the code in cluster_watcher.go. Essentially the whole cluster mirroring machinary can be tought of as a function that takes remote cluster state, local cluster state, and modification events and as a result it either modifies local cluster state or issues new events onto the queue. This is what these tests are trying to model. I think this covers a lot of the logic there. Any suggestions for other edge cases are welcome.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
The `/namespaces` page in the web dashboard was rendering broken Grafana
links, containing an extra `var-namespace=` param, for example:
```
/grafana/dashboard/db/linkerd-namespace?var-namespace=&var-namespace=emojivoto
```
Root cause was the `GrafanaLink` component taking both `resource` and
`namespace` properties, but not special-casing when
`resource === 'namespace' && namespace === ''`.
Modify the `GrafanaLink` component to omit the `var-namespace` param
when a `namespace` property is not provided.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #4117 was root-caused with the help of `shellcheck`.
This change introduces a `bin/shellcheck` script, and adds it to CI. In
CI, many checks are disabled to allow it to pass. This will at least
prevent introduction of new classes of shell issue, and should motivate
re-enabling more checks over time.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`bin/fetch-proxy` was failing on Linux:
```bash
$ bin/fetch-proxy
linkerd2-proxy-v2.87.0/
linkerd2-proxy-v2.87.0/LICENSE
linkerd2-proxy-v2.87.0/bin/
linkerd2-proxy-v2.87.0/bin/linkerd2-proxy
bin/fetch-proxy: 31: [: Linux: unexpected operator
/home/siggy/code/linkerd2/target/proxy/linkerd2-proxy-v2.87.0
```
Also in CI:
https://github.com/linkerd/linkerd2/runs/473746447?check_suite_focus=true#step:5:32
Unfortunately `bin/fetch-proxy` still returned a zero exit status, because
`set -e` does not apply to commands that are part of `if` statements.
From https://ss64.com/bash/set.html:
```
-e Exit immediately if a simple command exits with a non-zero status, unless
the command that fails is part of an until or while loop, part of an
if statement, part of a && or || list, or if the command's return status
is being inverted using !. -o errexit
```
Fortunately when the `if` command failed, it fell through to the `else` clause
for Linux, and copied `linkerd-proxy` successfully.
Root cause was a `==` instead of `=`. `shellcheck` confirms, and also
recommends quoting:
```bash
$ shellcheck bin/fetch-proxy
In bin/fetch-proxy line 31:
if [ $(uname) == "Darwin" ]; then
^-- SC2046: Quote this to prevent word splitting.
^-- SC2039: In POSIX sh, == in place of = is undefined.
```
Apply `shellcheck` recommendations.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
When adding an action we can quickly vet it and fix it to a sha. Whereas
if we use a tag, the 3rd party can change the code and retag it without us noticing
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Adds the SMI metrics API to the Linkerd install flow. This installs the SMI metrics controller deployment, the SMI metrics ApiService object, and supporting RBAC, and config resources.
This is the first step toward having Linkerd consume the SMI metrics API in the CLI and web dashboard.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Check Extension api server Authentication
* Added Checks and tests for extension api-server authentication
* Fixed Failing Static Checks
* Updated the golden file
Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
## edge-20.2.3
This release introduces the first optional add-on `tracing`, added through the
new add-on model!
The existing optional `tracing` components Jaeger and OpenCensus can now be
installed as add-on components.
There will be more information to come about the new add-on model, but please
refer to the details of [#3955](https://github.com/linkerd/linkerd2/pull/3955) for how to get started.
* CLI
* Added the `linkerd diagnostics` command to get metrics only from the
control plane, excluding metrics from the data plane proxies (thanks
@srv-twry!)
* Added the `linkerd install --prometheus-image` option for installing a
custom Prometheus image (thanks @christyjacob4!)
* Fixed an issue with `linkerd upgrade` where changes to the `Namespace`
object were ignored (thanks @supra08!)
* Controller
* Added the `tracing` add-on which installs Jaeger and OpenCensus as add-on
components (thanks @Pothulapati!!)
* Proxy
* Increased the inbound router's default capacity from 100 to 10k to
accommodate environments that have a high cardinality of virtual hosts
served by a single pod
* Web UI
* Fixed styling in the CallToAction banner (thanks @aliariff!)
This adds a message after running the `create-release-script` that I intended to
add as part of the initial PR. Example output:
```
❯ bin/create-release-tag $TAG tag created and signed.
tag: edge-93.1.1
To push tag, run:
git push origin edge-93.1.1
```
Fixes#4105
In my local machine, `linkerd stat` was not returning traffic up until
the 17th try or so. Which explains why the 20s timeout was a bit too
close to the limit and this test was failing sometimes. So I increased
the timeout up to 40s and I'm also adding stderr to the error message.
* Add spacing in CallToAction banner
The call to action banner in control plane page is missing some spacing.
The CSS is defined but not yet used. So the solution is to add the class name to the corresponding banner.
After its merged the banner will have more space.
Fixes#3690
* Remove unused css
Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
Added the `website_publish_check` job, triggered upon edge/stable tag pushes,
that verifies that the install script at the website has indeed been
updated with the corresponding edge/stable version. It performs the
check every 5 seconds, 10 times, before giving up and failing the build
run. Tested fine in my fork 👍
* Moves Common templates needed to partials
As add-ons re-use the partials helm chart, all the templates needed by multiple charts should be present in partials
This commit also updates the helm tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* add tracing add-on helm chart
Tracing sub-chart includes open-census and jaeger components as a sub-chart which can be enabled as needed
* Updated Install path to also install add-ons
This includes new interface for add-ons to implement, with example tracing implementation
* Updates Linkerd install path to also install add-ons
Changes include:
- Adds an optional Linkerd Values configmap which stores add-on configuration when add-ons are present.
- Updates Linkerd install path to check for add-ons and render their sub-charts.
- Adds a install Option called config, which is used to pass confiugration for add-ons.
- Uses a fork of mergo, to over-write default Values with the Values struct generated from config.
* Updates the upgrade path about add-ons.
Upgrade path now checks for the linkerd-values cm, and overwrites the default values with it, if present.
It then checks the config option, for any further overwrites
* Refactor linkerd-values and re-update tests
also adds relevant nil checks
* Refactor code to fix linting issues
* Fixes an error with linkerd-config global values
Also refactors the linkerd-values cm to work the same with helm
* Fix a nil pointer issue for tests
* Updated Tracing add-on chart meta-data
Also introduced a defaultGetFiles method for add-ons
* Add add-on/charts to gitignore
* refactor gitignore for chart deps
* Moves sub-charts to /charts directly
* Refactor linkerd values cm
* Add comment in linkerd-values
* remove extra controlplanetracing flag
* Support Stages deployment for add-ons along with tests
* linting fix
* update tracing rbac
* Removes the need for add-on Interface
- Uses helm loading capabiltiies to get info about add-ons
- Uses reflection to not have to unnecessarily add checks for each add-on type
* disable tracing flag
* Remove dep on forked mergo
- Re-use merge from helm
* Re-use helm's merge
* Override the chartDir path during tests
* add error check
* Updated the dependency iteration code
Currently, the charts directory, will not have the deps in the repo. So, Code is updated to read the dependencies from requirements.yaml
and use that info to read templates from the relevant add-ons directory.
* Hard Code add-ons name
* Remove struct details for add-ons
- As we don't use fields of a add-on struct, we don't have them to be typed. Instead we can just use the `enabled` flag using reflection
- Users can just use map[string]interface{} as the add-on type.
* update unit tests
* linting fix
* Rename flag to addon-config
* Use Chart loading logic
- This code uses chart loading to read the files and keep in a vfs.
- Once we have those files read we will then use them for generation of sub-charts.
* Go fmt fix
* Update the linkerd-values cm to use second level field
* Add relevant unit tests for mergeRaw
* linting fix
* Move addon tests to a new file
* Fix golden files
* remove addon install unit test
* Refactor sub-chart load logic
* Add install tracing unit test
* golden file update for tracing install
* Update golden files to reflect another pr changes
* Move addon-config flag to recordFlagSet
* add relevant tracing enabled checks
* linting fix
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
## Motivation
A goal of the release automation project is to automate the website publish
that publishes a new install script that uses the new release version.
linkerd/website#668 removed the hard coded versions from the repo and moved
the version update into the `make publish` command.
That workflow now needs to be triggered by a release in `linkerd2`.
## Solution
Once `kind_integration_tests` and `cloud_integration_test` pass, a job runs
using the [repository-dispatch](https://github.com/marketplace/actions/repository-dispatch) action to create a repository dispatch event in
`linkerd/website`.
This dispatch event will (current PR: linkerd/website#670) trigger the
publish workflow.
## Testing
Tested in my fork [here](https://github.com/kleimkuhler/linkerd2/actions/runs/45165789)
## Additional steps needed
A new `RELEASE_TOKEN` secret needs to be added to this repo. It should be the
personal access token of [l5d-bot](https://github.com/l5d-bot) with `repo` access.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* New CI job to automatically generate the Github release
Fixes#4083
New `gh_release` job in the `release.yml` for creating the release in
Github and uploading the CLI binaries for each platform, along with
their checksum files.
This job only gets triggered upon successful docker images building and
pushing, and kind and cloud integration tests passing.
The Helm chart deploying job gets now triggered upon the success of this
new job.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Update control-plane-namespace label
Upgrade command ignores changes to the namespace object
Add linkerd.io/control-plane-ns=linkerd label to the control-plane namespace
Fixes#3958
* Add controlPlaneNamespace label to namespace.yaml
* Modify tests for updated controlPlaneNamespace label
* Fix faulty values.yaml value
* Localize reference for controlPlaneNamespace label in kubernetes_helper.go
Signed-off-by: Supratik Das <rick.das08@gmail.com>
Fixes#4082
This tested fine in a fork, under various scenarios combining:
- Modify markup file in root dir
- Modify markup file in subdir
- Modify non-markup file
- In master
- In a PR