## Problem
#4557 changed the name of the function that `helm_upgrade_integration_tests`
uses.
`install_stable()` was renamed to `latest_release_channel()` and now takes an
argument for specifying either `edge` or `stable`.
`run_helm_upgrade_test` is a function used by the helm upgrade integration test
and was not properly updated to use `latest_release_channel()`.
This silently passed integration tests because `run_helm_upgrade_test` started
passing an empty string for the version to upgrade from, which results in the
default behavior of `install_test.go`--and therefore still passes.
## Solution
`run_helm_upgrade_test` now uses `latest_release_channel()` and passes the
proper argument.
Additionally, it checks that the version returned from
`latest_release_channel()` is not empty. If it is empty, it exits the test. This
ensures something like this does happen in the future.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Fix install-pr script
* Add image-archives path to commands to use the files
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Signed-off-by: Charles Pretzer <charles@buoyant.io>
Co-authored-by: Charles Pretzer <charles@buoyant.io>
This adds an integration test for upgrading from the latest edge to the current
build.
Closes#4471
Signed-off-by: Kevin Leimkuhler kevin@kleimkuhler.com
* CI steps for Chocolatey package - take 2
Followup to #4205, supersedes #4205
This adds:
- A new job psscript-analyzer into the `statics_checks.yml`
workflow for linting the Chocolatey Powershell script.
- A new `choco_pack` job in the `release.yml` workflow for
updating the Chocolatey spec file and generating the
package. This is only triggered for stable releases. It requires
a windows runner in order to run the choco tooling (in theory
it should have worked on a linux runner but in practice it
didn't).
- The `Create release` step was updated to upload the generated package,
if present.
- The source file path in `bin/win/linkerd.nuspec` was updated
to make this work.
* Name nupkg file accordingly to the other release assets
- Add quotes where missing, to handle whitespace & c:o.
- Use single quotes for non-expansion strings.
- Fix quotes were the current would cause errors.
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
The version of `uname` on Darwin doesn't support the `-o` flag, resulting in an error message when running the `bin/lint` script.
We add an if-branch to short-circuit the `uname-o` call if running on Darwin.
Signed-off-by: Alex Leong <alex@buoyant.io>
Failures in `bin/_test-run` from commands different than `go test`
aren't currently properly reported, in part because CI's bash default is
to have `set -e` which terminates the script and just outputs
`##[error]Process completed with exit code 2.` like
[here](https://github.com/linkerd/linkerd2/pull/4496/checks?check_run_id=720720352#step:14:116)
```
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
× no unschedulable pods
linkerd-controller-6c77c7ffb8-w8wh5: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-destination-6767d88f7f-rcnbq: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-grafana-76c76fcfb9-pdhfb: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-identity-5bcf97d6c8-q6rll: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-prometheus-6b95c56b44-hd9m6: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-proxy-injector-58d794ff9-jf7cj: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-sp-validator-6c5f999bfb-qg252: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-tap-6fdf84fc65-6txvr: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
linkerd-web-8484fbd867-nm8z2: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
see https://linkerd.io/checks/#l5d-existence-unschedulable-pods for hints
Status check results are ×
[error]Process completed with exit code 2.
```
I've made the following changes to `bin/_test-run` to generate better
messages and Github annotations when an error occurs:
- Unset `set -e` so that errors don't immediately exit the script and
don't allow us to properly format the errors.
- Removed many of the `exit_on_err` calls after go test calls because
those output enough information already (they were not being used
anyways in CI because of `set -e`). And instead have `run_test` exit
upon a `go test` error.
- Added `exit_on_err` calls right after non-`go-test` commands to
properly report their failure.
- Refactored the `exit_on_err` function so that it generates a Github
error annotation upon failure.
- Removed `trap` in `install_stable`, since the OS should be able to
handle GC for stuff under `/tmp`.
Also, I've changed the exit 2 code from `linkerd check` when it fails,
to exit code 1.
Quoting the list of directories passed to `goimports` was causing the list to be interpreted as a single argument which was stopping `bin/fmt` from working.
Instead, use `read` to split the list of directories into an array.
Also fix up incorrect formatting that has crept in while `bin/fmt` has been broken.
Signed-off-by: Alex Leong <alex@buoyant.io>
This change adds a `allow` and `link` commands, effectivelly enabling a cluster to have more than one set of credentials that allow it to be mirrored.
Fx #4461
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
In #4436 `head_root_tag()` was changed to replace `sed` with a
bash-native substitution. This assumes bash is our shell, which is the
case in `bin/_tag.sh` but not in `bin/root-tag` which calls it, and
which has a `sh` shebang that in Ubuntu points to dash instead of bash,
which breaks with the new bash-native substitution. Ergo, I'm
expliciting the bash shebang in this file.
* When installation test fails, fetch logs and events
Re #4371
When a test fails in `./test/install_test.go`, trigger the `TestLogs`
and `TestEvents` tests in a separate process in order to output any
unexpected logs/events that might have caused the initial test failure.
For instance, currently we're sporadically experiencing pod restarts.
Instead of ignoring them, this might help provide us with the real
underlying cause.
The nice and clean markdownlint scripts use no bash-specific
functionality. Hence they could be run with /bin/sh instead. On e.g.
Debian-based systems /bin/sh is dash which has 1/10 of bash's footprint.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
* Run shellcheck for all shell scripts in repository
Update the shellcheck command in static_checks.yml to not only scan the
contents of ./bin, but search for all files with mimetype
text/x-shellscript and feed them to shellcheck.
Certainly, this is a tad more time consuming than just scanning one
directory, but still a quite fast thing to do while it prevents any
new scripts to fly under the radar.
(Also, there is no need to exclude *.nuspec or *.ps1 from the find
command as they do not have the text/x-shellscript mimetype.)
Change-Id: I7433d231e8a315df65c03ee8765914e782057343
Signed-off-by: Joakim Roubert <joakimr@axis.com>
* Updates after review comment
Move shellcheck of all scripts to own script that is then called by
static_checks.yml as suggested by @kleimkuhler.
Also updated sources for helm-build and kind-load so that the
new shellcheck-all script can be called from any directory.
Change-Id: I9e82230459cb843c4143ec979c93060f424baed8
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
* Bump KinD to 0.8.1
This brings us K8s 1.18, which is in theory passing all the integration
tests. Currently the tracing one is failing just because of the quay.io
downtime, that hosts the nginx-ingress image.
Re #4382
Delete variable `os` that is not used. The golangci-lint downloader script does its own extensive platform lookup before downloading the selected binary.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
## Motivation
linkerd/rfc#22
## Solution
Use the [markdown-lint-action](https://github.com/marketplace/actions/markdown-linting-action) to lint all `.md` files for all pull requests
and pushes to master.
This action uses the default rules outlined in [markdownlint
package](https://github.com/DavidAnson/markdownlint/blob/master/doc/Rules.md).
The additional rules are added are explained below:
- Ignore line length lints for code blocks
- Ignore line length lints for tables
- Allow duplicate sub-headers in sibling headers (e.g. allowing multiple ##
Significant headers in `CHANGES.md` as long as they are part of separate
release headers)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
## Motivation
As mentioned in the [Testing RFC](https://github.com/linkerd/rfc/blob/master/design/0003-isolated-integration-tests.md#constraints):
> The integration test setup checks require that certain conditions are
> satisfied by the given cluster. A surprising condition is that no
> pre-existing Linkerd installation resource may exist; if it does then it is
> deleted.
## Solution
`init_test_run` which runs before integration tests start will now exit the
script if any Linkerd resources exist on the cluster.
Example bad path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...
Linkerd resources exist on cluster:
pod/hello-6b6b5d644d-xrnhn
pod/hello-slow-cooker-h8xn2
pod/world-fc8f457b7-gj7wq
pod/gateway-676fd64cb9-j57k6
pod/hello-c767bf764-cbdqh
pod/hello-slow-cooker-fqmxr
pod/slow-cooker-ftxdx
pod/t1-855c678bdd-vdg96
pod/t2-76989f94d4-d5fv8
pod/t3-75c8877797-hfwgc
pod/world-6784d4f65c-cn6vl
replicaset.apps/gateway-676fd64cb9
replicaset.apps/hello-c767bf764
replicaset.apps/t1-855c678bdd
replicaset.apps/t2-76989f94d4
replicaset.apps/t3-75c8877797
replicaset.apps/world-6784d4f65c
job.batch/hello-slow-cooker
job.batch/slow-cooker
Help:
Run [/home/kevin/Projects/linkerd/linkerd2/bin/test-cleanup]
Specify a cluster context [/home/kevin/Projects/linkerd/linkerd2/bin/test-run /home/kevin/Projects/linkerd/linkerd2/target/cli/linux/linkerd [l5d-integration] [context]]
exit
```
Example good path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...[ok]
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Pass grep output through xargs.
Use `${0%/*}` instead of `$bindir `since the variable `bindir` exists in
_tag.sh too and then triggers the shellcheck variable modifed warning.
Script uses no bash features and can thus be a POSIX /bin/sh script.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
shellcheck will not accept the string DO since it is not sure whether it is a misspelled do command or a string with DO. Explicitly quoting it will mitigate this.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
The SC1090 "Can't follow non-constant source" issue is addressed in the way suggested in shellcheck's documentation; the source paths are pointed out in shellcheck comments. By adding the bin dir to the -P shellcheck CLI parameter, we avoid having to state the bin directory in each and every script file.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Remove superfluous echo commands in assignments.
Add quotes.
Simplify the for loops that shellcheck didn't like.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Upgraded to Helm v3.2.1 from v2.16.1, getting rid of Tiller and making
other simplifications.
Note that the version placeholder in the `values.yaml` files had to be
changed from `{version}` to `linkerdVersionValue` because the former
confuses Helm v3.
* Increase timeout for Helm cleanup in integration tests
Tests were failing sporadically, waiting for the Helm namespace to get
cleaned up. I verified that it is getting cleaned up, but taking more
time sometimes.
* Add Linkerd CLI Chocolatey Package
This PR partially fixes#3063 by building a chocolatey package for Linkerd2's Windows CLI
It adds the build scripts for the Linkerd chocolatey package and based on discussions in
https://github.com/linkerd/linkerd2/pull/3921
Signed-off-by: Animesh Narayan Dangwal <animesh.leo@gmail.com>
Second part of #4176
Added extra Jest reporter when running js tests from CI, which will send
to stdout a GH annotation for each test failure, something like:
```
::error file=/home/alpeb/src/forks/linkerd2/web/app/js/components/Navigation.test.jsx::Navigation › checks state when versions do not match
```
See the [health
metrics RFC](https://github.com/linkerd/rfc/blob/master/design/0002-ci-health-metrics.md) for more context.
Fixes#4298
Since we started using using annotated tags for releases (because they
need to be signed), `bin/root-tag` will append `^0` to them when used
after checking out a release tag. E.g.:
```
$ bin/root-tag
edge-20.4.4^0
```
which breaks version checking by the CLI.
This PR removes that trailing `^0` whenever it's present
#4195 relaxed the clock skew check to match the Kubernetes 1.17 default
heartbeat interval.
This is the same issue that was preventing an update to the `kind` version
used.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Bug in `linkerd uninstall` when attempting to delete PSP
We were using a wrong apiVersion for PSP in `linkerd uninstall`'s
output, which avoids removing that resource:
```
$ linkerd uninstall | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller"
deleted
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination"
deleted
...
mutatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-proxy-injector-webhook-config" deleted
validatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-sp-validator-webhook-config" deleted
namespace "linkerd" deleted
error: unable to recognize "uninstall.yml": no matches for kind
"PodSecurityPolicy" in version "extensions/v1beta1"
$ kubectl get psp -oname
podsecuritypolicy.policy/linkerd-linkerd-control-plane
```
I've also replaced the uninstall integration test with a new separate
suite that performs the installation, waits for it to be ready,
uninstalls, and then confirms `linkerd check --pre` returns as expected.
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0. Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.
This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4. This keeps all of our transitive dependencies synchronized on one version of client-go.
This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code. I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Fix bin/kind-load for pull requests
Followup to #4212
External PRs were failing because:
1) The image tarballs weren't being loaded from the `images-archives`
directory
2) Concurrent calls to `bin/kind` were attempting to download the KinD
binary simultaneously, resulting in a "text file busy" error. To avoid
that, now we just call `bin/kind` synchronously one time beforehand.
Fixes#4206 Followup to #4167
Extract common logic to load images into KinD, from `bin/kind-load`, `bin/install-pr`, `.github/workflows/kind_integration.yml` and `.github/workflows/release.yml`.
Besides removing the duplication, `bin/kind-load` will benefit in performance by having each image be loaded in parallel.
```
Load into KinD the images for Linkerd's proxy, controller, web, grafana, debug and cni-plugin.
Usage:
bin/kind-load [--images] [--images-host ssh://linkerd-docker]
Examples:
# Load images from the local docker instance
bin/kind-load
# Load images from tar files located in the current directory
bin/kind-load --images
# Retrieve images from a remote docker instance and then load them into KinD
bin/kind-load --images --images-host ssh://linkerd-docker
Available Commands:
--images: use 'kind load image-archive' to load the images from local .tar files in the current directory.
--images-host: the argument to this option is used as the remote docker instance from which images are first retrieved
(using 'docker save') to be then loaded into KinD. This command requires --images.
```
Currently the release tag regex matches against arguments that have `edge` or
`stable` as a substring.
It should only match against arguments that are either `edge` or `stable`.
For example, the graceful error handling is not triggered for the following:
```
❯ bin/create-release-tag edge-20.3.3
bin/create-release-tag: line 92: release_tag: unbound variable
```
This PR fixes the regex so that the above results in graceful error handling.
```
❯ bin/create-release-tag edge-20.3.3
Error: valid release channels: edge, stable
Usage:
bin/create-release-tag edge
bin/create-release-tag stable 2.4.8
```
* Upgrade golangci-lint to v1.23.8
This should help with some timeouts we're seeing in CI.
I fixed some new warnings found in `inject.go` and `uninject.go`.
Also we now have to explicitly disable linting `/controller/gen`.
The linter was also complaining that in `/pkg/k8s/fake.go` the
`spClient.Interface` and `tsclient.Interface` returned in the function
`newFakeClientSetsFromManifests()` aren't used, but I opted to ignore
that to leave them available for future tests.
## Motivation
After #4147 added the `install-pr` script, installing PRs into existing
clusters does not work if that cluster is a KinD cluster
Changing the script to be able to use KinD, and specifically automate `kind
load` would be helpful!
## Solution
The script can now be used in the following ways.
```
❯ bin/install-pr --help
Install Linkerd with the changes made in a GitHub Pull Request.
Usage:
--context: The name of the kubeconfig context to use
# Install Linkerd into the current cluster
bin/install-pr 1234
# Install Linkerd into the current KinD cluster
bin/install-pr [-k|--kind] 1234
# Install Linkerd into the 'kind-pr-1234' KinD cluster
bin/install-pr [-k|--kind] --context kind-pr-1234 1234
```
The script assumes that the cluster (KinD or not) has already been created. If
the cluster is a KinD cluster, the `-k|--kind` flag should be passed.
If the `--context` flag is not passsed, the install defaults to the current
context (`kubectl config current-context`).
I also added a [`-h|--help]` option that describes how to use the script.
We use curl for fetching remote files in our `bin` scripts. Replace the use of `wget` with `curl` in `bin/shellcheck` for consistency.
Signed-off-by: Alex Leong <alex@buoyant.io>
# Install PR
This script takes a Github pull request number as an argument, downloads the
docker images from the pull request's artifacts, pushes them, and installs
them on your Kubernetes cluster. Requires a Github personal access token
in the $GITHUB_TOKEN environment variable.
Signed-off-by: Alex Leong <alex@buoyant.io>
## Motivation
Closes#4140
Automatically create new edge release tag:
```
❯ bin/create-release-tag edge
edge-20.3.2 tag created and signed.
tag: edge-20.3.2
To push tag, run:
git push origin edge-20.3.2
```
Validate new stable release tag:
```
❯ bin/create-release-tag stable 2.7.1
stable-2.7.1 tag created and signed.
tag: stable-2.7.1
To push tag, run:
git push origin stable-2.7.1
```
## Solution
The release tag script now takes a release channel argument. If the release
channel argument is `stable`, a second argument is required for the version.
If the release channel is `edge`, the script gets the current edge version and
creates a new edge version with the current year: `YY`, month: `MM`, and
increments the current month minor if it is not a new month.
If the release channel is `stable`, the script will only validate the version.
Example error cases:
```
❯ bin/create-release-tag
Error: create-release-tag accepts 1 or 2 arguments
Usage:
create-release-tag edge
create-release-tag stable x.x.x
```
```
❯ bin/create-release-tag foo
Error: valid release channels: edge, stable
Usage:
bin/create-release-tag edge
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag edge 2.7.1
Error: accepts 1 argument
Usage:
bin/create-release-tag edge
```
```
❯ bin/create-release-tag stable
Error: accepts 2 arguments
Usage:
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag stable 2.7
Error: version reference incorrect
Usage:
bin/create-release-tag stable 2.4.8
```
```
❯ bin/create-release-tag stable 2.7.1.1
Error: version reference incorrect
Usage:
bin/create-release-tag stable 2.4.8
```
This is a followup to #4129, fixing this warning:
```
In ./bin/create-release-tag line 32:
tmp=$(. "$bindir"/_release.sh; extract_release_notes)
^-------------------^ SC2119: Use
extract_release_notes "$@" if function's
$1 should mean script's $1.
```
In order to use functions in bash that use optional arguments that don't
generate this warning, we have to disable the SC2120 check, as explained here:
https://github.com/koalaman/shellcheck/wiki/SC2120#exceptions
Extracted the logic to pull the latest release notes, out of
`bin/create-release-tag` into `bin/_release.sh` so that it can be reused
in the `release.yml` workflow, which needs to use that inside
`gh_release` when creating the github release in order to have prettier
markup release notes instead of a plaintext message pulled out of the tag
message.
The new extracted function also receives an optional argument with the
name of the file to put the release notes into, because the `body_path`
parameter in `softprops/action-gh-release` doesn't work with dynamic
vars.
Finally, now the `website_publish` job will only launch until the `gh_release`
has succeeded.
PR #4117 was root-caused with the help of `shellcheck`.
This change introduces a `bin/shellcheck` script, and adds it to CI. In
CI, many checks are disabled to allow it to pass. This will at least
prevent introduction of new classes of shell issue, and should motivate
re-enabling more checks over time.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`bin/fetch-proxy` was failing on Linux:
```bash
$ bin/fetch-proxy
linkerd2-proxy-v2.87.0/
linkerd2-proxy-v2.87.0/LICENSE
linkerd2-proxy-v2.87.0/bin/
linkerd2-proxy-v2.87.0/bin/linkerd2-proxy
bin/fetch-proxy: 31: [: Linux: unexpected operator
/home/siggy/code/linkerd2/target/proxy/linkerd2-proxy-v2.87.0
```
Also in CI:
https://github.com/linkerd/linkerd2/runs/473746447?check_suite_focus=true#step:5:32
Unfortunately `bin/fetch-proxy` still returned a zero exit status, because
`set -e` does not apply to commands that are part of `if` statements.
From https://ss64.com/bash/set.html:
```
-e Exit immediately if a simple command exits with a non-zero status, unless
the command that fails is part of an until or while loop, part of an
if statement, part of a && or || list, or if the command's return status
is being inverted using !. -o errexit
```
Fortunately when the `if` command failed, it fell through to the `else` clause
for Linux, and copied `linkerd-proxy` successfully.
Root cause was a `==` instead of `=`. `shellcheck` confirms, and also
recommends quoting:
```bash
$ shellcheck bin/fetch-proxy
In bin/fetch-proxy line 31:
if [ $(uname) == "Darwin" ]; then
^-- SC2046: Quote this to prevent word splitting.
^-- SC2039: In POSIX sh, == in place of = is undefined.
```
Apply `shellcheck` recommendations.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This adds a message after running the `create-release-script` that I intended to
add as part of the initial PR. Example output:
```
❯ bin/create-release-tag $TAG tag created and signed.
tag: edge-93.1.1
To push tag, run:
git push origin edge-93.1.1
```
## Motivation
Creating a release tag is a manual process that is prone to error by the
release responsible member.
Additionally, the automated release project will require that a message is
included that is a copy of the recent `CHANGES.md` changes.
These steps can be scripted so that the member will just need to run a release
script.
## Solution
A `bin/create-release-tag` script will:
- Take a `$TAG` argument (maybe can remove this in the future) to use as the
tag name
- Pull out the top section of `CHANGES.md` to use as the commit message
- Create the a tag with `$TAG` name and release changes as the message
## Example
```
$ TAG="edge-20.2.3"
$ bin/create-release-tag $TAG
$ git push $TAG
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Have the preliminary setup for the Helm integration tests use
`bin/helm-build` instead of directly calling `helm dependency update`.
This allows testing `bin/helm-build` itself, and also lints the linkerd2
and linkerd2-cni charts (the latter lint call is being added as well in this
PR).
* bin/helm-build automatically updates version in values.yaml
Have the Helm charts building script (`bin/helm-build`) update the
linkerd version in the `values.yaml` files according to the tagged
version, thus removing the need of doing this manually on every release.
This is akin to the update we do in `version.go` at CLI build time.
Note that `shellcheck` is issuing some warnings about this script, but
that's on code that was already there, so that will be handled in an
followup PR.
* Allow CI to run concurrent builds in master
Fixes#3911
Refactors the `cloud_integration` test to run in separate GKE clusters
that are created and torn down on the fly.
It leverages a new "gcloud" github action that is also used to set up
gcloud in other build steps (`docker_deploy` and `chart_deploy`).
The action also generates unique names for those clusters, based on the
git commit SHA and `run_id`, a recently introduced variable that is
unique per CI run and available to all the jobs.
This fixes part of #3635 in that CI runs on the same SHA don't interfere
with one another (in the `cloud_integration` test; still to do for
`kind_integration`).
The "gcloud" GH action is hosted under its own repo in https://github.com/linkerd/linkerd2-action-gcloud
In light of the breaking changes we are introducing to the Helm chart and the convoluted upgrade process (see linkerd/website#647) an integration test can be quite helpful. This simply installs latest stable through helm install and then upgrades to the current head of the branch.
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
* Upgrade `kind` to v0.6.1
Fixes#3852
Upgraded `/bin/kind` to pull v0.6.1.
Also have `workflow.yml` use `KUBECONFIG` explicitly for setting the
location of the config file, now that `kind get kubeconfig-path` has
been deprecated (check
https://github.com/kubernetes-sigs/kind/releases/tag/v0.6.0 for detailed
info).
Note that in the build server the kind binary for this version is
`kind-0.6.1`, leaving the `kind` binary still pointing to v0.5.1 while
this gets merged and all the PR branches get this.
Fixes#3801
This will package and build the `linkerd2-cni` chart from the
`charts/linkerd2-cni` directory and update our Helm Hub's `index.yaml`
file to index it.
This will only be run in the `chart_deploy` job of our Github Actions
when an edge/stable tag is pushed.
Once that happens, users will be able to install the chart with a
command like:
```
helm install linkerd-edge/linkerd2-cni
```
Docs update will follow.
* Inject preStop hook into the proxy sidecar container to stop it last
This commit adds support for a Graceful Shutdown technique that is used
by some Kubernetes administrators while the more perspective
configuration is being discussed in
https://github.com/kubernetes/kubernetes/issues/65502
The problem is that RollingUpdate strategy does not guarantee that all
traffic will be sent to a new pod _before_ the previous pod is removed.
Kubernetes inside is an event-driven system and when a pod is being
terminating, several processes can receive the event simultaneously.
And if an Ingress Controller gets the event too late or processes it
slower than Kubernetes removes the pod from its Service, users requests
will continue flowing into the black whole.
According [to the documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods)
> 1. If one of the Pod’s containers has defined a `preStop` hook,
> it is invoked inside of the container. If the `preStop` hook is still
> running after the grace period expires, step 2 is then invoked with
> a small (2 second) extended grace period.
>
> 2. The container is sent the `TERM` signal. Note that not all
> containers in the Pod will receive the `TERM` signal at the same time
> and may each require a preStop hook if the order in which
> they shut down matters.
This commit adds support for the `preStop` hook that can be configured
in three forms:
1. As command line argument `--wait-before-exit-seconds` for
`linkerd inject` command.
2. As `linkerd2` Helm chart value `Proxy.WaitBeforeExitSeconds`.
2. As `config.alpha.linkerd.io/wait-before-exit-seconds` annotation.
If configured, it will add the following preHook to the proxy container
definition:
```yaml
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep {{.Values.Proxy.WaitBeforeExitSeconds}}
```
To achieve max benefit from the option, the main container should have
its own `preStop` hook with the `sleep` command inside which has
a smaller period than is set for the proxy sidecar. And none of them
must be bigger than `terminationGracePeriodSeconds` configured for the
entire pod.
An example of a rendered Kubernetes resource where
`.Values.Proxy.WaitBeforeExitSeconds` is equal to `40`:
```yaml
# application container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 20
# linkerd-proxy container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 40
terminationGracePeriodSeconds: 160 # for entire pod
```
Fixes#3747
Signed-off-by: Eugene Glotov <kivagant@gmail.com>
* Enable cert rotation test to work with dynamic namespaces
This PR adds support for dynamic cert generation when running the cert rotation intergration tests. This allows to avoid baking in the namespace in the certificate CN, thereby allowing us to run these tests on the clouds.
The tests in #3775 were failing because the second secret holding the issuer cert replacement was a leaf cert and not a root/intermediary cert capable of signing the CSRs. This is how the replacement cert looked like:
```bash
$ k -n l5d-integration-external-issuer get secrets linkerd-identity-issuer-new -ojson | jq '.data|.["tls.crt"]' | tr -d '"' | base64 -d | step certificate inspect -
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 2 (0x2)
Signature Algorithm: ECDSA-SHA256
Issuer: CN=identity.l5d-integration-external-issuer.cluster.local
Validity
Not Before: Dec 6 19:16:08 2019 UTC
Not After : Dec 5 19:16:28 2020 UTC
Subject: CN=identity.l5d-integration-external-issuer.cluster.local
Subject Public Key Info:
Public Key Algorithm: ECDSA
Public-Key: (256 bit)
X:
93:d5:fa:f8:d1:44:4f:9a:8c:aa:0c:9e:4f:98:a3:
8d:28:d9:cc:f2:74:4c:5f:76:14:52:47:b9:fb:c9:
a3:33
Y:
d2:04:74:95:2e:b4:78:28:94:8a:90:b2:fb:66:1b:
e7:60:e5:02:48:d2:02:0e:4d:9e:4f:6f:e9:0a:d9:
22:78
Curve: P-256
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Subject Alternative Name:
DNS:identity.l5d-integration-external-issuer.cluster.local
Signature Algorithm: ECDSA-SHA256
30:46:02:21:00:f6:93:2f:10:ba:eb:be:bf:77:1a:2d:68:e6:
04:17:a4:b4:2a:05:80:f7:c5:f7:37:82:7b:b7:9c:a1:66:6a:
e1:02:21:00:b3:65:06:37:49:06:1e:13:98:7c:cf:f9:71:ce:
5a:55:de:f6:1b:83:85:b0:a8:88:b7:cf:21:d1:16:f2:10:f9
```
For it to be a root/intermediate cert it should have had `CA:TRUE` under the `X509v3 extensions` section.
Why did the test pass sometimes? When it did pass for me, I could see in the linkerd-identity proxy logs something like:
```
ERR! [ 320.964592s] linkerd2_proxy_identity::certify Received invalid ceritficate: invalid certificate: UnknownIssuer
```
so the cert retrieved from identity still was invalid but for some reason the proxy, sometimes, keeps on going despite that. And when one would delete the linkerd-identity pod, its proxy wouldn't come up at all, also showing that error.
With the changes from this branch, we no longer see that error in the logs and after deleting the linkerd-identity pod it comes back gracefully.
This PR adds support for dynamic cert generation when running the cert rotation intergration tests. This allows to avoid baking in the namespace in the certificate CN, thereby allowing us to run these tests on the clouds.
* Enable cert rotation test to work with dynamic namespaces
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* Address comments
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* Address further comments
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* Fix whitespace path handling in non-docker (build) scripts
Handling of whitespace paths was not fully implemented; this patch adds
the missing pieces. Also, only use bash where bash-specific
functionality is used/needed.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
- Added cleanup step at the end of all integration tests.
- Disable external_issuer_integration_tests in cloud_tests due to
namespace issue. Running this via `kind` tests is sufficient for now.
- Set a flakey test to `Skip`, relates to #3332.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#3566
As explained in #3566, as of go 1.13 there's a strict check that ensures a dependency's timestamp matches it's sha (as declared in go.mod). Our smi-sdk dependency has a problem with that that got resolved later on, but more work would be required to upgrade that dependency. In the meantime a quick pair of replace statements at the bottom of go.mod fix the issue.
This patch sends the proxy settings to docker build if present.
Without this, the docker build will fail on apt-get update on a
system that is behind a proxy.
Change-Id: I3fcbad4d9a9c30e5f0a00f03c6d8629ed8cc97b0
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Handling of whitespace paths was not fully implemented; this patch adds
the missing pieces. Also, only use bash where bash-specific
functionality is used/needed.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
* fetch-proxy: Make POSIX compatible
* fetch-proxy: Update old comment to match current behavior
Getting the directory where the script resides can easily be done
without bash-specific functionality, and hence the script can be POSIX
compatible.
Change-Id: I30bd69dccbc950bdce3dc5da4bea279305a7b1f9
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Getting the directory where the script resides can easily be done
without bash-specific functionality, and hence the script can be POSIX
compatible. Also adding the missing pieces for handling paths with
whitespaces.
Change-Id: Ie2e867929be0322e476342438d9cf4a3d36f58f1
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Each proxy release tag now includes a message.
This change updates the git-commit-proxy-version script to include this
message in the commit message in this repo.
* Keep old releases in Helm repo index
When building the Helm repo index file, keep the references to the old
releases. Also rename and keep the old index file in case
something goes wrong when generating the new one.
Fixes#3561
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script. However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting. This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.
In this change we stop using golang-ci-lint for format checking. We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files. This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project. We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.
Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected
Signed-off-by: Alex Leong <alex@buoyant.io>
* Clean username before using as docker image tag
* Allow Alphanumerics instead of just alphabets in docker image tag
Incorporate Alex's suggestions
Fixes#3570
Signed-off-by: Saurav Tiwary <srv.twry@gmail.com>
Followup to #2990, which refactored `linkerd endpoints` to use the
`Destination.Get` API instead of the `Discovery.Endpoints` API, leaving
the Discovery with no implented methods. This PR removes all the Discovery
code leftovers.
Fixes#3499
## Summary
[kind](https://github.com/kubernetes-sigs/kind) has been a helpful tool for running local Kubernetes clusters and
testing linkerd builds. Once images are built with `bin/docker-build`, the
images must be loaded into the kind cluster.
This script should be run after `bin/docker-build` and will load the images into
the specified kind cluster.
Example:
```
$ bin/docker-build
$ kind get clusters # show available clusters to load images on to
kleimkuhler
$ bin/kind-load kleimkuhler
$ ./target/cli/linux/linkerd install | kubectl apply -f -
```
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* Have CI push the Helm artifacts into GCS
- Added missing OWNERS and README files
- Added maintainers section to Chart.yaml
- Changed NOTES.txt so it points to the installation of the CLI
- Set the proxy-init version to v1.1.0 in values.yaml
Ref #3256
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Last changes before submitting to the Helm incubator
- Added missing OWNERS and README files
- Added maintainers section to Chart.yaml
- Changed NOTES.txt so it points to the installation of the CLI
- Set the proxy-init version to v1.1.0 in values.yaml
- Added missing ProfileValidator vars, and add 'do not edit' comment to the Identity.Issuer.CrtExpiryAnnotation value
- Added new self-hosted repo
- Added option to bin/helm-build
- Added DisableHeartBeat to README
Ref #3256
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>