Commit Graph

302 Commits

Author SHA1 Message Date
Alex Leong d22dda0917
Fix integration tests to run better on ARM (#5101)
In order for the integration tests to run successfully on a dedicated ARM cluster, two small changes are necessary:

* We need to skip the multicluster test since this test uses two separate clusters (source and target)
* We need to properly uninstall the multicluster helm chart during cleanup.

With these changes, I was able to successfully run the integration tests on a dedicated ARM cluster.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-21 09:27:56 -07:00
Zahari Dichev 60d8f34095
avoid waiting when creating calico cluster with kind (#5064)
Currently the --wait flag times out when creating a calico cluster. The result is that we end up waiting for 5 minutes to simply emit a warning and continue. Instead we can check the readiness of some k8s components to ensure our cluster is up and running and avoid the delay.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-10-12 18:26:00 +03:00
Oliver Gould 5f694513bd
bin/tests: Improve argument parsing (#5060)
The `bin/tests` script takes command-line arguments, but it requires
that all arguments are specified before the linkerd binary path; and it
silently ignores flags that follow the linkerd binary. Furthermore,
unexpected flags may be incorrectly parsed as the linkerd binary path.

This changes argument parsing to be more flexible about ordering; and it
prints the full usage error when unexpected flags are encountered.
2020-10-09 07:27:22 -07:00
Alejandro Pedraza e1772ae183
Fixed releases.yaml by pulling images directly from ghcr.io (#5035)
Previously, `releases.yaml` was trying to load images into the kind
clusters but that failed because those images were already in `ghcr.io`
and not in the local docker cache, but that failure was masked.
Unmasking that failure revealed some flaws that this change addresses:

- In `bin/_test_helpers` (used by `bin/tests`), modified the `images`
arg to accept `docker(default)|archive|skip`, for determining how to
load the images into the cluster (if loading them at all)
- In `bin/image-load`, changed arg `images` to `archive` which is more
descriptive.
- Have `kind_integration.yml` call `bin/tests --images archive`.
- Have `release.yml` call `bin/tests --images skip`.
2020-10-02 08:05:17 -05:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Alejandro Pedraza e8f0724a71
Multicluster integration test (#4998)
This implements the run_multicluster_test() function in bin/_test-helpers.sh.

The idea is to create two clusters (source and target) using k3d, with linkerd and multicluster support in both, plus emojivoto (without vote-bot) in target, and vote-bot in source.
We then link the clusters and make sure traffic is flowing.

Detailed sequence:

Create certficates.
Install linkerd along with multicluster support in the target cluster.
Run the target1 test: install emojivoto in the target cluster (without vote-bot).
Run linkerd mc link on the target cluster.
Install linkerd along with multicluster support in the source cluster.
Apply the link resource in the source cluster.
Run the source test: Check linkerd mc gateways returns the target cluster link, and only install emojivoto's vote-bot in the source cluster. Note vote-bot's yaml defines the web-svc service as web-svc-target.emojivoto:80
Run the target2 test: Make sure web-svc in the target cluster is receiving requests.
2020-09-26 05:26:23 -05:00
Alejandro Pedraza b50ae6290d
Add support for k3d in integration tests (#4994)
* Add support for k3d in integration tests

KinD doesn't support setting LoadBalancer services out of the box. It can be added with some additional work, but it seems the solutions are not cross-platform.

K3d on the other hand facilitates this, so we'll be using k3d clusters for the multicluster integration test.

The current change sets the ground by generalizing some of the integration tests operations that were hard-coded to KinD.

- Added `bin/k3d` to wrap the setup and running of a pinned version of `k3d`.
- Refactored `bin/_test-helpers.sh` to account for tests to be run in either KinD or k3d.
- Renamed `bin/kind-load` to `bin/image-load` and make it more generic to load images for both KinD (default) and k3d. Also got rid of the no longer used `--images-host` option.
- Added a placeholder for the new `multicluster` test in the lists in `bin/_test-helpers.sh`. It starts by setting up two k3d clusters.

* Refactor handling of the `--multicluster` flag in integration tests (#4995)

Followup to #4994, based off of that branch (`alpeb/k3d-tests`).
This is more preliminary work previous to the more complete multicluster integration test.

- Removed the `--multicluster` flag from all the tests we had in `bin/_test-helpers.sh`, so only the new "multicluster" integration test will make use of that. Also got rid of the `TestUninstallMulticluster()` test in `install_test.go` to keep the multicluster stuff around, needed for the more complete multicluster test that will be implemented in a followup PR.
- Added "multicluster" to the list of tests in the `kind_integration.yml` workflow.
- For now, this new "multicluster" test in `run_multicluster_test()` is just running the install tests (`test/integration/install_test.go`) with the `--multicluster` flag.

Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-09-25 16:33:17 -05:00
Alejandro Pedraza 0f869f2e50
Ability for int tests to use external certs generated with openssl (#4997)
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
2020-09-25 11:25:29 -05:00
Tarun Pothulapati 3d900ccc19
Integration test for smi-metrics (#4844)
* Integration test for smi-metrics

This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.

Currently, We store the SMI Helm pkg locally and run the test on top, so 
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 22:49:20 +05:30
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
Alejandro Pedraza 51100606ca
Delete multicluster resources in `bin/test-cleanup` (#4983)
When some test failed in the middle of the
`./tests/integration/install_test.go` suite, multicluster resources can
be left-over, which `./bin/test-cleanup` wasn't removing.

This was affecting the ARM integration tests, that require good cleanup
since they use a non-transient cluster.
2020-09-18 07:38:46 -05:00
Alejandro Pedraza ccf027c051
Push docker images to ghcr.io instead of gcr.io (#4953)
* Push docker images to ghcr.io instead of gcr.io

The `cloud_integration.yml` and `release.yml` workflows were modified to
log into ghcr.io, and remove the `Configure gcloud` step which is no
longer necessary.

Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes.

Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag.

Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).
2020-09-10 15:16:24 -05:00
Alejandro Pedraza 9bf34ebc4e
Fixed helm cleanup in `./bin/test-cleanup` (#4944)
`./bin/test-cleanup` was trying to remove the
resources with the label `linkerd.io/is-test-helm` which we're not
using. Instead, we simply call `helm delete` on the appropriate helm
releases.

This is required for a clean cleanup after the ARM integration test, whose
cluster is just cleaned by this script at the end and is not torn down.
2020-09-08 12:20:14 -05:00
Ali Ariff 5186383c81
Add ARM64 Integration Test (#4897)
* Add ARM64 Integration Test

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-28 10:38:40 -07:00
Zahari Dichev 2e7c00aa37
Diff generated code from proto files (#4863)
Add a static check that ensures the generated files from the proto definitions have not changed. 

Fix #4669

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-08-18 11:44:33 +03:00
Ali Ariff 492b0fe093
Created ./bin/_os.sh lib for os-arch detection (#4880)
And refactored `./bin/linkerd` and `./bin/build-cli-bin` to use it.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-14 09:59:52 -05:00
Kevin Leimkuhler c0826dcedc
Choose the right architecture in bin/linkerd script (#4867)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-08-13 11:03:08 -07:00
Ali Ariff 66d2c6b74b
Fix build-cli-bin (#4876)
Fix `bin/build-cli-bin` on Linux to put the binary in the correctly named architecture directory.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-13 09:21:14 -07:00
Ali Ariff ae8bb0e26e
Release ARM CLI artifacts (#4841)
* When releasing, build and upload the amd64, arm64 and arm architectures builds for the CLI
* Refactored `Dockerfile-bin` so it has separate stages for single and multi arch builds. The latter stage is only used for releases.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-11 09:25:58 -05:00
Ali Ariff 61d7dedd98
Build ARM docker images (#4794)
Build ARM docker images in the release workflow.

# Changes:
- Add a new env key `DOCKER_MULTIARCH` and `DOCKER_PUSH`. When set, it will build multi-arch images and push them to the registry. See https://github.com/docker/buildx/issues/59 for why it must be pushed to the registry.
- Usage of `crazy-max/ghaction-docker-buildx ` is necessary as it already configured with the ability to perform cross-compilation (using QEMU) so we can just use it, instead of manually set up it.
- Usage of `buildx` now make default global arguments. (See: https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope)

# Follow-up:
- Releasing the CLI binary file in ARM architecture. The docker images resulting from these changes already build in the ARM arch. Still, we need to make another adjustment like how to retrieve those binaries and to name it correctly as part of Github Release artifacts.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-05 11:14:01 -07:00
Alejandro Pedraza a1be60aea1
Reenable `upgrade-edge` integration test (#4821)
Followup to #4797

That test was temporarily disabled until the prometheus check in
`linkerd check` got fixed in #4797 and made it into edge-20.7.5
2020-07-31 12:11:32 -05:00
Carol A. Scott eec8905660
Add i18n library to Linkerd dashboard (#4803)
This PR adds the LinguiJS project to the Linkerd dashboard for i18n and 
translation. It is a precursor to adding translations to the dashboard. Only 
two components have been translated in this PR, to allow reviewers to evaluate 
the ease of use; A second PR will add translations for the remaining components.
2020-07-30 09:09:59 -07:00
Alejandro Pedraza 2aea2221ed
Fixed `linkerd check` not finding Prometheus (#4797)
* Fixed `linkerd check` not finding Prometheus

## The Problem

`linkerd check` run right after install is failing because it can't find the Prometheus Pod.

## The Cause

The "control plane pods are ready" check used to verify the existence of all the control plane pods, blocking until all the pods were ready.

Since #4724, Prometheus is no longer included in that check because it's checked separately as an add-on. An unintended consequence is that when the ensuing "control plane self-check" is triggered, Prometheus might not be ready yet and the check fails because it doesn't do retries.

## The Fix

The "control plane self-check" uses a gRPC call (it's the only check that does that) and those weren't designed with retries in mind.

This PR adds retry functionality to the `runCheckRPC()` function, making sure the final output remains the same

It also temporarily disables the `upgrade-edge` integration test because after installing edge-20.7.4 `linkerd check` will fail because of this.
2020-07-27 11:54:03 -05:00
Ali Ariff 05439d0dc4
CI: Remove Base image (#4782)
Removed the dependency on the base image, and instead install the needed packages in the Dockerfiles for debug and CNI.
Also removed some obsolete info from BUILD.md

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-07-23 17:00:12 -05:00
Alex Leong d540e16c8b
Make service mirror controller per target cluster (#4710)
This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31.  For fuller context, please see that RFC.

Basic multicluster functionality works here including:
* `linkerd mc install` installs the Link CRD but not any service mirror controllers
* `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link
* The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints.
* The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences.
* The `linkerd check` multicluster checks have been updated for the new architecture.  Several checks have been rendered obsolete by the new architecture and have been removed.

The following are known issues requiring further work:
* the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror.  it does not yet support configuring a label selector.
* an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707
* an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-07-23 14:32:50 -07:00
Alejandro Pedraza 5e789ba152
Migrate CI to docker buildx and other improvements (#4765)
* Migrate CI to docker buildx and other improvements

## Motivation
- Improve build times in forks. Specially when rerunning builds because of some flaky test.
- Start using `docker buildx` to pave the way for multiplatform builds.

## Performance improvements
These timings were taken for the `kind_integration.yml` workflow when we merged and rerun the lodash bump PR (#4762)

Before these improvements:
- when merging: `24:18`
- when rerunning after merge (docker cache warm): `19:00`
- when running the same changes in a fork (no docker cache): `32:15`

After these improvements:
- when merging: `25:38`
- when rerunning after merge (docker cache warm): `19:25`
- when running the same changes in a fork (docker cache warm): `19:25`

As explained below, non-forks and forks now use the same cache, so the important take is that forks will always start with a warm cache and we'll no longer see long build times like the `32:15` above.
The downside is a slight increase in the build times for non-forks (up to a little more than a minute, depending on the case).

## Build containers in parallel
The `docker_build` job in the `kind_integration.yml`, `cloud_integration.yml` and `release.yml` workflows relied on running `bin/docker-build` which builds all the containers in sequence. Now each container is built in parallel using a matrix strategy.

## New caching strategy
CI now uses `docker buildx` for building the container images, which allows using an external cache source for builds, a location in the filesystem in this case. That location gets cached using actions/cache, using the key `{{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}` and the restore key `${{ runner.os }}-buildx-${{ matrix.target }}-`.

For example when building the `web` container, its image and all the intermediary layers get cached under the key `Linux-buildx-web-git-abc0123`. When that has been cached in the `main` branch, that cache will be available to all the child branches, including forks. If a new branch in a fork asks for a key like `Linux-buildx-web-git-def456`, the key won't be found during the first CI run, but the system falls back to the key `Linux-buildx-web-git-abc0123` from `main` and so the build will start with a warm cache (more info about how keys are matched in the [actions/cache docs](https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key)).

## Packet host no longer needed
To benefit from the warm caches both in non-forks and forks like just explained, we're required to ditch doing the builds in Packet and now everything runs in the github runners VMs.
As a result there's no longer separate logic for non-forks and forks in the workflow files; `kind_integration.yml` was greatly simplified but `cloud_integration.yml` and `release.yml` got a little bigger in order to use the actions artifacts as a repository for the images built. This bloat will be fixed when support for [composite actions](https://github.com/actions/runner/blob/users/ethanchewy/compositeADR/docs/adrs/0549-composite-run-steps.md) lands in github.

## Local builds
You still are able to run `bin/docker-build` or any of the `docker-build.*` scripts. And to make use of buildx, run those same scripts after having set the env var `DOCKER_BUILDKIT=1`. Using buildx supposes you have installed it, as instructed [here](https://github.com/docker/buildx).

## Other
- A new script `bin/docker-cache-prune` is used to remove unused images from the cache. Without that the cache grows constantly and we can rapidly hit the 5GB limit (when the limit is attained the oldest entries get evicted).
- The `go-deps` dockerfile base image was changed from `golang:1.14.2` (ubuntu based) to `golang-1:14.2-alpine` also to conserve cache space.

# Addressed separately in #4875:

Got rid of the `go-deps` image and instead added something similar on top of all the Dockerfiles dealing with `go`, as a first stage for those Dockerfiles. That continues to serve as a way to pre-populate go's build cache, which speeds up the builds in the subsequent stages. That build should in theory be rebuilt automatically only when `go.mod` or `go.sum` change, and now we don't require running `bin/update-go-deps-shas`. That script was removed along with all the logic elsewhere that used it, including the `go_dependencies` job in the `static_checks.yml` github workflow.

The list of modules preinstalled was moved from `Dockerfile-go-deps` to a new script `bin/install-deps`. I couldn't find a way to generate that list dynamically, so whenever a slow-to-compile dependency is found, we have to make sure it's included in that list.

Although this simplifies the dev workflow, note that the real motivation behind this was a limitation in buildx's `docker-container` driver that forbids us from depending on images that haven't been pushed to a registry, so we have to resort to building the dependencies as a first stage in the Dockerfiles.
2020-07-22 14:27:45 -05:00
Ali Ariff d457178f43
Fetch proxy with specific arch (#4739)
https://github.com/linkerd/linkerd2-proxy/pull/593 changed the proxy
release process to produce platform-specific binaries.

This change modifies the bin/fetch-proxy script to fetch amd64-specific
binaries. The proxy version has been updated to v1.104.1, which includes
no code changes since v1.104.0.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-07-13 17:48:34 -07:00
Alejandro Pedraza 2e4b6cc782
Fix cloud integration workflow (#4750)
The `tests` variable wasn't being properly initialized, which resulted
in the `helm-deep` tests being repeated, and without cleanup in between,
the attempt to create resources that were already there caused an error.
2020-07-13 16:44:20 -05:00
Alejandro Pedraza 873bd61324
Helm integration deep tests (#4728)
This creates a new integration test target that launches the deep suite,
using a linkerd instance installed through Helm.

I've added a `global.proxyInit.ignoreInboundPorts=1234,5678` override
during install and enhanced the injection test to catch problems like
what we saw in #4679.
2020-07-10 14:48:49 -05:00
Kevin Leimkuhler 5d400f5bcd
Fix deep integration test (#4709)
This fixes the deep integration test which currently only calls `run_test` for
`edges` integration test.

This occurs because `run_test "${tests[@]}"` will pass an entire array of
filenames when `run_test` only expects *one* filename.

The solution is to loop through `tests` and call `run_test` for each file.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-07-09 13:42:52 -07:00
Suraj Deshmukh d7dbe9cbff
Fix spelling mistakes using codespell (#4700)
Using following command the wrong spelling were found and later on
fixed:

```
codespell --skip CHANGES.md,.git,go.sum,\
    controller/cmd/service-mirror/events_formatting.go,\
    controller/cmd/service-mirror/cluster_watcher_test_util.go,\
    SECURITY_AUDIT.pdf,.gcp.json.enc,web/app/img/favicon.png \
    --ignore-words-list=aks,uint,ans,files\' --check-filenames \
    --check-hidden
```

Signed-off-by: Suraj Deshmukh <surajd.service@gmail.com>
2020-07-07 17:07:22 -05:00
Alejandro Pedraza 9908b2b8b2
Re-enable custom domain integration test (#4722)
The function triggering the test for k8s custom cluster domain was
misnamed, and thus the test wasn't being run.

This also adds some extra error handling to catch this and other
potential issues.
2020-07-07 16:27:46 -05:00
Tarun Pothulapati cf34a14985
Add a Windows Linkerd cli Test (#4653)
This PR adds a new cli test to see if installation yamls are correctly
generated even on windows, this is important because of all the file
path difference between windows and Linux, and if any code uses a wrong
format might cause the chart generation commands to fail on windows.

This creates a separate workflow for both release and integration.

Also, all the exisiting integration tests are moved in to
/tests/integration to separate from /test/cli as this test does not fall
under integration tests category
2020-07-02 23:13:57 +05:30
Kevin Leimkuhler 4372ed56dd
Isolate tests by cluster and make run interface simpler (#4593)
## Summary

Change the default behavior of integration tests to be isolated by cluster.
Additionally, make running one or all tests easier than the current process.

These changes are explained more in the [Testing
RFC](https://github.com/linkerd/rfc/blob/master/design/0004-isolated-integration-tests.md)

## Changes

This is a script used only by Linkerd developers, but there is a lot of useful
usage examples and explanations in `bin/tests --help` output:

```
Run Linkerd integration tests.

Optionally specify one of the following tests: [upgrade helm helm-upgrade uninstall deep external-issuer]

Usage:
    tests [--images] [--images-host ssh://linkerd-docker] [--name test-name] [--skip-kind-create] /path/to/linkerd

Examples:
    # Run all tests in isolated clusters
    tests /path/to/linkerd

    # Run single test in isolated clusters
    tests --name test-name /path/to/linkerd

    # Skip KinD cluster creation and run all tests in default cluster context
    tests --skip-kind-create /path/to/linkerd

    # Load images from tar files located under the 'image-archives' directory
    # Note: This is primarly for CI
    tests --images /path/to/linkerd

    # Retrieve images from a remote docker instance and then load them into KinD
    # Note: This is primarly for CI
    tests --images --images-host ssh://linkerd-docker /path/to/linkerd

Available Commands:
    --name: the argument to this option is the specific test to run
    --skip-kind-create: skip KinD cluster creation step and run tests in an existing cluster.
    --images: (Primarily for CI) use 'kind load image-archive' to load the images from local .tar files in the current directory.
    --images-host: (Primarily for CI) the argument to this option is used as the remote docker instance from which images are first retrieved (using 'docker save') to be then loaded into KinD. This command requires --images.
```

### Run all tests

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run single test (upgrade for example):

Current:

```bash
. bin/_test-run.sh
init_test_run $PWD/bin/linkerd
upgrade_integration_tests
```

New:

```bash
bin/tests --name upgrade $PWD/bin/linkerd
```

### Run tests in isolated KinD clusters

Current: Not possible without running single tests in newly created clusters
manually

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run tests in isolated namespaces on an existing cluster

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests --skip-kind-create $PWD/bin/linkerd
```

## CI

`kind_integration` has been updated so that it does not create a KinD cluster as
part of its test setup.

`cloud_integration` passes the `--skip-kind-create` flag so that the tests are
run serially in a non-KinD cluster.


Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-06-24 17:06:29 -04:00
Alejandro Pedraza d842a97cb2
Update CI and docs to reference `main` branch (#4662)
Files changed:

```
.github/PULL_REQUEST_TEMPLATE.md
.github/workflows/cloud_integration.yml
.github/workflows/kind_integration.yml
.github/workflows/release.yml
.github/workflows/static_checks.yml
.github/workflows/unit_tests.yml
BUILD.md
CONTRIBUTING.md
bin/test-scale
bin/win/linkerd.nuspec
```
2020-06-24 12:39:22 -07:00
Zahari Dichev 904f146558
Multicluster install integration test (#4540)
This PR adds multicluster components to the integration tests.

The existing tests have been modified to pass the `--multicluster` flag so that the entire integration test suite runs with multicluster components.

Currently, the upgrade tests do not have multicluster components installed, but this will be done in a follow-up PR. 

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-06-24 14:32:22 -04:00
Oliver Gould ca1a9f66d2
go-run: Move temporary binary into `target` directory (#4657)
The `bin/go-run` script generates a temporary binary, stored in the root
of the repository.

This change moves it into `target/` so that is included in the
.dockerignore, and so that the repo can be cleaned easily by removing
the `target/` directory.
2020-06-23 15:55:34 -07:00
Joakim Roubert 8d19b4055b
Improve shellscript portability by using /bin/env (#4628)
Using `/bin/env` increases portability for the shell scripts (and often using `/bin/env` is requested by e.g. Mac users). This would also facilitate testing scripts with different Bash versions via the Bash containers, as they have bash in `/usr/local` and not `/bin`. Using `/bin/env`, there is no need to change the script when testing. (I assume the latter was behind c301ea214b (diff-ecec5e3a811f60bc2739019004fa35b0), which would not happen using `/bin/env`.)

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-06-19 15:49:29 -04:00
cpretzer 84a29b9612
Prevent kind-load from resolving TAG when images are provided (#4634)
* Update the logic for resolving the tag based on script parameters
Signed-off-by: Charles Pretzer <charles@buoyant.io>
2020-06-19 11:29:50 -07:00
Joakim Roubert 82e91382b7
test-cleanup: Make populate_array() bash 3-friendly (#4627)
Fixes #4621

Legacy versions of bash (used in e.g. Mac OS) do not have the [nameref](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameters.html) functionality.
This patch replaces the use of that in the `populate_array` function and uses a bash 3-friendly way of handing this instead.

([Kubernetes](https://github.com/kubernetes/kubernetes) developers will recognize this bash 3-friendly way from [kube::util::read-array](d8febccacf/hack/lib/util.sh (L755-L770)) in the Kubernetes code base.)

Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-06-18 17:35:34 -04:00
Alejandro Pedraza 0404703b9e
Fix `bin/update-go-deps-shas` in Ubuntu (#4632)
Explicitly shebang `bin/update-go-deps-shas` with `#!/bin/bash` instead
of `#!/bin/sh` because the latter points to `dash` in most Ubuntu-based
distros, and the script's `bin/_tag.sh` dependency requires bash.
2020-06-18 12:03:04 -05:00
Kevin Leimkuhler d5591f07ac
Fix helm upgrade test (#4622)
## Problem

#4557 changed the name of the function that `helm_upgrade_integration_tests`
uses.

`install_stable()` was renamed to `latest_release_channel()` and now takes an
argument for specifying either `edge` or `stable`.

`run_helm_upgrade_test` is a function used by the helm upgrade integration test
and was not properly updated to use `latest_release_channel()`.

This silently passed integration tests because `run_helm_upgrade_test` started
passing an empty string for the version to upgrade from, which results in the
default behavior of `install_test.go`--and therefore still passes.

## Solution

`run_helm_upgrade_test` now uses `latest_release_channel()` and passes the
proper argument.

Additionally, it checks that the version returned from
`latest_release_channel()` is not empty. If it is empty, it exits the test. This
ensures something like this does happen in the future.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-06-18 12:22:15 -04:00
Kevin Leimkuhler f6bd722e2c
Fix install-pr script (#4610)
* Fix install-pr script
* Add image-archives path to commands to use the files

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Signed-off-by: Charles Pretzer <charles@buoyant.io>
Co-authored-by: Charles Pretzer <charles@buoyant.io>
2020-06-17 21:32:01 -07:00
Kevin Leimkuhler b0765c4361
Add integration test for upgrading from edge (#4557)
This adds an integration test for upgrading from the latest edge to the current
build.

Closes #4471

Signed-off-by: Kevin Leimkuhler kevin@kleimkuhler.com
2020-06-16 09:18:52 -07:00
Alejandro Pedraza d10ed2aa5e
CI steps for Chocolatey package - take 2 (#4536)
* CI steps for Chocolatey package - take 2

Followup to #4205, supersedes #4205

This adds:

- A new job psscript-analyzer into the `statics_checks.yml`
workflow for linting the Chocolatey Powershell script.
- A new `choco_pack` job in the `release.yml` workflow for
updating the Chocolatey spec file and generating the
package. This is only triggered for stable releases. It requires
a windows runner in order to run the choco tooling (in theory
it should have worked on a linux runner but in practice it
didn't).
- The `Create release` step was updated to upload the generated package,
if present.
- The source file path in `bin/win/linkerd.nuspec` was updated
to make this work.

* Name nupkg file accordingly to the other release assets
2020-06-15 16:42:50 -05:00
Joakim Roubert 57f321b14b
Use buster for base and web images too (#4567)
Requires setting iptables-legacy as the iptables provider.

Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-06-15 10:49:26 -07:00
Joakim Roubert 544d484784
bin/test-cleanup: Fix shellcheck issues (#4421)
Fix shellcheck issues

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-06-03 14:35:12 -04:00
Joakim Roubert 903fb0fcad
Fix quotes in shellscripts (#4406)
- Add quotes where missing, to handle whitespace & c:o.
- Use single quotes for non-expansion strings.
- Fix quotes were the current would cause errors.

Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-06-02 16:44:38 -04:00
Alex Leong 5635f7377f
Fix uname flags for darwin in bin/lint (#4490)
The version of `uname` on Darwin doesn't support the `-o` flag, resulting in an error message when running the `bin/lint` script. 

We add an if-branch to short-circuit the `uname-o` call if running on Darwin.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-06-02 13:02:07 -07:00
Alejandro Pedraza 571626d524
CI: properly report errors from commands (#4514)
Failures in `bin/_test-run` from commands different than `go test`
aren't currently properly reported, in part because CI's bash default is
to have `set -e` which terminates the script and just outputs
`##[error]Process completed with exit code 2.` like
[here](https://github.com/linkerd/linkerd2/pull/4496/checks?check_run_id=720720352#step:14:116)

```
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
× no unschedulable pods
    linkerd-controller-6c77c7ffb8-w8wh5: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-destination-6767d88f7f-rcnbq: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-grafana-76c76fcfb9-pdhfb: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-identity-5bcf97d6c8-q6rll: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-prometheus-6b95c56b44-hd9m6: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-proxy-injector-58d794ff9-jf7cj: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-sp-validator-6c5f999bfb-qg252: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-tap-6fdf84fc65-6txvr: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    linkerd-web-8484fbd867-nm8z2: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
    see https://linkerd.io/checks/#l5d-existence-unschedulable-pods for hints

Status check results are ×
[error]Process completed with exit code 2.
```

I've made the following changes to `bin/_test-run` to generate better
messages and Github annotations when an error occurs:

- Unset `set -e` so that errors don't immediately exit the script and
don't allow us to properly format the errors.
- Removed many of the `exit_on_err` calls after go test calls because
those output enough information already (they were not being used
anyways in CI because of `set -e`). And instead have `run_test` exit
upon a `go test` error.
- Added `exit_on_err` calls right after non-`go-test` commands to
properly report their failure.
- Refactored the `exit_on_err` function so that it generates a Github
error annotation upon failure.
- Removed `trap` in `install_stable`, since the OS should be able to
handle GC for stuff under `/tmp`.

Also, I've changed the exit 2 code from `linkerd check` when it fails,
to exit code 1.
2020-06-01 15:57:33 -05:00