Commit Graph

331 Commits

Author SHA1 Message Date
Alex Leong dd8e5fc5bc
Rename extension charts to linkerd-* (#5552)
For consistency we rename the extension charts to a common naming scheme:

linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link

We also make the chart files and chart readmes a bit more uniform.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-26 16:20:49 -08:00
Dennis Adjei-Baah ae2b3499b0
Tweak error message in web script (#5596)
The `bin/web run` script sets up a local environment for linkerd
dashboard development. This script port-forwards an existing linkerd
controller and a grafana instance in a local kubernetes cluster. When
running the command with just the linkerd control plane  and no
linkerd viz extension the error message is shown below.
```
'Controller is not running. Have you installed Linkerd?'
```

This error message is a little misleading because the controller is
installed when running this after `linkerd install`. The issue here is
that the script checks for a Grafana instance but the error message it
displays when it can't find a Grafana pod is that the controller isn't
install. The error message should instead notify the developer that
Linkerd Viz is not installed.

This change modifies the error message so it is more clear.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2021-01-26 11:21:44 -06:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Oliver Gould d2ae5a8117
build: Remove the DOCKER_TRACE environment variable (#5583)
Our build scripts hide docker's output by default and only pass through
output when DOCKER_TRACE is set. Practically everyone else tends to use
DOCKER_TRACE=1 persistently. And, recently, GitHub Actions stopped
working with `/dev/stderr`

This change removes the DOCKER_TRACE environment variable so that output
is always emitted as it would when invoking docker directly.
2021-01-20 22:09:47 -08:00
Alejandro Pedraza 3365e98f13
Have 'bin/test-cleanup' clean the viz helm release (#5542)
This is needed for the tests in the ARM box to pass.
2021-01-15 00:40:37 +05:30
Alejandro Pedraza f3b1ebfa99
Separate observability API (#5510)
* Separate observability API

Closes #5312

This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.

- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.

The other changes in the go files are just changes in the imports to point to the new protobufs.

Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
2021-01-13 14:34:54 -05:00
Yashvardhan Kukreja 06dccac35b
cleanup: utilise linkerd uninstall to concisely delete all the linkerd resources involved in the test (#5522)
The linkerd uninstall command is able to remove a lot of the test resources used in CI but it ends up leaving the test namespaces though.
Still, the test-cleanup script can be cleaned down to a good level by getting rid of the populate_array function.

Hence, this commits adds a one-liner, alongside linkerd uninstall, to deal with the deletion of all the test namespaces and the resources instead of using the big chunk of populate_array function.

Fixes #5497

Signed-off-by: Yashvardhan Kukreja <yash.kukreja.98@gmail.com>
2021-01-12 18:51:27 -05:00
Alejandro Pedraza 898de71098
Enable upgrade integration tests for ARM (#5513)
This enables the `helm-upgrade` and `upgrade-stable` integration tests,
that were disabled because the previous versions didn't have ARM
support, but now 2.9 does.
2021-01-11 17:34:05 -05:00
Kevin Leimkuhler 308a1f3ff3
Use linkerd path in test-cleanup (#5498)
## What this fixes

When clusters are cleaned up after tests in CI, the `bin/test-cleanup` script is
responsible for clearing the cluster of all testing resources.

Right now this does not work as expected because the script uses the `linkerd`
binary instead of the Linkerd path that is passed in to the `tests` script.

There are cases where different binaries have different uninstall behavior and
the script can complete with an incomplete uninstallation.

## How it fixes

`test-cleanup` now takes a linkerd path argument. This is used to specify the
Linkerd binary that should be used when running in the `uninstall` commands.

This value is passed through from the `tests` invocation which means that in CI,
the same binary is used for running tests as well as cleaning up the cluster.

Additionally, specifying the k8s context has now moved from an argument to the
`--context` flag. This is similar to how `tests` script works because it's not
always required.

## How to use

Shown here:

``` $ bin/test-cleanup -h Cleanup Linkerd integration tests.

Usage:
    test-cleanup [--context k8s_context] /path/to/linkerd

Examples:
    # Cleanup tests in non-default context test-cleanup --context k8s_context
    /path/to/linkerd

Available Commands:
    --context: use a non-default k8s context
```

## edge-21.1.1

This edge release introduces a new "opaque transport" feature that allows the
proxy to securely transport server-speaks-first and otherwise opaque TCP
traffic. Using the `config.linkerd.io/opaque-ports` annotation on pods and
namespaces, users can configure ports that should skip the proxy's protocol
detection.

Additionally, a new `linkerd-viz` extension has been introduced that separates
the installation of the Grafana, Prometheus, web, and tap components. This
extension closely follows the Jaeger and multicluster extensions; users can
`install` and `uninstall` with the `linkerd viz ..` command as well as configure
for HA with the `--ha` flag.

The `linkerd viz install` command does not have any cli flags to customize the
install directly, but instead follows the Helm way of customization by using
flags such as `set`, `set-string`, `values`, `set-files`.

Finally, a new `/shutdown` admin endpoint that may only be accessed over the
loopback network has been added. This allows batch jobs to gracefully terminate
the proxy on completion. The `linkerd-await` utility can be used to automate
this.

* Added a new `linkerd multicluster check` command to validate that the
  `linkerd-multicluster` extension is working correctly
* Fixed description in the `linkerd edges` command (thanks @jsoref!)
* Moved the Grafana, Prometheus, web, and tap components into a new Viz chart,
  following the same extension model that multicluster and Jaeger follow
* Introduced a new "opaque transport" feature that allows the proxy to securely
  transport server-speaks-first and otherwise opaque TCP traffic
* Removed the check comparing the `ca.crt` field in the identity issuer secret
  and the trust anchors in the Linkerd config; these values being different is
  not a failure case for the `linkerd check` command (thanks @cypherfox!)
* Removed the Prometheus check from the `linkerd check` command since it now
  depends on a component that is installed with the Viz extension
* Fixed error messages thrown by the cert checks in `linkerd check` (thanks
  @pradeepnnv!)
* Added PodDisruptionBudgets to the control plane components so that they cannot
  be all terminated at the same time during disruptions (thanks @tustvold!)
* Fixed an issue that displayed the wrong `linkerd.io/proxy-version` when it is
  overridden by annotations (thanks @mateiidavid!)
* Added support for custom registries in the `linkerd-viz` helm chart (thanks
  @jimil749!)
* Renamed `proxy-mutator` to `jaeger-injector` in the `linkerd-jaeger` extension
* Added a new `/shutdown` admin endpoint that may only be accessed over the
  loopback network allowing batch jobs to gracefully terminate the proxy on
  completion
* Introduced the `linkerd identity` command, used to fetch the TLS certificates
  for injected pods (thanks @jimil749)
* Fixed an issue with the CNI plugin where it was incorrectly terminating and
  emitting error events (thanks @mhulscher!)
* Re-added support for non-LoadBalancer service types in the
  `linkerd-multicluster` extension

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-08 15:24:14 -05:00
Kevin Leimkuhler 71fd10b887
Uninstall the viz and jaeger extensions (#5494)
Use the `uninstall` command for the viz and jaeger extensions to ensure clusters
are cleaned up properly after tests

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-08 08:39:25 -05:00
Alejandro Pedraza 57460bdc42
Cleanup dependencies in bin/helm-build (#5491)
Chart dependencies are added as tarballs under the chart's `chart`
subdirectory. When we move chart dependencies around this can leave
stale dependencies behind, ensuing havoc. This PR removes those deps
before calling `helm dep up`.
2021-01-06 18:00:34 -05:00
Alex Leong 790be8d972
Rename proxy-mutator to jaeger-injector (#5351)
The name `proxy-mutator` is too generic.  In particular, several different linkerd extensions will have mutating webhooks which mutate the proxy sidecar, the MutatingWebhookConfiguration resource is cluster scoped, and each one needs a unique name.

We use the `jaeger-injector` name instead.  This gives us a pattern to follow for future webhooks as well (e.g. `tap-injector` etc.)

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-06 10:00:07 -08:00
Alejandro Pedraza 6b1a3d4541
Upgrade k3d to v3.4.0 (#5483)
While using k3d v3.0.2 using 3 nodes and installing linkerd in HA I've
seen errors like
```
Error from server: error when creating "STDIN": rpc error: code = Unknown desc = database is locked
```
Which doesn't happen on v3.4.0.

This brings though by default k8s v1.19, which is producing some
warnings in `linkerd check` like:
```
W0106 11:09:39.204081  292603 warnings.go:67] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
```
That only affects stderr so the tests still pass, but needs to be
addressed in a followup.
2021-01-06 12:00:38 -05:00
Alejandro Pedraza 93ec23d2c7
Update bin/web script to take into account linkerd-viz (#5442)
The `get-pod` and `port-forward` functions continue to assume
deployments like grafana still live under the `linkerd` namespace.
This expands the definition of those functions to be able to specify the
namespace.

These changes can be solely tested by running `bin/web dev` (follow the
instructions in `BUILD.md` for the preliminaries needed).
2021-01-06 11:40:27 -05:00
Tarun Pothulapati 2087c95dd8
viz: move some components into linkerd-viz (#5340)
* viz: move some components into linkerd-viz

This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.

The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as 
`set`, `set-string`, `values`, `set-files`.

**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
 fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.

**Testing**

```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -

# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -

# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```

What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-23 20:17:31 +05:30
Alejandro Pedraza 02b456087d
Stop publishing the linkerd2-multicluster-link chart (#5365)
Closes #5348

That chart generates the service mirror resources and related RBAC, but
doesn't generate the credentials secret nor the Link CR which require
go-client logic not available from sheer Helm templates.

This PR stops publishing that chart, and adds a comment to its README
about it.
2020-12-11 08:55:50 -05:00
Joakim Roubert 377b38f0bf
bin/protoc-diff: Don't assume Debian and don't install unzip (#5347)
- Do unzip check but don't install; leave installation to user
- Move unzip check to bin/protoc that actually uses unzip
- Make sure the protoc scripts can be called from any directory

Fixes #5337

Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-12-09 09:12:38 -05:00
Alejandro Pedraza 48666a7673
bin/shellcheck-all was missing some files (#5335)
* bin/shellcheck-all was missing some files

`bin/shellcheck-all` identifies what files to check by filtering by the
`text/x-shellscript` mime-type, which only applies to files with a
shebang pointing to bash. We had a number of files with a
`#!/usr/bin/env sh` shebang that (at least in Ubuntu given `sh` points
to `dash`) only exposes a `text/plain` mime-type, thus they were not
being checked.

This fixes that issue by replacing the filter in `bin/shellcheck-all`, using a simple grep over the file shebang instead of using the `file` command.
2020-12-08 09:30:52 -05:00
Kevin Leimkuhler a456d03621
Change script to work with k3d and install only CLI (#5333)
This changes the install-pr script to work with k3d.

Additionally, it now only installs the CLI; it no longer installs Linkerd on the
cluster. This was removed because most of the time when installing a Linkerd
version from a PR, some extra installation configuration is required and I was
always commenting out that final part of the script.

`--context` was changed to `--cluster` since we no longer need a context value,
only the cluster name which we are loading the images in to.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-07 11:35:01 -05:00
Tarun Pothulapati 72a0ca974d
extension: Separate multicluster chart and binary (#5293)
Fixes #5257

This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.

Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-04 16:36:10 -08:00
Alejandro Pedraza 94574d4003
Add automatic readme generation for charts (#5316)
* Add automatic readme generation for charts

The current readmes for each chart is generated
manually and doesn't contain all the information available.

Utilize helm-docs to automatically fill out readme.mds
for the helm charts by pulling metadata from values.yml.

Fixes #4156

Co-authored-by: GMarkfjard <gabma047@student.liu.se>
2020-12-02 14:37:45 -05:00
Alejandro Pedraza 6fb35b0af7
Jaeger injector mutating webhook (#5276)
* Jaeger injector mutating webhook

Closes #5231. This is based off of the `alex/sep-tracing` branch.

This webhook injects the `LINKERD2_PROXY_TRACE_COLLECTOR_SVC_ADDR`,
`LINKERD2_PROXY_TRACE_COLLECTOR_SVC_NAME` and
`LINKERD2_PROXY_TRACE_ATTRIBUTES_PATH` environment vars into the proxy
spec when a pod is created, as well as the podinfo volume and its mount.
If any of these are found to be present already in the pod spec, it
exits without applying a patch.

The `values.yaml` file has been expanded to include config for this
webhook. In particular, one can define a `namespaceSelector` and/or a
`objectSelector` to filter which pods will this webhook act on.

The config entries in `values.yam` for `collectorSvcAddr` and
`collectorSvcAccount` can be overriden with the
`config.linkerd.io/trace-collector` and
`config.alpha.linkerd.io/trace-collector-service-account` annotation at
the namespace or pod spec level.

## How to test:
```bash
docker build . -t ghcr.io/linkerd/jaeger-webhook:0.0.1 -f
jaeger/proxy-mutator/Dockerfile
k3d image import ghcr.io/linkerd/jaeger-webhook:0.0.1
bin/helm-build
linkerd install
helm install jaeger jaeger/charts/jaeger
linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
kubectl -n emojivoto get po -l app=emoji-svc -oyaml | grep -A1 TRACE
```

## Reinvocation policy
The webhookconfig resource is configured with `reinvocationPolicy:
IfNeeded` so that if the tracing injector gets triggered before the
proxy injector, it will get triggered a second time after the proxy
injector runs so it can act on the injected proxy. By default this won't
be necessary because the webhooks run in alphabetical order (this is not
documented in k8s docs though) so
`linkerd-proxy-injector-webhook-config` will run before
`linkerd-proxy-mutator-webhook-config`. In order to test the
reinvocation mechanism, you can change the name of the former so it gets
called first.

I versioned the webhook image as `0.0.1`, but we can decide to align
that with linkerd's main version tag.
2020-11-27 12:25:28 -05:00
Tarun Pothulapati e7f4c31257
extension: Add new jaeger binary (#5278)
* extension: Add new jaeger binary

This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.

This covers only the install part of the command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-25 20:10:35 +05:30
Alex Leong 0f20b0572e
tracing: new jaeger independent helm chart (#5275)
Fixes #5230

This PR moves tracing into a jaeger chart with no proxy injection
templates. We still keep the dependency on partials, as we could use
common templates like resources, etc from there.

Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com
2020-11-24 09:45:16 -08:00
Alejandro Pedraza deca7ede08
Consolidate integration tests under k3d (#5245)
* Consolidate integration tests under k3d

Fixes #5007

Simplified integration tests by moving all to k3d. Previously things were running in Kind, except for the multicluster tests, which implied some extra complexity in the supporting scripts.

Removed the KinD config files under `test/integration/configs`, as config is now passed as flags into the `k3d` command.

Also renamed `kind_integration.yml` to `integration_tests.yml`

Test skipping logic under ARM was also simplified.
2020-11-18 14:33:16 -05:00
Alex Leong 2e5087b9f6
Update install-pr to use image-load script (#5226)
The `bin/kind-load` script no longer exists and has been renamed to `bin/image-load`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-11-13 14:38:20 -08:00
Alejandro Pedraza 7a0c64a9e2
Update BUILD.md with multiarch stuff and some extras (#5199)
* Update BUILD.md with multiarch stuff and some extras

Adds to `BUILD.md` a new section `Publishing images` explaining the
workflow for testing custom builds.

Also updates and gives more precision to the section `Building CLI for
development`.

Finally, a new `Multi-architecture builds` section is added.

This PR also removes `SUPPORTED_ARCHS` from `bin/docker-build-cli` that
is no longer used.

Note I'm leaving some references to Minikube. I might change that in a
separate PR to point to k3d if we manage to migrate the KinD stuff to
k3d.
2020-11-12 09:36:54 -05:00
Alejandro Pedraza 5de85d9320
Don't skip pushing images if tag is already there (#5128)
gcr.io has an issue that it's not possible to update multi-arch images
(see eclipse/che#16983 and open-policy-agent/gatekeeper#665).

We're now relying on ghcr.io instead, which I verified doesn't have this
bug, so we can stop skipping these pushes.
2020-10-23 13:13:42 -05:00
Alejandro Pedraza 8a41c9a3dd
Ignore cni-calico-deep integration tests for ARM (#5125)
Calico CNI is not supported on ARM, so this test should be conditionally run based on the `RUN_ARM_TEST` environment variable
2020-10-22 21:14:04 -07:00
Alex Leong d22dda0917
Fix integration tests to run better on ARM (#5101)
In order for the integration tests to run successfully on a dedicated ARM cluster, two small changes are necessary:

* We need to skip the multicluster test since this test uses two separate clusters (source and target)
* We need to properly uninstall the multicluster helm chart during cleanup.

With these changes, I was able to successfully run the integration tests on a dedicated ARM cluster.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-21 09:27:56 -07:00
Zahari Dichev 60d8f34095
avoid waiting when creating calico cluster with kind (#5064)
Currently the --wait flag times out when creating a calico cluster. The result is that we end up waiting for 5 minutes to simply emit a warning and continue. Instead we can check the readiness of some k8s components to ensure our cluster is up and running and avoid the delay.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-10-12 18:26:00 +03:00
Oliver Gould 5f694513bd
bin/tests: Improve argument parsing (#5060)
The `bin/tests` script takes command-line arguments, but it requires
that all arguments are specified before the linkerd binary path; and it
silently ignores flags that follow the linkerd binary. Furthermore,
unexpected flags may be incorrectly parsed as the linkerd binary path.

This changes argument parsing to be more flexible about ordering; and it
prints the full usage error when unexpected flags are encountered.
2020-10-09 07:27:22 -07:00
Alejandro Pedraza e1772ae183
Fixed releases.yaml by pulling images directly from ghcr.io (#5035)
Previously, `releases.yaml` was trying to load images into the kind
clusters but that failed because those images were already in `ghcr.io`
and not in the local docker cache, but that failure was masked.
Unmasking that failure revealed some flaws that this change addresses:

- In `bin/_test_helpers` (used by `bin/tests`), modified the `images`
arg to accept `docker(default)|archive|skip`, for determining how to
load the images into the cluster (if loading them at all)
- In `bin/image-load`, changed arg `images` to `archive` which is more
descriptive.
- Have `kind_integration.yml` call `bin/tests --images archive`.
- Have `release.yml` call `bin/tests --images skip`.
2020-10-02 08:05:17 -05:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Alejandro Pedraza e8f0724a71
Multicluster integration test (#4998)
This implements the run_multicluster_test() function in bin/_test-helpers.sh.

The idea is to create two clusters (source and target) using k3d, with linkerd and multicluster support in both, plus emojivoto (without vote-bot) in target, and vote-bot in source.
We then link the clusters and make sure traffic is flowing.

Detailed sequence:

Create certficates.
Install linkerd along with multicluster support in the target cluster.
Run the target1 test: install emojivoto in the target cluster (without vote-bot).
Run linkerd mc link on the target cluster.
Install linkerd along with multicluster support in the source cluster.
Apply the link resource in the source cluster.
Run the source test: Check linkerd mc gateways returns the target cluster link, and only install emojivoto's vote-bot in the source cluster. Note vote-bot's yaml defines the web-svc service as web-svc-target.emojivoto:80
Run the target2 test: Make sure web-svc in the target cluster is receiving requests.
2020-09-26 05:26:23 -05:00
Alejandro Pedraza b50ae6290d
Add support for k3d in integration tests (#4994)
* Add support for k3d in integration tests

KinD doesn't support setting LoadBalancer services out of the box. It can be added with some additional work, but it seems the solutions are not cross-platform.

K3d on the other hand facilitates this, so we'll be using k3d clusters for the multicluster integration test.

The current change sets the ground by generalizing some of the integration tests operations that were hard-coded to KinD.

- Added `bin/k3d` to wrap the setup and running of a pinned version of `k3d`.
- Refactored `bin/_test-helpers.sh` to account for tests to be run in either KinD or k3d.
- Renamed `bin/kind-load` to `bin/image-load` and make it more generic to load images for both KinD (default) and k3d. Also got rid of the no longer used `--images-host` option.
- Added a placeholder for the new `multicluster` test in the lists in `bin/_test-helpers.sh`. It starts by setting up two k3d clusters.

* Refactor handling of the `--multicluster` flag in integration tests (#4995)

Followup to #4994, based off of that branch (`alpeb/k3d-tests`).
This is more preliminary work previous to the more complete multicluster integration test.

- Removed the `--multicluster` flag from all the tests we had in `bin/_test-helpers.sh`, so only the new "multicluster" integration test will make use of that. Also got rid of the `TestUninstallMulticluster()` test in `install_test.go` to keep the multicluster stuff around, needed for the more complete multicluster test that will be implemented in a followup PR.
- Added "multicluster" to the list of tests in the `kind_integration.yml` workflow.
- For now, this new "multicluster" test in `run_multicluster_test()` is just running the install tests (`test/integration/install_test.go`) with the `--multicluster` flag.

Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-09-25 16:33:17 -05:00
Alejandro Pedraza 0f869f2e50
Ability for int tests to use external certs generated with openssl (#4997)
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
2020-09-25 11:25:29 -05:00
Tarun Pothulapati 3d900ccc19
Integration test for smi-metrics (#4844)
* Integration test for smi-metrics

This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.

Currently, We store the SMI Helm pkg locally and run the test on top, so 
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 22:49:20 +05:30
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
Alejandro Pedraza 51100606ca
Delete multicluster resources in `bin/test-cleanup` (#4983)
When some test failed in the middle of the
`./tests/integration/install_test.go` suite, multicluster resources can
be left-over, which `./bin/test-cleanup` wasn't removing.

This was affecting the ARM integration tests, that require good cleanup
since they use a non-transient cluster.
2020-09-18 07:38:46 -05:00
Alejandro Pedraza ccf027c051
Push docker images to ghcr.io instead of gcr.io (#4953)
* Push docker images to ghcr.io instead of gcr.io

The `cloud_integration.yml` and `release.yml` workflows were modified to
log into ghcr.io, and remove the `Configure gcloud` step which is no
longer necessary.

Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes.

Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag.

Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).
2020-09-10 15:16:24 -05:00
Alejandro Pedraza 9bf34ebc4e
Fixed helm cleanup in `./bin/test-cleanup` (#4944)
`./bin/test-cleanup` was trying to remove the
resources with the label `linkerd.io/is-test-helm` which we're not
using. Instead, we simply call `helm delete` on the appropriate helm
releases.

This is required for a clean cleanup after the ARM integration test, whose
cluster is just cleaned by this script at the end and is not torn down.
2020-09-08 12:20:14 -05:00
Ali Ariff 5186383c81
Add ARM64 Integration Test (#4897)
* Add ARM64 Integration Test

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-28 10:38:40 -07:00
Zahari Dichev 2e7c00aa37
Diff generated code from proto files (#4863)
Add a static check that ensures the generated files from the proto definitions have not changed. 

Fix #4669

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-08-18 11:44:33 +03:00
Ali Ariff 492b0fe093
Created ./bin/_os.sh lib for os-arch detection (#4880)
And refactored `./bin/linkerd` and `./bin/build-cli-bin` to use it.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-14 09:59:52 -05:00
Kevin Leimkuhler c0826dcedc
Choose the right architecture in bin/linkerd script (#4867)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-08-13 11:03:08 -07:00
Ali Ariff 66d2c6b74b
Fix build-cli-bin (#4876)
Fix `bin/build-cli-bin` on Linux to put the binary in the correctly named architecture directory.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-13 09:21:14 -07:00
Ali Ariff ae8bb0e26e
Release ARM CLI artifacts (#4841)
* When releasing, build and upload the amd64, arm64 and arm architectures builds for the CLI
* Refactored `Dockerfile-bin` so it has separate stages for single and multi arch builds. The latter stage is only used for releases.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-11 09:25:58 -05:00
Ali Ariff 61d7dedd98
Build ARM docker images (#4794)
Build ARM docker images in the release workflow.

# Changes:
- Add a new env key `DOCKER_MULTIARCH` and `DOCKER_PUSH`. When set, it will build multi-arch images and push them to the registry. See https://github.com/docker/buildx/issues/59 for why it must be pushed to the registry.
- Usage of `crazy-max/ghaction-docker-buildx ` is necessary as it already configured with the ability to perform cross-compilation (using QEMU) so we can just use it, instead of manually set up it.
- Usage of `buildx` now make default global arguments. (See: https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope)

# Follow-up:
- Releasing the CLI binary file in ARM architecture. The docker images resulting from these changes already build in the ARM arch. Still, we need to make another adjustment like how to retrieve those binaries and to name it correctly as part of Github Release artifacts.

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-05 11:14:01 -07:00
Alejandro Pedraza a1be60aea1
Reenable `upgrade-edge` integration test (#4821)
Followup to #4797

That test was temporarily disabled until the prometheus check in
`linkerd check` got fixed in #4797 and made it into edge-20.7.5
2020-07-31 12:11:32 -05:00