Commit Graph

103 Commits

Author SHA1 Message Date
Kevin Leimkuhler a11012819c
Add opaque ports namespace inheritance to pods (#5941)
### What

When a namespace has the opaque ports annotation, pods and services should
inherit it if they do not have one themselves. Currently, services do this but
pods do not. This can lead to surprising behavior where services are correctly
marked as opaque, but pods are not.

This changes the proxy-injector so that it now passes down the opaque ports
annotation to pods from their namespace if they do not have their own annotation
set. Closes #5736.

### How

The proxy-injector webhook receives admission requests for pods and services.
Regardless of the resource kind, it now checks if the resource should inherit
the opaque ports annotation from its namespace. It should inherit it if the
namespace has the annotation but the resource does not.

If the resource should inherit the annotation, the webhook creates an annotation
patch which is only responsible for adding the opaque ports annotation.

After generating the annotation patch, it checks if the resource is injectable.
From here there are a few scenarios:

1. If no annotation patch was created and the resource is not injectable, then
   admit the request with no changes. Examples of this are services with no OP
   annotation and inject-disabled pods with no OP annotation.
2. If the resource is a pod and it is injectable, create a patch that includes
   the proxy and proxy-init containers—as well as any other annotations and
   labels.
3. The above two scenarios lead to a patch being generated at this point, so no
   matter the resource the patch is returned.

### UI changes

Resources are now reported to either be "injected", "skipped", or "annotated".

The first pass at this PR worked around the fact that injection reports consider
services and namespaces injectable. This is not accurate because they don't have
pod templates that could be injected; they can however be annotated.

To fix this, an injection report now considers resources "annotatable" and uses
this to clean up some logic in the `inject` command, as well as avoid a more
complex proxy-injector webhook.

What's cool about this is it fixes some `inject` command output that would label
resources as "injected" when they were not even mutated. For example, namespaces
were always reported as being injected even if annotations were not added. Now,
it will properly report that a namespace has been "annotated" or "skipped".

### Tests

For testing, unit tests and integration tests have been added. Manual testing
can be done by installing linkerd with `debug` controller log levels, and
tailing the proxy-injector's app container when creating pods or services.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-03-29 19:41:15 -04:00
Alejandro Pedraza 9a191fbd7b
Get rid of `CheckDeployments()` in integration tests (#5942)
`CheckDeployments()` verified that some deployment had the appropriate
number of replicas in the Ready state. `CheckPods()` does the same, plus
checking if there were restarts (a single restart returns
`RestartCountError` which only triggers a warning on the calling side,
more restarts trigger a regular error). We were always calling the
former followed by a call to the latter which is superfluous, so we're
getting rid of the latter.

Also, the `testutil.DeploySpec` struct had a `Containers` field for
checking the name of containers, but that wasn't always checked and
didn't really represent a potential error that wouldn't be clearly
manifested otherwise (like in golden files), so that was simplified as
well.
2021-03-23 13:53:38 -05:00
Alejandro Pedraza 711124159a
Fix helm-upgrade and upgrade-stable integration tests (#5891)
* Fix helm-upgrade integration test

Update `install_test.go` now that the upgrade test is done from 2.10.
This also implied installing viz right after core.

Refactored `HelmInstallPlain()` into `HelmCmdPlain()` to work for both
helm install and upgrade.

* Add expected heartbeat config entry to the upgrade-stable test, and remove testCheckCommand arg (for lint)
2021-03-12 08:20:45 -05:00
Alejandro Pedraza af5aa70e68
Helm integration tests improvements, and other flakiness fixes to CI (#5857)
- Get rid of all the custom settings passed through `--set` during `helm install`, and instead let the defaults mechanisms in the templates to  kick-in
- Before installing `linkerd-viz` through Helm, wait on _all_ the core components to be ready (this is what might have been causing the restarts seen in CI)
- Show full output when `linkerd jaeger check` fails, and do some cleanup before triggering the tracing tests. But then I decided to temporarily disable that test till we figure out what's the deal.
2021-03-02 20:11:00 -05:00
Alejandro Pedraza 571f505b6b
Move CP check after the readiness check (#5848)
* Move CP check after the readiness check

Moved the `can initialize client` and `can query the control plane API`
checks from the `linkerd-existence` section to the `linkerd-api` because
they required the `linkerd-controller` pod to not just be "Running" but
actually be ready.

This was causing `linkerd check` to show some port-forwarding warnings
when ran right after install.

This also allowed getting rid of the `CheckPublicAPIClientOrExit` function
and directly use `CheckPublicAPIClientOrRetryOrExit` (better naming
punted for later) which was refactored so it always runs the
`linkerd-api` checks before retrieving the client.

Other changes:

- Temporarily disabled `upgrade-edge` test because the latest edge has this readiness check issue
- Have the upgrade tests do proper pruning (stolen for @Pothulapati's #5673 😉 )
- Added missing label to tap SA (fixes #5850)
- Complete tap-injector Service selector
2021-03-01 19:47:25 -05:00
Dennis Adjei-Baah 15d1809bd0
Remove linkerd prefix from extension resources (#5803)
* Remove linkerd prefix from extension resources

This change removes the `linkerd-` prefix on all non-cluster resources
in the jaeger and viz linkerd extensions. Removing the prefix makes all
linkerd extensions consistent in their naming.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2021-02-25 11:01:31 -05:00
Tarun Pothulapati ed65a96f43
tests: enable external-prometheus-deep with cleanup logic (#5779)
This PR enables the temporarily disabled external-prometheus-deep
integration test. This also fixes the clean up issue by essentially
moving the external-prometheus resources into `external-prometheus`
which has the relevant annotation required to be deleted by
`bin/test-cleanup`

This PR also updates the test resource label to be `test.linkerd.io/is-test-data-plane` from
`linkerd.io/is-test-data-plane` to prevent `linkerd inject` from removing it.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-22 22:23:31 +05:30
Tarun Pothulapati ac67a2c5d7
tests: add exnternal-prometheus integration test (#5720)
* tests: add exnternal-prometheus integration test

Fixes #5659

Though most control-plane components dont differntiate on default
vs external prometheus, There have been issues w.r.t CLI and external
prometheus i.e check, etc.

This PR adds a e2e deep integration tests w.r.t external pronmetheus
thereby running viz and non-viz cmds integration tests on a linkerd-viz
ewith external prometheus instance

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-16 23:20:58 +05:30
Tarun Pothulapati a393c42536
values: removal of .global field (#5699)
* values: removal of .global field

Fixes #5425

With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.

To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.

We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-11 23:38:34 +05:30
Dennis Adjei-Baah e4069b47e0
Run extension checks when linkerd check is invoked (#5647)
* Run extension checks when linkerd check is invoked

This change allows the linkerd check command to also run any known
linkerd extension commands that have been installed in the cluster. It
does this by first querying for any namespace that has the label
selector `linkerd.io/extension` and then runs the subcommands for either
`jaeger`, `multicluster` or `viz`. This change runs basic namespace
healthchecks for extensions that aren't part of the Linkerd extension suite.

Fixes #5233
2021-02-11 10:50:16 -06:00
Mayank Shah 96e078421c
CLI: Remove the `--disable-tap` flag from inject (#5671)
Fixes https://github.com/linkerd/linkerd2/issues/5664

- Remove `--disable-flag` from `inject`
-  Move `config.linkerd.io/disable-tap` to `viz.linkerd.io/disable-tap`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2021-02-11 10:01:53 -05:00
Alex Leong dd8e5fc5bc
Rename extension charts to linkerd-* (#5552)
For consistency we rename the extension charts to a common naming scheme:

linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link

We also make the chart files and chart readmes a bit more uniform.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-26 16:20:49 -08:00
Kevin Leimkuhler e7f2a3fba3
viz: add tap-injector (#5540)
## What this changes

This adds a tap-injector component to the `linkerd-viz` extension which is
responsible for adding the tap service name environment variable to the Linkerd
proxy container.

If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled
via annotation on the pod or the namespace, no action is taken.

This also removes the environment variable for explicitly disabling tap through
an environment variable. Tap status for a proxy is now determined only be the
presence or absence of the tap service name environment variable.

Closes #5326

## How it changes

### tap-injector

The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be
added to a pod's Linkerd proxy container environment. If the pod satisfies the
following, it is added:

- The pod has a Linkerd proxy container
- The pod has not already been mutated
- Tap is not disabled via annotation on the pod or the pod's namespace

### LINKERD2_PROXY_TAP_DISABLED

Now that tap is an extension of Linkerd and not a core component, it no longer
made sense to explicitly enable or disable tap through this Linkerd proxy
environment variable. The status of tap is now determined only be if the
tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment
variable.

### controller image

The tap-injector has been added to the controller image's several startup
commands which determines what it will do in the cluster.

As a follow-up, I think splitting out the `tap` and `tap-injector` commands from
the controller image into a linkerd-viz image (or something like that) makes
sense.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-21 11:24:08 -05:00
Tarun Pothulapati 4c3d002501
viz: move sub-cmds using viz extension under viz cmd (#5485)
* viz: move sub-cmds using viz extension under viz cmd

Fixes #5327 , #5524 

This branch moves the following commands, under the `linkerd viz`
cmd as they use the viz extension to perform the job.

- dashboard
- edges
- routes
- stat
- tap
- top

This also creates a new pkg `public-api` which fecilitates
interaction and communication with public-api to be used
across extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
2021-01-13 12:11:25 +05:30
Tarun Pothulapati 836c077898
viz: add render golden tests (#5433)
* viz: add render golden tests

This branch adds golden tests for the viz install. This would be
useful to track changes in render as more changes are added.

This also moves the common code that is used across extensions
to generate diffs into `testutil` to be able to be used widely.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-12 11:59:16 +05:30
Alex Leong 790be8d972
Rename proxy-mutator to jaeger-injector (#5351)
The name `proxy-mutator` is too generic.  In particular, several different linkerd extensions will have mutating webhooks which mutate the proxy sidecar, the MutatingWebhookConfiguration resource is cluster scoped, and each one needs a unique name.

We use the `jaeger-injector` name instead.  This gives us a pattern to follow for future webhooks as well (e.g. `tap-injector` etc.)

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-06 10:00:07 -08:00
Tarun Pothulapati 2087c95dd8
viz: move some components into linkerd-viz (#5340)
* viz: move some components into linkerd-viz

This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.

The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as 
`set`, `set-string`, `values`, `set-files`.

**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
 fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.

**Testing**

```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -

# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -

# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```

What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-23 20:17:31 +05:30
Alex Leong cdc57d1af0
Use linkerd-jaeger extension for control plane tracing (#5299)
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:

* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests

We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension.  To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.

Note that for now, only the control plane components send traces; the proxies in the control plane do not.  This is because the linkerd-jaeger injector is not yet available.  However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.

I tested this by doing the following:

1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-08 14:34:26 -08:00
Alejandro Pedraza 948aa23b2a
Remove logs comparisons in integration tests (#5223)
The rare cases where these tests were useful don't make up for the burden of
maintaing them, having different k8s version change the messages and
having unexpected warnings come up that didn't affect the final
convergence of the system.

With this we also revert the indirection added back in #4538 that
fetched unmatched warnings after a test had failed.
2020-11-13 16:00:16 -05:00
Tarun Pothulapati 1fe70dc16d
cli: remove logs subcommand and tests (#5203)
Fixes #5191

The logs command adds a external dependency that we forked to work but
does not fit within linkerd's core set of responsibilities. Hence, This
is being removed.

For capabilities like this, The Kubernetes plugin ecosystem has better
and well maintained tools that can be used.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-12 11:23:36 -08:00
Oliver Gould 25e49433fd
Do not permit cluster networks to be overridden per-pod (#5111)
In #5110 the `global.proxy.destinationGetNetworks` configuration is
renamed to `global.clusterNetworks` to better reflect its purpose.

The `config.linkerd.io/proxy-destination-get-networks` annotation allows
this configuration to be overridden per-workload, but there's no real use
case for this. I don't think we want to support this value differing
between pods in a cluster. No good can come of it.

This change removes support for the `proxy-destination-get-networks`
annotation.
2020-10-21 09:34:13 -07:00
Oliver Gould d0bce594ea
Remove defunct proxy config variables (#5109)
The proxy no longer honors DESTINATION_GET variables, as profile lookups
inform when endpoint resolution is performed.  Also, there is no longer
a router capacity limit.
2020-10-20 16:13:53 -07:00
Oliver Gould 60a742ab56
tests: Consolidate TestHelper.LinkerdRun error handling (#5057)
Most invocations of `TestHelper.LinkerdRun` don't actually need the stderr
output except to encode it in the error message. This changes this helper
to return an error that includes the full invoked command and error message.

Invocations that need direct access to stderr must call `TestHelper.PipeToLinkerdRun`
2020-10-15 14:57:03 -07:00
Zahari Dichev de8855c096
More comprehensive injection integration test (#5049)
The purpose of this test is to validate that the auto injector configures the proxy and the additional containers according to the specified config.

This is done by providing a helper that can generate the desired annotations and later inspect an injected pod in order to determine that every bit of configuration has been accounted for. This test is to provide further assurance that #5036 did not introduce any regressions.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-10-14 19:31:56 +03:00
Oliver Gould 8efec3e696
test: Relax regex for Endpoint Slices update error (#5083)
We've encountered errors like the following in CI:

```
Error updating Endpoint Slices for Service linkerd/linkerd-proxy-injector: Error updating linkerd-proxy-injector-27vgh EndpointSlice for Service linkerd/linkerd-proxy-injector: endpointslices.discovery.k8s.io "linkerd-proxy-injector-27vgh" not found
```

There is a regex to prevent similar errors from failing a test, but it
is too restrictive. This change relaxes the regex to ignore all errors
of this kind ("updating Endpoint Slices for Service").
2020-10-14 09:16:30 -07:00
Zahari Dichev ffa7157907
Add warnings for failed secred mounts to expect warnings in it (#5053)
Seems that Helm is cleaning orphaned resources. Pods that depend on them seem to be not upgraded on time, causing some warnings to be emitted and he CI process to fail

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-10-08 15:06:03 -05:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Alejandro Pedraza 0f869f2e50
Ability for int tests to use external certs generated with openssl (#4997)
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
2020-09-25 11:25:29 -05:00
Tarun Pothulapati 3d900ccc19
Integration test for smi-metrics (#4844)
* Integration test for smi-metrics

This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.

Currently, We store the SMI Helm pkg locally and run the test on top, so 
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 22:49:20 +05:30
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
Mayank Shah 25fe7237ae
conformance validation: move `tap_test.go` test helpers to `testutil` (#4800)
* Refactor `tap` test helpers

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-07-28 13:12:25 -07:00
Matei David 1c197b14e7
Change destination context token format (#4771)
Add a new structure on the destination controller side to keep track of contextual information.
The token format has been changed from ns:<namespace> to a JSON format so that more variables can be
encdoed in the token. As part of this PR, a new field 'nodeName' has been added to help with service
topologies.

Fixes #4498

Signed-off-by: Matei David <matei.david.35@gmail.com>
2020-07-27 09:49:48 -07:00
Alex Leong d540e16c8b
Make service mirror controller per target cluster (#4710)
This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31.  For fuller context, please see that RFC.

Basic multicluster functionality works here including:
* `linkerd mc install` installs the Link CRD but not any service mirror controllers
* `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link
* The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints.
* The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences.
* The `linkerd check` multicluster checks have been updated for the new architecture.  Several checks have been rendered obsolete by the new architecture and have been removed.

The following are known issues requiring further work:
* the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror.  it does not yet support configuring a label selector.
* an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707
* an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-07-23 14:32:50 -07:00
Matei David 8b85716eb8
Introduce install flag for EndpointSlices (#4740)
EndpointSlices have been made opt-in due to their experimental nature. This PR
introduces a new install flag 'enableEndpointSlices' that will allow adopters to
specify in their cli install or helm install step whether they would like to
use endpointslices as a resource in the destination service, instead of the
endpoints k8s resource.

Signed-off-by: Matei David <matei.david.35@gmail.com>
2020-07-15 09:53:04 -07:00
Alejandro Pedraza 873bd61324
Helm integration deep tests (#4728)
This creates a new integration test target that launches the deep suite,
using a linkerd instance installed through Helm.

I've added a `global.proxyInit.ignoreInboundPorts=1234,5678` override
during install and enhanced the injection test to catch problems like
what we saw in #4679.
2020-07-10 14:48:49 -05:00
Mayank Shah f00c17e52a
conformance validation: Refactor install test helpers (#4681)
* Refactor install test helpers

- Move testResourcesPostInstall to testutil.TestResourcesPostInstall
- Move exerciseTestAppEndpoint to testutil.ExerciseTestAppEndpoint

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>

* Trigger CI

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-06-29 14:29:41 -07:00
Kevin Leimkuhler 4372ed56dd
Isolate tests by cluster and make run interface simpler (#4593)
## Summary

Change the default behavior of integration tests to be isolated by cluster.
Additionally, make running one or all tests easier than the current process.

These changes are explained more in the [Testing
RFC](https://github.com/linkerd/rfc/blob/master/design/0004-isolated-integration-tests.md)

## Changes

This is a script used only by Linkerd developers, but there is a lot of useful
usage examples and explanations in `bin/tests --help` output:

```
Run Linkerd integration tests.

Optionally specify one of the following tests: [upgrade helm helm-upgrade uninstall deep external-issuer]

Usage:
    tests [--images] [--images-host ssh://linkerd-docker] [--name test-name] [--skip-kind-create] /path/to/linkerd

Examples:
    # Run all tests in isolated clusters
    tests /path/to/linkerd

    # Run single test in isolated clusters
    tests --name test-name /path/to/linkerd

    # Skip KinD cluster creation and run all tests in default cluster context
    tests --skip-kind-create /path/to/linkerd

    # Load images from tar files located under the 'image-archives' directory
    # Note: This is primarly for CI
    tests --images /path/to/linkerd

    # Retrieve images from a remote docker instance and then load them into KinD
    # Note: This is primarly for CI
    tests --images --images-host ssh://linkerd-docker /path/to/linkerd

Available Commands:
    --name: the argument to this option is the specific test to run
    --skip-kind-create: skip KinD cluster creation step and run tests in an existing cluster.
    --images: (Primarily for CI) use 'kind load image-archive' to load the images from local .tar files in the current directory.
    --images-host: (Primarily for CI) the argument to this option is used as the remote docker instance from which images are first retrieved (using 'docker save') to be then loaded into KinD. This command requires --images.
```

### Run all tests

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run single test (upgrade for example):

Current:

```bash
. bin/_test-run.sh
init_test_run $PWD/bin/linkerd
upgrade_integration_tests
```

New:

```bash
bin/tests --name upgrade $PWD/bin/linkerd
```

### Run tests in isolated KinD clusters

Current: Not possible without running single tests in newly created clusters
manually

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run tests in isolated namespaces on an existing cluster

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests --skip-kind-create $PWD/bin/linkerd
```

## CI

`kind_integration` has been updated so that it does not create a KinD cluster as
part of its test setup.

`cloud_integration` passes the `--skip-kind-create` flag so that the tests are
run serially in a non-KinD cluster.


Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-06-24 17:06:29 -04:00
Zahari Dichev 904f146558
Multicluster install integration test (#4540)
This PR adds multicluster components to the integration tests.

The existing tests have been modified to pass the `--multicluster` flag so that the entire integration test suite runs with multicluster components.

Currently, the upgrade tests do not have multicluster components installed, but this will be done in a follow-up PR. 

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-06-24 14:32:22 -04:00
Lutz Behnke 846d2f11d4
Add support for Helm configuration of per-component proxy resources requests and limits (#4226)
Signed-off-by: Lutz Behnke <lutz.behnke@finleap.com>
2020-06-24 12:54:27 -05:00
Mayank Shah 7f29717a64
Refactor helper functions from `inject` integration tests (#4644)
move `applyPatch` `useTestImageTag`, `validateInject``getProxyContainers` as global functions to be used!
2020-06-22 23:15:52 +05:30
Alejandro Pedraza d02359e094
Int tests: Warn (instead of erroring) upon pod restarts, part two (#4637)
In #4595 we stopped failing integration tests whenever a pod restarted
just once, which is being caused by containerd/containerd#4068.

But we forgot to remove the warning event corresponding to that
containerd failure, and such unexpected event continues to fail the
tests. So this change adds that event to the list of expected ones.
2020-06-18 15:50:51 -05:00
Mayank Shah 90041371d1
Update function signature of `testutil.NewGenericTestHelper` (#4591)
Adds parameters like kubernetesHelper, k8scontext, etc to the NewGenericTestHelper func allowing it to be more general, and to be able to be usable through linkerd2-conformance
2020-06-18 16:53:27 +05:30
Alejandro Pedraza c8c5980d63
Integration tests: Warn (instead of erroring) upon pod restarts (#4623)
* Integration tests: Warn (instead of erroring) upon pod restarts

Fixes #4595

Don't have integration tests fail whenever a pod is detected to have
restarted just once. For now we'll be just logging this out and creating
a warning annotation for it.
2020-06-18 06:08:05 -05:00
Mayank Shah 6174b194fe
conformance validation: add new helper to `testutil` (#4532)
Adds a new helper function to make TestHelper initialization more relaxed for linkerd2-conformance and other test use-cases.
2020-06-12 10:45:42 +05:30
Alejandro Pedraza c0afb443d2
Fix mechanism to fetch logs/events upon test failures (#4538)
Followup to #4522

This removes the `controlPlaneInstalled` var in `bin/install_test.go`
that flagged whether the control plane was already present in the series
of tests, whose intention was to avoid fetching the logs/events when the CP wasn't yet
there. That was done under the assumption `TestMain()` would feed that
flag to the runner for each individual test function, but it turns out
`TestMain()` only runs once per test file, and so
`controlPlaneInstalled` remained with its initial value `false`.

So now logs/events are fetched always, even if the control plane is not
there. If the CP is absent and we try fetching, we only see a `didn't
find any client-go entries` message.
2020-06-04 09:11:30 -05:00
Alejandro Pedraza e607fc9247
Fetch logs/events when integration test fails, not only for install tests (#4522)
* Fetch logs/events when integration test fails, not only for install tests

## Motivation

Mainly to know what caused containers to not start (or to restart), like in #4285

## Implementation

Followup to #4410, where we fetched unexpected logs/events when a test failed in `test/install_test.go`; now we're expanding that behavior to every integration test.

For that, we replace in each `TestMain()`:

```go
os.Exit(m.Run())
```

with

```go
os.Exit(testutil.Run(m, TestHelper, true))

```

where `testutil.Run()` executes the tests and fetches the logs/events if the tests failed.

Also extracted the log/event fetching and matching into its own separate file.

* Appease linter

* For external_issuer_integration_tests controlPlaninstalled wasn't being set
2020-06-01 16:48:55 -05:00
Alejandro Pedraza d0d97e9426
Upgrade to Helm v3 (#4373)
Upgraded to Helm v3.2.1 from v2.16.1, getting rid of Tiller and making
other simplifications.

Note that the version placeholder in the `values.yaml` files had to be
changed from `{version}` to `linkerdVersionValue` because the former
confuses Helm v3.
2020-05-14 12:11:47 -05:00
Alejandro Pedraza f62a2e6ee4
Refactor integration tests to use annotations functions (#4341)
* Refactor integration tests to use annotations functions

First part of #4176

Replaced all the `t.Error`/`t.Fatal` calls in the integration with the
new functions defined in `testutil/annotations.go` as described in #4292,
in order for the errors to produce Github annotations.

Most of these calls have now two strings: one containing a generic error
message and another with a more specific message. The former is what
will be aggregated and seen in the CI reports at
[linkerd2-ci-metrics](https://github.com/linkerd/linkerd2-ci-metrics).

Other changes:

- Improved the annotation generator in `annotations.go` so that the
  message includes the name of the test.
- When a failure from `RetryFor` occurs, log the original timeout so
  we can consider incrementing it when the failure is persistent.
2020-05-08 08:41:42 -05:00
Alejandro Pedraza 2cd48bc488
Go test failure message wrappers to create GH Annotations (#4292)
* Go test failure message wrappers to create GH Annotations

First part of #4176

## Problem

Failures in go tests need to be properly formatted as Github annotations
so that we can fetch them through Github's API for aggregation and
analysis.

## Solution

A wrapper for error messages has been created in `testutil/annotations.go`.
The idea is that instead of throwing test failures like this:

```go
t.Failf("error retrieving data;\nExpected: %#v\nActual: %#v", expected,
actual)
```

We'd throw them like this:
```go
testutil.AnnotationFatalf("error retrieving data", "error retrieving data;\nExpected: %#v\nActual: %#v", expected,
actual)
```

That will continue reporting the error as before (when using `go test`
or another test runner), but as a side-effect it will also send to
stdout something like:

```
::error file=pkg/inject_test.go,line=133::error retrieving data
```
Which becomes a GH annotation, visible in the CI run summary screen.

The fist string art is used to have the GH annotation be a generic error message
that can be aggregated and counted across multiple test runs. If `testutil.Fatalf(str, args...)`
is called instead, the original error message will be used.

Note that that the output will be produced only when the env var
`GH_ANNOTATION` is set (which will when tests are triggered from a
Github Actions workflow).

Besides `testutil/annotation.go` and its accompanying unit test file,
other changes were made in other tests as examples, the plan being that
in a further PR _all_ the tests will use these wrappers.
2020-05-01 16:16:06 -05:00
Alejandro Pedraza 322ba5fd2f
`linkerd uninstall` errors when attempting to delete PSP (#4234)
* Bug in `linkerd uninstall` when attempting to delete PSP

We were using a wrong apiVersion for PSP in `linkerd uninstall`'s
output, which avoids removing that resource:

```
$ linkerd uninstall | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller"
deleted
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination"
deleted
...
mutatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-proxy-injector-webhook-config" deleted
validatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-sp-validator-webhook-config" deleted
namespace "linkerd" deleted
error: unable to recognize "uninstall.yml": no matches for kind
"PodSecurityPolicy" in version "extensions/v1beta1"

$ kubectl get psp -oname
podsecuritypolicy.policy/linkerd-linkerd-control-plane
```

I've also replaced the uninstall integration test with a new separate
suite that performs the installation, waits for it to be ready,
uninstalls, and then confirms `linkerd check --pre` returns as expected.
2020-04-07 11:01:11 -05:00