Commit Graph

88 Commits

Author SHA1 Message Date
Alejandro Pedraza b21686a9be
IPv6/dual-stack integration tests (#12575)
* IPv6 integration tests

This adds a new test `TestDualStack` to the deep suite that ensures requests to a dual stack service are always routed the the IPv6 endpoint.

It also amends other tests in the suite for them to work in IPv6-only clusters:

- skipports: replaced the booksapp with emojivoto, given the servers in the former don't bind to IPv6 addresses
- endpoints: amended the regexes to include IPv6 addresses
- localhost: bumped nginx for it to bind to the IPv6 loopback as well

Note the `TestDualStack` test is disabled by default because Github runners don't support IPv6. To run it locally, first deploy a dual-stack cluster via:

```
kind create cluster --config test/integration/deep/kind-dualstack.yml
```
(for testing IPv6-only clusters, use the `kind-ipv6.yml` config)

Then load the images and trigger the test with:

```
bin/tests --name deep-dual-stack --skip-cluster-create $PWD/target/cli/linux-amd64/linkerd
```
2024-05-28 16:00:26 -05:00
Alejandro Pedraza 1f9fa44e01
Add native sidecar deep integration test (#12452)
Added the test `deep-native-sidecar` which runs the `deep` test with the
new flag `--native-sidecar`.

Also replaced the final `WaitRollout` call in `install_test.go` with a
`linkerd check` call, to also allow us verifying that command is working
as intended.
2024-04-30 15:30:00 -05:00
Matei David 21046ab9ff
Skip `multicluster-gateways-endpoints` for links with no gateways (#11447)
The multicluster extension has always allowed the extension to be
installed without a gateway; the idea being that users would provide
their own. With p2p, we extended this to allow links that do not specify
a gateway at all, but in the process we missed changing a key check
-- `multicluster-gateways-endpoints` -- that asserts all links have a
probe service.

Without a gateway on the other end, a link will not have a probe spec
(or a gateway address) so it makes no sense to run this check, there
will never be a probe service created in the source cluster. To fix this
issue, we skip the check when the link misses either a gateway address
or a probe spec.

Fixes #11428

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2023-10-18 10:38:40 +01:00
dependabot[bot] e5830ad05b
build(deps): bump linkerd/dev from 39 to 40 (#10825)
* build(deps): bump linkerd/dev from 39 to 40

Bumps [linkerd/dev](https://github.com/linkerd/dev) from 39 to 40.
- [Release notes](https://github.com/linkerd/dev/releases)
- [Commits](https://github.com/linkerd/dev/compare/v39...v40)

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
2023-05-09 10:57:19 -07:00
Matei David 0fcf84939f
Remove viz dependency in multicluster tests (#10609)
Our multicluster integration tests used to depend on viz. Viz was used
to check the state of the gateways (`linkerd multicluster gateways`
required it). Since this is no longer the case, we can remove this
dependency to get back a few seconds of execution times (multicluster
tests are famously slow).

---------

Signed-off-by: Matei David <matei@buoyant.io>
2023-03-30 15:11:32 +01:00
Alejandro Pedraza 72589f0e53
Reenable `helm-upgrade` integration test (#10047)
Supersedes #9856, now that the `linkerd check` logic in the integrations tests got cleaned up via #9989.

The helm-upgrade test had been commented-out when we jumped to the new 2.12 helm charts. It can be used again to test upgrades from 2.12.x.

- Some of the logic in `test/integration/install/install_test.go` still hadn't considered the need to upgrade both the `linkerd-crds` and `linkerd-control-plane` charts, so that got fixed.
- Removed references to the now-deprecated `linkerd2` chart.
- Improved the `helm_cleanup()` function by uninstalling the charts in reverse order (extensions first, core last). We delete the namespaces afterwards because helm sometimes doesn't remove them, and so we shouldn't fail if we attempt to delete one that is already gone. Also removed unneeded `kubectl wait`s because `kubect delete ns` should be blocking.
2023-01-10 09:33:11 -05:00
Alex Leong 768e04dd7e
Update endpoints watcher to not fetch pods for removed endpoints (#10013)
Fixes #10003

When endpoints are removed from an EndpointSlice resource, the destination controller builds a list of addresses to remove.  However, if any of the removed endpoints have a Pod as their targetRef, we will attempt to fetch that pod to build the address to remove.  If that pod has already been removed from the informer cache, this will fail and the endpoint will be skipped in the list of endpoints to be removed.  This results in stale endpoints being stuck in the address set and never being removed.

We update the endpoint watcher to construct only a list of endpoint IDs for endpoints to remove, rather than fetching the entire pod object.  Since we no longer attempt to fetch the pod, this operation is now infallible and endpoints will no longer be skipped during removal.

We also add a `TestEndpointSliceScaleDown` test to exercise this.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-01-03 10:04:02 -08:00
Alejandro Pedraza e6fa5a7156
Replace usage of io/ioutil package (#9613)
`io/ioutil` has been deprecated since go 1.16 and the linter started to
complain about it.
2022-10-13 12:10:58 -05:00
Eliza Weisman f6c6ff965c
inject: fix --default-inbound-policy not setting annotation (#9197)
Depends on #9195

Currently, `linkerd inject --default-inbound-policy` does not set the
`config.linkerd.io/default-inbound-policy` annotation on the injected
resource(s).

The `inject` command does _try_ to set that annotation if it's set in
the `Values` generated by `proxyFlagSet`:
14d1dbb3b7/cli/cmd/inject.go (L485-L487)

...but, the flag in the proxy `FlagSet` doesn't set
`Values.Proxy.DefaultInboundPolicy`, it sets
`Values.PolicyController.DefaultAllowPolicy`:
7c5e3aaf40/cli/cmd/options.go (L375-L379)

This is because the flag set is shared across `linkerd inject` and
`linkerd install` subcommands, and in `linkerd install`, we want to set
the default policy for the whole cluster by configuring the policy
controller. In `linkerd inject`, though, we want to add the annotation
to the injected pods only.

This branch fixes this issue by changing the flag so that it sets the
`Values.Proxy.DefaultInboundPolicy` instead of the
`Values.PolicyController.DefaultAllowPolicy` value. In `linkerd
install`, we then set `Values.PolicyController.DefaultAllowPolicy` based
on the value of `Values.Proxy.DefaultInboundPolicy`, while in `inject`,
we will now actually add the annotation.

This branch is based on PR #9195, which adds validation to reject
invalid values for `--default-inbound-policy`, rather than on `main`.
This is because the validation code added in that PR had to be moved
around a bit, since it now needs to validate the
`Values.Proxy.DefaultInboundPolicy` value rather than the
`Values.PolicyController.DefaultAllowPolicy` value. I thought using
#9195 as a base branch was better than basing this on `main` and then
having to resolve merge conflicts later. When that PR merges, this can 
be rebased onto `main`.

Fixes #9168
2022-08-18 17:16:27 -07:00
Oliver Gould c3d327f9c7
ci: Unify integration test workflows (#8964)
The policy integration tests builds some of the same container images
that are built in the integration test workflow. We can avoid this
duplication while still making the policy test execution optional.

This change moves `integration_tests.yml` and `policy_controller.yml`
into `integration.yml`. The policy test workflows now re-use some of the
`just` recipes used for development. This change also seperates the
tests that need Viz components so that, eventually, they can be executed
more selectively.

Furthermore, this change updates the docker-build action to take a tag
as a parameter, rather than *setting* a TAG environment variable. This
change tries to simplify the use of environment variables by setting a
single tag output in both the integration and release workflows.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-07-25 15:51:09 -07:00
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00
Matei David 61b75509da
Refactor multicluster test install (#8139)
This change continues the work from #7403 by refactoring the
multicluster tests in order to install components programatically.

As part of this change, we now generate certificates (a CA and a shared
issuer) in code, and add a few utilities to manage different Kubernetes
contexts; a few examples are `KubectlApplyWithContext` and a function to
re-initialise the clientset with an arbitrary context.

Few bits and pieces have also been changed as I went through this, such
as applying entire files as opposed to reading manifests in memory
before piping them to kubectl.

Some other changes:
* remove logic from test runner script that set-up multicluster
* add a more rigurous check test after linking source to target cluster
* remove `target1`, `source` and `target_statefulset` tests
* consolidated previous tests in one file

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-04-20 10:59:12 +01:00
Kevin Leimkuhler 67bcd8f642
Add `gosec` and `errcheck` lints (#7954)
Closes #7826

This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.

A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.

Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.

Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-03 10:09:51 -07:00
Matei David 044ffaa45e
Move `upgrade` integration tests in their own files (#7934)
This change hoists all of the upgrade logic into dedicated files.

Upgrade tests are now done programatically (at the expense of repeating a lot of code between upgrade-edge and upgrade-stable). This (in theory) will remove the dependency on an external test-runner and special arguments for the TestHelper. Upgrade tests can now be run through go's runner, provided a Kubernetes cluster has been provisioned: `go test ./test/integration/upgrade-<release-channel>/...`

Signed-off-by: Matei David <matei@buoyant.io>
2022-02-25 15:12:09 +00:00
Alex Leong 2a0084c6ac
Replace time.After with time.NewTimer to avoid memory leaks (#7956)
The timer created by calling `time.After` is not cleaned up until the timer fires.  Repeated calls to `time.After` in a loop create multiple timers which can accumulate if the loop runs faster than the timeout.

We replace `time.After` with `time.NewTimer` and explicitly call `Stop` when we are finished to clean up the timer.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-02-24 15:34:52 -08:00
Oliver Gould f5876c2a98
go: Enable `errorlint` checking (#7885)
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.

This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).

[el]: https://github.com/polyfloyd/go-errorlint

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-16 18:32:19 -07:00
Oliver Gould dcb0636ac1
Skip viz policy test on missing success rate (#7892)
When the flakey policy test can't produce a success rate, let's just
mark the test as skipped instead of failing CI.

Relates to #7590

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-15 16:45:37 -08:00
Matei David e46f7b4be2
Allow integration tests to run in parallel (#7773)
Go's test runner (`go test`) can be non-deterministic with the order in
which it runs the tests. Tests in Go seem to be always
run in parallel, but the specifics here differ depending on the
available CPU.

We can take advantage of parallelism here to get better timing on our
tests, however, we need to block the start of each test until the
control plane (or extension) pods are ready. In each `TestMain`, we
block until the pods are ready.

Signed-off-by: Matei David <matei@buoyant.io>
2022-02-07 15:21:57 +00:00
Matei David a0e2d12516
Isolate integration tests into suites (#7721)
This change re-works how tests are organised, by splitting each
integration test based on its functionality. Each "suite", is a
directory that compromises tests that will be specific to that suite;
deep tests will test the control plane, viz tests will target the
monitoring stack, and so on.

As part of this change, aside from moving the tests, the GH workflow
file has also been changed to remove old test names, and add (or
replace) new ones where appropriate.

To have a fully functioning viz suite, a new flag has been added to the
TestHelper to discern between tests that make use of the monitoring
stack and tests that do not. The external suite also has a duplicated
`edges` test so we can make sure interactions with external prometheus
don't break.

Signed-off-by: Matei David <matei@buoyant.io>
2022-02-02 17:19:08 +00:00
Alejandro Pedraza 67dfebb259
Stop shipping grafana-based image (#7567)
* Stop shipping grafana-based image

Fixes #6045 #7358

With this change we stop building a Grafana-based image preloaded with the Linkerd Grafana dashboards.

Instead, we'll recommend users to install Grafana by themselves, and we provide a file `grafana/values.yaml` with a default config that points to all the same Grafana dashboards we had, which are now hosted in https://grafana.com/orgs/linkerd/dashboards .

The new file `grafana/README.md` contains instructions for installing the official Grafana Helm chart, and mentions other available methods.

The `grafana.enabled` flag has been removed, and `grafanaUrl` has been moved to `grafana.url`. This will help consolidating other grafana settings that might emerge, in particular when #7429 gets addressed.

## Dashboards definitions changes

The dashboard definitions under `grafana/dashboards` (which should be kept in sync with what's published in https://grafana.com/orgs/linkerd/dashboards), got updated, adding the `__inputs`, `__elements` and `__requires` entries at the beginning, that were required in order to be published.
2022-01-11 14:47:40 -05:00
Alejandro Pedraza f9f3ebefa9
Remove namespace from charts and split them into `linkerd-crd` and `linkerd-control-plane` (#6635)
Fixes #6584 #6620 #7405

# Namespace Removal

With this change, the `namespace.yaml` template is rendered only for CLI installs and not Helm, and likewise the `namespace:` entry in the namespace-level objects (using a new `partials.namespace` helper).

The `installNamespace` and `namespace` entries in `values.yaml` have been removed.

There in the templates where the namespace is required, we moved from `.Values.namespace` to `.Release.Namespace` which is filled-in automatically by Helm. For the CLI, `install.go` now explicitly defines the contents of the `Release` map alongside `Values`.

The proxy-injector has a new `linkerd-namespace` argument given the namespace is no longer persisted in the `linkerd-config` ConfigMap, so it has to be passed in. To pass it further down to `injector.Inject()` without modifying the `Handler` signature, a closure was used.

------------
Update: Merged-in #6638: Similar changes for the `linkerd-viz` chart:

Stop rendering `namespace.yaml` in the `linkerd-viz` chart.

The additional change here is the addition of the `namespace-metadata.yaml` template (and its RBAC), _not_ rendered in CLI installs, which is a Helm `post-install` hook, consisting on a Job that executes a script adding the required annotations and labels to the viz namespace using a PATCH request against kube-api. The script first checks if the namespace doesn't already have an annotations/labels entries, in which case it has to add extra ops in that patch.

---------
Update: Merged-in the approved #6643, #6665 and #6669 which address the `linkerd2-cni`, `linkerd-multicluster` and `linkerd-jaeger` charts. 

Additional changes from what's already mentioned above:
- Removes the install-namespace option from `linkerd install-cni`, which isn't found in `linkerd install` nor `linkerd viz install` anyways, and it would add some complexity to support.
- Added a dependency on the `partials` chart to the `linkerd-multicluster-link` chart, so that we can tap on the `partials.namespace` helper.
- We don't have any more the restriction on having the muticluster objects live in a separate namespace than linkerd. It's still good practice, and that's the default for the CLI install, but I removed that validation.


Finally, as a side-effect, the `linkerd mc allow` subcommand was fixed; it has been broken for a while apparently:

```console
$ linkerd mc allow --service-account-name foobar
Error: template: linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml:16:7: executing "linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml" at <include "partials.annotations.created-by" $>: error calling include: template: no template "partials.annotations.created-by" associated with template "gotpl"
```
---------
Update: see helm/helm#5465 describing the current best-practice

# Core Helm Charts Split

This removes the `linkerd2` chart, and replaces it with the `linkerd-crds` and `linkerd-control-plane` charts. Note that the viz and other extension charts are not concerned by this change.

Also note the original `values.yaml` file has been split into both charts accordingly.

### UX

```console
$ helm install linkerd-crds --namespace linkerd --create-namespace linkerd/linkerd-crds
...
# certs.yaml should contain identityTrustAnchorsPEM and the identity issuer values
$ helm install linkerd-control-plane --namespace linkerd -f certs.yaml linkerd/linkerd-control-plane
```

### Upgrade

As explained in #6635, this is a breaking change. Users will have to uninstall the `linkerd2` chart and install these two, and eventually rollout the proxies (they should continue to work during the transition anyway).

### CLI

The CLI install/upgrade code was updated to be able to pick the templates from these new charts, but the CLI UX remains identical as before.

### Other changes

- The `linkerd-crds` and `linkerd-control-plane` charts now carry a version scheme independent of linkerd's own versioning, as explained in #7405.
- These charts are Helm v3, which is reflected in the `Chart.yaml` entries and in the removal of the `requirements.yaml` files.
- In the integration tests, replaced the `helm-chart` arg with `helm-charts` containing the path `./charts`, used to build the paths for both charts.

### Followups

- Now it's possible to add a `ServiceProfile` instance for Destination in the `linkerd-control-plane` chart.
2021-12-10 15:53:08 -05:00
Alejandro Pedraza e8aa640085
New integration test default-policy-deny (#6931)
Closes #6813, followup to #6846

This runs `./test/integration/install_test.go`, installing linkerd with
`--set policyController.defaultAllowPolicy=deny` and subsequently
installing viz, making sure everything is alright.
2021-09-23 07:54:53 -05:00
Tarun Pothulapati 174c50a235
integrationtests: check for stat output of policy resources (#6890)
This PR adds a new `policy` integration tests suite
to check the output of for the `srv` and `saz` commands and
make sure they have the expected format.

This is majorly needed to make sure things work as expected
with stat cmd output without having to do manual testing.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-09-17 16:37:44 +05:30
Tarun Pothulapati 45478b6db8
viz: support `stat` on new policy resources (#6785)
Fixes #6733

As policy resources provide a grouping, statistics summaries should
also be allowed on these groupings which are useful to the user. Them
being port specific provide a great way to break down these metrics
further.

This PR adds support for policy resources i.e `server` and `serverauthorization`
 on the `stat` command.

## Changes

This adds a new path in the `stat_summary.go` file to handle policy
objects. I tried to see if we could re-use some of the other paths
but some of the labels seems to differ and hence a different path
had to be created. We can try to refactor and merge them though.

We support both request and TCP metrics for the `server` resource
while only the former with `serverauthorization` resources
as metrics are generated in this manner.

This also adds these policy objects into the `k8s` package to
make them as known resources.

For both the policy resources, `--from` doesn't work as these
metrics are not exposed from outbound, and there is no way to
query about the client workload from the inbound metrics. `--to`
is supported to get metrics specifically for a destination workload.
(just like on a service)

## Testing

```bash
> curl -sL https://run.linkerd.io/emojivoto.yml | linkerd inject --proxy-log-level debug - | kubectl apply -f -

> kubectl apply -f 897de1a8d5/emojivoto-policy.yml


# Initial values
on  kind-kind  linkerd2 on 🌱 taru [📦📝🤷‍] via 🐼 v1.16.7 via  via ❄️  impure (shell)
➜ ./bin/go-run cli viz stat srv -A -owide                                                                                                         ~/work/linkerd2
NAMESPACE   NAME          UNAUTHORIZED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
emojivoto   emoji-grpc          0.0rps   100.00%   1.8rps           1ms           1ms           3ms          1         188.6B/s         2072.9B/s
emojivoto   prom                0.0rps         -        -             -             -             -          -                -                 -
emojivoto   voting-grpc         0.0rps    80.70%   0.9rps           1ms           2ms           3ms          1          91.4B/s           52.7B/s
emojivoto   web-http            0.0rps    90.68%   2.0rps           2ms          10ms          28ms          1         153.7B/s         4509.4B/s

# After changing the `emoji-grpc` authz
on  kind-kind  linkerd2 on 🌱 taru [📦📝🤷‍] via 🐼 v1.16.7 via  via ❄️  impure (shell) took 2s
➜ ./bin/go-run cli viz stat srv -A -owide                                                                                                         ~/work/linkerd2
NAMESPACE   NAME          UNAUTHORIZED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
emojivoto   emoji-grpc          0.3rps   100.00%   1.1rps           0ms           0ms           0ms          1         156.5B/s         1282.4B/s
emojivoto   prom                0.0rps         -        -             -             -             -          -                -                 -
emojivoto   voting-grpc         0.0rps    87.88%   0.6rps           0ms           0ms           0ms          1          53.5B/s           31.5B/s
emojivoto   web-http            0.0rps    61.18%   1.4rps           1ms           2ms           2ms          1         110.2B/s         2195.7B/s

# after changing the `web-http` authz

on  kind-kind  linkerd2 on 🌱 taru [📦📝🤷‍] via 🐼 v1.16.7 via  via ❄️  impure (shell)
➜ ./bin/go-run cli viz stat srv -A -owide                                                                                                         ~/work/linkerd2
NAMESPACE   NAME          UNAUTHORIZED   SUCCESS   RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
emojivoto   emoji-grpc          0.0rps         -     -             -             -             -          -                -                 -
emojivoto   prom                0.0rps         -     -             -             -             -          -                -                 -
emojivoto   voting-grpc         0.0rps         -     -             -             -             -          -                -                 -
emojivoto   web-http            1.0rps         -     -             -             -             -          -                -                 -

> linkerd  viz stat srv/emoji-grpc -n emojivoto -owide
NAME         SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
emoji-grpc        100.00%   2.0rps           1ms           1ms           1ms          1         199.9B/s         2208.0B/s

> linkerd  viz stat srv/web-http -n emojivoto -owide
NAME      SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
web-http         94.02%   1.9rps           4ms           9ms          10ms          1         152.7B/s         4505.9B/s

> linkerd  viz stat srv -n emojivoto -o wide                                                      
NAME          MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN   READ_BYTES/SEC   WRITE_BYTES/SEC
emoji-grpc         -   100.00%   2.0rps           1ms           1ms           1ms          1         201.6B/s         2209.8B/s
prom               -         -        -             -             -             -          -                -                 -
voting-grpc        -    86.21%   1.0rps           1ms           1ms           1ms          1          98.3B/s           55.9B/s
web-http           -    91.67%   2.0rps           3ms           8ms          10ms          1         157.7B/s         4600.3B/s


> linkerd  viz stat serverauthorization/web-public -n emojivoto
NAME       MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99  
web-http        -    89.83%   2.0rps           3ms           9ms          10ms

> linkerd viz stat saz -n emojivoto
NAME          AUTHORIZATION     MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
emoji-grpc    emoji-grpc             -   100.00%   2.0rps           1ms           1ms           1ms
prom          prom-prometheus        -         -        -             -             -             -
voting-grpc   voting-grpc            -    89.83%   1.0rps           1ms           1ms           1ms
web-http      web-public             -    94.96%   2.0rps           1ms           5ms           9ms

> linkerd viz stat saz/web-public -n emojivoto                                                 
NAME       AUTHORIZATION   MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
web-http   web-public           -    90.00%   2.0rps           1ms           5ms           9ms
```

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-09-15 10:59:36 +05:30
Kevin Leimkuhler 945eaf57e9
Only check proxy version of running pods (#6212)
This changes the data plane and control plane checks to only consider pods that
are in a running state.

This is currently an issue in scenarios when upgrading and an old pod is still
terminating while the upgraded pod is already running. Running the `check`
command will report that the old pod has an unexpected proxy version even though
it is terminating; it should not warn in this case.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-06-03 10:59:07 -04:00
Kevin Leimkuhler 8c66aafe7b
Increase all testing timeouts to 60 minutes (#6134) 2021-05-17 16:49:57 -04:00
Tarun Pothulapati 8db6398442
checks: add proxy checks for core cp and extension pods (#5673)
* checks: add proxy checks for core cp and extension pods

Fixes #5623

This PR adds proxy checks for control-plane and extension pods
when the respective checks are ran. This can make sure proxies
are working correctly and are able to communicate.

Currently, The following checks are added:

- proxy status checks
- proxy certificate checks
- proxy version checks

These are the same data-plane proxy checks that were already
present.

As these checks result in errors in most cases under integration
tests as there are latest versions online. This is fixed by templating
the check golden files and checking for the known error.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-04-22 11:39:52 +05:30
Alejandro Pedraza 6980e45e1d
Remove the `linkerd-controller` pod (#6039)
* Remove the `linkerd-controller` pod

Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.

## Removals

- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.

## Health checks

- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.

## Other changes

- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
2021-04-19 09:57:45 -05:00
Alejandro Pedraza 6b8c17eef1
Move `sp-validator` into the `destination` pod (#5936)
* Move `sp-validator` into the `destination` pod

Fixes #5195

The webhook secrets remain the same, as do the `profileValidator` settings in `values.yaml`, so this doesn't pose any upgrading challenges.
2021-04-16 08:04:11 -05:00
Alejandro Pedraza fae96c56d2
In the `helm-upgrade` int test, install the proper viz chart version (#6001)
The `helm-upgrade` test was installing the current viz chart version from `viz/charts/linkerd-viz` instead of the latest stable published one, before attempting to perform the helm upgrade.

As seen below, we're passing `--helm-chart` and `helm-stable-chart` as parameters, but only `--viz-helm-chart`

```console
Test script: [install_test.go] Params: [--helm-path=/home/alpeb/src/pr/version2/linkerd2/bin/helm --helm-chart=/home/alpeb/src/pr/version2/linkerd2/charts/linkerd2 --viz-helm-chart=/home/alpeb/src/pr/version2/linkerd2/viz/charts/linkerd-viz --helm-stable-chart=linkerd/linkerd2 --helm-release=helm-test --upgrade-helm-from-version=stable-2.10.0]
```

So this PR adds a new `--viz-helm-stable-chart` parameter that instructs which version to install before performing the upgrade.

This is affecting #6000 which has changes in the viz chart that are going undetected and are failing the `helm-upgrade` test.
2021-04-08 13:39:33 -05:00
Alejandro Pedraza 9a191fbd7b
Get rid of `CheckDeployments()` in integration tests (#5942)
`CheckDeployments()` verified that some deployment had the appropriate
number of replicas in the Ready state. `CheckPods()` does the same, plus
checking if there were restarts (a single restart returns
`RestartCountError` which only triggers a warning on the calling side,
more restarts trigger a regular error). We were always calling the
former followed by a call to the latter which is superfluous, so we're
getting rid of the latter.

Also, the `testutil.DeploySpec` struct had a `Containers` field for
checking the name of containers, but that wasn't always checked and
didn't really represent a potential error that wouldn't be clearly
manifested otherwise (like in golden files), so that was simplified as
well.
2021-03-23 13:53:38 -05:00
Alejandro Pedraza 711124159a
Fix helm-upgrade and upgrade-stable integration tests (#5891)
* Fix helm-upgrade integration test

Update `install_test.go` now that the upgrade test is done from 2.10.
This also implied installing viz right after core.

Refactored `HelmInstallPlain()` into `HelmCmdPlain()` to work for both
helm install and upgrade.

* Add expected heartbeat config entry to the upgrade-stable test, and remove testCheckCommand arg (for lint)
2021-03-12 08:20:45 -05:00
Alejandro Pedraza af5aa70e68
Helm integration tests improvements, and other flakiness fixes to CI (#5857)
- Get rid of all the custom settings passed through `--set` during `helm install`, and instead let the defaults mechanisms in the templates to  kick-in
- Before installing `linkerd-viz` through Helm, wait on _all_ the core components to be ready (this is what might have been causing the restarts seen in CI)
- Show full output when `linkerd jaeger check` fails, and do some cleanup before triggering the tracing tests. But then I decided to temporarily disable that test till we figure out what's the deal.
2021-03-02 20:11:00 -05:00
Dennis Adjei-Baah 15d1809bd0
Remove linkerd prefix from extension resources (#5803)
* Remove linkerd prefix from extension resources

This change removes the `linkerd-` prefix on all non-cluster resources
in the jaeger and viz linkerd extensions. Removing the prefix makes all
linkerd extensions consistent in their naming.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2021-02-25 11:01:31 -05:00
Tarun Pothulapati ac67a2c5d7
tests: add exnternal-prometheus integration test (#5720)
* tests: add exnternal-prometheus integration test

Fixes #5659

Though most control-plane components dont differntiate on default
vs external prometheus, There have been issues w.r.t CLI and external
prometheus i.e check, etc.

This PR adds a e2e deep integration tests w.r.t external pronmetheus
thereby running viz and non-viz cmds integration tests on a linkerd-viz
ewith external prometheus instance

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-16 23:20:58 +05:30
Tarun Pothulapati a393c42536
values: removal of .global field (#5699)
* values: removal of .global field

Fixes #5425

With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.

To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.

We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-11 23:38:34 +05:30
Dennis Adjei-Baah e4069b47e0
Run extension checks when linkerd check is invoked (#5647)
* Run extension checks when linkerd check is invoked

This change allows the linkerd check command to also run any known
linkerd extension commands that have been installed in the cluster. It
does this by first querying for any namespace that has the label
selector `linkerd.io/extension` and then runs the subcommands for either
`jaeger`, `multicluster` or `viz`. This change runs basic namespace
healthchecks for extensions that aren't part of the Linkerd extension suite.

Fixes #5233
2021-02-11 10:50:16 -06:00
Alex Leong dd8e5fc5bc
Rename extension charts to linkerd-* (#5552)
For consistency we rename the extension charts to a common naming scheme:

linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link

We also make the chart files and chart readmes a bit more uniform.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-26 16:20:49 -08:00
Tarun Pothulapati 2087c95dd8
viz: move some components into linkerd-viz (#5340)
* viz: move some components into linkerd-viz

This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.

The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as 
`set`, `set-string`, `values`, `set-files`.

**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
 fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.

**Testing**

```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -

# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -

# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```

What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-23 20:17:31 +05:30
Alejandro Pedraza 948aa23b2a
Remove logs comparisons in integration tests (#5223)
The rare cases where these tests were useful don't make up for the burden of
maintaing them, having different k8s version change the messages and
having unexpected warnings come up that didn't affect the final
convergence of the system.

With this we also revert the indirection added back in #4538 that
fetched unmatched warnings after a test had failed.
2020-11-13 16:00:16 -05:00
Tarun Pothulapati 1fe70dc16d
cli: remove logs subcommand and tests (#5203)
Fixes #5191

The logs command adds a external dependency that we forked to work but
does not fit within linkerd's core set of responsibilities. Hence, This
is being removed.

For capabilities like this, The Kubernetes plugin ecosystem has better
and well maintained tools that can be used.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-12 11:23:36 -08:00
Oliver Gould 60a742ab56
tests: Consolidate TestHelper.LinkerdRun error handling (#5057)
Most invocations of `TestHelper.LinkerdRun` don't actually need the stderr
output except to encode it in the error message. This changes this helper
to return an error that includes the full invoked command and error message.

Invocations that need direct access to stderr must call `TestHelper.PipeToLinkerdRun`
2020-10-15 14:57:03 -07:00
Zahari Dichev de8855c096
More comprehensive injection integration test (#5049)
The purpose of this test is to validate that the auto injector configures the proxy and the additional containers according to the specified config.

This is done by providing a helper that can generate the desired annotations and later inspect an injected pod in order to determine that every bit of configuration has been accounted for. This test is to provide further assurance that #5036 did not introduce any regressions.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-10-14 19:31:56 +03:00
Alejandro Pedraza 0f869f2e50
Ability for int tests to use external certs generated with openssl (#4997)
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
2020-09-25 11:25:29 -05:00
Tarun Pothulapati 3d900ccc19
Integration test for smi-metrics (#4844)
* Integration test for smi-metrics

This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.

Currently, We store the SMI Helm pkg locally and run the test on top, so 
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 22:49:20 +05:30
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
Alex Leong d540e16c8b
Make service mirror controller per target cluster (#4710)
This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31.  For fuller context, please see that RFC.

Basic multicluster functionality works here including:
* `linkerd mc install` installs the Link CRD but not any service mirror controllers
* `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link
* The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints.
* The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences.
* The `linkerd check` multicluster checks have been updated for the new architecture.  Several checks have been rendered obsolete by the new architecture and have been removed.

The following are known issues requiring further work:
* the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror.  it does not yet support configuring a label selector.
* an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707
* an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-07-23 14:32:50 -07:00
Alejandro Pedraza 873bd61324
Helm integration deep tests (#4728)
This creates a new integration test target that launches the deep suite,
using a linkerd instance installed through Helm.

I've added a `global.proxyInit.ignoreInboundPorts=1234,5678` override
during install and enhanced the injection test to catch problems like
what we saw in #4679.
2020-07-10 14:48:49 -05:00
Kevin Leimkuhler 4372ed56dd
Isolate tests by cluster and make run interface simpler (#4593)
## Summary

Change the default behavior of integration tests to be isolated by cluster.
Additionally, make running one or all tests easier than the current process.

These changes are explained more in the [Testing
RFC](https://github.com/linkerd/rfc/blob/master/design/0004-isolated-integration-tests.md)

## Changes

This is a script used only by Linkerd developers, but there is a lot of useful
usage examples and explanations in `bin/tests --help` output:

```
Run Linkerd integration tests.

Optionally specify one of the following tests: [upgrade helm helm-upgrade uninstall deep external-issuer]

Usage:
    tests [--images] [--images-host ssh://linkerd-docker] [--name test-name] [--skip-kind-create] /path/to/linkerd

Examples:
    # Run all tests in isolated clusters
    tests /path/to/linkerd

    # Run single test in isolated clusters
    tests --name test-name /path/to/linkerd

    # Skip KinD cluster creation and run all tests in default cluster context
    tests --skip-kind-create /path/to/linkerd

    # Load images from tar files located under the 'image-archives' directory
    # Note: This is primarly for CI
    tests --images /path/to/linkerd

    # Retrieve images from a remote docker instance and then load them into KinD
    # Note: This is primarly for CI
    tests --images --images-host ssh://linkerd-docker /path/to/linkerd

Available Commands:
    --name: the argument to this option is the specific test to run
    --skip-kind-create: skip KinD cluster creation step and run tests in an existing cluster.
    --images: (Primarily for CI) use 'kind load image-archive' to load the images from local .tar files in the current directory.
    --images-host: (Primarily for CI) the argument to this option is used as the remote docker instance from which images are first retrieved (using 'docker save') to be then loaded into KinD. This command requires --images.
```

### Run all tests

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run single test (upgrade for example):

Current:

```bash
. bin/_test-run.sh
init_test_run $PWD/bin/linkerd
upgrade_integration_tests
```

New:

```bash
bin/tests --name upgrade $PWD/bin/linkerd
```

### Run tests in isolated KinD clusters

Current: Not possible without running single tests in newly created clusters
manually

New:

```bash
bin/tests $PWD/bin/linkerd
```

### Run tests in isolated namespaces on an existing cluster

Old:

```bash
bin/test-run $PWD/bin/linkerd
```

New:

```bash
bin/tests --skip-kind-create $PWD/bin/linkerd
```

## CI

`kind_integration` has been updated so that it does not create a KinD cluster as
part of its test setup.

`cloud_integration` passes the `--skip-kind-create` flag so that the tests are
run serially in a non-KinD cluster.


Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-06-24 17:06:29 -04:00
Zahari Dichev 904f146558
Multicluster install integration test (#4540)
This PR adds multicluster components to the integration tests.

The existing tests have been modified to pass the `--multicluster` flag so that the entire integration test suite runs with multicluster components.

Currently, the upgrade tests do not have multicluster components installed, but this will be done in a follow-up PR. 

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-06-24 14:32:22 -04:00