Commit Graph

2080 Commits

Author SHA1 Message Date
Alejandro Pedraza 301429ea9b
Bump KinD to 0.8.1 (#4445)
* Bump KinD to 0.8.1

This brings us K8s 1.18, which is in theory passing all the integration
tests. Currently the tracing one is failing just because of the quay.io
downtime, that hosts the nginx-ingress image.

Re #4382
2020-05-20 14:46:05 -05:00
Alex Leong 9cd4557644
Properly show the meshed count for non-selector services (#4446)
When viewing the output of `linkerd stat` for services which do not have a selector (such as services created by the service-mirror, for example) the meshed count column shows the total number which exist, even though the service actually selects no pods at all.

We update the StatSummary implementation to account for services which have no selector.

Additionally, we update the logic of the `--unmeshed` flag.  When the `--unmeshed` flag is not set, we typically skip rows for unmeshed resources because those resources would have no stats.  This is not appropriate to do when the `--from` flag is also set because in this case, metrics are not collected on the target resource but are instead collected on the client-side.  This means that stats can be present, even for unmeshed resources and these resources should still be displayed, even if the `--unmeshed` flag is not set.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-05-20 10:08:27 -07:00
Tarun Pothulapati be664571c1
Separate grafana image tag in template (#4395)
Separates grafana image field into image.name, image.version and also moves controllerImageVersion to global
2020-05-20 22:27:19 +05:30
Joakim Roubert 960ce556ba
bin/_log.sh: Add shebang to please shellcheck (#4437)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-20 09:55:51 -07:00
Zahari Dichev 31e33d18d3
Enable service mirroring to work in private networks (#4440)
This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway.

Fix #4411

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-20 19:48:36 +03:00
Joakim Roubert de1b5d5a81
install-cni.sh: Fix shellcheck issues (#4405)
Where cat and echo are actually not needed, they have been removed.

Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-05-20 09:29:14 -07:00
Zahari Dichev 6574f124a7
Restrict Service mirror RBACs (#4426)
This PR introduces a few changes that were requested after a bit of service mirror reviewing.

- we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in
- we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters
- fixed checks to account for changes

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-20 17:08:01 +03:00
Joakim Roubert ef67cbed38
bin/lint: Fix shellcheck issue (#4434)
Delete variable `os` that is not used. The golangci-lint downloader script does its own extensive platform lookup before downloading the selected binary.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 23:23:25 -07:00
Kevin Leimkuhler d99c1486ba
Lint all markdown files in CI (#4402)
## Motivation

linkerd/rfc#22

## Solution

Use the [markdown-lint-action](https://github.com/marketplace/actions/markdown-linting-action) to lint all `.md` files for all pull requests
and pushes to master.

This action uses the default rules outlined in [markdownlint
package](https://github.com/DavidAnson/markdownlint/blob/master/doc/Rules.md).

The additional rules are added are explained below:
- Ignore line length lints for code blocks
- Ignore line length lints for tables
- Allow duplicate sub-headers in sibling headers (e.g. allowing multiple ##
  Significant headers in `CHANGES.md` as long as they are part of separate
  release headers)

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-19 23:03:50 -07:00
Joakim Roubert 30ba9a1261
bin/fmt: Fix shellcheck issue (#4438)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 14:49:28 -07:00
Joakim Roubert 6f1654a65d
bin/_tag.sh: Fix shellcheck issues (#4436)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 14:49:07 -07:00
Kevin Leimkuhler a457936045
Exit tests if linkerd resources exist (#4397)
## Motivation

As mentioned in the [Testing RFC](https://github.com/linkerd/rfc/blob/master/design/0003-isolated-integration-tests.md#constraints):
> The integration test setup checks require that certain conditions are
> satisfied by the given cluster. A surprising condition is that no
> pre-existing Linkerd installation resource may exist; if it does then it is
> deleted.

## Solution

`init_test_run` which runs before integration tests start will now exit the
script if any Linkerd resources exist on the cluster.

Example bad path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...
Linkerd resources exist on cluster:

pod/hello-6b6b5d644d-xrnhn
pod/hello-slow-cooker-h8xn2
pod/world-fc8f457b7-gj7wq
pod/gateway-676fd64cb9-j57k6
pod/hello-c767bf764-cbdqh
pod/hello-slow-cooker-fqmxr
pod/slow-cooker-ftxdx
pod/t1-855c678bdd-vdg96
pod/t2-76989f94d4-d5fv8
pod/t3-75c8877797-hfwgc
pod/world-6784d4f65c-cn6vl
replicaset.apps/gateway-676fd64cb9
replicaset.apps/hello-c767bf764
replicaset.apps/t1-855c678bdd
replicaset.apps/t2-76989f94d4
replicaset.apps/t3-75c8877797
replicaset.apps/world-6784d4f65c
job.batch/hello-slow-cooker
job.batch/slow-cooker

Help:
    Run [/home/kevin/Projects/linkerd/linkerd2/bin/test-cleanup]
    Specify a cluster context [/home/kevin/Projects/linkerd/linkerd2/bin/test-run /home/kevin/Projects/linkerd/linkerd2/target/cli/linux/linkerd [l5d-integration] [context]]
exit
```

Example good path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...[ok]
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-19 13:59:27 -07:00
Joakim Roubert b2082712b5
bin/update-go-deps-shas: Fix shellcheck issues (#4435)
Pass grep output through xargs.
Use `${0%/*}` instead of `$bindir `since the variable `bindir` exists in
_tag.sh too and then triggers the shellcheck variable modifed warning.
Script uses no bash features and can thus be a POSIX /bin/sh script.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 13:00:34 -07:00
Tarun Pothulapati 5f37a9f7fa
Add global.grafanaUrl for linking existing grafana use-case (#4381)
adds global.grafanaUrl for Bring your own Grafana use-case, with configuration in `linkerd-config-addons`
2020-05-20 00:56:31 +05:30
Joakim Roubert 406107bc87
bin/_docker.sh: Fix shellcheck issues (#4433)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 10:39:41 -07:00
Joakim Roubert 113ccbc9c6
shellcheck: Bump to version 0.7.1 (#4439)
This includes the new download location since the old one is deprecated.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-19 10:24:55 -07:00
Kevin Leimkuhler b407196549
Lint all markdown files (#4403)
## Motivation

Necessary lints for #4402

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-19 09:59:26 -07:00
Joakim Roubert 56484ade8d
bin/test-clouds: Fix shellcheck issues (#4423)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 13:42:46 -07:00
Joakim Roubert 3ef358bb2f
bin/protoc-go.sh: Fix shellcheck error (#4420)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 13:13:41 -07:00
Oliver Gould 4d0d9a55f4
multicluster: Do not use 80 for gateway port (#4428)
Using port `80` opens up services to all sorts of unwanted internet
traffic and, furthermore, we don't even want serve HTTP on this port
since we are always employing Linkerd's mTLS.

This changes the gateway's `incomingPort` to 4180 and the `probePort` to
4181 to fit into Linkerd's other port range being in 41XX.
2020-05-18 22:43:13 +03:00
Joakim Roubert 68e25f2c11
bin/test-clouds-cleanup: Fix shellcheck issues (#4422)
shellcheck will not accept the string DO since it is not sure whether it is a misspelled do command or a string with DO. Explicitly quoting it will mitigate this.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 11:53:24 -07:00
Joakim Roubert cc1279b4ba
Fix SC1090 shellcheck issues in shell script files (#4417)
The SC1090 "Can't follow non-constant source" issue is addressed in the way suggested in shellcheck's documentation; the source paths are pointed out in shellcheck comments. By adding the bin dir to the -P shellcheck CLI parameter, we avoid having to state the bin directory in each and every script file.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 11:32:14 -07:00
Tarun Pothulapati 7eb1c953b0
jaeger and grafana propTypes nit and tests (#4365)
Makes the jaeger and grafana propTypes consistent along with tests.
2020-05-18 23:24:17 +05:30
Joakim Roubert 55326a61ac
bin/web: Fix shellcheck issues (#4425)
Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 10:46:28 -07:00
Joakim Roubert 9c639dc3b7
bin/test-scale: Fix shellcheck issues (#4424)
Remove superfluous echo commands in assignments.
Add quotes.
Simplify the for loops that shellcheck didn't like.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 10:41:49 -07:00
Joakim Roubert 5eba710f54
bin/mkube: Update according to shellcheck suggestions (#4419)
Also clean up sed Windows path filtering.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 10:03:42 -07:00
Joakim Roubert 1e8bfed83f
bin/fmt: Use sort -u instead of sort | uniq (#4418)
No need to pipe output to another program when the functionality
exists in sort.

Signed-off-by: Joakim Roubert <joakimr@axis.com>
2020-05-18 09:52:53 -07:00
Kevin Leimkuhler 659756e93f
Bump golangci-lint version (#4356)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-15 16:22:17 -07:00
Joakim Roubert 0b58a56637
Use -n instead of ! -z in shell scripts (#4404)
Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>
2020-05-15 14:03:06 -05:00
Alejandro Pedraza 41ce6ccdbd
Release notes for edge-20.5.3 (#4396) 2020-05-14 14:25:35 -05:00
Tarun Pothulapati e91dbda287
Add health checks for grafana add-on (#4321)
* Add health checks for grafana add-on

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update testCheck command and fixes

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix checkContainersRunnning function

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* linting fix

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test golden files

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use hc.ControlPlanePods instead of k8s API

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use hc.controlPLanePods directly

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove unnecessary comments

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* proper comments

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update pod checks to use retries

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add values key check

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-05-14 23:18:43 +05:30
Zahari Dichev 4176580a0f
Threadsafe buffering listener (#4359)
* Add thread safety to watcher tests

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 20:45:41 +03:00
Alejandro Pedraza 8b0122bf94
Refactor CNI integration tests to use annotations functions (#4363)
Followup to #4341

Replaced all the `t.Error`/`t.Fatal` calls in the integration tests with the
new functions defined in `testutil/annotations.go` as described in #4292,
in order for the errors to produce Github annotations.

This piece takes care of the CNI integration test suite.

This also enables the annotations for these and the general integration
tests, by setting the `GH_ANNOTATIONS` environment variable in the
workflows whose flakiness we're interested on catching: Kind
integration, Cloud integration and Release.

Re #4176
2020-05-14 12:13:07 -05:00
Alejandro Pedraza d0d97e9426
Upgrade to Helm v3 (#4373)
Upgraded to Helm v3.2.1 from v2.16.1, getting rid of Tiller and making
other simplifications.

Note that the version placeholder in the `values.yaml` files had to be
changed from `{version}` to `linkerdVersionValue` because the former
confuses Helm v3.
2020-05-14 12:11:47 -05:00
Kevin Leimkuhler dc5ca1a754
Check that ActualSuccess is greater than 0 in ServiceProfiles test (#4384)
#4217 suggests a retries integration test, but this is already tested as part
of the ServiceProfiles test.

In order to fix this issue, an extra check has been added to the assertion of
the `ActualSuccess` value. It now asserts the value is both greater than 0 and
less than 100.

Closes #4217

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-14 09:56:39 -07:00
Zahari Dichev 115bab9868
Fix gateway update problems (#4388)
* Fix gateway update problems

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 10:59:30 -05:00
Zahari Dichev ef1a2c2b10
Multicluster dashboard for traffic metrics (#4178)
This change adds labels to endpoints that target remote services. It also adds a Grafana dashboard that can be used to monitor multicluster traffic.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 17:48:27 +03:00
Oliver Gould bfe02490ad
proxy: v2.97.0 (#4392)
This release adds special handling for I/O errors in HTTP responses so
that an `errno` label is included to describe the underlying errors
in the proxy's metrics.

---

* Add an `i/o` error label to http metrics (linkerd/linkerd2-proxy#512)
2020-05-13 16:07:12 -07:00
Tarun Pothulapati 640699edda
web: replace jaeger svg with a smaller one (#4360)
* replace jaeger svg with smaller one

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* increase height and width

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* decrease height and width

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-05-13 13:30:42 +05:30
Zahari Dichev fb588e79b5
edge-20.5.2 changes (#4379)
* CLI
  * Added a section to the `linkerd check` that validates that all
    clusters part of a multicluster setup have compatible trust anchors
  * Modified the `inkerd cluster export-service` command to work by
    transforming yaml instead of modifying cluster state
  * Added functionality that allows the `linkerd cluster export-service`
    command to operate on lists of services   
* Controller
  * Changed the multicluster gateway to always require TLS on connections
    originating from outside the cluster
  * Removed admin server timeouts from control plane components, thereby
    fixing a bug that can cause liveness checks to fail
* Helm
  * Moved Grafana templates into a separate add-on chart    
* Proxy
  * Improved latency under high-concurrency use cases.  

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-12 20:46:01 +03:00
Zahari Dichev 01894c700f
Make export-service handle k8s lists (#4370)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-12 09:11:36 +03:00
Oliver Gould a12073d6a8
proxy: v2.96.0 (#4374)
This release reduces latency and CPU consumption, especially for high-
concurrency use cases.

---

* Add middleware that rejects connections with no identity (linkerd/linkerd2-proxy#507)
* Buffer requests while the service is pending (linkerd/linkerd2-proxy#511)
2020-05-11 15:27:25 -07:00
Tarun Pothulapati 45ccc24a89
Move grafana templates into a separate sub-chart as a add-on (#4320)
* adds grafana manifests as a sub-chart

- moves grafana templates into its own chart
- implement add-on interface Grafana struct
- also add relevant conditions for grafana

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove redundant grafana fields in Values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update golden files

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix values issue

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove extra grafanaImage value

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add add-on upgrade tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix golden file tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add grafana field to linkerd-config-addons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* Don't apply nil configuration

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update golden files

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* make checks relaxed for grafana

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test to not test on grafana

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update TestServiceAccountsMatch to contain extra members

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* replace map[string]interface{} with Grafana for better readability

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update golden files

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-05-11 22:22:14 +05:30
Zahari Dichev fd59ce532d
Add better logging to service mirror controller (#4361)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-11 10:30:16 +03:00
Zahari Dichev edd9b654a7
Make gateway require TLS for incoming requests (#4339)
Make gateway require TLS for incoming requests

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-11 10:07:48 +03:00
Zahari Dichev 3008f1f87f
Add check for validating that remote clusters share the same trust an… (#4311)
Add check for validating that remote clusters share the same trust anchors

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-11 09:59:15 +03:00
Alex Leong a3c42b1380
remove admin server timeouts (#4350)
The Linkerd control plane components' admin servers have an idle connection timeout of 10 seconds.  This means that they will close connections which have been idle for 10 seconds.  These components are also configured with a 10 second period for liveness checks.  This introduces a race condition where connections will be idle for approximately 10 seconds between liveness checks and can idle out, potentially causing the next liveness check to fail.

We remove the idle timeout so that the connection stays alive.
2020-05-08 12:59:43 -07:00
Alejandro Pedraza f62a2e6ee4
Refactor integration tests to use annotations functions (#4341)
* Refactor integration tests to use annotations functions

First part of #4176

Replaced all the `t.Error`/`t.Fatal` calls in the integration with the
new functions defined in `testutil/annotations.go` as described in #4292,
in order for the errors to produce Github annotations.

Most of these calls have now two strings: one containing a generic error
message and another with a more specific message. The former is what
will be aggregated and seen in the CI reports at
[linkerd2-ci-metrics](https://github.com/linkerd/linkerd2-ci-metrics).

Other changes:

- Improved the annotation generator in `annotations.go` so that the
  message includes the name of the test.
- When a failure from `RetryFor` occurs, log the original timeout so
  we can consider incrementing it when the failure is persistent.
2020-05-08 08:41:42 -05:00
Kevin Leimkuhler d7ca4f886c
Add changes for edge-20.5.1 (#4348)
## edge-20.5.1

* CLI
  * Fixed all commands to use kubeconfig's default namespace if specified
    (thanks @Matei207!)
  * Added multicluster checks to the `linkerd check` command
  * Hid development flags in the `linkerd install` command for release builds
* Controller
  * Added ability to configure Prometheus Altermanager as well as recording
    and alerting rules on the Linkerd Prometheus (thanks @naseemkullah!)
  * Added ability to add more commandline flags to the Prometheus command
    (thanks @naseemkullah!)
* Web UI
  * Fixed TrafficSplit detail page not loading
  * Added Jaeger links to the dashboard when the tracing addon is enabled
* Proxy
  * Modified internal buffering to avoid idling out services as a request
    arrives, fixing failures for requests that are sent exactly once per
    minute--such as Prometheus scrapes

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-05-07 17:21:16 -07:00
Alejandro Pedraza 1a2eaf29dc
Flaky tests: increase timeout for 'linkerd edges' (#4353)
The `linkerd edges` test was being flaky, so gave more slack for it to
succeed.
2020-05-07 18:24:32 -05:00