* Bump KinD to 0.8.1
This brings us K8s 1.18, which is in theory passing all the integration
tests. Currently the tracing one is failing just because of the quay.io
downtime, that hosts the nginx-ingress image.
Re #4382
When viewing the output of `linkerd stat` for services which do not have a selector (such as services created by the service-mirror, for example) the meshed count column shows the total number which exist, even though the service actually selects no pods at all.
We update the StatSummary implementation to account for services which have no selector.
Additionally, we update the logic of the `--unmeshed` flag. When the `--unmeshed` flag is not set, we typically skip rows for unmeshed resources because those resources would have no stats. This is not appropriate to do when the `--from` flag is also set because in this case, metrics are not collected on the target resource but are instead collected on the client-side. This means that stats can be present, even for unmeshed resources and these resources should still be displayed, even if the `--unmeshed` flag is not set.
Signed-off-by: Alex Leong <alex@buoyant.io>
This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway.
Fix#4411
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR introduces a few changes that were requested after a bit of service mirror reviewing.
- we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in
- we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters
- fixed checks to account for changes
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Delete variable `os` that is not used. The golangci-lint downloader script does its own extensive platform lookup before downloading the selected binary.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
## Motivation
linkerd/rfc#22
## Solution
Use the [markdown-lint-action](https://github.com/marketplace/actions/markdown-linting-action) to lint all `.md` files for all pull requests
and pushes to master.
This action uses the default rules outlined in [markdownlint
package](https://github.com/DavidAnson/markdownlint/blob/master/doc/Rules.md).
The additional rules are added are explained below:
- Ignore line length lints for code blocks
- Ignore line length lints for tables
- Allow duplicate sub-headers in sibling headers (e.g. allowing multiple ##
Significant headers in `CHANGES.md` as long as they are part of separate
release headers)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
## Motivation
As mentioned in the [Testing RFC](https://github.com/linkerd/rfc/blob/master/design/0003-isolated-integration-tests.md#constraints):
> The integration test setup checks require that certain conditions are
> satisfied by the given cluster. A surprising condition is that no
> pre-existing Linkerd installation resource may exist; if it does then it is
> deleted.
## Solution
`init_test_run` which runs before integration tests start will now exit the
script if any Linkerd resources exist on the cluster.
Example bad path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...
Linkerd resources exist on cluster:
pod/hello-6b6b5d644d-xrnhn
pod/hello-slow-cooker-h8xn2
pod/world-fc8f457b7-gj7wq
pod/gateway-676fd64cb9-j57k6
pod/hello-c767bf764-cbdqh
pod/hello-slow-cooker-fqmxr
pod/slow-cooker-ftxdx
pod/t1-855c678bdd-vdg96
pod/t2-76989f94d4-d5fv8
pod/t3-75c8877797-hfwgc
pod/world-6784d4f65c-cn6vl
replicaset.apps/gateway-676fd64cb9
replicaset.apps/hello-c767bf764
replicaset.apps/t1-855c678bdd
replicaset.apps/t2-76989f94d4
replicaset.apps/t3-75c8877797
replicaset.apps/world-6784d4f65c
job.batch/hello-slow-cooker
job.batch/slow-cooker
Help:
Run [/home/kevin/Projects/linkerd/linkerd2/bin/test-cleanup]
Specify a cluster context [/home/kevin/Projects/linkerd/linkerd2/bin/test-run /home/kevin/Projects/linkerd/linkerd2/target/cli/linux/linkerd [l5d-integration] [context]]
exit
```
Example good path:
```
Checking the linkerd binary...[ok]
Checking if there is a Kubernetes cluster available...[ok]
Checking if Linkerd resources exist on cluster...[ok]
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Pass grep output through xargs.
Use `${0%/*}` instead of `$bindir `since the variable `bindir` exists in
_tag.sh too and then triggers the shellcheck variable modifed warning.
Script uses no bash features and can thus be a POSIX /bin/sh script.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Using port `80` opens up services to all sorts of unwanted internet
traffic and, furthermore, we don't even want serve HTTP on this port
since we are always employing Linkerd's mTLS.
This changes the gateway's `incomingPort` to 4180 and the `probePort` to
4181 to fit into Linkerd's other port range being in 41XX.
shellcheck will not accept the string DO since it is not sure whether it is a misspelled do command or a string with DO. Explicitly quoting it will mitigate this.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
The SC1090 "Can't follow non-constant source" issue is addressed in the way suggested in shellcheck's documentation; the source paths are pointed out in shellcheck comments. By adding the bin dir to the -P shellcheck CLI parameter, we avoid having to state the bin directory in each and every script file.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Remove superfluous echo commands in assignments.
Add quotes.
Simplify the for loops that shellcheck didn't like.
Signed-off-by: Joakim Roubert <joakimr@axis.com>
Followup to #4341
Replaced all the `t.Error`/`t.Fatal` calls in the integration tests with the
new functions defined in `testutil/annotations.go` as described in #4292,
in order for the errors to produce Github annotations.
This piece takes care of the CNI integration test suite.
This also enables the annotations for these and the general integration
tests, by setting the `GH_ANNOTATIONS` environment variable in the
workflows whose flakiness we're interested on catching: Kind
integration, Cloud integration and Release.
Re #4176
Upgraded to Helm v3.2.1 from v2.16.1, getting rid of Tiller and making
other simplifications.
Note that the version placeholder in the `values.yaml` files had to be
changed from `{version}` to `linkerdVersionValue` because the former
confuses Helm v3.
#4217 suggests a retries integration test, but this is already tested as part
of the ServiceProfiles test.
In order to fix this issue, an extra check has been added to the assertion of
the `ActualSuccess` value. It now asserts the value is both greater than 0 and
less than 100.
Closes#4217
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This change adds labels to endpoints that target remote services. It also adds a Grafana dashboard that can be used to monitor multicluster traffic.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This release adds special handling for I/O errors in HTTP responses so
that an `errno` label is included to describe the underlying errors
in the proxy's metrics.
---
* Add an `i/o` error label to http metrics (linkerd/linkerd2-proxy#512)
* CLI
* Added a section to the `linkerd check` that validates that all
clusters part of a multicluster setup have compatible trust anchors
* Modified the `inkerd cluster export-service` command to work by
transforming yaml instead of modifying cluster state
* Added functionality that allows the `linkerd cluster export-service`
command to operate on lists of services
* Controller
* Changed the multicluster gateway to always require TLS on connections
originating from outside the cluster
* Removed admin server timeouts from control plane components, thereby
fixing a bug that can cause liveness checks to fail
* Helm
* Moved Grafana templates into a separate add-on chart
* Proxy
* Improved latency under high-concurrency use cases.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This release reduces latency and CPU consumption, especially for high-
concurrency use cases.
---
* Add middleware that rejects connections with no identity (linkerd/linkerd2-proxy#507)
* Buffer requests while the service is pending (linkerd/linkerd2-proxy#511)
The Linkerd control plane components' admin servers have an idle connection timeout of 10 seconds. This means that they will close connections which have been idle for 10 seconds. These components are also configured with a 10 second period for liveness checks. This introduces a race condition where connections will be idle for approximately 10 seconds between liveness checks and can idle out, potentially causing the next liveness check to fail.
We remove the idle timeout so that the connection stays alive.
* Refactor integration tests to use annotations functions
First part of #4176
Replaced all the `t.Error`/`t.Fatal` calls in the integration with the
new functions defined in `testutil/annotations.go` as described in #4292,
in order for the errors to produce Github annotations.
Most of these calls have now two strings: one containing a generic error
message and another with a more specific message. The former is what
will be aggregated and seen in the CI reports at
[linkerd2-ci-metrics](https://github.com/linkerd/linkerd2-ci-metrics).
Other changes:
- Improved the annotation generator in `annotations.go` so that the
message includes the name of the test.
- When a failure from `RetryFor` occurs, log the original timeout so
we can consider incrementing it when the failure is persistent.
## edge-20.5.1
* CLI
* Fixed all commands to use kubeconfig's default namespace if specified
(thanks @Matei207!)
* Added multicluster checks to the `linkerd check` command
* Hid development flags in the `linkerd install` command for release builds
* Controller
* Added ability to configure Prometheus Altermanager as well as recording
and alerting rules on the Linkerd Prometheus (thanks @naseemkullah!)
* Added ability to add more commandline flags to the Prometheus command
(thanks @naseemkullah!)
* Web UI
* Fixed TrafficSplit detail page not loading
* Added Jaeger links to the dashboard when the tracing addon is enabled
* Proxy
* Modified internal buffering to avoid idling out services as a request
arrives, fixing failures for requests that are sent exactly once per
minute--such as Prometheus scrapes
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>