Commit Graph

2027 Commits

Author SHA1 Message Date
EMEHINOLA Idowu 920c742a68
Idowu Emehinola <idowu@deimos.co.za> (#4246)
Signed-off-by: hydeenoble <hydeenoble39@gmail.com>
2020-04-12 09:46:26 -07:00
Zahari Dichev 26c14d3c66
Detect changes in addresses when getting updates in endpoints watcher (#4104)
Detect changes in addresses when getting updates in endpoints watcher

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-10 11:42:39 +03:00
Alex Leong 7b9d475ffc
Gate SMI-Metrics behind an install flag (#4240)
This change adds a `--smi-metrics` install flag which controls if the SMI-metrics controller and associated RBAC and APIService resources are installed.  The flag defaults to false and is hidden.

We plan to remove this flag or default it to true if and when the SMI-Metrics integration graduates from experimental.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-09 14:34:08 -07:00
Tarun Pothulapati d35a98cb2b
Fix routes wide output formatting for empty values (#4239)
* use wider template string for empty values when -o wide

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-07 14:35:55 -05:00
Alejandro Pedraza 322ba5fd2f
`linkerd uninstall` errors when attempting to delete PSP (#4234)
* Bug in `linkerd uninstall` when attempting to delete PSP

We were using a wrong apiVersion for PSP in `linkerd uninstall`'s
output, which avoids removing that resource:

```
$ linkerd uninstall | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller"
deleted
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination"
deleted
...
mutatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-proxy-injector-webhook-config" deleted
validatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-sp-validator-webhook-config" deleted
namespace "linkerd" deleted
error: unable to recognize "uninstall.yml": no matches for kind
"PodSecurityPolicy" in version "extensions/v1beta1"

$ kubectl get psp -oname
podsecuritypolicy.policy/linkerd-linkerd-control-plane
```

I've also replaced the uninstall integration test with a new separate
suite that performs the installation, waits for it to be ready,
uninstalls, and then confirms `linkerd check --pre` returns as expected.
2020-04-07 11:01:11 -05:00
Zahari Dichev d6460cf0fb
Update upgrade test certs (#4236)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-06 20:15:06 +03:00
Alejandro Pedraza be558b6869
Don't include git SHA in cloud_integration_tests names - part 2 (#4231)
Followup to #4230

I forgot to make the same change to the `release.yml` workflow
2020-04-02 19:43:27 -05:00
Alejandro Pedraza 7a6e7c3b38
edge-20.4.1 (#4229)
* edge-20.4.1

This release introduces some cool new functionalities, all provided by our
awesome community of contributors! Also two bugs were fixed that were introduced
since edge-20.3.2.

* CLI
  * Added `linkerd uninstall` command to uninstall the control plane (thanks
    @Matei207!)
  * Fixed a bug causing `linkerd routes -o wide` to not show the proper actual
    success rate
* Controller
  * Fail proxy injection if the pod spec has `automountServiceAccountToken`
    disabled (thanks @mayankshah1607!)
* Web UI
  * Added a route dashboard to Grafana (thanks @lundbird!)
* Proxy
  * Fixed a bug causing the proxy's inbound to spuriously return 503 timeouts
2020-04-02 18:54:46 -05:00
Alejandro Pedraza 84a9e2a807
Don't include git SHA in cloud_integration_tests namespaces (#4230)
The `cloud_integration_tests` job was creating its tests under
namespaces containing the git SHA. This is a left-over from when all the
tests ran in the same cluster, which is no longer the case, and thus no
longer needed.

This fixes the [current CI
failure](https://github.com/linkerd/linkerd2/runs/556330879?check_suite_focus=true#step:6:24)
in master.
2020-04-02 18:22:00 -05:00
Oliver Gould 5ad3a4f72c
proxy: v2.91.0 (#4228)
This release fixes a bug introduced in v2.89.0 that could cause spurious
timeouts for inbound proxies that handle HTTP requests for many distinct
domains.

---

* inbound: Do not cache per-endpoint services (linkerd/linkerd2-proxy#469)
2020-04-02 14:48:45 -07:00
Matei David fee70c064b
Add uninstall cmd functionality to cli (#3622) (#4200)
Signed-off-by: Matei David <matei.david.35@gmail.com>
2020-04-02 12:35:39 -05:00
Alex Leong d8eebee4f7
Upgrade to client-go 0.17.4 and smi-sdk-go 0.3.0 (#4221)
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0.  Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.

This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4.  This keeps all of our transitive dependencies synchronized on one version of client-go.

This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code.  I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-01 10:07:23 -07:00
Alejandro Pedraza 0c8171d466
Fix bin/kind-load for pull requests (#4222)
* Fix bin/kind-load for pull requests

Followup to #4212

External PRs were failing because:

1) The image tarballs weren't being loaded from the `images-archives`
directory
2) Concurrent calls to `bin/kind` were attempting to download the KinD
binary simultaneously, resulting in a "text file busy" error. To avoid
that, now we just call `bin/kind` synchronously one time beforehand.
2020-04-01 12:04:24 -05:00
Mayank Shah 4429c1a5b1
Update inject to handle `automountServiceAccountToken: false` (#4145)
* Handle automountServiceAccountToken

Return error during inject if pod spec has `automountServiceAccountToken: false`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-04-01 09:39:49 -05:00
Oliver Gould 2b8f1b27c2
proxy: v2.90.0 (#4218)
This release restores the `route_actual_response_total` metric, which is
needed for `linkerd routes -o wide`.

---

* Update test certificates (linkerd/linkerd2-proxy#460)
* Use strong_count instead of upgrade on weak Arcs in cache (linkerd/linkerd2-proxy#459)
* Wire authority override coming from discovery (linkerd/linkerd2-proxy#462)
* Update integration tests certs (linkerd/linkerd2-proxy#465)
* Add a `mock-orig-dst` feature flag (linkerd/linkerd2-proxy#466)
* http-metrics: Make latency export optional (linkerd/linkerd2-proxy#467)
* Restore the route_actual_response_total metric (linkerd/linkerd2-proxy#468)
2020-03-31 15:02:26 -07:00
Alejandro Pedraza 22f1606b73
Extract common logic in scripts and CI to load images into KinD (#4212)
Fixes #4206 Followup to #4167

Extract common logic to load images into KinD, from `bin/kind-load`, `bin/install-pr`, `.github/workflows/kind_integration.yml` and `.github/workflows/release.yml`.

Besides removing the duplication, `bin/kind-load` will benefit in performance by having each image be loaded in parallel.

```
Load into KinD the images for Linkerd's proxy, controller, web, grafana, debug and cni-plugin.

Usage:
    bin/kind-load [--images] [--images-host ssh://linkerd-docker]

Examples:

    # Load images from the local docker instance
    bin/kind-load

    # Load images from tar files located in the current directory
    bin/kind-load --images

    # Retrieve images from a remote docker instance and then load them into KinD
    bin/kind-load --images --images-host ssh://linkerd-docker

Available Commands:
    --images: use 'kind load image-archive' to load the images from local .tar files in the current directory.
    --images-host: the argument to this option is used as the remote docker instance from which images are first retrieved
                   (using 'docker save') to be then loaded into KinD. This command requires --images.
```
2020-03-30 16:28:28 -05:00
Alex Lundberg 0d4d2dca65
Add route dashboard to grafana instance (#4155)
* Add route dashboard to grafana instance

Fixes #1737

Signed-off-by: alex lundberg <alex.lundberg@commonbond.co>
2020-03-27 09:16:00 -05:00
Alex Leong 27a4c8a073
edge-20.3.4 (#4204)
This release introduces several fixes and improvements to the CLI.

* CLI
  * Added support for kubectl-style label selectors in many CLI commands (thanks
    @mayankshah1607!)
  * Fixed the path regex in service profiles generated from proto files without
    a package name (thanks @amariampolskiy!)
  * Fixed an error when injecting Cronjobs that have no metadata
  * Relaxed the clock skew check to match the default node heartbeat interval
    on Kubernetes 1.17 and made this check a warning
  * Fixed a bug where the linkerd-smi-metrics pod could not be created on
    clusters with pod security policy enabled
* Internal
  * Upgraded tracing components to more recent versions and improved resource
    defaults (thanks @Pothulapati!)

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-26 14:54:12 -07:00
Alejandro Pedraza 573060bacc
New test for checking SA lists are synced (#4201)
Followup to #4193

This is to verify that the list of SA installed, as well as the list of
SA in the linkerd-psp RoleBinding match the list of expected SA defined
in `healthcheck.go`.
2020-03-26 12:54:31 -05:00
Naseem 5c1a02fe30
Add Transit to adopters. (#4202)
Signed-off-by: Naseem <naseem@transit.app>
2020-03-25 18:45:23 -07:00
Alejandro Pedraza 0a4df947e6
Add missing PSP for linkerd-smi-metrics (#4193)
The linkerd-smi-metrics ServiceAccount wasn't hooked into linkerd's PSP
resource, which resulted in the linkerd-smi-metrics ReplicaSet failing
to spawn pods:

```
Error creating: pods "linkerd-smi-metrics-574f57ffd4-" is forbidden:
unable to validate against any pod security policy: []
```
2020-03-25 14:28:35 -05:00
Zahari Dichev 10ecd8889e
Set auth override (#4160)
Set AuthOverride when present on endpoints annotation

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-25 10:56:36 +02:00
Alex Leong 8a5984ba8f
Relax clock skew check (#4195)
Fixes #3943

The Linkerd clock skew check requires that all nodes in the cluster have reported a heartbeat within (approximately) the last minute.  However, in Kubernetes 1.17, the default heartbeat interval is 5 minutes.  This means that the clock skew check will often fail in Kubernetes 1.17 clusters.

We relax the check to only require that heartbeats have been detected in the past 5 minutes, matching the default heartbeat interval in Kubernetes 1.17.  We also switch this check to be a warning so that clusters which are configured with longer heartbeat intervals don't see this as a fatal error.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-24 14:19:17 -07:00
Alejandro Pedraza d6c588f683
Add missing SAs to linkerd check (#4194)
* Add missing SAs to linkerd check

This adds the service accounts `linkerd-destination` and
`linkerd-smi-metrics` that were missing from the "control plane
ServiceAccounts exist" check.
2020-03-24 12:50:54 -05:00
Alex Leong e3bffb31a1
Add more owners of Dockerfiles (#4192)
Fixes #4179 

Changes to Go dependencies will touch all Dockerfiles in the repo which requires approval from the codeowners of each subdirectory.

We revise the codeowners to add more owners for the Dockerfiles so that approval is not required from the subdirectory owners specifically.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-24 09:46:14 -07:00
Alejandro Pedraza eb322dc420
Fix error when injecting Cronjobs that have no metadata (#4180)
When injecting a Cronjob with no
`spec.jobTemplate.spec.template.metadata` we were getting the following
error:

```
Error transforming resources: jsonpatch add operation does not apply:
doc is missing path:
"/spec/jobTemplate/spec/template/metadata/annotations"
```

This only happens to Cronjobs because other workloads force having at
least a label there that is used in `spec.selector` (at least as of v1
workloads).

With this fix, if no metadata is detected, then we add it in the json patch when
injecting, prior to adding the injection annotation.

I've added a couple of new unit tests, one that verifies that this
doesn't remove metadata contents in Cronjobs that do have that metadata,
and another one that tests injection in Cronjobs that don't have
metadata (which I verified it failed prior to this fix).
2020-03-23 14:49:50 -05:00
amariampolskiy a46fa05fd7
Generate correct path regex for proto files without package name (#4098)
Signed-off-by: amariampolskiy <amariampolskiy@pushwoosh.com>
2020-03-23 14:21:42 -05:00
dependabot[bot] ab75bcdf07
Bump acorn from 5.7.3 to 5.7.4 in /web/app (#4172)
Bumps [acorn](https://github.com/acornjs/acorn) from 5.7.3 to 5.7.4.
- [Release notes](https://github.com/acornjs/acorn/releases)
- [Commits](https://github.com/acornjs/acorn/compare/5.7.3...5.7.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-03-20 11:07:06 -05:00
Mayank Shah 963b9b049a
Add kubectl-style label selectors (#4120)
* Update tap, routes and top commands to support label selectors

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-03-20 10:45:06 -05:00
Kevin Leimkuhler 29db6c12a1
Fix script argument regex (#4188)
Currently the release tag regex matches against arguments that have `edge` or
`stable` as a substring.

It should only match against arguments that are either `edge` or `stable`.

For example, the graceful error handling is not triggered for the following:
```
❯ bin/create-release-tag edge-20.3.3
bin/create-release-tag: line 92: release_tag: unbound variable
```

This PR fixes the regex so that the above results in graceful error handling.

```
❯ bin/create-release-tag edge-20.3.3
Error: valid release channels: edge, stable
Usage:
    bin/create-release-tag edge
    bin/create-release-tag stable 2.4.8
```
2020-03-19 15:13:17 -07:00
Tarun Pothulapati 8d64f4e135
Bump Versions of Trace components (#4182)
* Bump Versions of Tracing components
- Jaeger to 1.17.1
- OpenCensus Collector to 0.1.11
* More sane defaults of jaeger resources

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-03-19 16:42:21 -05:00
Eliza Weisman fcc8700be3
Update changelog for edge-20.3.3 (#4187)
## edge-20.3.3

This release introduces new experimental CLI commands for querying metrics using
the Service Mesh Interface (SMI) and for multi-cluster support via service
mirroring.

If you would like to learn more about service mirroring or SMI, or are
interested in experimenting with these features, please join us in
[Linkerd Slack](https://slack.linkerd.io) for help and feedback.

* CLI
  * Added experimental `linkerd cluster` commands for managing multi-cluster
    service mirroring
  * Added the experimental `linkerd alpha clients` command, which uses the
    smi-metrics API to display client-side metrics from each of a resource's
    clients
  * Added retries to some `linkerd check` checks to prevent spurious failures
    when run immediately after cluster creation or Linkerd installation
2020-03-19 12:48:44 -07:00
Zahari Dichev 40a063878d
Service mirror CLI (#4070)
Multicluster CLI tools

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-19 20:08:11 +02:00
Alex Leong 8f82f8c241
Upgrade smi-metrics to v0.2.1 (#4186)
This version contains an fix for a bug that was rejecting all requests on clusters configured with an empty list of allowed client names.  Because smi-metrics is an apiservice, this was also preventing namespaces from terminating.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-19 11:03:09 -07:00
Alejandro Pedraza 1cbc26a2c1
Upgrade golangci-lint to v1.23.8 (#4181)
* Upgrade golangci-lint to v1.23.8

This should help with some timeouts we're seeing in CI.

I fixed some new warnings found in `inject.go` and `uninject.go`.
Also we now have to explicitly disable linting `/controller/gen`.

The linter was also complaining that in `/pkg/k8s/fake.go` the
`spClient.Interface` and `tsclient.Interface` returned in the function
`newFakeClientSetsFromManifests()` aren't used, but I opted to ignore
that to leave them available for future tests.
2020-03-18 09:13:19 -05:00
Alejandro Pedraza 8f79e07ee2
Bump proxy-init to v1.3.2 (#4170)
* Bump proxy-init to v1.3.2

Bumped `proxy-init` version to v1.3.2, fixing an issue with `go.mod`
(linkerd/linkerd2-proxy-init#9).
This is a non-user-facing fix.
2020-03-17 14:49:25 -05:00
Kevin Leimkuhler 10db65bcb3
Update linkerd/stern to fix go.mod parsing (#4173)
## Motivation

I noticed the Go language server stopped working in VS Code and narrowed it
down to `go build ./...` failing with the following:

```
❯ go build ./...
go: github.com/linkerd/stern@v0.0.0-20190907020106-201e8ccdff9c: parsing go.mod: go.mod:3: usage: go 1.23
```

This change updates `linkerd/stern` version with changes made in
linkerd/stern#3 to fix this issue.

This does not depend on #4170, but it is also needed in order to completely
fix `go build ./...`
2020-03-17 11:16:18 -07:00
Kevin Leimkuhler 6369cffacc
Add KinD option to `install-pr` script (#4167)
## Motivation

After #4147 added the `install-pr` script, installing PRs into existing
clusters does not work if that cluster is a KinD cluster

Changing the script to be able to use KinD, and specifically automate `kind
load` would be helpful!

## Solution

The script can now be used in the following ways.

```
❯ bin/install-pr --help
Install Linkerd with the changes made in a GitHub Pull Request.

Usage:
    --context: The name of the kubeconfig context to use

    # Install Linkerd into the current cluster
    bin/install-pr 1234

    # Install Linkerd into the current KinD cluster
    bin/install-pr [-k|--kind] 1234

    # Install Linkerd into the 'kind-pr-1234' KinD cluster
    bin/install-pr [-k|--kind] --context kind-pr-1234 1234
```

The script assumes that the cluster (KinD or not) has already been created. If
the cluster is a KinD cluster, the `-k|--kind` flag should be passed.

If the `--context` flag is not passsed, the install defaults to the current
context (`kubectl config current-context`).

I also added a [`-h|--help]` option that describes how to use the script.
2020-03-17 10:54:33 -07:00
Zahari Dichev 2db307ee91
Remove target port requirement in port resolution (#4174)
This change removes the target port requirement when resolving ports in the dst service. Based on the comments, it seems that we need to have a target port defined in the port spec in order to resolve to the port in the Endpoints. In reality if target port is note defined when creating the service, k8s will set the port and the target port to the same value. Seems to me that checking for the targetPort to be different than 0, is a no-op.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-03-16 23:04:08 +02:00
Kevin Leimkuhler e5b0ea28d4
Add retries to certain `linkerd check` checkers (#4171)
## Motivation

Testing #4167 has revealed some `linkerd check` failures that occur only
because the checks happen too quickly after cluster creation or install. If
retried, they pass on the second time.

Some checkers already handle this with the `retryDeadline` field. If a checker
does not set this field, there is no retry.

## Solution

Add retries to the `l5d-existence-replicasets`
`l5d-existence-unschedulable-pods` checks so that these checks do not fail
during a chained cluster creation > install > check process.
2020-03-16 13:15:42 -07:00
Thomas Rampelberg 18b6e4a723
Update maintainers and codeowners (#4166)
* Update maintainers and codeowners

* Drop wildcard
2020-03-12 15:48:19 -07:00
Alex Leong 794abfe0d4
Add alpha clients command (#4157)
We add the `linkerd alpha clients` command which displays client side metrics from each of a resource's clients.  This allows you to see who all of your clients are and see what your resource's metrics look like from your clients' point of view.  Since these metrics are measured on the client-side, they include network latency.

```
> linkerd alpha clients deploy/web -n emojivoto
FROM                TO   SUCCESS        RPS  LATENCY_P50  LATENCY_P90  LATENCY_P99
vote-bot.emojivoto  web   97.50%     2.0rps          4ms          5ms          5ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-12 13:45:34 -07:00
Zahari Dichev 7c0e6a86c7
Add changes for edge-20.3.2 (#4164)
Add changes for edge-20.3.2

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-12 15:36:34 +02:00
cpretzer 0b8272bdf5
concatenate additional url paths to protohttp.TapReqToUrl (#4151)
* concatenate additional url paths to protohttp.TapReqToUrl

Signed-off-by: Charles Pretzer <charles@buoyant.io>

* fix formatting
2020-03-11 10:34:05 -07:00
Alex Lundberg 80935f0ba7
update helm chart to use proxyInit version v1.3.1 (#4153)
Update helm chart to use proxyInit version v1.3.1
2020-03-11 09:58:07 -07:00
Lewis Cowper 5ca9bc6db5
Support configuration of service profile timeouts (#4072)
This change is in a similar vein to #4052 which provided support for
configuring service profile retries via a vendor extension of
`x-linkerd-retryable`, when generating from an openapi specification.

This change is very similar to the final version of that pull request,
and adds a timeout value based on `x-linkerd-timeout`.

At this point I believe that if the timeout is not specified then the
default provided by linkerd of 30s will apply anyway, but won't
explicitly be reflected in the service profile, which I'm somewhat okay
with as a current state, but I think there's a potential future
improvement that the default timeout is always shown when generating
from an open api spec, but that's more to make it clear and obvious that
that timeout exists.

Signed-off-by: Lewis Cowper <lewis.cowper@googlemail.com>
2020-03-10 13:22:26 -07:00
Oliver Gould bbca18492e
proxy: v2.89.0 (#4163)
This release builds on changes in the prior release to ensure that
balancers process updates eagerly.

Cache capacity limitations have been removed; and services now fail
eagerly, rather than making all requests wait for the timeout to expire.

Also, a bug was fixed in the way the `LINKERD2_PROXY_LOG` env variable
is parsed.

---

* Introduce a backpressure-propagating buffer (linkerd/linkerd2-proxy#451)
* trace: update tracing-subscriber to 0.2.3 (linkerd/linkerd2-proxy#455)
* timeout: Introduce FailFast, Idle, and Probe middlewares (linkerd/linkerd2-proxy#452)
* cache: Let services self-evict (linkerd/linkerd2-proxy#456)
2020-03-10 13:02:44 -07:00
Kevin Leimkuhler 88cafa36c6
Upload artifacts for all PRs (#4159)
## Motivation

#4147 adds a script for setting up a local cluster that uses the images built
from the changes introduced in a forked PR. This would be useful for all PRs.

In order to install Linkerd from a PR into a local cluster, the images still
need to be built at some point. If you happen to have SSH config setup for our
Packet host, you can pull them from there. That is not very
accessible--requiring that someone adds you as a user--so we can take a
similar approach to forked PRs.

## Solution

All PRs now make an artifact directory that is uploaded as part of the KinD
integration tests. This way, the `install-pr` script can use those images no
matter if the PR is a fork or not.
2020-03-10 12:44:27 -07:00
Alex Leong df59448046
Use curl (#4162)
We use curl for fetching remote files in our `bin` scripts.  Replace the use of `wget` with `curl` in `bin/shellcheck` for consistency.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-10 12:39:12 -07:00
Alex Lundberg c8dd369afd
update helm chart README with enforcedHostRegexp and controllerImageV… (#4154)
* update helm chart README with enforcedHostRegexp and controllerImageVersion

Signed-off-by: alex lundberg <alex.lundberg@commonbond.co>
2020-03-10 14:19:58 -05:00