Commit Graph

2635 Commits

Author SHA1 Message Date
Alejandro Pedraza a14f3f4eec
Add the 'tapInjector.logLevel' value. (#5713)
Fixes #5686

Test:
```bash
$ linkerd viz install --set tapInjector.logLevel=debug | k apply -f -

// and then when creating a pod we can see debug log entries such as:

time="2021-02-10T16:19:28Z" level=debug msg="admission request: &AdmissionRequest{UID:c5e95e8d-...
```
2021-02-11 09:21:41 -05:00
Alejandro Pedraza e887cc79ea
Replace Addons section with Extensions in main chart README (#5714)
Fixes #5704
2021-02-11 09:20:27 -05:00
Oliver Gould 8f2d01c5c0
Use fully-qualified DNS names in proxy configuration (#5707)
Pods with unusual DNS configurations may not be able to resolve the
control plane's domain names. We can avoid search path shenanigans by
adding a trailing dot to these names.
2021-02-10 08:27:35 -08:00
Oliver Gould 5437e23706
proxy: v2.132.0 (#5705)
This release updates the proxy to address [RUSTSEC-2021-0020][1].

This release also updates the proxy's gateway mode to support arbitrary
TCP traffic so that multicluster gateways are no longer limited to HTTP
traffic.

[1]: https://rustsec.org/advisories/RUSTSEC-2021-0020

---

* outbound: Split outbound::server into multiple modules (linkerd/linkerd2-proxy#899)
* Simplify opencensus stack composition (linkerd/linkerd2-proxy#900)
* Simplify outbound stack composition with builders (linkerd/linkerd2-proxy#901)
* Simplify inbound stack composition (linkerd/linkerd2-proxy#902)
* Remove outbound loop prevention (linkerd/linkerd2-proxy#903)
* tracing: update to 0.1.23, remove `tracing-futures` (linkerd/linkerd2-proxy#904)
* Update Hyper for  RUSTSEC-2021-0020 (linkerd/linkerd2-proxy#905)
* Improve stack builder ergonomics (linkerd/linkerd2-proxy#906)
* outbound: Unify the TCP logical stack (linkerd/linkerd2-proxy#907)
* gateway: Transport TCP connections (linkerd/linkerd2-proxy#909)
2021-02-10 07:47:52 -08:00
Alex Leong 2d2ebb255b
Addition of K3 Business Technologies (#5691)
Signed-off-by: Alex Leong <alex@buoyant.io>
2021-02-10 10:10:22 -05:00
Nathan J Mehl 3534a902ce
Fix spelling issue w/ sidecarContainers key in linkerd-viz helm chart (#5674)
The name of the `sidecarContainers` chart value key is misspelled
`sideCarContainers` in the default values file and the README: anyone
who uses the misspelled key will be confused and unhappy to see that
their sidecar container is not added to the pod because the Helm
template is looking for `.Values.prometheus.sidecarContainer`

Change the first `c` character to lower-case as the template expects. :)

Signed-off-by: Nathan J. Mehl <n@oden.io>
2021-02-09 13:42:43 -05:00
Kevin Leimkuhler 75fcc9d623
Move tap from core into Viz extension (#5651)
Closes #5545.

This change moves all tap and tap-injector code into the viz directory. 

The tap and tap-injector components now also use a new tap image—separating
these components from the controller image that they are currently part of. This
means the controller image has removed all its build dependencies related to
tap.

Finally, the tap Protobuf has been separated from the metrics-api and moved into
it's own `.proto` file and gen directory. This introduces a clear split between
metrics-api and tap Protobuf.

There is no change in behavior for the `viz tap` command.

### Reviewing

#### Docker images

All the bin directory scripts should be updated to build and load the tap image.
All the CI workflows should be updated to build and push the tap image.

#### Controller and pkg directories

This is primarily deletions. Most of the deleted code in this directory is now
in the tap directory of the Viz extension.

#### viz/tap

This is the location that all the tap related code now lives in. New files are
mostly moved from the controller and pkg directories. Imports have all been
updated to point at the right locations and Protobuf.

The Protobuf here is taken from metrics-api and contains all tap-related
Protobuf.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-09 12:43:21 -05:00
Dean Lindqvist Todevski 3542269e78
Add Zimpler to the adopters list. (#5700)
Signed-off-by: Dean Lindqvist Todevski <dean@zimpler.com>
2021-02-09 10:49:23 -05:00
Alejandro Pedraza a04b30d2ab
Simplify SelfCheck API (#5665)
Fixes #5575

Now that only viz makes use of the `SelfCheck` api, merged the `healthcheck.proto` into `viz.proto`.

Also removed the "checkRPC" functionality that was used for handling multiple API responses and was only used by `SelfCheck`, because the extra complexity was not granted. Revert to use the plain vanilla "check" by just concatenating error responses.

## Success Output

```bash
$ bin/linkerd viz check
...
linkerd-viz
-----------
...
√ viz extension self-check
```

## Failure Examples

Failure when viz fails to connect to the k8s api:
```bash
$ bin/linkerd viz check
...
linkerd-viz
-----------
...
× viz extension self-check
    Error calling the Kubernetes API: someerror
    see https://linkerd.io/checks/#l5d-api-control-api for hints

Status check results are ×
```

Failure when viz fails to connect to Prometheus:
```bash
$ bin/linkerd viz check
...
linkerd-viz
-----------
...
× viz extension self-check
    Error calling Prometheus from the control plane: someerror
    see https://linkerd.io/checks/#l5d-api-control-api for hints

Status check results are ×
```

Failure when viz fails to connect to both the k8s api and Prometheus:
```bash
$ bin/linkerd viz check
...
linkerd-viz
-----------
...
× viz extension self-check
    Error calling the Kubernetes API: someerror
    Error calling Prometheus from the control plane: someerror
    see https://linkerd.io/checks/#l5d-api-control-api for hints

Status check results are ×
```
2021-02-05 10:13:45 -05:00
Tarun Pothulapati 704ed00a49
viz: make checks aware of prom and grafana being optional (#5627)
* viz: make checks aware of prom and grafana being optional

Fixes #5618

Currently, The linkerd-viz checks fail whenever external
Prometheus is being used as those checks are not aware of
Prometheus and grafana being optional.

This commit fixes this by making the Prometheus and Grafana
as separate checks which are not fatal and these checks
can also be made dynamic and be ran only if those
components are available.

This commit also adds some of the missing resources checks,
especially that of the new `metrics-api` into viz checks

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-05 11:26:44 +05:30
Alex Leong c6536996f7
Add linkerd repair command (#5636)
In the stable-2.9.0, stable-2.9.1, and stable-2.9.2 releases, the `linkerd-config-overrides` secret is missing the `linkerd.io/control-plane-ns` label.  This means that if a `linkerd upgrade` is performed to one of these versions using the `--prune` flag, then the secret will be deleted.  Missing this secret will prevent any further upgrades.

We add a `linkerd repair` command which recreates the `linkerd-config-overrides` secret by fetching the installed values from the `linkerd-config` configmap and then re-populating the redacted identity values from the `linkerd-identity-issuer` secret.

Usage:

```bash
linkerd repair | kubectl apply -f -
```

To test:
```
# Set Linkerd version to stable-2.8.0
> linkerd install | kubectl apply -f -
# Set Linkerd version to stable-2.9.1
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command fails)
# Set Linkerd version to HEAD
> linkerd repair | kubectl apply -f -
# Set Linkerd version to stable-2.9.2
> linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f -
(Command succeeds)
> linkerd check
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-02-04 16:47:04 -08:00
Alejandro Pedraza 59eca5bb82
Use full-length actions SHAs in CI (#5668)
Github now requires that actions pined to a SHA use the full-length SHA,
otherwise CI throws a warning like:
```
actions/checkout@722adc6 looks like the shortened version of a commit
SHA. Referencing actions by the short SHA will be disabled soon. Please
see
https://docs.github.com/en/actions/learn-github-actions/security-hardening-for-github-actions#using-third-party-actions.
```
2021-02-04 16:51:53 -05:00
Alejandro Pedraza 565f32d429
## edge-21.2.1 (#5667)
This edge release continues improving the proxy's diagnostics and also avoids
timing out when the HTTP protocol detection fails. Additionally, old resource
versions were upgraded to avoid warnings in k8s v1.19. Finally, it comes with
lots of CLI improvements detailed below.

* Improved the proxy's diagnostic metrics to help us get better insights into
  services that are in fail-fast
* Improved the proxy's HTTP protocol detection to prevent timeout errors
* Upgraded CRD and webhook config resources to get rid of warnings in k8s v1.19
  (thanks @mateiidavid!)
* Added viz components into the Linkerd Health Grafana charts
* Had the tap injector add a `viz.linkerd.io/tap-enabled` annotation when
  injecting a pod, which allowed providing clearer feedback for the `linkerd
  tap` command
* Had the jaeger injector add a `jaeger.linkerd.io/tracing-enabled` annotation
  when injecting a pod, which also allowed providing better feedback for the
  `linkerd jaeger check` command
* Improved the `linkerd uninstall` command so it fails gracefully when there
  still are injected resources in the cluster (a `--force` flag was provided
  too)
* Moved the `linkerd profile --tap` functionality into a new command `linkerd
  viz profile --tap`, given tap now belongs to the viz extension
* Expanded the `linkerd viz check` command to include data-plane checks
* Cleaned-up YAML in templates that was incompatible with SOPS (thanks
  @tkms0106!)
2021-02-04 15:10:26 -05:00
Alejandro Pedraza 7cd5bf9e10
bin/test-cleanup reordering (#5666)
After #5642 `linkerd uninstall` will fail if there still are injected
pods in the cluster. So uninstall should happen last.
2021-02-04 14:01:01 -05:00
Alejandro Pedraza b8ed799372
Include viz components in Prom scrapes, fix Linkerd Health charts (#5656)
* Include viz components in Prom scrapes, fix Linkerd Health charts

Fixes #5429

Expanded the `linkerd-controller` Prometheus scraping config so it also includes the `linkerd-viz` namespace. Also simplified the first relabelling config there removing the `_meta_kubernetes_pod_label_linkerd_io_control_plane_component` source label that wasn't serving any purpose. Just by its own, that extra scraping now allows having non-empty Go charts at the bottom of the `Linkerd Health` charts for the viz components.

Additionally, the `namespace-viz` variable was added into `health.json` which then is leveraged in the queries for the `Control-Plane Traffic` and `Control-Plane TCP Metrics` charts to include the viz pods.

Finally in that same file the queries for the `Data-Plane Telemetry` section were simplified by removing the filter on the `control_plane_ns` label which was redundant.
2021-02-04 09:40:23 -05:00
William Morgan 511ba69f5c
add initial steering committee members (#5640)
* add initial steering committee members

Signed-off-by: William Morgan <william@buoyant.io>

Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2021-02-04 08:25:09 -06:00
Kevin Leimkuhler 228d8e9e95
Add tracing enabled annotation (#5643)
This change adds the `jaeger.linkerd.io/tracing-enabled` annotation which is
automatically added by the Jaeger extension's `jaeger-injector`.

All pods that receive this annotation have also had the required environment
variables and volume/volume mounts add by the injector.

The purpose of this annotation is that it will allow `jaeger check` to check for
the presence of this annotation instead of needing to look at the proxy
containers directly. If this annotation is not present on pods, `jaeger check`
can warn users that tracing is not configured for those pods. This is similar to
`viz check` warning users that tap is not configured—recenlty added in #5602.

Closes #5632

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-03 14:05:15 -05:00
Tarun Pothulapati b521091ca7
cli: make `linkerd uninstall` fail when injected pods are present (#5642)
* cli: make `linkerd uninstall` fail when injected pods are present

Fixes #5622

This PR updates the `linkerd uninstall` cmd to check if there
are any injected pods and fails if there are any. This also
provides `--force` flag to skip this check.

pods from namespaces with prefix `linkerd` are skipped
so as to not error out for control-plane and extension
pods.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-03 12:31:30 +05:30
Oliver Gould 10411d02bb
proxy: v2.131.0 (#5657)
This release changes HTTP protocol detection to prevent timeout errors
in two ways:

1. HTTP detection no longer blocks until a newline is read. We've
   reverted to relying on a single read to make a determination.
2. Detection timeouts are no longer terminal. When a timeout is
   encountered, we continue forwarding the connection as an opaque TCP
   connection.

These changes may lead to false-negatives--we may fail to detect some
HTTP streams--but it should prevent many avoidable detection errors.

This release also makes improvements for multicluster gateways,
improving caching so that profile lookups are only performed
once per target service.

Diagnostic `stack_*` metrics have been moved so that they track
underlying services, ignoring fail-fast. This should help us get better
insights into services that are in failfast.

Finally, the opencensus exporter has been improved to ensure that trace
events are flushed if the trace buffer is not filled within a timeout.

---

* actions: Update actions to use full SHAs (linkerd/linkerd2-proxy#885)
* http: Parameterize normalize_ur::DefaultAuthority (linkerd/linkerd2-proxy#886)
* http: Parameterize the HTTP server (linkerd/linkerd2-proxy#887)
* opencensus: rewrite span exporter using async/await (linkerd/linkerd2-proxy#789)
* Update http::Insert to use `Param` (linkerd/linkerd2-proxy#889)
* Update crate dependencies (linkerd/linkerd2-proxy#892)
* stack: Make the router fallible (linkerd/linkerd2-proxy#888)
* Track stack metrics within failfast (linkerd/linkerd2-proxy#891)
* outbound: Avoid building balancers when no concrete name (linkerd/linkerd2-proxy#890)
* inbound: Cache HTTP gateways per destination (linkerd/linkerd2-proxy#893)
* Reorganize the gateway crate (linkerd/linkerd2-proxy#897)
* Bias HTTP detection towards availability (linkerd/linkerd2-proxy#894)
* inbound: Use ALPN to determine transport header (linkerd/linkerd2-proxy#895)
* detect: Return unknown protocol on detection timeout (linkerd/linkerd2-proxy#896)
* Extract protocol detection into the gateway crate (linkerd/linkerd2-proxy#898)
2021-02-02 18:08:22 -08:00
Kevin Leimkuhler df0ce24b12
viz: add viz profile command (#5621)
## What this changes

This adds a `viz profile` command that outputs a service profile based off tap
data. It is identical—but fixes—the current `profile --tap` command.

Additionally, it removes the `--tap` flag from the `profile` command since this
depends on the Viz extension being installed in order to tap a service.

## Why

The `profile --tap` command is currently broken since it depends on the Viz
extension being installed, but the `profile` command is part of the core
install.

Closes #5613

Unblocks #5545

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-01 19:02:46 -05:00
Simon Weald 590d152566
add Giant Swarm to adopters (#5645)
This PR adds Giant Swarm as an adopter 🥳

---

Signed-off-by: Simon Weald <glitchcrab-github@simonweald.com>
2021-02-01 16:33:45 -05:00
Tarun Pothulapati cd2e911be3
viz: add data-plane and prometheus healthchecks (#5602)
* viz: add data-plane and prometheus healthchecks

Fixes #5325

This branch adds the remaining healthchecks for the viz extension
i.e

- Data-plane metrics check in Prometheus
- `--proxy` mode which also checks for tap injections based
  on annotations.

For this, The following changes were needed
- Category.ID is made public so that --proxy toggleness can be
allowed
- Made tap env key as a field so that it can be re-used for
checks

simplify viz.NewHealthChecker by removing the need to
 pass categoryIDs and instead using
hc.appendCategories directly at the caller to add the
required categories. This is possible by dividing the vizCategories
into separate functions

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-01 23:01:13 +05:30
Takumi Sue 77add64860
Remove extra three dashes from helm templates (#5628)
(Background information)
In our company we are checking the sops-encrypted Linkerd manifest into GitHub repository,
and I came across the following problem.

---

Three dashes mean the start of the YAML document (or the end of the
directive).
https://yaml.org/spec/1.2/spec.html#id2800132

If there are only comments between `---`, the document is empty.
Assume the file which include an empty document at the top of itself.

```yaml
---
# foo
---
apiVersion: v1
kind: Namespace
metadata:
  name: foo
---
# bar
---
apiVersion: v1
kind: Namespace
metadata:
  name: bar
```

When we encrypt and decrypt it with [sops](https://github.com/mozilla/sops), the empty document will be
converted to `{}`.

```yaml
{}
---
apiVersion: v1
kind: Namespace
metadata:
    name: foo
---
apiVersion: v1
kind: Namespace
metadata:
    name: bar
```

It is invalid as k8s manifest ([apiVersion not set, kind not set]).

```
error validating data: [apiVersion not set, kind not set]
```

---

I'm afraid that it's sops's problem (at least partly), but anyhow this modification is enough harmless I think.
Thank you.

Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>
2021-02-01 10:51:34 -05:00
Matei David 0ce9e84a94
Introduce V1 to CRDs and Mutating Hooks (#5603)
*Closes #5484*
 ### Changes
---
*Overview*:
 * Update golden files and make necessary spec changes
 * Update test files for viz
 * Add v1 to healthcheck and uninstall
 * Fix link-crd clusterDomain field validation

- To update to v1, I had to change crd schemas to be version-based (i.e each version has to declare its own schema). I noticed an error in the link-crd (`targetClusterDomain` was `targetDomainName`). Also, additionalPrinterColumns are also version-dependent as a field now.

- For `admissionregistration` resources I had to add an additional `admissionReviewVersions` field -- I included `v1` and `v1beta1`.

- In `healthcheck.go` and `resources.go` (used by `uninstall`) I had to make some changes to the client-go versions (i.e from `v1beta1` to `v1` for admissionreg and apiextension) so that we don't see any warning messages when uninstalling or when we do any install checks. 

I tested again different cli and k8s versions to have a bit more confidence in the changes (in addition to automated tests), hope the cases below will be enough, if not let me know and I can test further.

### Tests

Linkerd local build CLI + k8s 1.19+
`install/check/mc-check/mc-install/mc-link/viz-install/viz-check/uninstall/`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-b0fd2ec8
Server version: unavailable

$ bin/linkerd install | kubectl apply -f -
- no errors, no version warnings - 

$ bin/linkerd check --expected-version git-b0fd2ec8
Status check results are :tick:

# MC

$ bin/linkerd mc install | k apply -f - 
- no erros, no version warnings - 

$ bin/linkerd mc check
Status check results are :tick:

$ bin/linkerd mc link foo | k apply -f -   # test crd creation
# had a validation error here because the schema had targetDomainName instead of targetClusterDomain
# changed, rebuilt cli, re-installed mc, tried command again
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created
...

# VIZ
$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd uninstall | k delete -f -
- no errors, no version warnings - 
```

Linkerd local build CLI + k8s 1.17
`check-pre/install/mc-check/mc-install/mc-link/viz-install/viz-check`
```
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.17-rc1+k3s1", GitCommit:"e8c9484078bc59f2cd04f4018b095407758073f5", GitTreeState:"clean", BuildDate:"2021-01-14T06:20:56Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

$ bin/linkerd version
Client version: git-3d2d4df1 # made changes to link-crd after prev test case
Server version: unavailable

$ bin/linkerd check --pre --expected-version git-3d2d4df1
- no errors, no version warnings -
Status check results are :tick:

$ bin/linkerd install | k apply -f -
- no errors, no version warnings -

$ bin/linkerd check --expected-version git-3d2d4df1
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc install | k apply -f -
- no errors, no version warnings - 

$ bin/linkerd mc check 
- no errors, no version warnings - 
Status check results are :tick:

$ bin/linkerd mc link --cluster-name foo | k apply -f -
bin/linkerd mc link --cluster-name foo | k apply -f -
secret/cluster-credentials-foo created
link.multicluster.linkerd.io/foo created

# VIZ

$ bin/linkerd viz install | k apply -f - 
- no errors, no version warnings - 

$ bin/linkerd viz check
- no errors, no version warnings -
- hangs up indefinitely after linkerd-viz can talk to Kubernetes
```

Linkerd edge (21.1.3) CLI + k8s 1.17 (already installed)
`check`
```
$ linkerd version
Client version: edge-21.1.3
Server version: git-3d2d4df1

$ linkerd check
- no errors -
- warnings: mismatch between cli & control plane, control plane not up to date (both expected) -
Status check results are :tick:
```

Linkerd stable (2.9.2) CLI + k8s 1.17 (already installed)
`check/uninstall`
```
$ linkerd version
Client version: stable-2.9.2
Server version: git-3d2d4df1

$ linkerd check
× control plane ClusterRoles exist
    missing ClusterRoles: linkerd-linkerd-tap
    see https://linkerd.io/checks/#l5d-existence-cr for hints

Status check results are ×

# viz wasn't installed, hence the error, installing viz didn't help since
# the res is named `viz-tap` now
# moving to uninstall

$ linkerd uninstall | k delete -f -
- no warnings, no errors - 
```

_Note_: I used `go test ./cli/cmd/... --generate` which is why there are so many changes 😨 

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-02-01 09:18:13 -05:00
Kevin Leimkuhler 964a190069
viz: only tap pods that have tap explicitly enabled (#5608)
## What this changes

This allows the tap controller to inform `tap` users when pods either have tap
disabled or tap is not enabled yet.

## Why

When a user taps a resource that has not been admitted by the Viz extension's
`tap-injector`, tap is not explicitly disabled but it is also not enabled.
Therefore, the `tap` command hangs and provides no feedback to the user.

Closes #5544

## How

A new `viz.linkerd.io/tap-enabled` annotation is introduced which is
automatically added by the Viz extension's `tap-injector`. This annotation is
added to a pod when it is able to be tapped; this means that the pod and the
pod's namespace do not have the `config.linkerd.io/disable-tap` annotation
added.

When a user attempts to tap a resource, the tap controller now looks for this
new annotation; if the annotation is present on the pod then that pod is
tappable.

If the annotation is not present or tap is explicitly disabled, an error is
returned.

## UI changes

Multiple errors can now occur when trying to tap a resource:

1. There are no pods for the resource.
2. There are pods for the resource, but tap is disabled via pod or namespace
   annotation.
3. There are pods for the resource, but tap is not yet enabled because the
   `tap-injector` did not admit the resource.

Errors are now handled as shown below:

Tap is disabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap disabled via the config.linkerd.io/disable-tap annotation
```

Tap is not enabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap not enabled; try restarting resource so that it can be injected
```

There are a mix of pods with tap disabled or tap not enabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap disabled via the config.linkerd.io/disable-tap annotation
pods found with tap not enabled; try restarting resource so that it can be injected
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-28 17:37:45 -05:00
Alex Leong 07a2f07471
edge-21.1.4 (#5633)
This edge release continues to polish the Linkerd extension model and improves
the robustness of the opaque transport.

* Improved the consistency of behavior of the `check` commands between
  Linkerd extensions
* Fixed an issue where Linkerd extension commands could be run before the
  extension was fully installed
* Renamed some extension Helm charts for consistency:
  * jaeger -> linkerd-jaeger
  * linkerd2-multicluster -> linkerd-multicluster
  * linkerd2-multicluster-link -> linkerd-multicluster-link
* Fixed an issue that could cause the inbound proxy to fail meshed HTTP/1
  requests from older proxies (from the stable-2.8.x vintage)
* Changed opaque-port transport to be advertised via ALPN so that new proxies
  will not initiate opaque-transport connections to proxies from prior
  edge releases
* Added inbound proxy transport metrics with `tls="passhtru"` when forwarding
  non-mesh TLS connections
* Thanks to @hs0210 for adding new unit tests!

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-28 11:28:07 -08:00
Oliver Gould fdd36357d5
proxy: v2.130.1 (#5625)
First and foremost, this release fixes an issue that could cause the
inbound proxy to fail meshed HTTP/1 requests from older proxies (from
the stable-2.8.x vintage).

Additionally, this release changes how opaque-port transport works, in
preparation for TCP multicluster functionality: now servers must
advertise support for the transport header via ALPN. Clients will only
send a transport header when the server advertises support for ALPN.
This means that new proxies will not initiate opaque-transport
connections to proxies from prior edge releases.

Finally, inbound proxies may now report transport metrics with
`tls="passhtru"` when forwarding non-mesh TLS connections.

---

* metrics: add `target_addr` labels to HTTP metrics (linkerd/linkerd2-proxy#866)
* inbound: Handle direct connections with a dedicated stack (linkerd/linkerd2-proxy#863)
* inbound: Avoid HTTP detection when a transport header is present (linkerd/linkerd2-proxy#867)
* Update tokio to v1.1.0 (linkerd/linkerd2-proxy#870)
* admin: stackify admin server (linkerd/linkerd2-proxy#868)
* tls: Report SNI values for non-Linkerd TLS (linkerd/linkerd2-proxy#869)
* admin: Record transport & HTTP metrics (linkerd/linkerd2-proxy#871)
* test: Disable tracing-subscriber by default (linkerd/linkerd2-proxy#873)
* inbound: Split stack into modules (linkerd/linkerd2-proxy#872)
* Improve diagnostics around the SwitchReady module (linkerd/linkerd2-proxy#875)
* Use TLS ALPN to negotiate transport header support (linkerd/linkerd2-proxy#874)
* stack: Introduce the Param trait (linkerd/linkerd2-proxy#876)
* transport-header: Encode session protocol (linkerd/linkerd2-proxy#877)
* transport: Add a ConnectAddr parameter type (linkerd/linkerd2-proxy#879)
* profiles: Use a LogicalAddr param type (linkerd/linkerd2-proxy#878)
* stack: Replace switch with Filter and NewEither (linkerd/linkerd2-proxy#880)
* inbound: normalize URIs after downgrading to HTTP/1 (linkerd/linkerd2-proxy#881)
2021-01-28 08:01:51 -08:00
Hu Shuai 5e3d5190c3
Add unit test for pkg/healthcheck/sidecar.go (#5609)
Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-27 16:56:14 -05:00
William Morgan c79d36d63a
add STEERING.md (#5607)
* add STEERING.md

Signed-off-by: William Morgan <william@buoyant.io>
2021-01-27 09:39:32 -06:00
Alex Leong dd8e5fc5bc
Rename extension charts to linkerd-* (#5552)
For consistency we rename the extension charts to a common naming scheme:

linkerd-viz -> linkerd-viz (unchanged)
jaeger -> linkerd-jaeger
linkerd2-multicluster -> linkerd-multicluster
linkerd2-multicluster-link -> linkerd-multicluster-link

We also make the chart files and chart readmes a bit more uniform.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-26 16:20:49 -08:00
Tarun Pothulapati 9756b3f8f1
extensions: make subcmds check/wait for respective extensions (#5566)
* extensions: make subcmds check/wait for respective extensions

This commit updates the extension subcmds to check and wait
for the respective extensions to be up before running them.

The same healthcheck pkg and respective extension checks
 are used to at the check/wait logic.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-26 23:01:25 +05:30
Dennis Adjei-Baah ae2b3499b0
Tweak error message in web script (#5596)
The `bin/web run` script sets up a local environment for linkerd
dashboard development. This script port-forwards an existing linkerd
controller and a grafana instance in a local kubernetes cluster. When
running the command with just the linkerd control plane  and no
linkerd viz extension the error message is shown below.
```
'Controller is not running. Have you installed Linkerd?'
```

This error message is a little misleading because the controller is
installed when running this after `linkerd install`. The issue here is
that the script checks for a Grafana instance but the error message it
displays when it can't find a Grafana pod is that the controller isn't
install. The error message should instead notify the developer that
Linkerd Viz is not installed.

This change modifies the error message so it is more clear.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2021-01-26 11:21:44 -06:00
Andrew Seigner 700b4c5cb5
Move @siggy to emeritus maintainer (#5597)
Emeritus (adj): having retired but allowed to retain their title as an
honor

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2021-01-23 07:26:47 -08:00
Alex Leong 964ce11559
Update generated serviceprofile code (#5580)
I ran `bin/update-codegen.sh` to update the generated code to include the opaque ports in the generated deepcopy function for service profiles.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-22 14:34:49 -08:00
Tarun Pothulapati 4f0601e632
jaeger: cli and check logic cleanup (#5564)
This branch cleans up some of the unnecessary logic that is not
needed and thus making the check logic similar to that of other
extensions, namely viz.

Includes the following cleanups:

- Remove `namespace` flag in jaeger CLI and make the fetching logic
dynamic and use it in check and dashboard.
- Use `hc.KubeAPIClient` instead of creating our own in jaeger check.
- Move injection checks up before we run the readiness checks

This change adds a new extension namespace exist check for
jaeger.

Also, Updates integration tests to run the check commands.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-22 23:31:35 +05:30
cpretzer fcb71de428
Changes for edge-21.1.3 (#5590)
* Changes for edge-21.1.3
Signed-off-by: Charles Pretzer <charles@buoyant.io>
2021-01-21 17:06:10 -08:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Tarun Pothulapati 288fbefe02
viz: cleanup helm values.yaml (#5546)
* viz: cleanup helm values.yaml

This branch fixes some nits around naming of default variables
i.e replace the usage of global with default.

Renames globalLogLevel to defaultLogLevel and globalUID to
defaultUID along with some chart README updates.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-22 00:48:16 +05:30
Tarun Pothulapati a95efe2db1
tests: update newly added golden tests (#5588)
#5507 added new golden tests but missed some updates from other PRs
that got merged meanwhile.

This branch updates those golden tests with those changes

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-21 23:47:36 +05:30
Tarun Pothulapati d0d2e0ea7a
cli: add helm customization flags to core install (#5507)
* cli: add helm customization flags to core install

Fixes #5506

This branch adds helm way of customization through
 `set`, `set-string`, `values`, `set-files` flags for
`linkerd install` cmd along with unit tests.

For this to work, the helm v3 engine rendering helpers
had to be used instead of our own wrapper type.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-21 22:49:50 +05:30
Andrew Seigner 9c80d4d2a1
Fix `linkerd metrics` resource selector (#5567)
The `linkerd metrics` command was selecting pods based on owner resource
names. If multiple owners existed with the same name (for example
`sts/web`, `deploy/web`), additional pods would be incorrectly included
in the output.

Fix the pod selector code to validate pods have owner references to the
given workload/owner.

Before:
```
$ linkerd metrics -n emojivoto deploy/web|grep POD
  # POD web-0 (1 of 3)
  # POD web-d9ffd684f-gnbcx (2 of 3)
  # POD web-fs6l7 (3 of 3)
```

After:
```
$ bin/go-run cli metrics -n emojivoto deploy/web|grep POD
  # POD web-d9ffd684f-gnbcx (1 of 1)
```

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2021-01-21 11:57:27 -05:00
Matei David c63fbdf0e4
Introduce OpenAPIV3 validation for CRDs (#5573)
* Introduce OpenAPIV3 validation for CRDs

* Add validation to link crd
* Add validation to sp using kube-gen
* Add openapiv3 under schema fields in specific versions
* Modify fields to rid spec of yaml errors
* Add top level validation for all three CRDs

Signed-off-by: Matei David <matei.david.35@gmail.com>
2021-01-21 11:56:28 -05:00
Naseem 2cc96d4ab9
fix alertmanagers casing (one word) (#5377)
fixes #5371

Signed-off-by: naseemkullah <naseem@transit.app>
2021-01-21 11:55:24 -05:00
Kevin Leimkuhler e7f2a3fba3
viz: add tap-injector (#5540)
## What this changes

This adds a tap-injector component to the `linkerd-viz` extension which is
responsible for adding the tap service name environment variable to the Linkerd
proxy container.

If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled
via annotation on the pod or the namespace, no action is taken.

This also removes the environment variable for explicitly disabling tap through
an environment variable. Tap status for a proxy is now determined only be the
presence or absence of the tap service name environment variable.

Closes #5326

## How it changes

### tap-injector

The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be
added to a pod's Linkerd proxy container environment. If the pod satisfies the
following, it is added:

- The pod has a Linkerd proxy container
- The pod has not already been mutated
- Tap is not disabled via annotation on the pod or the pod's namespace

### LINKERD2_PROXY_TAP_DISABLED

Now that tap is an extension of Linkerd and not a core component, it no longer
made sense to explicitly enable or disable tap through this Linkerd proxy
environment variable. The status of tap is now determined only be if the
tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment
variable.

### controller image

The tap-injector has been added to the controller image's several startup
commands which determines what it will do in the cluster.

As a follow-up, I think splitting out the `tap` and `tap-injector` commands from
the controller image into a linkerd-viz image (or something like that) makes
sense.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-21 11:24:08 -05:00
Oliver Gould 6f954c3823
proxy: v2.129.0 (#5581)
This release improves diagnostics about the proxy's failfast state:

* Warnings are now emitted when the failfast state is entered;
* The "max concurrency exhausted" gRPC message has been changed to
  more-clearly indicate a failfast state error; and
* Failfast recovery has been made more robust, ensuring that a service
  can recover indepenently of new requests being received.

Furthermore, metric labeling has been improved:

* TCP server metrics are now annotated with the original `target_addr`;
* The `tls` label is now set to true for inbound TLS connections that
  lack a client ID. This is mostly helpful to clarify inbound metrics on
  the `identity` controller;
* Outbound `tls` metrics could be reported incorrectly when a proxy was
  configured to not use identity. This has been corrected.

Finally, socket-level errors now include a _client_ or _server_ prefix
to indicate which side of the proxy encountered the error.

---

* stack: remove `map_response` (linkerd/linkerd2-proxy#835)
* replace `RequestFilter` with Tower's upstream impl (linkerd/linkerd2-proxy#842)
* tracing: fix incorrect field format when logging in JSON (linkerd/linkerd2-proxy#845)
* replace `FutureService` with Tower's upstream impl (linkerd/linkerd2-proxy#839)
* integration: improve tracing in tests (linkerd/linkerd2-proxy#846)
* service-profiles: Prevent Duration coercion panics (linkerd/linkerd2-proxy#844)
* inbound: Separate HTTP server logic from protocol detection (linkerd/linkerd2-proxy#843)
* Correct gRPC 'max-concurrency exhausted' error messages (linkerd/linkerd2-proxy#847)
* Update tonic to v0.4 (linkerd/linkerd2-proxy#849)
* failfast: Improve diagnostic logging (linkerd/linkerd2-proxy#848)
* Update the base docker image (linkerd/linkerd2-proxy#850)
* stack: Implement Clone for ResultService (linkerd/linkerd2-proxy#851)
* Ensure services in failfast can become ready (linkerd/linkerd2-proxy#858)
* tests: replace string matching on metrics with parsing (linkerd/linkerd2-proxy#859)
* Decouple tls::accept from TcpStream (linkerd/linkerd2-proxy#853)
* metrics: Handle NoPeerIdFromRemote properly (linkerd/linkerd2-proxy#857)
* metrics: Reorder metrics labels (linkerd/linkerd2-proxy#856)
* Rename tls::accept to tls::server (linkerd/linkerd2-proxy#854)
* Annotate socket-level errors with a scope (linkerd/linkerd2-proxy#852)
* test: reduce repetition in metrics tests (linkerd/linkerd2-proxy#860)
* tls: Disambiguate client and server identities (linkerd/linkerd2-proxy#855)
* Update to tower v0.4.4 (linkerd/linkerd2-proxy#864)
* Update cargo dependencies (linkerd/linkerd2-proxy#865)
* metrics: add `target_addr` label for accepted transport metrics (linkerd/linkerd2-proxy#861)
* outbound: Strip endpoint identity when disabled (linkerd/linkerd2-proxy#862)

---

The opaque-ports test has been updated to reflect proxy metrics changes.
2021-01-21 06:52:38 -08:00
Oliver Gould d2ae5a8117
build: Remove the DOCKER_TRACE environment variable (#5583)
Our build scripts hide docker's output by default and only pass through
output when DOCKER_TRACE is set. Practically everyone else tends to use
DOCKER_TRACE=1 persistently. And, recently, GitHub Actions stopped
working with `/dev/stderr`

This change removes the DOCKER_TRACE environment variable so that output
is always emitted as it would when invoking docker directly.
2021-01-20 22:09:47 -08:00
Hu Shuai 08439f1f6e
Add unit test for pkg/charts/charts.go (#5565)
Add tests for MergeMap

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-20 09:55:01 -05:00
Tarun Pothulapati 3b755e5c1d
multicluster: add helm customization flags for install (#5534)
* multicluster: add helm customization flags

This branch updates the multicluster install flow to use the
helm engine directly instead of our own chart wrapper. This
also adds the helm customization flags.

```bash
tarun in dev in on  k3d-deep (default) linkerd2 on  tarun/mc-helm-flags [$+?] via  v1.15.4
 ./bin/go-run cli mc install --set namespace=l5d-mc | grep l5d-mc
github.com/linkerd/linkerd2/multicluster/cmd
github.com/linkerd/linkerd2/cli/cmd
  name: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
    mirror.linkerd.io/gateway-identity: linkerd-gateway.l5d-mc.serviceaccount.identity.linkerd.cluster.local
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
  namespace: l5d-mc
```

* add customization flags even for link cmd

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-20 11:29:42 +05:30
Hu Shuai 37472c566f
Fix typos. (#5563)
Fix spelling: accomodate
Fix spelling: conenctions

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
2021-01-19 15:28:57 -08:00
Risha Mars 1a5a8c0cf2
Move @rmars to emeritus maintainer (#5562)
Signed-off-by: Risha Mars <mars@buoyant.io>
2021-01-19 13:54:05 -08:00