Commit Graph

273 Commits

Author SHA1 Message Date
Tarun Pothulapati b521091ca7
cli: make `linkerd uninstall` fail when injected pods are present (#5642)
* cli: make `linkerd uninstall` fail when injected pods are present

Fixes #5622

This PR updates the `linkerd uninstall` cmd to check if there
are any injected pods and fails if there are any. This also
provides `--force` flag to skip this check.

pods from namespaces with prefix `linkerd` are skipped
so as to not error out for control-plane and extension
pods.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-02-03 12:31:30 +05:30
Kevin Leimkuhler df0ce24b12
viz: add viz profile command (#5621)
## What this changes

This adds a `viz profile` command that outputs a service profile based off tap
data. It is identical—but fixes—the current `profile --tap` command.

Additionally, it removes the `--tap` flag from the `profile` command since this
depends on the Viz extension being installed in order to tap a service.

## Why

The `profile --tap` command is currently broken since it depends on the Viz
extension being installed, but the `profile` command is part of the core
install.

Closes #5613

Unblocks #5545

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-01 19:02:46 -05:00
Kevin Leimkuhler 964a190069
viz: only tap pods that have tap explicitly enabled (#5608)
## What this changes

This allows the tap controller to inform `tap` users when pods either have tap
disabled or tap is not enabled yet.

## Why

When a user taps a resource that has not been admitted by the Viz extension's
`tap-injector`, tap is not explicitly disabled but it is also not enabled.
Therefore, the `tap` command hangs and provides no feedback to the user.

Closes #5544

## How

A new `viz.linkerd.io/tap-enabled` annotation is introduced which is
automatically added by the Viz extension's `tap-injector`. This annotation is
added to a pod when it is able to be tapped; this means that the pod and the
pod's namespace do not have the `config.linkerd.io/disable-tap` annotation
added.

When a user attempts to tap a resource, the tap controller now looks for this
new annotation; if the annotation is present on the pod then that pod is
tappable.

If the annotation is not present or tap is explicitly disabled, an error is
returned.

## UI changes

Multiple errors can now occur when trying to tap a resource:

1. There are no pods for the resource.
2. There are pods for the resource, but tap is disabled via pod or namespace
   annotation.
3. There are pods for the resource, but tap is not yet enabled because the
   `tap-injector` did not admit the resource.

Errors are now handled as shown below:

Tap is disabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap disabled via the config.linkerd.io/disable-tap annotation
```

Tap is not enabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap not enabled; try restarting resource so that it can be injected
```

There are a mix of pods with tap disabled or tap not enabled:

```
❯ bin/linkerd viz tap deploy/test
Error: no pods to tap for deployment/test
pods found with tap disabled via the config.linkerd.io/disable-tap annotation
pods found with tap not enabled; try restarting resource so that it can be injected
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-28 17:37:45 -05:00
Oliver Gould fdd36357d5
proxy: v2.130.1 (#5625)
First and foremost, this release fixes an issue that could cause the
inbound proxy to fail meshed HTTP/1 requests from older proxies (from
the stable-2.8.x vintage).

Additionally, this release changes how opaque-port transport works, in
preparation for TCP multicluster functionality: now servers must
advertise support for the transport header via ALPN. Clients will only
send a transport header when the server advertises support for ALPN.
This means that new proxies will not initiate opaque-transport
connections to proxies from prior edge releases.

Finally, inbound proxies may now report transport metrics with
`tls="passhtru"` when forwarding non-mesh TLS connections.

---

* metrics: add `target_addr` labels to HTTP metrics (linkerd/linkerd2-proxy#866)
* inbound: Handle direct connections with a dedicated stack (linkerd/linkerd2-proxy#863)
* inbound: Avoid HTTP detection when a transport header is present (linkerd/linkerd2-proxy#867)
* Update tokio to v1.1.0 (linkerd/linkerd2-proxy#870)
* admin: stackify admin server (linkerd/linkerd2-proxy#868)
* tls: Report SNI values for non-Linkerd TLS (linkerd/linkerd2-proxy#869)
* admin: Record transport & HTTP metrics (linkerd/linkerd2-proxy#871)
* test: Disable tracing-subscriber by default (linkerd/linkerd2-proxy#873)
* inbound: Split stack into modules (linkerd/linkerd2-proxy#872)
* Improve diagnostics around the SwitchReady module (linkerd/linkerd2-proxy#875)
* Use TLS ALPN to negotiate transport header support (linkerd/linkerd2-proxy#874)
* stack: Introduce the Param trait (linkerd/linkerd2-proxy#876)
* transport-header: Encode session protocol (linkerd/linkerd2-proxy#877)
* transport: Add a ConnectAddr parameter type (linkerd/linkerd2-proxy#879)
* profiles: Use a LogicalAddr param type (linkerd/linkerd2-proxy#878)
* stack: Replace switch with Filter and NewEither (linkerd/linkerd2-proxy#880)
* inbound: normalize URIs after downgrading to HTTP/1 (linkerd/linkerd2-proxy#881)
2021-01-28 08:01:51 -08:00
Tarun Pothulapati 4f0601e632
jaeger: cli and check logic cleanup (#5564)
This branch cleans up some of the unnecessary logic that is not
needed and thus making the check logic similar to that of other
extensions, namely viz.

Includes the following cleanups:

- Remove `namespace` flag in jaeger CLI and make the fetching logic
dynamic and use it in check and dashboard.
- Use `hc.KubeAPIClient` instead of creating our own in jaeger check.
- Move injection checks up before we run the readiness checks

This change adds a new extension namespace exist check for
jaeger.

Also, Updates integration tests to run the check commands.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-22 23:31:35 +05:30
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Kevin Leimkuhler e7f2a3fba3
viz: add tap-injector (#5540)
## What this changes

This adds a tap-injector component to the `linkerd-viz` extension which is
responsible for adding the tap service name environment variable to the Linkerd
proxy container.

If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled
via annotation on the pod or the namespace, no action is taken.

This also removes the environment variable for explicitly disabling tap through
an environment variable. Tap status for a proxy is now determined only be the
presence or absence of the tap service name environment variable.

Closes #5326

## How it changes

### tap-injector

The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be
added to a pod's Linkerd proxy container environment. If the pod satisfies the
following, it is added:

- The pod has a Linkerd proxy container
- The pod has not already been mutated
- Tap is not disabled via annotation on the pod or the pod's namespace

### LINKERD2_PROXY_TAP_DISABLED

Now that tap is an extension of Linkerd and not a core component, it no longer
made sense to explicitly enable or disable tap through this Linkerd proxy
environment variable. The status of tap is now determined only be if the
tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment
variable.

### controller image

The tap-injector has been added to the controller image's several startup
commands which determines what it will do in the cluster.

As a follow-up, I think splitting out the `tap` and `tap-injector` commands from
the controller image into a linkerd-viz image (or something like that) makes
sense.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-21 11:24:08 -05:00
Oliver Gould 6f954c3823
proxy: v2.129.0 (#5581)
This release improves diagnostics about the proxy's failfast state:

* Warnings are now emitted when the failfast state is entered;
* The "max concurrency exhausted" gRPC message has been changed to
  more-clearly indicate a failfast state error; and
* Failfast recovery has been made more robust, ensuring that a service
  can recover indepenently of new requests being received.

Furthermore, metric labeling has been improved:

* TCP server metrics are now annotated with the original `target_addr`;
* The `tls` label is now set to true for inbound TLS connections that
  lack a client ID. This is mostly helpful to clarify inbound metrics on
  the `identity` controller;
* Outbound `tls` metrics could be reported incorrectly when a proxy was
  configured to not use identity. This has been corrected.

Finally, socket-level errors now include a _client_ or _server_ prefix
to indicate which side of the proxy encountered the error.

---

* stack: remove `map_response` (linkerd/linkerd2-proxy#835)
* replace `RequestFilter` with Tower's upstream impl (linkerd/linkerd2-proxy#842)
* tracing: fix incorrect field format when logging in JSON (linkerd/linkerd2-proxy#845)
* replace `FutureService` with Tower's upstream impl (linkerd/linkerd2-proxy#839)
* integration: improve tracing in tests (linkerd/linkerd2-proxy#846)
* service-profiles: Prevent Duration coercion panics (linkerd/linkerd2-proxy#844)
* inbound: Separate HTTP server logic from protocol detection (linkerd/linkerd2-proxy#843)
* Correct gRPC 'max-concurrency exhausted' error messages (linkerd/linkerd2-proxy#847)
* Update tonic to v0.4 (linkerd/linkerd2-proxy#849)
* failfast: Improve diagnostic logging (linkerd/linkerd2-proxy#848)
* Update the base docker image (linkerd/linkerd2-proxy#850)
* stack: Implement Clone for ResultService (linkerd/linkerd2-proxy#851)
* Ensure services in failfast can become ready (linkerd/linkerd2-proxy#858)
* tests: replace string matching on metrics with parsing (linkerd/linkerd2-proxy#859)
* Decouple tls::accept from TcpStream (linkerd/linkerd2-proxy#853)
* metrics: Handle NoPeerIdFromRemote properly (linkerd/linkerd2-proxy#857)
* metrics: Reorder metrics labels (linkerd/linkerd2-proxy#856)
* Rename tls::accept to tls::server (linkerd/linkerd2-proxy#854)
* Annotate socket-level errors with a scope (linkerd/linkerd2-proxy#852)
* test: reduce repetition in metrics tests (linkerd/linkerd2-proxy#860)
* tls: Disambiguate client and server identities (linkerd/linkerd2-proxy#855)
* Update to tower v0.4.4 (linkerd/linkerd2-proxy#864)
* Update cargo dependencies (linkerd/linkerd2-proxy#865)
* metrics: add `target_addr` label for accepted transport metrics (linkerd/linkerd2-proxy#861)
* outbound: Strip endpoint identity when disabled (linkerd/linkerd2-proxy#862)

---

The opaque-ports test has been updated to reflect proxy metrics changes.
2021-01-21 06:52:38 -08:00
Tarun Pothulapati 4c3d002501
viz: move sub-cmds using viz extension under viz cmd (#5485)
* viz: move sub-cmds using viz extension under viz cmd

Fixes #5327 , #5524 

This branch moves the following commands, under the `linkerd viz`
cmd as they use the viz extension to perform the job.

- dashboard
- edges
- routes
- stat
- tap
- top

This also creates a new pkg `public-api` which fecilitates
interaction and communication with public-api to be used
across extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Alex Leong <alex@buoyant.io>
2021-01-13 12:11:25 +05:30
Tarun Pothulapati ff841d54fc
viz: add a retry check for core control-plane pods before install (#5434)
* viz: add a retry check for core control-plane pods before install

This commit adds a new check so that `viz install` waits till
the control-plane pods are up. For this to work, the `prometheus`
sub-system check in control-plane self-check has been removed,
as we re-use healthchecks to perform this.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-01-07 23:52:09 +05:30
Kevin Leimkuhler b85928e73c
Enable dashboard test (#5486)
This test was never broken. My best guess is that CI was not merging with the
latest `main` as we have recently noticed, so this was an issue that was fixed
by #5458

Closes #5478

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-06 17:49:16 -05:00
Oliver Gould 93f43ff462
test: Re-enable proxy logs (#5488)
Proxy logs are disabled in tests. This makes it difficult to inspect
proxies after failed tests. This change re-enables the default proxy
logs in tests.
2021-01-06 12:39:31 -08:00
Alex Leong 790be8d972
Rename proxy-mutator to jaeger-injector (#5351)
The name `proxy-mutator` is too generic.  In particular, several different linkerd extensions will have mutating webhooks which mutate the proxy sidecar, the MutatingWebhookConfiguration resource is cluster scoped, and each one needs a unique name.

We use the `jaeger-injector` name instead.  This gives us a pattern to follow for future webhooks as well (e.g. `tap-injector` etc.)

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-01-06 10:00:07 -08:00
Tarun Pothulapati e04647fb8d
remove prom check for public-api self-check (#5436)
Currently, public-api is part of the core control-plane where
the prom check fails when ran before the viz extension is installed.
This change comments out that check, Once metrics api is moved into
viz, maybe this check can be part of it instead or directly part of
`linkerd viz check`.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-05 17:22:39 -05:00
Alejandro Pedraza d3d7f4e2e2
Destination should return `OpaqueTransport` hint when annotation matches resolved target port (#5458)
The destination service now returns `OpaqueTransport` hint when the annotation
matches the resolve target port. This is different from the current behavior
which always sets the hint when a proxy is present.

Closes #5421

This happens by changing the endpoint watcher to set a pod's opaque port
annotation in certain cases. If the pod already has an annotation, then its
value is used. If the pod has no annotation, then it checks the namespace that
the endpoint belongs to; if it finds an annotation on the namespace then it
overrides the pod's annotation value with that.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-05 14:54:55 -05:00
Tarun Pothulapati 2087c95dd8
viz: move some components into linkerd-viz (#5340)
* viz: move some components into linkerd-viz

This branch moves the grafana,prometheus,web, tap components
into a new viz chart, following the same extension model that
multi-cluster and jaeger follow.

The components in viz are not injected during install time, and
will go through the injector. The `viz install` does not have any
cli flags to customize the install directly but instead follow the Helm
way of customization by using flags such as 
`set`, `set-string`, `values`, `set-files`.

**Changes Include**
- Move `grafana`, `prometheus`, `web`, `tap` templates into viz extension.
- Remove all add-on related charts, logic and tests w.r.t CLI & Helm.
- Clean up `linkerd2/values.go` & `linkerd2/values.yaml` to not contain
 fields related to viz components.
- Update `linkerd check` Healthchecks to not check for viz components.
- Create a new top level `viz` directory with CLI logic and Helm charts.
- Clean fields in the `viz/Values.yaml` to be in the `<component>.<property>`
model. Ex: `prometheus.resources`, `dashboard.image.tag`, etc so that it is
consistent everywhere.

**Testing**

```bash
# Install the Core Linkerd Installation
./bin/linkerd install | k apply -f -

# Wait for the proxy-injector to be ready
# Install the Viz Extension
./bin/linkerd cli viz install | k apply -f -

# Customized Install
./bin/linkerd cli viz install --set prometheus.enabled=false | k apply -f -
```

What is not included in this PR:
- Move of Controller from core install into the viz extension.
- Simplification and refactoring of the core chart i.e removing `.global`, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-23 20:17:31 +05:30
Kevin Leimkuhler 2c78cf9255
Remove count from opaque ports tcp metric (#5422)
We need to test for the presence of the TCP metric labels, not the exact count.
This change removes the count of `1` so that it can match any count.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-22 12:10:05 -05:00
Kevin Leimkuhler f6c8d27d83
Add mulitcluster check command (#5410)
## What

This change moves the `linkerd check --multicluster` functionality under it's
own multicluster subcommand: `linkerd multicluster check`.

There should be no functional changes as a result of this change. `linkerd
check` no longer checks for anything multicluster related and the
`--multicluster` flag has been removed.

## Why

Closes #5208

The bulk of these changes are moving all the multicluster checks from
`pkg/healthcheck` into the multicluster package.

Doing this completely separates it from core Linkerd. It still uses
`pkg/healtcheck` when possible, but anything that is used only by `multicluster
check` has been moved.

**Note the the `kubernetes-api` and `linkerd-existence` checks are run.**

These checks are required for setting up the Linkerd health checker. They set
the health checker's `kubeAPI`, `linkerdConfig`, and `apiClient` fields.

These could be set manually so that the only check the user sees is
`linkerd-multicluster`, but I chose not to do this.

If any of the setting functions errors, it would just tell the user to run
`linkerd check` and ensure the installation is correct. I find the user error
handling to be better by including these required checks since they should be
run in the first place.

## How to test

Installing Linkerd and multicluster should result in a basic check output:

```
$ bin/linkerd install |kubectl apply -f -
..
$ bin/linkerd check
..
$ bin/linkerd multicluster install |kubectl apply -f -
..
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists


Status check results are √
```

After linking a cluster:

```
$ bin/linkerd multicluster check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * k3d-y
√ remote cluster access credentials are valid
        * k3d-y
√ clusters share trust anchors
        * k3d-y
√ service mirror controller has required permissions
        * k3d-y
√ service mirror controllers are running
        * k3d-y
× all gateway mirrors are healthy
        probe-gateway-k3d-y.linkerd-multicluster mirrored from cluster [k3d-y] has no endpoints
    see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints

Status check results are ×
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-21 15:50:17 -05:00
Kevin Leimkuhler 7c0843a823
Add opaque ports to destination service updates (#5294)
## Summary

This changes the destination service to start indicating whether a profile is an
opaque protocol or not.

Currently, profiles returned by the destination service are built by chaining
together updates coming from watching Profile and Traffic Split updates.

With this change, we now also watch updates to Opaque Port annotations on pods
and namespaces; if an update occurs this is now included in building a profile
update and is sent to the client.

## Details

Watching updates to Profiles and Traffic Splits is straightforward--we watch
those resources and if an update occurs on one associated to a service we care
about then the update is passed through.

For Opaque Ports this is a little different because it is an annotation on pods
or namespaces. To account for this, we watch the endpoints that we should care
about.

### When host is a Pod IP

When getting the profile for a Pod IP, we check for the opaque ports annotation
on the pod and the pod's namespace. If one is found, we'll indicate if the
profile is an opaque protocol if the requested port is in the annotation.

We do not subscribe for updates to this pod IP. The only update we really care
about is if the pod is deleted and this is already handled by the proxy.

### When host is a Service

When getting the profile for a Service, we subscribe for updates to the
endpoints of that service. For any ports set in the opaque ports annotation on
any of the pods, we check if the requested port is present.

Since the endpoints for a service can be added and removed, we do subscribe for
updates to the endpoints of the service.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-18 12:38:59 -05:00
Alex Leong cdc57d1af0
Use linkerd-jaeger extension for control plane tracing (#5299)
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:

* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests

We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension.  To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.

Note that for now, only the control plane components send traces; the proxies in the control plane do not.  This is because the linkerd-jaeger injector is not yet available.  However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.

I tested this by doing the following:

1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-08 14:34:26 -08:00
Alejandro Pedraza deca7ede08
Consolidate integration tests under k3d (#5245)
* Consolidate integration tests under k3d

Fixes #5007

Simplified integration tests by moving all to k3d. Previously things were running in Kind, except for the multicluster tests, which implied some extra complexity in the supporting scripts.

Removed the KinD config files under `test/integration/configs`, as config is now passed as flags into the `k3d` command.

Also renamed `kind_integration.yml` to `integration_tests.yml`

Test skipping logic under ARM was also simplified.
2020-11-18 14:33:16 -05:00
Alejandro Pedraza 948aa23b2a
Remove logs comparisons in integration tests (#5223)
The rare cases where these tests were useful don't make up for the burden of
maintaing them, having different k8s version change the messages and
having unexpected warnings come up that didn't affect the final
convergence of the system.

With this we also revert the indirection added back in #4538 that
fetched unmatched warnings after a test had failed.
2020-11-13 16:00:16 -05:00
Tarun Pothulapati e4c354985c
cli: Remove get cmd and relevant tests (#5202)
Fixes #5190

`linkerd get` is not used currently and works only for pods. This can be
removed instead as per the issue. This branch removes the command and
also the associated unit and integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-12 11:19:46 -08:00
Alex Leong da194f5dc3
Warn when webhook certificates near expiry (#5155)
Fixes #5149 

Before:

```
linkerd-webhooks-and-apisvc-tls
-------------------------------
× tap API server has valid cert
    certificate will expire on 2020-10-28T20:22:32Z
    see https://linkerd.io/checks/#l5d-tap-cert-valid for hints
```

After:

```
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ tap API server has valid cert
‼ tap API server cert is valid for at least 60 days
    certificate will expire on 2020-10-28T20:22:32Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ proxy-injector webhook has valid cert
‼ proxy-injector cert is valid for at least 60 days
    certificate will expire on 2020-10-29T18:17:03Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ sp-validator webhook has valid cert
‼ sp-validator cert is valid for at least 60 days
    certificate will expire on 2020-10-28T20:21:34Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-30 11:48:51 -07:00
Alejandro Pedraza 177669b377
Remove code refs to controllerImageVersion (#5119)
Followup to #5100

We had both `controllerImageVersion` and `global.controllerImageVersion`
configs, but only the latter was taken into account in the chart
templates, so this change removes all of its references.
2020-10-21 13:40:25 -05:00
Oliver Gould 25e49433fd
Do not permit cluster networks to be overridden per-pod (#5111)
In #5110 the `global.proxy.destinationGetNetworks` configuration is
renamed to `global.clusterNetworks` to better reflect its purpose.

The `config.linkerd.io/proxy-destination-get-networks` annotation allows
this configuration to be overridden per-workload, but there's no real use
case for this. I don't think we want to support this value differing
between pods in a cluster. No good can come of it.

This change removes support for the `proxy-destination-get-networks`
annotation.
2020-10-21 09:34:13 -07:00
Oliver Gould f0820bdfbf
inject: Use 'quote' function in proxy template (#5107)
As described in #5105, it's not currently possible to set the proxy log
level to `off`. The proxy injector's template does not quote the log
level value, and so the `off` value is handled as `false`. Thanks, YAML.

This change updates the proxy template to use helm's `quote` function
throughout, replacing manually quoted values and fixing the quoting for
the log level value.

We also remove the default logFormat value, as the default is specified
in values.yaml.
2020-10-20 15:36:10 -07:00
Oliver Gould 4f16a234aa
Add a default set of ports to bypass the proxy (#5093)
The proxy has a default, hardcoded set of ports on which it doesn't do
protocol detection (25, 587, 3306 -- all of which are server-first
protocols). In a recent change, this default set was removed from
the outbound proxy, since there was no way to configure it to anything
other than the default set. I had thought that there was a default set
applied to proxy-init, but this appears to not be the case.

This change adds these ports to the default Helm values to restore the
prior behavior.

I have also elected to include 443 in this set, as it is generally our
recommendation to avoid proxying HTTPS traffic, since the proxy provides
very little value on these connections today.

Additionally, the memcached port 11211 is skipped by default, as clients
do not issue any sort of preamble that is immediately detectable.

These defaults may change in the future, but seem like good choices for
the 2.9 release.
2020-10-16 11:53:41 -07:00
Alex Leong 9701f1944e
Stop rendering addon config (#5078)
The linkerd-addon-config is no longer used and can be safely removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-16 11:07:51 -07:00
Oliver Gould 60a742ab56
tests: Consolidate TestHelper.LinkerdRun error handling (#5057)
Most invocations of `TestHelper.LinkerdRun` don't actually need the stderr
output except to encode it in the error message. This changes this helper
to return an error that includes the full invoked command and error message.

Invocations that need direct access to stderr must call `TestHelper.PipeToLinkerdRun`
2020-10-15 14:57:03 -07:00
Oliver Gould 222c11400b
tests: Set proxy log to linkerd=debug (#5081)
The proxy log level `linkerd2_proxy=debug` only enables logging from a
few proxy modules. We should instead use the more general
`linkerd=debug`.
2020-10-14 15:31:03 -07:00
Alex Leong 9553fbcd75
Skip SMI metrics integration test on arm (#5086)
The SMI metrics image does not yet support arm.  Thus we must skip the SMI metrics integration test when using arm.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-14 14:10:33 -07:00
Zahari Dichev de8855c096
More comprehensive injection integration test (#5049)
The purpose of this test is to validate that the auto injector configures the proxy and the additional containers according to the specified config.

This is done by providing a helper that can generate the desired annotations and later inspect an injected pod in order to determine that every bit of configuration has been accounted for. This test is to provide further assurance that #5036 did not introduce any regressions.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-10-14 19:31:56 +03:00
Oliver Gould 5e7e7e6477
proxy: v2.115.0 (#5076)
This release fixes several recent regressions:

1. The proxy could incorrectly emit inbound requests with absolute-form
   URIs.
2. Inbound tap metadata did not include source addresses or identities.
3. Gateway requests included the incorrect port in the
   `l5d-dst-canonical` header.
4. Gateway requests never included a `Host` header.

Furthermore, support for the
`LINKERD2_PROXY_OUTBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` environment
variable has been removed in anticipation of control plane changes that
will provide this configuration via service profiles. This configuration
is never set by the proxy injector, so this change does not pose any
issues with regard to compatibility.

---

* metrics: Coerce targets to metric labels by-reference (linkerd/linkerd2-proxy#706)
* outbound: Unify TCP & HTTP target types (linkerd/linkerd2-proxy#707)
* inbound: Fix source tap annotations (linkerd/linkerd2-proxy#712)
* trace-context: Simplify implementation with async (linkerd/linkerd2-proxy#710)
* outbound: Use profile to inform protocol detection (linkerd/linkerd2-proxy#708)
* inbound: Fix URI normalization for orig-proto requests (linkerd/linkerd2-proxy#713)
* outbound: more TCP tests, test cleanup (linkerd/linkerd2-proxy#711)
* gateway: Ensure proper outbound metadata (linkerd/linkerd2-proxy#715)
2020-10-14 08:11:17 -07:00
Tarun Pothulapati 1e7bb1217d
Update Injection to use new linkerd-config.values (#5036)
This PR Updates the Injection Logic (both CLI and proxy-injector)
to use `Values` struct instead of protobuf Config, part of our move
in removing the protobuf.

This does not touch any of the flags, install related code.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

Co-authored-by: Alex Leong <alex@buoyant.io>
2020-10-07 09:54:34 -07:00
Alex Leong 1784f0643e
Add linkerd-config-overrides secret (#4911)
This PR adds a new secret to the output of `linkerd install` called `linkerd-config-overrides`.  This is the first step towards simplifying the configuration of the linkerd install and upgrade flow through the CLI.  This secret contains the subset of the values.yaml which have been overridden.  In other words, the subset of values which differ from their default values.  The idea is that this will give us a simpler way to produce the `linkerd upgrade` output while still persisting options set during install.  This will eventually replace the `linkerd-config` configmap entirely.

This PR only adds and populates the new secret.  The secret is not yet read or used anywhere.  Subsequent PRs will update individual control plane components to accept their configuration through flags and will update the `linkerd upgrade` flow to use this secret instead of the `linkerd-config` configmap.

This secret is only generated by the CLI and is not present or required when installing or upgrading with Helm.

Here are sample contents of the secret, base64 decoded.  Note that identity tls context is saved as an override so that it can be persisted across updates.  Since these fields contain private key material, this object must be a secret.  This secret is only used for upgrades and thus only the CLI needs to be able to read it.  We will not create any RBAC bindings to grant service accounts access to this secret.

```
global:
  identityTrustAnchorsPEM: |
    -----BEGIN CERTIFICATE-----
    MIIBhDCCASmgAwIBAgIBATAKBggqhkjOPQQDAjApMScwJQYDVQQDEx5pZGVudGl0
    eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMjAwODI1MjMzMTU3WhcNMjEwODI1
    MjMzMjE3WjApMScwJQYDVQQDEx5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9j
    YWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQ0e7IPBlVZ03TL8UVlODllbh8b
    2pcM5mbtSGgpX9z0l3n5M70oHn715xu2szh63oBjPl2ZfOA5Bd43cJIksONQo0Iw
    QDAOBgNVHQ8BAf8EBAMCAQYwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMC
    MA8GA1UdEwEB/wQFMAMBAf8wCgYIKoZIzj0EAwIDSQAwRgIhAI7Sy8P+3TYCJBlK
    pIJSZD4lGTUyXPD4Chl/FwWdFfvyAiEA6AgCPbNCx1dOZ8RpjsN2icMRA8vwPtTx
    oSfEG/rBb68=
    -----END CERTIFICATE-----
heartbeatSchedule: '42 23 * * * '
identity:
  issuer:
    crtExpiry: "2021-08-25T23:32:17Z"
    tls:
      crtPEM: |
        -----BEGIN CERTIFICATE-----
        MIIBhDCCASmgAwIBAgIBATAKBggqhkjOPQQDAjApMScwJQYDVQQDEx5pZGVudGl0
        eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMjAwODI1MjMzMTU3WhcNMjEwODI1
        MjMzMjE3WjApMScwJQYDVQQDEx5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9j
        YWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQ0e7IPBlVZ03TL8UVlODllbh8b
        2pcM5mbtSGgpX9z0l3n5M70oHn715xu2szh63oBjPl2ZfOA5Bd43cJIksONQo0Iw
        QDAOBgNVHQ8BAf8EBAMCAQYwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMC
        MA8GA1UdEwEB/wQFMAMBAf8wCgYIKoZIzj0EAwIDSQAwRgIhAI7Sy8P+3TYCJBlK
        pIJSZD4lGTUyXPD4Chl/FwWdFfvyAiEA6AgCPbNCx1dOZ8RpjsN2icMRA8vwPtTx
        oSfEG/rBb68=
        -----END CERTIFICATE-----
      keyPEM: |
        -----BEGIN EC PRIVATE KEY-----
        MHcCAQEEIJaqjoDnqkKSsTqJMGeo3/1VMfJTBsMEuMWYzdJVxIhToAoGCCqGSM49
        AwEHoUQDQgAENHuyDwZVWdN0y/FFZTg5ZW4fG9qXDOZm7UhoKV/c9Jd5+TO9KB5+
        9ecbtrM4et6AYz5dmXzgOQXeN3CSJLDjUA==
        -----END EC PRIVATE KEY-----
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-09-29 08:01:36 -07:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Alejandro Pedraza e8f0724a71
Multicluster integration test (#4998)
This implements the run_multicluster_test() function in bin/_test-helpers.sh.

The idea is to create two clusters (source and target) using k3d, with linkerd and multicluster support in both, plus emojivoto (without vote-bot) in target, and vote-bot in source.
We then link the clusters and make sure traffic is flowing.

Detailed sequence:

Create certficates.
Install linkerd along with multicluster support in the target cluster.
Run the target1 test: install emojivoto in the target cluster (without vote-bot).
Run linkerd mc link on the target cluster.
Install linkerd along with multicluster support in the source cluster.
Apply the link resource in the source cluster.
Run the source test: Check linkerd mc gateways returns the target cluster link, and only install emojivoto's vote-bot in the source cluster. Note vote-bot's yaml defines the web-svc service as web-svc-target.emojivoto:80
Run the target2 test: Make sure web-svc in the target cluster is receiving requests.
2020-09-26 05:26:23 -05:00
Alejandro Pedraza b50ae6290d
Add support for k3d in integration tests (#4994)
* Add support for k3d in integration tests

KinD doesn't support setting LoadBalancer services out of the box. It can be added with some additional work, but it seems the solutions are not cross-platform.

K3d on the other hand facilitates this, so we'll be using k3d clusters for the multicluster integration test.

The current change sets the ground by generalizing some of the integration tests operations that were hard-coded to KinD.

- Added `bin/k3d` to wrap the setup and running of a pinned version of `k3d`.
- Refactored `bin/_test-helpers.sh` to account for tests to be run in either KinD or k3d.
- Renamed `bin/kind-load` to `bin/image-load` and make it more generic to load images for both KinD (default) and k3d. Also got rid of the no longer used `--images-host` option.
- Added a placeholder for the new `multicluster` test in the lists in `bin/_test-helpers.sh`. It starts by setting up two k3d clusters.

* Refactor handling of the `--multicluster` flag in integration tests (#4995)

Followup to #4994, based off of that branch (`alpeb/k3d-tests`).
This is more preliminary work previous to the more complete multicluster integration test.

- Removed the `--multicluster` flag from all the tests we had in `bin/_test-helpers.sh`, so only the new "multicluster" integration test will make use of that. Also got rid of the `TestUninstallMulticluster()` test in `install_test.go` to keep the multicluster stuff around, needed for the more complete multicluster test that will be implemented in a followup PR.
- Added "multicluster" to the list of tests in the `kind_integration.yml` workflow.
- For now, this new "multicluster" test in `run_multicluster_test()` is just running the install tests (`test/integration/install_test.go`) with the `--multicluster` flag.

Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-09-25 16:33:17 -05:00
Alejandro Pedraza 0f869f2e50
Ability for int tests to use external certs generated with openssl (#4997)
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
2020-09-25 11:25:29 -05:00
Tarun Pothulapati 3d900ccc19
Integration test for smi-metrics (#4844)
* Integration test for smi-metrics

This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.

Currently, We store the SMI Helm pkg locally and run the test on top, so 
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 22:49:20 +05:30
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
OlivierB f599bf9b10
Helm chart - linkerd2-collector : enable jaeger receiver (#4783)
Fixes #4778

Signed-off-by: Olivier Boudet <o.boudet@gmail.com>
2020-09-21 12:17:04 -07:00
Alejandro Pedraza ccf027c051
Push docker images to ghcr.io instead of gcr.io (#4953)
* Push docker images to ghcr.io instead of gcr.io

The `cloud_integration.yml` and `release.yml` workflows were modified to
log into ghcr.io, and remove the `Configure gcloud` step which is no
longer necessary.

Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes.

Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag.

Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).
2020-09-10 15:16:24 -05:00
Oliver Gould 7ee638bb0c
inject: Configure the proxy to discover profiles for unnamed services (#4960)
The proxy performs endpoint discovery for unnamed services, but not
service profiles.

The destination controller and proxy have been updated to support
lookups for unnamed services in linkerd/linkerd2#4727 and
linkerd/linkerd2-proxy#626, respectively.

This change modifies the injection template so that the
`proxy.destinationGetNetworks` configuration enables profile
discovery for all networks on which endpoint discovery is permitted.
2020-09-10 12:44:00 -07:00
Zahari Dichev 084bb678c7
Perform TLS checks on injector, sp validator and tap (#4924)
* Check sp-validator,proxy-injector and tap certs

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-09-10 11:21:23 -05:00
Zahari Dichev 77c88419b8
Make destination and identity services headless (#4923)
* Make destination and identity svcs headless

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-09-02 14:53:38 -05:00
Ali Ariff 5186383c81
Add ARM64 Integration Test (#4897)
* Add ARM64 Integration Test

Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>
2020-08-28 10:38:40 -07:00
Tarun Pothulapati c9c5d97405
Remove SMI-Metrics charts and commands (#4843)
Fixes #4790

This PR removes both the SMI-Metrics templates along with the
experimental sub-commands. This also removes pkg `smi-metrics`
as there is no direct use of it without the commands.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-08-24 14:35:33 -07:00
Josh Soref 72aadb540f
Spelling (#4872)
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).

The misspellings have been reported at aaf440489e (commitcomment-41423663)

The action reports that the changes in this PR would make it happy: 5b82c6c5ca

Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-08-12 21:59:50 -07:00