Commit Graph

747 Commits

Author SHA1 Message Date
Naseem 6aa1e76096
Allow config of prometheus alertmanagers, rules and extra args. (#4220)
This allows end user flexibility for options such as log format. Rather than bubbling up such possible config options into helm values, extra arguments provides more flexibility.

Add prometheusAlertmanagers value allows configuring a list of statically targetted alertmanager instances.
Use rule configmaps for prometheus rules. They take a list of {name,subPath,configMap} values and mounts them accordingly. Provided that subpaths end with _rules.yml or _rules.yaml they should be loaded by prometheus as per prometheus.yml's rule_files content.

Signed-off-by: Naseem <naseem@transit.app>
2020-05-04 14:06:10 -05:00
Alex Leong 40b921508f
Inject LINKERD2_PROXY_DESTINATION_GET_NETWORKS proxy variable (#4300)
Fixes #3807

By setting the LINKERD2_PROXY_DESTINATION_GET_NETWORKS environment variable, we configure the Linkerd proxy to do destination lookups for authorities which are IP addresses in the private network range.  This allows us to get destination metadata including identity for HTTP requests which target an IP address in the cluster, Prometheus metrics scrape requests, for example.

This change allowed us to update the "direct edges" test which ensures that the edges command produces correct output for traffic which is addressed directly to a pod IP.

We also re-enabled the "linkerd stat" integration tests which had been disabled while the destination service did not yet support these types of IP queries.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-30 11:22:24 -07:00
Zahari Dichev 00f17d2ed6
Make export-service non side-effecting (#4307)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-30 17:11:55 +03:00
Tarun Pothulapati e75c6580ec
refacor TestRenderHelm to not need addOn list (#4297)
- rather than passing the list of add-ons, they can instead be built from the values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-29 23:44:30 +05:30
Zahari Dichev 5149152ef3
Multicluster gateway and remote setup command (#4265)
Add multicluster gateway and setup command

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-29 20:33:23 +03:00
Zahari Dichev 17dacf5548
Add gateways command, allowing the retrieval of gateway stats (#4241)
Add gateways command, allowing the retrieval of gateway stats

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-27 13:55:01 +03:00
Zahari Dichev 09262ebd72
Add liveliness checks and metrics for multicluster gateway (#4233)
Add liveliness checks for gateway

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-27 13:06:58 +03:00
Alejandro Pedraza dacf87e084
Added missing annotations to addon test fixtures (#4286)
Followup to #4271

Add missing annotation `linkerd.io/workload-ns: linkerd` in in the
addons test fixtures, introduced by the downward work from #4199
2020-04-23 16:15:16 -05:00
Tarun Pothulapati 60ffd1c2a2
Support Multi-stage install with Add-On's (#4271)
* Support Multi-stage install with Add-Ons
* add upgrade tests for add-ons
* add multi stage upgrade unit tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-23 14:40:58 -05:00
Tarun Pothulapati 2b1cbc6fc1
charts: Using downwardAPI to mount labels to the proxy container (#4199)
* use downward API to mount labels to the proxy container as a volume
* add namespace as a label to the pod
* add a trace inject test
* add downwardAPi for controlplaneTracing
* add controlPlaneTracing condition to volumeMounts
* update add-ons to have workload-ns
* add workload-ns label to control-plane components

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-22 10:33:51 -05:00
Alejandro Pedraza b00a84126d
Some `linkerd stat` test failures were being hidden (#4272)
* Some `linkerd stat` test failures were being hidden

`linkerd stat` was doing an early `os.Exit(0)` when no traffic was
found, which avoided `go test` to report any test failure that ended in
that code path.

This was hiding a mismatch in the golden files for HA after the
introduction of the rolling update strategy (#4267), and the failure of
`linkerd stat trafficsplit` not returning results unless `--unmeshed` is
used. For the latter, I added the flag to the tests in order to temporarly pass
them, but the underlying issue remains to be fixed in a separate
PR.
2020-04-21 14:52:09 -05:00
Kevin Leimkuhler 2c38f228f7
Add MeshedPodCount field to TS resource rows (#4273)
The addition of the `--unmeshed` flag changed the rendering behavior of the
`stat` command so that resources with 0 meshed pods are not displayed by
default.

Rendering is based off the row's `MeshedPodCount` field which is currently not
set by `func trafficSplitResourceQuery`. This change sets that field now so
that in rendering, the trafficsplit resource is rendered in the output.

The reason for this not showing up in testing is addressed by #4272 where the
`stat` command behavior for no traffic is changed.

The following now works without `--unmeshed` flag being passed:

```
❯ bin/linkerd stat -A ts
NAMESPACE   NAME                    APEX          LEAF          WEIGHT   SUCCESS   RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
default     backend-traffic-split   backend-svc   backend-svc     500m         -     -             -             -             -
default     backend-traffic-split   backend-svc   failing-svc        0         -     -             -             -             -
```
2020-04-21 10:23:35 -07:00
Alex Leong 5d3862c120
Use /live for liveness probe (#4270)
Fixes #3984

We use the new `/live` admin endpoint in the Linkerd proxy for liveness probes instead of the `/metrics` endpoint.  This endpoint returns a much smaller payload.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-17 14:53:32 -07:00
Tarun Pothulapati 8e56166774
Refactor AddOn Installation (#4247)
* refactor add-ons install code

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-16 15:24:55 -05:00
Alex Leong e962bf1968
Improve proxy version diagnostics (#4244)
It can be difficult to know which versions of the proxy are running in your cluster, especially when you have pods running at multiple different proxy versions.

We add two pieces of CLI functionality to assist with this:

The `linkerd check --proxy` command will now list all data plane pods which are not up-to-date rather than just printing the first one it encounters:

```
‼ data plane is up-to-date
    Some data plane pods are not running the current version:
	* default/books-84958fff5-95j75 (git-ca760bdd)
	* default/authors-57c6dc9b47-djldq (git-ca760bdd)
	* default/traffic-85f58ccb66-vxr49 (git-ca760bdd)
	* default/release-name-smi-metrics-899c68958-5ctpz (git-ca760bdd)
	* default/webapp-6975dc796f-2ngh4 (git-ca760bdd)
	* default/webapp-6975dc796f-z4bc4 (git-ca760bdd)
	* emojivoto/voting-54ffc5787d-wj6cp (git-ca760bdd)
	* emojivoto/vote-bot-7b54d6999b-57srw (git-ca760bdd)
	* emojivoto/emoji-5cb99f85d8-5bhvm (git-ca760bdd)
	* emojivoto/web-7988674b8b-zfvvm (git-ca760bdd)
	* default/webapp-6975dc796f-d2fbc (git-ca760bdd)
	* default/curl (git-7f6bbc73)
    see https://linkerd.io/checks/#l5d-data-plane-version for hints
```

The `linkerd version` command now supports a `--proxy` flag which will list all proxy versions running in the cluster and the number of pods running each version:

```
linkerd version --proxy
Client version: dev-7b9d475f-alex
Server version: edge-20.4.1
Proxy versions:
	edge-20.4.1 (10 pods)
	git-ca760bdd (11 pods)
	git-7f6bbc73 (1 pods)
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-16 11:28:19 -07:00
Alejandro Pedraza 7d07504b5b
Upgrade crashes proxy-init when skipping ports (#4258)
Fixes #4257

This was introduced in 2.7.0. When performing an upgrade on an
installation having used `--skip-outbound-ports` or
`--skip-inbound-ports`, the upgrade picks those values from the
ConfigMap, parses them wrongly, and then when proxy-init picks them the
iptables commands fail.

I've also improved one of the upgrade unit tests to include these flags,
and confirmed it failed before this fix.
2020-04-15 07:11:15 -05:00
Kevin Leimkuhler 0d235694af
Add `unmeshed` flag to stat command (#4254)
## Motivation

Introduces an `unmeshed` flag to the `stat` command so that users can opt-in
to viewing unmeshed resources in the `stat` output.

This changes the existing behavior of the `stat` command such that unmeshed
resources no longer render by default in the output.

Before:

```
❯ bin/linkerd stat -A deploy
NAMESPACE     NAME                     MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN
kube-system   coredns                     0/1         -        -             -             -             -          -
kube-system   local-path-provisioner      0/1         -        -             -             -             -          -
kube-system   metrics-server              0/1         -        -             -             -             -          -
kube-system   traefik                     0/1         -        -             -             -             -          -
linkerd       linkerd-controller          1/1   100.00%   0.3rps           1ms           2ms           2ms          2
linkerd       linkerd-destination         1/1   100.00%   0.3rps           1ms           1ms           1ms         11
...
```

After:

```
❯ bin/linkerd stat -A deploy
NAMESPACE   NAME                     MESHED   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99   TCP_CONN
linkerd     linkerd-controller          1/1   100.00%   0.3rps           1ms           1ms           1ms          2
linkerd     linkerd-destination         1/1   100.00%   0.3rps           1ms           2ms           2ms         13
...
```

Closes #3871

## Solution

Using the meshed pod count in the stat response, resources with a count of `0`
are not rendered in the table.

The `-l`/`--selector` flag do not work for all resource types, so applying a
default label does not solve this problem. While it works for pods, it does
not work for deployments as the `linkerd.io/inject` is an annotation that
cannot be selected on.

I did not think a shorthand flag was necessary for this. I do not think users
will commonly pass this flag to the `stat` command, and I didn't think adding
an additional short flag such as `u` was necessary.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-04-14 20:24:29 -07:00
Alex Leong 7b9d475ffc
Gate SMI-Metrics behind an install flag (#4240)
This change adds a `--smi-metrics` install flag which controls if the SMI-metrics controller and associated RBAC and APIService resources are installed.  The flag defaults to false and is hidden.

We plan to remove this flag or default it to true if and when the SMI-Metrics integration graduates from experimental.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-09 14:34:08 -07:00
Tarun Pothulapati d35a98cb2b
Fix routes wide output formatting for empty values (#4239)
* use wider template string for empty values when -o wide

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-04-07 14:35:55 -05:00
Alejandro Pedraza 322ba5fd2f
`linkerd uninstall` errors when attempting to delete PSP (#4234)
* Bug in `linkerd uninstall` when attempting to delete PSP

We were using a wrong apiVersion for PSP in `linkerd uninstall`'s
output, which avoids removing that resource:

```
$ linkerd uninstall | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller"
deleted
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination"
deleted
...
mutatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-proxy-injector-webhook-config" deleted
validatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-sp-validator-webhook-config" deleted
namespace "linkerd" deleted
error: unable to recognize "uninstall.yml": no matches for kind
"PodSecurityPolicy" in version "extensions/v1beta1"

$ kubectl get psp -oname
podsecuritypolicy.policy/linkerd-linkerd-control-plane
```

I've also replaced the uninstall integration test with a new separate
suite that performs the installation, waits for it to be ready,
uninstalls, and then confirms `linkerd check --pre` returns as expected.
2020-04-07 11:01:11 -05:00
Zahari Dichev d6460cf0fb
Update upgrade test certs (#4236)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-06 20:15:06 +03:00
Matei David fee70c064b
Add uninstall cmd functionality to cli (#3622) (#4200)
Signed-off-by: Matei David <matei.david.35@gmail.com>
2020-04-02 12:35:39 -05:00
Alex Leong d8eebee4f7
Upgrade to client-go 0.17.4 and smi-sdk-go 0.3.0 (#4221)
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0.  Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.

This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4.  This keeps all of our transitive dependencies synchronized on one version of client-go.

This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code.  I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-01 10:07:23 -07:00
Mayank Shah 4429c1a5b1
Update inject to handle `automountServiceAccountToken: false` (#4145)
* Handle automountServiceAccountToken

Return error during inject if pod spec has `automountServiceAccountToken: false`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-04-01 09:39:49 -05:00
Alejandro Pedraza 0a4df947e6
Add missing PSP for linkerd-smi-metrics (#4193)
The linkerd-smi-metrics ServiceAccount wasn't hooked into linkerd's PSP
resource, which resulted in the linkerd-smi-metrics ReplicaSet failing
to spawn pods:

```
Error creating: pods "linkerd-smi-metrics-574f57ffd4-" is forbidden:
unable to validate against any pod security policy: []
```
2020-03-25 14:28:35 -05:00
Alejandro Pedraza eb322dc420
Fix error when injecting Cronjobs that have no metadata (#4180)
When injecting a Cronjob with no
`spec.jobTemplate.spec.template.metadata` we were getting the following
error:

```
Error transforming resources: jsonpatch add operation does not apply:
doc is missing path:
"/spec/jobTemplate/spec/template/metadata/annotations"
```

This only happens to Cronjobs because other workloads force having at
least a label there that is used in `spec.selector` (at least as of v1
workloads).

With this fix, if no metadata is detected, then we add it in the json patch when
injecting, prior to adding the injection annotation.

I've added a couple of new unit tests, one that verifies that this
doesn't remove metadata contents in Cronjobs that do have that metadata,
and another one that tests injection in Cronjobs that don't have
metadata (which I verified it failed prior to this fix).
2020-03-23 14:49:50 -05:00
Mayank Shah 963b9b049a
Add kubectl-style label selectors (#4120)
* Update tap, routes and top commands to support label selectors

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-03-20 10:45:06 -05:00
Tarun Pothulapati 8d64f4e135
Bump Versions of Trace components (#4182)
* Bump Versions of Tracing components
- Jaeger to 1.17.1
- OpenCensus Collector to 0.1.11
* More sane defaults of jaeger resources

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-03-19 16:42:21 -05:00
Zahari Dichev 40a063878d
Service mirror CLI (#4070)
Multicluster CLI tools

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-19 20:08:11 +02:00
Alex Leong 8f82f8c241
Upgrade smi-metrics to v0.2.1 (#4186)
This version contains an fix for a bug that was rejecting all requests on clusters configured with an empty list of allowed client names.  Because smi-metrics is an apiservice, this was also preventing namespaces from terminating.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-19 11:03:09 -07:00
Alejandro Pedraza 1cbc26a2c1
Upgrade golangci-lint to v1.23.8 (#4181)
* Upgrade golangci-lint to v1.23.8

This should help with some timeouts we're seeing in CI.

I fixed some new warnings found in `inject.go` and `uninject.go`.
Also we now have to explicitly disable linting `/controller/gen`.

The linter was also complaining that in `/pkg/k8s/fake.go` the
`spClient.Interface` and `tsclient.Interface` returned in the function
`newFakeClientSetsFromManifests()` aren't used, but I opted to ignore
that to leave them available for future tests.
2020-03-18 09:13:19 -05:00
Alejandro Pedraza 8f79e07ee2
Bump proxy-init to v1.3.2 (#4170)
* Bump proxy-init to v1.3.2

Bumped `proxy-init` version to v1.3.2, fixing an issue with `go.mod`
(linkerd/linkerd2-proxy-init#9).
This is a non-user-facing fix.
2020-03-17 14:49:25 -05:00
Alex Leong 794abfe0d4
Add alpha clients command (#4157)
We add the `linkerd alpha clients` command which displays client side metrics from each of a resource's clients.  This allows you to see who all of your clients are and see what your resource's metrics look like from your clients' point of view.  Since these metrics are measured on the client-side, they include network latency.

```
> linkerd alpha clients deploy/web -n emojivoto
FROM                TO   SUCCESS        RPS  LATENCY_P50  LATENCY_P90  LATENCY_P99
vote-bot.emojivoto  web   97.50%     2.0rps          4ms          5ms          5ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-12 13:45:34 -07:00
Alex Leong cfae4d6432
Add -A flag to alpha stat (#4142)
Add an `--all-namespaces` flag to `linkerd alpha stat`.  This flag ignore the value of the `--namespace` flag and looks up resources across all namespaces.

Some example usage:

```
> linkerd alpha stat po -A
NAMESPACE  NAME                                    SUCCESS        RPS  LATENCY_P50  LATENCY_P90  LATENCY_P99
default    curl                                    100.00%     0.6rps          1ms          1ms          1ms
emojivoto  emoji-ffd474b7b-nq8wc                   100.00%     2.0rps          1ms          1ms          1ms
emojivoto  vote-bot-74c4867dc6-d5j4d                90.00%     2.0rps          3ms          4ms          4ms
emojivoto  voting-6b69659f5b-6hpvx                  78.95%     0.9rps          1ms          1ms          1ms
emojivoto  web-6cfccddd6b-vrq2q                     92.86%     5.6rps          1ms          3ms          4ms
linkerd    linkerd-controller-54bbb5d485-4p9w2     100.00%     0.3rps          1ms          1ms          1ms
linkerd    linkerd-destination-69fb65c4fb-7mthj    100.00%     0.3rps          1ms          1ms          1ms
linkerd    linkerd-grafana-ffc4d969-gf5cz          100.00%     0.3rps          1ms          2ms          2ms
linkerd    linkerd-identity-6456988769-tbkx9       100.00%     0.3rps          1ms          1ms          1ms
linkerd    linkerd-prometheus-5469d5d8fd-kskc6     100.00%     2.5rps          1ms          2ms          3ms
linkerd    linkerd-proxy-injector-658f8c4cd-pfgbt  100.00%     0.3rps          1ms          1ms          1ms
linkerd    linkerd-smi-metrics-86567c5ff4-dh7rn          -     0.0rps          0ms          0ms          0ms
linkerd    linkerd-sp-validator-54c8d7dcf9-wq6jv   100.00%     0.3rps          1ms          2ms          2ms
linkerd    linkerd-tap-574b74c964-cwm6l            100.00%     0.3rps          1ms          1ms          1ms
linkerd    linkerd-web-577755788d-95slx            100.00%     0.3rps          1ms          1ms          1ms
```

```
> linkerd alpha stat po/curl --to po
FROM  TO                              SUCCESS        RPS  LATENCY_P50  LATENCY_P90  LATENCY_P99
curl  web-6cfccddd6b-vrq2q.emojivoto  100.00%     0.9rps          1ms          2ms          2ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-09 13:29:25 -07:00
Alex Leong 9408dc7fe1
Add linkerd alpha stat command (#4130)
This PR introduces the `linkerd alpha stat` command which will eventually replace the `linkerd stat` command.  This command functions in a similar way, but with slightly different arguments and is implemented using the smi-metrics API.  This means that access to metrics can be controlled with RBAC.

See the `linkerd alpha stat` help text for full details, or try one of these commands:

* `linkerd alpha stat -n emojivoto deploy/web`
* `linkerd alpha stat -n emojivoto deploy`
* `linkerd alpha stat -n emojivoto deploy/web --to deploy/emoji`

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-05 15:23:14 -08:00
Zahari Dichev edd7fd203d
Service Mirroring Component (#4028)
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-02 21:16:08 +02:00
Alex Leong 71d6a00faa
Include SMI metrics as part of Linkerd install (#4109)
Adds the SMI metrics API to the Linkerd install flow.  This installs the SMI metrics controller deployment, the SMI metrics ApiService object, and supporting RBAC, and config resources.

This is the first step toward having Linkerd consume the SMI metrics API in the CLI and web dashboard.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-03-02 10:11:16 -08:00
arminbuerkle 65eae40b6a
Remove envoy, contour restrictions (#4092)
* Remove envoy, contour restrictions

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2020-03-02 09:18:51 -05:00
Tarun Pothulapati 948dc22a34
Tracing Add-on For Linkerd (#3955)
* Moves Common templates needed to partials

As add-ons re-use the partials helm chart, all the templates needed by multiple charts should be present in partials
This commit also updates the helm tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add tracing add-on helm chart

Tracing sub-chart includes open-census and jaeger components as a sub-chart which can be enabled as needed

* Updated Install path to also install add-ons

This includes new interface for add-ons to implement, with example tracing implementation

* Updates Linkerd install path to also install add-ons

Changes include:
 - Adds an optional Linkerd Values configmap which stores add-on configuration when add-ons are present.
 - Updates Linkerd install path to check for add-ons and render their sub-charts.
 - Adds a install Option called config, which is used to pass confiugration for add-ons.
 - Uses a fork of mergo, to over-write default Values with the Values struct generated from config.

* Updates the upgrade path about add-ons.

Upgrade path now checks for the linkerd-values cm, and overwrites the default values with it, if present.
It then checks the config option, for any further overwrites

* Refactor linkerd-values and re-update tests
also adds relevant nil checks
* Refactor code to fix linting issues
* Fixes an error with linkerd-config global values

Also refactors the linkerd-values cm to work the same with helm

* Fix a nil pointer issue for tests
* Updated Tracing add-on chart meta-data
Also introduced a defaultGetFiles method for add-ons

* Add add-on/charts to gitignore
* refactor gitignore for chart deps
* Moves sub-charts to /charts directly
* Refactor linkerd values cm
* Add comment in linkerd-values
* remove extra controlplanetracing flag
* Support Stages deployment for add-ons along with tests
* linting fix
* update tracing rbac
* Removes the need for add-on Interface
- Uses helm loading capabiltiies to get info about add-ons
- Uses reflection to not have to unnecessarily add checks for each add-on type

* disable tracing flag
* Remove dep on forked mergo
- Re-use merge from helm

* Re-use helm's merge
* Override the chartDir path during tests
* add error check
* Updated the dependency iteration code

Currently, the charts directory, will not have the deps in the repo. So, Code is updated to read the dependencies from requirements.yaml
and use that info to read templates from the relevant add-ons directory.

* Hard Code add-ons name
* Remove struct details for add-ons

- As we don't use fields of a add-on struct, we don't have them to be typed. Instead we can just use the `enabled` flag using reflection
- Users can just use map[string]interface{} as the add-on type.

* update unit tests
* linting fix
* Rename flag to addon-config
* Use Chart loading logic
- This code uses chart loading to read the files and keep in a vfs.
- Once we have those files read we will then use them for generation of sub-charts.

* Go fmt fix
* Update the linkerd-values cm to use second level field
* Add relevant unit tests for mergeRaw
* linting fix
* Move addon tests to a new file
* Fix golden files
* remove addon install unit test
* Refactor sub-chart load logic
* Add install tracing unit test
* golden file update for tracing install
* Update golden files to reflect another pr changes
* Move addon-config flag to recordFlagSet
* add relevant tracing enabled checks
* linting fix

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-02-26 10:15:04 -08:00
Supratik Das d9956f3b35
Update control-plane-namespace label (#4061)
* Update control-plane-namespace label

Upgrade command ignores changes to the namespace object

Add linkerd.io/control-plane-ns=linkerd label to the control-plane namespace

Fixes #3958

* Add controlPlaneNamespace label to namespace.yaml
* Modify tests for updated controlPlaneNamespace label
* Fix faulty values.yaml value
* Localize reference for controlPlaneNamespace label in kubernetes_helper.go

Signed-off-by: Supratik Das <rick.das08@gmail.com>
2020-02-24 12:57:28 -08:00
Christy Jacob f9b940e89d
Support for custom prometheus registry (#4041)
* feat: added prometheus Registry Option for install command
* chore: draft commit
* Draft for custom prometheus image
* Support for custom prometheus image

This PR adds support to override the default prometheus image name and use custom image names in private repositories

* Added default Prometheus Image from values.yaml

The default can be overridden by the argument given in installOptions

* chore: fixed failing check
* Fixed fialing check
* Updated the tests as per the new flag
* Air-gapped installation for prometheus-image
* Air Gapped installation for Prometheus Image
* Added regex for prometheus repository/image cli option

Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
2020-02-24 09:59:29 -08:00
Saurav Tiwary 1c19e314b7
Linkerd CLI command to get control plane diagnostics (#4050)
* CLI command to fetch control plane metrics
Fixes #3116
* Add GetResonse method to return http GET response
* Implemented timeouts using waitgroups
* Refactor metrics command by extracting common code to metrics_diagnostics_util
* Refactor diagnostics to remove code duplication
* Update portforward_test for NewContainerMetricsForward function
* Lint code
* Incorporate Alex's suggestions
* Lint code
* fix minor errors
* Add unit test for getAllContainersWithPort
* Update metrics and diagnostics to store results in a buffer and print once
* Incorporate Ivan's suggestions
* consistent error handling inside diagnostics
* add coloring for the output
* spawn goroutines for each pod instead of each container
* switch back to unbuffered channel
* remove coloring in the output
* Add a long description of the command

Signed-off-by: Saurav Tiwary <srv.twry@gmail.com>
2020-02-24 09:09:54 -08:00
Supratik Das 42efc1da01
Improve kubectl apply format by removing misplaced message (#4053)
* Improve kubectl apply format by removing misplaced message

Fixes #2956

Also separate stderr messages with a new line

Signed-off-by: Supratik Das <rick.das08@gmail.com>
2020-02-20 10:36:36 -05:00
Mayank Shah 7cff974a79
cli: handle panic caused by `linkerd metrics` port-forward failure (#4007)
* cli: handle `linkerd metrics` port-forward gracefully

- add return for routine in func `Init()` in case of error
- add return from func `getMetrics()` if error from `portforward.Init()`

* Remove select block at pkg/k8s/portforward.go

- It is now the caller's responsibility to call pf.Stop()

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-19 21:44:37 -08:00
Mayank Shah 3c3a4a5f5d
cli: Add label selector flag for `stat` (#4040)
* Update `linkerd-namespace` shorthand to `L`
* Add --selector (-l) flag for `stat`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-17 13:40:07 -05:00
Kohsheen Tiku 19806e3626
Scroll functionality for linkerd top deploy/linkerd-web (#4011)
* Table obtained from linkerd top is not scrollable.

Added scroll functionality for the table.

Fixes #2558

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>
2020-02-17 11:17:43 -05:00
Zahari Dichev 3538944d03
Unify trust anchors terminology (#4047)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:12:46 +02:00
Mayank Shah c1b683147a
Update identity to make certs more diagnosable (#3990)
Update identity controller to make issuer certificates diagnosable if
cert validity is causing error

    - Add expiry time in identity log message
    - Add current time in identity log message
    - Emit k8s event with appropriate message


Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-13 11:21:41 +02:00
Zahari Dichev 20f8da0e61
Remove experimental from CNI (#4038)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-11 20:10:55 +02:00
Zahari Dichev 9b29a915d3
Improve cni resources labels (#4032)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-11 12:10:08 +02:00
Zahari Dichev c609564dc8
Add helm upgrade integration test (#3976)
In light of the breaking changes we are introducing to the Helm chart and the convoluted upgrade process (see linkerd/website#647) an integration test can be quite helpful. This simply installs latest stable through helm install and then upgrades to the current head of the branch.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-02-04 08:27:46 +02:00
Zahari Dichev deefeeec52 Rename no init container second take (#3972)
This is a second attempt on #3956 as it got merged in the wrong branch

Fixes #3930

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-01-24 12:52:55 -08:00
Zahari Dichev 113c23bdf6 Fix helm list of ports not rendering correctly (#3957)
There was a problem that caused helm install to not reflect the proper list of ignored inbound and outbound ports. Namely if you supply just one port, that would not get reflected.

To reproduce do a: 

```
 helm install \
       --name=linkerd2 \
       --set-file global.identityTrustAnchorsPEM=ca.crt \
       --set-file identity.issuer.tls.crtPEM=issuer.crt \
       --set-file identity.issuer.tls.keyPEM=issuer.key \
       --set identity.issuer.crtExpiry=2021-01-14T14:21:43Z \
       --set-string global.proxyInit.ignoreInboundPorts="6666" \
       linkerd-edge/linkerd2
```


Check your config: 

```bash
 $ kubectl get configmap -n linkerd -oyaml | grep ignoreInboundPort
 "ignoreInboundPorts":[],
```
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-24 09:41:11 -08:00
Kevin Leimkuhler 53baecb382
Changes for edge-20.1.3 (#3966)
## edge-20.1.3

* CLI
  * Introduced `linkerd check --pre --linkerd-cni-enabled`, used when the CNI
    plugin is used, to check it has been properly installed before proceeding
    with the control plane installation
  * Added support for the `--as-group` flag so that users can impersonate
    groups for Kubernetes operations (thanks @mayankshah160!)
* Controller
  * Fixed an issue where an override of the Docker registry was not being
    applied to debug containers (thanks @javaducky!)
  * Added check for the Subject Alternate Name attributes to the API server
    when access restrictions have been enabled (thanks @javaducky!)
  * Added support for arbitrary pod labels so that users can leverage the
    Linkerd provided Prometheus instance to scrape for their own labels
    (thanks @daxmc99!)
  * Fixed an issue with CNI config parsing

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-01-23 16:55:21 -08:00
Zahari Dichev a9d38189fb Fix CNI config parsing (#3953)
This PR addreses the problem introduced after #3766.

Fixes #3941 

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-23 09:55:04 -08:00
Mayank Shah 60ac0d5527 Add `as-group` CLI flag (#3952)
Add CLI flag --as-group that can impersonate group for k8s operations

Signed-off-by: Mayank Shah mayankshah1614@gmail.com
2020-01-22 16:38:31 +02:00
Dax McDonald 5b75a2176f Add scraping of arbitrary pod labels (#3833)
This allows for users of Linkerd to leverage the Prometheus instance
deployed by the mesh for their metric needs. With support for pod labels
outside of the Linkerd metrics users are able to scrape metrics
based upon their own labels.

Signed-off-by: Dax McDonald <dax@rancher.com>
2020-01-22 09:55:26 +02:00
Paul Balogh dabee12b93 Fix issue for debug containers when using custom Docker registry (#3873)
**Subject**
Fixes bug where override of Docker registry was not being applied to debug containers (#3851)

**Problem**
Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image.

**Solution**
This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images.

This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option.

**Validation**
Several new unit tests have been created to confirm functionality.  In addition, the following workflows were run through:

### Standard Workflow with Custom Registry
This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service.

* Start with a k8s instance having no Linkerd installation
* Build all images locally using `bin/docker-build`
* Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6`
* Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section
* Request injection of the debug image into an available container.  I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Custom Registry Override at Injection
This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Standard Workflow with Default Registry
This workflow is the typical workflow which utilizes the standard Linkerd image registry.

* Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster | kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/
* Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -`
* Perform standard Linkerd installation as `bin/linkerd install | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section
* Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Default Registry at Injection
This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

Fixes issue #3851 

Signed-off-by: Paul Balogh javaducky@gmail.com
2020-01-17 10:18:03 -08:00
Zahari Dichev e30b9a9c69
Add checks for CNI plugin (#3903)
As part of the effort to remove the "experimental" label from the CNI plugin, this PR introduces cni checks to `linkerd check`

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-17 12:11:19 +02:00
Mayank Shah b94e03a8a6 Remove empty fields from generated configs (#3886)
Fixes
- https://github.com/linkerd/linkerd2/issues/2962
- https://github.com/linkerd/linkerd2/issues/2545

### Problem
Field omissions for workload objects are not respected while marshaling to JSON.

### Solution
After digging a bit into the code, I came to realize that while marshaling, workload objects have empty structs as values for various fields which would rather be omitted. As of now, the standard library`encoding/json` does not support zero values of structs with the `omitemty` tag. The relevant issue can be found [here](https://github.com/golang/go/issues/11939). To tackle this problem, the object declaration should have _pointer-to-struct_ as a field type instead of _struct_ itself. However, this approach would be out of scope as the workload object declaration is handled by the k8s library.

I was able to find a drop-in replacement for the `encoding/json` library which supports zero value of structs with the `omitempty` tag. It can be found [here](https://github.com/clarketm/json). I have made use of this library to implement a simple filter like functionality to remove empty tags once a YAML with empty tags is generated, hence leaving the previously existing methods unaffected

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-01-13 10:02:24 -08:00
Zahari Dichev d259b23e8b
Add check to ensure kube-system has the needed annotations (HA) (#3731)
Adds a check to ensure kube-system namespace has `config.linkerd.io/admission-webhooks:disabled`

FIxes #3721

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-10 10:03:13 +02:00
Alex Leong 93a81dce97
Change default proxy log level to "warn,linkerd=info" (#3908)
Fixes #3901 

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-09 14:22:06 -08:00
Tarun Pothulapati 03982d8837 move more values to global (#3892)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-01-09 14:57:43 -05:00
Alex Leong 3b2c1eb540
Respect registry override during inject (#3879)
Fixes https://github.com/linkerd/linkerd2/issues/3878

If the `--registry` flag is provided to Linkerd without the `--proxy-image` or `--init-image` flags, the `--registry` flag is ignored and not applied to the existing values for the proxy or init images pulled from the configmap.

We now override the registry with the value from the `--registry` flag regardless of which other flags are provided.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-08 15:54:09 -08:00
Zahari Dichev 287900a686
Unify issuance lifetime name (#3887)
Due to wrong snake casing, lifetime setting lifetime issuance was not reflected when installing through helm. This commit solved that problem

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-01-08 09:58:20 +02:00
Tarun Pothulapati 42b0c0f1a1 Bump prometheus version to 2.15.2 (#3876)
* bump prometheus version to 2.15.0
* update golden files
* update helm tests
* update to prometheus 2.15.1
* update to prometheus 2.15.2

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-01-07 19:44:57 -08:00
Tarun Pothulapati eac06b973c Move common values to global (#3839)
* move values to global in template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update inject and cli

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update unit tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remote controllerImageVersion from global

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* move identity out of global

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update var name and comments

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update bin and helm tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update helm readme

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix proxy config

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix proxy config indentation

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* more linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove unnecessary lines

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-01-06 14:31:41 -08:00
Alejandro Pedraza f39d4c5275
Fix `linkerd-cni` Helm chart (#3866)
* The `linkerd-cni` chart should set proper annotations/labels for the namespace

When installing through Helm, the `linkerd-cni` chart will (by default)
install itself under the same namespace ("linkerd") that the `linkerd` chart will be
installed aftewards. So it needs to set up the proper annotations and labels.

* Fix Helm install when disabling init containers

To install linkerd using Helm after having installed linkerd's CNI plugin, one needs to `--set noInitContainer=true`.
But to determine whether to use init containers or not, we weren't
evaluating that, but instead `Values.proxyInit`, which is indeed null
when installing through the CLI but not when installing with Helm. So
init containers were being set despite having passed `--set
noInitContainers=true`.
2020-01-06 13:02:27 -05:00
Tarun Pothulapati 576c2bece6 Fix Helm templating bugs, left-over smaller-cases (#3869)
* update flags to smaller
* add tests for the same
* fix control plane trace flag
* add tests for controlplane tracing install

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-12-30 13:34:15 -05:00
Paul Balogh 2cd2ecfa30 Enable mixed configuration of skip-[inbound|outbound]-ports (#3766)
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)

This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.

* Bump versions for proxy-init to v1.3.0

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2019-12-20 09:32:13 -05:00
Sergio C. Arteaga 7886938f4f Classify some gRPC status codes as non-errors (#3736)
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-19 15:22:43 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Andrew Seigner 537bc76f2f
Add recommended k8s labels to control-plane (#3847)
The Kubernetes docs recommend a common set of labels for resources:
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/#labels

Add the following 3 labels to all control-plane workloads:
```
app.kubernetes.io/name: controller # or destination, etc
app.kubernetes.io/part-of: Linkerd
app.kubernetes.io/version: edge-X.Y.Z
```

Fixes #3816

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-12-19 09:00:55 -08:00
Eugene Glotov 748da80409 Inject preStop hook into the proxy sidecar container to stop it last (#3798)
* Inject preStop hook into the proxy sidecar container to stop it last

This commit adds support for a Graceful Shutdown technique that is used
by some Kubernetes administrators while the more perspective
configuration is being discussed in
https://github.com/kubernetes/kubernetes/issues/65502

The problem is that RollingUpdate strategy does not guarantee that all
traffic will be sent to a new pod _before_ the previous pod is removed.
Kubernetes inside is an event-driven system and when a pod is being
terminating, several processes can receive the event simultaneously.
And if an Ingress Controller gets the event too late or processes it
slower than Kubernetes removes the pod from its Service, users requests
will continue flowing into the black whole.

According [to the documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods)

> 1. If one of the Pod’s containers has defined a `preStop` hook,
> it is invoked inside of the container. If the `preStop` hook is still
> running after the grace period expires, step 2 is then invoked with
> a small (2 second) extended grace period.
>
> 2. The container is sent the `TERM` signal. Note that not all
> containers in the Pod will receive the `TERM` signal at the same time
> and may each require a preStop hook if the order in which
> they shut down matters.

This commit adds support for the `preStop` hook that can be configured
in three forms:

1. As command line argument `--wait-before-exit-seconds` for
  `linkerd inject` command.

2. As `linkerd2` Helm chart value `Proxy.WaitBeforeExitSeconds`.

2. As `config.alpha.linkerd.io/wait-before-exit-seconds` annotation.

If configured, it will add the following preHook to the proxy container
definition:

```yaml
lifecycle:
  preStop:
    exec:
      command:
        - /bin/bash
        - -c
        - sleep {{.Values.Proxy.WaitBeforeExitSeconds}}
```

To achieve max benefit from the option, the main container should have
its own `preStop` hook with the `sleep` command inside which has
a smaller period than is set for the proxy sidecar. And none of them
must be bigger than `terminationGracePeriodSeconds` configured for the
entire pod.

An example of a rendered Kubernetes resource where
`.Values.Proxy.WaitBeforeExitSeconds` is equal to `40`:

```yaml
       # application container
        lifecycle:
          preStop:
            exec:
              command:
                - /bin/bash
                - -c
                - sleep 20

        # linkerd-proxy container
        lifecycle:
          preStop:
            exec:
              command:
                - /bin/bash
                - -c
                - sleep 40
    terminationGracePeriodSeconds: 160 # for entire pod
```

Fixes #3747

Signed-off-by: Eugene Glotov <kivagant@gmail.com>
2019-12-18 16:58:14 -05:00
Sergio C. Arteaga 56c8a1429f Increase the comprehensiveness of check --pre (#3701)
* Increase the comprehensiveness of check --pre

Closes #3224

Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-18 13:27:32 -05:00
Tarun Pothulapati efb1101bdb Switch to smaller-case values in linkerd2-cni (#3827)
* update linkerd2-cni templates and cli
* update readme and docs
* update helm unit tests
* update helm build script
* use smaller case linkerd version

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-12-16 15:09:57 -08:00
Zahari Dichev f88b55e36e Tls certs checks (#3813)
* Added checks for cert correctness
* Add warning checks for approaching expiration
* Add unit tests
* Improve unit tests
* Address comments
* Address more comments
* Prevent upgrade from breaking proxies when issuer cert is overwritten (#3821)
* Address more comments
* Add gate to upgrade cmd that checks that all proxies roots work with the identitiy issuer that we are updating to
* Address comments
* Enable use of upgarde to modify both roots and issuer at the same time

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-16 14:49:32 -08:00
Tarun Pothulapati 2f492a77fb Switch to Smaller-Case in Linkerd2 and Partials Charts (#3823)
* update linkerd2, partials charts
* support install and inject workflow
* update helm docs
* update comments in values
* update helm tests
* update comments in test

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-12-13 14:48:07 -05:00
Zahari Dichev a98fe03c5e
Consolidate certificates validation logic (#3810)
* Consolidate certificates validation logic

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Add test for upgrading trust anchors when using external cert manager

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Add logic to ensure issuer cert is CA

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Fix golden file

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-13 10:01:55 +02:00
Sergio C. Arteaga 7f0213d534 Fix upgrade unit tests golden files (#3815)
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-11 14:27:18 -05:00
Sergio C. Arteaga cee8e3d0ae Add CronJobs and ReplicaSets to dashboard and CLI (#3687)
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. 

Closes #3614 
Closes #3630 
Closes #3584 
Closes #3585

Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
2019-12-11 10:02:37 -08:00
Alejandro Pedraza 2d12b88145
Pods with non empty securitycontext capabilities fail to be injected (#3806)
* Pods with non empty securitycontext capabilities fail to be injected

Followup to #3744

The `_capabilities.tpl` template got its variables scope changed in
`Values.Proxy`, which caused inject to fail when security context
capabilities were detected.

Discovered when testing injecting the nginx ingress controller.
2019-12-10 14:36:14 -05:00
Alejandro Pedraza d21fda12db
Added unit test for injecting debug sidecar into CP deployment (#3786)
* Added unit test for injecting debug sidecar into CP deployment

I realized this was missing when testing #3774 (superseded by #3784).
2019-12-10 13:45:48 -05:00
Zahari Dichev 0313f10baa
Move CNI template to helm (#3581)
* Create helm chart for the CNI plugin

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Add helm install tests for the CNI plugin

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Add readme for the CNI helm chart

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Fix integration tests

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Remove old cni-plugin.yaml

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Add trace partial template

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address more comments

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-10 13:21:28 +02:00
Zahari Dichev 7e98128782 Fix upgrade unit tests golden files (#3805)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-09 13:34:08 -08:00
Zahari Dichev 7cc3815d49
Add issuer file flags to upgrade command (#3771)
* Add identity-issuer-certificate-file and identity-issuer-key-file to upgrade command

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Implement logic to use identity-trust-anchors-file flag to update the anchors

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Address remarks

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-09 21:18:03 +02:00
Alejandro Pedraza b4d27f9d82
No need for `processYAML()` in `install` (#3784)
* No need for `processYAML()` in `install`

Since `install` uses helm to do its proxy injection, there's no need to
call `processYAML`. This also fixes an issue discovered in #3687 where
we started supporting injection of cronjobs, and even though `linkerd`'s
namespace is flagged to skip automatic injection it was being injected.

This replaces #3773 as it's a much more simpler approach.
2019-12-09 09:32:14 -05:00
Zahari Dichev e5f75a8c3d
Add validation to ensure stat time window is at least 15s (#3720)
* Add stat time window minimum of 10s

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-12-04 08:12:01 +02:00
Zahari Dichev 36609c88b8
Error on conflicting stat options (--namespace and --all-namespaces) (#3719)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-18 13:05:44 +02:00
Zahari Dichev ef2007a933
Add helm version annotation to tap,injector and sp-validator (#3673)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-15 20:42:19 +02:00
Zahari Dichev a6ff442789
Traffic split integration test (#3649)
* Traffic split integration test

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Display placeholder when there is no basic stats data

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-13 21:14:34 +02:00
Alejandro Pedraza 4b6254b52e
Replaced `uuid` with `uid` from linkerd-config resource (#3694)
* Replaced `uuid` with `uid` from linkerd-config resource

Fixes #3621

Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.

I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.

Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.

I also added an integration test to verify the stability of the uid.
2019-11-13 13:56:01 -05:00
Sergio C. Arteaga eff1714a08 Add `linkerd check` to dashboard (#3656)
`linkerd check` can now be run from the dashboard in the `/controlplane` view.
Once the check results are received, they are displayed in a modal in a similar
style to the CLI output.

Closes #3613
2019-11-12 12:37:36 -08:00
Eugene Glotov 2941ddb7f5 Support Dashboard replicas (#2899) (#3633)
This PR makes possible to increase the amount of web dashboard replicas.

Follows up #2899

Signed-off-by: Eugene Glotov <kivagant@gmail.com>
2019-11-12 11:00:23 -08:00
Zahari Dichev 038900c27e Remove destination container from controller (#3661)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-08 14:40:25 -08:00
Tarun Pothulapati f18e27b115 use appsv1 api in identity (#3682)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-11-06 15:06:09 -08:00
Alejandro Pedraza 1c879ac430
Added simplified service name to list of allowed hosts for linkerd-web (#3674)
Followup to linkerd/website#573
2019-11-06 10:27:55 -05:00
Mayank Shah e91f2020db Update uninject command to handle namespaces (Fixes #3648) (#3668)
* Add support for uninject command to uninject namespace configs
* Add relevant unit tests in cli/cmd/uninject_test.go

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2019-11-04 22:21:47 -08:00
StupidScience 5958111533 WIP: Added annotations parsing and doc generation (#3564)
* rework annotations doc generation from godoc parsing to map[string]string and get rid of unused yaml tags
* move annotations doc function from pkg/k8s to cli/cmd

Signed-off-by: StupidScience <tonysignal@gmail.com>
2019-11-04 14:55:50 -08:00
Zahari Dichev 86854ac845
Control plane debug (#3507)
* Add cmd to inject debug sidecar for l5d components only

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Revert "Add cmd to inject debug sidecar for l5d components only"

This reverts commit 50b8b3577e.

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Stop uninjecting metadata from control plane components

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Ensure inject can be run on control plane components only if --manual is present

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-04 18:56:35 +02:00
Alejandro Pedraza bd8d47226d
DNS rebinding protection for the dashboard (#3644)
* DNS rebinding protection for the dashboard

Fixes #3083 and replacement for #3629

This adds a new parameter to the `linkerd-web` container `enforcedHost`
that establishes the regexp that the Host header must enforce, otherwise
it returns an error.

This parameter will be hard-coded for now, in `linkerd-web`'s deployment
yaml.

Note this also protects the dashboard because that's proxied from
`linkerd-web`.

Also note this means the usage of `linkerd dashboard --address` will
require the user to change that parameter in the deployment yaml (or
have Kustomize do it).

How to test:
- Run `linkerd dashboard`
- Go to http://rebind.it:8080/manager.html and change the target port to
50750
- Click on “Start Attack” and wait for a minute.
- The response from the dashboard will be returned, showing an 'Invalid
Host header' message returned by the dashboard. If the attack would have
succeeded then the dashboard's html would be shown instead.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-10-31 11:51:25 -05:00
Mayank Shah ec848d4ef3 Add inject support for namespace configs (Fix #3255) (#3607)
* Add inject support for namespaces(Fix #3255)

* Add relevant unit tests (including overridden annotations)

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2019-10-30 10:18:01 -05:00
Tarun Pothulapati 015ea9e17a Control Plane Trace configuration (#3539)
* add Control Plane Trace config
* remove collector and jaeger templates
* add linting fixes
* add trace tpl to helm tests
* add build docs to enable tracing
* fix the install command
* remove sampling
* add templated namespace
* simplify config and use templating
* hide the tracing flag
* add correct link
* fix the link

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-10-25 11:42:30 -07:00
Alejandro Pedraza d3d8266c63
If tap source IP matches many running pods then only show the IP (#3513)
* If tap source IP matches many running pods then only show the IP

When an unmeshed source ip matched more than one running pod, tap was
showing the names for all those pods, even though the didn't necessary
originate the connection. This could be reproduced when using pod
network add-on such as Calico.

With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP.

Fixes #3103
2019-10-25 12:38:11 -05:00
Zahari Dichev 0017f9a60a Cert manager support (#3600)
* Add support for --identity-issuer-mode flag to install cmd
* Change flag to be a bool
* Read correct data form identity when external issuer is used
* Add ability for identity service to dynamically reload certs
* Fix failing tests
* Minor refactor
* Load trust anchors from identity issuer secret
* Make identity service actually watch for issuer certs updates
* Add some testing around cmd line identity options validation
* Add tests ensuring that identity service loads issuer
* Take into account external-issuer flag during upgrade + tests
* Fix failing upgrade test
* Address initial review feedback
* Address further review feedback on cli and helm
* Do not persist --identity-external-issuer
* Some improvements to identitiy service
* Bring back persistane of external issuer flag
* Address more feedback
* Update dockerfiles shas
* Publishing k8s events on issuer certs rotation
* Ensure --ignore-cluster+external issuer is not supported
* Update go-deps shas
* Transition to identity issuer scheme based configuration
* Use k8s consts for secret file names

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-10-24 13:15:14 -07:00
Tarun Pothulapati 78b6f42ea7 Add Collector Flags for inject cmd (#3588)
* add flags to inject cmd
* add trace flags to readme
* use ns from pod

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-10-24 10:16:13 -07:00
Andrew Seigner 3b3dfa701c
Faster `linkerd install --ignore-cluster` (#3568)
The `linkerd install` `--ignore-cluster` and `--skip-checks` flags
enable generating install manifests without a connection to a k8s
cluster. Unfortunately these flags were only checked after attempted
connections to a k8s cluster were made. This satisfied the use case of
`linkerd install` "ignoring" the state of the cluster, but for
environments not connected to a cluster, the user would have to wait for
30s timeouts before getting the manifests.

Modify `linkerd install` and its subcommands to pre-emptively check for
`--ignore-cluster` and `--skip-checks`. This decreases `linkerd install
--ignore-cluster` from ~30s to ~1s, and `linkerd install control-plane
--ignore-cluster --skip-checks` from ~60s to ~1s.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-10-21 11:05:16 -07:00
Saurav Tiwary d95a469a60 Correct definition of Less function in CLI's metrics command(#3533) (#3534)
Fixes #3533

Signed-off-by: Saurav Tiwary <srv.twry@gmail.com>
2019-10-15 14:21:10 -07:00
Bruno M. Custódio df48873da8 Make '--cluster-domain' an install-only flag. (#3496)
This PR aims at preventing `--cluster-domain` from being changed during `linkerd upgrade`. I am not sure this is all that is necessary, but it can probably be at least a good start. 🙂 Closes #3454.

Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-10-15 14:09:33 -07:00
Ivan Sim cf69dedf9c
Re-add the destination container to the controller spec (#3540)
* Re-add the destination container to the controller spec

This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.

On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.

* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-08 10:49:40 -07:00
Alejandro Pedraza c5d68ecb16
Add missing nodeSelector in Destination deployment (#3527)
Fixes #3526
2019-10-04 12:47:55 -05:00
Kevin Leimkuhler a3a240e0ef
Add TapEvent headers and trailers to the tap protobuf (#3410)
### Motivation

In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390

### Solution

The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.

These values are set by reading the corresponding fields off of the proxy's tap
events.

The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33

Closes #3262

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-29 09:54:37 -07:00
Bruno M. Custódio caddda8e48 Add support for a node selector in the Helm chart. (#3275)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-27 10:36:37 -07:00
Alex Leong 4799baa8e2
Revert "Trace Control Plane components using OC (#3461)" (#3484)
This reverts commit edd3b1f6d4.

This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy.  We will reintroduce this change once the config plan is straightened out.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-09-26 11:56:44 -07:00
Tarun Pothulapati edd3b1f6d4 Trace Control Plane components using OC (#3461)
* add exporter config for all components

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add cmd flags wrt tracing

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp tracing to web server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add flags to the tap deployment

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add trace flags to install and upgrade command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add linkerd prefix to svc names

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp trasport to API Internal Client

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix goimport linting errors

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp handler to tap http server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* review and fix tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use common template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use Initialize

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix sample flag

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add verbose info reg flags

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-26 08:11:48 -07:00
Alejandro Pedraza 6568929028
Add --disable-heartbeat flag for linkerd install|upgrade (#3439)
Fixes #278

Add `linkerd install|upgrade --disable-heartbeat` flag, and have
`linkerd check` check for the heartbeat's SA only if it's enabled.

Also added those flags into the `linkerd upgrade -h` examples.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-25 15:53:36 -05:00
Tarun Pothulapati 096668d62c make public-api use the right destination address (#3476)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-25 15:24:56 -05:00
Daniel Mangum fa01b49998 proxy injector: mwc match expressions admission-webhooks disabled (#3460)
When running linkerd in HA mode, a cluster can be broken by bringing down the proxy-injector.

Add a label to MWC namespace selctor that skips any namespace.

Fixes #3346

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
2019-09-24 19:28:16 -07:00
arminbuerkle 09114d4b08 Add cluster domain cli flag (#3360)
* Add custom cluster domain cli flag
* Fetch cluster domain from config map
* Add cluster domain cli flag only where necessary

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-09-19 16:08:50 -07:00
Kevin Leimkuhler c62c90870e
Add JSON output to tap command (#3434)
Replaces #3411 

### Motivation

It is a little tough to filter/read the current tap output. As headers are being
added to tap, the output is starting to get difficult to consume. Take a peek at
#3262 for an example. It would be nice to have some more machine readable output
that can be sliced and diced with tools such as jq.

### Solution

A new output option has been added to the `linkerd tap` command that returns the
JSON encoding of tap events.

The default output is line oriented; `-o wide` appends the request's target
resource type to the tap line oriented tap events.

In order display certain values in a more human readable form, a tap event
display struct has been introduced. This struct maps public API `TapEvent`s
directly to a private `tapEvent`. This struct offers a flatter JSON structure
than the protobuf JSON rendering. It also can format certain field--such as
addresses--better than the JSON protobuf marshaler.

Closes #3390

**Default**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web
req id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics
rsp id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs
end id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B
```

**Wide**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide
req id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd
rsp id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd
end id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd
```

**JSON**:
*Edit: Flattened `Method` and `Scheme` formatting*
```
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "ip-masq-agent",
      "namespace": "kube-system",
      "pod": "ip-masq-agent-4d5s9",
      "serviceaccount": "ip-masq-agent",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "requestInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "method": "GET",
    "scheme": "",
    "authority": "10.60.1.49:9994",
    "path": "/ready"
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "calico-node",
      "namespace": "kube-system",
      "pod": "calico-node-bbrjq",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 644820
    },
    "httpStatus": 200
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "deployment": "calico-typha",
      "namespace": "kube-system",
      "pod": "calico-typha-59cb487c49-8247r",
      "pod_template_hash": "59cb487c49",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseEndEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 790898
    },
    "sinceResponseInit": {
      "nanos": 146078
    },
    "responseBytes": 3,
    "grpcStatusCode": 0
  }
}
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-19 09:34:49 -07:00
Alejandro Pedraza 1653f88651
Put the destination controller into its own deployment (#3407)
* Put the destination controller into its own deployment

Fixes #3268

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-18 13:41:06 -05:00
Eugene Glotov 616131467c Allow to disable Namespace installation with Helm (#3412) (#3413)
If the namespace is controlled by an external tool or can't be installed
with Helm, disable its installation
Fixes #3412

Signed-off-by: Eugene Glotov <kivagant@gmail.com>
2019-09-17 12:25:35 -05:00
Ivan Sim 4d89c52113 Update Prometheus config to keep only needed cadvisor metrics (#3401)
* Update prometheus cadvisor config to only keep container resources metrics

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Drop unused large metric

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Fix unit test

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Siggy's feedback

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Fix unit test

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-09-17 10:17:49 -07:00
Alejandro Pedraza 1e2810c431
Trim certs and keys in the Helm charts (#3421)
* Trim certs and keys in the Helm charts

Fixes #3419

When installing through the CLI the installation will fail if the certs
are malformed, so this only concerns the Helm templates.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-11 20:47:38 -05:00
Andrew Seigner c5a85e587c
Update to client-go v12.0.0, forked stern (#3387)
The repo depended on an old version of client-go. It also depended on
stern, which itself depended on an old version of client-go, making
client-go upgrade non-trivial.

Update the repo to client-go v12.0.0, and also replace stern with a
fork.

This fork of stern includes the following changes:
- updated to use Go Modules
- updated to use client-go v12.0.0
- fixed log line interleaving:
  - https://github.com/wercker/stern/issues/96
  - based on:
    - 8723308e46

Fixes #3382

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-10 11:04:29 -07:00
Andrew Seigner 7f59caa7fc
Bump proxy-init to 1.2.0 (#3397)
Pulls in latest proxy-init:
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/v1.2.0

This also bumps a dependency on cobra, which provides more complete zsh
completion.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-09 09:06:14 -07:00
Bruno M. Custódio 8fec756395 Add '--address' flag to 'linkerd dashboard'. (#3274)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-05 10:56:10 -07:00
陈谭军 a30882ef22 remove the duplicate word (#3385)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-09-04 20:13:55 -07:00
Alena Varkockova d369029909 Emit error when cannot connect to kubernetes (#3327)
Introduce CategoryError

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-09-04 17:34:53 -07:00
Alejandro Pedraza 17dd9bf6bc
Couple of injection events fixes (#3363)
* Couple of injection events fixes

When generating events in quick succession against the same target, client-go issues a PATCH request instead of a POST, so we need the extra RBAC permission.

Also we have an informer on pods, so we also need the "watch" permission
for them, whose omission was causing an error entry in the logs.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-04 11:57:20 -05:00
Alejandro Pedraza acbab93ca8
Add support for k8s 1.16 (#3364)
Fixes #3356

1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):

```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
    Migrate to the policy/v1beta1 API, available since v1.10. Existing
    persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
    Migrate to the apps/v1 API, available since v1.9. Existing persisted
    data can be retrieved/updated via the apps/v1 API.
```

Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.

The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-04 09:59:55 -05:00
arminbuerkle 5c38f38a02 Allow custom cluster domains in remaining backends (#3278)
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-27 10:01:36 -07:00
Alejandro Pedraza 02efb46e45
Have the proxy-injector emit events upon injection/skipping injection (#3316)
* Have the proxy-injector emit events upon injection/skipping injection

Fixes #3253

Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).

The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.

Related changes:

- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-26 13:34:36 -05:00
Carol A. Scott 089836842a
Add unit test for edges API endpoint (#3306)
Fixes #3052.

Adds a unit test for the edges API endpoint. To maintain a consistent order for
testing, the returned rows in api/public/edges.go are now sorted.
2019-08-23 09:28:02 -07:00
Ivan Sim 954a45f751
Fix broken unit and integration tests (#3303)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-21 18:52:19 -07:00
arminbuerkle e7d303e03f Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES (#3277)
* Fix missing `clusterDomain` in render RenderTapOutputProfile
* Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES env variable

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-21 14:28:30 -07:00
Ivan Sim 183e42e4cd
Merge the CLI 'installValues' type with Helm 'Values' type (#3291)
* Rename template-values.go
* Define new constructor of charts.Values type
* Move all Helm values related code to the pkg/charts package
* Bump dependency
* Use '/' in filepath to remain compatible with VFS requirement
* Add unit test to verify Helm YAML output
* Alejandro's feedback
* Add unit test for Helm YAML validation (HA)

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-20 19:26:38 -07:00
Alejandro Pedraza 99ddc66461
Always use forward-slash when interacting with the VFS (#3284)
* Always use forward-slash when interacting with the VFS

Fixes #3283

Our VFS implementation relies on `net.http.FileSystem` which always
expects `/` regardless of the OS.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-19 11:10:21 -05:00
cpretzer 4e92064f3b
Add a flag to install-cni command to configure iptables wait flag (#3066)
Signed-off-by: Charles Pretzer <charles@buoyant.io>
2019-08-15 12:58:18 -07:00
Kevin Leimkuhler cc3c53fa73
Remove tap from public API and associated test infrastructure (#3240)
### Summary

After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.

This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.

This draft passes `go test` as well as a local run of the integration tests.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-14 17:27:37 -04:00
Andrew Seigner 3b55e2e87d
Add container cpu and mem to heartbeat requests (#3238)
PR #3217 re-introduced container metrics collection to
linkerd-prometheus. This enabled linkerd-heartbeat to collect mem and
cpu metrics at the container-level.

Add container cpu and mem metrics to heartbeat requests. For each of
(destination, prometheus, linkerd-proxy), collect maximum memory and p95
cpu.

Concretely, this introduces 7 new query params to heartbeat requests:
- p99-handle-us
- max-mem-linkerd-proxy
- max-mem-destination
- max-mem-prometheus
- p95-cpu-linkerd-proxy
- p95-cpu-destination
- p95-cpu-prometheus

Part of #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-14 12:04:08 -07:00
Carol A. Scott 00437709eb
Add trafficsplit metrics to CLI (#3176)
This PR adds `trafficsplit` as a supported resource for the `linkerd stat` command. Users can type `linkerd stat ts` to see the apex and leaf services of their trafficsplits, as well as metrics for those leaf services.
2019-08-14 10:30:57 -07:00
Ivan Sim 4d01e3720e
Update install and upgrade code to use the new helm charts (#3229)
* Delete symlink to old Helm chart
* Update 'install' code to use common Helm template structs
* Remove obsolete TLS assets functions.

These are now handle by Helm functions inside the templates

* Read defaults from values.yaml and values-ha.yaml
* Ensure that webhooks TLS assets are retained during upgrade
* Fix a few bugs in the Helm templates (see bullet points):
* Merge the way the 'install' ha and non-ha options are handled into one function
* Honor the 'NoInitContainer' option in the components templates
* Control plane mTLS will not be disabled if identity context in the
config map is empty. The data plane mTLS will still be automatically disabled
if the context is nil.
* Resolve test failures from rebase with master
* Fix linter issues
* Set service account mount path read-only field
* Add TLS variables of the webhooks and tap to values.yaml

During upgrade, these secrets are preserved to ensure they remain synced
wih the CA bundle in the webhook configurations. These Helm variables are used
to override the defaults in the templates.

* Remove obsolete 'chart' folder
* Fix bugs in templates
* Handle missing webhooks and tap TLS assets during upgrade

When upgrading from an older version that don't have these secrets, fallback to let Helm
create them by creating an empty charts.TLS struct.

* Revert the selector labels of webhooks to be compatible with that in 2.4

In 2.4, the proxy injector and profile validator webhooks already have their selector labels defined.
Since these attributes are immutable, the recent change to these selectors introduced by the Helm chart work will cause upgrade to fail.

* Alejandro's feedback
* Siggy's feedback
* Removed redundant unexported custom types

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-13 14:16:24 -07:00
Alejandro Pedraza 1e82f62d6e
Fix uninject (#3236)
Now that we inject at the pod level by default, `linkerd uninject` should remove the `linkerd.io/inject: enabled`
annotation. Also added a test for that.

Fix #3156

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-13 15:06:21 -05:00
ethan b4b2a44299 cleanup: stat.go help message words correction (#3226)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-12 10:05:26 -07:00
Thomas Rampelberg ca5b4fab2e
Add container metrics and grafana dashboard (#3217)
* Add container metrics and grafana dashboard

* Review cleanup

* Update templates
2019-08-12 08:03:57 -07:00
Andrew Seigner 43bc175ea9
Enable tap-admin ClusterRole privileges for `*` (#3214)
The `linkerd-linkerd-tap-admin` ClusterRole had `watch` privileges on
`*/tap` resources. This disallowed non-namespaced tap requests of the
form: `/apis/tap.linkerd.io/v1alpha1/watch/namespaces/linkerd/tap`,
because that URL structure is interpreted by the Kubernetes API as
watching a resource of type `tap` within the linkerd namespace, rather
than tapping the linkerd namespace.

Modify `linkerd-linkerd-tap-admin` to have `watch` privileges on `*`,
enabling any request of the form
`/apis/tap.linkerd.io/v1alpha1/watch/namespaces/linkerd/*` to succeed.

Fixes #3212

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 12:04:03 -07:00
Andrew Seigner 0ff39ddf8d
Introduce tap-admin ClusterRole, web privs flag (#3203)
The web dashboard will be migrating to the new Tap APIService, which
requires RBAC privileges to access.

Introduce a new ClusterRole, `linkerd-linkerd-tap-admin`, which gives
cluster-wide tap privileges. Also introduce a new ClusterRoleBinding,
`linkerd-linkerd-web-admin` which binds the `linkerd-web` service
account to the new tap ClusterRole. This ClusterRoleBinding is enabled
by default, but may be disabled via a new `linkerd install` flag
`--restrict-dashboard-privileges`.

Fixes #3177

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 10:28:35 -07:00
Alejandro Pedraza 54b2103bba
Fix bug in service profile name generation (#3209)
Followup to #3148

Wrong args order in call to `profiles.RenderOpenAPI` was generating an
invalid service profile name.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-07 18:51:32 -05:00
Alejandro Pedraza 3ae653ae92
Refactor proxy injection to use Helm charts (#3200)
* Refactor proxy injection to use Helm charts

Fixes #3128

A new chart `/charts/patch` was created, that generates the JSON patch
payload that is to be returned to the k8s API when doing the injection
through the proxy injector, and it's also leveraged by the `linkerd
inject --manual` CLI.

The VFS was used by `linkerd install` to access the old chart under
`/chart`. Now the proxy injection also uses the Helm charts to generate
the JSON patch (see above) so we've moved the VFS from `cli/static` to a
new common place under `/pkg/charts/static`, and the new root for the VFS is
now `/charts`.

`linkerd install` hasn't yet migrated to use the new charts (that'll
happen in #3127), so the only change in that regard was the creation of
`/charts/chart` which is a symlink pointing to `/chart` that
`install.go` now uses, so that the VFS contains both the old and new
charts, as a temporary measure.

You can see that `/bin/Dockerfile-bin`, `/controller/Dockerfile` and
`/bin/build-cli-bin` do now `go generate` pointing to the new location
(and the `go generate` annotation was moved from `/cli/main.go` to
`pkg/charts/static/templates.go`).

The symlink trick doesn't work when building the binaries through
Docker, so `/bin/Dockerfile-bin` replaces the symlink with an actual
copy of `/chart`.

Also note that in `/controller/Dockerfile` we now need to include the
`prod` tag in `go install` like we do in `/bin/Dockerfile-bin` so that
the proxy injector does use the VFS instead of the local file system.

- The common logic to parse a chart has been moved from `install.go` to
`/pkg/charts/util.go`.
- The special ENV var in the proxy for "outbound router capacity" that
only applies to the Prometheus pod is now handled directly in the proxy
partial and all the associated go code could be removed.
- The `patch.go` lib for generating the JSON patch in go along
with its tests `patch_test.go` are no longer needed.
- Lots of functions in `/pkg/inject/inject.go` got removed/simplified
with their logic being moved into the charts themselves. As a
consequence lots of things in `inject_test.go` became irrelevant.
- Moved `template-values.go` from `/pkg/inject` to `pkg/charts` as that
contains the go structs representation of the chart variables that
will be leveraged in #3127.

Don't forget to run `/bin/helm.sh` whenever you make changes to charts
;-)

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-07 17:32:37 -05:00
Tarun Pothulapati 0cbba0b03e Setting SuccessfulJobHistoryLimit to 0 for CronJobs (#3193)
* setting successful job history limit to 0

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-08-07 16:59:14 -05:00
arminbuerkle e3d68da1dc Allow setting custom cluster domain in service profiles (#3148)
Continue of #2950.

I decided to check for the `clusterDomain` in the config map in web server main for the same reasons as as pointed out here https://github.com/linkerd/linkerd2/pull/3113#discussion_r306935817

It decouples the server implementations from the config.

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-07 09:49:54 -07:00
Andrew Seigner 0565955428
Update `linkerd profile --tap` to Tap APIService (#3187)
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.

This change migrates `linkerd profile --tap` to the new Tap APIService.

Depends on #3186
Fixes #3169

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-02 12:44:58 -07:00
Andrew Seigner a185cae55b
Update `linkerd top` to use Tap APIService (#3186)
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.

This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.

Fixes #3168

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-02 11:34:22 -07:00
Andrew Seigner a59c1dd32d
Introduce tap APIService, update `linkerd tap` (#3167)
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.

This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.

This change also modifies the `linkerd tap` command to make requests
against the new APIService.

The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET  /apis
GET  /apis/tap.linkerd.io
GET  /apis/tap.linkerd.io/v1alpha1
GET  /healthz
GET  /healthz/log
GET  /healthz/ping
GET  /metrics
GET  /openapi/v2
GET  /version

Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.

This change introduces the following resources into the default Linkerd
install:
- Global
  - APIService/v1alpha1.tap.linkerd.io
  - ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
  - Secret/linkerd-tap-tls
- `kube-system` namespace:
  - RoleBinding/linkerd-linkerd-tap-auth-reader

Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller

Fixes #2725, #3162, #3172

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-01 14:02:45 -07:00
Andrew Seigner 9a672dd5a9
Introduce `linkerd --as` flag for impersonation (#3173)
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-31 16:05:33 -07:00
Andrew Seigner a8830b2323
Set heartbeat cronjobs to not restart on failure (#3174)
The heartbeat cronjob specified `restartPolicy: OnFailure`. In cases
where failure was non-transient, such as if a cluster did not have
internet access, this would continuously restart and fail.

Change the heartbeat cronjob to `restartPolicy: Never`, as a failed job
has no user-facing impact.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-31 13:51:13 -07:00
Kevin Leimkuhler 8d9cfbf670
Inject Tap service name into proxy PodSpec (#3155)
### Summary

In order for Pods' tap servers to start authorizing tap clients, the tap server
must be able to check client names against the expected tap service name.

This change injects the `LINKERD2_PROXY_TAP_SVC_NAME` into proxy PodSpecs.

### Details

The tap servers on the individual resources being tapped should be able to
verify that the client is the tap service. The `LINKERD2_PROXY_TAP_SVC_NAME` is
now injected as an environment variable in the proxies so that it can check this
value against the client name of the TLS connection. Currently, this environment
will go unused. There is an open PR (linkerd2-proxy#290) to use this variable in
the proxy, but this is *not* dependent on that merging first. 

Note: The variable is not injected if tap is disabled.

### Testing

Test output has been updated with the newly injected environment variable.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-07-29 15:05:45 -07:00
Tarun Pothulapati 2ba2dea6a6 Added Resource Limits when ha is Configured (#3092)
* increased ha resource limits
* added resource limits to proxy when HA
* update golden files in cmd/main

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-07-26 09:46:36 -07:00
Alejandro Pedraza 8c07223f3b
Remove unused argument (#3149)
Removed unused argument in the `GetPatch()` function in
`pkg/inject/inject.go`

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2019-07-26 11:39:25 -05:00
Cody Vandermyn 808fa381f9 A Slightly More Restrictive PSP (#3085)
* Adds more PSP restrictions
* Update test fixtures
* Updates PSP to be conditional on initContainer

- The proxy-init container runs as root and needs the PSP to allow this
user when there is an init container.

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2019-07-24 10:12:33 -07:00
Andrew Seigner 889a4a0578
Introduce -A as a shorthand for --all-namespaces (#3125)
kubectl introduced `-A` as a shorthand for `--all-namespaces` in
`v1.14.0`:
https://github.com/kubernetes/kubernetes/pull/72006

Update linkerd cli's `edges`, `get`, and `stat` commands to match this
convention.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-24 07:50:22 -07:00
Andrew Seigner 64ed8e4a74
Introduce Cluster Heartbeat cronjob (#3056)
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.

Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.

Example check URL:
https://versioncheck.linkerd.io/version.json?
  install-time=1562761177&
  k8s-version=v1.15.0&
  meshed-pods=8&
  rps=3&
  source=heartbeat&
  uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
  version=git-b97ee9f7

Fixes #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 17:12:30 -07:00
Andrew Seigner 48a69cb88a
Bump Prometheus to 2.11.1, Grafana to 6.2.5 (#3123)
- set `disable_sanitize_html` in `grafana.ini`.
- make all text box dimensions whole integers to fix dropdown issue,
  reported in:
  https://github.com/linkerd/linkerd2/issues/2955#issuecomment-503085444
- rev all dashboards to `schemaVersion` 18 for Grafana 6.2.5
- `prometheus-benchmark.json` based on:
  https://grafana.com/grafana/dashboards/9761
- `prometheus.json` based on:
  69c93e6401/public/app/plugins/datasource/prometheus/dashboards/prometheus_2_stats.json
- `grafana.json` based on:
  85aed0276e/public/app/plugins/datasource/prometheus/dashboards/grafana_stats.json

Fixes #2955

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 13:37:56 -07:00
Alex Leong d6ef9ea460
Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078)
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller.  Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.

To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2.  This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource.  If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can.  This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.  

Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation).  In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-23 11:46:31 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong c8b34a8cab
Add pod status to linkerd check (#3065)
When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff).

We add pod status to the output to make this more immediately obvious.

Fixes #2877 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-18 15:56:19 -07:00
Tarun Pothulapati fcec1cfb8a Added Anti Affinity when HA is configured (#2893)
* Added Anti Affinity when HA is configured
* Move check to validate()
* Test output with anti-affinity when ha upgrade
* Add anti-affinity to identity deployment
* made host anti-affinity default when ha
* Define affinity template in a separate file

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-07-18 10:03:25 -07:00
Alejandro Pedraza ba9fd70892
`linkerd upgrade config` bombs when installation had a flag (#3097)
When installing using some of the flags that persist in install, e.g
`linkerd install --ha`, and then doing `linkerd upgrade config` a nil
pointer error is thrown.

Fixes #3094

`newCmdUpgradeConfig()` was using passing `flags` as nil because
`linkerd upgrade config` doesn't expose any flags for the subcommand,
but turns out they're still needed down the call stack in
`setFlagsFromInstall` to reuse the flags persisted during install.

I also added a new unit test catching this.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-07-18 09:09:01 -05:00
Carol A. Scott ee1a111993
Updating CLI output for `linkerd edges` (#3048)
This PR improves the CLI output for `linkerd edges` to reflect the latest API
changes. 

Source and destination namespaces for each edge are now shown by default. The
`MSG` column has been replaced with `Secured` and contains a green checkmark or
the reason for no identity. A new `-o wide` flag shows the identity of client
and server if known.
2019-07-17 12:23:34 -07:00
Jonathan Juares Beber 2dcbde08b3 Show pod status more clearly (#1967) (#2989)
During operations with `linkerd stat` sometimes it's not clear the actual
pod status.

This commit introduces a method, to the `k8s`package, getting the pod status,
based on [`kubectl` logic](33a3e325f7/pkg/printers/internalversion/printers.go (L558-L640))
to expose the `STATUS` column for pods . Also, it changes the stat command
on the` cli` package adding a column when the resource type is a Pod.

Fixes #1967

Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
2019-07-10 12:44:44 -07:00
Andrew Seigner 7756828ae6
Update install failure message to list resources (#3050)
The existing `linkerd install` error message for existing resources was
shared with `linkerd check`. Given the different contexts, the messaging
made more sense for `linkerd check` than for `linkerd install`.

Modify the error messaging for `linkerd install` to print a bare list
of existing resources, and provide instructions for proceeding.

For example:
```bash
$ linkerd install
Unable to install the Linkerd control plane. It appears that there is an existing installation:

clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity

If you are sure you'd like to have a fresh install, remove these resources with:

    linkerd install --ignore-cluster | kubectl delete -f -

Otherwise, you can use the --ignore-cluster flag to overwrite the existing global resources.
```

Fixes #3045

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-09 20:21:19 +02:00
Andrew Seigner 9e09bd5e98
Mark High Availability as non-experimental (#3049)
Fixes #2419

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-09 20:20:28 +02:00
Andrew Seigner 94fa653cf3
Fix `linkerd check` missing uuid on version check (#3040)
PR #2603 modified the web process to read the UUID from the
`linkerd-config` ConfigMap rather than from a command line flag. The
`linkerd check` command relied on that command line flag to retrieve the
UUID as part of its version check.

Modify `linkerd check` to correctly retrieve the UUID from
`linkerd-config`. Also refactor `linkerd-config` retrieval and parsing
code to be shared between healthcheck, install, and upgrade.

Relates to #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-05 19:39:13 +02:00
Tarun Pothulapati eb7f9866af Fix inject with path and add tests (#3038)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-07-05 09:26:25 -05:00
Alejandro Pedraza 53e589890d
Have `linkerd endpoints` use `Destination.Get` (#2990)
* Have `linkerd endpoints` use `Destination.Get`

Fixes #2885

We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.

For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.

I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.

Analogously, I added a `destinationServer` struct that mimics
`tapServer`.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-07-03 09:11:03 -05:00
Ivan Sim 7e1c14e783
Add the 'linkerd.io/control-plane-ns' label to the Traffic Split CRD (#3026)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-07-02 15:46:25 -07:00
Andrew Seigner 902978fe48
Rename debug annotation to enable-debug-sidecar (#3016)
Linkerd's CLI flags all match 1:1 with their `config.linkerd.io/*`
annotation counterparts, except `--enable-debug-sidecar`, which
corresponded to `config.linkerd.io/debug`. Additionally, the Linkerd
docs assume this 1:1 mapping.

Rename the `config.linkerd.io/debug` annotation to
`config.linkerd.io/enable-debug-sidecar`.

Relates to https://github.com/linkerd/website/issues/381

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-02 20:01:52 +02:00
Carol A. Scott a504e8c2d8
Expand and improve edges API endpoint (#3007)
Updates functionality of `linkerd edges`, including a new `--all-namespaces`
flag and returning namespace information for SRC and DST resources.
2019-06-28 15:46:04 -07:00
Alex Leong 27373a8b78
Add traffic splitting to destination profiles (#2931)
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).

We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to.  A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-28 13:19:47 -07:00
Tarun Pothulapati 7db058f096 linkerd inject from remote URL (#2988)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-28 09:47:33 -07:00
Tarun Pothulapati 5c5ec6d816 add admin port label to proxy-injector and sp-validator (#2984)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-27 17:25:49 -05:00
Ivan Sim 866fe6fa5e
Introduce global resources checks to install and multi-stage install (#2987)
* Introduce new checks to determine existence of global resources and the
'linkerd-config' config map.
* Update pre-check to check for existence of global resources

This ensures that multiple control planes can't be installed into
different namespaces.

* Update integration test clean-up script to delete psp and crd

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-27 09:59:12 -07:00
Alejandro Pedraza 73740fb503
Simplify port-forwarding code (#2976)
* Simplify port-forwarding code

Simplifies the establishment of a port-forwarding by moving the common
logic into `PortForward.Init()`

Stemmed from this
[comment](https://github.com/linkerd/linkerd2/pull/2937#discussion_r295078800)

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-26 11:14:57 -05:00
Andrew Seigner 81790b6735 Bump Prometheus to v2.10.0 (#2979)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-06-21 12:51:31 -07:00
Tarun Pothulapati a3ce06bd80 Add sideEffects field to Webhooks (#2963)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-21 11:06:10 -07:00
Ivan Sim 435fe861d0
Label all Linkerd resources (#2971)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-20 09:44:30 -07:00
Ivan Sim e2e976cce9
Add `NET_RAW` capability to the proxy-init container (#2969)
Also, update control plane PSP to match linkerd/website#94

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-19 19:34:37 -07:00
Dennis Adjei-Baah 694ba9c2cb
Revert add namespace name to MWC (#2946)
* revert add namespace name to MWC
2019-06-14 15:26:34 -07:00
Alejandro Pedraza 7fc6c195ad
Set MWC and VWC failure policy to 'fail' in HA mode only (#2943)
Fixes #2927

Also moved `TestInstallSP` after `TestCheckPostInstall` so we're sure
the validating webhook is ready before installing a service profile.

Signed-off-by: Alejandro Pedraza Borrero <alejandro@buoyant.io>
2019-06-14 11:50:59 -05:00
Alejandro Pedraza 28025eeb56
Remove UPDATE event from the mutating webhook config (#2919)
Fixes #2889

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-13 15:42:47 -05:00
Alejandro Pedraza e9bf014d34
Remove MWVC RBAC from webhook configs (#2925)
Fixes #2890

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-13 15:42:00 -05:00
Dennis Adjei-Baah 035ba6ae87
update sp-validator MWC golden test file (#2938) 2019-06-13 13:39:24 -07:00
Dennis Adjei-Baah 8aef9280dd
add namespace name to MWC (#2905)
When installing multiple control planes, the mutatingwebhookconfiguration of the first control plane gets overwritten by any subsequent control plane install. This is caused by the fixed name given to the mutatingwebhookconfiguration manifest at install time.

This commit adds in the namespace to the manifest so that there is a unique configuration for each control plane.

Fixes #2887
2019-06-13 12:15:43 -07:00
Ivan Sim ecc4465cd1
Introduce Control Plane's PSP and RBAC resources into Helm templates (#2920)
* Add control plane and CNI PSP and RBAC resources
* Add the '--linkerd-cni-enabled' flag to the multi-stage install subcommands

This flag ensures that the NET_ADMIN capability is omitted from the control
plane's PSP during 'install config' and the proxy-init containers aren't
injected during 'install control-plane'.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-12 20:18:46 -07:00
Alejandro Pedraza 8416d326c2
If HA, set the webhooks failure policy to 'Fail' (#2906)
* If HA, set the webhooks failure policy to 'Fail'

I'm adding to the linkerd namespace a new label
`linkerd.io/is-control-plane: true` that is used in the webhook configs'
selector to skip the proxy injector for this namespace. This avoids
running into the timing issues described in #2852.

Fixes #2852

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-11 13:11:54 -05:00
Cody Vandermyn 33de3574ee Correctly set securityContext values on injection (#2911)
The patch provided by @ihcsim applies correct values for the securityContext during injection, namely: `allowPrivilegeEscalation = false`, `readOnlyRootFilesystem = true`, and the capabilities are copied from the primary container. Additionally, the proxy-init container securityContext has been updated with appropriate values.

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2019-06-11 10:34:30 -07:00
Dan 24bbd7c64b Ensure Prometheus log level is lowercase (#2823) (#2870)
Signed-off-by: Daniel Baranowski <daniel.baranowski@infinityworks.com>
2019-06-07 09:57:08 -07:00
Alejandro Pedraza 66eb829e5a
Fix HA during upgrade (#2900)
* Fix HA during upgrade

If we have a Linkerd installation with HA, and then we do `linkerd
upgrade` without specifying `--ha`, the replicas will get set back to 1,
yet the resource requests will keep their HA values.

Desired behavior: `linkerd install --ha` adds the `ha` value into the
linkerd-config, so it should be used during upgrade even if `--ha` is
not passed to `linkerd upgrade`.
Note we still can do `linkerd upgrade --ha=false` to disable HA.

This is a prerequesite to address #2852

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-06 17:27:27 -05:00
Alejandro Pedraza 74ca92ea25
Split proxy-init into separate repo (#2824)
Split proxy-init into separate repo

Fixes #2563

The new repo is https://github.com/linkerd/linkerd2-proxy-init, and I
tagged the latest there `v1.0.0`.

Here, I've removed the `/proxy-init` dir and pinned the injected
proxy-init version to `v1.0.0` in the injector code and tests.

`/cni-plugin` depends on proxy-init, so I updated the import paths
there, and could verify CNI is still working (there is some flakiness
but unrelated to this PR).

For consistency, I added a `--init-image-version` flag to `linkerd
inject` along with its corresponding override config annotation.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-03 16:24:05 -05:00