Commit Graph

35 Commits

Author SHA1 Message Date
Ivan Sim cf69dedf9c
Re-add the destination container to the controller spec (#3540)
* Re-add the destination container to the controller spec

This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.

On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.

* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-08 10:49:40 -07:00
Alejandro Pedraza c5d68ecb16
Add missing nodeSelector in Destination deployment (#3527)
Fixes #3526
2019-10-04 12:47:55 -05:00
Bruno M. Custódio caddda8e48 Add support for a node selector in the Helm chart. (#3275)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-27 10:36:37 -07:00
Tarun Pothulapati 096668d62c make public-api use the right destination address (#3476)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-25 15:24:56 -05:00
Alejandro Pedraza 1653f88651
Put the destination controller into its own deployment (#3407)
* Put the destination controller into its own deployment

Fixes #3268

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-18 13:41:06 -05:00
Ivan Sim 4d89c52113 Update Prometheus config to keep only needed cadvisor metrics (#3401)
* Update prometheus cadvisor config to only keep container resources metrics

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Drop unused large metric

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Fix unit test

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Siggy's feedback

Signed-off-by: Ivan Sim <ivan@buoyant.io>

* Fix unit test

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-09-17 10:17:49 -07:00
Alejandro Pedraza 1e2810c431
Trim certs and keys in the Helm charts (#3421)
* Trim certs and keys in the Helm charts

Fixes #3419

When installing through the CLI the installation will fail if the certs
are malformed, so this only concerns the Helm templates.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-11 20:47:38 -05:00
Andrew Seigner 7f59caa7fc
Bump proxy-init to 1.2.0 (#3397)
Pulls in latest proxy-init:
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/v1.2.0

This also bumps a dependency on cobra, which provides more complete zsh
completion.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-09 09:06:14 -07:00
arminbuerkle 5c38f38a02 Allow custom cluster domains in remaining backends (#3278)
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-27 10:01:36 -07:00
arminbuerkle e7d303e03f Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES (#3277)
* Fix missing `clusterDomain` in render RenderTapOutputProfile
* Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES env variable

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-21 14:28:30 -07:00
Ivan Sim 183e42e4cd
Merge the CLI 'installValues' type with Helm 'Values' type (#3291)
* Rename template-values.go
* Define new constructor of charts.Values type
* Move all Helm values related code to the pkg/charts package
* Bump dependency
* Use '/' in filepath to remain compatible with VFS requirement
* Add unit test to verify Helm YAML output
* Alejandro's feedback
* Add unit test for Helm YAML validation (HA)

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-20 19:26:38 -07:00
cpretzer 4e92064f3b
Add a flag to install-cni command to configure iptables wait flag (#3066)
Signed-off-by: Charles Pretzer <charles@buoyant.io>
2019-08-15 12:58:18 -07:00
Kevin Leimkuhler cc3c53fa73
Remove tap from public API and associated test infrastructure (#3240)
### Summary

After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.

This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.

This draft passes `go test` as well as a local run of the integration tests.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-14 17:27:37 -04:00
Ivan Sim 4d01e3720e
Update install and upgrade code to use the new helm charts (#3229)
* Delete symlink to old Helm chart
* Update 'install' code to use common Helm template structs
* Remove obsolete TLS assets functions.

These are now handle by Helm functions inside the templates

* Read defaults from values.yaml and values-ha.yaml
* Ensure that webhooks TLS assets are retained during upgrade
* Fix a few bugs in the Helm templates (see bullet points):
* Merge the way the 'install' ha and non-ha options are handled into one function
* Honor the 'NoInitContainer' option in the components templates
* Control plane mTLS will not be disabled if identity context in the
config map is empty. The data plane mTLS will still be automatically disabled
if the context is nil.
* Resolve test failures from rebase with master
* Fix linter issues
* Set service account mount path read-only field
* Add TLS variables of the webhooks and tap to values.yaml

During upgrade, these secrets are preserved to ensure they remain synced
wih the CA bundle in the webhook configurations. These Helm variables are used
to override the defaults in the templates.

* Remove obsolete 'chart' folder
* Fix bugs in templates
* Handle missing webhooks and tap TLS assets during upgrade

When upgrading from an older version that don't have these secrets, fallback to let Helm
create them by creating an empty charts.TLS struct.

* Revert the selector labels of webhooks to be compatible with that in 2.4

In 2.4, the proxy injector and profile validator webhooks already have their selector labels defined.
Since these attributes are immutable, the recent change to these selectors introduced by the Helm chart work will cause upgrade to fail.

* Alejandro's feedback
* Siggy's feedback
* Removed redundant unexported custom types

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-13 14:16:24 -07:00
Thomas Rampelberg ca5b4fab2e
Add container metrics and grafana dashboard (#3217)
* Add container metrics and grafana dashboard

* Review cleanup

* Update templates
2019-08-12 08:03:57 -07:00
Tarun Pothulapati 0cbba0b03e Setting SuccessfulJobHistoryLimit to 0 for CronJobs (#3193)
* setting successful job history limit to 0

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-08-07 16:59:14 -05:00
Andrew Seigner a59c1dd32d
Introduce tap APIService, update `linkerd tap` (#3167)
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.

This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.

This change also modifies the `linkerd tap` command to make requests
against the new APIService.

The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET  /apis
GET  /apis/tap.linkerd.io
GET  /apis/tap.linkerd.io/v1alpha1
GET  /healthz
GET  /healthz/log
GET  /healthz/ping
GET  /metrics
GET  /openapi/v2
GET  /version

Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.

This change introduces the following resources into the default Linkerd
install:
- Global
  - APIService/v1alpha1.tap.linkerd.io
  - ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
  - Secret/linkerd-tap-tls
- `kube-system` namespace:
  - RoleBinding/linkerd-linkerd-tap-auth-reader

Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller

Fixes #2725, #3162, #3172

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-01 14:02:45 -07:00
Andrew Seigner a8830b2323
Set heartbeat cronjobs to not restart on failure (#3174)
The heartbeat cronjob specified `restartPolicy: OnFailure`. In cases
where failure was non-transient, such as if a cluster did not have
internet access, this would continuously restart and fail.

Change the heartbeat cronjob to `restartPolicy: Never`, as a failed job
has no user-facing impact.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-31 13:51:13 -07:00
Kevin Leimkuhler 8d9cfbf670
Inject Tap service name into proxy PodSpec (#3155)
### Summary

In order for Pods' tap servers to start authorizing tap clients, the tap server
must be able to check client names against the expected tap service name.

This change injects the `LINKERD2_PROXY_TAP_SVC_NAME` into proxy PodSpecs.

### Details

The tap servers on the individual resources being tapped should be able to
verify that the client is the tap service. The `LINKERD2_PROXY_TAP_SVC_NAME` is
now injected as an environment variable in the proxies so that it can check this
value against the client name of the TLS connection. Currently, this environment
will go unused. There is an open PR (linkerd2-proxy#290) to use this variable in
the proxy, but this is *not* dependent on that merging first. 

Note: The variable is not injected if tap is disabled.

### Testing

Test output has been updated with the newly injected environment variable.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-07-29 15:05:45 -07:00
Andrew Seigner 64ed8e4a74
Introduce Cluster Heartbeat cronjob (#3056)
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.

Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.

Example check URL:
https://versioncheck.linkerd.io/version.json?
  install-time=1562761177&
  k8s-version=v1.15.0&
  meshed-pods=8&
  rps=3&
  source=heartbeat&
  uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
  version=git-b97ee9f7

Fixes #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 17:12:30 -07:00
Andrew Seigner 48a69cb88a
Bump Prometheus to 2.11.1, Grafana to 6.2.5 (#3123)
- set `disable_sanitize_html` in `grafana.ini`.
- make all text box dimensions whole integers to fix dropdown issue,
  reported in:
  https://github.com/linkerd/linkerd2/issues/2955#issuecomment-503085444
- rev all dashboards to `schemaVersion` 18 for Grafana 6.2.5
- `prometheus-benchmark.json` based on:
  https://grafana.com/grafana/dashboards/9761
- `prometheus.json` based on:
  69c93e6401/public/app/plugins/datasource/prometheus/dashboards/prometheus_2_stats.json
- `grafana.json` based on:
  85aed0276e/public/app/plugins/datasource/prometheus/dashboards/grafana_stats.json

Fixes #2955

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 13:37:56 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Tarun Pothulapati 5c5ec6d816 add admin port label to proxy-injector and sp-validator (#2984)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-27 17:25:49 -05:00
Andrew Seigner 81790b6735 Bump Prometheus to v2.10.0 (#2979)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-06-21 12:51:31 -07:00
Tarun Pothulapati a3ce06bd80 Add sideEffects field to Webhooks (#2963)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-21 11:06:10 -07:00
Ivan Sim 435fe861d0
Label all Linkerd resources (#2971)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-20 09:44:30 -07:00
Ivan Sim e2e976cce9
Add `NET_RAW` capability to the proxy-init container (#2969)
Also, update control plane PSP to match linkerd/website#94

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-19 19:34:37 -07:00
Cody Vandermyn 33de3574ee Correctly set securityContext values on injection (#2911)
The patch provided by @ihcsim applies correct values for the securityContext during injection, namely: `allowPrivilegeEscalation = false`, `readOnlyRootFilesystem = true`, and the capabilities are copied from the primary container. Additionally, the proxy-init container securityContext has been updated with appropriate values.

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2019-06-11 10:34:30 -07:00
Alejandro Pedraza 74ca92ea25
Split proxy-init into separate repo (#2824)
Split proxy-init into separate repo

Fixes #2563

The new repo is https://github.com/linkerd/linkerd2-proxy-init, and I
tagged the latest there `v1.0.0`.

Here, I've removed the `/proxy-init` dir and pinned the injected
proxy-init version to `v1.0.0` in the injector code and tests.

`/cni-plugin` depends on proxy-init, so I updated the import paths
there, and could verify CNI is still working (there is some flakiness
but unrelated to this PR).

For consistency, I added a `--init-image-version` flag to `linkerd
inject` along with its corresponding override config annotation.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-03 16:24:05 -05:00
Ivan Sim 5a5f8bbfe8
Install MWC and VWC During Installation (#2806)
* Update helm charts to include webhooks config and TLS secret
* Update the webhooks to read the secret cert and key
* Update webhooks to not recreate config on restart
* Ensure upgrade preserve existing secrets
* Revert the change to rename the webhook configs

The renaming change breaks upgrade, where the new webhook configs conflict with
the existing ones. The older resources  aren't deleted during upgrade because
they are dynamically created.

* Make the secret volume read-only
* Remove unnecessary exported getter functions
* Remove obsolete mwc and vwc templates

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-05-20 12:43:50 -07:00
Dennis Adjei-Baah a0fa1dff59
Move tap service into its own pod. (#2773)
* Split tap into its own pod in the control plane

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2019-05-15 16:28:44 -05:00
Ivan Sim 714035fee9
Define default resource spec for proxy-init init container (#2763)
Fixes #2750 

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-04-29 11:41:05 -07:00
Andrew Seigner be60b37e93
Group Web and Grafana ServiceAccounts with RBAC (#2756)
All ServiceAccounts are intended to be grouped together with other RBAC
resources, particularly for `linkerd install config` output. Grafana and
Web ServiceAccounts were still included with their respective
Deployments.

Group Grafana and Web ServiceAccounts with other RBAC resources.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-25 17:33:05 -07:00
Alejandro Pedraza 53bb7c47f6
Make the auto-injector required and removed proxy-auto-inject flag (#2733)
Make the auto-injector required and removed proxy-auto-inject flag

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-04-24 13:06:51 -05:00
Andrew Seigner b2b4780430
Introduce install stages (#2719)
This change introduces two named parameters for `linkerd install`, split
by privilege:
- `linkerd install config`
  - Namespace
  - ClusterRoles
  - ClusterRoleBindings
  - CustomResourceDefinition
  - ServiceAccounts
- `linkerd install control-plane`
  - ConfigMaps
  - Secrets
  - Deployments
  - Services

Comprehensive `linkerd install` is still supported.

TODO:
- `linkerd check` support
- `linkerd upgrade` support
- integration tests

Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-23 14:52:34 -07:00