Commit Graph

132 Commits

Author SHA1 Message Date
Tarun Pothulapati 2184928813 Wire up stats for Jobs (#2416)
Support for Jobs in stat/tap/top cli commands

Part of #2007

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-03-01 17:16:54 -08:00
Andrew Seigner d08dcb0a37
Skip outbound port 443 in control-plane (#2411)
linkerd/linkerd2#2349 introduced a `SelfSubjectAccessReview` check at
startup, to determine whether each control-plane component should
establish Kubernetes watches cluster-wide or namespace-wide. If this
check occurs before the linkerd-proxy sidecar is ready, it fails, and
the control-plane component restarts.

This change configures each control-plane pod to skip outbound port 443
when injecting the proxy, allowing the control-plane to connect to
Kubernetes regardless of the `linkerd-proxy` state.

A longer-term fix should involve a more robust control-plane startup,
that is resilient to failed Kubernetes API requests. An even longer-term
fix could involve injecting `linkerd-proxy` as a Kubernetes "sidecar"
container, when that becomes available.

Workaround for #2407

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-27 15:23:19 -08:00
Kevin Lingerfelt 40076c4de2
Remove namespace from serviceprofile CRD in install config (#2409)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-27 14:29:47 -08:00
Oliver Gould ab90263461
destination: Only return TLS identities when appropriate (#2371)
As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.

This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.

The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.

Fixes #2217
2019-02-27 12:18:39 -08:00
Andrew Seigner ec5a0ca8d9
Authorization-aware control-plane components (#2349)
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.

Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.

Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.

TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-26 11:54:52 -08:00
Ivan Sim 1e2e2bf53c
Install the Linkerd global and proxy config maps (#2344)
Also, some protobuf updates:

* Rename `api_port` to match recent changes in CLI code.
* Remove the `cni` message because it won't be used.
* Remove `registry` field from proto types. This helps to avoid having to workaround edge cases like fully-qualified image name in different format, and overriding user-specified Linkerd version etc.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-22 15:28:21 -08:00
Oliver Gould 4ed84f0c0a
Split install template into component-specific files (#2313)
chart/templates/base.yaml is nearly 800 lines and contains the
kubernetes configurations for the marjority of the control plane.
Furthermore, its contents are not particularly organized (for example,
the prometheus RBAC bindings are in the middle of the controller's
configuration).

The size and complexity of this file makes it especially daunting to
introduce new functionality.

In order to make the situation easier to understand and change, this
splits base.yaml into several new template files: namespace, controller,
serviceprofile, and prometheus, and grafana. The `tls.yaml` template has
been renamed `ca.yaml`, since it installs the `linkerd-ca` resources.

This change also makes the comments uniform, adding a "header" to each
logical component.

Fixes #2154
2019-02-18 15:31:17 -08:00
Oliver Gould 71ce786dd3
Rename linkerd-proxy-api to linkerd-destination (#2281)
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).

With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).

Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-15 15:11:04 -08:00
Andrew Seigner a9b9908908
Bump Prometheus to v2.7.1, Grafana to 5.4.3 (#2242)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-13 11:27:15 -08:00
Ivan Sim f6e75ec83a
Add statefulsets to the dashboard and CLI (#2234)
Fixes #1983

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-08 15:37:44 -08:00
Alex Leong 5b054785e5
Read service profiles from client or server namespace instead of control namespace (#2200)
Fixes #2077 

When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace.  This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.

Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace.  If a service profile exists in both namespaces, the client namespace takes priority.  In this way, clients may override the behavior dictated by the service.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-07 14:51:43 -08:00
Andrew Seigner 907f01fba6
Improve ServiceProfile validation in linkerd check (#2218)
The `linkerd check` command was doing limited validation on
ServiceProfiles.

Make ServiceProfile validation more complete, specifically validate:
- types of all fields
- presence of required fields
- presence of unknown fields
- recursive fields

Also move all validation code into a new `Validate` function in the
profiles package.

Validation of field types and required fields is handled via
`yaml.UnmarshalStrict` in the `Validate` function. This motivated
migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml.

Fixes #2190
2019-02-07 14:35:47 -08:00
Oliver Gould 44e31f0f67
Configure proxy keepalives via the environment (#2193)
In linkerd/linkerd2-proxy#186, the proxy supports configuration of TCP
keepalive values.

This change sets `LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE` and
`LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE` to 10s when injecting the
proxy, so that remote connections are configured with a keepalive.

This configuration is NOT yet exposed through the CLI. This may be done
in a followup, if necessary.

Fixes #1949
2019-02-04 16:16:43 -08:00
Eliza Weisman 846975a190
Remove proxy bind timeout from CLIs (#2017)
This branch removes the `--proxy-bind-timeout` flag from the 
`linkerd inject` and `linkerd install` CLI commands, and the
`LINKERD2_PROXY_BIND_TIMEOUT` environment variable from their output.
This is in preparation for removing that timeout from the proxy (as
described in #2013). 

I thought it was prudent to remove this from the CLIs before removing it
from the proxy, so we can't create a situation where the CLIs produce
output that results in broken proxy containers.

Fixes #2013

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2019-01-24 15:34:09 -08:00
zak 8c413ca38b Wire up stats commands for daemonsets (#2006) (#2086)
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.

Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.

Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.

Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.

Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.

Update dependencies and dockerfile hashes

Add DaemonSet support to tap and top commands

Fixes of #2006

Signed-off-by: Zak Knill <zrjknill@gmail.com>
2019-01-24 14:34:13 -08:00
Andrew Seigner c9ac77cd7c
Introduce version consistency checks (#2130)
Version checks were not validating that the cli version matched the
control plane or data plane versions.

Add checks via the `linkerd check` command to validate the cli is
running the same version as the control and data plane.

Also add types around `channel-version` string parsing and matching. A
consequence being that during development `version.Version` changes from
`undefined` to `dev-undefined`.

Fixes #2076

Depends on linkerd/website#101

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-23 16:54:43 -08:00
Kevin Lingerfelt ed3fbd75f3
Setup port-forwarding for linkerd dashboard command (#2052)
* Setup port-forwarding for linkerd dashboard command
* Output port-forward logs when --verbose flag is set

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-01-10 10:16:08 -08:00
Alena Varkockova 172398292d Add validation to CRD for service profiles (#2024)
* Add validation to CRD for service profiles

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>

* Use properties instead of oneof

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-10 07:04:40 -08:00
Kevin Lingerfelt a27bb2e0ce
Proxy grafana requests through web service (#2039)
* Proxy grafana requests through web service
* Fix -grafana-addr default, clarify -api-addr flag
* Fix version check in grafana dashboards
* Fix comment typo

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-01-04 16:07:57 -08:00
Thomas Rampelberg 612ebe2b81
Add rules_file loading into prometheus (#1966) 2018-12-20 11:47:13 -08:00
Kevin Lingerfelt 0866bb2a41
Remove runAsGroup field from security context settings (#1986)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 15:12:13 -08:00
Kevin Lingerfelt 86e95b7ad3
Disable serivce profiles in single-namespace mode (#1980)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 14:37:18 -08:00
Cody Vandermyn d847f66ec5 Create new service accounts for linkerd-web and linkerd-grafana. Chan… (#1978)
* Create new service accounts for linkerd-web and linkerd-grafana. Change 'serviceAccount:' to 'serviceAccountName:'
* Use dynamic namespace name

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2018-12-12 18:10:50 -08:00
Cody Vandermyn aa5e5f42eb Use an emptyDir for Prometheus and Grafana (#1971)
* Allow input of a volume name for prometheus and grafana
* Make Prometheus and Grafana volume names 'data' by default and disallow user editing via cli flags
* Remove volume name from options

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2018-12-12 15:54:03 -08:00
Cody Vandermyn 8e4d9d2ef6 add securityContext with runAsUser: {{.ControllerUID}} to the various cont… (#1929)
* add securityContext with runAsUser: {{.ProxyUID}} to the various containers in the install template
* Update golden to reflect new additions
* changed to a different user id than the proxy user id
* Added a controller-uid install option
* change the port that the proxy-injector runs
* The initContainers needs to be run as the root user.
* move security contexts to container level

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2018-12-11 11:51:28 -08:00
Alex Leong cbb196066f
Support service profiles for external authorities (#1928)
Add support for service profiles created on external (non-service) authorities.  For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`.

This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority.  We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem.

We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not.  To simplify this, support for multiple resolves (which was unused) was removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 14:32:59 -08:00
Oliver Gould 12ec5cf922
install: Add a -disable-h2-upgrade flag (#1926)
The proxy-api service _always_ suggests that two meshed pods communicate
via HTTP/2 (i.e. via transparent protocol upgrading, if necessary).
This can complicate debugging and diagnostics at times, so it's
important that we have a way to deploy linkerd without this auto-upgrade
behavior.

This change adds a `-disable-h2-upgrade` flag to the `linkerd install`
command that disables transparent upgrading for the whole cluster.
2018-12-05 12:50:47 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner ad2366f208
Revert proxy readiness initialDelaySeconds change (#1912)
Reverts part of #1899 to workaround readiness failures.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-04 14:27:55 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Andrew Seigner d121071f87
Adjust proxy, Prometheus, and Grafana probes (#1899)
* Adjust proxy, Prometheus, and Grafana probes

High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.

Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.

Detailed probe changes:
- proxy
  - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `readinessProbe.failureThreshold` from 10 to 3
  - increase `livenessProbe.initialDelaySeconds` from 0s to 30s

Fixes #1804

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 10:41:11 -08:00
Ben Lambert 297cb570f2 Added a --ha flag to install CLI (#1852)
This change allows some advised production config to be applied to the install of the control plane.
Currently this runs 3x replicas of the controller and adds some pretty sane requests to each of the components + containers of the control plane.

Fixes #1101

Signed-off-by: Ben Lambert <ben@blam.sh>
2018-11-20 23:03:59 -05:00