Private k8s clusters, such as the private GKE clusters offered by Google
Cloud, cannot be reached through the current API proxy method.
This commit uses the port forwarding feature already developed.
Also modify dashboard command to not fall back to ephemeral port.
Signed-off-by: Jack Price <jackprice@outlook.com>
This change reintroduces identity hinting to the destination service.
The Get endpoint includes identities for pods that are injected with an
identity-mode of "default" and have the same linkerd control plane.
A `serviceaccount` label is now also added to destination response
metadata so that it's accessible in prometheus and tap.
The new proxy has changed its configuration as follows:
- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.
Identity injection is **NOT** configured by this branch.
This change introduces a new Identity service implementation for the
`io.linkerd.proxy.identity.Identity` gRPC service.
The `pkg/identity` contains a core, abstract implementation of the service
(generic over both the CA and (Kubernetes) Validator interfaces).
`controller/identity` includes a concrete implementation that uses the
Kubernetes TokenReview API to validate serviceaccount tokens when
issuing certificates.
This change does **NOT** alter installation or runtime to include the
identity service. This will be included in a follow-up.
The proxy's TLS implementation has changed to use a new _Identity_ controller.
In preparation for this, the `--tls=optional` CLI flag has been removed
from install and inject; and the `ca` controller has been deleted. Metrics
and UI treatments for TLS have **not** been removed, as they will continue to
be valuable for the new Identity system.
With the removal of the old identity scheme, the Destination service's proxy
ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable
locality awareness.
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.
This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.
A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
combining some check categories, as we can always assume cluster-wide
requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
make a decision based on cluster-wide vs. namespace-wide access.
Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.
Reverts #1721
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
## Problem
When an object has no previous route metrics, we do not generate a table for
that object.
The reasoning behind this was for reducing output of the following command:
```
$ linkerd routes deploy --to deploy/foo
```
For each deployment object, if it has no previous traffic to `deploy/foo`, then
a table would not be generated for it.
However, the behavior we see with that indicates there is an error even when a
Service Profile is installed:
```
$ linkerd routes deploy deploy/foo
Error: No Service Profiles found for selected resources
```
## Solution
Always generate a stat table for the queried resource object.
## Validation
I deployed [booksapp](https://github.com/buoyantIO/booksapp) with the `traffic`
deployment removed and Service Profiles installed.
Without the fix, `linkerd routes deploy/webapp` displays an error because there
has been no traffic to `deploy/webapp` without the `traffic` deployment.
With the fix, the following output is generated:
```
ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
GET / webapp 0.00% 0.0rps 0ms 0ms 0ms
GET /authors/{id} webapp 0.00% 0.0rps 0ms 0ms 0ms
GET /books/{id} webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors/{id}/delete webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books/{id}/delete webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms
```
Closes#2328
Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
linkerd/linkerd2#2428 modified SelfSubjectAccessReview behavior to no
longer paper-over failed ServiceProfile checks, assuming that
ServiceProfiles will be required going forward. There was a lingering
ServiceProfile check in the web's startup that started failing due to
this change, as the web component does not have (and should not need)
ServiceProfile access. The check was originally implemented to inform
the web component whether to expect "single namespace" mode or
ServiceProfile support.
Modify the web's initialization to always expect ServiceProfile support.
Also remove single namespace integration test
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.
This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
We were depending on an untagged version of prometheus/client_golang
from Feb 2018.
This bumps our dependency to v0.9.2, from Dec 2018.
Also, this is a prerequisite to #1488.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.
This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.
The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.
Fixes#2217
Previously, the update-handling logic was spread across several very
small functions that were only called within this file. I've
consolidated this logic into endpointListener.Update so that all of the
debug logging can be instrumented in one place without having to iterate
over lists multiple times.
Also, I've fixed the formatting of IP addresses in some places.
Logs now look as follows:
msg="Establishing watch on endpoint linkerd-prometheus.linkerd:9090" component=endpoints-watcher
msg="Subscribing linkerd-prometheus.linkerd:9090 exists=true" component=service-port id=linkerd-prometheus.linkerd target-port=admin-http
msg="Update: add=1; remove=0" component=endpoint-listener namespace=linkerd service=linkerd-prometheus
msg="Update: add: addr=10.1.1.160; pod=linkerd-prometheus-7bbc899687-nd9zt; addr:<ip:<ipv4:167838112 > port:9090 > weight:1 metric_labels:<key:\"control_plane_ns\" value:\"linkerd\" > metric_labels:<key:\"deployment\" value:\"linkerd-prometheus\" > metric_labels:<key:\"pod\" value:\"linkerd-prometheus-7bbc899687-nd9zt\" > metric_labels:<key:\"pod_template_hash\" value:\"7bbc899687\" > protocol_hint:<h2:<> > " component=endpoint-listener namespace=linkerd service=linkerd-prometheus
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.
Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.
Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.
TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.
TCP stats are now displayed on the Resource Detail pages.
The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).
With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).
Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
# Problem
When a route does not specify a timeout, the proxy-api defaults to the default
timeout and logs an error:
```
time="2019-02-13T16:29:12Z" level=error msg="failed to parse duration for route POST /io.linkerd.proxy.destination.Destination/GetProfile: time: invalid duration"
```
# Solution
We now check if a route timeout is blank. If it is not set, it is set to
`DefaultRouteTimeout`. If it is set, we try to parse it into a `Duration`.
A request was made to improve logging to include the service profile and
namespace as well.
# Validation
With valid service profiles installed, edit the `.yaml` to include an invalid
`timeout`:
```
...
name: GET /
timeout: foo
```
We should now see the following errors:
```
proxy-api time="2019-02-13T22:27:32Z" level=error msg="failed to parse duration for route 'GET /' in service profile 'webapp.default.svc.cluster.local' in namespace 'default': time: invalid duration foo"
```
This error does not show up when `timeout` is blank.
Fixes#2276
Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.
This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.
Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#2077
When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace. This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.
Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace. If a service profile exists in both namespaces, the client namespace takes priority. In this way, clients may override the behavior dictated by the service.
Signed-off-by: Alex Leong <alex@buoyant.io>
The Proxy API service lacked introspection of its internal state.
Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server
Also wire up a new `linkerd endpoints` command.
Fixes#2165
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds the ability to generate a service profile by running a tap for a configurable
amount of time, and using the route results from the routes seen during the tap.
e.g. `linkerd profile web --tap deploy/web -n emojivoto --tap-duration 2s`
Fixes#2042
Adds a new field to service profile routes called `timeout`. Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead. If the timeout field is not specified, a default timeout of 10 seconds is used.
Signed-off-by: Alex Leong <alex@buoyant.io>
When `GetProfiles` is called for a destination that does not have a service profile, the proxy-api service does not return any messages until a service profile is created for that service. This can be interpreted as hanging, and can make it difficult to calculate response latency metrics.
Change the behavior of the API to always return a service profile message immediately. If the service does not have a service profile, the default service profile is returned.
Signed-off-by: Alex Leong <alex@buoyant.io>
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.
Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.
Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.
Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.
Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.
Update dependencies and dockerfile hashes
Add DaemonSet support to tap and top commands
Fixes of #2006
Signed-off-by: Zak Knill <zrjknill@gmail.com>
Fixes#2119
When Linkerd is installed in single-namespace mode, the public-api container panics when it attempts to access watch service profiles.
In single-namespace mode, we no longer watch service profiles and return an informative error when the TopRoutes API is called.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
Fixes#1875
This change improves the `linkerd routes` command in a number of important ways:
* The restriction on the type of the `--to` argument is lifted and any resource type can now be used. Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.
```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
GET / webapp 100.00% 0.5rps 50ms 180ms 196ms
GET /authors/{id} webapp 100.00% 0.5rps 100ms 900ms 980ms
GET /books/{id} webapp 100.00% 0.9rps 38ms 93ms 99ms
POST /authors webapp 100.00% 0.5rps 35ms 48ms 50ms
POST /authors/{id}/delete webapp 100.00% 0.5rps 83ms 180ms 196ms
POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms
POST /books webapp 45.16% 2.1rps 75ms 425ms 485ms
POST /books/{id}/delete webapp 100.00% 0.5rps 30ms 90ms 98ms
POST /books/{id}/edit webapp 56.00% 0.8rps 92ms 875ms 975ms
[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms
```
This is all made possible by a shift in the way we handle the destination resource. When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource. We then fetch all service profiles for those services and display the routes from those serivce profiles.
This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.
Signed-off-by: Alex Leong <alex@buoyant.io>
This is a followup branch from #2023:
- delete `proxy/client.go`, move code to `destination-client`
- move `RenderTapEvent` and stat functions from `util` to `cmd`
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Commit 1: Enable lint check for comments
Part of #217. Follow up from #1982 and #2018.
A subsequent commit will fix the ci failure.
Commit 2: Address all comment-related linter errors.
This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
be removed
This PR does not:
- Modify, move, or remove any code
- Modify existing comments
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Add parameter to stats API to skip retrieving Prometheus stats
Used by the dashboard to populate list of resources.
Fixes#1022
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated
* Add test for SkipStats=true
Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats
* Fix previous test
Rename snake case fields to camel case in service profile spec. This improves the way they are rendered when the `kubectl describe` command is used.
Signed-off-by: Alex Leong <alex@buoyant.io>