Commit Graph

126 Commits

Author SHA1 Message Date
harsh jain 976bc40345 Fixes #2607: Remove TLS from stat (#2613)
Removes the TLS percentages from the stat command in the CLI.
2019-04-04 10:37:42 -07:00
Oliver Gould 91c5f07650
proxy: Upgrade to identity-capable proxy (#2524)
The new proxy has changed its configuration as follows:

- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
  just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.

Identity injection is **NOT** configured by this branch.
2019-03-19 14:20:39 -07:00
Kevin Leimkuhler 229e33e79e
cli: Always display stat tables for all routes (#2466)
## Problem

When an object has no previous route metrics, we do not generate a table for
that object.

The reasoning behind this was for reducing output of the following command:

```
$ linkerd routes deploy --to deploy/foo
```

For each deployment object, if it has no previous traffic to `deploy/foo`, then
a table would not be generated for it.

However, the behavior we see with that indicates there is an error even when a
Service Profile is installed:

```
$ linkerd routes deploy deploy/foo
Error: No Service Profiles found for selected resources
```

## Solution

Always generate a stat table for the queried resource object.

## Validation

I deployed [booksapp](https://github.com/buoyantIO/booksapp) with the `traffic`
deployment removed and Service Profiles installed.

Without the fix, `linkerd routes deploy/webapp` displays an error because there
has been no traffic to `deploy/webapp` without the `traffic` deployment.

With the fix, the following output is generated:

```
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /authors/{id}            webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /books/{id}              webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors                webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/delete    webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/delete      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/edit        webapp     0.00%   0.0rps           0ms           0ms           0ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

Closes #2328

Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
2019-03-11 14:17:20 -07:00
Andrew Seigner d4fdbe4991
Fix web init to not check for ServiceProfiles (#2470)
linkerd/linkerd2#2428 modified SelfSubjectAccessReview behavior to no
longer paper-over failed ServiceProfile checks, assuming that
ServiceProfiles will be required going forward. There was a lingering
ServiceProfile check in the web's startup that started failing due to
this change, as the web component does not have (and should not need)
ServiceProfile access. The check was originally implemented to inform
the web component whether to expect "single namespace" mode or
ServiceProfile support.

Modify the web's initialization to always expect ServiceProfile support.

Also remove single namespace integration test

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 15:20:46 -08:00
Alejandro Pedraza 0da851842b
Public API endpoint `Config()` (#2455)
Public API endpoint `Config()`

Retrieves Global and Proxy configurations.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-03-07 17:37:46 -05:00
Andrew Seigner 8da2cd3fd4
Require cluster-wide k8s API access (#2428)
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.

This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.

Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 10:23:18 -08:00
Andrew Seigner 206ff685e2
Bump Prometheus client to v0.9.2 (#2388)
We were depending on an untagged version of prometheus/client_golang
from Feb 2018.

This bumps our dependency to v0.9.2, from Dec 2018.

Also, this is a prerequisite to #1488.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-05 10:31:16 -08:00
Tarun Pothulapati 2184928813 Wire up stats for Jobs (#2416)
Support for Jobs in stat/tap/top cli commands

Part of #2007

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-03-01 17:16:54 -08:00
Andrew Seigner 9f748d2d2e
lint: Enable unparam (#2369)
unparam reports unused function parameters:
https://github.com/mvdan/unparam

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-27 10:34:02 -08:00
Andrew Seigner ec5a0ca8d9
Authorization-aware control-plane components (#2349)
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.

Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.

Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.

TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-26 11:54:52 -08:00
Andrew Seigner 25e462352d
lint: Enable goimports (#2366)
goimports checks import lines, adding missing ones and removing
unreferenced ones:
https://godoc.org/golang.org/x/tools/cmd/goimports

It also requires named imports for packages whose
import paths don't match their package names:
- https://github.com/golang/go/issues/28428
- https://go-review.googlesource.com/c/tools/+/145699/

Also standardized named imports of common Kubernetes packaages.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 15:51:10 -08:00
Andrew Seigner 35a0b652f2
lint: Enable goconst (#2365)
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 12:00:03 -08:00
Risha Mars 80b6e41d5d
Modify StatSummary to also return TCP stats (#2262)
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.

TCP stats are now displayed on the Resource Detail pages.

The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
2019-02-25 10:37:39 -08:00
Oliver Gould f7435800da
lint: Enable scopelint (#2364)
[scopelint][scopelint] detects a nasty reference-scoping issue in loops.

[scopelint]: https://github.com/kyoh86/scopelint
2019-02-24 08:59:51 -08:00
Andrew Seigner cc3ff70f29
Enable `unused` linter (#2357)
`unused` checks Go code for unused constants, variables, functions, and
types.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-23 11:05:39 -08:00
Kevin Lingerfelt 5384ca8c97
Add discovery package for managing discovery API (#2317)
* Add discovery package for managing discovery API
* Fix typo in destination server comment

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-18 16:38:04 -08:00
Oliver Gould 71ce786dd3
Rename linkerd-proxy-api to linkerd-destination (#2281)
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).

With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).

Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-15 15:11:04 -08:00
Andrew Seigner 2305974202
Introduce golangci-lint tooling, fixes (#2239)
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.

This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.

Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-13 11:16:28 -08:00
Ivan Sim f6e75ec83a
Add statefulsets to the dashboard and CLI (#2234)
Fixes #1983

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-08 15:37:44 -08:00
Alex Leong 5b054785e5
Read service profiles from client or server namespace instead of control namespace (#2200)
Fixes #2077 

When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace.  This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.

Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace.  If a service profile exists in both namespaces, the client namespace takes priority.  In this way, clients may override the behavior dictated by the service.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-07 14:51:43 -08:00
Andrew Seigner 72812baf99
Introduce Discovery API and endpoints command (#2195)
The Proxy API service lacked introspection of its internal state.

Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server

Also wire up a new `linkerd endpoints` command.

Fixes #2165

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-07 14:02:21 -08:00
Alena Varkockova 2691dda5ce Add possibility to filter by owner and label in ListPods (#2161)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-28 18:50:29 -08:00
zak 8c413ca38b Wire up stats commands for daemonsets (#2006) (#2086)
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.

Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.

Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.

Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.

Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.

Update dependencies and dockerfile hashes

Add DaemonSet support to tap and top commands

Fixes of #2006

Signed-off-by: Zak Knill <zrjknill@gmail.com>
2019-01-24 14:34:13 -08:00
Alex Leong 32efab41b5
Fix panic when routes is called in single-namespace mode (#2123)
Fixes #2119 

When Linkerd is installed in single-namespace mode, the public-api container panics when it attempts to access watch service profiles.

In single-namespace mode, we no longer watch service profiles and return an informative error when the TopRoutes API is called.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-23 16:47:05 -08:00
Alena Varkockova 28f662c9c6 Introduce resource selector and deprecate namespace field for ListPods (#2025)
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-23 10:35:55 -08:00
Alex Leong a562f8b9fd
Improve routes command to list all routes (#2066)
Fixes #1875 

This change improves the `linkerd routes` command in a number of important ways:

* The restriction on the type of the `--to` argument is lifted and any resource type can now be used.  Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.

```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp   100.00%   0.5rps          50ms         180ms         196ms
GET /authors/{id}            webapp   100.00%   0.5rps         100ms         900ms         980ms
GET /books/{id}              webapp   100.00%   0.9rps          38ms          93ms          99ms
POST /authors                webapp   100.00%   0.5rps          35ms          48ms          50ms
POST /authors/{id}/delete    webapp   100.00%   0.5rps          83ms         180ms         196ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp    45.16%   2.1rps          75ms         425ms         485ms
POST /books/{id}/delete      webapp   100.00%   0.5rps          30ms          90ms          98ms
POST /books/{id}/edit        webapp    56.00%   0.8rps          92ms         875ms         975ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

This is all made possible by a shift in the way we handle the destination resource.  When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource.  We then fetch all service profiles for those services and display the routes from those serivce profiles.  

This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-16 17:15:35 -08:00
Alex Leong 771542dde2
Add support for retries (#2038) 2019-01-16 14:13:48 -08:00
Andrew Seigner 1c302182ef
Enable lint check for comments (#2023)
Commit 1: Enable lint check for comments

Part of #217. Follow up from #1982 and #2018.

A subsequent commit will fix the ci failure.

Commit 2: Address all comment-related linter errors.

This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
  be removed

This PR does not:
- Modify, move, or remove any code
- Modify existing comments

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 14:03:59 -08:00
Kevin Lingerfelt f1b0983f72
Add go linting to CI config (#2018)
* Add go linting to CI config
* Fix lint warnings
* Add note about bin/lint script in TEST.md

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-20 15:33:09 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Alex Leong cb3fa1245b
Remove TLS column from routes command output (#1956)
Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-14 21:52:49 -08:00
Alejandro Pedraza 8c67bfbcc6 Add parameter to stats API to skip retrieving Prometheus stats (#1871)
* Add parameter to stats API to skip retrieving Prometheus stats

Used by the dashboard to populate list of resources.

Fixes #1022

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated

* Add test for SkipStats=true

Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats

* Fix previous test
2018-12-10 16:48:12 -08:00
Kevin Lingerfelt 0f8bcc9159
Controller: wait for caches to sync before opening listeners (#1958)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-07 11:15:45 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 73836f05cf
Update proxy version and use canonicalized dst (#1866)
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag).  Inbound queries (i.e. ones without the `--from` flag) never return any metrics.

We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-26 17:20:07 -08:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Kevin Leimkuhler c68693e820
Fix stat filtering for `--from` queries (#1856)
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.

Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).

# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.

# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.

Fixes #1766 

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2018-11-14 10:52:27 -08:00
Alejandro Pedraza 37bc8a69db Added support for json output in `linkerd stat` (#1749)
Added support for json output in `linkerd stat` through a new (-o|--output)=json option.

Fixes #1417

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-10-15 14:10:48 -07:00
Risha Mars 31a396b631
Fix incorrect test wording (#1767) 2018-10-15 12:07:06 -07:00
Alex Leong 1fe19bf3ce
Add ServiceProfile support to k8s utilities (#1758)
Updates to the Kubernetes utility code in `/controller/k8s` to support interacting with ServiceProfiles.

This makes use of the code generated client added in #1752 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-12 09:35:11 -07:00
Alena Varkockova 5a853e8990 Use ListPods always for data plane HC (#1701)
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-10-02 11:45:01 -07:00
Kevin Lingerfelt b5ff29c8aa
Add data plane check to validate proxy version (#1574)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-04 15:22:38 -07:00
Kevin Lingerfelt e97be1f5da
Move all healthcheck-related code to pkg/healthcheck (#1492)
* Move all healthcheck-related code to pkg/healthcheck
* Fix failed check formatting
* Better version check wording

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-20 16:50:22 -07:00
Kevin Lingerfelt 00a0572098
Better CLI error messages when control plane is unavailable (#1428)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-09 15:40:41 -07:00
Kevin Lingerfelt bd19e8aaff
Update prometheus to only scrape proxies in the same mesh (#1402)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-06 12:05:55 -07:00
Alex Leong 3e1f35913b
Read all bytes of message length header (#1394)
The `reader.Read` method only reads as many bytes as are currently available from reader.  When reading the 4 byte message length header, if not all 4 of those bytes are available, `Read` will only read the available bytes and return.  This causes alignment issues when the message body is read and there are still unread header bytes in the reader.  These bytes will appear at the beginning of the message body and cause a crash when the message is unmarshalled.

Use `io.ReadFull` to ensure that we read all 4 of the message length header bytes.

Fixes #1287 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-08-02 10:45:49 -07:00
Risha Mars fef896011f Add more filters to the web UI tap form (#1371)
* Update ant to 3.7.2
* Add autocomplete of namespaces/resources to Tap in web ui
  * Add form fields for authority/path/method/rps/scheme
  * Add the ability to clear error messages to the error banner
* Add error listener to ws object
2018-07-31 15:48:53 -07:00