Commit Graph

62 Commits

Author SHA1 Message Date
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00
Alex Leong 24792cfd1c
Remove core dependency on viz (#6497)
Fixes #5589 

The core control plane has a dependency on the viz package in order to use the `BuildResource` function.  This "backwards" dependency means that the viz source code needs to be included in core docker-builds and is bad for code hygiene.

We move the `BuildResource` function into the viz package.  In `cli/cmd/metrics.go` we replace a call to `BuildResource` with a call directly to `CanonicalResourceNameFromFriendlyName`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-07-19 14:28:45 -07:00
Alejandro Pedraza 6980e45e1d
Remove the `linkerd-controller` pod (#6039)
* Remove the `linkerd-controller` pod

Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.

## Removals

- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.

## Health checks

- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.

## Other changes

- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
2021-04-19 09:57:45 -05:00
Kevin Leimkuhler 75fcc9d623
Move tap from core into Viz extension (#5651)
Closes #5545.

This change moves all tap and tap-injector code into the viz directory. 

The tap and tap-injector components now also use a new tap image—separating
these components from the controller image that they are currently part of. This
means the controller image has removed all its build dependencies related to
tap.

Finally, the tap Protobuf has been separated from the metrics-api and moved into
it's own `.proto` file and gen directory. This introduces a clear split between
metrics-api and tap Protobuf.

There is no change in behavior for the `viz tap` command.

### Reviewing

#### Docker images

All the bin directory scripts should be updated to build and load the tap image.
All the CI workflows should be updated to build and push the tap image.

#### Controller and pkg directories

This is primarily deletions. Most of the deleted code in this directory is now
in the tap directory of the Viz extension.

#### viz/tap

This is the location that all the tap related code now lives in. New files are
mostly moved from the controller and pkg directories. Imports have all been
updated to point at the right locations and Protobuf.

The Protobuf here is taken from metrics-api and contains all tap-related
Protobuf.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-09 12:43:21 -05:00
Alejandro Pedraza 8ac5360041
Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560)
* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used
2021-01-21 18:26:38 -05:00
Alejandro Pedraza f3b1ebfa99
Separate observability API (#5510)
* Separate observability API

Closes #5312

This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here.

- Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot`
- Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff.
- Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto`
- `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup.
- The web server provides handlers for both interfaces.
- `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs.
- Most of the CLI commands will use the Viz API, except for `version`.

The other changes in the go files are just changes in the imports to point to the new protobufs.

Other minor changes:
- Removed `git add controller/gen` from `bin/protoc-go.sh`
2021-01-13 14:34:54 -05:00
Kevin Leimkuhler eff50936bf
Fix --all-namespaces flag handling (#5085)
## Motivations

Closes #5080

## Solution

When the `--all-namespaces` (`-A`) flag is set for the `linkerd edges` command,
ignore the `namespace` value set by default or `-n`.

This is similar to the behavior for `kubectl`. `kubectl get -A -n linkerd pods`
showing pods in all namespaces.

### Behavior changes

With linkerd and emojivoto installed, this results in:

Before:

```
❯ linkerd edges -A pods
No edges found.
```

After:

```
❯ linkerd edges -A pods
SRC                                   DST                                       SRC_NS      DST_NS      SECURED       
vote-bot-6cb9cb9569-wl6w5             web-5d69bcfdb7-mxf8f                      emojivoto   emojivoto   √  
web-5d69bcfdb7-mxf8f                  emoji-7dc976587b-rb9c5                    emojivoto   emojivoto   √  
web-5d69bcfdb7-mxf8f                  voting-bdf4f778c-pjkjg                    emojivoto   emojivoto   √  
linkerd-prometheus-68d6897d75-ghmgm   emoji-7dc976587b-rb9c5                    linkerd     emojivoto   √  
linkerd-prometheus-68d6897d75-ghmgm   vote-bot-6cb9cb9569-wl6w5                 linkerd     emojivoto   √  
linkerd-prometheus-68d6897d75-ghmgm   voting-bdf4f778c-pjkjg                    linkerd     emojivoto   √  
linkerd-prometheus-68d6897d75-ghmgm   web-5d69bcfdb7-mxf8f                      linkerd     emojivoto   √  
linkerd-controller-7d965cf78d-qw6xj   linkerd-prometheus-68d6897d75-ghmgm       linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-controller-7d965cf78d-qw6xj       linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-destination-74dbb9c46b-nkxgh      linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-grafana-5d9fb67dc6-sn2l8          linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-identity-c875b5d58-b756v          linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-proxy-injector-767b55988d-n9r6f   linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-sp-validator-6c8df84fb9-4w8kc     linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-tap-777fbf7656-p87dm              linkerd     linkerd     √  
linkerd-prometheus-68d6897d75-ghmgm   linkerd-web-546c9444b5-68xpx              linkerd     linkerd     √
```

`linkerd edges -A -n linkerd pods` results in all edges as well (the result
above).

The behavior of `linkerd edges pods` does not change and shows edges in the
`default` namespace.

```
❯ linkerd edges pods
No edges found.
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-10-16 16:49:10 -04:00
Josh Soref 72aadb540f
Spelling (#4872)
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).

The misspellings have been reported at aaf440489e (commitcomment-41423663)

The action reports that the changes in this PR would make it happy: 5b82c6c5ca

Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-08-12 21:59:50 -07:00
Tharun Rajendran e24c323bf9
Gateway Metrics in Dashboard (#4717)
* Introduce multicluster gateway api handler in web api server
* Added MetricsUtil for Gateway metrics
* Added gateway api helper
* Added Gateway Component

Updated metricsTable component to support gateway metrics
Added handler for gateway

Fixes #4601

Signed-off-by: Tharun <rajendrantharun@live.com>
2020-07-27 12:43:54 -07:00
Zahari Dichev 73010149ce
Do not treat evicted pods as failed in healthchecks (#4732)
When a k8s pod is evicted its Phase is set to Failed and the reason is set to Evicted. Because in the ListPods method of the public APi we only transmit the phase and treat it as Status, the healthchecks assume such evicted data plane pods to be failed. Since this check is retryable, the results is that linkerd check --proxy appears to hang when there are evicted pods. As @adleong correctly pointed out here, the presence of evicted pod is not something that we should make the checks fail.

This change modifies the publci api to set the Pod.Status to "Evicted" for evicted pods. The healtcheks are also modified to not treat evicted pods as error cases.

Fix #4690

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-07-09 14:22:27 +03:00
Alejandro Pedraza aea541d6f9
Upgrade generated protobuf files to v1.4.2 (#4673)
Regenerated protobuf files, using version 1.4.2 that was upgraded from
1.3.2 with the proxy-api update in #4614.

As of v1.4 protobuf messages are disallowed to be copied (because they
hold a mutex), so whenever a message is passed to or returned from a
function we need to use a pointer.

This affects _mostly_ test files.

This is required to unblock #4620 which is adding a field to the config
protobuf.
2020-06-26 09:36:48 -05:00
Mayank Shah 963b9b049a
Add kubectl-style label selectors (#4120)
* Update tap, routes and top commands to support label selectors

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-03-20 10:45:06 -05:00
Mayank Shah 3c3a4a5f5d
cli: Add label selector flag for `stat` (#4040)
* Update `linkerd-namespace` shorthand to `L`
* Add --selector (-l) flag for `stat`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-17 13:40:07 -05:00
Sergio C. Arteaga cee8e3d0ae Add CronJobs and ReplicaSets to dashboard and CLI (#3687)
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. 

Closes #3614 
Closes #3630 
Closes #3584 
Closes #3585

Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
2019-12-11 10:02:37 -08:00
Zahari Dichev e5f75a8c3d
Add validation to ensure stat time window is at least 15s (#3720)
* Add stat time window minimum of 10s

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-12-04 08:12:01 +02:00
Kevin Leimkuhler a3a240e0ef
Add TapEvent headers and trailers to the tap protobuf (#3410)
### Motivation

In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390

### Solution

The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.

These values are set by reading the corresponding fields off of the proxy's tap
events.

The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33

Closes #3262

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-29 09:54:37 -07:00
Kevin Leimkuhler c62c90870e
Add JSON output to tap command (#3434)
Replaces #3411 

### Motivation

It is a little tough to filter/read the current tap output. As headers are being
added to tap, the output is starting to get difficult to consume. Take a peek at
#3262 for an example. It would be nice to have some more machine readable output
that can be sliced and diced with tools such as jq.

### Solution

A new output option has been added to the `linkerd tap` command that returns the
JSON encoding of tap events.

The default output is line oriented; `-o wide` appends the request's target
resource type to the tap line oriented tap events.

In order display certain values in a more human readable form, a tap event
display struct has been introduced. This struct maps public API `TapEvent`s
directly to a private `tapEvent`. This struct offers a flatter JSON structure
than the protobuf JSON rendering. It also can format certain field--such as
addresses--better than the JSON protobuf marshaler.

Closes #3390

**Default**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web
req id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics
rsp id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs
end id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B
```

**Wide**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide
req id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd
rsp id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd
end id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd
```

**JSON**:
*Edit: Flattened `Method` and `Scheme` formatting*
```
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "ip-masq-agent",
      "namespace": "kube-system",
      "pod": "ip-masq-agent-4d5s9",
      "serviceaccount": "ip-masq-agent",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "requestInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "method": "GET",
    "scheme": "",
    "authority": "10.60.1.49:9994",
    "path": "/ready"
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "calico-node",
      "namespace": "kube-system",
      "pod": "calico-node-bbrjq",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 644820
    },
    "httpStatus": 200
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "deployment": "calico-typha",
      "namespace": "kube-system",
      "pod": "calico-typha-59cb487c49-8247r",
      "pod_template_hash": "59cb487c49",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseEndEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 790898
    },
    "sinceResponseInit": {
      "nanos": 146078
    },
    "responseBytes": 3,
    "grpcStatusCode": 0
  }
}
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-19 09:34:49 -07:00
Andrew Seigner f98bc27a38
Fix invalid `l5d-require-id` for some tap requests (#3210)
PR #3154 introduced an `l5d-require-id` header to Tap requests. That
header string was constructed based on the TapByResourceRequest, which
includes 3 notable fields (type, name, namespace). For namespace-level
requests (via commands like `linkerd tap ns linkerd`), type ==
`namespace`, name == `linkerd`, and namespace == "". This special casing
for namespace-level requests yielded invalid `l5d-require-id` headers,
for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`.

Fix `l5d-require-id` string generation to account for namespace-level
requests. The bulk of this change is tap unit test updates to validate
the fix.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 09:42:11 -07:00
Carol A. Scott a504e8c2d8
Expand and improve edges API endpoint (#3007)
Updates functionality of `linkerd edges`, including a new `--all-namespaces`
flag and returning namespace information for SRC and DST resources.
2019-06-28 15:46:04 -07:00
Carol A. Scott 042086142a
Adding an edges command to the CLI (#2808)
Adds an edges command to the CLI. `linkerd edges` displays connections between resources, and Linkerd proxy identities. Currently this feature will only display edges where both the client identity and server identity are known. The next step will be to display edges for which identity is not known and/or one-sided traffic such as Prometheus and tap requests.
2019-05-15 13:59:27 -07:00
Tarun Pothulapati 2184928813 Wire up stats for Jobs (#2416)
Support for Jobs in stat/tap/top cli commands

Part of #2007

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-03-01 17:16:54 -08:00
Andrew Seigner 9f748d2d2e
lint: Enable unparam (#2369)
unparam reports unused function parameters:
https://github.com/mvdan/unparam

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-27 10:34:02 -08:00
Andrew Seigner 25e462352d
lint: Enable goimports (#2366)
goimports checks import lines, adding missing ones and removing
unreferenced ones:
https://godoc.org/golang.org/x/tools/cmd/goimports

It also requires named imports for packages whose
import paths don't match their package names:
- https://github.com/golang/go/issues/28428
- https://go-review.googlesource.com/c/tools/+/145699/

Also standardized named imports of common Kubernetes packaages.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 15:51:10 -08:00
Risha Mars 80b6e41d5d
Modify StatSummary to also return TCP stats (#2262)
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.

TCP stats are now displayed on the Resource Detail pages.

The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
2019-02-25 10:37:39 -08:00
Andrew Seigner 2305974202
Introduce golangci-lint tooling, fixes (#2239)
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.

This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.

Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-13 11:16:28 -08:00
Ivan Sim f6e75ec83a
Add statefulsets to the dashboard and CLI (#2234)
Fixes #1983

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-08 15:37:44 -08:00
Andrew Seigner 72812baf99
Introduce Discovery API and endpoints command (#2195)
The Proxy API service lacked introspection of its internal state.

Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server

Also wire up a new `linkerd endpoints` command.

Fixes #2165

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-07 14:02:21 -08:00
Risha Mars e531655d26
Add a --tap flag to the linkerd profile command (#2139)
Adds the ability to generate a service profile by running a tap for a configurable 
amount of time, and using the route results from the routes seen during the tap.

e.g. `linkerd profile web --tap deploy/web -n emojivoto --tap-duration 2s`
2019-02-06 12:43:16 -08:00
zak 8c413ca38b Wire up stats commands for daemonsets (#2006) (#2086)
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.

Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.

Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.

Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.

Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.

Update dependencies and dockerfile hashes

Add DaemonSet support to tap and top commands

Fixes of #2006

Signed-off-by: Zak Knill <zrjknill@gmail.com>
2019-01-24 14:34:13 -08:00
Alex Leong a562f8b9fd
Improve routes command to list all routes (#2066)
Fixes #1875 

This change improves the `linkerd routes` command in a number of important ways:

* The restriction on the type of the `--to` argument is lifted and any resource type can now be used.  Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.

```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp   100.00%   0.5rps          50ms         180ms         196ms
GET /authors/{id}            webapp   100.00%   0.5rps         100ms         900ms         980ms
GET /books/{id}              webapp   100.00%   0.9rps          38ms          93ms          99ms
POST /authors                webapp   100.00%   0.5rps          35ms          48ms          50ms
POST /authors/{id}/delete    webapp   100.00%   0.5rps          83ms         180ms         196ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp    45.16%   2.1rps          75ms         425ms         485ms
POST /books/{id}/delete      webapp   100.00%   0.5rps          30ms          90ms          98ms
POST /books/{id}/edit        webapp    56.00%   0.8rps          92ms         875ms         975ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

This is all made possible by a shift in the way we handle the destination resource.  When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource.  We then fetch all service profiles for those services and display the routes from those serivce profiles.  

This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-16 17:15:35 -08:00
Andrew Seigner a91c77d0bf
Followups from lint/comment changes (#2032)
This is a followup branch from #2023:
- delete `proxy/client.go`, move code to `destination-client`
- move `RenderTapEvent` and stat functions from `util` to `cmd`

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 15:28:09 -08:00
Andrew Seigner 1c302182ef
Enable lint check for comments (#2023)
Commit 1: Enable lint check for comments

Part of #217. Follow up from #1982 and #2018.

A subsequent commit will fix the ci failure.

Commit 2: Address all comment-related linter errors.

This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
  be removed

This PR does not:
- Modify, move, or remove any code
- Modify existing comments

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 14:03:59 -08:00
Kevin Lingerfelt f1b0983f72
Add go linting to CI config (#2018)
* Add go linting to CI config
* Fix lint warnings
* Add note about bin/lint script in TEST.md

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-20 15:33:09 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Alejandro Pedraza 8c67bfbcc6 Add parameter to stats API to skip retrieving Prometheus stats (#1871)
* Add parameter to stats API to skip retrieving Prometheus stats

Used by the dashboard to populate list of resources.

Fixes #1022

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated

* Add test for SkipStats=true

Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats

* Fix previous test
2018-12-10 16:48:12 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Oliver Gould 926395f616
tap: Include route labels in tap events (#1902)
This change alters the controller's Tap service to include route labels
when translating tap events, modifies the public API to include route
metadata in responses, and modifies the tap CLI command to include
rt_ labels in tap output (when -o wide is used).
2018-12-03 13:52:47 -08:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Alejandro Pedraza bbcf5a8c9f Allow stat summary to query for multiple resources (#1841)
* Refactor util.BuildResource so it can deal with multiple resources

First step to address #1487: Allow stat summary to query for multiple
resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update the stat cli help text to explain the new multi resource querying ability

Propsal for #1487: Allow stat summary to query for multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow stat summary to query for multiple resources

Implement this ability by issuing parallel requests to requestStatsFromAPI()

Proposal for #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update tests as part of multi-resource support in `linkerd stat` (#1487)

- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` called with multiple resources should keep an ordering (#1487)

Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Extra validations for `linkerd stat` with multiple resources (#1487)

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` resource grouping, ordering and name prefixing (#1487)

- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow `linkerd stat` to be called with multiple resources

A few final refactorings as per code review.

Fixes #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-11-14 10:44:04 -08:00
Eliza Weisman efabd90ff7
Fix missing ns/svc labels in metadata hydrated by Tap server (#1496)
Fixes #1493.

When the tap server hydrates metadata for the source or destination peer
of a Tap event from the peer's IP address, it doesn't currently add a
namespace label. However, destinations labeled by the proxy do have such
a label.

This is because the tap server currently gets the hydrated labels from
the `GetPodLabels` function, which is also used by the Destination
service for labeling the individual endpoints in a `WeightedAddrSet`
response. However, the Destination service also adds some labels to all
the endpoints in the set, including the namespace and service, so
`GetPodLabels` doesn't return these labels. However, when the tap server
uses that function, it does not add the service or namespace labels.

This branch fixes this issue by adding those labels to the Tap event 
after calling `GetPodLabels`. In addition, it fixes a missing space 
between the `src/dst_res` and `src/dst_ns` labels in Tap CLI output
with the `-o wide` flag set. This issue was introduced during the 
review of #1437, but was missed at the time because the namespace label
wasn't being set correctly.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-20 18:09:34 -07:00
Eliza Weisman b8434d60d4
Add resource metadata to Tap CLI output (#1437)
Closes #1170.

This branch adds a `-o wide` (or `--output wide`) flag to the Tap CLI.
Passing this flag adds `src_res` and `dst_res` elements to the Tap
output, as described in #1170. These use the metadata labels in the tap
event to describe what Kubernetes resource the source and destination
peers belong to, based on what resource type is being tapped, and fall
back to pods if either peer is not a member of the specified resource
type.

In addition, when the resource type is not `namespace`, `src_ns` and
`dst_ns` elements are added, which show what namespaces the the source
and destination peers are in. For peers which are not in the Kubernetes
cluster, none of these labels are displayed.

The source metadata added in #1434 is used to populate the `src_res` and
`src_ns` fields.

Also, this branch includes some refactoring to how tap output is
formatted.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-20 14:25:26 -07:00
Eliza Weisman bf7fc12f5c
Add source metadata to Tap server tap events (#1434)
The `TapEvent` protobuf contains two maps, `DestinationMeta` and
`SourceMeta`. The `DestinationMeta` contains all the metadata provided
by the proxy that originated the event (ultimately originating from the
Destination service), while the `SourceMeta` currently only contains the
source connection's TLS status.

This branch modifies the Tap server to hydrate the same set of metadata
from the source IP address, when the source was within the cluster. It
does this by adding an indexer of pod IPs to pods to its k8s API client,
and looking up IPs against this index. If a pod was found, the extra
metadata is added to the tap event sent to the client.

This branch also changes the client so that if a source pod name was
provided in the metadata, it prints the pod name rather than the IP
address for the `src` field in its output. This mimics what is currently
done for the `dst` field in tap output. Furthermore, the added source
metadata will be necessary for adding src resource types to tap output
(see issue #1170).

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-13 13:25:14 -07:00
Risha Mars ec3c861743
Enable Tap from the Web UI (#1356)
Adds a tap endpoint in the web api that communicates with the dashboard 
via websockets.
I've moved a bunch of code from the cli tap.go into utils so that the code 
can be shared between web and CLI. I think we should consider making the 
display more suited to web, but in the short term, reusing the CLI's 
rendering of tap events works.

Adds a Tap page in the Web UI that you can use to make tap requests. 
The form currently only allows you to enter a resource and namespace, 
other filters coming in a follow-up branch.
2018-07-24 14:23:42 -04:00
Kevin Lingerfelt 4b9700933a
Update prometheus labels to match k8s resource names (#1355)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-07-23 15:45:05 -07:00
Kevin Lingerfelt e5cce1abaf
Rename CLI from conduit to linkerd (#1312)
* Rename CLI binary
* Update integration tests for new binary name
* Rename --conduit-namespace flag, change default ns
* Rename occurrences of conduit in rest of CLI
* Rename inject and install components
* Remove conduit occurrences in docker files
* Additional miscellaneous cleanup
* Move protobuf definitions to linkerd2 package
* Rename conduit.io labels to use linkerd.io
* Rename conduit-managed segment to linkerd-managed
* Fix conduit references in web project

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-07-12 17:14:07 -07:00
Oliver Gould 941cad4a9c
Migrate build infrastructure to linkerd2 (#1298)
This PR begins to migrate Conduit to Linkerd2:
* The proxy has been completely removed from this repo, and is now located at
  github.com/linkerd/linkerd2-proxy.
* A `Dockerfile-proxy` has been added to fetch the most-recently published proxy
  binary from build.l5d.io.
* Proxy-specific protobuf bindings have been moved to
  github.com/linkerd/linkerd2-proxy-api.
* All docker images now use the gcr.io/linkerd-io registry.
* `inject` now uses `LINKERD2_PROXY_` environment variables
* Go paths have been updated to reflect the new (future) repo location.
2018-07-09 15:38:38 -07:00
Risha Mars 68586fe697
Add the ability to query stats by authority (#1181)
Adds the ability to query by a new non-kubernetes resource type, "authorities",
in the StatSummary api.

This includes an extensive refactor of stat_summary.go to deal with non-kubernetes 
resource types.

- Add documentation to Resource in the public api so we can use it for authority
- Handle non-k8s resource requests in the StatSummary endpoint
- Rewrite stat summary fetching and parsing to handle non-k8s resources
- keys stat summary metric handling by Resource instead of a generated string
- Adds authority to the CLI
- Adds /authorities to the Web UI
- Adds some more stat integration and unit tests
2018-06-28 14:31:44 -07:00
Risha Mars 0ff1bb4ad8
Don't allow stat requests for named resources in --all-namespaces (#1163)
Don't allow the CLI or Web UI to request named resources if --all-namespaces is used.

This follows kubectl, which also does not allow requesting named resources
over all namespaces.

This PR also updates the Web API's behaviour to be in line with the CLI's. 
Both will now default to the default namespace if no namespace is specified.
2018-06-20 12:59:31 -07:00
Andrew Seigner a0a9a42e23
Implement Public API and Tap on top of Lister (#835)
public-api and and tap were both using their own implementations of
the Kubernetes Informer/Lister APIs.

This change factors out all Informer/Lister usage into the Lister
module. This also introduces a new `Lister.GetObjects` method.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-04-24 18:10:48 -07:00