Commit Graph

333 Commits

Author SHA1 Message Date
Zahari Dichev 6c3922a7f1
Probe manager simplification (#4510)
There are a few notable things happening in this PR: 

- the probe manager has been decoupled from the cluster_watcher. Now its only responsibility is to watch for mirrored gateways beeing created and to probe them. This means that probes are initiated for all gateways no matter whether there are mirrored services being paired
- the number of paired services is derived from the existing services in the cluster rather than being published as a metric by the prober
- there are no events being exchanged between the cluster watcher and the probe manager

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-06-01 14:41:29 -07:00
Zahari Dichev 4176580a0f
Threadsafe buffering listener (#4359)
* Add thread safety to watcher tests

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 20:45:41 +03:00
Zahari Dichev ef1a2c2b10
Multicluster dashboard for traffic metrics (#4178)
This change adds labels to endpoints that target remote services. It also adds a Grafana dashboard that can be used to monitor multicluster traffic.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 17:48:27 +03:00
Alex Leong 8fbaa3ef9b
Don't send NoEndpoints during pod updates for ip watches (#4338)
When the proxy has an IP watch on a pod and the destination controller gets a pod update event, the destination controller sends a NoEndpoints message to all listeners followed by an Add with the new pod state.  This can result in the proxy's load balancer being briefly empty and could result in failing requests in the period.  

Since consecutive Add events with the same address will override each other, we can simply send the Adds without needing to clear the previous state with a NoEndpoints message.
2020-05-07 16:10:17 -07:00
Zahari Dichev 5149152ef3
Multicluster gateway and remote setup command (#4265)
Add multicluster gateway and setup command

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-29 20:33:23 +03:00
Zahari Dichev 17dacf5548
Add gateways command, allowing the retrieval of gateway stats (#4241)
Add gateways command, allowing the retrieval of gateway stats

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-27 13:55:01 +03:00
Zahari Dichev 26c14d3c66
Detect changes in addresses when getting updates in endpoints watcher (#4104)
Detect changes in addresses when getting updates in endpoints watcher

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-10 11:42:39 +03:00
Alex Leong d8eebee4f7
Upgrade to client-go 0.17.4 and smi-sdk-go 0.3.0 (#4221)
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0.  Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.

This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4.  This keeps all of our transitive dependencies synchronized on one version of client-go.

This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code.  I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-01 10:07:23 -07:00
Zahari Dichev 10ecd8889e
Set auth override (#4160)
Set AuthOverride when present on endpoints annotation

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-25 10:56:36 +02:00
Mayank Shah 963b9b049a
Add kubectl-style label selectors (#4120)
* Update tap, routes and top commands to support label selectors

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-03-20 10:45:06 -05:00
Zahari Dichev 2db307ee91
Remove target port requirement in port resolution (#4174)
This change removes the target port requirement when resolving ports in the dst service. Based on the comments, it seems that we need to have a target port defined in the port spec in order to resolve to the port in the Endpoints. In reality if target port is note defined when creating the service, k8s will set the port and the target port to the same value. Seems to me that checking for the targetPort to be different than 0, is a no-op.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-03-16 23:04:08 +02:00
Zahari Dichev caf4e61daf
Enable identitiy on endpoints not associated with pods (#4134)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-09 20:55:57 +02:00
Zahari Dichev edd7fd203d
Service Mirroring Component (#4028)
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-02 21:16:08 +02:00
Mayank Shah 3c3a4a5f5d
cli: Add label selector flag for `stat` (#4040)
* Update `linkerd-namespace` shorthand to `L`
* Add --selector (-l) flag for `stat`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-17 13:40:07 -05:00
Zahari Dichev 6fa9407318
Ensure we get the correct type out of Informer Deletion events (#4034)
Ensure we get what we expect when receiving DELETE events from the k8s Informer api

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:15:24 +02:00
Alex Leong ec51434eb9
Show traffic split metrics from sources in all namespaces (#3967)
Fixes #3562 

When a pod in one namespace sends traffic to a service which is the apex of a traffic split in another namespace, that traffic is not displayed in the `linkerd stat trafficsplit` output.  This is because when we do a Prometheus query for traffic to the traffic split, we supply a Prometheus label selector to only select traffic sources in the namespace of the traffic split.

Since any pod in any namespace can send traffic to the apex service of a traffic split, we must look at all possible sources of traffic, not just the ones in the same namespace.

Before:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS   RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m         -     -             -             -             -
webapp-split   webapp   webapp-2     100m         -     -             -             -             -
```

After:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m    80.00%   1.4rps          31ms          99ms        2530ms
webapp-split   webapp   webapp-2     100m    60.00%   0.2rps          35ms          93ms          99ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-12 09:21:59 -08:00
Alejandro Pedraza 3ba66f6f9d
Fix flakey TestGetProfiles (#3965)
Fixes #3332

Fixes the very rare test failure
```
--- FAIL: TestGetProfiles (0.33s)
    --- FAIL: TestGetProfiles/Returns_server_profile (0.11s)
            server_test.go:228: Expected 1 or 2 updates but got 3:
            [retry_budget:<retry_ratio:0.2 min_retries_per_second:10
            ttl:<seconds:10 > >  routes:<condition:<path:<regex:"/a/b/c"
            > > metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > >
            routes:<condition:<path:<regex:"/a/b/c" > >
            metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > > ]
            FAIL
            FAIL  github.com/linkerd/linkerd2/controller/api/destination
            0.624s
```
that occurs when a third unexpected stream update occurs, when the fake
API takes more time to notify its listeners about the resources created.

For all the nasty details check #3332
2020-02-07 19:43:29 -05:00
Alejandro Pedraza afb93cddc8
Use `t.Name()` instead of `t.Name` in tests (#3970)
Use `t.Name()` instead of `t.Name` when retrieving the name of tests.
This was causing an error to be added in the log:
```
output: logrus_error="can not add field \"test\"
```

Followup to
[comment](https://github.com/linkerd/linkerd2/pull/3965#discussion_r370387990)
2020-01-27 09:17:19 -05:00
Paul Balogh dabee12b93 Fix issue for debug containers when using custom Docker registry (#3873)
**Subject**
Fixes bug where override of Docker registry was not being applied to debug containers (#3851)

**Problem**
Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image.

**Solution**
This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images.

This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option.

**Validation**
Several new unit tests have been created to confirm functionality.  In addition, the following workflows were run through:

### Standard Workflow with Custom Registry
This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service.

* Start with a k8s instance having no Linkerd installation
* Build all images locally using `bin/docker-build`
* Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6`
* Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section
* Request injection of the debug image into an available container.  I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Custom Registry Override at Injection
This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Standard Workflow with Default Registry
This workflow is the typical workflow which utilizes the standard Linkerd image registry.

* Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster | kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/
* Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -`
* Perform standard Linkerd installation as `bin/linkerd install | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section
* Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Default Registry at Injection
This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

Fixes issue #3851 

Signed-off-by: Paul Balogh javaducky@gmail.com
2020-01-17 10:18:03 -08:00
Alex Leong 93a81dce97
Change default proxy log level to "warn,linkerd=info" (#3908)
Fixes #3901 

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-09 14:22:06 -08:00
Paul Balogh 2cd2ecfa30 Enable mixed configuration of skip-[inbound|outbound]-ports (#3766)
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)

This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.

* Bump versions for proxy-init to v1.3.0

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2019-12-20 09:32:13 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Dax McDonald 3088f404ce Upgrade prometheus to v1.2.1 (#3541)
Signed-off-by: Dax McDonald <dax@rancher.com>
2019-12-11 15:26:16 -08:00
Sergio C. Arteaga cee8e3d0ae Add CronJobs and ReplicaSets to dashboard and CLI (#3687)
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. 

Closes #3614 
Closes #3630 
Closes #3584 
Closes #3585

Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
2019-12-11 10:02:37 -08:00
Zahari Dichev e5f75a8c3d
Add validation to ensure stat time window is at least 15s (#3720)
* Add stat time window minimum of 10s

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-12-04 08:12:01 +02:00
Alex Leong 0026103362 Unit and integration test fixups (#3730)
- Added cleanup step at the end of all integration tests.
- Disable external_issuer_integration_tests in cloud_tests due to
  namespace issue. Running this via `kind` tests is sufficient for now.
- Set a flakey test to `Skip`, relates to #3332.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-11-15 03:40:42 -08:00
Tarun Pothulapati f3deee01b6 Trace Control plane Components with OC (#3495)
* add trace flags and initialisation
* add ocgrpc handler to newgrpc
* add ochttp handler to linkerd web
* add flags to linkerd web
* add ochttp handler to prometheus handler initialisation
* add ochttp clients for components
* add span for prometheus query
* update godep sha
* fix reviews
* better commenting
* add err checking
* remove sampling
* add check in main
* move to pkg/trace

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-10-18 12:19:13 -07:00
Alex Leong 3dcff52b9f
Switch from using golangci fmt to using goimports (#3555)
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script.  However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting.  This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.  

In this change we stop using golang-ci-lint for format checking.  We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files.  This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project.  We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.

Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-16 13:56:11 -07:00
Johannes Hansen f880e71fcd The linkerd proxy does not work with headless services (#3470)
* The linkerd proxy does not work with headless services (i.e. endpoints not referencing a pod).

Changed endpoints_watcher to also return endpoints with no targetref.

Fixes #3308

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>

* Fix panic in endpoint_translator

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>
2019-10-15 14:56:41 -07:00
Alejandro Pedraza 3de35ccc58
Remove Discovery service leftovers (#3500)
Followup to #2990, which refactored `linkerd endpoints` to use the
`Destination.Get` API instead of the `Discovery.Endpoints` API, leaving
the Discovery with no implented methods. This PR removes all the Discovery
code leftovers.

Fixes #3499
2019-10-15 11:20:21 -05:00
Kevin Leimkuhler a3a240e0ef
Add TapEvent headers and trailers to the tap protobuf (#3410)
### Motivation

In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390

### Solution

The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.

These values are set by reading the corresponding fields off of the proxy's tap
events.

The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33

Closes #3262

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-29 09:54:37 -07:00
Alex Leong 4799baa8e2
Revert "Trace Control Plane components using OC (#3461)" (#3484)
This reverts commit edd3b1f6d4.

This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy.  We will reintroduce this change once the config plan is straightened out.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-09-26 11:56:44 -07:00
Tarun Pothulapati edd3b1f6d4 Trace Control Plane components using OC (#3461)
* add exporter config for all components

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add cmd flags wrt tracing

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp tracing to web server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add flags to the tap deployment

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add trace flags to install and upgrade command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add linkerd prefix to svc names

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp trasport to API Internal Client

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix goimport linting errors

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp handler to tap http server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* review and fix tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use common template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use Initialize

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix sample flag

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add verbose info reg flags

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-26 08:11:48 -07:00
Kevin Leimkuhler c62c90870e
Add JSON output to tap command (#3434)
Replaces #3411 

### Motivation

It is a little tough to filter/read the current tap output. As headers are being
added to tap, the output is starting to get difficult to consume. Take a peek at
#3262 for an example. It would be nice to have some more machine readable output
that can be sliced and diced with tools such as jq.

### Solution

A new output option has been added to the `linkerd tap` command that returns the
JSON encoding of tap events.

The default output is line oriented; `-o wide` appends the request's target
resource type to the tap line oriented tap events.

In order display certain values in a more human readable form, a tap event
display struct has been introduced. This struct maps public API `TapEvent`s
directly to a private `tapEvent`. This struct offers a flatter JSON structure
than the protobuf JSON rendering. It also can format certain field--such as
addresses--better than the JSON protobuf marshaler.

Closes #3390

**Default**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web
req id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics
rsp id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs
end id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B
```

**Wide**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide
req id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd
rsp id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd
end id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd
```

**JSON**:
*Edit: Flattened `Method` and `Scheme` formatting*
```
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "ip-masq-agent",
      "namespace": "kube-system",
      "pod": "ip-masq-agent-4d5s9",
      "serviceaccount": "ip-masq-agent",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "requestInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "method": "GET",
    "scheme": "",
    "authority": "10.60.1.49:9994",
    "path": "/ready"
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "calico-node",
      "namespace": "kube-system",
      "pod": "calico-node-bbrjq",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 644820
    },
    "httpStatus": 200
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "deployment": "calico-typha",
      "namespace": "kube-system",
      "pod": "calico-typha-59cb487c49-8247r",
      "pod_template_hash": "59cb487c49",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseEndEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 790898
    },
    "sinceResponseInit": {
      "nanos": 146078
    },
    "responseBytes": 3,
    "grpcStatusCode": 0
  }
}
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-19 09:34:49 -07:00
Bruno M. Custódio 8fec756395 Add '--address' flag to 'linkerd dashboard'. (#3274)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-05 10:56:10 -07:00
Alejandro Pedraza acbab93ca8
Add support for k8s 1.16 (#3364)
Fixes #3356

1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):

```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
    Migrate to the policy/v1beta1 API, available since v1.10. Existing
    persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
    Migrate to the apps/v1 API, available since v1.9. Existing persisted
    data can be retrieved/updated via the apps/v1 API.
```

Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.

The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-04 09:59:55 -05:00
陈谭军 e281fb3410 fix-up grammar (#3351)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-08-30 08:09:36 -07:00
Alejandro Pedraza fd248d3755
Undo refactoring from #3316 (#3331)
Thus fixing `linkerd edges` and the dashboard topology graph

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 13:37:54 -05:00
Alejandro Pedraza 5d7499dc84
Avoid the dashboard requesting stats when not needed (#3338)
* Avoid the dashboard requesting stats when not needed

Create an alternative to `urlsForResource` called
`urlsForResourceNoStats` that makes use of the `skip_stats` parameter in
the stats API (created in #1871) that doesn't query Prometheus when not needed.

When testing using the dashboard looking at the linkerd namespace,
queries per second went down from 2874 to 2756, a 4% decrease.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 05:52:44 -05:00
arminbuerkle 5c38f38a02 Allow custom cluster domains in remaining backends (#3278)
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-27 10:01:36 -07:00
Alejandro Pedraza 02efb46e45
Have the proxy-injector emit events upon injection/skipping injection (#3316)
* Have the proxy-injector emit events upon injection/skipping injection

Fixes #3253

Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).

The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.

Related changes:

- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-26 13:34:36 -05:00
Carol A. Scott 089836842a
Add unit test for edges API endpoint (#3306)
Fixes #3052.

Adds a unit test for the edges API endpoint. To maintain a consistent order for
testing, the returned rows in api/public/edges.go are now sorted.
2019-08-23 09:28:02 -07:00
Guangming Wang 70d85d2065 Cleanup: fix some typos in code comment (#3296)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-21 09:40:43 -07:00
Oliver Gould ee79d5d324
destination: Reorganize authority-parsing (#3244)
In preparation for #3242, the destination controller will need to
support a broader set of valid authorities including IP addresses.

This change modifies the destination controller's authority-parsing code
so that the is-this-a-kubernete-service-name decision is decoupled from
parsing of authorities into their consituent parts.

The `Get` API now explicitly handles IP address names, though it
currently fails all such resolutions.
2019-08-21 07:19:42 -07:00
Carol A. Scott bc8fef7ba9 Sorting the expected response for trafficsplit rows so it is always in consistent row order (#3280) 2019-08-19 10:10:26 -07:00
Carol A. Scott 9c62b65c6a
Adding trafficsplit test to stat_summary_test.go (#3252)
This PR adds a test for trafficsplits to stat_summary_test.go.

Because the test requires a consistent order for returned rows, trafficsplit
rows in stat_summary.go are now sorted by apex + leaf name before being
returned.
2019-08-14 14:48:46 -07:00
Kevin Leimkuhler cc3c53fa73
Remove tap from public API and associated test infrastructure (#3240)
### Summary

After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.

This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.

This draft passes `go test` as well as a local run of the integration tests.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-14 17:27:37 -04:00
Carol A. Scott 00437709eb
Add trafficsplit metrics to CLI (#3176)
This PR adds `trafficsplit` as a supported resource for the `linkerd stat` command. Users can type `linkerd stat ts` to see the apex and leaf services of their trafficsplits, as well as metrics for those leaf services.
2019-08-14 10:30:57 -07:00
Andrew Seigner f98bc27a38
Fix invalid `l5d-require-id` for some tap requests (#3210)
PR #3154 introduced an `l5d-require-id` header to Tap requests. That
header string was constructed based on the TapByResourceRequest, which
includes 3 notable fields (type, name, namespace). For namespace-level
requests (via commands like `linkerd tap ns linkerd`), type ==
`namespace`, name == `linkerd`, and namespace == "". This special casing
for namespace-level requests yielded invalid `l5d-require-id` headers,
for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`.

Fix `l5d-require-id` string generation to account for namespace-level
requests. The bulk of this change is tap unit test updates to validate
the fix.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 09:42:11 -07:00
Andrew Seigner a59c1dd32d
Introduce tap APIService, update `linkerd tap` (#3167)
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.

This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.

This change also modifies the `linkerd tap` command to make requests
against the new APIService.

The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET  /apis
GET  /apis/tap.linkerd.io
GET  /apis/tap.linkerd.io/v1alpha1
GET  /healthz
GET  /healthz/log
GET  /healthz/ping
GET  /metrics
GET  /openapi/v2
GET  /version

Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.

This change introduces the following resources into the default Linkerd
install:
- Global
  - APIService/v1alpha1.tap.linkerd.io
  - ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
  - Secret/linkerd-tap-tls
- `kube-system` namespace:
  - RoleBinding/linkerd-linkerd-tap-auth-reader

Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller

Fixes #2725, #3162, #3172

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-01 14:02:45 -07:00
Alex Leong ab7226cbcd
Return invalid argument for external name services (#3120)
Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498

When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource.  Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request.  The net effect is that Linkerd fails all requests to external name services.

We change the destination service to instead return `InvalidArgument` for external name services.  This causes the proxy to fallback to SO_ORIG_DST instead of failing the request.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-29 16:31:22 -07:00
Andrew Seigner 51b33ad53c
Fix nil pointer dereference in endpoints watcher (#3147)
The destination service's endpoints watcher assumed every `Endpoints`
object contained a `TargetRef`. This field is optional, and in cases
such as the default `ep/kubernetes` object, `TargetRef` is nil, causing
a nil pointer dereference.

Fix endpoints watcher to check for `TargetRef` prior to dereferencing.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-25 17:11:56 -07:00
Alex Leong 3c4a0e4381
Make authorities in destination overrides absolute (#3137)
Fixes #3136 

When the destination service sends a destination profile with a traffic split to the proxy, the override destination authorities are absolute but do no contain a trailing dot.  e.g. "bar.ns.svc.cluster.local:80".  However, NameAddrs which have undergone canonicalization in the proxy will include the trailing dot.  When a traffic split includes the apex service as one of the overrides, the original apex NameAddr will have the trailing dot and the override will not.  Since these two NameAddrs are not identical, they will go into two distinct slots in the proxy's concrete dst router.  This will cause two services to be created for the same destination which will cause the stats clobbering described in the linked issue.

We change the destination service to always return absolute dst overrides including the trailing dot.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 17:08:40 -07:00
Alex Leong e538a05ce2
Add support for stateful sets (#3113)
We add support for looking up individual pods in a stateful set with the destination service.  This allows Linkerd to correctly proxy requests which address individual pods.  The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.

Fixes #2266 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 14:09:46 -07:00
Andrew Seigner 64ed8e4a74
Introduce Cluster Heartbeat cronjob (#3056)
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.

Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.

Example check URL:
https://versioncheck.linkerd.io/version.json?
  install-time=1562761177&
  k8s-version=v1.15.0&
  meshed-pods=8&
  rps=3&
  source=heartbeat&
  uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
  version=git-b97ee9f7

Fixes #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 17:12:30 -07:00
Alex Leong d6ef9ea460
Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078)
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller.  Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.

To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2.  This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource.  If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can.  This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.  

Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation).  In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-23 11:46:31 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong c8b34a8cab
Add pod status to linkerd check (#3065)
When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff).

We add pod status to the output to make this more immediately obvious.

Fixes #2877 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-18 15:56:19 -07:00
Carol A. Scott ee1a111993
Updating CLI output for `linkerd edges` (#3048)
This PR improves the CLI output for `linkerd edges` to reflect the latest API
changes. 

Source and destination namespaces for each edge are now shown by default. The
`MSG` column has been replaced with `Secured` and contains a green checkmark or
the reason for no identity. A new `-o wide` flag shows the identity of client
and server if known.
2019-07-17 12:23:34 -07:00
Jonathan Juares Beber 2dcbde08b3 Show pod status more clearly (#1967) (#2989)
During operations with `linkerd stat` sometimes it's not clear the actual
pod status.

This commit introduces a method, to the `k8s`package, getting the pod status,
based on [`kubectl` logic](33a3e325f7/pkg/printers/internalversion/printers.go (L558-L640))
to expose the `STATUS` column for pods . Also, it changes the stat command
on the` cli` package adding a column when the resource type is a Pod.

Fixes #1967

Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
2019-07-10 12:44:44 -07:00
Jonathan Juares Beber e2211f5f77 Introduces owner references verification for pods (#3027)
When getting pods for specific kubernetes resources, the usage of just
labels, as a selector, generates wrong outputs. Once, two resources can use
the same label selector and manage distinct pods, a new mechanism to check
pods for a given resource it's needed. More details on #2932.

This commit introduces a verification through the pod owner references
`UID`s, comparing with the given resource's. Additional logic is needed
when handling `Deployments` since it creates a `ReplicaSet` and this last
one is the actual pod's owner. No verification is done in case of
`Services`.

Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
2019-07-10 12:44:24 -07:00
Alex Leong 92ddffa3c2
Add prometheus metrics for watchers (#3022)
To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-08 11:50:26 -07:00
Alejandro Pedraza 53e589890d
Have `linkerd endpoints` use `Destination.Get` (#2990)
* Have `linkerd endpoints` use `Destination.Get`

Fixes #2885

We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.

For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.

I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.

Analogously, I added a `destinationServer` struct that mimics
`tapServer`.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-07-03 09:11:03 -05:00
Carol A. Scott de635d3fcf
Allow `edges` to handle requests from multiple namespaces to one resource (#3025)
This PR fixes a bug in the edges command where if src_resources from two
different namespaces sent requests to the same dst_resource, the original
src_identity was overwritten.
2019-07-02 12:31:15 -07:00
Carol A. Scott a504e8c2d8
Expand and improve edges API endpoint (#3007)
Updates functionality of `linkerd edges`, including a new `--all-namespaces`
flag and returning namespace information for SRC and DST resources.
2019-06-28 15:46:04 -07:00
Alex Leong 27373a8b78
Add traffic splitting to destination profiles (#2931)
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).

We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to.  A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-28 13:19:47 -07:00
Alejandro Pedraza 73740fb503
Simplify port-forwarding code (#2976)
* Simplify port-forwarding code

Simplifies the establishment of a port-forwarding by moving the common
logic into `PortForward.Init()`

Stemmed from this
[comment](https://github.com/linkerd/linkerd2/pull/2937#discussion_r295078800)

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-26 11:14:57 -05:00
Alejandro Pedraza 8988a5723f
Have `GetOwnerKindAndName` be able to skip the cache (#2972)
* Have `GetOwnerKindAndName` be able to skip the cache

Refactored `GetOwnerKindAndName` so it can optionally skip the
shared informer cache and instead hit the k8s API directly.
Useful for the proxy injector, when the pod's replicaset got just
created and might not be in ready in the cache yet.

Fixes #2738

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-20 12:58:15 -05:00
Alex Leong 06a69f69c5
Refactor destination service (#2786)
This is a major refactor of the destination service.  The goals of this refactor are to simplify the code for improved maintainability.  In particular:

* Remove the "resolver" interfaces.  These were a holdover from when our decision tree was more complex about how to handle different kinds of authorities.  The current implementation only accepts fully qualified kubernetes service names and thus this was an unnecessary level of indirection.
* Moved the endpoints and profile watchers into their own package for a more clear separation of concerns.  These watchers deal only in Kubernetes primitives and are agnostic to how they are used.  This allows a cleaner layering when we use them from our gRPC service.
* Renamed the "listener" types to "translator" to make it more clear that the function of these structs is to translate kubernetes updates from the watcher to gRPC messages.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-04 15:01:16 -07:00
Carol A. Scott 8c496e3d0d
Adding unit test for CLI edges command (#2837)
Adds a unit test for the `linkerd edges` command.
2019-05-28 13:51:45 -07:00
Carol A. Scott bb2921a3d9
Verify in Prometheus edges query that data for a specific resource type exists (#2826)
Adds a check to Prometheus `edges` queries to verify that data for the requested
resource type exists. Previously, if Prometheus could not find request data for the
requested resource type, it would skip that label and still return data for
other labels in the `by` clause, leading to an incorrect response.
2019-05-15 16:03:48 -07:00
Carol A. Scott 042086142a
Adding an edges command to the CLI (#2808)
Adds an edges command to the CLI. `linkerd edges` displays connections between resources, and Linkerd proxy identities. Currently this feature will only display edges where both the client identity and server identity are known. The next step will be to display edges for which identity is not known and/or one-sided traffic such as Prometheus and tap requests.
2019-05-15 13:59:27 -07:00
Carol A. Scott 87e69bf885
Adding edges endpoint to public API (#2793)
This change adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies. This PR only includes changes to the controller and protobuf.
2019-05-09 09:30:11 -07:00
Jack Price f758a9e428 Use port-forwarding for linkerd CLIs (#2757)
Private k8s clusters, such as the private GKE clusters offered by Google
Cloud, cannot be reached through the current API proxy method.

This commit uses the port forwarding feature already developed.

Also modify dashboard command to not fall back to ephemeral port.

Signed-off-by: Jack Price <jackprice@outlook.com>
2019-05-02 14:41:26 +02:00
harsh jain 976bc40345 Fixes #2607: Remove TLS from stat (#2613)
Removes the TLS percentages from the stat command in the CLI.
2019-04-04 10:37:42 -07:00
Oliver Gould da0330743f
Provide peer Identities via the Destination API (#2537)
This change reintroduces identity hinting to the destination service.
The Get endpoint includes identities for pods that are injected with an
identity-mode of "default" and have the same linkerd control plane.

A `serviceaccount` label is now also added to destination response
metadata so that it's accessible in prometheus and tap.
2019-03-22 09:19:14 -07:00
Oliver Gould 91c5f07650
proxy: Upgrade to identity-capable proxy (#2524)
The new proxy has changed its configuration as follows:

- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
  just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.

Identity injection is **NOT** configured by this branch.
2019-03-19 14:20:39 -07:00
Oliver Gould 790c13b3b2
Introduce the Identity controller implementation (#2521)
This change introduces a new Identity service implementation for the
`io.linkerd.proxy.identity.Identity` gRPC service.

The `pkg/identity` contains a core, abstract implementation of the service
(generic over both the CA and (Kubernetes) Validator interfaces).

`controller/identity` includes a concrete implementation that uses the
Kubernetes TokenReview API to validate serviceaccount tokens when
issuing certificates.

This change does **NOT** alter installation or runtime to include the
identity service. This will be included in a follow-up.
2019-03-19 13:58:45 -07:00
Oliver Gould 81f645da66
Remove `--tls=optional` and `linkerd-ca` (#2515)
The proxy's TLS implementation has changed to use a new _Identity_ controller.

In preparation for this, the `--tls=optional` CLI flag has been removed
from install and inject; and the `ca` controller has been deleted. Metrics
and UI treatments for TLS have **not** been removed, as they will continue to
be valuable for the new Identity system.

With the removal of the old identity scheme, the Destination service's proxy
ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable
locality awareness.
2019-03-18 17:40:31 -07:00
Andrew Seigner e5d2460792
Remove single namespace functionality (#2474)
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.

This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.

A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
  combining some check categories, as we can always assume cluster-wide
  requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
  make a decision based on cluster-wide vs. namespace-wide access.
  Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.

Reverts #1721
Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-12 00:17:22 -07:00
Kevin Leimkuhler 229e33e79e
cli: Always display stat tables for all routes (#2466)
## Problem

When an object has no previous route metrics, we do not generate a table for
that object.

The reasoning behind this was for reducing output of the following command:

```
$ linkerd routes deploy --to deploy/foo
```

For each deployment object, if it has no previous traffic to `deploy/foo`, then
a table would not be generated for it.

However, the behavior we see with that indicates there is an error even when a
Service Profile is installed:

```
$ linkerd routes deploy deploy/foo
Error: No Service Profiles found for selected resources
```

## Solution

Always generate a stat table for the queried resource object.

## Validation

I deployed [booksapp](https://github.com/buoyantIO/booksapp) with the `traffic`
deployment removed and Service Profiles installed.

Without the fix, `linkerd routes deploy/webapp` displays an error because there
has been no traffic to `deploy/webapp` without the `traffic` deployment.

With the fix, the following output is generated:

```
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /authors/{id}            webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /books/{id}              webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors                webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/delete    webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/delete      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/edit        webapp     0.00%   0.0rps           0ms           0ms           0ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

Closes #2328

Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
2019-03-11 14:17:20 -07:00
Andrew Seigner d4fdbe4991
Fix web init to not check for ServiceProfiles (#2470)
linkerd/linkerd2#2428 modified SelfSubjectAccessReview behavior to no
longer paper-over failed ServiceProfile checks, assuming that
ServiceProfiles will be required going forward. There was a lingering
ServiceProfile check in the web's startup that started failing due to
this change, as the web component does not have (and should not need)
ServiceProfile access. The check was originally implemented to inform
the web component whether to expect "single namespace" mode or
ServiceProfile support.

Modify the web's initialization to always expect ServiceProfile support.

Also remove single namespace integration test

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 15:20:46 -08:00
Alejandro Pedraza 0da851842b
Public API endpoint `Config()` (#2455)
Public API endpoint `Config()`

Retrieves Global and Proxy configurations.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-03-07 17:37:46 -05:00
Andrew Seigner 8da2cd3fd4
Require cluster-wide k8s API access (#2428)
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.

This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.

Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 10:23:18 -08:00
Andrew Seigner 206ff685e2
Bump Prometheus client to v0.9.2 (#2388)
We were depending on an untagged version of prometheus/client_golang
from Feb 2018.

This bumps our dependency to v0.9.2, from Dec 2018.

Also, this is a prerequisite to #1488.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-05 10:31:16 -08:00
Tarun Pothulapati 2184928813 Wire up stats for Jobs (#2416)
Support for Jobs in stat/tap/top cli commands

Part of #2007

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-03-01 17:16:54 -08:00
Oliver Gould ab90263461
destination: Only return TLS identities when appropriate (#2371)
As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.

This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.

The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.

Fixes #2217
2019-02-27 12:18:39 -08:00
Andrew Seigner 9f748d2d2e
lint: Enable unparam (#2369)
unparam reports unused function parameters:
https://github.com/mvdan/unparam

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-27 10:34:02 -08:00
Oliver Gould c3f9ff8e32
Consolidate endpointListener.Update with logging (#2389)
Previously, the update-handling logic was spread across several very
small functions that were only called within this file. I've
consolidated this logic into endpointListener.Update so that all of the
debug logging can be instrumented in one place without having to iterate
over lists multiple times.

Also, I've fixed the formatting of IP addresses in some places.

Logs now look as follows:

    msg="Establishing watch on endpoint linkerd-prometheus.linkerd:9090" component=endpoints-watcher
    msg="Subscribing linkerd-prometheus.linkerd:9090 exists=true" component=service-port id=linkerd-prometheus.linkerd target-port=admin-http
    msg="Update: add=1; remove=0" component=endpoint-listener namespace=linkerd service=linkerd-prometheus
    msg="Update: add: addr=10.1.1.160; pod=linkerd-prometheus-7bbc899687-nd9zt; addr:<ip:<ipv4:167838112 > port:9090 > weight:1 metric_labels:<key:\"control_plane_ns\" value:\"linkerd\" > metric_labels:<key:\"deployment\" value:\"linkerd-prometheus\" > metric_labels:<key:\"pod\" value:\"linkerd-prometheus-7bbc899687-nd9zt\" > metric_labels:<key:\"pod_template_hash\" value:\"7bbc899687\" > protocol_hint:<h2:<> > " component=endpoint-listener namespace=linkerd service=linkerd-prometheus
2019-02-26 15:05:23 -08:00
Andrew Seigner ec5a0ca8d9
Authorization-aware control-plane components (#2349)
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.

Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.

Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.

TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-26 11:54:52 -08:00
Andrew Seigner 25e462352d
lint: Enable goimports (#2366)
goimports checks import lines, adding missing ones and removing
unreferenced ones:
https://godoc.org/golang.org/x/tools/cmd/goimports

It also requires named imports for packages whose
import paths don't match their package names:
- https://github.com/golang/go/issues/28428
- https://go-review.googlesource.com/c/tools/+/145699/

Also standardized named imports of common Kubernetes packaages.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 15:51:10 -08:00
Andrew Seigner 35a0b652f2
lint: Enable goconst (#2365)
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 12:00:03 -08:00
Risha Mars 80b6e41d5d
Modify StatSummary to also return TCP stats (#2262)
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.

TCP stats are now displayed on the Resource Detail pages.

The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
2019-02-25 10:37:39 -08:00
Oliver Gould f7435800da
lint: Enable scopelint (#2364)
[scopelint][scopelint] detects a nasty reference-scoping issue in loops.

[scopelint]: https://github.com/kyoh86/scopelint
2019-02-24 08:59:51 -08:00
Andrew Seigner cc3ff70f29
Enable `unused` linter (#2357)
`unused` checks Go code for unused constants, variables, functions, and
types.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-23 11:05:39 -08:00
Kevin Lingerfelt 5384ca8c97
Add discovery package for managing discovery API (#2317)
* Add discovery package for managing discovery API
* Fix typo in destination server comment

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-18 16:38:04 -08:00
Oliver Gould 71ce786dd3
Rename linkerd-proxy-api to linkerd-destination (#2281)
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).

With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).

Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-15 15:11:04 -08:00
Kevin Leimkuhler b2bbeb05ef
Issue 2276: Do not log error when timeout is blank (#2279)
# Problem

When a route does not specify a timeout, the proxy-api defaults to the default
timeout and logs an error:

```
time="2019-02-13T16:29:12Z" level=error msg="failed to parse duration for route POST /io.linkerd.proxy.destination.Destination/GetProfile: time: invalid duration"
```

# Solution

We now check if a route timeout is blank. If it is not set, it is set to
`DefaultRouteTimeout`. If it is set, we try to parse it into a `Duration`.

A request was made to improve logging to include the service profile and
namespace as well.

# Validation

With valid service profiles installed, edit the `.yaml` to include an invalid
`timeout`:

```
...
name: GET /
timeout: foo
```

We should now see the following errors:

```
proxy-api time="2019-02-13T22:27:32Z" level=error msg="failed to parse duration for route 'GET /' in service profile 'webapp.default.svc.cluster.local' in namespace 'default': time: invalid duration foo"
```

This error does not show up when `timeout` is blank.

Fixes #2276

Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
2019-02-14 17:09:02 -08:00
Andrew Seigner 2305974202
Introduce golangci-lint tooling, fixes (#2239)
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.

This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.

Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-13 11:16:28 -08:00
Ivan Sim f6e75ec83a
Add statefulsets to the dashboard and CLI (#2234)
Fixes #1983

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-08 15:37:44 -08:00
Alex Leong 030767d615
Refactor fallback profile listener to avoid repetition (#2228)
Refactor fallback profile listener to avoid repetition

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-08 14:24:10 -08:00
Alex Leong 5b054785e5
Read service profiles from client or server namespace instead of control namespace (#2200)
Fixes #2077 

When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace.  This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.

Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace.  If a service profile exists in both namespaces, the client namespace takes priority.  In this way, clients may override the behavior dictated by the service.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-07 14:51:43 -08:00
Andrew Seigner 72812baf99
Introduce Discovery API and endpoints command (#2195)
The Proxy API service lacked introspection of its internal state.

Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server

Also wire up a new `linkerd endpoints` command.

Fixes #2165

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-07 14:02:21 -08:00
Risha Mars e531655d26
Add a --tap flag to the linkerd profile command (#2139)
Adds the ability to generate a service profile by running a tap for a configurable 
amount of time, and using the route results from the routes seen during the tap.

e.g. `linkerd profile web --tap deploy/web -n emojivoto --tap-duration 2s`
2019-02-06 12:43:16 -08:00
Ye Ben f2ba17d366 fix some typos (#2194)
Signed-off-by: yeya24 <ben.ye@daocloud.io>
2019-02-02 23:03:54 -08:00
Alex Leong 3bd4231cec
Add support for timeouts in service profiles (#2149)
Fixes #2042 

Adds a new field to service profile routes called `timeout`.  Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead.  If the timeout field is not specified, a default timeout of 10 seconds is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-30 16:48:55 -08:00
Alena Varkockova 2691dda5ce Add possibility to filter by owner and label in ListPods (#2161)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-28 18:50:29 -08:00
Alex Leong d542571b65
GetProfiles should always respond with a (possibly empty) profile immediately (#2146)
When `GetProfiles` is called for a destination that does not have a service profile, the proxy-api service does not return any messages until a service profile is created for that service.  This can be interpreted as hanging, and can make it difficult to calculate response latency metrics.

Change the behavior of the API to always return a service profile message immediately.  If the service does not have a service profile, the default service profile is returned.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-24 15:22:14 -08:00
zak 8c413ca38b Wire up stats commands for daemonsets (#2006) (#2086)
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.

Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.

Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.

Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.

Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.

Update dependencies and dockerfile hashes

Add DaemonSet support to tap and top commands

Fixes of #2006

Signed-off-by: Zak Knill <zrjknill@gmail.com>
2019-01-24 14:34:13 -08:00
Alex Leong 32efab41b5
Fix panic when routes is called in single-namespace mode (#2123)
Fixes #2119 

When Linkerd is installed in single-namespace mode, the public-api container panics when it attempts to access watch service profiles.

In single-namespace mode, we no longer watch service profiles and return an informative error when the TopRoutes API is called.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-23 16:47:05 -08:00
Alena Varkockova 28f662c9c6 Introduce resource selector and deprecate namespace field for ListPods (#2025)
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-23 10:35:55 -08:00
Alex Leong a562f8b9fd
Improve routes command to list all routes (#2066)
Fixes #1875 

This change improves the `linkerd routes` command in a number of important ways:

* The restriction on the type of the `--to` argument is lifted and any resource type can now be used.  Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.

```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp   100.00%   0.5rps          50ms         180ms         196ms
GET /authors/{id}            webapp   100.00%   0.5rps         100ms         900ms         980ms
GET /books/{id}              webapp   100.00%   0.9rps          38ms          93ms          99ms
POST /authors                webapp   100.00%   0.5rps          35ms          48ms          50ms
POST /authors/{id}/delete    webapp   100.00%   0.5rps          83ms         180ms         196ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp    45.16%   2.1rps          75ms         425ms         485ms
POST /books/{id}/delete      webapp   100.00%   0.5rps          30ms          90ms          98ms
POST /books/{id}/edit        webapp    56.00%   0.8rps          92ms         875ms         975ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

This is all made possible by a shift in the way we handle the destination resource.  When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource.  We then fetch all service profiles for those services and display the routes from those serivce profiles.  

This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-16 17:15:35 -08:00
Alex Leong 771542dde2
Add support for retries (#2038) 2019-01-16 14:13:48 -08:00
Andrew Seigner a91c77d0bf
Followups from lint/comment changes (#2032)
This is a followup branch from #2023:
- delete `proxy/client.go`, move code to `destination-client`
- move `RenderTapEvent` and stat functions from `util` to `cmd`

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 15:28:09 -08:00
Andrew Seigner 1c302182ef
Enable lint check for comments (#2023)
Commit 1: Enable lint check for comments

Part of #217. Follow up from #1982 and #2018.

A subsequent commit will fix the ci failure.

Commit 2: Address all comment-related linter errors.

This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
  be removed

This PR does not:
- Modify, move, or remove any code
- Modify existing comments

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 14:03:59 -08:00
Kevin Lingerfelt f1b0983f72
Add go linting to CI config (#2018)
* Add go linting to CI config
* Fix lint warnings
* Add note about bin/lint script in TEST.md

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-20 15:33:09 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Alex Leong cb3fa1245b
Remove TLS column from routes command output (#1956)
Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-14 21:52:49 -08:00
Kevin Lingerfelt 86e95b7ad3
Disable serivce profiles in single-namespace mode (#1980)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 14:37:18 -08:00
Kevin Lingerfelt 00de48bd26
Fix proxy-api handling of named target ports (#1973)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-12 13:42:47 -08:00
Alejandro Pedraza 8c67bfbcc6 Add parameter to stats API to skip retrieving Prometheus stats (#1871)
* Add parameter to stats API to skip retrieving Prometheus stats

Used by the dashboard to populate list of resources.

Fixes #1022

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated

* Add test for SkipStats=true

Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats

* Fix previous test
2018-12-10 16:48:12 -08:00
Kevin Lingerfelt 0f8bcc9159
Controller: wait for caches to sync before opening listeners (#1958)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-07 11:15:45 -08:00
Alex Leong 04ed200e36
Rename path_regex to pathRegex (#1951)
Rename snake case fields to camel case in service profile spec.  This improves the way they are rendered when the `kubectl describe` command is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-06 11:51:33 -08:00
Alex Leong cbb196066f
Support service profiles for external authorities (#1928)
Add support for service profiles created on external (non-service) authorities.  For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`.

This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority.  We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem.

We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not.  To simplify this, support for multiple resolves (which was unused) was removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 14:32:59 -08:00
Oliver Gould 8f9bb711dd
proxy-api: Expose a flag to control auto-h2-upgrade (#1925)
When debugging issues, it's helpful to disable HTTP/2 upgrading to
simplify diagnostics.

This chagne adds an `enable-h2-ugprade` flag to _proxy-api_. When this
flag is set to false, the proxy-api will not suggest that meshed
endpoints are upgraded to use HTTP/2.

As a follow-up, a flag should be added to `install` to control how the
proxy-api is initialized.
2018-12-05 12:41:20 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Alex Leong 4f3e55e937
Rename path to path_regex in ServiceProfile CRD (#1923)
We rename path to path_regex in the ServiceProfile CRD to make it clear that this field accepts a regular expression. We also take this opportunity to remove unnecessary line anchors from regular expressions now that these anchors are added in the proxy.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 10:42:47 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Oliver Gould 926395f616
tap: Include route labels in tap events (#1902)
This change alters the controller's Tap service to include route labels
when translating tap events, modifies the public API to include route
metadata in responses, and modifies the tap CLI command to include
rt_ labels in tap output (when -o wide is used).
2018-12-03 13:52:47 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 73836f05cf
Update proxy version and use canonicalized dst (#1866)
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag).  Inbound queries (i.e. ones without the `--from` flag) never return any metrics.

We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-26 17:20:07 -08:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Kevin Leimkuhler c68693e820
Fix stat filtering for `--from` queries (#1856)
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.

Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).

# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.

# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.

Fixes #1766 

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2018-11-14 10:52:27 -08:00
Alejandro Pedraza bbcf5a8c9f Allow stat summary to query for multiple resources (#1841)
* Refactor util.BuildResource so it can deal with multiple resources

First step to address #1487: Allow stat summary to query for multiple
resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update the stat cli help text to explain the new multi resource querying ability

Propsal for #1487: Allow stat summary to query for multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow stat summary to query for multiple resources

Implement this ability by issuing parallel requests to requestStatsFromAPI()

Proposal for #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update tests as part of multi-resource support in `linkerd stat` (#1487)

- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` called with multiple resources should keep an ordering (#1487)

Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Extra validations for `linkerd stat` with multiple resources (#1487)

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` resource grouping, ordering and name prefixing (#1487)

- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow `linkerd stat` to be called with multiple resources

A few final refactorings as per code review.

Fixes #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-11-14 10:44:04 -08:00
Alex Leong 32d556e732
Improve ergonomics of service profile spec (#1828)
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.

* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition.  Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-31 12:00:22 -07:00
Alex Leong d8b5ebaa6d
Remove the proxy-api container (#1813)
A container called `proxy-api` runs in the Linkerd2 controller pod.  This container listens on port 8086 and serves the proxy-api but does nothing other than forward gRPC requests to the destination container which listens on port 8089.

We remove the proxy-api container altogether and change the destination container to listen on port 8086 instead of 8089.  The result is that clients still use the proxy-api by connecting to `proxy-api.<ns>.svc.cluster.local:8086` but the controller has one fewer containers.  This results in a simpler system that is easier to reason about.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 16:31:43 -07:00
Oliver Gould 0e91dbb18d
Implement GetProfile for the proxy-api service (#1801)
The `proxy-api` service included a stub implementation of `GetProfile`
instead of forwarding requests to the `destination` service.

This change fills in the proxy-api service's `GetProfile` implementation
to forward requests to the destination service.
2018-10-24 12:37:29 -07:00
Alejandro Pedraza 37bc8a69db Added support for json output in `linkerd stat` (#1749)
Added support for json output in `linkerd stat` through a new (-o|--output)=json option.

Fixes #1417

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-10-15 14:10:48 -07:00
Risha Mars 31a396b631
Fix incorrect test wording (#1767) 2018-10-15 12:07:06 -07:00
Alex Leong 1fe19bf3ce
Add ServiceProfile support to k8s utilities (#1758)
Updates to the Kubernetes utility code in `/controller/k8s` to support interacting with ServiceProfiles.

This makes use of the code generated client added in #1752 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-12 09:35:11 -07:00
Alena Varkockova 5a853e8990 Use ListPods always for data plane HC (#1701)
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-10-02 11:45:01 -07:00
Andrew Seigner c5a719da47
Modify inject to warn when file is un-injectable (#1603)
If an input file is un-injectable, existing inject behavior is to simply
output a copy of the input.

Introduce a report, printed to stderr, that communicates the end state
of the inject command. Currently this includes checking for hostNetwork
and unsupported resources.

Malformed YAML documents will continue to cause no YAML output, and return
error code 1.

This change also modifies integration tests to handle stdout and stderr separately.

example outputs...

some pods injected, none with host networking:

```
hostNetwork: pods do not use host networking...............................[ok]
supported: at least one resource injected..................................[ok]

Summary: 4 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
  deploy/vote-bot
```

some pods injected, one host networking:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true"
supported: at least one resource injected..................................[ok]

Summary: 3 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
```

no pods injected:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true"
supported: at least one resource injected..................................[warn] -- no supported objects found

Summary: 0 of 8 YAML document(s) injected
```

TODO: check for UDP and other init containers

Part of #1516

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-10 10:34:25 -07:00
Kevin Lingerfelt f884caf56d
Upgrade protobuf to v1.2.0 (#1591)
* Upgrade protobuf to v1.2.0
* Fix Gopkg.lock
* Switch linkerd2-proxy-api dep back to stable

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-06 11:36:29 -07:00
Kevin Lingerfelt b5ff29c8aa
Add data plane check to validate proxy version (#1574)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-04 15:22:38 -07:00
Eliza Weisman efabd90ff7
Fix missing ns/svc labels in metadata hydrated by Tap server (#1496)
Fixes #1493.

When the tap server hydrates metadata for the source or destination peer
of a Tap event from the peer's IP address, it doesn't currently add a
namespace label. However, destinations labeled by the proxy do have such
a label.

This is because the tap server currently gets the hydrated labels from
the `GetPodLabels` function, which is also used by the Destination
service for labeling the individual endpoints in a `WeightedAddrSet`
response. However, the Destination service also adds some labels to all
the endpoints in the set, including the namespace and service, so
`GetPodLabels` doesn't return these labels. However, when the tap server
uses that function, it does not add the service or namespace labels.

This branch fixes this issue by adding those labels to the Tap event 
after calling `GetPodLabels`. In addition, it fixes a missing space 
between the `src/dst_res` and `src/dst_ns` labels in Tap CLI output
with the `-o wide` flag set. This issue was introduced during the 
review of #1437, but was missed at the time because the namespace label
wasn't being set correctly.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-20 18:09:34 -07:00
Kevin Lingerfelt e97be1f5da
Move all healthcheck-related code to pkg/healthcheck (#1492)
* Move all healthcheck-related code to pkg/healthcheck
* Fix failed check formatting
* Better version check wording

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-20 16:50:22 -07:00
Eliza Weisman b8434d60d4
Add resource metadata to Tap CLI output (#1437)
Closes #1170.

This branch adds a `-o wide` (or `--output wide`) flag to the Tap CLI.
Passing this flag adds `src_res` and `dst_res` elements to the Tap
output, as described in #1170. These use the metadata labels in the tap
event to describe what Kubernetes resource the source and destination
peers belong to, based on what resource type is being tapped, and fall
back to pods if either peer is not a member of the specified resource
type.

In addition, when the resource type is not `namespace`, `src_ns` and
`dst_ns` elements are added, which show what namespaces the the source
and destination peers are in. For peers which are not in the Kubernetes
cluster, none of these labels are displayed.

The source metadata added in #1434 is used to populate the `src_res` and
`src_ns` fields.

Also, this branch includes some refactoring to how tap output is
formatted.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-20 14:25:26 -07:00
Eliza Weisman bf7fc12f5c
Add source metadata to Tap server tap events (#1434)
The `TapEvent` protobuf contains two maps, `DestinationMeta` and
`SourceMeta`. The `DestinationMeta` contains all the metadata provided
by the proxy that originated the event (ultimately originating from the
Destination service), while the `SourceMeta` currently only contains the
source connection's TLS status.

This branch modifies the Tap server to hydrate the same set of metadata
from the source IP address, when the source was within the cluster. It
does this by adding an indexer of pod IPs to pods to its k8s API client,
and looking up IPs against this index. If a pod was found, the extra
metadata is added to the tap event sent to the client.

This branch also changes the client so that if a source pod name was
provided in the metadata, it prints the pod name rather than the IP
address for the `src` field in its output. This mimics what is currently
done for the `dst` field in tap output. Furthermore, the added source
metadata will be necessary for adding src resource types to tap output
(see issue #1170).

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-08-13 13:25:14 -07:00
Kevin Lingerfelt 00a0572098
Better CLI error messages when control plane is unavailable (#1428)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-09 15:40:41 -07:00