Commit Graph

545 Commits

Author SHA1 Message Date
Paul Balogh b5e39bcbf7 Utilize Common Name or Subject Alternate Name for access checks (#3459) (#3949)
Subject
Utilize Common Name or Subject Alternate Name for access checks (#3459)

Problem
When access restrictions to API server have been enabled with the requestheader-allowed-names configuration, only the Common Name of the requestor certificate is being checked. This check should include the use of Subject Alternate Name attributes.

Solution
API server will now check the SAN attributes (DNS Names, Email Addresses, IP Addresses, and URIs) when determining accessibility for allowed names.

Fixes issue #3459

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2020-01-22 08:58:19 +02:00
Paul Balogh dabee12b93 Fix issue for debug containers when using custom Docker registry (#3873)
**Subject**
Fixes bug where override of Docker registry was not being applied to debug containers (#3851)

**Problem**
Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image.

**Solution**
This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images.

This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option.

**Validation**
Several new unit tests have been created to confirm functionality.  In addition, the following workflows were run through:

### Standard Workflow with Custom Registry
This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service.

* Start with a k8s instance having no Linkerd installation
* Build all images locally using `bin/docker-build`
* Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6`
* Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section
* Request injection of the debug image into an available container.  I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Custom Registry Override at Injection
This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Standard Workflow with Default Registry
This workflow is the typical workflow which utilizes the standard Linkerd image registry.

* Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster | kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/
* Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -`
* Perform standard Linkerd installation as `bin/linkerd install | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section
* Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Default Registry at Injection
This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

Fixes issue #3851 

Signed-off-by: Paul Balogh javaducky@gmail.com
2020-01-17 10:18:03 -08:00
Mayank Shah b94e03a8a6 Remove empty fields from generated configs (#3886)
Fixes
- https://github.com/linkerd/linkerd2/issues/2962
- https://github.com/linkerd/linkerd2/issues/2545

### Problem
Field omissions for workload objects are not respected while marshaling to JSON.

### Solution
After digging a bit into the code, I came to realize that while marshaling, workload objects have empty structs as values for various fields which would rather be omitted. As of now, the standard library`encoding/json` does not support zero values of structs with the `omitemty` tag. The relevant issue can be found [here](https://github.com/golang/go/issues/11939). To tackle this problem, the object declaration should have _pointer-to-struct_ as a field type instead of _struct_ itself. However, this approach would be out of scope as the workload object declaration is handled by the k8s library.

I was able to find a drop-in replacement for the `encoding/json` library which supports zero value of structs with the `omitempty` tag. It can be found [here](https://github.com/clarketm/json). I have made use of this library to implement a simple filter like functionality to remove empty tags once a YAML with empty tags is generated, hence leaving the previously existing methods unaffected

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-01-13 10:02:24 -08:00
Alex Leong 93a81dce97
Change default proxy log level to "warn,linkerd=info" (#3908)
Fixes #3901 

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-09 14:22:06 -08:00
Paul Balogh 2cd2ecfa30 Enable mixed configuration of skip-[inbound|outbound]-ports (#3766)
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)

This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.

* Bump versions for proxy-init to v1.3.0

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2019-12-20 09:32:13 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Sergio C. Arteaga a1141fc507 Cache StatSummary responses in dashboard web server (#3769)
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-17 09:15:00 -05:00
Dax McDonald 3088f404ce Upgrade prometheus to v1.2.1 (#3541)
Signed-off-by: Dax McDonald <dax@rancher.com>
2019-12-11 15:26:16 -08:00
Sergio C. Arteaga cee8e3d0ae Add CronJobs and ReplicaSets to dashboard and CLI (#3687)
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. 

Closes #3614 
Closes #3630 
Closes #3584 
Closes #3585

Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
2019-12-11 10:02:37 -08:00
Zahari Dichev e5f75a8c3d
Add validation to ensure stat time window is at least 15s (#3720)
* Add stat time window minimum of 10s

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-12-04 08:12:01 +02:00
Alejandro Pedraza cf9fa0a8c9
Removed calico logutils dependency, incompatible with go 1.13 (#3763)
* Removed calico logutils dependency, incompatible with go 1.13

Fixes #1153

Removed dependency on
`github.com/projectcalico/libcalico-go/lib/logutils` because it has
problems with go modules, as described in
projectcalico/libcalico-go#1153

Not a big deal since it was only used for modifying the plugin's log
format.
2019-11-29 09:19:11 -05:00
Zahari Dichev b83c3a2137
Add Responses to path items to satisfy kube apiserver (#3700)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-15 20:41:50 +02:00
Alex Leong 0026103362 Unit and integration test fixups (#3730)
- Added cleanup step at the end of all integration tests.
- Disable external_issuer_integration_tests in cloud_tests due to
  namespace issue. Running this via `kind` tests is sufficient for now.
- Set a flakey test to `Skip`, relates to #3332.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-11-15 03:40:42 -08:00
Alejandro Pedraza 4b6254b52e
Replaced `uuid` with `uid` from linkerd-config resource (#3694)
* Replaced `uuid` with `uid` from linkerd-config resource

Fixes #3621

Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.

I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.

Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.

I also added an integration test to verify the stability of the uid.
2019-11-13 13:56:01 -05:00
Alejandro Pedraza 3324966702
Upgrade go to 1.13.4 (#3702)
Fixes #3566

As explained in #3566, as of go 1.13 there's a strict check that ensures a dependency's timestamp matches it's sha (as declared in go.mod). Our smi-sdk dependency has a problem with that that got resolved later on, but more work would be required to upgrade that dependency. In the meantime a quick pair of replace statements at the bottom of go.mod fix the issue.
2019-11-13 12:54:36 -05:00
Tarun Pothulapati f18e27b115 use appsv1 api in identity (#3682)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-11-06 15:06:09 -08:00
Alejandro Pedraza 0e8958cd07
Fixed bad identity string for target pod in tap (#3675)
* Fixed bad identity string for target pod in tap

Fixes #3506

Was using the cluster domain instead of the trust domain, which results
in an error when those domains differ.
2019-11-05 15:57:41 -05:00
Alejandro Pedraza 8cf4494e78
Add proxy-injector-injections count to heartbeat (#3655)
Fixes #3059
2019-10-31 11:09:00 -05:00
Alejandro Pedraza d3d8266c63
If tap source IP matches many running pods then only show the IP (#3513)
* If tap source IP matches many running pods then only show the IP

When an unmeshed source ip matched more than one running pod, tap was
showing the names for all those pods, even though the didn't necessary
originate the connection. This could be reproduced when using pod
network add-on such as Calico.

With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP.

Fixes #3103
2019-10-25 12:38:11 -05:00
Zahari Dichev 0017f9a60a Cert manager support (#3600)
* Add support for --identity-issuer-mode flag to install cmd
* Change flag to be a bool
* Read correct data form identity when external issuer is used
* Add ability for identity service to dynamically reload certs
* Fix failing tests
* Minor refactor
* Load trust anchors from identity issuer secret
* Make identity service actually watch for issuer certs updates
* Add some testing around cmd line identity options validation
* Add tests ensuring that identity service loads issuer
* Take into account external-issuer flag during upgrade + tests
* Fix failing upgrade test
* Address initial review feedback
* Address further review feedback on cli and helm
* Do not persist --identity-external-issuer
* Some improvements to identitiy service
* Bring back persistane of external issuer flag
* Address more feedback
* Update dockerfiles shas
* Publishing k8s events on issuer certs rotation
* Ensure --ignore-cluster+external issuer is not supported
* Update go-deps shas
* Transition to identity issuer scheme based configuration
* Use k8s consts for secret file names

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-10-24 13:15:14 -07:00
Andrew Seigner 0f9ea553d2 Add APIService fake clientset support (#3569)
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-10-21 12:12:19 -07:00
Tarun Pothulapati f3deee01b6 Trace Control plane Components with OC (#3495)
* add trace flags and initialisation
* add ocgrpc handler to newgrpc
* add ochttp handler to linkerd web
* add flags to linkerd web
* add ochttp handler to prometheus handler initialisation
* add ochttp clients for components
* add span for prometheus query
* update godep sha
* fix reviews
* better commenting
* add err checking
* remove sampling
* add check in main
* move to pkg/trace

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-10-18 12:19:13 -07:00
Alex Leong 3dcff52b9f
Switch from using golangci fmt to using goimports (#3555)
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script.  However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting.  This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.  

In this change we stop using golang-ci-lint for format checking.  We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files.  This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project.  We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.

Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-16 13:56:11 -07:00
Johannes Hansen f880e71fcd The linkerd proxy does not work with headless services (#3470)
* The linkerd proxy does not work with headless services (i.e. endpoints not referencing a pod).

Changed endpoints_watcher to also return endpoints with no targetref.

Fixes #3308

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>

* Fix panic in endpoint_translator

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>
2019-10-15 14:56:41 -07:00
Alex Leong ef54d18bb7
Fallback to defaults when config cannot be loaded (#3530)
When running the destination controller locally, the Linkerd config files which are typically mounted from a configmap are not available.  To facilitate local development, we fall back to default values in this case instead of failing to start up.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-15 14:47:42 -07:00
Guangming Wang c59e9cf500 move t.Fatalf out of goroutine in server_test.go (#3490)
Subject
t.Fataf should not be called in goroutine

Problem

Solution
move t.Fatalf into testing func instead of its goroutine

Validation
unit test passed on my env

Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-10-15 14:22:10 -07:00
Alejandro Pedraza 3de35ccc58
Remove Discovery service leftovers (#3500)
Followup to #2990, which refactored `linkerd endpoints` to use the
`Destination.Get` API instead of the `Discovery.Endpoints` API, leaving
the Discovery with no implented methods. This PR removes all the Discovery
code leftovers.

Fixes #3499
2019-10-15 11:20:21 -05:00
Ivan Sim cf69dedf9c
Re-add the destination container to the controller spec (#3540)
* Re-add the destination container to the controller spec

This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.

On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.

* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-08 10:49:40 -07:00
Kevin Leimkuhler a3a240e0ef
Add TapEvent headers and trailers to the tap protobuf (#3410)
### Motivation

In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390

### Solution

The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.

These values are set by reading the corresponding fields off of the proxy's tap
events.

The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33

Closes #3262

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-29 09:54:37 -07:00
Ivan Sim 9f21c8b481
Introduce Tracing Annotations (#3481)
* Add the tracing environment variables to the proxy spec
* Add tracing event
* Remove unnecessary CLI change
* Update log message
* Handle single segment service name
* Use default service account if not provided

The injector doesn't read the defaults from the values.yaml

* Remove references to conf.workload.ownerRef in log messages

This nested field isn't always set.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-09-26 16:07:30 -07:00
Alex Leong 4799baa8e2
Revert "Trace Control Plane components using OC (#3461)" (#3484)
This reverts commit edd3b1f6d4.

This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy.  We will reintroduce this change once the config plan is straightened out.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-09-26 11:56:44 -07:00
Tarun Pothulapati edd3b1f6d4 Trace Control Plane components using OC (#3461)
* add exporter config for all components

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add cmd flags wrt tracing

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp tracing to web server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add flags to the tap deployment

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add trace flags to install and upgrade command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add linkerd prefix to svc names

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp trasport to API Internal Client

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix goimport linting errors

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp handler to tap http server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* review and fix tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use common template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use Initialize

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix sample flag

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add verbose info reg flags

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-26 08:11:48 -07:00
Tarun Pothulapati 139c64132d Make Identity use GRPC Server with Prom Metrics (#3457)
* make identity use grpc server with prom metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* linting fix

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-23 08:17:41 -07:00
Tarun Pothulapati 49d39e5a12 Instrumenting Proxy-Injector (#3354)
* add proxy injection prometheus counters

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* formatted injection reasons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update proxy injection report tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* keep the structure, and add global ownerKind

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* increase request count, when owner is nil

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add readable reasons using map

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add proxy config override annotations as labels

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove space for machine reasons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use correct proxy image override annotation

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add annotation_at label to prom metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* refactor disablebyannotation function

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-20 09:46:57 -07:00
Kevin Leimkuhler c62c90870e
Add JSON output to tap command (#3434)
Replaces #3411 

### Motivation

It is a little tough to filter/read the current tap output. As headers are being
added to tap, the output is starting to get difficult to consume. Take a peek at
#3262 for an example. It would be nice to have some more machine readable output
that can be sliced and diced with tools such as jq.

### Solution

A new output option has been added to the `linkerd tap` command that returns the
JSON encoding of tap events.

The default output is line oriented; `-o wide` appends the request's target
resource type to the tap line oriented tap events.

In order display certain values in a more human readable form, a tap event
display struct has been introduced. This struct maps public API `TapEvent`s
directly to a private `tapEvent`. This struct offers a flatter JSON structure
than the protobuf JSON rendering. It also can format certain field--such as
addresses--better than the JSON protobuf marshaler.

Closes #3390

**Default**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web
req id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics
rsp id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs
end id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B
```

**Wide**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide
req id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd
rsp id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd
end id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd
```

**JSON**:
*Edit: Flattened `Method` and `Scheme` formatting*
```
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "ip-masq-agent",
      "namespace": "kube-system",
      "pod": "ip-masq-agent-4d5s9",
      "serviceaccount": "ip-masq-agent",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "requestInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "method": "GET",
    "scheme": "",
    "authority": "10.60.1.49:9994",
    "path": "/ready"
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "calico-node",
      "namespace": "kube-system",
      "pod": "calico-node-bbrjq",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 644820
    },
    "httpStatus": 200
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "deployment": "calico-typha",
      "namespace": "kube-system",
      "pod": "calico-typha-59cb487c49-8247r",
      "pod_template_hash": "59cb487c49",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseEndEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 790898
    },
    "sinceResponseInit": {
      "nanos": 146078
    },
    "responseBytes": 3,
    "grpcStatusCode": 0
  }
}
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-19 09:34:49 -07:00
Alejandro Pedraza 30ecddb965
Fix injector timeout under high load (#3442)
* Fix injector timeout under high load

Fixes #3358

When retrieving a pod owner, we were hitting the k8s API directly because
at injection time the informer might not have been informed about the
existence of the parent object.
Under a large number of injection requests this ended up in the k8s API requests
being throttled, the proxy-injector getting blocked and the webhook requests
timing out.

Now we'll hit the shared informer first, and hit the k8s API only when
the informer doesn't return anything. After a few injection requests for
the same owner, the informer should have been updated.

Testing:

Scaling an emoji deployment to 1000 replicas, and after waiting for a
couple of minutes:

Before:
```bash
# a portion of the pods doesn't get injected
$ kubectl-n emojivoto get po | grep ./1 | wc -l
109

kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
.... (lots of errors)
```

After:
```bash
# all the pods get injected
$ kubectl -n emojivoto get po | grep ./1 | wc -l
0

kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
```
2019-09-18 17:58:38 -05:00
Andrew Seigner c5a85e587c
Update to client-go v12.0.0, forked stern (#3387)
The repo depended on an old version of client-go. It also depended on
stern, which itself depended on an old version of client-go, making
client-go upgrade non-trivial.

Update the repo to client-go v12.0.0, and also replace stern with a
fork.

This fork of stern includes the following changes:
- updated to use Go Modules
- updated to use client-go v12.0.0
- fixed log line interleaving:
  - https://github.com/wercker/stern/issues/96
  - based on:
    - 8723308e46

Fixes #3382

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-10 11:04:29 -07:00
Andrew Seigner 7f59caa7fc
Bump proxy-init to 1.2.0 (#3397)
Pulls in latest proxy-init:
https://github.com/linkerd/linkerd2-proxy-init/releases/tag/v1.2.0

This also bumps a dependency on cobra, which provides more complete zsh
completion.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-09 09:06:14 -07:00
Andrew Seigner d773a47dd3
Shrink controller Docker image from 315MB to 38MB (#3378)
The controller Docker image included 7 Go binaries (destination,
heartbeat, identity, proxy-injector, public-api, sp-validator, tap),
each roughly 35MB, with similar dependencies.

Change each controller binary into subcommands of a single `controller`
binary, decreasing the controller Docker image size from 315MB to 38MB.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-05 11:44:03 -07:00
Bruno M. Custódio 8fec756395 Add '--address' flag to 'linkerd dashboard'. (#3274)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-05 10:56:10 -07:00
Alejandro Pedraza acbab93ca8
Add support for k8s 1.16 (#3364)
Fixes #3356

1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):

```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
    Migrate to the policy/v1beta1 API, available since v1.10. Existing
    persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
    Migrate to the apps/v1 API, available since v1.9. Existing persisted
    data can be retrieved/updated via the apps/v1 API.
```

Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.

The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-04 09:59:55 -05:00
Andrew Seigner 90c547576d
Remove broken thrift dependency (#3370)
The repo depended on a (recently broken) thrift package:

```
github.com/linkerd/linkerd2
 -> contrib.go.opencensus.io/exporter/ocagent@v0.2.0
  -> go.opencensus.io@v0.17.0
   -> git.apache.org/thrift.git@v0.0.0-20180902110319-2566ecd5d999
```
... via this line in `controller/k8s`:

```go
_ "k8s.io/client-go/plugin/pkg/client/auth"
```

...which created a dependency on go.opencensus.io:

```bash
$ go mod why go.opencensus.io
...
github.com/linkerd/linkerd2/controller/k8s
k8s.io/client-go/plugin/pkg/client/auth
k8s.io/client-go/plugin/pkg/client/auth/azure
github.com/Azure/go-autorest/autorest
github.com/Azure/go-autorest/tracing
contrib.go.opencensus.io/exporter/ocagent
go.opencensus.io
```

Bump contrib.go.opencensus.io/exporter/ocagent from `v0.2.0` to
`v0.6.0`, creating this new dependency chain:

```
github.com/linkerd/linkerd2
 -> contrib.go.opencensus.io/exporter/ocagent@v0.6.0
  -> google.golang.org/api@v0.7.0
   -> go.opencensus.io@v0.21.0
```

Bumping our go.opencensus.io dependency from `v0.17.0` to `v0.21.0`
pulls in this commit:
ed3a3f0bf0 (diff-37aff102a57d3d7b797f152915a6dc16)

...which removes our dependency on github.com/apache/thrift

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-03 16:22:43 -07:00
陈谭军 e281fb3410 fix-up grammar (#3351)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-08-30 08:09:36 -07:00
Alejandro Pedraza fd248d3755
Undo refactoring from #3316 (#3331)
Thus fixing `linkerd edges` and the dashboard topology graph

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 13:37:54 -05:00
Alejandro Pedraza 5d7499dc84
Avoid the dashboard requesting stats when not needed (#3338)
* Avoid the dashboard requesting stats when not needed

Create an alternative to `urlsForResource` called
`urlsForResourceNoStats` that makes use of the `skip_stats` parameter in
the stats API (created in #1871) that doesn't query Prometheus when not needed.

When testing using the dashboard looking at the linkerd namespace,
queries per second went down from 2874 to 2756, a 4% decrease.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 05:52:44 -05:00
Alejandro Pedraza 368d16f23c
Fix auto-injecting pods and integration tests reporting (#3335)
* Fix auto-injecting pods and integration tests reporting

When creating an Event when auto-injection occurs (#3316) we try to
fetch the parent object to associate the event to it. If the parent
doesn't exist (like in the case of stand-alone pods) the event isn't
created. I had missed dealing with one part where that parent was
expected.

This also adds a new integration test that I verified fails before this
fix.

Finally, I removed from `_test-run.sh` some `|| exit_code=$?` that was
preventing the whole suite to report failure whenever one of the tests
in `/tests` failed.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-28 15:04:20 -05:00
arminbuerkle 5c38f38a02 Allow custom cluster domains in remaining backends (#3278)
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-27 10:01:36 -07:00
Alejandro Pedraza 02efb46e45
Have the proxy-injector emit events upon injection/skipping injection (#3316)
* Have the proxy-injector emit events upon injection/skipping injection

Fixes #3253

Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).

The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.

Related changes:

- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-26 13:34:36 -05:00
Carol A. Scott 089836842a
Add unit test for edges API endpoint (#3306)
Fixes #3052.

Adds a unit test for the edges API endpoint. To maintain a consistent order for
testing, the returned rows in api/public/edges.go are now sorted.
2019-08-23 09:28:02 -07:00
arminbuerkle e7d303e03f Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES (#3277)
* Fix missing `clusterDomain` in render RenderTapOutputProfile
* Add LINKERD2_PROXY_DESTINATION_GET_SUFFIXES env variable

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-21 14:28:30 -07:00
Oliver Gould cb276032f5
Require go 1.12.9 for controller builds (#3297)
Netflix recently announced a security advisory that identified several
Denial of Service attack vectors that can affect server implementations
of the HTTP/2 protocol, and has issued eight CVEs. [1]

Go is affected by two of the vulnerabilities (CVE-2019-9512 and
CVE-2019-9514) and so Linkerd components that serve HTTP/2 traffic are
also affected. [2]

These vulnerabilities allow untrusted clients to allocate an unlimited
amount of memory, until the server crashes. The Kubernetes Product
Security Committee has assigned this set of vulnerabilities with a CVSS
score of 7.5. [3]

[1] https://github.com/Netflix/security-bulletins/blob/master/advisories/third-party/2019-002.md
[2] https://golang.org/doc/devel/release.html#go1.12
[3] https://www.first.org/cvss/calculator/3.0#CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
2019-08-21 10:03:29 -07:00
Guangming Wang 70d85d2065 Cleanup: fix some typos in code comment (#3296)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-21 09:40:43 -07:00
Oliver Gould ee79d5d324
destination: Reorganize authority-parsing (#3244)
In preparation for #3242, the destination controller will need to
support a broader set of valid authorities including IP addresses.

This change modifies the destination controller's authority-parsing code
so that the is-this-a-kubernete-service-name decision is decoupled from
parsing of authorities into their consituent parts.

The `Get` API now explicitly handles IP address names, though it
currently fails all such resolutions.
2019-08-21 07:19:42 -07:00
Ivan Sim 183e42e4cd
Merge the CLI 'installValues' type with Helm 'Values' type (#3291)
* Rename template-values.go
* Define new constructor of charts.Values type
* Move all Helm values related code to the pkg/charts package
* Bump dependency
* Use '/' in filepath to remain compatible with VFS requirement
* Add unit test to verify Helm YAML output
* Alejandro's feedback
* Add unit test for Helm YAML validation (HA)

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-08-20 19:26:38 -07:00
Carol A. Scott bc8fef7ba9 Sorting the expected response for trafficsplit rows so it is always in consistent row order (#3280) 2019-08-19 10:10:26 -07:00
Kevin Leimkuhler c9c41e2e8a
Remove gRPC tap server listener from controller (#3276)
### Summary

As an initial attempt to secure the connection from clients to the gRPC tap
server on the tap Pod, the tap `addr` only listened on localhost.

As @adleong pointed out #3257, this was not actually secure because the inbound
proxy would establish a connection to localhost anyways.

This change removes the gRPC tap server listener and changes `TapByResource`
requests to interface with the server object directly.

From this, we know that all `TapByResourceRequests` have gone through the tap
APIServer and thus authorized by RBAC.

### Details

[NewAPIServer](ef90e0184f/controller/tap/apiserver.go (L25-L26)) now takes a [GRPCTapServer](f6362dfa80/controller/tap/server.go (L33-L34)) instead of a `pb.TapClient` so that
`TapByResource` requests can interact directly with the [TapByResource](f6362dfa80/controller/tap/server.go (L49-L50)) method.

`GRPCTapServer.TapByResource` now makes a private [grpcTapServer](ef90e0184f/controller/tap/handlers.go (L373-L374)) that satisfies
the [tap.TapServer](https://godoc.org/github.com/linkerd/linkerd2/controller/gen/controller/tap#TapServer) interface. Because this interface is satisfied, we can interact
with the tap server methods without spawning an additional listener.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-16 16:38:50 -04:00
cpretzer 4e92064f3b
Add a flag to install-cni command to configure iptables wait flag (#3066)
Signed-off-by: Charles Pretzer <charles@buoyant.io>
2019-08-15 12:58:18 -07:00
Carol A. Scott 9c62b65c6a
Adding trafficsplit test to stat_summary_test.go (#3252)
This PR adds a test for trafficsplits to stat_summary_test.go.

Because the test requires a consistent order for returned rows, trafficsplit
rows in stat_summary.go are now sorted by apex + leaf name before being
returned.
2019-08-14 14:48:46 -07:00
Kevin Leimkuhler cc3c53fa73
Remove tap from public API and associated test infrastructure (#3240)
### Summary

After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.

This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.

This draft passes `go test` as well as a local run of the integration tests.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-14 17:27:37 -04:00
Andrew Seigner 3b55e2e87d
Add container cpu and mem to heartbeat requests (#3238)
PR #3217 re-introduced container metrics collection to
linkerd-prometheus. This enabled linkerd-heartbeat to collect mem and
cpu metrics at the container-level.

Add container cpu and mem metrics to heartbeat requests. For each of
(destination, prometheus, linkerd-proxy), collect maximum memory and p95
cpu.

Concretely, this introduces 7 new query params to heartbeat requests:
- p99-handle-us
- max-mem-linkerd-proxy
- max-mem-destination
- max-mem-prometheus
- p95-cpu-linkerd-proxy
- p95-cpu-destination
- p95-cpu-prometheus

Part of #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-14 12:04:08 -07:00
Carol A. Scott 00437709eb
Add trafficsplit metrics to CLI (#3176)
This PR adds `trafficsplit` as a supported resource for the `linkerd stat` command. Users can type `linkerd stat ts` to see the apex and leaf services of their trafficsplits, as well as metrics for those leaf services.
2019-08-14 10:30:57 -07:00
Alex Leong 98b6b9e9ba
Check in gen deps (#3245)
Go dependencies which are only used by generated code had not previously been checked into the repo.  Because `go generate` does not respect the `-mod=readonly` flag, running `bin/linkerd` will add these dependencies and dirty the local repo.  This can interfere with the way version tags are generated.

To avoid this, we simply check these deps in.

Note that running `go mod tidy` will remove these again.  Thus, it is not recommended to run `go mod tidy`. 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-08-13 17:02:52 -07:00
Kevin Leimkuhler f0bd82f831
Update protobuf files (#3241)
### Summary

Changes from `bin/protoc-go.sh`

An existing [draft PR](https://github.com/linkerd/linkerd2/pull/3240) has a majority of its changes related to protobuf file
updates. In order to separate these changes out into more related components,
this PR updates the generated protobuf files so that #3240 can be rebased off
this and have a more manageable diff.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-13 16:58:49 -04:00
Kevin Leimkuhler 5d7662fd90
Update web server to use tap APIService (#3208)
### Motivation

PR #3167 introduced the tap APIService and migrated `linkerd tap` to use it.
Subsequent PRs (#3186 and #3187) updated `linkerd top` and `linkerd profile
--tap` to use the tap APIService. This PR moves the web's Go server to now also
use the tap APIService instead of the public API. It also ensures an error
banner is shown to the user when unauthorized taps fail via `linkerd top`
command in *Overview* and *Top*, and `linkerd tap` command in *Tap*.

### Details

The majority of these changes are focused around piping through the HTTP error
that occurs and making sure the error banner generated displays the error
message explaining to view the tap RBAC docs.

`httpError` is now public (`HTTPError`) and the error message generated is short
enough to fit in a control frame (explained [here](https://github.com/linkerd/linkerd2/blob/kleimkuhler%2Fweb-tap-apiserver/web/srv/api_handlers.go#L173-L175)).

### Testing

The error we are testing for only occurs when the linkerd-web service account is
not authorzied to tap resources. Unforutnately that is not the case on Docker
For Mac (assuming that is what you use locally), so you'll need to test on a
different cluster. I chose a GKE cluster made through the GKE console--not made
through cluster-utils because it adds cluster-admin.

Checkout the branch locally and `bin/docker-build` or `ares-build` if you have
it setup. It should produce a linkerd with the version `git-04e61786`. I have
already pushed the dependent components, so you won't need to `bin/docker-push
git-04e61786`.

Install linkerd on this GKE cluster and try to run `tap` or `top` commands via
the web. You should see the following errors:

### Tap

![web-tap-unauthorized](https://user-images.githubusercontent.com/4572153/62661243-51464900-b925-11e9-907b-29d7ca3f815d.png)

### Top

![web-top-unauthorized](https://user-images.githubusercontent.com/4572153/62661308-894d8c00-b925-11e9-9498-6c9d38b371f6.png)

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-08-08 10:18:32 -07:00
Andrew Seigner f98bc27a38
Fix invalid `l5d-require-id` for some tap requests (#3210)
PR #3154 introduced an `l5d-require-id` header to Tap requests. That
header string was constructed based on the TapByResourceRequest, which
includes 3 notable fields (type, name, namespace). For namespace-level
requests (via commands like `linkerd tap ns linkerd`), type ==
`namespace`, name == `linkerd`, and namespace == "". This special casing
for namespace-level requests yielded invalid `l5d-require-id` headers,
for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`.

Fix `l5d-require-id` string generation to account for namespace-level
requests. The bulk of this change is tap unit test updates to validate
the fix.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 09:42:11 -07:00
Alejandro Pedraza 3ae653ae92
Refactor proxy injection to use Helm charts (#3200)
* Refactor proxy injection to use Helm charts

Fixes #3128

A new chart `/charts/patch` was created, that generates the JSON patch
payload that is to be returned to the k8s API when doing the injection
through the proxy injector, and it's also leveraged by the `linkerd
inject --manual` CLI.

The VFS was used by `linkerd install` to access the old chart under
`/chart`. Now the proxy injection also uses the Helm charts to generate
the JSON patch (see above) so we've moved the VFS from `cli/static` to a
new common place under `/pkg/charts/static`, and the new root for the VFS is
now `/charts`.

`linkerd install` hasn't yet migrated to use the new charts (that'll
happen in #3127), so the only change in that regard was the creation of
`/charts/chart` which is a symlink pointing to `/chart` that
`install.go` now uses, so that the VFS contains both the old and new
charts, as a temporary measure.

You can see that `/bin/Dockerfile-bin`, `/controller/Dockerfile` and
`/bin/build-cli-bin` do now `go generate` pointing to the new location
(and the `go generate` annotation was moved from `/cli/main.go` to
`pkg/charts/static/templates.go`).

The symlink trick doesn't work when building the binaries through
Docker, so `/bin/Dockerfile-bin` replaces the symlink with an actual
copy of `/chart`.

Also note that in `/controller/Dockerfile` we now need to include the
`prod` tag in `go install` like we do in `/bin/Dockerfile-bin` so that
the proxy injector does use the VFS instead of the local file system.

- The common logic to parse a chart has been moved from `install.go` to
`/pkg/charts/util.go`.
- The special ENV var in the proxy for "outbound router capacity" that
only applies to the Prometheus pod is now handled directly in the proxy
partial and all the associated go code could be removed.
- The `patch.go` lib for generating the JSON patch in go along
with its tests `patch_test.go` are no longer needed.
- Lots of functions in `/pkg/inject/inject.go` got removed/simplified
with their logic being moved into the charts themselves. As a
consequence lots of things in `inject_test.go` became irrelevant.
- Moved `template-values.go` from `/pkg/inject` to `pkg/charts` as that
contains the go structs representation of the chart variables that
will be leveraged in #3127.

Don't forget to run `/bin/helm.sh` whenever you make changes to charts
;-)

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-07 17:32:37 -05:00
Andrew Seigner c253805042
Update tap authz error with doc URL (#3196)
When the Tap APIServer's SubjectAccessReview fails, return an error
message pointing the user to: https://linkerd.io/tap-rbac

Depends on https://github.com/linkerd/website/pull/450
Part of #3191

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-06 08:56:41 -07:00
Andrew Seigner 09fbea7165
Fix Tap resource/subresource APIGroupList (#3192)
The `/apis/tap.linkerd.io/v1alpha1` endpoint on the Tap APIServer
included tap resource/subresource pairs, such as `deployments/tap` and
`pods/tap`, but did not include the parent resources, such as
`deployments` and `pods`. This broke commands like `kubectl auth can-i`,
which expect the parent resource to exist.

Introduce parent resources for all tap-able subresources. Concretely, it
fixes this class of command:
```
$ kubectl auth can-i watch deployments.tap.linkerd.io/linkerd-grafana \
  --subresource=tap -n linkerd --as siggy@buoyant.io
Warning: the server doesn't have a resource type 'deployments' in group 'tap.linkerd.io'
no - no RBAC policy matched
```
Fixed:
```
$ kubectl auth can-i watch deployments.tap.linkerd.io/linkerd-grafana \
  --subresource=tap -n linkerd --as siggy@buoyant.io
yes
```

Additionally, when SubjectAccessReviews fail, return a 403 rather than
500.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-05 10:36:48 -07:00
Andrew Seigner a59c1dd32d
Introduce tap APIService, update `linkerd tap` (#3167)
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.

This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.

This change also modifies the `linkerd tap` command to make requests
against the new APIService.

The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET  /apis
GET  /apis/tap.linkerd.io
GET  /apis/tap.linkerd.io/v1alpha1
GET  /healthz
GET  /healthz/log
GET  /healthz/ping
GET  /metrics
GET  /openapi/v2
GET  /version

Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.

This change introduces the following resources into the default Linkerd
install:
- Global
  - APIService/v1alpha1.tap.linkerd.io
  - ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
  - Secret/linkerd-tap-tls
- `kube-system` namespace:
  - RoleBinding/linkerd-linkerd-tap-auth-reader

Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller

Fixes #2725, #3162, #3172

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-01 14:02:45 -07:00
Andrew Seigner 9a672dd5a9
Introduce `linkerd --as` flag for impersonation (#3173)
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-31 16:05:33 -07:00
Kevin Leimkuhler 6cc52c3363
Add `l5d-require-id` header to Tap requests (#3154)
### Summary

In order for Pods' tap servers to start authorizing tap clients, the tap
controller must open TLS connections so that it can identity itself to the
server.

This change introduces the use of `l5d-require-id` header on outbound tap
requests.

### Details

When tap requests are made by the tap controller, the `Authority` header is an
IP address. The proxy does not attempt to do service discovery on such requests
and therefore the connection is over plaintext. By introducing the
`l5d-require-id` header the proxy can require a server name on the connection.
This allows the tap controller to identity itself as the client making tap
requests. The name value for the header can be made from the Pod Spec and tap
request, so the change is rather minimal.

#### Proxy Changes

* Update h2 to v0.1.26
* Properly fall back in the dst_router (linkerd/linkerd2-proxy#291)

### Testing

Unit tests for the header have not been added mainly because [no test
infrastructure currently exists](065c221858/controller/tap/server_test.go (L241)) to mock proxy requests. After talking with
@siggy a little about this, it makes to do in a separate change at some point
when behavior like this cannot be reliably tested through integration tests
either.

Integration tests do test this well, and will continue to do once
linkerd/linkerd2-proxy#290 lands.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-07-30 11:17:52 -07:00
Alex Leong ab7226cbcd
Return invalid argument for external name services (#3120)
Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498

When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource.  Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request.  The net effect is that Linkerd fails all requests to external name services.

We change the destination service to instead return `InvalidArgument` for external name services.  This causes the proxy to fallback to SO_ORIG_DST instead of failing the request.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-29 16:31:22 -07:00
Alejandro Pedraza 8c07223f3b
Remove unused argument (#3149)
Removed unused argument in the `GetPatch()` function in
`pkg/inject/inject.go`

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2019-07-26 11:39:25 -05:00
Andrew Seigner 51b33ad53c
Fix nil pointer dereference in endpoints watcher (#3147)
The destination service's endpoints watcher assumed every `Endpoints`
object contained a `TargetRef`. This field is optional, and in cases
such as the default `ep/kubernetes` object, `TargetRef` is nil, causing
a nil pointer dereference.

Fix endpoints watcher to check for `TargetRef` prior to dereferencing.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-25 17:11:56 -07:00
Andrew Seigner 18b74aa8a8
Introduce Go modules support (#2481)
The repo relied on `dep` for managing Go dependencies. Go 1.11 shipped
with Go modules support. Go 1.13 will be released in August 2019 with
module support enabled by default, deprecating GOPATH.

This change replaces `dep` with Go modules for dependency management.
All scripts, including Docker builds and ci, should work without any dev
environment changes.

To execute `go` commands directly during development, do one of the
following:
1. clone this repo outside of `GOPATH`; or
2. run `export GO111MODULE=on`

Summary of changes:
- Docker build scripts and ci set `-mod=readonly`, to ensure
  dependencies defined in `go.mod` are exactly what is used for the
  builds.
- Dependency updates to `go.mod` are accomplished by running
 `go build` and `go test` directly.
- `bin/go-run`, `bin/build-cli-bin`, and `bin/test-run` set
  `GO111MODULE=on`, permitting usage inside and outside of GOPATH.
- `gcr.io/linkerd-io/go-deps` tags hashed from `go.mod`.
- `bin/update-codegen.sh` still requires running from GOPATH,
  instructions added to BUILD.md.

Fixes #1488

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-25 14:41:38 -07:00
Alex Leong 3c4a0e4381
Make authorities in destination overrides absolute (#3137)
Fixes #3136 

When the destination service sends a destination profile with a traffic split to the proxy, the override destination authorities are absolute but do no contain a trailing dot.  e.g. "bar.ns.svc.cluster.local:80".  However, NameAddrs which have undergone canonicalization in the proxy will include the trailing dot.  When a traffic split includes the apex service as one of the overrides, the original apex NameAddr will have the trailing dot and the override will not.  Since these two NameAddrs are not identical, they will go into two distinct slots in the proxy's concrete dst router.  This will cause two services to be created for the same destination which will cause the stats clobbering described in the linked issue.

We change the destination service to always return absolute dst overrides including the trailing dot.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 17:08:40 -07:00
Alex Leong e538a05ce2
Add support for stateful sets (#3113)
We add support for looking up individual pods in a stateful set with the destination service.  This allows Linkerd to correctly proxy requests which address individual pods.  The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.

Fixes #2266 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 14:09:46 -07:00
Andrew Seigner 64ed8e4a74
Introduce Cluster Heartbeat cronjob (#3056)
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.

Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.

Example check URL:
https://versioncheck.linkerd.io/version.json?
  install-time=1562761177&
  k8s-version=v1.15.0&
  meshed-pods=8&
  rps=3&
  source=heartbeat&
  uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
  version=git-b97ee9f7

Fixes #2961

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-23 17:12:30 -07:00
Alex Leong d6ef9ea460
Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078)
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller.  Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.

To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2.  This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource.  If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can.  This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.  

Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation).  In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-23 11:46:31 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong c8b34a8cab
Add pod status to linkerd check (#3065)
When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff).

We add pod status to the output to make this more immediately obvious.

Fixes #2877 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-18 15:56:19 -07:00
Carol A. Scott ee1a111993
Updating CLI output for `linkerd edges` (#3048)
This PR improves the CLI output for `linkerd edges` to reflect the latest API
changes. 

Source and destination namespaces for each edge are now shown by default. The
`MSG` column has been replaced with `Secured` and contains a green checkmark or
the reason for no identity. A new `-o wide` flag shows the identity of client
and server if known.
2019-07-17 12:23:34 -07:00
Alex Leong dc2b96f903
Sort slices to avoid flakeyness in tests (#3064)
The `TestGetServicesFor` is flaky because it compares two slices of services which are in a non-deterministic order.

To make this deterministic, we first sort the slices by name.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-16 16:03:21 -07:00
Alex Leong bdf5b46d09
Make the routes command traffic split aware (#3030)
The `linkerd routes` command gets the list of routes for a resource by checking which services that resource is a member of.  If a traffic split exists, it is possible for a resource to get traffic via a service that it is not a member of.  Specifically, a resource which is a member of a leaf service can get traffic to the apex service.  This means that even though the resource is serving routes associated with the apex service, these will not be displayed in the `linkerd routes` command.

We update `linkerd routes` to be traffic-split aware.  This means that when a traffic split exists, we consider resources which are members of a leaf service with non-zero weight to be members of the apex service for the purpose of determining which routes to display.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-10 12:45:35 -07:00
Jonathan Juares Beber 2dcbde08b3 Show pod status more clearly (#1967) (#2989)
During operations with `linkerd stat` sometimes it's not clear the actual
pod status.

This commit introduces a method, to the `k8s`package, getting the pod status,
based on [`kubectl` logic](33a3e325f7/pkg/printers/internalversion/printers.go (L558-L640))
to expose the `STATUS` column for pods . Also, it changes the stat command
on the` cli` package adding a column when the resource type is a Pod.

Fixes #1967

Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
2019-07-10 12:44:44 -07:00
Jonathan Juares Beber e2211f5f77 Introduces owner references verification for pods (#3027)
When getting pods for specific kubernetes resources, the usage of just
labels, as a selector, generates wrong outputs. Once, two resources can use
the same label selector and manage distinct pods, a new mechanism to check
pods for a given resource it's needed. More details on #2932.

This commit introduces a verification through the pod owner references
`UID`s, comparing with the given resource's. Additional logic is needed
when handling `Deployments` since it creates a `ReplicaSet` and this last
one is the actual pod's owner. No verification is done in case of
`Services`.

Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
2019-07-10 12:44:24 -07:00
Alex Leong 92ddffa3c2
Add prometheus metrics for watchers (#3022)
To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-08 11:50:26 -07:00
Alejandro Pedraza 53e589890d
Have `linkerd endpoints` use `Destination.Get` (#2990)
* Have `linkerd endpoints` use `Destination.Get`

Fixes #2885

We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.

For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.

I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.

Analogously, I added a `destinationServer` struct that mimics
`tapServer`.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-07-03 09:11:03 -05:00
Carol A. Scott de635d3fcf
Allow `edges` to handle requests from multiple namespaces to one resource (#3025)
This PR fixes a bug in the edges command where if src_resources from two
different namespaces sent requests to the same dst_resource, the original
src_identity was overwritten.
2019-07-02 12:31:15 -07:00
Carol A. Scott a504e8c2d8
Expand and improve edges API endpoint (#3007)
Updates functionality of `linkerd edges`, including a new `--all-namespaces`
flag and returning namespace information for SRC and DST resources.
2019-06-28 15:46:04 -07:00
Alex Leong 27373a8b78
Add traffic splitting to destination profiles (#2931)
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).

We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to.  A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-28 13:19:47 -07:00
Alejandro Pedraza 73740fb503
Simplify port-forwarding code (#2976)
* Simplify port-forwarding code

Simplifies the establishment of a port-forwarding by moving the common
logic into `PortForward.Init()`

Stemmed from this
[comment](https://github.com/linkerd/linkerd2/pull/2937#discussion_r295078800)

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-26 11:14:57 -05:00
Tarun Pothulapati a3ce06bd80 Add sideEffects field to Webhooks (#2963)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-06-21 11:06:10 -07:00
Alejandro Pedraza 8988a5723f
Have `GetOwnerKindAndName` be able to skip the cache (#2972)
* Have `GetOwnerKindAndName` be able to skip the cache

Refactored `GetOwnerKindAndName` so it can optionally skip the
shared informer cache and instead hit the k8s API directly.
Useful for the proxy injector, when the pod's replicaset got just
created and might not be in ready in the cache yet.

Fixes #2738

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-20 12:58:15 -05:00
Ivan Sim e2e976cce9
Add `NET_RAW` capability to the proxy-init container (#2969)
Also, update control plane PSP to match linkerd/website#94

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-06-19 19:34:37 -07:00
Cody Vandermyn 33de3574ee Correctly set securityContext values on injection (#2911)
The patch provided by @ihcsim applies correct values for the securityContext during injection, namely: `allowPrivilegeEscalation = false`, `readOnlyRootFilesystem = true`, and the capabilities are copied from the primary container. Additionally, the proxy-init container securityContext has been updated with appropriate values.

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2019-06-11 10:34:30 -07:00
Alex Leong c698d6bca1
Add support for TrafficSplits (#2897)
Add support for querying TrafficSplit resources through the common API layer. This is done by depending on the TrafficSplit client bindings from smi-sdk-go.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-11 10:04:42 -07:00
Alex Leong 06a69f69c5
Refactor destination service (#2786)
This is a major refactor of the destination service.  The goals of this refactor are to simplify the code for improved maintainability.  In particular:

* Remove the "resolver" interfaces.  These were a holdover from when our decision tree was more complex about how to handle different kinds of authorities.  The current implementation only accepts fully qualified kubernetes service names and thus this was an unnecessary level of indirection.
* Moved the endpoints and profile watchers into their own package for a more clear separation of concerns.  These watchers deal only in Kubernetes primitives and are agnostic to how they are used.  This allows a cleaner layering when we use them from our gRPC service.
* Renamed the "listener" types to "translator" to make it more clear that the function of these structs is to translate kubernetes updates from the watcher to gRPC messages.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-04 15:01:16 -07:00
Alejandro Pedraza 74ca92ea25
Split proxy-init into separate repo (#2824)
Split proxy-init into separate repo

Fixes #2563

The new repo is https://github.com/linkerd/linkerd2-proxy-init, and I
tagged the latest there `v1.0.0`.

Here, I've removed the `/proxy-init` dir and pinned the injected
proxy-init version to `v1.0.0` in the injector code and tests.

`/cni-plugin` depends on proxy-init, so I updated the import paths
there, and could verify CNI is still working (there is some flakiness
but unrelated to this PR).

For consistency, I added a `--init-image-version` flag to `linkerd
inject` along with its corresponding override config annotation.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-03 16:24:05 -05:00
Carol A. Scott 8c496e3d0d
Adding unit test for CLI edges command (#2837)
Adds a unit test for the `linkerd edges` command.
2019-05-28 13:51:45 -07:00
Ivan Sim 5a5f8bbfe8
Install MWC and VWC During Installation (#2806)
* Update helm charts to include webhooks config and TLS secret
* Update the webhooks to read the secret cert and key
* Update webhooks to not recreate config on restart
* Ensure upgrade preserve existing secrets
* Revert the change to rename the webhook configs

The renaming change breaks upgrade, where the new webhook configs conflict with
the existing ones. The older resources  aren't deleted during upgrade because
they are dynamically created.

* Make the secret volume read-only
* Remove unnecessary exported getter functions
* Remove obsolete mwc and vwc templates

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-05-20 12:43:50 -07:00
Carol A. Scott bb2921a3d9
Verify in Prometheus edges query that data for a specific resource type exists (#2826)
Adds a check to Prometheus `edges` queries to verify that data for the requested
resource type exists. Previously, if Prometheus could not find request data for the
requested resource type, it would skip that label and still return data for
other labels in the `by` clause, leading to an incorrect response.
2019-05-15 16:03:48 -07:00
Tarun Pothulapati 512dcd336f Added Controller Component Labels to webhooks (#2820)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-05-15 17:06:36 -05:00
Dennis Adjei-Baah a0fa1dff59
Move tap service into its own pod. (#2773)
* Split tap into its own pod in the control plane

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2019-05-15 16:28:44 -05:00
Carol A. Scott 042086142a
Adding an edges command to the CLI (#2808)
Adds an edges command to the CLI. `linkerd edges` displays connections between resources, and Linkerd proxy identities. Currently this feature will only display edges where both the client identity and server identity are known. The next step will be to display edges for which identity is not known and/or one-sided traffic such as Prometheus and tap requests.
2019-05-15 13:59:27 -07:00
Alejandro Pedraza 065c221858
Support for resources opting-out of tap (#2807)
Support for resources opting out of tap

Implements the `linkerd inject --disable-tap` flag (although hidden pending #2811) and the config override annotation `config.linkerd.io/disable-tap`.
Fixes #2778

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-05-10 14:17:23 -05:00
Carol A. Scott 87e69bf885
Adding edges endpoint to public API (#2793)
This change adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies. This PR only includes changes to the controller and protobuf.
2019-05-09 09:30:11 -07:00
Jack Price f758a9e428 Use port-forwarding for linkerd CLIs (#2757)
Private k8s clusters, such as the private GKE clusters offered by Google
Cloud, cannot be reached through the current API proxy method.

This commit uses the port forwarding feature already developed.

Also modify dashboard command to not fall back to ephemeral port.

Signed-off-by: Jack Price <jackprice@outlook.com>
2019-05-02 14:41:26 +02:00
Ivan Sim 714035fee9
Define default resource spec for proxy-init init container (#2763)
Fixes #2750 

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-04-29 11:41:05 -07:00
Andrew Seigner 0cfc8c6f1c
Introduce k8s apiextensions support (#2759)
CustomResourceDefinition parsing and retrieval is not available via
client-go's `kubernetes.Interface`, but rather via a separate
`k8s.io/apiextensions-apiserver` package.

Introduce support for CustomResourceDefintion object parsing and
retrieval. This change facilitates retrieval of CRDs from the k8s API
server, and also provides CRD resources as mock objects.

Also introduce a `NewFakeAPI` constructor, deprecating
`NewFakeClientSets`. Callers need no longer be concerned with discreet
clientsets (for k8s resources vs. CRDs vs. (eventually)
ServiceProfiles), and can instead use the unified `KubernetesAPI`.

Part of #2337, in service to multi-stage check.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-28 18:55:22 -07:00
Andrew Seigner ec540a882e
Consolidate k8s APIs (#2747)
Numerous codepaths have emerged that create k8s configs, k8s clients,
and make k8s api requests.

This branch consolidates k8s client creation and APIs. The primary
change migrates most codepaths to call `k8s.NewAPI` to instantiate a
`KubernetesAPI` struct from `pkg`. `KubernetesAPI` implements the
`kubernetes.Interface` (clientset) interface, and also persists a
`client-go` `rest.Config`.

Specific list of changes:
- removes manual GET requests from `k8s.KubernetesAPI`, in favor of
  clientsets
- replaces most calls to `k8s.GetConfig`+`kubernetes.NewForConfig` with
  a single `k8s.NewAPI`
- introduces a `timeout` param to `k8s.NewAPI`, currently only used by
  healthchecks
- removes `NewClientSet` in `controller/k8s/clientset.go` in favor of
  `k8s.NewAPI`
- removes `httpClient` and `clientset` from `HealthChecker`, use
  `KubernetesAPI` instead

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-25 11:31:38 -07:00
Ivan Sim cd37d3f0f5
Fall back to default built-in version if versions config are missing (#2745)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-04-24 19:49:18 -07:00
Alejandro Pedraza 53bb7c47f6
Make the auto-injector required and removed proxy-auto-inject flag (#2733)
Make the auto-injector required and removed proxy-auto-inject flag

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-04-24 13:06:51 -05:00
Alejandro Pedraza 62d9a80894
New `linkerd inject` default and manual modes (#2721)
Fixes #2720 and 2711 

This changes the default behavior of `linkerd inject` to not inject the
proxy but just the `linkerd.io/inject: enabled` annotation for the
auto-injector to pick it up (regardless of any namespace annotation).

A new `--manual` mode was added, which behaves as before, injecting
the proxy in the command output.

The unit tests are running with `--manual` to avoid any changes in the
fixtures.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-04-24 09:05:27 -05:00
Andrew Seigner 64d38572ae
Implement fallback logic for owner ref lookups (#2737)
The proxy-injector retrieves owner information when injecting pods. For
pods created via deployments, this requires a Pod -> ReplicaSet ->
Deployment lookup. There is a race condition where the injection happens
before the k8s informer client has indexed the new ReplicaSet.

If a ReplicaSet informer lookup initially fails, retry one time via a
get request. Also introduce logging to record the failure/retry, and
tests to validate `GetOwnerKindAndName` works with and without informer
indexing.

Fixes #2731 

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-23 14:39:18 -07:00
Ivan Sim 8d13084f94
Split the `linkerd-version` CLI flag into `control-plane-version` and `proxy-version` (#2702)
* The 'linkerd-version' CLI flag is renamed to 'control-plane-version'
* Add version field to proxy config
* Add the control plane version to the global config
* Unit test for init image version
* Use more specific control plane and proxy versions in unit tests

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-04-19 11:35:20 -07:00
Andrew Seigner 72287ae121
Don't use spinner in cli when run without a tty (#2716)
In some non-tty environments, the `linkerd check` spinner can render
unexpected control characters.

Disable the spinner when run without a tty.

Fixes #2700

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-04-18 09:31:56 -07:00
Ivan Sim 4e19827457
Allow identity to be disabled during inject on existing cluster (#2686)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-04-11 13:37:06 -07:00
Oliver Gould ba65bd8039
Switch UUID implementation (#2667)
The UUID implementation we use to generate install IDs is technically
not random enough for secure uses, which ours is not. To prevent
security scanners like SNYK from flagging this false-positive, let's
just switch to the other UUID implementation (Already in our
dependencies).
2019-04-08 10:58:02 -07:00
Alejandro Pedraza edb225069c
Add validation webhook for service profiles (#2623)
Add validation webhook for service profiles

Fixes #2075

Todo in a follow-up PRs: remove the SP check from the CLI check.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-04-05 16:10:47 -05:00
Kevin Lingerfelt 74e48ba301
Remove project injector's -no-init-container flag (#2635)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-04-04 11:09:47 -07:00
harsh jain 976bc40345 Fixes #2607: Remove TLS from stat (#2613)
Removes the TLS percentages from the stat command in the CLI.
2019-04-04 10:37:42 -07:00
Alejandro Pedraza f6fb865183
Enhance webhook unit tests by checking returned JSON patch (#2615)
Enhance webhook unit tests by checking returned JSON patch

Also have labels/annotations added during injection to be added in order

Fixes #2560

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-04-03 15:39:27 -05:00
Oliver Gould d74ca1bab0
cli: Introduce an upgrade command (#2564)
The `install` command errors when the deploy target contains an existing
Linkerd deployment. The `upgrade` command is introduced to reinstall or
reconfigure the Linkerd control plane.

Upgrade works as follows:

1. The controller config is fetched from the Kubernetes API. The Public
   API is not used, because we need to be able to reinstall the control
   plane when the Public API is not available; and we are not concerned
   about RBAC restrictions preventing the installer from reading the
   config (as we are for inject).

2. The install configuration is read, particularly the flags used during
   the last install/upgrade. If these flags were not set again during the
   upgrade, the previous values are used as if they were passed this time.
   The configuration is updated from the combination of these values,
   including the install configuration itself.

   Note that some flags, including the linkerd-version, are omitted
   since they are stored elsewhere in the configurations and don't make
   sense to track as overrides..

3. The issuer secrets are read from the Kubernetes API so that they can
   be re-used. There is currently no way to reconfigure issuer
   certificates. We will need to create _another_ workflow for
   updating these credentials.

4. The install rendering is invoked with values and config fetched from
   the cluster, synthesized with the new configuration.
2019-04-01 13:27:41 -07:00
Oliver Gould 655632191b
config: Store install parameters with global config (#2577)
When installing Linkerd, a user may override default settings, or may
explicitly configure defaults. Consider install options like `--ha
--controller-replicas=4` -- the `--ha` flag sets a new default value for
the controller-replicas, and then we override it.

When we later upgrade this cluster, how can we know how to configure the
cluster?

We could store EnableHA and ControllerReplicas configurations in the
config, but what if, in a later upgrade, the default value changes? How
can we know whether the user specified an override or just used the
default?

To solve this, we add an `Install` message into a new config.
This message includes (at least) the CLI flags used to invoke
install.

upgrade does not specify defaults for install/proxy-options fields and,
instead, uses the persisted install flags to populate default values,
before applying overrides from the upgrade invocation.

This change breaks the protobuf compatibility by altering the
`installation_uuid` field introduced in 9c442f6885.
Because this change was not yet released (even in an edge release), we
feel that it is safe to break.

Fixes https://github.com/linkerd/linkerd2/issues/2574
2019-03-29 10:04:20 -07:00
Ivan Sim ea07dd3938
Promote the shared injection check to the CLI and webhook (#2555)
Performing this check earlier helps to separate the specialized logic to the CLI
and webhook.
Any subsequent modification of this check logic to support config override of
existing meshed workload will be confined to the relevant component.
The shared lib can then focus only on config overrides.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-03-27 14:51:05 -07:00
Oliver Gould 24222da13b
install: Create auto-inject configuration (#2562)
When reading a Linkerd configuration, we cannot determine whether
auto-inject should be configured.

This change adds auto-inject configuration to the global config
structure. Currently, this configuration is effectively boolean,
determined by the presence of an empty value (versus a null).
2019-03-26 15:28:54 -07:00
Ivan Sim 9c5bb4ec0c
Convert CLI inject proxy options to annotations (#2547)
* Include the DisableExternalProfile option even if it's 'false'. The override logic depends on this option to assign different profile suffix.
* Check for proxy and init image overrides even when registry option is empty
* Append the config annotations to the pod's meta before creating the patch. This ensures that any configs provided via the CLI options are persisted as annotations before the configs override.
* Persist linkerd version CLI option

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-03-26 14:21:22 -07:00
Alejandro Pedraza 7efe385feb
Have the Webhook react to pod creation/update only (#2472)
Have the Webhook react to pod creation/update only

This was already working almost out-of-the-box, just had to:

- Change the webhook config so it watches pods instead of deployments
- Grant some extra ClusterRole permissions
- Add the piece that figures what's the OwnerReference and add the label
for it
- Manually inject service account mount paths
- Readd volumes tests

Fixes #2342 and #1751

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-03-26 11:53:56 -05:00
Oliver Gould 9c442f6885
Store install UUID in global config (#2561)
Currently, the install UUID is regenerated each time `install` is run.
When implementing cluster upgrades, it seems most appropriate to reuse
the prior UUID, rather than generate a new one.

To this end, this change stores an "Installation UUID" in the global
linkerd config.
2019-03-26 08:45:40 -07:00
Oliver Gould da0330743f
Provide peer Identities via the Destination API (#2537)
This change reintroduces identity hinting to the destination service.
The Get endpoint includes identities for pods that are injected with an
identity-mode of "default" and have the same linkerd control plane.

A `serviceaccount` label is now also added to destination response
metadata so that it's accessible in prometheus and tap.
2019-03-22 09:19:14 -07:00
Oliver Gould f02730a90d
Check the cluster's config for install & inject (#2535)
The introduction of identity in 0626fa37 created new state in the
control plane's configuration that must be considered when re-installing
the control plane or when injecting pods.

This change alters `install` to fail if it would seem to conflict with
an existing installation. This behavior may be disabled with the
`--ignore-cluster` flag.

Furthermore, `inject` now _requires_ that it can fetch a configuration
from the control plane in order to operate. Otherwise the
`--ignore-cluster` and `--disable-identity` flags must be specified.

This change does not actually instrument pods to use identity yet---it
lays the framework for proxy identity without changing the test fixture
output (besides a change to how identity HA is configured).

Fixes #2531
2019-03-21 12:49:46 -07:00
Oliver Gould 0626fa374a
install: Introduce the Identity controller (#2526)
https://github.com/linkerd/linkerd2/pull/2521 introduces an "Identity"
controller, but there is no way to include it in linkerd installation.

This change alters the `install` flow as follows:
- An Identity service is _always_ installed;
- Issuer credentials may be specified via the CLI;
- If no Issuer credentials are provided, they are generated each time `install` is called.
- Proxies are NOT configured to use the identity service.
- It's possible to override the credential generation logic---especially
  for tests---via install options that can be configured via the CLI.
2019-03-19 17:04:11 -07:00
Oliver Gould 91c5f07650
proxy: Upgrade to identity-capable proxy (#2524)
The new proxy has changed its configuration as follows:

- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
  just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.

Identity injection is **NOT** configured by this branch.
2019-03-19 14:20:39 -07:00
Oliver Gould 790c13b3b2
Introduce the Identity controller implementation (#2521)
This change introduces a new Identity service implementation for the
`io.linkerd.proxy.identity.Identity` gRPC service.

The `pkg/identity` contains a core, abstract implementation of the service
(generic over both the CA and (Kubernetes) Validator interfaces).

`controller/identity` includes a concrete implementation that uses the
Kubernetes TokenReview API to validate serviceaccount tokens when
issuing certificates.

This change does **NOT** alter installation or runtime to include the
identity service. This will be included in a follow-up.
2019-03-19 13:58:45 -07:00
Oliver Gould 81f645da66
Remove `--tls=optional` and `linkerd-ca` (#2515)
The proxy's TLS implementation has changed to use a new _Identity_ controller.

In preparation for this, the `--tls=optional` CLI flag has been removed
from install and inject; and the `ca` controller has been deleted. Metrics
and UI treatments for TLS have **not** been removed, as they will continue to
be valuable for the new Identity system.

With the removal of the old identity scheme, the Destination service's proxy
ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable
locality awareness.
2019-03-18 17:40:31 -07:00
Ivan Sim 468ad118f2
Support Auto-Inject Configs Overrides Via Annotations (#2471)
* Defined the config annotations as new constants in labels.go
* Introduced the getOverride() functions to override configs
* Introduced new accessors to abstract with type casting

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-03-14 08:42:12 -07:00
Andrew Seigner e5d2460792
Remove single namespace functionality (#2474)
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.

This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.

A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
  combining some check categories, as we can always assume cluster-wide
  requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
  make a decision based on cluster-wide vs. namespace-wide access.
  Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.

Reverts #1721
Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-12 00:17:22 -07:00
Kevin Leimkuhler 229e33e79e
cli: Always display stat tables for all routes (#2466)
## Problem

When an object has no previous route metrics, we do not generate a table for
that object.

The reasoning behind this was for reducing output of the following command:

```
$ linkerd routes deploy --to deploy/foo
```

For each deployment object, if it has no previous traffic to `deploy/foo`, then
a table would not be generated for it.

However, the behavior we see with that indicates there is an error even when a
Service Profile is installed:

```
$ linkerd routes deploy deploy/foo
Error: No Service Profiles found for selected resources
```

## Solution

Always generate a stat table for the queried resource object.

## Validation

I deployed [booksapp](https://github.com/buoyantIO/booksapp) with the `traffic`
deployment removed and Service Profiles installed.

Without the fix, `linkerd routes deploy/webapp` displays an error because there
has been no traffic to `deploy/webapp` without the `traffic` deployment.

With the fix, the following output is generated:

```
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /authors/{id}            webapp     0.00%   0.0rps           0ms           0ms           0ms
GET /books/{id}              webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors                webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/delete    webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/delete      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books/{id}/edit        webapp     0.00%   0.0rps           0ms           0ms           0ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

Closes #2328

Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
2019-03-11 14:17:20 -07:00
Andrew Seigner a42e8db45f
Quiet inject logging (#2483)
Manual and auto injection was logging the full patch JSON at the `Info`
level.

Modify injection to log the object type and name at the `Info` level,
and the full patch at the `Debug` level.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-11 10:39:10 -07:00
Andrew Seigner d4fdbe4991
Fix web init to not check for ServiceProfiles (#2470)
linkerd/linkerd2#2428 modified SelfSubjectAccessReview behavior to no
longer paper-over failed ServiceProfile checks, assuming that
ServiceProfiles will be required going forward. There was a lingering
ServiceProfile check in the web's startup that started failing due to
this change, as the web component does not have (and should not need)
ServiceProfile access. The check was originally implemented to inform
the web component whether to expect "single namespace" mode or
ServiceProfile support.

Modify the web's initialization to always expect ServiceProfile support.

Also remove single namespace integration test

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 15:20:46 -08:00
Alejandro Pedraza 0da851842b
Public API endpoint `Config()` (#2455)
Public API endpoint `Config()`

Retrieves Global and Proxy configurations.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-03-07 17:37:46 -05:00
Andrew Seigner 8da2cd3fd4
Require cluster-wide k8s API access (#2428)
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.

This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.

Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 10:23:18 -08:00
Aditya Sharma 3740aa238a Remove `--api-port` flag from the cli (#2429)
* Changed the protobuf definition to take out destinationApiPort entirely
* Store destinationAPIPort as a constant in pkg/inject.go

Fixes #2351

Signed-off-by: Aditya Sharma <hello@adi.run>
2019-03-06 15:54:12 -08:00
Alejandro Pedraza f155fb9a8f
Have `NewFakeClientSets()` not swallow errors when parsing YAML (#2454)
This helps catching bad YAMLs in test resources

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-03-06 13:53:04 -05:00
Ivan Sim 8f9473fbd7
Recreate the MWC when the proxy injector is restarted (#2431)
This ensures that the MWC always picks up the latest config template during version upgrade.
The removed `update()` method and RBAC permissions are superseded by @2163.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-03-05 15:53:37 -08:00
Andrew Seigner 206ff685e2
Bump Prometheus client to v0.9.2 (#2388)
We were depending on an untagged version of prometheus/client_golang
from Feb 2018.

This bumps our dependency to v0.9.2, from Dec 2018.

Also, this is a prerequisite to #1488.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-05 10:31:16 -08:00
Alejandro Pedraza ddf2e729ac
Injection consolidation (#2334)
- Created the pkg/inject package to hold the new injection shared lib.
- Extracted from `/cli/cmd/inject.go` and `/cli/cmd/inject_util.go`
the core methods doing the workload parsing and injection, and moved them into
`/pkg/inject/inject.go`. The CLI files should now deal only with
strictly CLI concerns, and applying the json patch returned by the new
lib.
- Proceeded analogously with `/cli/cmd/uninject.go` and
`/pkg/inject/uninject.go`.
- The `InjectReport` struct and helping methods were moved into
`/pkg/inject/report.go`
- Refactored webhook to use the new injection lib
- Removed linkerd-proxy-injector-sidecar-config ConfigMap
- Added the ability to add pod labels and annotations without having to
specify the already existing ones

Fixes #1748, #2289

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2019-03-05 08:38:56 -05:00
Tarun Pothulapati 2184928813 Wire up stats for Jobs (#2416)
Support for Jobs in stat/tap/top cli commands

Part of #2007

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-03-01 17:16:54 -08:00
Oliver Gould ab90263461
destination: Only return TLS identities when appropriate (#2371)
As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.

This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.

The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.

Fixes #2217
2019-02-27 12:18:39 -08:00