Commit Graph

458 Commits

Author SHA1 Message Date
Zahari Dichev caf4e61daf
Enable identitiy on endpoints not associated with pods (#4134)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-09 20:55:57 +02:00
Zahari Dichev 72fc94b03c
Service mirroring tests (#4115)
Unit tests that exercise most of the code in cluster_watcher.go. Essentially the whole cluster mirroring machinary can be tought of as a function that takes remote cluster state, local cluster state, and modification events and as a result it either modifies local cluster state or issues new events onto the queue. This is what these tests are trying to model. I think this covers a lot of the logic there. Any suggestions for other edge cases are welcome.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-04 20:17:21 +02:00
Zahari Dichev edd7fd203d
Service Mirroring Component (#4028)
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-02 21:16:08 +02:00
Christy Jacob 8111e54606
Check for extension server certificate (#4062)
* Check Extension api server Authentication
* Added Checks and tests for extension api-server authentication
* Fixed Failing Static Checks
* Updated the golden file

Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
2020-02-28 13:39:02 -08:00
Mayank Shah 3c3a4a5f5d
cli: Add label selector flag for `stat` (#4040)
* Update `linkerd-namespace` shorthand to `L`
* Add --selector (-l) flag for `stat`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-17 13:40:07 -05:00
Zahari Dichev 6fa9407318
Ensure we get the correct type out of Informer Deletion events (#4034)
Ensure we get what we expect when receiving DELETE events from the k8s Informer api

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:15:24 +02:00
Alex Leong ec51434eb9
Show traffic split metrics from sources in all namespaces (#3967)
Fixes #3562 

When a pod in one namespace sends traffic to a service which is the apex of a traffic split in another namespace, that traffic is not displayed in the `linkerd stat trafficsplit` output.  This is because when we do a Prometheus query for traffic to the traffic split, we supply a Prometheus label selector to only select traffic sources in the namespace of the traffic split.

Since any pod in any namespace can send traffic to the apex service of a traffic split, we must look at all possible sources of traffic, not just the ones in the same namespace.

Before:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS   RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m         -     -             -             -             -
webapp-split   webapp   webapp-2     100m         -     -             -             -             -
```

After:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m    80.00%   1.4rps          31ms          99ms        2530ms
webapp-split   webapp   webapp-2     100m    60.00%   0.2rps          35ms          93ms          99ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-12 09:21:59 -08:00
Alejandro Pedraza 3ba66f6f9d
Fix flakey TestGetProfiles (#3965)
Fixes #3332

Fixes the very rare test failure
```
--- FAIL: TestGetProfiles (0.33s)
    --- FAIL: TestGetProfiles/Returns_server_profile (0.11s)
            server_test.go:228: Expected 1 or 2 updates but got 3:
            [retry_budget:<retry_ratio:0.2 min_retries_per_second:10
            ttl:<seconds:10 > >  routes:<condition:<path:<regex:"/a/b/c"
            > > metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > >
            routes:<condition:<path:<regex:"/a/b/c" > >
            metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > > ]
            FAIL
            FAIL  github.com/linkerd/linkerd2/controller/api/destination
            0.624s
```
that occurs when a third unexpected stream update occurs, when the fake
API takes more time to notify its listeners about the resources created.

For all the nasty details check #3332
2020-02-07 19:43:29 -05:00
Dax McDonald 76d3285247
Use correct go module file syntax (#4021)
The correct syntax for the go module file is
go MAJOR.MINOR

Signed-off-by: Dax McDonald <dax@rancher.com>
2020-02-07 07:58:54 -08:00
Alejandro Pedraza afb93cddc8
Use `t.Name()` instead of `t.Name` in tests (#3970)
Use `t.Name()` instead of `t.Name` when retrieving the name of tests.
This was causing an error to be added in the log:
```
output: logrus_error="can not add field \"test\"
```

Followup to
[comment](https://github.com/linkerd/linkerd2/pull/3965#discussion_r370387990)
2020-01-27 09:17:19 -05:00
Kevin Leimkuhler 53baecb382
Changes for edge-20.1.3 (#3966)
## edge-20.1.3

* CLI
  * Introduced `linkerd check --pre --linkerd-cni-enabled`, used when the CNI
    plugin is used, to check it has been properly installed before proceeding
    with the control plane installation
  * Added support for the `--as-group` flag so that users can impersonate
    groups for Kubernetes operations (thanks @mayankshah160!)
* Controller
  * Fixed an issue where an override of the Docker registry was not being
    applied to debug containers (thanks @javaducky!)
  * Added check for the Subject Alternate Name attributes to the API server
    when access restrictions have been enabled (thanks @javaducky!)
  * Added support for arbitrary pod labels so that users can leverage the
    Linkerd provided Prometheus instance to scrape for their own labels
    (thanks @daxmc99!)
  * Fixed an issue with CNI config parsing

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-01-23 16:55:21 -08:00
Zahari Dichev a9d38189fb Fix CNI config parsing (#3953)
This PR addreses the problem introduced after #3766.

Fixes #3941 

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-23 09:55:04 -08:00
Mayank Shah 60ac0d5527 Add `as-group` CLI flag (#3952)
Add CLI flag --as-group that can impersonate group for k8s operations

Signed-off-by: Mayank Shah mayankshah1614@gmail.com
2020-01-22 16:38:31 +02:00
Paul Balogh b5e39bcbf7 Utilize Common Name or Subject Alternate Name for access checks (#3459) (#3949)
Subject
Utilize Common Name or Subject Alternate Name for access checks (#3459)

Problem
When access restrictions to API server have been enabled with the requestheader-allowed-names configuration, only the Common Name of the requestor certificate is being checked. This check should include the use of Subject Alternate Name attributes.

Solution
API server will now check the SAN attributes (DNS Names, Email Addresses, IP Addresses, and URIs) when determining accessibility for allowed names.

Fixes issue #3459

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2020-01-22 08:58:19 +02:00
Paul Balogh dabee12b93 Fix issue for debug containers when using custom Docker registry (#3873)
**Subject**
Fixes bug where override of Docker registry was not being applied to debug containers (#3851)

**Problem**
Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image.

**Solution**
This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images.

This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option.

**Validation**
Several new unit tests have been created to confirm functionality.  In addition, the following workflows were run through:

### Standard Workflow with Custom Registry
This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service.

* Start with a k8s instance having no Linkerd installation
* Build all images locally using `bin/docker-build`
* Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6`
* Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section
* Request injection of the debug image into an available container.  I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Custom Registry Override at Injection
This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Standard Workflow with Default Registry
This workflow is the typical workflow which utilizes the standard Linkerd image registry.

* Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster | kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/
* Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -`
* Perform standard Linkerd installation as `bin/linkerd install | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section
* Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.

### Overriding the Default Registry at Injection
This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection.

* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry.  Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image.  Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.

Fixes issue #3851 

Signed-off-by: Paul Balogh javaducky@gmail.com
2020-01-17 10:18:03 -08:00
Mayank Shah b94e03a8a6 Remove empty fields from generated configs (#3886)
Fixes
- https://github.com/linkerd/linkerd2/issues/2962
- https://github.com/linkerd/linkerd2/issues/2545

### Problem
Field omissions for workload objects are not respected while marshaling to JSON.

### Solution
After digging a bit into the code, I came to realize that while marshaling, workload objects have empty structs as values for various fields which would rather be omitted. As of now, the standard library`encoding/json` does not support zero values of structs with the `omitemty` tag. The relevant issue can be found [here](https://github.com/golang/go/issues/11939). To tackle this problem, the object declaration should have _pointer-to-struct_ as a field type instead of _struct_ itself. However, this approach would be out of scope as the workload object declaration is handled by the k8s library.

I was able to find a drop-in replacement for the `encoding/json` library which supports zero value of structs with the `omitempty` tag. It can be found [here](https://github.com/clarketm/json). I have made use of this library to implement a simple filter like functionality to remove empty tags once a YAML with empty tags is generated, hence leaving the previously existing methods unaffected

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-01-13 10:02:24 -08:00
Alex Leong 93a81dce97
Change default proxy log level to "warn,linkerd=info" (#3908)
Fixes #3901 

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-09 14:22:06 -08:00
Paul Balogh 2cd2ecfa30 Enable mixed configuration of skip-[inbound|outbound]-ports (#3766)
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)

This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.

* Bump versions for proxy-init to v1.3.0

Signed-off-by: Paul Balogh <javaducky@gmail.com>
2019-12-20 09:32:13 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Sergio C. Arteaga a1141fc507 Cache StatSummary responses in dashboard web server (#3769)
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-17 09:15:00 -05:00
Dax McDonald 3088f404ce Upgrade prometheus to v1.2.1 (#3541)
Signed-off-by: Dax McDonald <dax@rancher.com>
2019-12-11 15:26:16 -08:00
Sergio C. Arteaga cee8e3d0ae Add CronJobs and ReplicaSets to dashboard and CLI (#3687)
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. 

Closes #3614 
Closes #3630 
Closes #3584 
Closes #3585

Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
2019-12-11 10:02:37 -08:00
Zahari Dichev e5f75a8c3d
Add validation to ensure stat time window is at least 15s (#3720)
* Add stat time window minimum of 10s

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-12-04 08:12:01 +02:00
Alejandro Pedraza cf9fa0a8c9
Removed calico logutils dependency, incompatible with go 1.13 (#3763)
* Removed calico logutils dependency, incompatible with go 1.13

Fixes #1153

Removed dependency on
`github.com/projectcalico/libcalico-go/lib/logutils` because it has
problems with go modules, as described in
projectcalico/libcalico-go#1153

Not a big deal since it was only used for modifying the plugin's log
format.
2019-11-29 09:19:11 -05:00
Zahari Dichev b83c3a2137
Add Responses to path items to satisfy kube apiserver (#3700)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-15 20:41:50 +02:00
Alex Leong 0026103362 Unit and integration test fixups (#3730)
- Added cleanup step at the end of all integration tests.
- Disable external_issuer_integration_tests in cloud_tests due to
  namespace issue. Running this via `kind` tests is sufficient for now.
- Set a flakey test to `Skip`, relates to #3332.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-11-15 03:40:42 -08:00
Alejandro Pedraza 4b6254b52e
Replaced `uuid` with `uid` from linkerd-config resource (#3694)
* Replaced `uuid` with `uid` from linkerd-config resource

Fixes #3621

Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.

I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.

Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.

I also added an integration test to verify the stability of the uid.
2019-11-13 13:56:01 -05:00
Alejandro Pedraza 3324966702
Upgrade go to 1.13.4 (#3702)
Fixes #3566

As explained in #3566, as of go 1.13 there's a strict check that ensures a dependency's timestamp matches it's sha (as declared in go.mod). Our smi-sdk dependency has a problem with that that got resolved later on, but more work would be required to upgrade that dependency. In the meantime a quick pair of replace statements at the bottom of go.mod fix the issue.
2019-11-13 12:54:36 -05:00
Tarun Pothulapati f18e27b115 use appsv1 api in identity (#3682)
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-11-06 15:06:09 -08:00
Alejandro Pedraza 0e8958cd07
Fixed bad identity string for target pod in tap (#3675)
* Fixed bad identity string for target pod in tap

Fixes #3506

Was using the cluster domain instead of the trust domain, which results
in an error when those domains differ.
2019-11-05 15:57:41 -05:00
Alejandro Pedraza 8cf4494e78
Add proxy-injector-injections count to heartbeat (#3655)
Fixes #3059
2019-10-31 11:09:00 -05:00
Alejandro Pedraza d3d8266c63
If tap source IP matches many running pods then only show the IP (#3513)
* If tap source IP matches many running pods then only show the IP

When an unmeshed source ip matched more than one running pod, tap was
showing the names for all those pods, even though the didn't necessary
originate the connection. This could be reproduced when using pod
network add-on such as Calico.

With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP.

Fixes #3103
2019-10-25 12:38:11 -05:00
Zahari Dichev 0017f9a60a Cert manager support (#3600)
* Add support for --identity-issuer-mode flag to install cmd
* Change flag to be a bool
* Read correct data form identity when external issuer is used
* Add ability for identity service to dynamically reload certs
* Fix failing tests
* Minor refactor
* Load trust anchors from identity issuer secret
* Make identity service actually watch for issuer certs updates
* Add some testing around cmd line identity options validation
* Add tests ensuring that identity service loads issuer
* Take into account external-issuer flag during upgrade + tests
* Fix failing upgrade test
* Address initial review feedback
* Address further review feedback on cli and helm
* Do not persist --identity-external-issuer
* Some improvements to identitiy service
* Bring back persistane of external issuer flag
* Address more feedback
* Update dockerfiles shas
* Publishing k8s events on issuer certs rotation
* Ensure --ignore-cluster+external issuer is not supported
* Update go-deps shas
* Transition to identity issuer scheme based configuration
* Use k8s consts for secret file names

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-10-24 13:15:14 -07:00
Andrew Seigner 0f9ea553d2 Add APIService fake clientset support (#3569)
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.

Update the fake clientset code to support APIService objects.

Fixes #3559

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-10-21 12:12:19 -07:00
Tarun Pothulapati f3deee01b6 Trace Control plane Components with OC (#3495)
* add trace flags and initialisation
* add ocgrpc handler to newgrpc
* add ochttp handler to linkerd web
* add flags to linkerd web
* add ochttp handler to prometheus handler initialisation
* add ochttp clients for components
* add span for prometheus query
* update godep sha
* fix reviews
* better commenting
* add err checking
* remove sampling
* add check in main
* move to pkg/trace

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-10-18 12:19:13 -07:00
Alex Leong 3dcff52b9f
Switch from using golangci fmt to using goimports (#3555)
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script.  However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting.  This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.  

In this change we stop using golang-ci-lint for format checking.  We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files.  This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project.  We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.

Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-16 13:56:11 -07:00
Johannes Hansen f880e71fcd The linkerd proxy does not work with headless services (#3470)
* The linkerd proxy does not work with headless services (i.e. endpoints not referencing a pod).

Changed endpoints_watcher to also return endpoints with no targetref.

Fixes #3308

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>

* Fix panic in endpoint_translator

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>
2019-10-15 14:56:41 -07:00
Alex Leong ef54d18bb7
Fallback to defaults when config cannot be loaded (#3530)
When running the destination controller locally, the Linkerd config files which are typically mounted from a configmap are not available.  To facilitate local development, we fall back to default values in this case instead of failing to start up.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-15 14:47:42 -07:00
Guangming Wang c59e9cf500 move t.Fatalf out of goroutine in server_test.go (#3490)
Subject
t.Fataf should not be called in goroutine

Problem

Solution
move t.Fatalf into testing func instead of its goroutine

Validation
unit test passed on my env

Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-10-15 14:22:10 -07:00
Alejandro Pedraza 3de35ccc58
Remove Discovery service leftovers (#3500)
Followup to #2990, which refactored `linkerd endpoints` to use the
`Destination.Get` API instead of the `Discovery.Endpoints` API, leaving
the Discovery with no implented methods. This PR removes all the Discovery
code leftovers.

Fixes #3499
2019-10-15 11:20:21 -05:00
Ivan Sim cf69dedf9c
Re-add the destination container to the controller spec (#3540)
* Re-add the destination container to the controller spec

This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.

On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.

* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-08 10:49:40 -07:00
Kevin Leimkuhler a3a240e0ef
Add TapEvent headers and trailers to the tap protobuf (#3410)
### Motivation

In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390

### Solution

The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.

These values are set by reading the corresponding fields off of the proxy's tap
events.

The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33

Closes #3262

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-29 09:54:37 -07:00
Ivan Sim 9f21c8b481
Introduce Tracing Annotations (#3481)
* Add the tracing environment variables to the proxy spec
* Add tracing event
* Remove unnecessary CLI change
* Update log message
* Handle single segment service name
* Use default service account if not provided

The injector doesn't read the defaults from the values.yaml

* Remove references to conf.workload.ownerRef in log messages

This nested field isn't always set.

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-09-26 16:07:30 -07:00
Alex Leong 4799baa8e2
Revert "Trace Control Plane components using OC (#3461)" (#3484)
This reverts commit edd3b1f6d4.

This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy.  We will reintroduce this change once the config plan is straightened out.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-09-26 11:56:44 -07:00
Tarun Pothulapati edd3b1f6d4 Trace Control Plane components using OC (#3461)
* add exporter config for all components

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add cmd flags wrt tracing

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp tracing to web server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add flags to the tap deployment

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add trace flags to install and upgrade command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add linkerd prefix to svc names

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp trasport to API Internal Client

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix goimport linting errors

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add ochttp handler to tap http server

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* review and fix tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update test values

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use common template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use Initialize

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix sample flag

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add verbose info reg flags

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-26 08:11:48 -07:00
Tarun Pothulapati 139c64132d Make Identity use GRPC Server with Prom Metrics (#3457)
* make identity use grpc server with prom metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* linting fix

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-23 08:17:41 -07:00
Tarun Pothulapati 49d39e5a12 Instrumenting Proxy-Injector (#3354)
* add proxy injection prometheus counters

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* formatted injection reasons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update proxy injection report tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* keep the structure, and add global ownerKind

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* increase request count, when owner is nil

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add readable reasons using map

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add proxy config override annotations as labels

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove space for machine reasons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* use correct proxy image override annotation

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* add annotation_at label to prom metrics

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* refactor disablebyannotation function

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-09-20 09:46:57 -07:00
Kevin Leimkuhler c62c90870e
Add JSON output to tap command (#3434)
Replaces #3411 

### Motivation

It is a little tough to filter/read the current tap output. As headers are being
added to tap, the output is starting to get difficult to consume. Take a peek at
#3262 for an example. It would be nice to have some more machine readable output
that can be sliced and diced with tools such as jq.

### Solution

A new output option has been added to the `linkerd tap` command that returns the
JSON encoding of tap events.

The default output is line oriented; `-o wide` appends the request's target
resource type to the tap line oriented tap events.

In order display certain values in a more human readable form, a tap event
display struct has been introduced. This struct maps public API `TapEvent`s
directly to a private `tapEvent`. This struct offers a flatter JSON structure
than the protobuf JSON rendering. It also can format certain field--such as
addresses--better than the JSON protobuf marshaler.

Closes #3390

**Default**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web
req id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics
rsp id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs
end id=5:0 proxy=in  src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B
```

**Wide**:
```
➜  linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide
req id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd
rsp id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd
end id=6:0 proxy=in  src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd
```

**JSON**:
*Edit: Flattened `Method` and `Scheme` formatting*
```
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "ip-masq-agent",
      "namespace": "kube-system",
      "pod": "ip-masq-agent-4d5s9",
      "serviceaccount": "ip-masq-agent",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "requestInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "method": "GET",
    "scheme": "",
    "authority": "10.60.1.49:9994",
    "path": "/ready"
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "daemonset": "calico-node",
      "namespace": "kube-system",
      "pod": "calico-node-bbrjq",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseInitEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 644820
    },
    "httpStatus": 200
  }
}
{
  "source": {
    "ip": "10.138.0.28",
    "port": 47078,
    "metadata": {
      "deployment": "calico-typha",
      "namespace": "kube-system",
      "pod": "calico-typha-59cb487c49-8247r",
      "pod_template_hash": "59cb487c49",
      "serviceaccount": "calico-sa",
      "tls": "not_provided_by_remote"
    }
  },
  "destination": {
    "ip": "10.60.1.49",
    "port": 9994,
    "metadata": {
      "control_plane_ns": "linkerd",
      "deployment": "linkerd-web",
      "namespace": "linkerd",
      "pod": "linkerd-web-6988999458-c6wpw",
      "pod_template_hash": "6988999458",
      "serviceaccount": "linkerd-web"
    }
  },
  "routeMeta": null,
  "proxyDirection": "INBOUND",
  "responseEndEvent": {
    "id": {
      "base": 0,
      "stream": 0
    },
    "sinceRequestInit": {
      "nanos": 790898
    },
    "sinceResponseInit": {
      "nanos": 146078
    },
    "responseBytes": 3,
    "grpcStatusCode": 0
  }
}
```

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2019-09-19 09:34:49 -07:00
Alejandro Pedraza 30ecddb965
Fix injector timeout under high load (#3442)
* Fix injector timeout under high load

Fixes #3358

When retrieving a pod owner, we were hitting the k8s API directly because
at injection time the informer might not have been informed about the
existence of the parent object.
Under a large number of injection requests this ended up in the k8s API requests
being throttled, the proxy-injector getting blocked and the webhook requests
timing out.

Now we'll hit the shared informer first, and hit the k8s API only when
the informer doesn't return anything. After a few injection requests for
the same owner, the informer should have been updated.

Testing:

Scaling an emoji deployment to 1000 replicas, and after waiting for a
couple of minutes:

Before:
```bash
# a portion of the pods doesn't get injected
$ kubectl-n emojivoto get po | grep ./1 | wc -l
109

kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
.... (lots of errors)
```

After:
```bash
# all the pods get injected
$ kubectl -n emojivoto get po | grep ./1 | wc -l
0

kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
```
2019-09-18 17:58:38 -05:00
Andrew Seigner c5a85e587c
Update to client-go v12.0.0, forked stern (#3387)
The repo depended on an old version of client-go. It also depended on
stern, which itself depended on an old version of client-go, making
client-go upgrade non-trivial.

Update the repo to client-go v12.0.0, and also replace stern with a
fork.

This fork of stern includes the following changes:
- updated to use Go Modules
- updated to use client-go v12.0.0
- fixed log line interleaving:
  - https://github.com/wercker/stern/issues/96
  - based on:
    - 8723308e46

Fixes #3382

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-10 11:04:29 -07:00