linkerd2

Commit Graph

Author	SHA1	Message	Date
Zahari Dichev	edd7fd203d	Service Mirroring Component (#4028 ) This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-03-02 21:16:08 +02:00
Mayank Shah	3c3a4a5f5d	cli: Add label selector flag for `stat` (#4040 ) * Update `linkerd-namespace` shorthand to `L` * Add --selector (-l) flag for `stat` Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-02-17 13:40:07 -05:00
Zahari Dichev	6fa9407318	Ensure we get the correct type out of Informer Deletion events (#4034 ) Ensure we get what we expect when receiving DELETE events from the k8s Informer api Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-02-15 10:15:24 +02:00
Alex Leong	ec51434eb9	Show traffic split metrics from sources in all namespaces (#3967 ) Fixes #3562 When a pod in one namespace sends traffic to a service which is the apex of a traffic split in another namespace, that traffic is not displayed in the `linkerd stat trafficsplit` output. This is because when we do a Prometheus query for traffic to the traffic split, we supply a Prometheus label selector to only select traffic sources in the namespace of the traffic split. Since any pod in any namespace can send traffic to the apex service of a traffic split, we must look at all possible sources of traffic, not just the ones in the same namespace. Before: ``` $ bin/linkerd stat ts NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 webapp-split webapp webapp 900m - - - - - webapp-split webapp webapp-2 100m - - - - - ``` After: ``` $ bin/linkerd stat ts NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 webapp-split webapp webapp 900m 80.00% 1.4rps 31ms 99ms 2530ms webapp-split webapp webapp-2 100m 60.00% 0.2rps 35ms 93ms 99ms ``` Signed-off-by: Alex Leong <alex@buoyant.io>	2020-02-12 09:21:59 -08:00
Alejandro Pedraza	3ba66f6f9d	Fix flakey TestGetProfiles (#3965 ) Fixes #3332 Fixes the very rare test failure ``` --- FAIL: TestGetProfiles (0.33s) --- FAIL: TestGetProfiles/Returns_server_profile (0.11s) server_test.go:228: Expected 1 or 2 updates but got 3: [retry_budget:<retry_ratio:0.2 min_retries_per_second:10 ttl:<seconds:10 > > routes:<condition:<path:<regex:"/a/b/c" > > metrics_labels:<key:"route" value:"route1" > timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2 min_retries_per_second:10 ttl:<seconds:10 > > routes:<condition:<path:<regex:"/a/b/c" > > metrics_labels:<key:"route" value:"route1" > timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2 min_retries_per_second:10 ttl:<seconds:10 > > ] FAIL FAIL github.com/linkerd/linkerd2/controller/api/destination 0.624s ``` that occurs when a third unexpected stream update occurs, when the fake API takes more time to notify its listeners about the resources created. For all the nasty details check #3332	2020-02-07 19:43:29 -05:00
Alejandro Pedraza	afb93cddc8	Use `t.Name()` instead of `t.Name` in tests (#3970 ) Use `t.Name()` instead of `t.Name` when retrieving the name of tests. This was causing an error to be added in the log: ``` output: logrus_error="can not add field \"test\" ``` Followup to [comment](https://github.com/linkerd/linkerd2/pull/3965#discussion_r370387990)	2020-01-27 09:17:19 -05:00
Paul Balogh	dabee12b93	Fix issue for debug containers when using custom Docker registry (#3873 ) Subject Fixes bug where override of Docker registry was not being applied to debug containers (#3851) Problem Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image. Solution This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images. This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option. Validation Several new unit tests have been created to confirm functionality. In addition, the following workflows were run through: ### Standard Workflow with Custom Registry This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service. * Start with a k8s instance having no Linkerd installation * Build all images locally using `bin/docker-build` * Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6` * Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io \| kubectl apply -f -` * Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section * Request injection of the debug image into an available container. I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml \| bin/linkerd inject --enable-debug-sidecar - \| kubectl apply -f -` * Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap * Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f` * Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container. ### Overriding the Custom Registry Override at Injection This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection. * “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment * Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml \| bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - \| kubectl apply -f -` * Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry. Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image. Of note, the proxy and proxy-init images are still running the “original” override images. * As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container. ### Standard Workflow with Default Registry This workflow is the typical workflow which utilizes the standard Linkerd image registry. * Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster \| kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/ * Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml \| kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml \| kubectl apply -f -` * Perform standard Linkerd installation as `bin/linkerd install \| kubectl apply -f -` * Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section * Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml \| bin/linkerd inject --enable-debug-sidecar - \| kubectl apply -f -` * Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f` * Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container. ### Overriding the Default Registry at Injection This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection. * “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment * Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml \| bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - \| kubectl apply -f -` * Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry. Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image. Of note, the proxy and proxy-init images are still running the “original” override images. * As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container. Fixes issue #3851 Signed-off-by: Paul Balogh javaducky@gmail.com	2020-01-17 10:18:03 -08:00
Alex Leong	93a81dce97	Change default proxy log level to "warn,linkerd=info" (#3908 ) Fixes #3901 Signed-off-by: Alex Leong <alex@buoyant.io>	2020-01-09 14:22:06 -08:00
Paul Balogh	2cd2ecfa30	Enable mixed configuration of skip-[inbound\|outbound]-ports (#3766 ) * Enable mixed configuration of skip-[inbound\|outbound]-ports using port numbers and ranges (#3752) * included tests for generated output given proxy-ignore configuration options * renamed "validate" method to "parseAndValidate" given mutation * updated documentation to denote inclusiveness of ranges * Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766) This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer. * Bump versions for proxy-init to v1.3.0 Signed-off-by: Paul Balogh <javaducky@gmail.com>	2019-12-20 09:32:13 -05:00
Alex Leong	03762cc526	Support pod ip and service cluster ip lookups in the destination service (#3595 ) Fixes #3444 Fixes #3443 ## Background and Behavior This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter. It returns the stream of endpoints, just as if `Get` had been called with the service's authority. This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections. When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity. Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error. Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`. ## Implementation We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip. `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`. Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response. ## Testing This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster: ``` go run controller/cmd/main.go destination -kubeconfig ~/.kube/config ``` Then lookups can be issued using the destination client: ``` go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086 ``` Service cluster ips and pod ips can be used as the `path` argument. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-12-19 09:25:12 -08:00
Dax McDonald	3088f404ce	Upgrade prometheus to v1.2.1 (#3541 ) Signed-off-by: Dax McDonald <dax@rancher.com>	2019-12-11 15:26:16 -08:00
Sergio C. Arteaga	cee8e3d0ae	Add CronJobs and ReplicaSets to dashboard and CLI (#3687 ) This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. Closes #3614 Closes #3630 Closes #3584 Closes #3585 Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com	2019-12-11 10:02:37 -08:00
Zahari Dichev	e5f75a8c3d	Add validation to ensure stat time window is at least 15s (#3720 ) * Add stat time window minimum of 10s Signed-off-by: zaharidichev <zaharidichev@gmail.com> * Address comments Signed-off-by: zaharidichev <zaharidichev@gmail.com>	2019-12-04 08:12:01 +02:00
Alex Leong	0026103362	Unit and integration test fixups (#3730 ) - Added cleanup step at the end of all integration tests. - Disable external_issuer_integration_tests in cloud_tests due to namespace issue. Running this via `kind` tests is sufficient for now. - Set a flakey test to `Skip`, relates to #3332. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-11-15 03:40:42 -08:00
Tarun Pothulapati	f3deee01b6	Trace Control plane Components with OC (#3495 ) * add trace flags and initialisation * add ocgrpc handler to newgrpc * add ochttp handler to linkerd web * add flags to linkerd web * add ochttp handler to prometheus handler initialisation * add ochttp clients for components * add span for prometheus query * update godep sha * fix reviews * better commenting * add err checking * remove sampling * add check in main * move to pkg/trace Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2019-10-18 12:19:13 -07:00
Alex Leong	3dcff52b9f	Switch from using golangci fmt to using goimports (#3555 ) CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script. However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting. This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use. In this change we stop using golang-ci-lint for format checking. We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files. This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project. We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting. Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`: * goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources * goimports returns a 0 exit code, even when formatting errors are detected Signed-off-by: Alex Leong <alex@buoyant.io>	2019-10-16 13:56:11 -07:00
Johannes Hansen	f880e71fcd	The linkerd proxy does not work with headless services (#3470 ) * The linkerd proxy does not work with headless services (i.e. endpoints not referencing a pod). Changed endpoints_watcher to also return endpoints with no targetref. Fixes #3308 Signed-off-by: Johannes Hansen <johannesh1980@gmail.com> * Fix panic in endpoint_translator Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>	2019-10-15 14:56:41 -07:00
Alejandro Pedraza	3de35ccc58	Remove Discovery service leftovers (#3500 ) Followup to #2990, which refactored `linkerd endpoints` to use the `Destination.Get` API instead of the `Discovery.Endpoints` API, leaving the Discovery with no implented methods. This PR removes all the Discovery code leftovers. Fixes #3499	2019-10-15 11:20:21 -05:00
Kevin Leimkuhler	a3a240e0ef	Add TapEvent headers and trailers to the tap protobuf (#3410 ) ### Motivation In order to expose arbitrary headers through tap, headers and trailers should be read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s. This change should have no user facing changes as it just prepares the events for JSON output in linkerd/linkerd2#3390 ### Solution The public API has been updated with a headers field for `TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers field for `TapEvent_Http_ResponseEnd_`. These values are set by reading the corresponding fields off of the proxy's tap events. The proto changes are equivalent to the proto changes proposed in linkerd/linkerd2-proxy-api#33 Closes #3262 Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-09-29 09:54:37 -07:00
Alex Leong	4799baa8e2	Revert "Trace Control Plane components using OC (#3461 )" (#3484 ) This reverts commit `edd3b1f6d4`. This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy. We will reintroduce this change once the config plan is straightened out. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-09-26 11:56:44 -07:00
Tarun Pothulapati	edd3b1f6d4	Trace Control Plane components using OC (#3461 ) * add exporter config for all components Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add cmd flags wrt tracing Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add ochttp tracing to web server Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add flags to the tap deployment Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add trace flags to install and upgrade command Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add linkerd prefix to svc names Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add ochttp trasport to API Internal Client Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * fix goimport linting errors Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add ochttp handler to tap http server Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * review and fix tests Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update test values Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * use common template Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update tests Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * use Initialize Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * fix sample flag Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add verbose info reg flags Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2019-09-26 08:11:48 -07:00
Kevin Leimkuhler	c62c90870e	Add JSON output to tap command (#3434 ) Replaces #3411 ### Motivation It is a little tough to filter/read the current tap output. As headers are being added to tap, the output is starting to get difficult to consume. Take a peek at #3262 for an example. It would be nice to have some more machine readable output that can be sliced and diced with tools such as jq. ### Solution A new output option has been added to the `linkerd tap` command that returns the JSON encoding of tap events. The default output is line oriented; `-o wide` appends the request's target resource type to the tap line oriented tap events. In order display certain values in a more human readable form, a tap event display struct has been introduced. This struct maps public API `TapEvent`s directly to a private `tapEvent`. This struct offers a flatter JSON structure than the protobuf JSON rendering. It also can format certain field--such as addresses--better than the JSON protobuf marshaler. Closes #3390 Default: ``` ➜ linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web req id=5:0 proxy=in src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/metrics rsp id=5:0 proxy=in src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=3366µs end id=5:0 proxy=in src=10.1.6.146:36976 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=132µs response-length=1505B ``` Wide: ``` ➜ linkerd2 git:(kleimkuhler/tap-json-output) linkerd -n linkerd tap deploy/linkerd-web -o wide req id=6:0 proxy=in src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :method=GET :authority=10.1.6.148:9994 :path=/ping dst_res=deploy/linkerd-web dst_ns=linkerd rsp id=6:0 proxy=in src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote :status=200 latency=1442µs dst_res=deploy/linkerd-web dst_ns=linkerd end id=6:0 proxy=in src=10.1.0.1:35394 dst=10.1.6.148:9994 tls=not_provided_by_remote duration=88µs response-length=5B dst_res=deploy/linkerd-web dst_ns=linkerd ``` JSON: Edit: Flattened `Method` and `Scheme` formatting ``` { "source": { "ip": "10.138.0.28", "port": 47078, "metadata": { "daemonset": "ip-masq-agent", "namespace": "kube-system", "pod": "ip-masq-agent-4d5s9", "serviceaccount": "ip-masq-agent", "tls": "not_provided_by_remote" } }, "destination": { "ip": "10.60.1.49", "port": 9994, "metadata": { "control_plane_ns": "linkerd", "deployment": "linkerd-web", "namespace": "linkerd", "pod": "linkerd-web-6988999458-c6wpw", "pod_template_hash": "6988999458", "serviceaccount": "linkerd-web" } }, "routeMeta": null, "proxyDirection": "INBOUND", "requestInitEvent": { "id": { "base": 0, "stream": 0 }, "method": "GET", "scheme": "", "authority": "10.60.1.49:9994", "path": "/ready" } } { "source": { "ip": "10.138.0.28", "port": 47078, "metadata": { "daemonset": "calico-node", "namespace": "kube-system", "pod": "calico-node-bbrjq", "serviceaccount": "calico-sa", "tls": "not_provided_by_remote" } }, "destination": { "ip": "10.60.1.49", "port": 9994, "metadata": { "control_plane_ns": "linkerd", "deployment": "linkerd-web", "namespace": "linkerd", "pod": "linkerd-web-6988999458-c6wpw", "pod_template_hash": "6988999458", "serviceaccount": "linkerd-web" } }, "routeMeta": null, "proxyDirection": "INBOUND", "responseInitEvent": { "id": { "base": 0, "stream": 0 }, "sinceRequestInit": { "nanos": 644820 }, "httpStatus": 200 } } { "source": { "ip": "10.138.0.28", "port": 47078, "metadata": { "deployment": "calico-typha", "namespace": "kube-system", "pod": "calico-typha-59cb487c49-8247r", "pod_template_hash": "59cb487c49", "serviceaccount": "calico-sa", "tls": "not_provided_by_remote" } }, "destination": { "ip": "10.60.1.49", "port": 9994, "metadata": { "control_plane_ns": "linkerd", "deployment": "linkerd-web", "namespace": "linkerd", "pod": "linkerd-web-6988999458-c6wpw", "pod_template_hash": "6988999458", "serviceaccount": "linkerd-web" } }, "routeMeta": null, "proxyDirection": "INBOUND", "responseEndEvent": { "id": { "base": 0, "stream": 0 }, "sinceRequestInit": { "nanos": 790898 }, "sinceResponseInit": { "nanos": 146078 }, "responseBytes": 3, "grpcStatusCode": 0 } } ``` Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-09-19 09:34:49 -07:00
Bruno M. Custódio	8fec756395	Add '--address' flag to 'linkerd dashboard'. (#3274 ) Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>	2019-09-05 10:56:10 -07:00
Alejandro Pedraza	acbab93ca8	Add support for k8s 1.16 (#3364 ) Fixes #3356 1.16 removes some api groups that were already deprecated. From k8s blog post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/): ``` - PodSecurityPolicy: will no longer be served from extensions/v1beta1 in v1.16. Migrate to the policy/v1beta1 API, available since v1.10. Existing persisted data can be retrieved/updated via the policy/v1beta1 API. - DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16. Migrate to the apps/v1 API, available since v1.9. Existing persisted data can be retrieved/updated via the apps/v1 API. ``` Previous PRs had already made this change at the Helm templates level, but we still needed to do it at the API calls and tests. The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16 because the upgrade integration test tries to install linkerd 2.5 which is not compatible with 1.16. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-09-04 09:59:55 -05:00
陈谭军	e281fb3410	fix-up grammar (#3351 ) Signed-off-by: chentanjun <2799194073@qq.com>	2019-08-30 08:09:36 -07:00
Alejandro Pedraza	fd248d3755	Undo refactoring from #3316 (#3331 ) Thus fixing `linkerd edges` and the dashboard topology graph Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-08-29 13:37:54 -05:00
Alejandro Pedraza	5d7499dc84	Avoid the dashboard requesting stats when not needed (#3338 ) * Avoid the dashboard requesting stats when not needed Create an alternative to `urlsForResource` called `urlsForResourceNoStats` that makes use of the `skip_stats` parameter in the stats API (created in #1871) that doesn't query Prometheus when not needed. When testing using the dashboard looking at the linkerd namespace, queries per second went down from 2874 to 2756, a 4% decrease. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-08-29 05:52:44 -05:00
arminbuerkle	5c38f38a02	Allow custom cluster domains in remaining backends (#3278 ) * Set custom cluster domain in GetServiceProfileFor * Set custom cluster domain in tap server Move fetching cluster domain for tap server to cmd main * Handle fetchting cluster domain errors separately * Use custom cluster domain for traffic split adaptor Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>	2019-08-27 10:01:36 -07:00
Alejandro Pedraza	02efb46e45	Have the proxy-injector emit events upon injection/skipping injection (#3316 ) * Have the proxy-injector emit events upon injection/skipping injection Fixes #3253 Have the proxy-injector emit an event whenever a injection happens, or when injection is skipped for some reason (also added that reason into the proxy-injector logs). The level is associated to the parent workload (it can't be associated to the pod because at this point the pod hasn't been persisted). The event recorder was setup at the `webhook/server.go` level and passed to the proxy-injector's `Inject` function. The sp-validator thus also has access to the event recorder, but for now it's not using it. Related changes: - Refactored `api.GetOwnerKindAndName()` to have it return a more generic object. - Refactored `report.Injectable()` to also have it return the reason why a workload is not injectable. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-08-26 13:34:36 -05:00
Carol A. Scott	089836842a	Add unit test for edges API endpoint (#3306 ) Fixes #3052. Adds a unit test for the edges API endpoint. To maintain a consistent order for testing, the returned rows in api/public/edges.go are now sorted.	2019-08-23 09:28:02 -07:00
Guangming Wang	70d85d2065	Cleanup: fix some typos in code comment (#3296 ) Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>	2019-08-21 09:40:43 -07:00
Oliver Gould	ee79d5d324	destination: Reorganize authority-parsing (#3244 ) In preparation for #3242, the destination controller will need to support a broader set of valid authorities including IP addresses. This change modifies the destination controller's authority-parsing code so that the is-this-a-kubernete-service-name decision is decoupled from parsing of authorities into their consituent parts. The `Get` API now explicitly handles IP address names, though it currently fails all such resolutions.	2019-08-21 07:19:42 -07:00
Carol A. Scott	bc8fef7ba9	Sorting the expected response for trafficsplit rows so it is always in consistent row order (#3280 )	2019-08-19 10:10:26 -07:00
Carol A. Scott	9c62b65c6a	Adding trafficsplit test to stat_summary_test.go (#3252 ) This PR adds a test for trafficsplits to stat_summary_test.go. Because the test requires a consistent order for returned rows, trafficsplit rows in stat_summary.go are now sorted by apex + leaf name before being returned.	2019-08-14 14:48:46 -07:00
Kevin Leimkuhler	cc3c53fa73	Remove tap from public API and associated test infrastructure (#3240 ) ### Summary After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed. This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files. This draft passes `go test` as well as a local run of the integration tests. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-08-14 17:27:37 -04:00
Carol A. Scott	00437709eb	Add trafficsplit metrics to CLI (#3176 ) This PR adds `trafficsplit` as a supported resource for the `linkerd stat` command. Users can type `linkerd stat ts` to see the apex and leaf services of their trafficsplits, as well as metrics for those leaf services.	2019-08-14 10:30:57 -07:00
Andrew Seigner	f98bc27a38	Fix invalid `l5d-require-id` for some tap requests (#3210 ) PR #3154 introduced an `l5d-require-id` header to Tap requests. That header string was constructed based on the TapByResourceRequest, which includes 3 notable fields (type, name, namespace). For namespace-level requests (via commands like `linkerd tap ns linkerd`), type == `namespace`, name == `linkerd`, and namespace == "". This special casing for namespace-level requests yielded invalid `l5d-require-id` headers, for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`. Fix `l5d-require-id` string generation to account for namespace-level requests. The bulk of this change is tap unit test updates to validate the fix. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-08-08 09:42:11 -07:00
Andrew Seigner	a59c1dd32d	Introduce tap APIService, update `linkerd tap` (#3167 ) The Tap Service enabled tapping of any meshed pod, regardless of user privilege. This change introduces a new Tap APIService. Kubernetes provides authentication and authorization of Tap requests, and then forwards requests to a new Tap APIServer, which implements a Kubernetes aggregated APIServer. The Tap APIServer authenticates the client TLS from Kubernetes, and authorizes the user via a SubjectAccessReview. This change also modifies the `linkerd tap` command to make requests against the new APIService. The Tap APIService implements these Kubernetes-style endpoints: POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap GET /apis GET /apis/tap.linkerd.io GET /apis/tap.linkerd.io/v1alpha1 GET /healthz GET /healthz/log GET /healthz/ping GET /metrics GET /openapi/v2 GET /version Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the `watch` verb is supported. Access is also available via subresources such as `deployments/tap` and `pods/tap`. This change introduces the following resources into the default Linkerd install: - Global - APIService/v1alpha1.tap.linkerd.io - ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator - `linkerd` namespace: - Secret/linkerd-tap-tls - `kube-system` namespace: - RoleBinding/linkerd-linkerd-tap-auth-reader Tasks not covered by this PR: - `linkerd top` - `linkerd dashboard` - `linkerd profile --tap` - removal of the unauthenticated tap controller Fixes #2725, #3162, #3172 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-08-01 14:02:45 -07:00
Alex Leong	ab7226cbcd	Return invalid argument for external name services (#3120 ) Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498 When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource. Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request. The net effect is that Linkerd fails all requests to external name services. We change the destination service to instead return `InvalidArgument` for external name services. This causes the proxy to fallback to SO_ORIG_DST instead of failing the request. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-29 16:31:22 -07:00
Andrew Seigner	51b33ad53c	Fix nil pointer dereference in endpoints watcher (#3147 ) The destination service's endpoints watcher assumed every `Endpoints` object contained a `TargetRef`. This field is optional, and in cases such as the default `ep/kubernetes` object, `TargetRef` is nil, causing a nil pointer dereference. Fix endpoints watcher to check for `TargetRef` prior to dereferencing. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-25 17:11:56 -07:00
Alex Leong	3c4a0e4381	Make authorities in destination overrides absolute (#3137 ) Fixes #3136 When the destination service sends a destination profile with a traffic split to the proxy, the override destination authorities are absolute but do no contain a trailing dot. e.g. "bar.ns.svc.cluster.local:80". However, NameAddrs which have undergone canonicalization in the proxy will include the trailing dot. When a traffic split includes the apex service as one of the overrides, the original apex NameAddr will have the trailing dot and the override will not. Since these two NameAddrs are not identical, they will go into two distinct slots in the proxy's concrete dst router. This will cause two services to be created for the same destination which will cause the stats clobbering described in the linked issue. We change the destination service to always return absolute dst overrides including the trailing dot. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-24 17:08:40 -07:00
Alex Leong	e538a05ce2	Add support for stateful sets (#3113 ) We add support for looking up individual pods in a stateful set with the destination service. This allows Linkerd to correctly proxy requests which address individual pods. The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`. Fixes #2266 Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-24 14:09:46 -07:00
Andrew Seigner	64ed8e4a74	Introduce Cluster Heartbeat cronjob (#3056 ) `linkerd check`, the web dashboard, and Grafana all perform version checks to validate Linkerd is up to date. It's common for users to seldom execute these codepaths. This makes it difficult to identify what versions of Linkerd are currently in use and what environments it is being run in, which helps prioritize testing and backports. Introduce a `heartbeat` CronJob to the default Linkerd install. The cronjob executes every 24 hours, starting from 5 minutes after `linkerd install` is run. Example check URL: https://versioncheck.linkerd.io/version.json? install-time=1562761177& k8s-version=v1.15.0& meshed-pods=8& rps=3& source=heartbeat& uuid=cc4bb700-3314-426a-9f0f-ec588b9df020& version=git-b97ee9f7 Fixes #2961 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-23 17:12:30 -07:00
Alex Leong	d6ef9ea460	Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078 ) The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller. Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object. To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2. This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource. If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can. This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start. Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation). In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-23 11:46:31 -07:00
arminbuerkle	010efac24b	Allow custom cluster domain in controller components (#2950 ) * Allow custom cluster domain in destination watcher The change relaxes the constrains of an authority requiring a `svc.cluster.local` suffix to only require `svc` as third part. A unit test could be added though the destination/server and endpoint watcher already test this behaviour. * Update proto to allow setting custom cluster domain Update golden templates * Allow setting custom domain in grpc, web server * Remove cluster domain flags from web srv and public api * Set defaultClusterDomain in validateAndBuild if none is set Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>	2019-07-23 08:59:41 -07:00
Alex Leong	c8b34a8cab	Add pod status to linkerd check (#3065 ) When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff). We add pod status to the output to make this more immediately obvious. Fixes #2877 Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-18 15:56:19 -07:00
Carol A. Scott	ee1a111993	Updating CLI output for `linkerd edges` (#3048 ) This PR improves the CLI output for `linkerd edges` to reflect the latest API changes. Source and destination namespaces for each edge are now shown by default. The `MSG` column has been replaced with `Secured` and contains a green checkmark or the reason for no identity. A new `-o wide` flag shows the identity of client and server if known.	2019-07-17 12:23:34 -07:00
Jonathan Juares Beber	2dcbde08b3	Show pod status more clearly (#1967 ) (#2989 ) During operations with `linkerd stat` sometimes it's not clear the actual pod status. This commit introduces a method, to the `k8s`package, getting the pod status, based on [`kubectl` logic](`33a3e325f7/pkg/printers/internalversion/printers.go (L558-L640)`) to expose the `STATUS` column for pods . Also, it changes the stat command on the` cli` package adding a column when the resource type is a Pod. Fixes #1967 Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>	2019-07-10 12:44:44 -07:00
Jonathan Juares Beber	e2211f5f77	Introduces owner references verification for pods (#3027 ) When getting pods for specific kubernetes resources, the usage of just labels, as a selector, generates wrong outputs. Once, two resources can use the same label selector and manage distinct pods, a new mechanism to check pods for a given resource it's needed. More details on #2932. This commit introduces a verification through the pod owner references `UID`s, comparing with the given resource's. Additional logic is needed when handling `Deployments` since it creates a `ReplicaSet` and this last one is the actual pod's owner. No verification is done in case of `Services`. Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>	2019-07-10 12:44:24 -07:00
Alex Leong	92ddffa3c2	Add prometheus metrics for watchers (#3022 ) To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-08 11:50:26 -07:00

1 2 3 4 5

221 Commits