linkerd2

Commit Graph

Author	SHA1	Message	Date
cpretzer	4e92064f3b	Add a flag to install-cni command to configure iptables wait flag (#3066 ) Signed-off-by: Charles Pretzer <charles@buoyant.io>	2019-08-15 12:58:18 -07:00
Andrew Seigner	9a672dd5a9	Introduce `linkerd --as` flag for impersonation (#3173 ) Similar to `kubectl --as`, global flag across all linkerd subcommands which sets a `ImpersonationConfig` in the Kubernetes API config. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-31 16:05:33 -07:00
Andrew Seigner	065dd3ec9d	Add "can create cronjobs" to linkerd check (#3133 ) PR #3056 introduced a cluster heartbeat cronjob to the Linkerd installation. This implies the user installing Linkerd requires the privileges to create CronJobs. Update `linkerd check` to validate the user has privileges necessary to create CronJobs. Fixes #3057 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-26 13:09:41 -07:00
Andrew Seigner	64ed8e4a74	Introduce Cluster Heartbeat cronjob (#3056 ) `linkerd check`, the web dashboard, and Grafana all perform version checks to validate Linkerd is up to date. It's common for users to seldom execute these codepaths. This makes it difficult to identify what versions of Linkerd are currently in use and what environments it is being run in, which helps prioritize testing and backports. Introduce a `heartbeat` CronJob to the default Linkerd install. The cronjob executes every 24 hours, starting from 5 minutes after `linkerd install` is run. Example check URL: https://versioncheck.linkerd.io/version.json? install-time=1562761177& k8s-version=v1.15.0& meshed-pods=8& rps=3& source=heartbeat& uuid=cc4bb700-3314-426a-9f0f-ec588b9df020& version=git-b97ee9f7 Fixes #2961 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-23 17:12:30 -07:00
Alex Leong	d6ef9ea460	Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078 ) The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller. Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object. To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2. This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource. If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can. This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start. Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation). In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-23 11:46:31 -07:00
Alex Leong	c8b34a8cab	Add pod status to linkerd check (#3065 ) When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff). We add pod status to the output to make this more immediately obvious. Fixes #2877 Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-18 15:56:19 -07:00
Andrew Seigner	5d0746ff91	Add NET_RAW to `linkerd check --pre` (#3055 ) `linkerd check --pre` validates that PSPs provide `NET_ADMIN`, but was not validating `NET_RAW`, despite `NET_RAW` being required by Linkerd's proxy-init container since #2969. Introduce a `has NET_RAW capability` check to `linkerd check --pre`. Fixes #3054 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-10 20:28:49 +02:00
Andrew Seigner	7756828ae6	Update install failure message to list resources (#3050 ) The existing `linkerd install` error message for existing resources was shared with `linkerd check`. Given the different contexts, the messaging made more sense for `linkerd check` than for `linkerd install`. Modify the error messaging for `linkerd install` to print a bare list of existing resources, and provide instructions for proceeding. For example: ```bash $ linkerd install Unable to install the Linkerd control plane. It appears that there is an existing installation: clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity If you are sure you'd like to have a fresh install, remove these resources with: linkerd install --ignore-cluster \| kubectl delete -f - Otherwise, you can use the --ignore-cluster flag to overwrite the existing global resources. ``` Fixes #3045 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-09 20:21:19 +02:00
Andrew Seigner	94fa653cf3	Fix `linkerd check` missing uuid on version check (#3040 ) PR #2603 modified the web process to read the UUID from the `linkerd-config` ConfigMap rather than from a command line flag. The `linkerd check` command relied on that command line flag to retrieve the UUID as part of its version check. Modify `linkerd check` to correctly retrieve the UUID from `linkerd-config`. Also refactor `linkerd-config` retrieval and parsing code to be shared between healthcheck, install, and upgrade. Relates to #2961 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-05 19:39:13 +02:00
Andrew Seigner	7c87fd4498	Make ReplicaSet check more explicit. (#3017 ) The `linkerd check` for healthy ReplicaSets had a generic `control plane components ready` description, and a hint anchor to `l5d-existence-psp`. While a ReplicaSet failure could definitely occur due to psp, that hintAnchor was already in use by the "control plane PodSecurityPolicies exist" check. Rename the `control plane components ready` check to `control plane replica sets are ready`, and the hintAnchor from `l5d-existence-psp` to `l5d-existence-replicasets`. Relates to https://github.com/linkerd/website/issues/372. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-02 20:02:08 +02:00
Ivan Sim	866fe6fa5e	Introduce global resources checks to install and multi-stage install (#2987 ) * Introduce new checks to determine existence of global resources and the 'linkerd-config' config map. * Update pre-check to check for existence of global resources This ensures that multiple control planes can't be installed into different namespaces. * Update integration test clean-up script to delete psp and crd Signed-off-by: Ivan Sim <ivan@buoyant.io>	2019-06-27 09:59:12 -07:00
Andrew Seigner	2528e3d62d	Make NET_ADMIN check a warning, add PSP check (#2958 ) `linkerd check` validates whether PSP's exist, and if the caller has the `NET_ADMIN` capability. This check was previously failing if `NET_ADMIN` was not found, even in the case where the PSP admission controller was not running. Related, `linkerd install` now includes a PSP, so `linkerd check` should also validate that the caller can create PSP's. Modify the `NET_ADMIN` check to warn, but not fail, if PSP's are found but the caller does not have `NET_ADMIN`. Update the warning message to mention that this is only a problem if the PSP admission controller is running (and will only be a problem during injection, since #2920 handles control plane installation by adding its own PSP). Also introduce a check to validate the caller can create PSP's. Fixes #2884, #2849 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-06-20 17:58:26 +02:00
Alena Varkockova	7973715ee4	Output check result as json (#2666 ) * Add missing file, make linter happy * Changes from code review * Ivan's code review * PR review feedback * style fixes * Last PR comments Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2019-05-20 09:04:28 -07:00
Dennis Adjei-Baah	a0fa1dff59	Move tap service into its own pod. (#2773 ) * Split tap into its own pod in the control plane Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2019-05-15 16:28:44 -05:00
mg	b965d0d30e	Introduce pre-install healthcheck for clock skew (#2803 ) * Adding pre-install check for clock skew * Fixing lint error - time.Since * Update test data for clock skew check * Incorporating code review comments * Additional fix - clock skew test Signed-off-by: Matej Gera <matejgera@gmail.com>	2019-05-13 10:14:38 -07:00
Andrew Seigner	ad2f92662e	Fix check/dashboard failing from one pod when HA (#2764 ) The `linkerd check` and `linkerd dashboard` commands validate control plane pods are up via the `LinkerdAPIChecks` category of checks. These checks will fail if a single pod is not ready, even in HA mode. Modify the underlying `validateControlPlanePods` check to return successful if at least one pod per control plane component is ready. Fixes #2554 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-05-03 18:42:21 +02:00
Andrew Seigner	66494591e0	Multi-stage check support (#2765 ) Add support for `linkerd check config`. Validates the existence of the Linkerd Namespace, ClusterRoles, ClusterRoleBindings, ServiceAccounts, and CustomResourceDefitions. Part of #2337 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-30 17:17:59 +01:00
Andrew Seigner	dec620f818	Fix `linkerd check --proxy` with default ns param (#2754 ) The `linkerd check --proxy` command checks for proxies in all namespaces, if the `--namespace` flag is not set. PR #2747 modified the behavior of `KubernetesAPI.NamespaceExists`. Previously it would succeed if given an emptry string for a namespace. Now it fails with a `resource name may not be empty` error (for k8s server `v1.10.11`), or a not found error (for our fake test client). Modify the data plane proxy namespace check to return success if the namespace is not set. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-25 16:37:08 -07:00
Andrew Seigner	ec540a882e	Consolidate k8s APIs (#2747 ) Numerous codepaths have emerged that create k8s configs, k8s clients, and make k8s api requests. This branch consolidates k8s client creation and APIs. The primary change migrates most codepaths to call `k8s.NewAPI` to instantiate a `KubernetesAPI` struct from `pkg`. `KubernetesAPI` implements the `kubernetes.Interface` (clientset) interface, and also persists a `client-go` `rest.Config`. Specific list of changes: - removes manual GET requests from `k8s.KubernetesAPI`, in favor of clientsets - replaces most calls to `k8s.GetConfig`+`kubernetes.NewForConfig` with a single `k8s.NewAPI` - introduces a `timeout` param to `k8s.NewAPI`, currently only used by healthchecks - removes `NewClientSet` in `controller/k8s/clientset.go` in favor of `k8s.NewAPI` - removes `httpClient` and `clientset` from `HealthChecker`, use `KubernetesAPI` instead Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-25 11:31:38 -07:00
Andrew Seigner	b2b4780430	Introduce install stages (#2719 ) This change introduces two named parameters for `linkerd install`, split by privilege: - `linkerd install config` - Namespace - ClusterRoles - ClusterRoleBindings - CustomResourceDefinition - ServiceAccounts - `linkerd install control-plane` - ConfigMaps - Secrets - Deployments - Services Comprehensive `linkerd install` is still supported. TODO: - `linkerd check` support - `linkerd upgrade` support - integration tests Part of #2337 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-23 14:52:34 -07:00
Andrew Seigner	c7bd8167a9	Remove redundant k8s request in linkerd check (#2722 ) PR #2510 introduced some new checks into `linkerd check`. One set of checks was unnecessarily calling `GetPodsByNamespace` twice. Remove the redundant k8s API call. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-19 12:27:34 -07:00
Gaurav Kumar	392cab80fa	Add check for unschedulable pods and psp issues (#2510 ) Fixes #2465 * Add check for unschedulable pods and psp issues (#2465) * Return error reason and message on pod or node failure Signed-off-by: Gaurav Kumar <gaurav.kumar9825@gmail.com>	2019-04-19 11:42:22 -07:00
Alejandro Pedraza	edb225069c	Add validation webhook for service profiles (#2623 ) Add validation webhook for service profiles Fixes #2075 Todo in a follow-up PRs: remove the SP check from the CLI check. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-04-05 16:10:47 -05:00
Kevin Lingerfelt	0d4eb02835	Add identity pod to check, web, and integration tests (#2529 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2019-03-19 20:49:31 -07:00
Oliver Gould	81f645da66	Remove `--tls=optional` and `linkerd-ca` (#2515 ) The proxy's TLS implementation has changed to use a new _Identity_ controller. In preparation for this, the `--tls=optional` CLI flag has been removed from install and inject; and the `ca` controller has been deleted. Metrics and UI treatments for TLS have not been removed, as they will continue to be valuable for the new Identity system. With the removal of the old identity scheme, the Destination service's proxy ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable locality awareness.	2019-03-18 17:40:31 -07:00
Andrew Seigner	024a77ec16	Use correct resource names for authz checks (#2496 ) The `linkerd check` command was using TitleCase resource names (e.g. "ConfigMaps") for SelfSubjectAccessReview requests. These were not valid, they were only passing because SSARs requests return `allowed` for unknown resource types unless explicitly restricted. Modify the `linkerd check` authorization requests to use the correct resource names. Steps to reproduce: - default AKS cluster - running inside a pod ```bash $ kubectl proxy ``` Fails: ```bash $ curl -k -v -XPOST -d'{"kind":"SelfSubjectAccessReview","apiVersion":"authorization.k8s.io/v1","spec":{"resourceAttributes":{"namespace":"default","verb":"create","version":"v1","resource": "Namespace"}}}' -H "Accept: application/json, /" -H "Content-Type: application/json" http://127.0.0.1:8001/apis/authorization.k8s.io/v1/selfsubjectaccessreviews ... { "kind": "SelfSubjectAccessReview", "apiVersion": "authorization.k8s.io/v1", "metadata": { "creationTimestamp": null }, "spec": { "resourceAttributes": { "namespace": "default", "verb": "create", "version": "v1", "resource": "Namespace" } }, "status": { "allowed": false } } ``` Works: ```bash curl -k -v -XPOST -d'{"kind":"SelfSubjectAccessReview","apiVersion":"authorization.k8s.io/v1","spec":{"resourceAttributes":{"namespace":"default","verb":"create","version":"v1","resource": "namespaces"}}}' -H "Accept: application/json, /" -H "Content-Type: application/json" http://127.0.0.1:8001/apis/authorization.k8s.io/v1/selfsubjectaccessreviews ... { "kind": "SelfSubjectAccessReview", "apiVersion": "authorization.k8s.io/v1", "metadata": { "creationTimestamp": null }, "spec": { "resourceAttributes": { "namespace": "default", "verb": "create", "version": "v1", "resource": "namespaces" } }, "status": { "allowed": true, "reason": "RBAC: allowed by ClusterRoleBinding \"docker-build\" of ClusterRole \"docker-build\" to ServiceAccount \"docker-build/docker-build\"" } } ``` Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-03-14 10:20:09 -07:00
Andrew Seigner	e5d2460792	Remove single namespace functionality (#2474 ) linkerd/linkerd2#1721 introduced a `--single-namespace` install flag, enabling the control-plane to function within a single namespace. With the introduction of ServiceProfiles, and upcoming identity changes, this single namespace mode of operation is becoming less viable. This change removes the `--single-namespace` install flag, and all underlying support. The control-plane must have cluster-wide access to operate. A few related changes: - Remove `--single-namespace` from `linkerd check`, this motivates combining some check categories, as we can always assume cluster-wide requirements. - Simplify the `k8s.ResourceAuthz` API, as callers no longer need to make a decision based on cluster-wide vs. namespace-wide access. Components either have access, or they error out. - Modify the web dashboard to always assume ServiceProfiles are enabled. Reverts #1721 Part of #2337 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-03-12 00:17:22 -07:00
Andrew Seigner	d90fa16727	Introduce NET_ADMIN cli check (#2421 ) The `linkerd-init` container requires the NET_ADMIN capability to modify iptables. The `linkerd check` command was not verifying this. Introduce a `has NET_ADMIN capability` check, which does the following: 1) Lists all available PodSecurityPolicies, if none found, returns success 2) For each PodSecurityPolicy, validate one exists that: - the user has `use` access AND - provides `*` or `NET_ADMIN` capability A couple limitations to this approach: - It is testing whether the user running `linkerd check` has NET_ADMIN, but during installation time it will be the `linkerd-init` pod that requires NET_ADMIN. - It assumes the presense of PodSecurityPolicies in the cluster means the PodSecurityPolicy admission controller is installed. If the admission controller is not installed, but PSPs exists that restrict NET_ADMIN, `linkerd check` will incorrectly report the user does not have that capability. This PR also fixes the `can create CustomResourceDefinitions` check to not specify a namespace when doing a `create` check, as CRDs are cluster-wide. Fixes #1732 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-03-05 17:19:11 -08:00
Yan	4cd1f99e89	Check kubectl version as part of checks (#2358 ) Fixes #2354 Signed-off-by: Yan Babitski <yan.babitski@gmail.com>	2019-03-01 10:03:59 -08:00
Andrew Seigner	ec5a0ca8d9	Authorization-aware control-plane components (#2349 ) The control-plane components relied on a `--single-namespace` param, passed from `linkerd install` into each individual component, to determine which namespaces they were authorized to access, and whether to support ServiceProfiles. This command-line flag was redundant given the authorization rules encoded in the parent `linkerd install` output, via [Cluster]Role[Binding]s. Modify the control-plane components to query Kubernetes at startup to determine which namespaces they are authorized to access, and whether ServiceProfile support is available. This allows removal of the `--single-namespace` flag on the components. Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD. TODO: - Remove `--single-namespace` flag on `linkerd install`, part of #2164 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-26 11:54:52 -08:00
Andrew Seigner	25e462352d	lint: Enable goimports (#2366 ) goimports checks import lines, adding missing ones and removing unreferenced ones: https://godoc.org/golang.org/x/tools/cmd/goimports It also requires named imports for packages whose import paths don't match their package names: - https://github.com/golang/go/issues/28428 - https://go-review.googlesource.com/c/tools/+/145699/ Also standardized named imports of common Kubernetes packaages. Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-25 15:51:10 -08:00
Oliver Gould	f7435800da	lint: Enable scopelint (#2364 ) [scopelint][scopelint] detects a nasty reference-scoping issue in loops. [scopelint]: https://github.com/kyoh86/scopelint	2019-02-24 08:59:51 -08:00
Andrew Seigner	7fa7e962cb	Fix hint URLs not display for RPC checks (#2361 ) Hint URLs should display for all failed checks in `linkerd check`, but were not displaying for RPC checks. Fix `runCheckRPC` to pass along the hintAnchor to the check result. Also rename the second `can query the control plane API` to `control plane self-check`, as there were two checks with that name. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-23 13:35:32 -08:00
Andrew Seigner	2305974202	Introduce golangci-lint tooling, fixes (#2239 ) `golangci-lint` performs numerous checks on Go code, including golint, ineffassign, govet, and gofmt. This change modifies `bin/lint` to use `golangci-lint`, and replaces usage of golint and govet. Also perform a one-time gofmt cleanup: - `gofmt -s -w controller/` - `gofmt -s -w pkg/` Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-13 11:16:28 -08:00
Andrew Seigner	1a60237a94	Update check command hint URLs to new alias (#2245 ) The existing hint URLs printing by `linkerd check` pointed to locations that would change if the linkerd.io website was reorganized. linkerd/website#148 introduces an alias for hint URLs at https://linkerd.io/checks/. This is the corresponding change to update `linkerd check` output. Depends on linkerd/website#148, relates to linkerd/website#146. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-11 11:00:16 -08:00
Alex Leong	3f333c2860	Validate service profiles in all namespaces (#2237 ) Fixes #2220 The service profile validation which is part of `linkerd check` only validates service profiles in the Linkerd namespace. Due to a recent change, service profiles now can exist in any namespace. Update the logic so that service profiles in all namespaces are validated. Additionally: * Relax validation of service profile names to support external names Signed-off-by: Alex Leong <alex@buoyant.io>	2019-02-11 09:52:47 -08:00
Andrew Seigner	907f01fba6	Improve ServiceProfile validation in linkerd check (#2218 ) The `linkerd check` command was doing limited validation on ServiceProfiles. Make ServiceProfile validation more complete, specifically validate: - types of all fields - presence of required fields - presence of unknown fields - recursive fields Also move all validation code into a new `Validate` function in the profiles package. Validation of field types and required fields is handled via `yaml.UnmarshalStrict` in the `Validate` function. This motivated migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml. Fixes #2190	2019-02-07 14:35:47 -08:00
Andrew Seigner	72812baf99	Introduce Discovery API and endpoints command (#2195 ) The Proxy API service lacked introspection of its internal state. Introduce a new gRPC Discovery API, implemented by two servers: 1) Proxy API Server: returns a snapshot of discovery state 2) Public API Server: pass-through to the Proxy API Server Also wire up a new `linkerd endpoints` command. Fixes #2165 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-07 14:02:21 -08:00
Alejandro Pedraza	2a7654ce78	Consolidate timeouts for `linkerd check` (#2191 ) Consolidate timeouts for `linkerd check` - Moved the creation of contexts from inside the methods targeted by the checks into a single place in the runCheck() and runCheckRPC() methods where the context is built using a hard-coded timeout of 30 seconds. - k8s' client-go doesn't allow passing along contexts, but it let's us setting the Timeout manually. - Reworded the description for the --wait option. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-02-05 11:38:30 -05:00
Andrew Seigner	3a139d0202	Fix spelling on linkerd check link (#2197 ) Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-04 14:00:11 -08:00
Alex Leong	3bd4231cec	Add support for timeouts in service profiles (#2149 ) Fixes #2042 Adds a new field to service profile routes called `timeout`. Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead. If the timeout field is not specified, a default timeout of 10 seconds is used. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-01-30 16:48:55 -08:00
Andrew Seigner	5651e02496	Add hint URLs for all checks (#2159 ) linkerd/website#105 introduced a FAQ page, providing resolutions for all `linkerd check` failures. Update each check to reference its corresponding section in the FAQ. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-30 16:26:57 -08:00
Dennis Adjei-Baah	20efe14fef	Fix issue where linkerd check would panic with replicationcontroller pod name in control plane (#2140 ) When running Linkerd check with a control plane namespace that may contain an additional pod with a replication controller ID for pod names instead of a replicaSet ID, the check command panics because of an "index out of bounds" error. This PR adds a check to make sure that, when parsing pod names during the `checkControllerRunning` healthcheck, we only check for linkerd control plane pods and that the pod name results in four or more substrings when split on '-'. This prevents the check from panicking when encountering a replication controller ID pod name. Fixes #2084 Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2019-01-24 10:26:11 -08:00
Risha Mars	950c952d14	Add a less scary message to the user when retries are still in progress (#2141 ) When a failing check is being retried, we show the current err to the user. This can sometimes be unnecessarily alarming, as in the case of the control plane starting up. If a failing check is in the process of being retried, wait to show the final error message until the retries have completed.	2019-01-24 10:24:59 -08:00
Risha Mars	e7556d7edc	Check RoleBindings for specified single namespace only (#2142 ) Previously, we were doing the creation checks for both Roles/RoleBindings and ClusterRoles/ClusterRoleBindings for all namespaces, but in --single-namespace mode, we only need to check that these can be created in the control plane namespace.	2019-01-23 18:04:15 -08:00
Andrew Seigner	c9ac77cd7c	Introduce version consistency checks (#2130 ) Version checks were not validating that the cli version matched the control plane or data plane versions. Add checks via the `linkerd check` command to validate the cli is running the same version as the control and data plane. Also add types around `channel-version` string parsing and matching. A consequence being that during development `version.Version` changes from `undefined` to `dev-undefined`. Fixes #2076 Depends on linkerd/website#101 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-23 16:54:43 -08:00
Alena Varkockova	28f662c9c6	Introduce resource selector and deprecate namespace field for ListPods (#2025 ) * Introduce resource selector and deprecate namespace field for ListPods * Changes from code review * Properly deprecate the field * Do not check for nil * Fix the mockProm usage * Protoc changes revert * Changed from code review Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2019-01-23 10:35:55 -08:00
Andrew Seigner	92f2cd9b63	Update check and inject output (#2087 ) The outputs of the `check` and `inject` commands did not vary much between successful and failed executions, and were a bit verbose and challenging to parse. Reorganize output of `check` and `inject` commands, to provide more output when errors occur, and less output when successful. Specific changes: `linkerd check` - visually group checks by category - introduce `hintURL`'s, to provide doc links when checks fail - add spinners when retrying, remove additional retry lines - colored unicode characters to indicate success/warning/failure `linkerd inject` - modify default output to mirror `kubectl apply` - only output non-successful inject reports - support `--verbose` flag to output all inject reports Fixes #1471, #1653, #1656, #1739 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-16 15:14:14 -08:00
Andrew Seigner	dacd8819ff	Group checkers by category (#2083 ) The linkerd check command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the HealthChecker level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-14 18:01:32 -08:00
Andrew Seigner	04373414ef	Modify all health checks to be specified via enums (#2078 ) The set of health checks to be executed were dependent on a combination of check enums and boolean options. This change modifies the health checks to be governed strictly by a set of enums. Next steps: - tightly couple category IDs to names - tightly couple checks to their parent categories - programmatic control over check ordering Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-14 17:16:15 -08:00
Alejandro Pedraza	281ba37e6d	More granular control on checks made by CLI commands (#2033 ) Have the CLI commands `get`, `routes`, `stat`, `tap`and `top` perform a more limited set of checks Fixes #1854	2019-01-10 09:13:44 -05:00
Andrew Seigner	1c302182ef	Enable lint check for comments (#2023 ) Commit 1: Enable lint check for comments Part of #217. Follow up from #1982 and #2018. A subsequent commit will fix the ci failure. Commit 2: Address all comment-related linter errors. This change addresses all comment-related linter errors by doing the following: - Add comments to exported symbols - Make some exported symbols private - Recommend via TODOs that some exported symbols should should move or be removed This PR does not: - Modify, move, or remove any code - Modify existing comments Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-01-02 14:03:59 -08:00
Radu M	07cbfe2725	Fix most golint issues that are not comment related (#1982 ) Signed-off-by: Radu Matei <radu@radu-matei.com>	2018-12-20 10:37:47 -08:00
Kevin Lingerfelt	86e95b7ad3	Disable serivce profiles in single-namespace mode (#1980 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-12-13 14:37:18 -08:00
Kevin Lingerfelt	fd44896644	Remove namespace definition from --single-namespace installs (#1974 ) * Remove namespace definition from --single-namespace installs * DRY up code in healthcheck.go Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-12-12 14:53:02 -08:00
Kevin Lingerfelt	37ae423bb3	Add linkerd- prefix to all objects in linkerd install (#1920 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-12-04 15:41:47 -08:00
Andrew Seigner	d121071f87	Adjust proxy, Prometheus, and Grafana probes (#1899 ) * Adjust proxy, Prometheus, and Grafana probes High `readinessProbe.initialDelaySeconds` values delayed the controller's readiness by up to 30s, preventing cli commands from succeeding shortly after control plane deployment. Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and Grafana to the default 0s. Also change `linkerd check` controller pod ordering to: controller, prometheus, web, grafana. Detailed probe changes: - proxy - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s - prometheus - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s - decrease `readinessProbe.timeoutSeconds` from 30s to 1s - decrease `livenessProbe.timeoutSeconds` from 30s to 1s - grafana - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s - decrease `readinessProbe.timeoutSeconds` from 30s to 1s - decrease `readinessProbe.failureThreshold` from 10 to 3 - increase `livenessProbe.initialDelaySeconds` from 0s to 30s Fixes #1804 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-12-03 10:41:11 -08:00
Kevin Lingerfelt	4547ba7f0a	Make permission checks non-fatal, add check for CRDs (#1859 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-11-14 10:29:04 -08:00
Alena Varkockova	fda834cf64	Allow retrying control plane API check (#1858 ) Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-11-13 10:52:50 -08:00
Alena Varkockova	38dfc5308f	Make version checks warning (#1844 ) Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-11-09 09:48:14 -08:00
Alex Leong	32d556e732	Improve ergonomics of service profile spec (#1828 ) We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API. * Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition. Doing so is equivalent to combining the fields with an "all" condition. * Rename "responses" to "response_classes" * Change "IsSuccess" to "is_failure" Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-31 12:00:22 -07:00
Alex Leong	82ca821e62	Use fqdn for service profile name (#1808 ) Service profiles must be named in the form `"<service>.<namespace>"`. This is inconsistent with the fully normalized domain name that the proxy sends to the controller. It also does not permit creating service profiles for non-Kubernetes services. We switch to requiring that service profiles must be named with the FQDN of their service. For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`. This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services. Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-29 14:35:42 -07:00
Alex Leong	6cffad277b	Make service profile validation a warning instead of an error (#1807 ) The existence of an invalid service profile causes `linkerd check` to fail. This means that it is not possible to open the Linkerd dashboard with the `linkerd dashboard` command. While service profile validation is useful, it should not lock users out. Add the ability to designate health checks as warnings. A failed warning health check will display a warning output in `linkerd check` but will not affect the overall success of the command. Switch the service profile validation to be a warning. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-26 13:28:10 -07:00
Alex Leong	f549868033	Fix integration test and docker build (#1790 ) Fix broken docker build by moving Service Profile conversion and validation into `/pkg`. Fix broken integration test by adding service profile validation output to `check`'s expected output. Testing done: * `gotest -v ./...` * `bin/docker-build` * `bin/test-run (pwd)/bin/linkerd` Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-19 10:23:34 -07:00
Alex Leong	5210b7b44a	Add check for service profile validation (#1775 ) Add a check to `linkerd check` which validates all service profile resources. In particular it checks: * does the service profile refer to an existent service * is the service profile valid Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-18 16:37:39 -07:00
Kevin Lingerfelt	46c887ca00	Add --single-namespace install flag for restricted permissions (#1721 ) * Add --single-namespace install flag for restricted permissions * Better formatting in install template * Mark --single-namespace and --proxy-auto-inject as experimental * Fix wording of --single-namespace check flag * Small healthcheck refactor Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-10-11 10:55:57 -07:00
Darko Radisic	6fee0f3c2b	Added --context flag to specify the context to use to talk to the Kubernetes apiserver (#1743 ) * Added --context flag to specify the context to use to talk to the Kubernetes apiserver * Fix tests that are failing * Updated context flag description Signed-off-by: Darko Radisic <ffd2subroutine@users.noreply.github.com>	2018-10-08 12:37:35 -07:00
Alena Varkockova	5a853e8990	Use ListPods always for data plane HC (#1701 ) * Use ListPods always for data plane HC * Missing changes in grpc_server.go * Address review comments * Read proxy version from spec Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-10-02 11:45:01 -07:00
Alena Varkockova	8ab9b4981b	Make wait flag configurable for check and dashboard (#1654 ) Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-09-19 10:42:29 -07:00
Alex Leong	e65a9617bd	Add can-i checks to linkerd check --pre (#1644 ) Add checks to `linkerd check --pre` to verify that the user has permission to create: * namespaces * serviceaccounts * clusterroles * clusterrolebindings * services * deployments * configmaps Signed-off-by: Alex Leong <alex@buoyant.io>	2018-09-17 11:31:10 -07:00
Andrew Seigner	c3150d2c90	`linkerd check` sends params on version check (#1642 ) The `linkerd check` parameter hits https://versioncheck.linkerd.io/version.json to check for the latest Linkerd version. This loses information, as that endpoint is intended to record current version, uuid, and source. Modify `linkerd check` to set `version`, `uuid`, and `source` parameters when performing a version check. Part of #1604. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-09-14 15:39:05 -07:00
Kevin Lingerfelt	f1b3827194	Bump default check retry time to 5 minutes (#1645 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-14 10:58:03 -07:00
Andrew Seigner	7c70531b8e	Add data plane check for metrics Prometheus (#1635 ) The `linkerd check` command was not validating whether data plane proxies were successfully reporting metrics to Prometheus. Introduce a new check that validates data plane proxies are found in Prometheus. This is made possible via the existing `ListPods` endpoint in the public API, which includes an `Added` field, indicating a pod's metrics were found in Prometheus. Fixes #1517 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-09-13 13:02:05 -07:00
Kevin Lingerfelt	b5ff29c8aa	Add data plane check to validate proxy version (#1574 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-04 15:22:38 -07:00
Kevin Lingerfelt	c7a79da89c	Add data plane check to validate proxies are ready (#1570 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-31 15:51:57 -07:00
Risha Mars	136b9cc7c1	Add linkerd check flag to run data plane checks (#1528 ) Adds a --proxy flag to the linkerd check CLI command which will run to-be-implemented data plane checks	2018-08-28 10:16:24 -07:00
Kevin Lingerfelt	4450a7536d	Add --wait flag for CLI check and dashboard commands (#1503 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-22 12:56:42 -07:00
Kevin Lingerfelt	49f6c4c770	Refactor healthcheck init and observe setup (#1502 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-22 12:30:45 -07:00
Kevin Lingerfelt	5fc63cde10	Add check for running pods in control plane namepsace (#1498 ) * Add check for running pods in control plane namepsace * Better pod validation logic Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-21 14:36:49 -07:00
Kevin Lingerfelt	53cd3b50d5	Add --pre flag for linkerd check command (#1497 ) * Add --pre flag for linkerd check command * Small adjustments to check help text Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-20 17:09:43 -07:00
Kevin Lingerfelt	e97be1f5da	Move all healthcheck-related code to pkg/healthcheck (#1492 ) * Move all healthcheck-related code to pkg/healthcheck * Fix failed check formatting * Better version check wording Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-20 16:50:22 -07:00
Oliver Gould	941cad4a9c	Migrate build infrastructure to linkerd2 (#1298 ) This PR begins to migrate Conduit to Linkerd2: * The proxy has been completely removed from this repo, and is now located at github.com/linkerd/linkerd2-proxy. * A `Dockerfile-proxy` has been added to fetch the most-recently published proxy binary from build.l5d.io. * Proxy-specific protobuf bindings have been moved to github.com/linkerd/linkerd2-proxy-api. * All docker images now use the gcr.io/linkerd-io registry. * `inject` now uses `LINKERD2_PROXY_` environment variables * Go paths have been updated to reflect the new (future) repo location.	2018-07-09 15:38:38 -07:00
Kevin Lingerfelt	fd3cfcb5d9	Move healthcheck proto to separate file, use throughout (#150 ) * Move healthcheck proto to separate file, use throughout Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Remove Check message from healthcheck.proto Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Standardize healthcheck protobuf import name Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-01-17 11:15:38 -08:00
Phil Calçado	e328db7e87	Adds conduit-api check for status command (#140 ) * Abstract Conduit API client from protobuf interface to add new features Signed-off-by: Phil Calcado <phil@buoyant.io> * Consolidate mock api clients Signed-off-by: Phil Calcado <phil@buoyant.io> * Add simple implementation of healthcheck for conduit api Signed-off-by: Phil Calcado <phil@buoyant.io> * Change NextSteps to FriendlyMessageToUser Signed-off-by: Phil Calcado <phil@buoyant.io> * Add grpc check for status on the client Signed-off-by: Phil Calcado <phil@buoyant.io> * Add simple server-side check for Conduit API Signed-off-by: Phil Calcado <phil@buoyant.io> * Fix feedback from PR Signed-off-by: Phil Calcado <phil@buoyant.io>	2018-01-12 15:35:22 -05:00
Phil Calçado	709de5a7b0	Moves k8s and conduit client code to /pkg (#103 ) * Rename constructor functions from MakeXyz to NewXyz As it is more commonly used in the codebase Signed-off-by: Phil Calcado <phil@buoyant.io> * Make Conduit client depend on KubernetesAPI Signed-off-by: Phil Calcado <phil@buoyant.io> * Move Conduit client and k8s logic to standard go package dir for internal libs Signed-off-by: Phil Calcado <phil@buoyant.io> * Move dependencies to /pkg Signed-off-by: Phil Calcado <phil@buoyant.io> * Make conduit client more testable Signed-off-by: Phil Calcado <phil@buoyant.io> * Remove unused config object Signed-off-by: Phil Calcado <phil@buoyant.io> * Add more test cases for marhsalling Signed-off-by: Phil Calcado <phil@buoyant.io> * Move client back to controller Signed-off-by: Phil Calcado <phil@buoyant.io> * Sort imports Signed-off-by: Phil Calcado <phil@buoyant.io>	2018-01-04 10:10:10 -08:00

1 2 3

135 Commits