linkerd2

Commit Graph

Author	SHA1	Message	Date
Kevin Leimkuhler	e5b0ea28d4	Add retries to certain `linkerd check` checkers (#4171 ) ## Motivation Testing #4167 has revealed some `linkerd check` failures that occur only because the checks happen too quickly after cluster creation or install. If retried, they pass on the second time. Some checkers already handle this with the `retryDeadline` field. If a checker does not set this field, there is no retry. ## Solution Add retries to the `l5d-existence-replicasets` `l5d-existence-unschedulable-pods` checks so that these checks do not fail during a chained cluster creation > install > check process.	2020-03-16 13:15:42 -07:00
Christy Jacob	8111e54606	Check for extension server certificate (#4062 ) * Check Extension api server Authentication * Added Checks and tests for extension api-server authentication * Fixed Failing Static Checks * Updated the golden file Signed-off-by: Christy Jacob <christyjacob4@gmail.com>	2020-02-28 13:39:02 -08:00
Zahari Dichev	3538944d03	Unify trust anchors terminology (#4047 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-02-15 10:12:46 +02:00
Mayank Shah	c1b683147a	Update identity to make certs more diagnosable (#3990 ) Update identity controller to make issuer certificates diagnosable if cert validity is causing error - Add expiry time in identity log message - Add current time in identity log message - Emit k8s event with appropriate message Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-02-13 11:21:41 +02:00
Mayank Shah	6c6514f169	cli: Update 'check' command to validate HA configuration (#3942 ) Add check for number of control plane replicas for HA Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-02-07 19:07:11 +02:00
Zahari Dichev	26de5cf650	Trim space when comparing roots between the issuer secret and the config (#3982 ) This fix ensures that we ignore whitespace and newlines when checking that roots match between the Linkerd config map and the issuer secret (in the case of using external issue + Helm). Fixes: #3907 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-01-28 11:21:01 -08:00
Mayank Shah	5fc83bc1c1	Update "linkerd-version" check (#3975 ) Update check command to throw a warning when it fails to determine latest version Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-01-28 09:41:19 +02:00
Anantha Krishnan	7f026c96f6	Added check for TapAPI service (#3689 ) Added check for TapAPI service Fixes #3462 Added a checker using `kube-aggregator` client Signed-off-by: Ananthakrishnan <kannan4mi3@gmail.com>	2020-01-27 20:07:07 +02:00
Zahari Dichev	deefeeec52	Rename no init container second take (#3972 ) This is a second attempt on #3956 as it got merged in the wrong branch Fixes #3930 Signed-off-by: Zahari Dichev zaharidichev@gmail.com	2020-01-24 12:52:55 -08:00
Mayank Shah	60ac0d5527	Add `as-group` CLI flag (#3952 ) Add CLI flag --as-group that can impersonate group for k8s operations Signed-off-by: Mayank Shah mayankshah1614@gmail.com	2020-01-22 16:38:31 +02:00
Zahari Dichev	e30b9a9c69	Add checks for CNI plugin (#3903 ) As part of the effort to remove the "experimental" label from the CNI plugin, this PR introduces cni checks to `linkerd check` Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-01-17 12:11:19 +02:00
Zahari Dichev	d259b23e8b	Add check to ensure kube-system has the needed annotations (HA) (#3731 ) Adds a check to ensure kube-system namespace has `config.linkerd.io/admission-webhooks:disabled` FIxes #3721 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-01-10 10:03:13 +02:00
Zahari Dichev	c078b4ff8d	Add hint anchors for tls checks (#3853 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2019-12-20 11:02:02 +02:00
Sergio C. Arteaga	56c8a1429f	Increase the comprehensiveness of check --pre (#3701 ) * Increase the comprehensiveness of check --pre Closes #3224 Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>	2019-12-18 13:27:32 -05:00
Zahari Dichev	f88b55e36e	Tls certs checks (#3813 ) * Added checks for cert correctness * Add warning checks for approaching expiration * Add unit tests * Improve unit tests * Address comments * Address more comments * Prevent upgrade from breaking proxies when issuer cert is overwritten (#3821) * Address more comments * Add gate to upgrade cmd that checks that all proxies roots work with the identitiy issuer that we are updating to * Address comments * Enable use of upgarde to modify both roots and issuer at the same time Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2019-12-16 14:49:32 -08:00
Alena Varkockova	adb8117d78	Remove redundant serviceprofile check (#3718 ) Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2019-11-14 09:40:36 -05:00
Alejandro Pedraza	4b6254b52e	Replaced `uuid` with `uid` from linkerd-config resource (#3694 ) * Replaced `uuid` with `uid` from linkerd-config resource Fixes #3621 Removed the old `uuid` for identifying linkerd installations, and replaced it with the `uid` property from the `linkerd-config` ConfigMap. I tested that this `uid` remains the same by updating the config and also upgrading linkerd, using both the CLI and Helm. Note that this required granting `linkerd-web` RBAC access to the `linkerd-config` Config. I also added an integration test to verify the stability of the uid.	2019-11-13 13:56:01 -05:00
Sergio C. Arteaga	eff1714a08	Add `linkerd check` to dashboard (#3656 ) `linkerd check` can now be run from the dashboard in the `/controlplane` view. Once the check results are received, they are displayed in a modal in a similar style to the CLI output. Closes #3613	2019-11-12 12:37:36 -08:00
Zahari Dichev	a8170bd634	Add preinstall checks for deletion and creation of secrets (#3639 ) Signed-off-by: zaharidichev <zaharidichev@gmail.com>	2019-10-31 18:01:03 +02:00
Ivan Sim	ff69c29f5e	Add missing package to proxy Dockerfile (#3583 ) * Add missing package to proxy Dockerfile * Fix failing 'check' integration test * Trim whitespaces in certs comparison. Without this change, the integration test would fail because the trust anchor stored in the linkerd-config config map generated by the Helm renderer is stripped of the line breaks. See charts/linkerd2/templates/_config.tpl Signed-off-by: Ivan Sim <ivan@buoyant.io>	2019-10-15 15:51:26 -07:00
Rafael Fernández López	ba14dc3fc7	Health check: check if proxies trust anchors match configuration (#3524 ) * Health check: check if proxies trust anchors match configuration If Linkerd is reinstalled or if the trust anchors are modified while proxies are running on the cluster, they will contain an outdated `LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS` certificate. This changeset adds support for `linkerd check`, so it checks if there is any proxy running on the cluster, and performing the check against the configuration trust anchor. If there's a failure (considered a warning), `linkerd check` will notify the user about what pods are the offenders (and in what namespace each one is), and also a hint to remediate the issue (restarting the pods). * Add integration tests for proxy certificate check Fixes #3344 Signed-off-by: Rafael Fernández López <ereslibre@ereslibre.es>	2019-10-15 11:33:09 -07:00
Alejandro Pedraza	6568929028	Add --disable-heartbeat flag for linkerd install\|upgrade (#3439 ) Fixes #278 Add `linkerd install\|upgrade --disable-heartbeat` flag, and have `linkerd check` check for the heartbeat's SA only if it's enabled. Also added those flags into the `linkerd upgrade -h` examples. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-09-25 15:53:36 -05:00
Alena Varkockova	d369029909	Emit error when cannot connect to kubernetes (#3327 ) Introduce CategoryError Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2019-09-04 17:34:53 -07:00
Alejandro Pedraza	acbab93ca8	Add support for k8s 1.16 (#3364 ) Fixes #3356 1.16 removes some api groups that were already deprecated. From k8s blog post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/): ``` - PodSecurityPolicy: will no longer be served from extensions/v1beta1 in v1.16. Migrate to the policy/v1beta1 API, available since v1.10. Existing persisted data can be retrieved/updated via the policy/v1beta1 API. - DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16. Migrate to the apps/v1 API, available since v1.9. Existing persisted data can be retrieved/updated via the apps/v1 API. ``` Previous PRs had already made this change at the Helm templates level, but we still needed to do it at the API calls and tests. The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16 because the upgrade integration test tries to install linkerd 2.5 which is not compatible with 1.16. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-09-04 09:59:55 -05:00
陈谭军	e281fb3410	fix-up grammar (#3351 ) Signed-off-by: chentanjun <2799194073@qq.com>	2019-08-30 08:09:36 -07:00
cpretzer	4e92064f3b	Add a flag to install-cni command to configure iptables wait flag (#3066 ) Signed-off-by: Charles Pretzer <charles@buoyant.io>	2019-08-15 12:58:18 -07:00
Andrew Seigner	9a672dd5a9	Introduce `linkerd --as` flag for impersonation (#3173 ) Similar to `kubectl --as`, global flag across all linkerd subcommands which sets a `ImpersonationConfig` in the Kubernetes API config. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-31 16:05:33 -07:00
Andrew Seigner	065dd3ec9d	Add "can create cronjobs" to linkerd check (#3133 ) PR #3056 introduced a cluster heartbeat cronjob to the Linkerd installation. This implies the user installing Linkerd requires the privileges to create CronJobs. Update `linkerd check` to validate the user has privileges necessary to create CronJobs. Fixes #3057 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-26 13:09:41 -07:00
Andrew Seigner	64ed8e4a74	Introduce Cluster Heartbeat cronjob (#3056 ) `linkerd check`, the web dashboard, and Grafana all perform version checks to validate Linkerd is up to date. It's common for users to seldom execute these codepaths. This makes it difficult to identify what versions of Linkerd are currently in use and what environments it is being run in, which helps prioritize testing and backports. Introduce a `heartbeat` CronJob to the default Linkerd install. The cronjob executes every 24 hours, starting from 5 minutes after `linkerd install` is run. Example check URL: https://versioncheck.linkerd.io/version.json? install-time=1562761177& k8s-version=v1.15.0& meshed-pods=8& rps=3& source=heartbeat& uuid=cc4bb700-3314-426a-9f0f-ec588b9df020& version=git-b97ee9f7 Fixes #2961 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-23 17:12:30 -07:00
Alex Leong	d6ef9ea460	Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078 ) The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller. Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object. To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2. This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource. If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can. This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start. Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation). In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-23 11:46:31 -07:00
Alex Leong	c8b34a8cab	Add pod status to linkerd check (#3065 ) When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff). We add pod status to the output to make this more immediately obvious. Fixes #2877 Signed-off-by: Alex Leong <alex@buoyant.io>	2019-07-18 15:56:19 -07:00
Andrew Seigner	5d0746ff91	Add NET_RAW to `linkerd check --pre` (#3055 ) `linkerd check --pre` validates that PSPs provide `NET_ADMIN`, but was not validating `NET_RAW`, despite `NET_RAW` being required by Linkerd's proxy-init container since #2969. Introduce a `has NET_RAW capability` check to `linkerd check --pre`. Fixes #3054 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-10 20:28:49 +02:00
Andrew Seigner	7756828ae6	Update install failure message to list resources (#3050 ) The existing `linkerd install` error message for existing resources was shared with `linkerd check`. Given the different contexts, the messaging made more sense for `linkerd check` than for `linkerd install`. Modify the error messaging for `linkerd install` to print a bare list of existing resources, and provide instructions for proceeding. For example: ```bash $ linkerd install Unable to install the Linkerd control plane. It appears that there is an existing installation: clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity If you are sure you'd like to have a fresh install, remove these resources with: linkerd install --ignore-cluster \| kubectl delete -f - Otherwise, you can use the --ignore-cluster flag to overwrite the existing global resources. ``` Fixes #3045 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-09 20:21:19 +02:00
Andrew Seigner	94fa653cf3	Fix `linkerd check` missing uuid on version check (#3040 ) PR #2603 modified the web process to read the UUID from the `linkerd-config` ConfigMap rather than from a command line flag. The `linkerd check` command relied on that command line flag to retrieve the UUID as part of its version check. Modify `linkerd check` to correctly retrieve the UUID from `linkerd-config`. Also refactor `linkerd-config` retrieval and parsing code to be shared between healthcheck, install, and upgrade. Relates to #2961 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-05 19:39:13 +02:00
Andrew Seigner	7c87fd4498	Make ReplicaSet check more explicit. (#3017 ) The `linkerd check` for healthy ReplicaSets had a generic `control plane components ready` description, and a hint anchor to `l5d-existence-psp`. While a ReplicaSet failure could definitely occur due to psp, that hintAnchor was already in use by the "control plane PodSecurityPolicies exist" check. Rename the `control plane components ready` check to `control plane replica sets are ready`, and the hintAnchor from `l5d-existence-psp` to `l5d-existence-replicasets`. Relates to https://github.com/linkerd/website/issues/372. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-07-02 20:02:08 +02:00
Ivan Sim	866fe6fa5e	Introduce global resources checks to install and multi-stage install (#2987 ) * Introduce new checks to determine existence of global resources and the 'linkerd-config' config map. * Update pre-check to check for existence of global resources This ensures that multiple control planes can't be installed into different namespaces. * Update integration test clean-up script to delete psp and crd Signed-off-by: Ivan Sim <ivan@buoyant.io>	2019-06-27 09:59:12 -07:00
Andrew Seigner	2528e3d62d	Make NET_ADMIN check a warning, add PSP check (#2958 ) `linkerd check` validates whether PSP's exist, and if the caller has the `NET_ADMIN` capability. This check was previously failing if `NET_ADMIN` was not found, even in the case where the PSP admission controller was not running. Related, `linkerd install` now includes a PSP, so `linkerd check` should also validate that the caller can create PSP's. Modify the `NET_ADMIN` check to warn, but not fail, if PSP's are found but the caller does not have `NET_ADMIN`. Update the warning message to mention that this is only a problem if the PSP admission controller is running (and will only be a problem during injection, since #2920 handles control plane installation by adding its own PSP). Also introduce a check to validate the caller can create PSP's. Fixes #2884, #2849 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-06-20 17:58:26 +02:00
Alena Varkockova	7973715ee4	Output check result as json (#2666 ) * Add missing file, make linter happy * Changes from code review * Ivan's code review * PR review feedback * style fixes * Last PR comments Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2019-05-20 09:04:28 -07:00
Dennis Adjei-Baah	a0fa1dff59	Move tap service into its own pod. (#2773 ) * Split tap into its own pod in the control plane Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2019-05-15 16:28:44 -05:00
mg	b965d0d30e	Introduce pre-install healthcheck for clock skew (#2803 ) * Adding pre-install check for clock skew * Fixing lint error - time.Since * Update test data for clock skew check * Incorporating code review comments * Additional fix - clock skew test Signed-off-by: Matej Gera <matejgera@gmail.com>	2019-05-13 10:14:38 -07:00
Andrew Seigner	ad2f92662e	Fix check/dashboard failing from one pod when HA (#2764 ) The `linkerd check` and `linkerd dashboard` commands validate control plane pods are up via the `LinkerdAPIChecks` category of checks. These checks will fail if a single pod is not ready, even in HA mode. Modify the underlying `validateControlPlanePods` check to return successful if at least one pod per control plane component is ready. Fixes #2554 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-05-03 18:42:21 +02:00
Andrew Seigner	66494591e0	Multi-stage check support (#2765 ) Add support for `linkerd check config`. Validates the existence of the Linkerd Namespace, ClusterRoles, ClusterRoleBindings, ServiceAccounts, and CustomResourceDefitions. Part of #2337 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-30 17:17:59 +01:00
Andrew Seigner	dec620f818	Fix `linkerd check --proxy` with default ns param (#2754 ) The `linkerd check --proxy` command checks for proxies in all namespaces, if the `--namespace` flag is not set. PR #2747 modified the behavior of `KubernetesAPI.NamespaceExists`. Previously it would succeed if given an emptry string for a namespace. Now it fails with a `resource name may not be empty` error (for k8s server `v1.10.11`), or a not found error (for our fake test client). Modify the data plane proxy namespace check to return success if the namespace is not set. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-25 16:37:08 -07:00
Andrew Seigner	ec540a882e	Consolidate k8s APIs (#2747 ) Numerous codepaths have emerged that create k8s configs, k8s clients, and make k8s api requests. This branch consolidates k8s client creation and APIs. The primary change migrates most codepaths to call `k8s.NewAPI` to instantiate a `KubernetesAPI` struct from `pkg`. `KubernetesAPI` implements the `kubernetes.Interface` (clientset) interface, and also persists a `client-go` `rest.Config`. Specific list of changes: - removes manual GET requests from `k8s.KubernetesAPI`, in favor of clientsets - replaces most calls to `k8s.GetConfig`+`kubernetes.NewForConfig` with a single `k8s.NewAPI` - introduces a `timeout` param to `k8s.NewAPI`, currently only used by healthchecks - removes `NewClientSet` in `controller/k8s/clientset.go` in favor of `k8s.NewAPI` - removes `httpClient` and `clientset` from `HealthChecker`, use `KubernetesAPI` instead Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-25 11:31:38 -07:00
Andrew Seigner	b2b4780430	Introduce install stages (#2719 ) This change introduces two named parameters for `linkerd install`, split by privilege: - `linkerd install config` - Namespace - ClusterRoles - ClusterRoleBindings - CustomResourceDefinition - ServiceAccounts - `linkerd install control-plane` - ConfigMaps - Secrets - Deployments - Services Comprehensive `linkerd install` is still supported. TODO: - `linkerd check` support - `linkerd upgrade` support - integration tests Part of #2337 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-23 14:52:34 -07:00
Andrew Seigner	c7bd8167a9	Remove redundant k8s request in linkerd check (#2722 ) PR #2510 introduced some new checks into `linkerd check`. One set of checks was unnecessarily calling `GetPodsByNamespace` twice. Remove the redundant k8s API call. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-04-19 12:27:34 -07:00
Gaurav Kumar	392cab80fa	Add check for unschedulable pods and psp issues (#2510 ) Fixes #2465 * Add check for unschedulable pods and psp issues (#2465) * Return error reason and message on pod or node failure Signed-off-by: Gaurav Kumar <gaurav.kumar9825@gmail.com>	2019-04-19 11:42:22 -07:00
Alejandro Pedraza	edb225069c	Add validation webhook for service profiles (#2623 ) Add validation webhook for service profiles Fixes #2075 Todo in a follow-up PRs: remove the SP check from the CLI check. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-04-05 16:10:47 -05:00
Kevin Lingerfelt	0d4eb02835	Add identity pod to check, web, and integration tests (#2529 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2019-03-19 20:49:31 -07:00
Oliver Gould	81f645da66	Remove `--tls=optional` and `linkerd-ca` (#2515 ) The proxy's TLS implementation has changed to use a new _Identity_ controller. In preparation for this, the `--tls=optional` CLI flag has been removed from install and inject; and the `ca` controller has been deleted. Metrics and UI treatments for TLS have not been removed, as they will continue to be valuable for the new Identity system. With the removal of the old identity scheme, the Destination service's proxy ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable locality awareness.	2019-03-18 17:40:31 -07:00

1 2 3

110 Commits