Commit Graph

54 Commits

Author SHA1 Message Date
Alejandro Pedraza 2a7654ce78
Consolidate timeouts for `linkerd check` (#2191)
Consolidate timeouts for `linkerd check`

- Moved the creation of contexts from inside the methods targeted by the
checks into a single place in the runCheck() and runCheckRPC() methods
where the context is built using a hard-coded timeout of 30 seconds.
- k8s' client-go doesn't allow passing along contexts, but it let's us
setting the Timeout manually.
- Reworded the description for the --wait option.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-02-05 11:38:30 -05:00
Andrew Seigner 3a139d0202
Fix spelling on linkerd check link (#2197)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-04 14:00:11 -08:00
Alex Leong 3bd4231cec
Add support for timeouts in service profiles (#2149)
Fixes #2042 

Adds a new field to service profile routes called `timeout`.  Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead.  If the timeout field is not specified, a default timeout of 10 seconds is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-30 16:48:55 -08:00
Andrew Seigner 5651e02496
Add hint URLs for all checks (#2159)
linkerd/website#105 introduced a FAQ page, providing resolutions for all
`linkerd check` failures.

Update each check to reference its corresponding section in the FAQ.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-30 16:26:57 -08:00
Dennis Adjei-Baah 20efe14fef
Fix issue where linkerd check would panic with replicationcontroller pod name in control plane (#2140)
When running Linkerd check with a control plane namespace that may contain an additional pod with a replication controller ID for pod names instead of a replicaSet ID, the check command panics because of an "index out of bounds" error.

This PR adds a check to make sure that, when parsing pod names during the `checkControllerRunning` healthcheck, we only check for linkerd control plane pods and that the pod name results in four or more substrings when split on '-'. This prevents the check from panicking when encountering a replication controller ID pod name.

Fixes #2084 

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2019-01-24 10:26:11 -08:00
Risha Mars 950c952d14
Add a less scary message to the user when retries are still in progress (#2141)
When a failing check is being retried, we show the current err to the user. This
can sometimes be unnecessarily alarming, as in the case of the control plane
starting up.

If a failing check is in the process of being retried, wait to show the final
error message until the retries have completed.
2019-01-24 10:24:59 -08:00
Risha Mars e7556d7edc
Check RoleBindings for specified single namespace only (#2142)
Previously, we were doing the creation checks for both Roles/RoleBindings and
ClusterRoles/ClusterRoleBindings for all namespaces, but in --single-namespace
mode, we only need to check that these can be created in the control plane
namespace.
2019-01-23 18:04:15 -08:00
Andrew Seigner c9ac77cd7c
Introduce version consistency checks (#2130)
Version checks were not validating that the cli version matched the
control plane or data plane versions.

Add checks via the `linkerd check` command to validate the cli is
running the same version as the control and data plane.

Also add types around `channel-version` string parsing and matching. A
consequence being that during development `version.Version` changes from
`undefined` to `dev-undefined`.

Fixes #2076

Depends on linkerd/website#101

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-23 16:54:43 -08:00
Alena Varkockova 28f662c9c6 Introduce resource selector and deprecate namespace field for ListPods (#2025)
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-23 10:35:55 -08:00
Alejandro Pedraza bd1955c317 When injecting, perform an uninject as a first step (#2089)
* When injecting, perform an uninject as a first step

Fixes #1970

The fixture `inject_emojivoto_already_injected.input.yml` is no longer
rejected, so I created the corresponding golden file.

Note that we'll still forbid injection over resources already injected
with third party meshes (Istio, Contour), so now we have
`HasExisting3rdPartySidecars()` to detect that.

* Generalize HasExistingSidecars() to cater for both the auto-injector and
* Convert `linkerd uninject` result format to the one used in `linkerd inject`.
* More updates to the uninject reports. Revert changes to the HasExistingSidecars func.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-01-17 10:15:23 -08:00
Andrew Seigner 92f2cd9b63
Update check and inject output (#2087)
The outputs of the `check` and `inject` commands did not vary much
between successful and failed executions, and were a bit verbose and
challenging to parse.

Reorganize output of `check` and `inject` commands, to provide more
output when errors occur, and less output when successful.

Specific changes:

`linkerd check`
- visually group checks by category
- introduce `hintURL`'s, to provide doc links when checks fail
- add spinners when retrying, remove additional retry lines
- colored unicode characters to indicate success/warning/failure

`linkerd inject`
- modify default output to mirror `kubectl apply`
- only output non-successful inject reports
- support `--verbose` flag to output all inject reports

Fixes #1471, #1653, #1656, #1739

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-16 15:14:14 -08:00
Andrew Seigner dacd8819ff
Group checkers by category (#2083)
The linkerd check command organized the various checks via loosely
coupled category IDs, category names, and checkers themselves, all with
ordering defined by consumers of this code.

This change removes category IDs in favor of category names, groups all
checkers by category, and enforces ordering at the HealthChecker
level.

Part of #1471, depends on #2078.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-14 18:01:32 -08:00
Andrew Seigner 04373414ef
Modify all health checks to be specified via enums (#2078)
The set of health checks to be executed were dependent on a combination
of check enums and boolean options.

This change modifies the health checks to be governed strictly by a set
of enums.

Next steps:
- tightly couple category IDs to names
- tightly couple checks to their parent categories
- programmatic control over check ordering

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-14 17:16:15 -08:00
Alejandro Pedraza 281ba37e6d
More granular control on checks made by CLI commands (#2033)
Have the CLI commands `get`, `routes`, `stat`, `tap`and `top` perform a more limited set of checks 

Fixes #1854
2019-01-10 09:13:44 -05:00
Andrew Seigner 1c302182ef
Enable lint check for comments (#2023)
Commit 1: Enable lint check for comments

Part of #217. Follow up from #1982 and #2018.

A subsequent commit will fix the ci failure.

Commit 2: Address all comment-related linter errors.

This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
  be removed

This PR does not:
- Modify, move, or remove any code
- Modify existing comments

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 14:03:59 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Kevin Lingerfelt 86e95b7ad3
Disable serivce profiles in single-namespace mode (#1980)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 14:37:18 -08:00
Kevin Lingerfelt fd44896644
Remove namespace definition from --single-namespace installs (#1974)
* Remove namespace definition from --single-namespace installs
* DRY up code in healthcheck.go

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-12 14:53:02 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner d121071f87
Adjust proxy, Prometheus, and Grafana probes (#1899)
* Adjust proxy, Prometheus, and Grafana probes

High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.

Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.

Detailed probe changes:
- proxy
  - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `readinessProbe.failureThreshold` from 10 to 3
  - increase `livenessProbe.initialDelaySeconds` from 0s to 30s

Fixes #1804

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 10:41:11 -08:00
Kevin Lingerfelt 4547ba7f0a
Make permission checks non-fatal, add check for CRDs (#1859)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-11-14 10:29:04 -08:00
Alena Varkockova fda834cf64 Allow retrying control plane API check (#1858)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-13 10:52:50 -08:00
Alena Varkockova 38dfc5308f Make version checks warning (#1844)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-09 09:48:14 -08:00
Alex Leong 32d556e732
Improve ergonomics of service profile spec (#1828)
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.

* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition.  Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-31 12:00:22 -07:00
Alex Leong 82ca821e62
Use fqdn for service profile name (#1808)
Service profiles must be named in the form `"<service>.<namespace>"`.  This is inconsistent with the fully normalized domain name that the proxy sends to the controller.  It also does not permit creating service profiles for non-Kubernetes services.

We switch to requiring that service profiles must be named with the FQDN of their service.  For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`.

This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services.  Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 14:35:42 -07:00
Alex Leong 6cffad277b
Make service profile validation a warning instead of an error (#1807)
The existence of an invalid service profile causes `linkerd check` to fail.  This means that it is not possible to open the Linkerd dashboard with the `linkerd dashboard` command.  While service profile validation is useful, it should not lock users out.

Add the ability to designate health checks as warnings.  A failed warning health check will display a warning output in `linkerd check` but will not affect the overall success of the command.  Switch the service profile validation to be a warning.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-26 13:28:10 -07:00
Alex Leong f549868033
Fix integration test and docker build (#1790)
Fix broken docker build by moving Service Profile conversion and validation into `/pkg`.

Fix broken integration test by adding service profile validation output to `check`'s expected output.

Testing done:
* `gotest -v ./...`
* `bin/docker-build`
* `bin/test-run (pwd)/bin/linkerd`

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-19 10:23:34 -07:00
Alex Leong 5210b7b44a
Add check for service profile validation (#1775)
Add a check to `linkerd check` which validates all service profile resources.  In particular it checks:
* does the service profile refer to an existent service
* is the service profile valid

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-18 16:37:39 -07:00
Kevin Lingerfelt 46c887ca00
Add --single-namespace install flag for restricted permissions (#1721)
* Add --single-namespace install flag for restricted permissions
* Better formatting in install template
* Mark --single-namespace and --proxy-auto-inject as experimental
* Fix wording of --single-namespace check flag
* Small healthcheck refactor

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-10-11 10:55:57 -07:00
Ivan Sim 4fba6aca0a Proxy init and sidecar containers auto-injection (#1714)
* Support auto sidecar-injection

1. Add proxy-injector deployment spec to cli/install/template.go
2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration
during the webhook's start-up process.
3. Add a new handler to the CA controller to create a new secret for the
webhook when a new MutatingWebhookConfiguration is created.
4. Declare a config map to store the proxy and proxy-init container
specs used during the auto-inject process.
5. Ignore namespace and pods that are labeled with
linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed
6. Add new flag to `linkerd install` to enable/disable proxy
auto-injection

Proposed implementation for #561.

* Resolve missing packages errors
* Move the auto-inject label to the pod level
* PR review items
* Move proxy-injector to its own deployment
* Ignore pods that already have proxy injected

This ensures the webhook doesn't error out due to proxy that are injected using the  command

* PR review items on creating/updating the MWC on-start
* Replace API calls to ConfigMap with file reads
* Fixed post-rebase broken tests
* Don't mutate the auto-inject label

Since we started using healhcheck.HasExistingSidecars() to ensure pods with
existing proxies aren't mutated, we don't need to use the auto-inject label as
an indicator.

This resolves a bug which happens with the kubectl run command where the deployment
is also assigned the auto-inject label. The mutation causes the pod auto-inject
label to not match the deployment label, causing kubectl run to fail.

* Tidy up unit tests
* Include proxy resource requests in sidecar config map
* Fixes to broken YAML in CLI install config

The ignore inbound and outbound ports are changed to string type to
avoid broken YAML caused by the string conversion in the uint slice.

Also, parameterized the proxy bind timeout option in template.go.

Renamed the sidecar config map to
'linkerd-proxy-injector-webhook-config'.

Signed-off-by: ihcsim <ihcsim@gmail.com>
2018-10-10 12:09:22 -07:00
Darko Radisic 6fee0f3c2b Added --context flag to specify the context to use to talk to the Kubernetes apiserver (#1743)
* Added --context flag to specify the context to use to talk to the Kubernetes apiserver
* Fix tests that are failing
* Updated context flag description

Signed-off-by: Darko Radisic <ffd2subroutine@users.noreply.github.com>
2018-10-08 12:37:35 -07:00
Alena Varkockova 5a853e8990 Use ListPods always for data plane HC (#1701)
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-10-02 11:45:01 -07:00
Alena Varkockova 8ab9b4981b Make wait flag configurable for check and dashboard (#1654)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-09-19 10:42:29 -07:00
Alex Leong e65a9617bd
Add can-i checks to linkerd check --pre (#1644)
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-09-17 11:31:10 -07:00
Andrew Seigner c3150d2c90
`linkerd check` sends params on version check (#1642)
The `linkerd check` parameter hits
https://versioncheck.linkerd.io/version.json to check for the latest
Linkerd version. This loses information, as that endpoint is intended to
record current version, uuid, and source.

Modify `linkerd check` to set `version`, `uuid`, and `source`
parameters when performing a version check.

Part of #1604.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-14 15:39:05 -07:00
Kevin Lingerfelt f1b3827194
Bump default check retry time to 5 minutes (#1645)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-14 10:58:03 -07:00
Andrew Seigner 7c70531b8e
Add data plane check for metrics Prometheus (#1635)
The `linkerd check` command was not validating whether data plane
proxies were successfully reporting metrics to Prometheus.

Introduce a new check that validates data plane proxies are found in
Prometheus. This is made possible via the existing `ListPods` endpoint
in the public API, which includes an `Added` field, indicating a pod's
metrics were found in Prometheus.

Fixes #1517

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-13 13:02:05 -07:00
Kevin Lingerfelt b5ff29c8aa
Add data plane check to validate proxy version (#1574)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-04 15:22:38 -07:00
Kevin Lingerfelt c7a79da89c
Add data plane check to validate proxies are ready (#1570)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-31 15:51:57 -07:00
Risha Mars 136b9cc7c1
Add linkerd check flag to run data plane checks (#1528)
Adds a --proxy flag to the linkerd check CLI command which will run 
to-be-implemented data plane checks
2018-08-28 10:16:24 -07:00
Kevin Lingerfelt 4450a7536d
Add --wait flag for CLI check and dashboard commands (#1503)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-22 12:56:42 -07:00
Kevin Lingerfelt 49f6c4c770
Refactor healthcheck init and observe setup (#1502)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-22 12:30:45 -07:00
Kevin Lingerfelt 5fc63cde10
Add check for running pods in control plane namepsace (#1498)
* Add check for running pods in control plane namepsace
* Better pod validation logic

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-21 14:36:49 -07:00
Kevin Lingerfelt 53cd3b50d5
Add --pre flag for linkerd check command (#1497)
* Add --pre flag for linkerd check command
* Small adjustments to check help text

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-20 17:09:43 -07:00
Kevin Lingerfelt e97be1f5da
Move all healthcheck-related code to pkg/healthcheck (#1492)
* Move all healthcheck-related code to pkg/healthcheck
* Fix failed check formatting
* Better version check wording

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-20 16:50:22 -07:00
Kevin Lingerfelt 00a0572098
Better CLI error messages when control plane is unavailable (#1428)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-08-09 15:40:41 -07:00
Kevin Lingerfelt e5cce1abaf
Rename CLI from conduit to linkerd (#1312)
* Rename CLI binary
* Update integration tests for new binary name
* Rename --conduit-namespace flag, change default ns
* Rename occurrences of conduit in rest of CLI
* Rename inject and install components
* Remove conduit occurrences in docker files
* Additional miscellaneous cleanup
* Move protobuf definitions to linkerd2 package
* Rename conduit.io labels to use linkerd.io
* Rename conduit-managed segment to linkerd-managed
* Fix conduit references in web project

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-07-12 17:14:07 -07:00
Oliver Gould 941cad4a9c
Migrate build infrastructure to linkerd2 (#1298)
This PR begins to migrate Conduit to Linkerd2:
* The proxy has been completely removed from this repo, and is now located at
  github.com/linkerd/linkerd2-proxy.
* A `Dockerfile-proxy` has been added to fetch the most-recently published proxy
  binary from build.l5d.io.
* Proxy-specific protobuf bindings have been moved to
  github.com/linkerd/linkerd2-proxy-api.
* All docker images now use the gcr.io/linkerd-io registry.
* `inject` now uses `LINKERD2_PROXY_` environment variables
* Go paths have been updated to reflect the new (future) repo location.
2018-07-09 15:38:38 -07:00
Kevin Lingerfelt 11a4359e9a
Misc cleanup following the telemetry rewrite (#771)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-04-16 15:51:07 -07:00
Andy Hume 1a66e7f8f1 cli: reduce timeouts on API check requests (#586)
Applies timeout of 5s to check request contexts. This overrides
30s timeout applied at client transport level, and stops the
conduit check command from taking > 90s to complete.

Fixes #553

Signed-off-by: Andy Hume <andyhume@gmail.com>
2018-03-19 17:15:01 -07:00