The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.
Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.
Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.
TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Hint URLs should display for all failed checks in `linkerd check`, but
were not displaying for RPC checks.
Fix `runCheckRPC` to pass along the hintAnchor to the check result.
Also rename the second `can query the control plane API` to
`control plane self-check`, as there were two checks with that name.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.
This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.
Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The existing hint URLs printing by `linkerd check` pointed to locations
that would change if the linkerd.io website was reorganized.
linkerd/website#148 introduces an alias for hint URLs at
https://linkerd.io/checks/. This is the corresponding change to update
`linkerd check` output.
Depends on linkerd/website#148, relates to linkerd/website#146.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#2220
The service profile validation which is part of `linkerd check` only validates service profiles in the Linkerd namespace. Due to a recent change, service profiles now can exist in any namespace.
Update the logic so that service profiles in all namespaces are validated.
Additionally:
* Relax validation of service profile names to support external names
Signed-off-by: Alex Leong <alex@buoyant.io>
The `linkerd check` command was doing limited validation on
ServiceProfiles.
Make ServiceProfile validation more complete, specifically validate:
- types of all fields
- presence of required fields
- presence of unknown fields
- recursive fields
Also move all validation code into a new `Validate` function in the
profiles package.
Validation of field types and required fields is handled via
`yaml.UnmarshalStrict` in the `Validate` function. This motivated
migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml.
Fixes#2190
The Proxy API service lacked introspection of its internal state.
Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server
Also wire up a new `linkerd endpoints` command.
Fixes#2165
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Consolidate timeouts for `linkerd check`
- Moved the creation of contexts from inside the methods targeted by the
checks into a single place in the runCheck() and runCheckRPC() methods
where the context is built using a hard-coded timeout of 30 seconds.
- k8s' client-go doesn't allow passing along contexts, but it let's us
setting the Timeout manually.
- Reworded the description for the --wait option.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Fixes#2042
Adds a new field to service profile routes called `timeout`. Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead. If the timeout field is not specified, a default timeout of 10 seconds is used.
Signed-off-by: Alex Leong <alex@buoyant.io>
linkerd/website#105 introduced a FAQ page, providing resolutions for all
`linkerd check` failures.
Update each check to reference its corresponding section in the FAQ.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
When running Linkerd check with a control plane namespace that may contain an additional pod with a replication controller ID for pod names instead of a replicaSet ID, the check command panics because of an "index out of bounds" error.
This PR adds a check to make sure that, when parsing pod names during the `checkControllerRunning` healthcheck, we only check for linkerd control plane pods and that the pod name results in four or more substrings when split on '-'. This prevents the check from panicking when encountering a replication controller ID pod name.
Fixes#2084
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
When a failing check is being retried, we show the current err to the user. This
can sometimes be unnecessarily alarming, as in the case of the control plane
starting up.
If a failing check is in the process of being retried, wait to show the final
error message until the retries have completed.
Previously, we were doing the creation checks for both Roles/RoleBindings and
ClusterRoles/ClusterRoleBindings for all namespaces, but in --single-namespace
mode, we only need to check that these can be created in the control plane
namespace.
Version checks were not validating that the cli version matched the
control plane or data plane versions.
Add checks via the `linkerd check` command to validate the cli is
running the same version as the control and data plane.
Also add types around `channel-version` string parsing and matching. A
consequence being that during development `version.Version` changes from
`undefined` to `dev-undefined`.
Fixes#2076
Depends on linkerd/website#101
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
* When injecting, perform an uninject as a first step
Fixes#1970
The fixture `inject_emojivoto_already_injected.input.yml` is no longer
rejected, so I created the corresponding golden file.
Note that we'll still forbid injection over resources already injected
with third party meshes (Istio, Contour), so now we have
`HasExisting3rdPartySidecars()` to detect that.
* Generalize HasExistingSidecars() to cater for both the auto-injector and
* Convert `linkerd uninject` result format to the one used in `linkerd inject`.
* More updates to the uninject reports. Revert changes to the HasExistingSidecars func.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The outputs of the `check` and `inject` commands did not vary much
between successful and failed executions, and were a bit verbose and
challenging to parse.
Reorganize output of `check` and `inject` commands, to provide more
output when errors occur, and less output when successful.
Specific changes:
`linkerd check`
- visually group checks by category
- introduce `hintURL`'s, to provide doc links when checks fail
- add spinners when retrying, remove additional retry lines
- colored unicode characters to indicate success/warning/failure
`linkerd inject`
- modify default output to mirror `kubectl apply`
- only output non-successful inject reports
- support `--verbose` flag to output all inject reports
Fixes#1471, #1653, #1656, #1739
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The linkerd check command organized the various checks via loosely
coupled category IDs, category names, and checkers themselves, all with
ordering defined by consumers of this code.
This change removes category IDs in favor of category names, groups all
checkers by category, and enforces ordering at the HealthChecker
level.
Part of #1471, depends on #2078.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The set of health checks to be executed were dependent on a combination
of check enums and boolean options.
This change modifies the health checks to be governed strictly by a set
of enums.
Next steps:
- tightly couple category IDs to names
- tightly couple checks to their parent categories
- programmatic control over check ordering
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Commit 1: Enable lint check for comments
Part of #217. Follow up from #1982 and #2018.
A subsequent commit will fix the ci failure.
Commit 2: Address all comment-related linter errors.
This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
be removed
This PR does not:
- Modify, move, or remove any code
- Modify existing comments
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Adjust proxy, Prometheus, and Grafana probes
High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.
Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.
Detailed probe changes:
- proxy
- decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
- decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
- decrease `readinessProbe.timeoutSeconds` from 30s to 1s
- decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
- decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
- decrease `readinessProbe.timeoutSeconds` from 30s to 1s
- decrease `readinessProbe.failureThreshold` from 10 to 3
- increase `livenessProbe.initialDelaySeconds` from 0s to 30s
Fixes#1804
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.
* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition. Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"
Signed-off-by: Alex Leong <alex@buoyant.io>
Service profiles must be named in the form `"<service>.<namespace>"`. This is inconsistent with the fully normalized domain name that the proxy sends to the controller. It also does not permit creating service profiles for non-Kubernetes services.
We switch to requiring that service profiles must be named with the FQDN of their service. For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`.
This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services. Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes.
Signed-off-by: Alex Leong <alex@buoyant.io>
The existence of an invalid service profile causes `linkerd check` to fail. This means that it is not possible to open the Linkerd dashboard with the `linkerd dashboard` command. While service profile validation is useful, it should not lock users out.
Add the ability to designate health checks as warnings. A failed warning health check will display a warning output in `linkerd check` but will not affect the overall success of the command. Switch the service profile validation to be a warning.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fix broken docker build by moving Service Profile conversion and validation into `/pkg`.
Fix broken integration test by adding service profile validation output to `check`'s expected output.
Testing done:
* `gotest -v ./...`
* `bin/docker-build`
* `bin/test-run (pwd)/bin/linkerd`
Signed-off-by: Alex Leong <alex@buoyant.io>
Add a check to `linkerd check` which validates all service profile resources. In particular it checks:
* does the service profile refer to an existent service
* is the service profile valid
Signed-off-by: Alex Leong <alex@buoyant.io>
* Add --single-namespace install flag for restricted permissions
* Better formatting in install template
* Mark --single-namespace and --proxy-auto-inject as experimental
* Fix wording of --single-namespace check flag
* Small healthcheck refactor
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Support auto sidecar-injection
1. Add proxy-injector deployment spec to cli/install/template.go
2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration
during the webhook's start-up process.
3. Add a new handler to the CA controller to create a new secret for the
webhook when a new MutatingWebhookConfiguration is created.
4. Declare a config map to store the proxy and proxy-init container
specs used during the auto-inject process.
5. Ignore namespace and pods that are labeled with
linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed
6. Add new flag to `linkerd install` to enable/disable proxy
auto-injection
Proposed implementation for #561.
* Resolve missing packages errors
* Move the auto-inject label to the pod level
* PR review items
* Move proxy-injector to its own deployment
* Ignore pods that already have proxy injected
This ensures the webhook doesn't error out due to proxy that are injected using the command
* PR review items on creating/updating the MWC on-start
* Replace API calls to ConfigMap with file reads
* Fixed post-rebase broken tests
* Don't mutate the auto-inject label
Since we started using healhcheck.HasExistingSidecars() to ensure pods with
existing proxies aren't mutated, we don't need to use the auto-inject label as
an indicator.
This resolves a bug which happens with the kubectl run command where the deployment
is also assigned the auto-inject label. The mutation causes the pod auto-inject
label to not match the deployment label, causing kubectl run to fail.
* Tidy up unit tests
* Include proxy resource requests in sidecar config map
* Fixes to broken YAML in CLI install config
The ignore inbound and outbound ports are changed to string type to
avoid broken YAML caused by the string conversion in the uint slice.
Also, parameterized the proxy bind timeout option in template.go.
Renamed the sidecar config map to
'linkerd-proxy-injector-webhook-config'.
Signed-off-by: ihcsim <ihcsim@gmail.com>
* Added --context flag to specify the context to use to talk to the Kubernetes apiserver
* Fix tests that are failing
* Updated context flag description
Signed-off-by: Darko Radisic <ffd2subroutine@users.noreply.github.com>
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps
Signed-off-by: Alex Leong <alex@buoyant.io>
The `linkerd check` parameter hits
https://versioncheck.linkerd.io/version.json to check for the latest
Linkerd version. This loses information, as that endpoint is intended to
record current version, uuid, and source.
Modify `linkerd check` to set `version`, `uuid`, and `source`
parameters when performing a version check.
Part of #1604.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd check` command was not validating whether data plane
proxies were successfully reporting metrics to Prometheus.
Introduce a new check that validates data plane proxies are found in
Prometheus. This is made possible via the existing `ListPods` endpoint
in the public API, which includes an `Added` field, indicating a pod's
metrics were found in Prometheus.
Fixes#1517
Signed-off-by: Andrew Seigner <siggy@buoyant.io>