This change allows some advised production config to be applied to the install of the control plane.
Currently this runs 3x replicas of the controller and adds some pretty sane requests to each of the components + containers of the control plane.
Fixes#1101
Signed-off-by: Ben Lambert <ben@blam.sh>
Add a routes command which displays per-route stats for services that have service profiles defined.
This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared.
Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Refactor util.BuildResource so it can deal with multiple resources
First step to address #1487: Allow stat summary to query for multiple
resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Update the stat cli help text to explain the new multi resource querying ability
Propsal for #1487: Allow stat summary to query for multiple resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Allow stat summary to query for multiple resources
Implement this ability by issuing parallel requests to requestStatsFromAPI()
Proposal for #1487
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Update tests as part of multi-resource support in `linkerd stat` (#1487)
- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* `linkerd stat` called with multiple resources should keep an ordering (#1487)
Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Extra validations for `linkerd stat` with multiple resources (#1487)
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* `linkerd stat` resource grouping, ordering and name prefixing (#1487)
- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Allow `linkerd stat` to be called with multiple resources
A few final refactorings as per code review.
Fixes#1487
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.
* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition. Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"
Signed-off-by: Alex Leong <alex@buoyant.io>
A container called `proxy-api` runs in the Linkerd2 controller pod. This container listens on port 8086 and serves the proxy-api but does nothing other than forward gRPC requests to the destination container which listens on port 8089.
We remove the proxy-api container altogether and change the destination container to listen on port 8086 instead of 8089. The result is that clients still use the proxy-api by connecting to `proxy-api.<ns>.svc.cluster.local:8086` but the controller has one fewer containers. This results in a simpler system that is easier to reason about.
Signed-off-by: Alex Leong <alex@buoyant.io>
Service profiles must be named in the form `"<service>.<namespace>"`. This is inconsistent with the fully normalized domain name that the proxy sends to the controller. It also does not permit creating service profiles for non-Kubernetes services.
We switch to requiring that service profiles must be named with the FQDN of their service. For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`.
This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services. Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes.
Signed-off-by: Alex Leong <alex@buoyant.io>
The existence of an invalid service profile causes `linkerd check` to fail. This means that it is not possible to open the Linkerd dashboard with the `linkerd dashboard` command. While service profile validation is useful, it should not lock users out.
Add the ability to designate health checks as warnings. A failed warning health check will display a warning output in `linkerd check` but will not affect the overall success of the command. Switch the service profile validation to be a warning.
Signed-off-by: Alex Leong <alex@buoyant.io>
Add a new CLI command: `linkerd profile --template` which outputs a sample service profile yaml. Users can edit this sample and then `kubectl apply` it to add a service profile. The sample serves as "documentation by example" of what service profiles may contain.
Example usage:
```bash
linkerd profile -n emojivoto --template web-svc > web-svc-profile.yaml
# edit web-svc-profile.yaml in your favorite editor
kubectl apply -f web-svc-profile.yaml
```
Signed-off-by: Alex Leong <alex@buoyant.io>
We implement the getProfiles method in the destination service. This method returns a stream of destination profiles for a given authority. It does this by looking up the ServiceProfile resource in the controller namespace named `<svc>.<ns>` where `<svc>` is the name of the service and `<ns>` is the namespace of the service.
This PR includes:
* Adding a ServiceProfile Custom Resource Definition to linkerd install
* A watch based implementation of the getProfiles method in the destination service, similar to the implementation of get.
* An update to the destination client script that allows querying the getProfiles method.
Signed-off-by: Alex Leong <alex@buoyant.io>
Added support for json output in `linkerd stat` through a new (-o|--output)=json option.
Fixes#1417
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Make room for columns in `linkerd top`
Make room for columns in `linkerd top`. Columns with data longer than some predetermined minimum length were stepping over each other.
Proposal for #1728
* Removed unneeded truncations
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
To support reading and writing of the ServiceProfile custom resource, we add a codegen'd Kubernetes client for this resource.
* Adding the ServiceProfile type and related boilerplate to /controller/gen/apis/serviceprofile. This boilerplate also contains directives that control how codegen works.
* A script in /hack which invokes codegen that generates Kubernetes client machinery for interacting with ServiceProfile resources. The majority of the generated code lives in /controller/gen/client.
* The above-mentioned generated code.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Add --single-namespace install flag for restricted permissions
* Better formatting in install template
* Mark --single-namespace and --proxy-auto-inject as experimental
* Fix wording of --single-namespace check flag
* Small healthcheck refactor
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Support auto sidecar-injection
1. Add proxy-injector deployment spec to cli/install/template.go
2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration
during the webhook's start-up process.
3. Add a new handler to the CA controller to create a new secret for the
webhook when a new MutatingWebhookConfiguration is created.
4. Declare a config map to store the proxy and proxy-init container
specs used during the auto-inject process.
5. Ignore namespace and pods that are labeled with
linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed
6. Add new flag to `linkerd install` to enable/disable proxy
auto-injection
Proposed implementation for #561.
* Resolve missing packages errors
* Move the auto-inject label to the pod level
* PR review items
* Move proxy-injector to its own deployment
* Ignore pods that already have proxy injected
This ensures the webhook doesn't error out due to proxy that are injected using the command
* PR review items on creating/updating the MWC on-start
* Replace API calls to ConfigMap with file reads
* Fixed post-rebase broken tests
* Don't mutate the auto-inject label
Since we started using healhcheck.HasExistingSidecars() to ensure pods with
existing proxies aren't mutated, we don't need to use the auto-inject label as
an indicator.
This resolves a bug which happens with the kubectl run command where the deployment
is also assigned the auto-inject label. The mutation causes the pod auto-inject
label to not match the deployment label, causing kubectl run to fail.
* Tidy up unit tests
* Include proxy resource requests in sidecar config map
* Fixes to broken YAML in CLI install config
The ignore inbound and outbound ports are changed to string type to
avoid broken YAML caused by the string conversion in the uint slice.
Also, parameterized the proxy bind timeout option in template.go.
Renamed the sidecar config map to
'linkerd-proxy-injector-webhook-config'.
Signed-off-by: ihcsim <ihcsim@gmail.com>
* Added --context flag to specify the context to use to talk to the Kubernetes apiserver
* Fix tests that are failing
* Updated context flag description
Signed-off-by: Darko Radisic <ffd2subroutine@users.noreply.github.com>
Horizontal Pod Autoscaling does not work when container definitions in pods do not all have resource requests, so here's the ability to add CPU + Memory requests to install + inject commands by proving proxy options --proxy-cpu + --proxy-memory
Fixes#1480
Signed-off-by: Ben Lambert <ben@blam.sh>
In linkerd/linkerd2-proxy#99, several proxy configuration variables were
deprecated.
This change updates the CLI to use the updated names to avoid
deprecation warnings during startup.
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps
Signed-off-by: Alex Leong <alex@buoyant.io>
* Remove wait option and make it a default for check
* Switch the wait default to true
* Wait by default also for dashboard
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
`linkerd inject` was not checking its input for known sidecars and
initContainers.
Modify `linkerd inject` to check for existing sidecars and
initContainers, specifically, Linkerd, Istio, and Contour.
Part of #1516
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#1593
Add a `--hide-sources` flag to `linkerd top`. Setting this removes the source column from the output.
Signed-off-by: Alex Leong <alex@buoyant.io>
linkerd only routes TCP data, but `linkerd inject` does not warn when it
injects into pods with ports set to `protocol: UDP`.
Modify `linkerd inject` to warn when injected into a pod with
`protocol: UDP`. The Linkerd sidecar will still be injected, but the
stderr output will include a warning.
Also add stderr checking on all inject unit tests.
Part of #1516.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
If an input file is un-injectable, existing inject behavior is to simply
output a copy of the input.
Introduce a report, printed to stderr, that communicates the end state
of the inject command. Currently this includes checking for hostNetwork
and unsupported resources.
Malformed YAML documents will continue to cause no YAML output, and return
error code 1.
This change also modifies integration tests to handle stdout and stderr separately.
example outputs...
some pods injected, none with host networking:
```
hostNetwork: pods do not use host networking...............................[ok]
supported: at least one resource injected..................................[ok]
Summary: 4 of 8 YAML document(s) injected
deploy/emoji
deploy/voting
deploy/web
deploy/vote-bot
```
some pods injected, one host networking:
```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true"
supported: at least one resource injected..................................[ok]
Summary: 3 of 8 YAML document(s) injected
deploy/emoji
deploy/voting
deploy/web
```
no pods injected:
```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true"
supported: at least one resource injected..................................[warn] -- no supported objects found
Summary: 0 of 8 YAML document(s) injected
```
TODO: check for UDP and other init containers
Part of #1516
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The default value for the max-rps argument to the tap and top commands is an overly conservative 1rps. This causes the data to come in very slowly and much data to be discarded. Furthermore, because tap requests are windowed to 10 seconds, this causes long pauses between updates.
We fix this in two ways. Firstly we reduce the window size to 1s so that updates will come in at least once per second, even when the actual RPS of the data path is extremely high. Secondly, we increase the default max-rps parameter from 1 to 100. This allows tap to paint an accurate picture of the data much more quickly and sidesteps some sampling bias that happens when the max-rps is low.
In general, tap events tend to happen in bursts. For example, one request in may trigger one or more requests out. Likewise, a single upstream event may trigger several requests to the tapped pod in quick succession. Sampling bias will occur when the max-rps is less than the actual rps and when the tap event limit subdivides these event bursts (biasing towards the first few events in the burst). The greater the max-rps, the less the effects of this bias.
Fixes#1525
Signed-off-by: Alex Leong <alex@buoyant.io>
Previously, we would tap any resource's pods, regardless of whether the pods
were meshed or not. We can't actually tap non-meshed pods, so I'm adding a check
that will filter out non-meshed pods from the pods that tap watches.
Previous behaviour:
When attempting to hang a non meshed pod, it would establish
a watch on the pods, but then never return any results. In the CLI you could
just cancel it with Ctrl-C. In the web, clicking Stop would send a
WebSocket.close(1000) but wouldn't actually close the connection...
Behaviour after change :
If no pods under the specified resource are meshed, it'll
return an error of no pods being found to tap
Adds basic probes to the linkerd-proxy containers injected by linkerd inject.
- Currently the Readiness and Liveness probes are configured to be the same.
- I haven't supplied a periodSeconds, but the default is 10.
- I also set the initialDelaySeconds to 10, but that might be a bit high.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
Closes#1170.
This branch adds a `-o wide` (or `--output wide`) flag to the Tap CLI.
Passing this flag adds `src_res` and `dst_res` elements to the Tap
output, as described in #1170. These use the metadata labels in the tap
event to describe what Kubernetes resource the source and destination
peers belong to, based on what resource type is being tapped, and fall
back to pods if either peer is not a member of the specified resource
type.
In addition, when the resource type is not `namespace`, `src_ns` and
`dst_ns` elements are added, which show what namespaces the the source
and destination peers are in. For peers which are not in the Kubernetes
cluster, none of these labels are displayed.
The source metadata added in #1434 is used to populate the `src_res` and
`src_ns` fields.
Also, this branch includes some refactoring to how tap output is
formatted.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Upgrade to dep 0.5.0, go 1.10.3
* Remove existing dep binary if it's the wrong version
* Add version in filename of dep binary to prevent version conflicts
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
This an initial implementation of the `linkerd top` command. This command launches an ncurses style tabular view of current requests (using data from tap). Most of the command line arguments are the same as tap and allow selecting the resource to inspect and filtering which requests to view.
Fixes#1283
Signed-off-by: Alex Leong <alex@buoyant.io>
Closes#776.
This branch adds the following validation to the `linkerd stat` command:
* The `--to` and `--from` flags are now mutually exclusive
* The `--to-namespace` and `--from-namespace` commands are also mutually
exclusive.
* The `namespace` resource type conflicts with the `--namespace`,
`--to-namespace`, and `--from-namespace` flags.
Examples:
```
$ bin/go-run cli/main.go stat deploy --to deploy/foo --from deploy/bar
Error: --to and --from flags are mutually exclusive
Usage:
linkerd stat [flags] (RESOURCE)
...
```
```
$ bin/go-run cli/main.go stat deploy --to-namespace foo --from-namespace bar
Error: --to-namespace and --from-namespace flags are mutually exclusive
Usage:
linkerd stat [flags] (RESOURCE)
...
```
```
$ bin/go-run cli/main.go stat namespace foo --namespace bar
Error: --namespace flag is incompatible with namespace resource type
Usage:
linkerd stat [flags] (RESOURCE)
...
```
```
$ bin/go-run cli/main.go stat ns --to-namespace bar
Error: --to-namespace flag is incompatible with namespace resource type
Usage:
linkerd stat [flags] (RESOURCE)
...
```
```
$ bin/go-run cli/main.go stat namespace --from-namespace bar
Error: --from-namespace flag is incompatible with namespace resource type
Usage:
linkerd stat [flags] (RESOURCE)
...
```
```
$ bin/go-run cli/main.go stat ns/foo --from-namespace bar
Error: --from-namespace flag is incompatible with namespace resource type
Usage:
linkerd stat [flags] (RESOURCE)
...
```
Signed-off-by: Eliza Weisman <eliza@buoyant.io>