Rename snake case fields to camel case in service profile spec. This improves the way they are rendered when the `kubectl describe` command is used.
Signed-off-by: Alex Leong <alex@buoyant.io>
Now that #1921 has merged, we can query for top routes for any resource,
not just services.
This PR adds a dropdown for all resources to the Top Routes query form.
It also adds a link to the Top Routes page in the sidebar.
Add the ability to create and download a service profile from the web UI.
This form will be displayed in the call to action if no route metrics are found.
If the `linkerd routes` command gets two routes with the same name, it will only display one of them, even if the routes are from different services. This is particularly obvious with the default `[UNKNOWN]` route.
We now display all routes, even if they have the same name.
Signed-off-by: Alex Leong <alex@buoyant.io>
Add support for service profiles created on external (non-service) authorities. For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`.
This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority. We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem.
We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not. To simplify this, support for multiple resolves (which was unused) was removed.
Signed-off-by: Alex Leong <alex@buoyant.io>
Adds the top routes metrics to the resource detail pages.
* Add a tabbed view to the resource detail page
Add the ability to query top routes from the detail tabs
* Move ConfigureProfilesMsg to its own module
The proxy-api service _always_ suggests that two meshed pods communicate
via HTTP/2 (i.e. via transparent protocol upgrading, if necessary).
This can complicate debugging and diagnostics at times, so it's
important that we have a way to deploy linkerd without this auto-upgrade
behavior.
This change adds a `-disable-h2-upgrade` flag to the `linkerd install`
command that disables transparent upgrading for the whole cluster.
When debugging issues, it's helpful to disable HTTP/2 upgrading to
simplify diagnostics.
This chagne adds an `enable-h2-ugprade` flag to _proxy-api_. When this
flag is set to false, the proxy-api will not suggest that meshed
endpoints are upgraded to use HTTP/2.
As a follow-up, a flag should be added to `install` to control how the
proxy-api is initialized.
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.
Signed-off-by: Alex Leong <alex@buoyant.io>
We rename path to path_regex in the ServiceProfile CRD to make it clear that this field accepts a regular expression. We also take this opportunity to remove unnecessary line anchors from regular expressions now that these anchors are added in the proxy.
Signed-off-by: Alex Leong <alex@buoyant.io>
commit 68f42c337f2580f3b33ddab2e01540f6849d0d1a (HEAD -> master, origin/master)
Author: Oliver Gould <ver@buoyant.io>
Date: Tue Dec 4 07:45:20 2018 -0800
Log discovery updates in the outbound proxy (#153)
When debugging issues that users believe is related to discovery, it's
helpful to get a narrow set of logs out to determine whether the proxy
is observing discovery updates.
With this change, a user can inject the proxy with
```
LINKERD2_PROXY_LOG='warn,linkerd2_proxy=info,linkerd2_proxy::app::outbound::discovery=debug'
```
and the proxy's logs will include messages like:
```
DBUG voting-svc.emojivoto.svc.cluster.local:8080 linkerd2_proxy::app::outbound::discovery adding 10.233.70.98:8080
DBUG voting-svc.emojivoto.svc.cluster.local:8080 linkerd2_proxy::app::outbound::discovery removing 10.233.66.36:8080
```
This change also turns-down some overly chatty INFO logging in main.
[web UI] Previously, we were specifying the display order to display the cli flags in the
QueryToCliCmd module. But this order is pretty standard for each command, and
I'd like to avoid hardcoding that list everywhere.
Move the handling of order into the QueryToCliCmd module.
Adds an endpoint, at /profiles/new that allows you to input a service name and
namespace, and download a service profile yaml template.
This will enable future work, where we can add more of the yaml customization via
a form in the dashboard, and use that data to help the user configure routes.
A Tap integration test fails and has been fixed by
linkerd/linkerd2-proxy#152.
This change bumps the proxy version to get this change, as well as an
upgrade to the `h2` library for bugfixes.
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.
Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.
Fix panic when unknown type specified as a `--from` or `--to` flag.
Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.
Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.
Set `--controller-log-level debug` in `install_test.go` for easier debugging.
Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.
Fixes#1872
Part of #627
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This change alters the controller's Tap service to include route labels
when translating tap events, modifies the public API to include route
metadata in responses, and modifies the tap CLI command to include
rt_ labels in tap output (when -o wide is used).
Previously, we were passing in "tap" as the command name for both the tap and
top forms, resulting in the equivalent CLI command always being linkerd tap
regardless of whether you were in the Tap or Top view.
Fix this to correctly pass in tap or top depending on the page.
* Adjust proxy, Prometheus, and Grafana probes
High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.
Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.
Detailed probe changes:
- proxy
- decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
- decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
- decrease `readinessProbe.timeoutSeconds` from 30s to 1s
- decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
- decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
- decrease `readinessProbe.timeoutSeconds` from 30s to 1s
- decrease `readinessProbe.failureThreshold` from 10 to 3
- increase `livenessProbe.initialDelaySeconds` from 0s to 30s
Fixes#1804
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `--open-api` flag is an alternative to the `--template` flag for the `linkerd profile` command. It reads an OpenAPI specification file (also called a swagger file) and uses it to generate a corresponding service profile.
Signed-off-by: Alex Leong <alex@buoyant.io>
Separates out the querying and table display of route data, so that this module
can be easily placed in other places in the UI.
Adds usability improvements to the routes query form at /routes:
- displays CLI equivalent
- adds dropdown with populated options for service / namespace
As part of this work, made TapQueryCliCmd more generic, so it can work
for other CLI commands besides tap/top.
When using `--proxy-auto-inject` with Kuberntes `v1.9.11`, observed auto
injector incorrectly merging list elements rather than inserting new
ones. This issue was not reproducible on `v1.10.3`.
For example, this input:
```
spec:
template:
spec:
containers:
- name: vote-bot
command:
- emojivoto-vote-bot
```
Would yield:
```
spec:
template:
spec:
containers:
- name: linkerd-proxy
command:
- emojivoto-vote-bot
- name: vote-bot
command:
- emojivoto-vote-bot
```
This change replaces json patch specs like
`/spec/template/spec/containers/0` with
`/spec/template/spec/containers/-`. The former is intended to insert at
the beggining of a list, the latter at the end. This also simplifies the
code a bit and more closely aligns with the intent of injecting at the
end of lists.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds a (currently not displayed in sidebar, but available at /routes) page to
mirror the current functionality of `linkerd routes <service>`. So far, this is just a
barebones form and table, but it works.
Adds a /api/routes path and handler to the api to receive TopRoutes requests from the web.
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).
Accessible from the web UI via http://localhost:8084/api/services
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag). Inbound queries (i.e. ones without the `--from` flag) never return any metrics.
We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
The tap server accesses protobuf fields directly instead of using the
`Get*()` accessors. The accessors are necessary to prevent dereferencing
a nil pointer and crashing the tap service.
Furthermore, these maps are explicitly initialized when `nil` to support
label hydration.
This change allows some advised production config to be applied to the install of the control plane.
Currently this runs 3x replicas of the controller and adds some pretty sane requests to each of the components + containers of the control plane.
Fixes#1101
Signed-off-by: Ben Lambert <ben@blam.sh>
Add a routes command which displays per-route stats for services that have service profiles defined.
This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared.
Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.
Signed-off-by: Alex Leong <alex@buoyant.io>
When requests from a pod send requests to itself, the proxy properly redirects traffic from the originating container in the pod through the outbound listener of the proxy. Once the request ends on the inbound side of the proxy, it skips the proxy and calls the original container that made the request. This can cause problems for containers that serve HTTP as the proxy naively tries to initiate an HTTP/2 connection to the destination of a request. (See #1585 for a concrete example)
This PR adds a new iptable rule, coupled with a proxy [change](https://github.com/linkerd/linkerd2-proxy/pull/122) ensure that requests from a that occur in the aforementioned scenario, always redirect to the inbound listener of the proxy first.
fixes#1585
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.
Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).
# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.
# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.
Fixes#1766
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Refactor util.BuildResource so it can deal with multiple resources
First step to address #1487: Allow stat summary to query for multiple
resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Update the stat cli help text to explain the new multi resource querying ability
Propsal for #1487: Allow stat summary to query for multiple resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Allow stat summary to query for multiple resources
Implement this ability by issuing parallel requests to requestStatsFromAPI()
Proposal for #1487
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Update tests as part of multi-resource support in `linkerd stat` (#1487)
- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* `linkerd stat` called with multiple resources should keep an ordering (#1487)
Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Extra validations for `linkerd stat` with multiple resources (#1487)
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* `linkerd stat` resource grouping, ordering and name prefixing (#1487)
- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Allow `linkerd stat` to be called with multiple resources
A few final refactorings as per code review.
Fixes#1487
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
This commit removes duplicate logic that loads Kubernetes config and
replaces it with GetConfig from pkg/k8s. This also allows to load
config from default sources like $KUBECONFIG instead of explicitly
passing -kubeconfig option to controller components.
Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
When running integration tests in a Kubernetes cluster that sometimes takes a little longer to get pods ready, the integration tests fail tests too early because most tests have a retry timeout of 30 seconds.
This PR bumps up this retry timeout for `TestInstall` to 3 minutes. This gives the test enough time to download any new docker images that it needs to complete succesfully and also reduces the need to have large timeout values for subsequent tests. This PR also refactors `CheckPods` to check that all containers in a pods for a deployment are in a`Ready` state. This helps also helps in ensuring that all docker images have been downloaded and the pods are in a good state.
Tests were run on the community cluster and all were successful.
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
This PR adds a check before the TestCheckProxy executes the CLI command to make sure the control plane has installed all its pods. This avoids the situation where the test would fail because the pods are retrying the data plane check. The PR's base is #1835 but will switch to master once that PR merges.
Relates to PR #1835Fixes#1733
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
linkerd/linkerd2-proxy#116 removes the `classification` label for the
`tcp_close_total` metric because TCP sockets that close with an error
do not actually indicate any sort of failure -- many graceful shutdown
situations can still cause a socket error.
This change uses the `errno` label to enumerate tcp_close_total metrics.