Commit Graph

1029 Commits

Author SHA1 Message Date
Andrew Seigner bef9479f57
Add input validation for profile command (#1934)
Fixes #1878

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-05 15:13:10 -08:00
Risha Mars 442685674b
Add a Create Service Profile dialog (#1933)
Add the ability to create and download a service profile from the web UI.

This form will be displayed in the call to action if no route metrics are found.
2018-12-05 15:08:10 -08:00
Alex Leong 7169eaef27
Stop routes with the same name from different services from clobbering each other (#1936)
If the `linkerd routes` command gets two routes with the same name, it will only display one of them, even if the routes are from different services.  This is particularly obvious with the default `[UNKNOWN]` route.

We now display all routes, even if they have the same name.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 15:05:19 -08:00
Alex Leong cbb196066f
Support service profiles for external authorities (#1928)
Add support for service profiles created on external (non-service) authorities.  For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`.

This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority.  We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem.

We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not.  To simplify this, support for multiple resolves (which was unused) was removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 14:32:59 -08:00
Risha Mars 5b26508f7c
Add a tabbed view to the resource detail page for Top and Routes (#1918)
Adds the top routes metrics to the resource detail pages.

* Add a tabbed view to the resource detail page
Add the ability to query top routes from the detail tabs

* Move ConfigureProfilesMsg to its own module
2018-12-05 13:55:31 -08:00
Oliver Gould f80f3892a0
proxy: bump version for bug fixes (#1935)
* 0065c137 profiles: Drive profile discovery on a daemon task (#156)
* b9ffbb7f Update h2 to v0.1.14
* 3ac6b72c Add basic tap integration tests (#154)
2018-12-05 13:23:26 -08:00
Oliver Gould 12ec5cf922
install: Add a -disable-h2-upgrade flag (#1926)
The proxy-api service _always_ suggests that two meshed pods communicate
via HTTP/2 (i.e. via transparent protocol upgrading, if necessary).
This can complicate debugging and diagnostics at times, so it's
important that we have a way to deploy linkerd without this auto-upgrade
behavior.

This change adds a `-disable-h2-upgrade` flag to the `linkerd install`
command that disables transparent upgrading for the whole cluster.
2018-12-05 12:50:47 -08:00
Oliver Gould 8f9bb711dd
proxy-api: Expose a flag to control auto-h2-upgrade (#1925)
When debugging issues, it's helpful to disable HTTP/2 upgrading to
simplify diagnostics.

This chagne adds an `enable-h2-ugprade` flag to _proxy-api_. When this
flag is set to false, the proxy-api will not suggest that meshed
endpoints are upgraded to use HTTP/2.

As a follow-up, a flag should be added to `install` to control how the
proxy-api is initialized.
2018-12-05 12:41:20 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Alex Leong 4f3e55e937
Rename path to path_regex in ServiceProfile CRD (#1923)
We rename path to path_regex in the ServiceProfile CRD to make it clear that this field accepts a regular expression. We also take this opportunity to remove unnecessary line anchors from regular expressions now that these anchors are added in the proxy.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 10:42:47 -08:00
Risha Mars 7949a3355c
s/Request Rate/RPS (#1927) 2018-12-04 17:29:31 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner ad2366f208
Revert proxy readiness initialDelaySeconds change (#1912)
Reverts part of #1899 to workaround readiness failures.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-04 14:27:55 -08:00
Oliver Gould ffa302eb6a
proxy: Update for debug logging (#1922)
commit 68f42c337f2580f3b33ddab2e01540f6849d0d1a (HEAD -> master, origin/master)
Author: Oliver Gould <ver@buoyant.io>
Date:   Tue Dec 4 07:45:20 2018 -0800

    Log discovery updates in the outbound proxy (#153)

    When debugging issues that users believe is related to discovery, it's
    helpful to get a narrow set of logs out to determine whether the proxy
    is observing discovery updates.

    With this change, a user can inject the proxy with
    ```
    LINKERD2_PROXY_LOG='warn,linkerd2_proxy=info,linkerd2_proxy::app::outbound::discovery=debug'
    ```
    and the proxy's logs will include messages like:

    ```
    DBUG voting-svc.emojivoto.svc.cluster.local:8080 linkerd2_proxy::app::outbound::discovery adding 10.233.70.98:8080
    DBUG voting-svc.emojivoto.svc.cluster.local:8080 linkerd2_proxy::app::outbound::discovery removing 10.233.66.36:8080
    ```

    This change also turns-down some overly chatty INFO logging in main.
2018-12-04 12:13:45 -08:00
Risha Mars 92d92d3f9b
Move the determining of display order into the cli query module (#1917)
[web UI] Previously, we were specifying the display order to display the cli flags in the
QueryToCliCmd module. But this order is pretty standard for each command, and
I'd like to avoid hardcoding that list everywhere.

Move the handling of order into the QueryToCliCmd module.
2018-12-04 10:08:59 -08:00
Risha Mars e8a39cd17e
Add ability to download a service profile template from the web UI (#1893)
Adds an endpoint, at /profiles/new that allows you to input a service name and
namespace, and download a service profile yaml template. 

This will enable future work, where we can add more of the yaml customization via 
a form in the dashboard, and use that data to help the user configure routes.
2018-12-03 16:48:43 -08:00
Oliver Gould baa7436cc7
Bump the proxy version to fix integration tests (#1914)
A Tap integration test fails and has been fixed by
linkerd/linkerd2-proxy#152.

This change bumps the proxy version to get this change, as well as an
upgrade to the `h2` library for bugfixes.
2018-12-03 16:30:35 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Oliver Gould 926395f616
tap: Include route labels in tap events (#1902)
This change alters the controller's Tap service to include route labels
when translating tap events, modifies the public API to include route
metadata in responses, and modifies the tap CLI command to include
rt_ labels in tap output (when -o wide is used).
2018-12-03 13:52:47 -08:00
Risha Mars dd44e10a58
Display the correct command name for the cli equivalent (#1907)
Previously, we were passing in "tap" as the command name for both the tap and
top forms, resulting in the equivalent CLI command always being linkerd tap
regardless of whether you were in the Tap or Top view.

Fix this to correctly pass in tap or top depending on the page.
2018-12-03 12:22:00 -08:00
Risha Mars c108cd260d
Upgrade to material-ui 3.6.1 (#1906) 2018-12-03 11:59:45 -08:00
Andrew Seigner d121071f87
Adjust proxy, Prometheus, and Grafana probes (#1899)
* Adjust proxy, Prometheus, and Grafana probes

High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.

Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.

Detailed probe changes:
- proxy
  - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `readinessProbe.failureThreshold` from 10 to 3
  - increase `livenessProbe.initialDelaySeconds` from 0s to 30s

Fixes #1804

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 10:41:11 -08:00
Alex Leong f9d66cf4de
Add --open-api option to linkerd profiles command (#1867)
The `--open-api` flag is an alternative to the `--template` flag for the `linkerd profile` command.  It reads an OpenAPI specification file (also called a swagger file) and uses it to generate a corresponding service profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-30 09:25:19 -08:00
Risha Mars 7a2689ce4a
Improve the top routes request form and code structure. (#1886)
Separates out the querying and table display of route data, so that this module
can be easily placed in other places in the UI. 

Adds usability improvements to the routes query form at /routes:
- displays CLI equivalent 
- adds dropdown with populated options for service / namespace 

As part of this work, made TapQueryCliCmd more generic, so it can work
for other CLI commands besides tap/top.
2018-11-29 10:20:53 -08:00
Andrew Seigner e9aa9114e1
Release notes for edge-18.11.3 (#1885)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-11-28 15:41:27 -08:00
Andrew Seigner 34d9eef03e
proxy injector: insert at end of arrays (#1881)
When using `--proxy-auto-inject` with Kuberntes `v1.9.11`, observed auto
injector incorrectly merging list elements rather than inserting new
ones. This issue was not reproducible on `v1.10.3`.

For example, this input:
```
spec:
  template:
    spec:
      containers:
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

Would yield:
```
spec:
  template:
    spec:
      containers:
      - name: linkerd-proxy
        command:
        - emojivoto-vote-bot
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

This change replaces json patch specs like
`/spec/template/spec/containers/0` with
`/spec/template/spec/containers/-`. The former is intended to insert at
the beggining of a list, the latter at the end. This also simplifies the
code a bit and more closely aligns with the intent of injecting at the
end of lists.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-11-28 14:21:18 -08:00
Alex Leong 835e34b500
Left-align the routes column and sort by route name (#1879)
Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-28 09:32:59 -08:00
Risha Mars d9539bcb37
Add the top routes feature to the dashboard UI (#1868)
Adds a (currently not displayed in sidebar, but available at /routes) page to
mirror the current functionality of `linkerd routes <service>`. So far, this is just a
barebones form and table, but it works.

Adds a /api/routes path and handler to the api to receive TopRoutes requests from the web.
2018-11-27 16:53:10 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 73836f05cf
Update proxy version and use canonicalized dst (#1866)
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag).  Inbound queries (i.e. ones without the `--from` flag) never return any metrics.

We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-26 17:20:07 -08:00
Oliver Gould ba11698d4b
tap: Use nil-safe protobuf accessors (#1873)
The tap server accesses protobuf fields directly instead of using the
`Get*()` accessors. The accessors are necessary to prevent dereferencing
a nil pointer and crashing the tap service.

Furthermore, these maps are explicitly initialized when `nil` to support
label hydration.
2018-11-26 14:14:28 -08:00
Ben Lambert 297cb570f2 Added a --ha flag to install CLI (#1852)
This change allows some advised production config to be applied to the install of the control plane.
Currently this runs 3x replicas of the controller and adds some pretty sane requests to each of the components + containers of the control plane.

Fixes #1101

Signed-off-by: Ben Lambert <ben@blam.sh>
2018-11-20 23:03:59 -05:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Kevin Lingerfelt 4d4b1ebc89
Update CHANGES.md for edge-18.11.2 release (#1864)
* Update CHANGES.md for edge-18.11.2 release
* Add issue number for same src/dst issue

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-11-15 14:39:25 -08:00
Dennis Adjei-Baah 214540c823
Add new iptable rule to for outbound traffic (#1863)
When requests from a pod send requests to itself, the proxy properly redirects traffic from the originating container in the pod through the outbound listener of the proxy. Once the request ends on the inbound side of the proxy, it skips the proxy and calls the original container that made the request. This can cause problems for containers that serve HTTP as the proxy naively tries to initiate an HTTP/2 connection to the destination of a request.  (See #1585 for a concrete example)

This PR adds a new iptable rule, coupled with a proxy [change](https://github.com/linkerd/linkerd2-proxy/pull/122) ensure that requests from a that occur in the aforementioned scenario, always redirect to the inbound listener of the proxy first.

fixes #1585

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-15 13:56:45 -08:00
Kevin Leimkuhler c68693e820
Fix stat filtering for `--from` queries (#1856)
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.

Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).

# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.

# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.

Fixes #1766 

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2018-11-14 10:52:27 -08:00
Alejandro Pedraza bbcf5a8c9f Allow stat summary to query for multiple resources (#1841)
* Refactor util.BuildResource so it can deal with multiple resources

First step to address #1487: Allow stat summary to query for multiple
resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update the stat cli help text to explain the new multi resource querying ability

Propsal for #1487: Allow stat summary to query for multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow stat summary to query for multiple resources

Implement this ability by issuing parallel requests to requestStatsFromAPI()

Proposal for #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update tests as part of multi-resource support in `linkerd stat` (#1487)

- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` called with multiple resources should keep an ordering (#1487)

Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Extra validations for `linkerd stat` with multiple resources (#1487)

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` resource grouping, ordering and name prefixing (#1487)

- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow `linkerd stat` to be called with multiple resources

A few final refactorings as per code review.

Fixes #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-11-14 10:44:04 -08:00
Kevin Lingerfelt 4547ba7f0a
Make permission checks non-fatal, add check for CRDs (#1859)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-11-14 10:29:04 -08:00
Igor Zibarev 60bcdb15f9 controller: use GetConfig from pkg/k8s package (#1857)
This commit removes duplicate logic that loads Kubernetes config and
replaces it with GetConfig from pkg/k8s. This also allows to load
config from default sources like $KUBECONFIG instead of explicitly
passing -kubeconfig option to controller components.

Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
2018-11-13 14:41:31 -08:00
Alena Varkockova fda834cf64 Allow retrying control plane API check (#1858)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-13 10:52:50 -08:00
Igor Zibarev 9b3f67e3dc Fix comment typos (#1855)
Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
2018-11-12 15:27:22 -08:00
Alena Varkockova 38dfc5308f Make version checks warning (#1844)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-09 09:48:14 -08:00
Dennis Adjei-Baah 15e87bfd8d
Increase retry timeout for retryable tests, refactor RetryFor (#1835)
When running integration tests in a Kubernetes cluster that sometimes takes a little longer to get pods ready, the integration tests fail tests too early because most tests have a retry timeout of 30 seconds. 

This PR bumps up this retry timeout for `TestInstall` to 3 minutes. This gives the test enough time to download any new docker images that it needs to complete succesfully and also reduces the need to have large timeout values for subsequent tests. This PR also refactors `CheckPods` to check that all containers in a pods for a deployment are in a`Ready` state. This helps also helps in ensuring that all docker images have been downloaded and the pods are in a good state.

Tests were run on the community cluster and all were successful.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-06 16:03:58 -08:00
Dennis Adjei-Baah dfaf3b1e1b
bump proxy version to 5e0a15b (#1842)
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-06 13:20:52 -08:00
Dennis Adjei-Baah 1c79f0146c
add control plane check before running proxy test assertion (#1836)
This PR adds a check before the TestCheckProxy executes the CLI command to make sure the control plane has installed all its pods. This avoids the situation where the test would fail because the pods are retrying the data plane check. The PR's base is #1835 but will switch to master once that PR merges.

Relates to PR #1835
Fixes #1733

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-02 13:30:06 -07:00
Oliver Gould 747fd328e9
grafana: Show TCP closes by errno (#1839)
linkerd/linkerd2-proxy#116 removes the `classification` label for the
`tcp_close_total` metric because TCP sockets that close with an error
do not actually indicate any sort of failure -- many graceful shutdown
situations can still cause a socket error.

This change uses the `errno` label to enumerate tcp_close_total metrics.
2018-11-02 10:20:11 -07:00
Risha Mars a6c5f9820e
Update CHANGES.md for the edge-18.11.1 release (#1840)
* Update CHANGES.md for the edge-18.11.1 release
2018-11-01 14:02:12 -07:00
Risha Mars 6d8911090d
Rewrite octopus arm code to be more parameterized and flexible. (#1834)
As a result, displays better in the material UI version of the dashboard.
Also adds Success rate to data displayed on neighbour nodes.

* Rewrite octopus arm code to be more parameterized and flexible.
As a result, displays better in the material UI version of the dashboard.
* Add Success rate to data displayed on neighbour nodes
* Fix variablilty in grid spacing by fixing the max and min widths of the chart,
and by scrolling the overflow
* Center the octopus graph so it looks better at full width
* Also add padding, so that the drop shadows aren't cut off
2018-11-01 12:30:19 -07:00
Andrew Seigner f777f87924
Fix grid alignment in tap query form (#1833)
Fixes #1789

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-31 14:10:16 -07:00
Andrew Seigner 1eb93b670a
Replace some sidebar icons with Font Awesome (#1830)
This replaces a couple of the MaterialUI icons introduced in #1776 with
their original counterparts in Font Awesome, but wrapped in a MaterialUI
`Icon` tag. Also fix Linkerd logo padding in sidebar.

Part of #1781.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-31 13:42:55 -07:00