Commit Graph

959 Commits

Author SHA1 Message Date
Risha Mars c108cd260d
Upgrade to material-ui 3.6.1 (#1906) 2018-12-03 11:59:45 -08:00
Andrew Seigner d121071f87
Adjust proxy, Prometheus, and Grafana probes (#1899)
* Adjust proxy, Prometheus, and Grafana probes

High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.

Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.

Detailed probe changes:
- proxy
  - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `readinessProbe.failureThreshold` from 10 to 3
  - increase `livenessProbe.initialDelaySeconds` from 0s to 30s

Fixes #1804

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 10:41:11 -08:00
Alex Leong f9d66cf4de
Add --open-api option to linkerd profiles command (#1867)
The `--open-api` flag is an alternative to the `--template` flag for the `linkerd profile` command.  It reads an OpenAPI specification file (also called a swagger file) and uses it to generate a corresponding service profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-30 09:25:19 -08:00
Risha Mars 7a2689ce4a
Improve the top routes request form and code structure. (#1886)
Separates out the querying and table display of route data, so that this module
can be easily placed in other places in the UI. 

Adds usability improvements to the routes query form at /routes:
- displays CLI equivalent 
- adds dropdown with populated options for service / namespace 

As part of this work, made TapQueryCliCmd more generic, so it can work
for other CLI commands besides tap/top.
2018-11-29 10:20:53 -08:00
Andrew Seigner e9aa9114e1
Release notes for edge-18.11.3 (#1885)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-11-28 15:41:27 -08:00
Andrew Seigner 34d9eef03e
proxy injector: insert at end of arrays (#1881)
When using `--proxy-auto-inject` with Kuberntes `v1.9.11`, observed auto
injector incorrectly merging list elements rather than inserting new
ones. This issue was not reproducible on `v1.10.3`.

For example, this input:
```
spec:
  template:
    spec:
      containers:
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

Would yield:
```
spec:
  template:
    spec:
      containers:
      - name: linkerd-proxy
        command:
        - emojivoto-vote-bot
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

This change replaces json patch specs like
`/spec/template/spec/containers/0` with
`/spec/template/spec/containers/-`. The former is intended to insert at
the beggining of a list, the latter at the end. This also simplifies the
code a bit and more closely aligns with the intent of injecting at the
end of lists.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-11-28 14:21:18 -08:00
Alex Leong 835e34b500
Left-align the routes column and sort by route name (#1879)
Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-28 09:32:59 -08:00
Risha Mars d9539bcb37
Add the top routes feature to the dashboard UI (#1868)
Adds a (currently not displayed in sidebar, but available at /routes) page to
mirror the current functionality of `linkerd routes <service>`. So far, this is just a
barebones form and table, but it works.

Adds a /api/routes path and handler to the api to receive TopRoutes requests from the web.
2018-11-27 16:53:10 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 73836f05cf
Update proxy version and use canonicalized dst (#1866)
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag).  Inbound queries (i.e. ones without the `--from` flag) never return any metrics.

We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-26 17:20:07 -08:00
Oliver Gould ba11698d4b
tap: Use nil-safe protobuf accessors (#1873)
The tap server accesses protobuf fields directly instead of using the
`Get*()` accessors. The accessors are necessary to prevent dereferencing
a nil pointer and crashing the tap service.

Furthermore, these maps are explicitly initialized when `nil` to support
label hydration.
2018-11-26 14:14:28 -08:00
Ben Lambert 297cb570f2 Added a --ha flag to install CLI (#1852)
This change allows some advised production config to be applied to the install of the control plane.
Currently this runs 3x replicas of the controller and adds some pretty sane requests to each of the components + containers of the control plane.

Fixes #1101

Signed-off-by: Ben Lambert <ben@blam.sh>
2018-11-20 23:03:59 -05:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Kevin Lingerfelt 4d4b1ebc89
Update CHANGES.md for edge-18.11.2 release (#1864)
* Update CHANGES.md for edge-18.11.2 release
* Add issue number for same src/dst issue

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-11-15 14:39:25 -08:00
Dennis Adjei-Baah 214540c823
Add new iptable rule to for outbound traffic (#1863)
When requests from a pod send requests to itself, the proxy properly redirects traffic from the originating container in the pod through the outbound listener of the proxy. Once the request ends on the inbound side of the proxy, it skips the proxy and calls the original container that made the request. This can cause problems for containers that serve HTTP as the proxy naively tries to initiate an HTTP/2 connection to the destination of a request.  (See #1585 for a concrete example)

This PR adds a new iptable rule, coupled with a proxy [change](https://github.com/linkerd/linkerd2-proxy/pull/122) ensure that requests from a that occur in the aforementioned scenario, always redirect to the inbound listener of the proxy first.

fixes #1585

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-15 13:56:45 -08:00
Kevin Leimkuhler c68693e820
Fix stat filtering for `--from` queries (#1856)
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.

Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).

# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.

# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.

Fixes #1766 

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2018-11-14 10:52:27 -08:00
Alejandro Pedraza bbcf5a8c9f Allow stat summary to query for multiple resources (#1841)
* Refactor util.BuildResource so it can deal with multiple resources

First step to address #1487: Allow stat summary to query for multiple
resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update the stat cli help text to explain the new multi resource querying ability

Propsal for #1487: Allow stat summary to query for multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow stat summary to query for multiple resources

Implement this ability by issuing parallel requests to requestStatsFromAPI()

Proposal for #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update tests as part of multi-resource support in `linkerd stat` (#1487)

- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` called with multiple resources should keep an ordering (#1487)

Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Extra validations for `linkerd stat` with multiple resources (#1487)

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` resource grouping, ordering and name prefixing (#1487)

- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow `linkerd stat` to be called with multiple resources

A few final refactorings as per code review.

Fixes #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-11-14 10:44:04 -08:00
Kevin Lingerfelt 4547ba7f0a
Make permission checks non-fatal, add check for CRDs (#1859)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-11-14 10:29:04 -08:00
Igor Zibarev 60bcdb15f9 controller: use GetConfig from pkg/k8s package (#1857)
This commit removes duplicate logic that loads Kubernetes config and
replaces it with GetConfig from pkg/k8s. This also allows to load
config from default sources like $KUBECONFIG instead of explicitly
passing -kubeconfig option to controller components.

Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
2018-11-13 14:41:31 -08:00
Alena Varkockova fda834cf64 Allow retrying control plane API check (#1858)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-13 10:52:50 -08:00
Igor Zibarev 9b3f67e3dc Fix comment typos (#1855)
Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
2018-11-12 15:27:22 -08:00
Alena Varkockova 38dfc5308f Make version checks warning (#1844)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-11-09 09:48:14 -08:00
Dennis Adjei-Baah 15e87bfd8d
Increase retry timeout for retryable tests, refactor RetryFor (#1835)
When running integration tests in a Kubernetes cluster that sometimes takes a little longer to get pods ready, the integration tests fail tests too early because most tests have a retry timeout of 30 seconds. 

This PR bumps up this retry timeout for `TestInstall` to 3 minutes. This gives the test enough time to download any new docker images that it needs to complete succesfully and also reduces the need to have large timeout values for subsequent tests. This PR also refactors `CheckPods` to check that all containers in a pods for a deployment are in a`Ready` state. This helps also helps in ensuring that all docker images have been downloaded and the pods are in a good state.

Tests were run on the community cluster and all were successful.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-06 16:03:58 -08:00
Dennis Adjei-Baah dfaf3b1e1b
bump proxy version to 5e0a15b (#1842)
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-06 13:20:52 -08:00
Dennis Adjei-Baah 1c79f0146c
add control plane check before running proxy test assertion (#1836)
This PR adds a check before the TestCheckProxy executes the CLI command to make sure the control plane has installed all its pods. This avoids the situation where the test would fail because the pods are retrying the data plane check. The PR's base is #1835 but will switch to master once that PR merges.

Relates to PR #1835
Fixes #1733

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-11-02 13:30:06 -07:00
Oliver Gould 747fd328e9
grafana: Show TCP closes by errno (#1839)
linkerd/linkerd2-proxy#116 removes the `classification` label for the
`tcp_close_total` metric because TCP sockets that close with an error
do not actually indicate any sort of failure -- many graceful shutdown
situations can still cause a socket error.

This change uses the `errno` label to enumerate tcp_close_total metrics.
2018-11-02 10:20:11 -07:00
Risha Mars a6c5f9820e
Update CHANGES.md for the edge-18.11.1 release (#1840)
* Update CHANGES.md for the edge-18.11.1 release
2018-11-01 14:02:12 -07:00
Risha Mars 6d8911090d
Rewrite octopus arm code to be more parameterized and flexible. (#1834)
As a result, displays better in the material UI version of the dashboard.
Also adds Success rate to data displayed on neighbour nodes.

* Rewrite octopus arm code to be more parameterized and flexible.
As a result, displays better in the material UI version of the dashboard.
* Add Success rate to data displayed on neighbour nodes
* Fix variablilty in grid spacing by fixing the max and min widths of the chart,
and by scrolling the overflow
* Center the octopus graph so it looks better at full width
* Also add padding, so that the drop shadows aren't cut off
2018-11-01 12:30:19 -07:00
Andrew Seigner f777f87924
Fix grid alignment in tap query form (#1833)
Fixes #1789

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-31 14:10:16 -07:00
Andrew Seigner 1eb93b670a
Replace some sidebar icons with Font Awesome (#1830)
This replaces a couple of the MaterialUI icons introduced in #1776 with
their original counterparts in Font Awesome, but wrapped in a MaterialUI
`Icon` tag. Also fix Linkerd logo padding in sidebar.

Part of #1781.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-31 13:42:55 -07:00
Andrew Seigner 9a49f96a9a
Fix popovers to be reachable and clickable (#1831)
The popover on the src/dst column in the top and tap tables disappeared
before a use could click on it.

Modify the popovers to be reachable, also reimplement them as activated
by mouse clicks rather than mouse over events, allowing the src/dst
column to be both clickable and provide an icon for popover.

Fixes #1784

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-31 13:42:34 -07:00
Alex Leong 32d556e732
Improve ergonomics of service profile spec (#1828)
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.

* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition.  Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-31 12:00:22 -07:00
Oliver Gould 557dca5a56
Upgrade to linkerd/linkerd2-proxy#f97239ba (#1829)
This change updates the proxy version to fix grpc failure
classification, per #1819.
2018-10-30 15:19:01 -07:00
Andrew Seigner 3cd13f2913
Fix sidebar not rendering to end of page (#1827)
Also make main content independently scrollable.

Part of #1781

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-30 11:18:42 -07:00
Risha Mars 1b2b02985b
Small tweaks to the metrics tables to make them denser (#1826)
Try to squish the metrics columns so that the data fits on the page without
having to scroll the table. This was mostly evident in the Tap and Top tables,
where a lot of the table content would be initially out of view.

This branch also includes an unrelated tiny fix for max error length, which had
been changed from 500 to 50 for testing, and had not been changed back
2018-10-29 18:27:02 -07:00
Alex Leong d8b5ebaa6d
Remove the proxy-api container (#1813)
A container called `proxy-api` runs in the Linkerd2 controller pod.  This container listens on port 8086 and serves the proxy-api but does nothing other than forward gRPC requests to the destination container which listens on port 8089.

We remove the proxy-api container altogether and change the destination container to listen on port 8086 instead of 8089.  The result is that clients still use the proxy-api by connecting to `proxy-api.<ns>.svc.cluster.local:8086` but the controller has one fewer containers.  This results in a simpler system that is easier to reason about.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 16:31:43 -07:00
Alex Leong 82ca821e62
Use fqdn for service profile name (#1808)
Service profiles must be named in the form `"<service>.<namespace>"`.  This is inconsistent with the fully normalized domain name that the proxy sends to the controller.  It also does not permit creating service profiles for non-Kubernetes services.

We switch to requiring that service profiles must be named with the FQDN of their service.  For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`.

This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services.  Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 14:35:42 -07:00
Alex Leong 622185a4dd
Send metric labels in profile API (#1800)
* Send metric labels in profile API

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 14:28:09 -07:00
Risha Mars f7ca589556
Upgrade material-ui to 3.3.2 (#1824)
* Upgrade material-ui to 3.3.2

* Re-add popover that was removed in #1814
2018-10-29 14:08:22 -07:00
Kevin Lingerfelt 07c861e39f
Revert proxy upgrade (#1818)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-10-26 15:42:37 -07:00
Risha Mars 8393aed0f7
Update CHANGES.md for edge-18.10.4 release (#1817)
* Update CHANGES.md for edge-18.10.4 release
2018-10-26 15:11:55 -07:00
Kevin Lingerfelt cf7a532e15
Re-add sortable column headers to tables in web UI (#1814)
* Re-add sortable column headers to tables in web UI
* Display sort icons on all sortable columns
* Disable src/dst popover in top table

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-10-26 14:58:01 -07:00
Risha Mars d2f847a484
Fix success rate not appearing in Top column (#1816)
* Fix success rate mini chart not appearing on Top event tables

* Fix success rates being left aligned
2018-10-26 14:40:27 -07:00
Kevin Lingerfelt c59f43d827
Bump proxy version to latest master (#1815)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-10-26 13:49:03 -07:00
Andrew Seigner c661b00f8e
Re-implement sidebar resource selectors (#1810)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-26 13:28:29 -07:00
Alex Leong 6cffad277b
Make service profile validation a warning instead of an error (#1807)
The existence of an invalid service profile causes `linkerd check` to fail.  This means that it is not possible to open the Linkerd dashboard with the `linkerd dashboard` command.  While service profile validation is useful, it should not lock users out.

Add the ability to designate health checks as warnings.  A failed warning health check will display a warning output in `linkerd check` but will not affect the overall success of the command.  Switch the service profile validation to be a warning.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-26 13:28:10 -07:00
Risha Mars 148d7bc608
Do some small alignment tweaks, fix Firefox rendering (#1809)
This branch:

- adds a "meshed" badge to the namespace overview page instead 
of a green checkmark (uses Chip)
- fixes aforementioned meshed indicator not showing up in firefox
- vertically centers the ( ! ) icons in the metrics tables
- vertically centers the dots in the metrics tables
2018-10-25 10:04:58 -07:00
Risha Mars 715e8ff2dc
Apply global theming to the dashboard using material-ui themes (#1799)
Try to standardize theming and colours throughout the app:

- Move Material UI theme definition into its own file
- Use theme colours in success rate charts
- Remove all colour definitions from styles.css
- Remove unused styles in styles.css
- Audit bare h tag usage throughout the app; replace with Typography
- Standardize the colours to the theme for Progress.jsx
- Use theme colour in Spinner
- Default to warning in meshed status table bar chart
2018-10-24 17:13:39 -07:00
Oliver Gould 0e91dbb18d
Implement GetProfile for the proxy-api service (#1801)
The `proxy-api` service included a stub implementation of `GetProfile`
instead of forwarding requests to the `destination` service.

This change fills in the proxy-api service's `GetProfile` implementation
to forward requests to the destination service.
2018-10-24 12:37:29 -07:00
Risha Mars ae23c43e0a
Upgrade dashboard js libraries (#1806)
* Upgrade moment to 2.22.2

* Upgrade material-ui/core to 3.3.1
2018-10-24 11:36:47 -07:00