Commit Graph

328 Commits

Author SHA1 Message Date
Andrew Seigner 1df1683b6a
Instrument k8s clients (#2243)
The control-plane's clients, specifically the Kubernetes clients, did
not provide telemetry information.

Introduce a `prometheus.ClientWithTelemetry` wrapper to instrument
arbitrary clients. Apply this wrapper to Kubernetes clients.

Fixes #2183

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-18 09:10:02 -08:00
Oliver Gould 71ce786dd3
Rename linkerd-proxy-api to linkerd-destination (#2281)
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).

With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).

Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-15 15:11:04 -08:00
Thomas Rampelberg f9d353ea22
Generate CLI docs for usage by the website (#2296)
* Generate CLI docs for usage by the website

* Update description to match existing commands

* Remove global
2019-02-15 13:28:31 -08:00
Kevin Leimkuhler b2bbeb05ef
Issue 2276: Do not log error when timeout is blank (#2279)
# Problem

When a route does not specify a timeout, the proxy-api defaults to the default
timeout and logs an error:

```
time="2019-02-13T16:29:12Z" level=error msg="failed to parse duration for route POST /io.linkerd.proxy.destination.Destination/GetProfile: time: invalid duration"
```

# Solution

We now check if a route timeout is blank. If it is not set, it is set to
`DefaultRouteTimeout`. If it is set, we try to parse it into a `Duration`.

A request was made to improve logging to include the service profile and
namespace as well.

# Validation

With valid service profiles installed, edit the `.yaml` to include an invalid
`timeout`:

```
...
name: GET /
timeout: foo
```

We should now see the following errors:

```
proxy-api time="2019-02-13T22:27:32Z" level=error msg="failed to parse duration for route 'GET /' in service profile 'webapp.default.svc.cluster.local' in namespace 'default': time: invalid duration foo"
```

This error does not show up when `timeout` is blank.

Fixes #2276

Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
2019-02-14 17:09:02 -08:00
Ivan Sim f383c9e1f2
Remove auto proxy inject 'Mutate' function tests (#2257)
The way these tests compare the hard-coded base64-encoded JSON
patches with those generated by the proxy injector, is extremely
brittle. Changing any of the proxy configuration causes these tests
to break, even though the proxy injector itself isn't affected.

Also, the AdmissionRequest and AdmissionResponse types are "boundary
objects" that are largely irrelevant to our code.

Fixes #2201 

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-14 11:55:19 -08:00
Alejandro Pedraza c78f105350
Upgrade Spinner to fix race condition (#2265)
Fixes #2264

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-02-14 09:51:25 -05:00
Andrew Seigner 2305974202
Introduce golangci-lint tooling, fixes (#2239)
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.

This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.

Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-13 11:16:28 -08:00
Kevin Lingerfelt 56c5ce6a31
Update auto-inject to set LINKERD2_PROXY_ID in all cases (#2267)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-12 11:08:06 -08:00
Kevin Lingerfelt 26aa771482
Fix auto-inject config when TLS is disabled (#2246)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-11 11:01:18 -08:00
Ivan Sim f6e75ec83a
Add statefulsets to the dashboard and CLI (#2234)
Fixes #1983

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-02-08 15:37:44 -08:00
Alex Leong 030767d615
Refactor fallback profile listener to avoid repetition (#2228)
Refactor fallback profile listener to avoid repetition

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-08 14:24:10 -08:00
Alejandro Pedraza 1ef25390ec
GetPodsFor() called for an ExternalName service shouldn't return any pods (#2226)
Running `linkerd routes` for some resource was returning, besides the data for the resource, additional rows for each `ExternalName` service in the namespace.

Fixes #2216

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-02-07 18:17:36 -05:00
Alex Leong 5b054785e5
Read service profiles from client or server namespace instead of control namespace (#2200)
Fixes #2077 

When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace.  This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.

Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace.  If a service profile exists in both namespaces, the client namespace takes priority.  In this way, clients may override the behavior dictated by the service.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-02-07 14:51:43 -08:00
Andrew Seigner 907f01fba6
Improve ServiceProfile validation in linkerd check (#2218)
The `linkerd check` command was doing limited validation on
ServiceProfiles.

Make ServiceProfile validation more complete, specifically validate:
- types of all fields
- presence of required fields
- presence of unknown fields
- recursive fields

Also move all validation code into a new `Validate` function in the
profiles package.

Validation of field types and required fields is handled via
`yaml.UnmarshalStrict` in the `Validate` function. This motivated
migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml.

Fixes #2190
2019-02-07 14:35:47 -08:00
Andrew Seigner 72812baf99
Introduce Discovery API and endpoints command (#2195)
The Proxy API service lacked introspection of its internal state.

Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server

Also wire up a new `linkerd endpoints` command.

Fixes #2165

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-07 14:02:21 -08:00
Kevin Lingerfelt a11b9933fc
Update auto-injector to require opt-in by namespace or pod (#2209)
* Update auto injector to require opt-in by namespace or pod
* Rename namespace fixtures

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-06 17:37:14 -08:00
Risha Mars e531655d26
Add a --tap flag to the linkerd profile command (#2139)
Adds the ability to generate a service profile by running a tap for a configurable 
amount of time, and using the route results from the routes seen during the tap.

e.g. `linkerd profile web --tap deploy/web -n emojivoto --tap-duration 2s`
2019-02-06 12:43:16 -08:00
Kevin Leimkuhler 66070c26f4
Introduce go generate to embed static templates (#2189)
# Problem
In order to switch Linkerd template rendering to use `.yaml` files, static
assets must be bundled in the Go binary for use by `linkerd install`.

# Solution
The solution should not affect the local development process of building and
testing.

[vfsgen](https://github.com/shurcooL/vfsgen) generates Go code that statically
implements the provided `http.FileSystem`. Paired with `go generate` and Go
[build tags](https://golang.org/pkg/go/build/), we can continue to use the
template files on disk when developing with no change required.

In `!prod` Go builds, the `cli/static/templates.go` file provides a
`http.FileSystem` to the local templates. In `prod` Go builds, `go generate
./cli` generates `cli/static/generated_templates.gogen.go` that statically
provides the template files.

When built with `-tags prod`, the executable will be built with the staticlly
generated file instead of the local files.

# Validation
The binaries were compiled locally with `bin/docker-build`. The binaries were
then tested with `bin/test-run (pwd)/target/cli/darwin/linkerd`. All tests
passed.

No change was required to successfully run `bin/go-run cli install`. No change
was required to run `bin/linkerd install`.

Fixes #2153

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2019-02-04 18:09:47 -08:00
Oliver Gould 44e31f0f67
Configure proxy keepalives via the environment (#2193)
In linkerd/linkerd2-proxy#186, the proxy supports configuration of TCP
keepalive values.

This change sets `LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE` and
`LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE` to 10s when injecting the
proxy, so that remote connections are configured with a keepalive.

This configuration is NOT yet exposed through the CLI. This may be done
in a followup, if necessary.

Fixes #1949
2019-02-04 16:16:43 -08:00
Oliver Gould 4798ad3f44
Use the proper controller identity when configuring pods with TLS (#2196)
Since 37ae423, deployments have been prefixed with linkerd-; however
the inject logic was not changed to take this into consideration when
constructing the controller's identity.

This means that the proxy's client to the control plane has been unable to
establish TLS'd communcation to the proxy-api. Previously, the proxy would
silently fall back to plaintext, but in master this behavior recently changed to
be stricter, so this bug will prevent the proxy from connecting to proxy-api
in any way.
2019-02-04 14:59:03 -08:00
Ye Ben f2ba17d366 fix some typos (#2194)
Signed-off-by: yeya24 <ben.ye@daocloud.io>
2019-02-02 23:03:54 -08:00
Kevin Lingerfelt 4c019c27c1
Add pod spec annotation to disable injection in CLI and auto-injector (#2187)
* Add pod spec annotation to disable injection in CLI and auto-injector
* Remove support for linkerd.io/auto-inject label entirely
* Update based on review feedback
* Fix issue with finding the namespace of deployments applied to the default ns

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-01 16:57:06 -08:00
Alex Leong 3bd4231cec
Add support for timeouts in service profiles (#2149)
Fixes #2042 

Adds a new field to service profile routes called `timeout`.  Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead.  If the timeout field is not specified, a default timeout of 10 seconds is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-30 16:48:55 -08:00
Alejandro Pedraza fe234cade1
Use `ca.NewCA()` for generating certs and keys for the proxy injector (#2163)
Use `ca.NewCA()` for generating certs and keys for the proxy injector

- Remove from CA controller everything that dealt with the
webhook/proxy-injector
- Remove no longer needed proxy-injector volumes for 'trust-anchors' and
'webhook-secrets'
- Remove from the proxy-injector the retrieval of the trust anchor and
secrets
- tls flag during install is no longer needed for auto-inject to work

Fixes #2095 and fixes #2166

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-01-30 16:04:33 -05:00
Cody Vandermyn 906c3cbfc5 WIP: CNI Plugin (#2071)
* Export RootOptions and BuildFirewallConfiguration so that the cni-plugin can use them.
* Created the cni-plugin based on istio-cni implementation
* Create skeleton files that need to be filled out.
* Create the install scripts and finish up plugin to write iptables
* Added in an integration test around the install_cni.sh and updated the script to handle the case where it isn't the only plugin. Removed the istio kubernetes.go file in favor of pkg/k8s; initial usage of this package; found and fixed the typo in the ClusterRole and ClusterRoleBinding; found the docker-build-cni-plugin script
* Corrected an incorrect name in the docker build file for cni-plugin
* Rename linkerd2-cni to linkerd-cni
* Fixup Dockerfile and clean up code a bit as well as logging statements.
* Update Gopkg.lock after master merge.
* Update test file to remove temporary tag.
* Fixed the command to run during the test while building up the docker run.
* Added attributions to applicable files; in the test file, use a different container for each test scenario and also print the docker logs to stdout when there is an error;
* Add the --no-init-container flag to install and inject. This flag will not output the initContainer and will add an annotation assuming that the cni will be used in this case.
* Update .travis.yml to build the cni-plugin docker image before running the tests.
* Workaround golint warnings.
* Create a new command to install the linkerd-cni plugin.
* Add the --no-init-container option to linkerd inject
* Use the setup ip tables annotation during the proxy auto inject webhook prevent/allow addition of an init container; move cni-plugin tests to the integration-test section of travis
* gate the cni-plugin tests with the -integration-tests flag; remove unnecessary deployment .yaml file.
* Incorporate PR Cleanup suggestions.
* Remove the SetupIPTablesLabel annotation and use config flags and the presence of the init container to determine whether the cni-plugin writes ip tables.
* Fix a logic bug in the cni-plugin code that prevented the iptables from being written; Address PR comments; make tests pass.
* Update go deps shas
* Changed the single file install-cni plugin filename to be .conf vs .conflist; Incorporated latest PR comments around spacing with the new renderer among others.
* Fix an issue with renaming .conf to .conflist when needed.
* Renamed some of the variables to try to make it more clear what is going on.
* Address final PR comments.
* Hide cni flags for the time being.

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2019-01-30 11:51:34 -08:00
Alena Varkockova 2691dda5ce Add possibility to filter by owner and label in ListPods (#2161)
Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-28 18:50:29 -08:00
Thomas Rampelberg ea61630f9d
Update client-go to 1.13.1 (#2160)
* Update client-go to 1.13.1

Fixes #2145

* Update Dockerfile-bin with new tag

* Update all the dockerfile tags

* Clean gopkg and do not apply cluster defaults

* Update for klog

* Match existing behavior with klog

* Add klog to gopkg.lock

* Update go-deps shas

* Update klog comment

* Update comment to be a non-sentence
2019-01-28 17:42:14 -08:00
Alex Leong 872e1bb026
Add --proto flag to linkerd profile command to read protobuf files (#2128)
Fixes #1425 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-25 11:15:20 -08:00
Kevin Leimkuhler f3325e7d81
Refactor `install`'s `render` output to be helm compatible (#2098)
# Problem
In order to refactor `install` to allow for a more flexible configuration, we
should start with the format of the YAML that it renders. Using the Helm
YAML format will make it easier add flexible configuration options in the
future. Currently, the rendered template that `install` produces does not
follow this format.

# Solution
Use the internals that Helm itself uses to render an inject template that
follows the same formatting rules. Helm's `template` cmd provides a good
outline of what is needed to make Linkerd's `install` cmd work as if it was
a Chart.

# Validation
There are no new tests, but there may not be anything to test at this stage.
This is a WIP PR towards the ultimate goal of `install` allowing a more
flexible configuration.

However, `install` now uses all the Helm `template` internals and therefore
satisfies the needed properties for Helm Charts.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2019-01-25 10:53:35 -08:00
Eliza Weisman 846975a190
Remove proxy bind timeout from CLIs (#2017)
This branch removes the `--proxy-bind-timeout` flag from the 
`linkerd inject` and `linkerd install` CLI commands, and the
`LINKERD2_PROXY_BIND_TIMEOUT` environment variable from their output.
This is in preparation for removing that timeout from the proxy (as
described in #2013). 

I thought it was prudent to remove this from the CLIs before removing it
from the proxy, so we can't create a situation where the CLIs produce
output that results in broken proxy containers.

Fixes #2013

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2019-01-24 15:34:09 -08:00
Alex Leong d542571b65
GetProfiles should always respond with a (possibly empty) profile immediately (#2146)
When `GetProfiles` is called for a destination that does not have a service profile, the proxy-api service does not return any messages until a service profile is created for that service.  This can be interpreted as hanging, and can make it difficult to calculate response latency metrics.

Change the behavior of the API to always return a service profile message immediately.  If the service does not have a service profile, the default service profile is returned.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-24 15:22:14 -08:00
zak 8c413ca38b Wire up stats commands for daemonsets (#2006) (#2086)
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.

Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.

Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.

Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.

Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.

Update dependencies and dockerfile hashes

Add DaemonSet support to tap and top commands

Fixes of #2006

Signed-off-by: Zak Knill <zrjknill@gmail.com>
2019-01-24 14:34:13 -08:00
Alex Leong 32efab41b5
Fix panic when routes is called in single-namespace mode (#2123)
Fixes #2119 

When Linkerd is installed in single-namespace mode, the public-api container panics when it attempts to access watch service profiles.

In single-namespace mode, we no longer watch service profiles and return an informative error when the TopRoutes API is called.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-23 16:47:05 -08:00
Alena Varkockova 28f662c9c6 Introduce resource selector and deprecate namespace field for ListPods (#2025)
* Introduce resource selector and deprecate namespace field for ListPods
* Changes from code review
* Properly deprecate the field
* Do not check for nil
* Fix the mockProm usage
* Protoc changes revert
* Changed from code review

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2019-01-23 10:35:55 -08:00
Dennis Adjei-Baah f9cd9366d9
Surface logs from control plane pods (#2037)
When debugging control plane issues or issues pertaining to a linkerd proxy, it can be cumbersome to get logs from affected containers quickly. 

This PR adds a new `logs` command to the Linkerd CLI to surface log lines from any container within linkerd's control plane. This feature relies heavily on [stern](https://github.com/wercker/stern), which already provides this behavior. This PR integrates this package into the Linkerd CLI to allow users to quickly retrieve logs whenever they run into issues when using Linkerd. 

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2019-01-16 19:24:42 -08:00
Alex Leong a562f8b9fd
Improve routes command to list all routes (#2066)
Fixes #1875 

This change improves the `linkerd routes` command in a number of important ways:

* The restriction on the type of the `--to` argument is lifted and any resource type can now be used.  Try `--to ns/books`, `--to po/webapp-ABCDEF`, `--to au/linkerd.io`, or even `--to svc`.
* All routes for the target will now be populated in the table, even if there are no Prometheus metrics for that route.
* [UNKNOWN] has been renamed to [DEFAULT]
* The `Service/Authority` column will now list `Service` in all cases except for when an authority target is explicitly requested.

```
$ linkerd routes deploy/traffic --to deploy/webapp
ROUTE                       SERVICE   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
GET /                        webapp   100.00%   0.5rps          50ms         180ms         196ms
GET /authors/{id}            webapp   100.00%   0.5rps         100ms         900ms         980ms
GET /books/{id}              webapp   100.00%   0.9rps          38ms          93ms          99ms
POST /authors                webapp   100.00%   0.5rps          35ms          48ms          50ms
POST /authors/{id}/delete    webapp   100.00%   0.5rps          83ms         180ms         196ms
POST /authors/{id}/edit      webapp     0.00%   0.0rps           0ms           0ms           0ms
POST /books                  webapp    45.16%   2.1rps          75ms         425ms         485ms
POST /books/{id}/delete      webapp   100.00%   0.5rps          30ms          90ms          98ms
POST /books/{id}/edit        webapp    56.00%   0.8rps          92ms         875ms         975ms
[DEFAULT]                    webapp     0.00%   0.0rps           0ms           0ms           0ms
```

This is all made possible by a shift in the way we handle the destination resource.  When we get a request with a `ToResource`, we use the k8s API to find all Services which include at least one pod belonging to that resource.  We then fetch all service profiles for those services and display the routes from those serivce profiles.  

This shift in thinking also precipitates a change in the TopRoutes API where we no longer need special cases for `ToAll` (which can be specified by `--to au`) or `ToAuthority` (which can be specified by `--to au/<authority>`) and instead can use a `ToResource` to handle all cases.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-01-16 17:15:35 -08:00
Andrew Seigner 92f2cd9b63
Update check and inject output (#2087)
The outputs of the `check` and `inject` commands did not vary much
between successful and failed executions, and were a bit verbose and
challenging to parse.

Reorganize output of `check` and `inject` commands, to provide more
output when errors occur, and less output when successful.

Specific changes:

`linkerd check`
- visually group checks by category
- introduce `hintURL`'s, to provide doc links when checks fail
- add spinners when retrying, remove additional retry lines
- colored unicode characters to indicate success/warning/failure

`linkerd inject`
- modify default output to mirror `kubectl apply`
- only output non-successful inject reports
- support `--verbose` flag to output all inject reports

Fixes #1471, #1653, #1656, #1739

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-16 15:14:14 -08:00
Alex Leong 771542dde2
Add support for retries (#2038) 2019-01-16 14:13:48 -08:00
Kevin Lingerfelt ed3fbd75f3
Setup port-forwarding for linkerd dashboard command (#2052)
* Setup port-forwarding for linkerd dashboard command
* Output port-forward logs when --verbose flag is set

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-01-10 10:16:08 -08:00
Andrew Seigner a91c77d0bf
Followups from lint/comment changes (#2032)
This is a followup branch from #2023:
- delete `proxy/client.go`, move code to `destination-client`
- move `RenderTapEvent` and stat functions from `util` to `cmd`

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 15:28:09 -08:00
Andrew Seigner 1c302182ef
Enable lint check for comments (#2023)
Commit 1: Enable lint check for comments

Part of #217. Follow up from #1982 and #2018.

A subsequent commit will fix the ci failure.

Commit 2: Address all comment-related linter errors.

This change addresses all comment-related linter errors by doing the
following:
- Add comments to exported symbols
- Make some exported symbols private
- Recommend via TODOs that some exported symbols should should move or
  be removed

This PR does not:
- Modify, move, or remove any code
- Modify existing comments

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-01-02 14:03:59 -08:00
Kevin Lingerfelt f1b0983f72
Add go linting to CI config (#2018)
* Add go linting to CI config
* Fix lint warnings
* Add note about bin/lint script in TEST.md

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-20 15:33:09 -08:00
Radu M 07cbfe2725 Fix most golint issues that are not comment related (#1982)
Signed-off-by: Radu Matei <radu@radu-matei.com>
2018-12-20 10:37:47 -08:00
Kevin Lingerfelt 10d5ebd064
Fix flaky certificate controller test (#2009)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-19 17:10:17 -08:00
Alex Leong cb3fa1245b
Remove TLS column from routes command output (#1956)
Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-14 21:52:49 -08:00
Kevin Lingerfelt 0866bb2a41
Remove runAsGroup field from security context settings (#1986)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 15:12:13 -08:00
Kevin Lingerfelt 86e95b7ad3
Disable serivce profiles in single-namespace mode (#1980)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-13 14:37:18 -08:00
Kevin Lingerfelt 00de48bd26
Fix proxy-api handling of named target ports (#1973)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-12 13:42:47 -08:00
Cody Vandermyn 8e4d9d2ef6 add securityContext with runAsUser: {{.ControllerUID}} to the various cont… (#1929)
* add securityContext with runAsUser: {{.ProxyUID}} to the various containers in the install template
* Update golden to reflect new additions
* changed to a different user id than the proxy user id
* Added a controller-uid install option
* change the port that the proxy-injector runs
* The initContainers needs to be run as the root user.
* move security contexts to container level

Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
2018-12-11 11:51:28 -08:00
Alejandro Pedraza 8c67bfbcc6 Add parameter to stats API to skip retrieving Prometheus stats (#1871)
* Add parameter to stats API to skip retrieving Prometheus stats

Used by the dashboard to populate list of resources.

Fixes #1022

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Prometheus queries check results were being ignored
* Refactor verifyPromQueries() to also test when no prometheus queries
should be generated

* Add test for SkipStats=true

Includes adding ability to public.GenStatSummaryResponse to not generate
basicStats

* Fix previous test
2018-12-10 16:48:12 -08:00
Kevin Lingerfelt 0f8bcc9159
Controller: wait for caches to sync before opening listeners (#1958)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-07 11:15:45 -08:00
Alex Leong 04ed200e36
Rename path_regex to pathRegex (#1951)
Rename snake case fields to camel case in service profile spec.  This improves the way they are rendered when the `kubectl describe` command is used.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-06 11:51:33 -08:00
Andrew Seigner bef9479f57
Add input validation for profile command (#1934)
Fixes #1878

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-05 15:13:10 -08:00
Alex Leong cbb196066f
Support service profiles for external authorities (#1928)
Add support for service profiles created on external (non-service) authorities.  For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`.

This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority.  We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem.

We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not.  To simplify this, support for multiple resolves (which was unused) was removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 14:32:59 -08:00
Oliver Gould 8f9bb711dd
proxy-api: Expose a flag to control auto-h2-upgrade (#1925)
When debugging issues, it's helpful to disable HTTP/2 upgrading to
simplify diagnostics.

This chagne adds an `enable-h2-ugprade` flag to _proxy-api_. When this
flag is set to false, the proxy-api will not suggest that meshed
endpoints are upgraded to use HTTP/2.

As a follow-up, a flag should be added to `install` to control how the
proxy-api is initialized.
2018-12-05 12:41:20 -08:00
Alex Leong 380ec52a39
Rework routes command to accept any resource (#1921)
We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 11:11:34 -08:00
Alex Leong 4f3e55e937
Rename path to path_regex in ServiceProfile CRD (#1923)
We rename path to path_regex in the ServiceProfile CRD to make it clear that this field accepts a regular expression. We also take this opportunity to remove unnecessary line anchors from regular expressions now that these anchors are added in the proxy.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-12-05 10:42:47 -08:00
Kevin Lingerfelt 37ae423bb3
Add linkerd- prefix to all objects in linkerd install (#1920)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-12-04 15:41:47 -08:00
Andrew Seigner ad2366f208
Revert proxy readiness initialDelaySeconds change (#1912)
Reverts part of #1899 to workaround readiness failures.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-04 14:27:55 -08:00
Andrew Seigner 37a5455445
Add filtering by job in stat, tap, top; fix panic (#1904)
Filtering by Kubernetes job was not supported. Also filtering by any unknown
type caused a panic.

Add filtering support by Kubernetes job, with special case mapping `job` to
`k8s_job`, to not conflict with Prometheus' job label.

Fix panic when unknown type specified as a `--from` or `--to` flag.

Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at
collection time. This caused all metrics collected by proxy sidecars in
Kubernetes jobs to be collected into an incorrect Prometheus job, rather than
the expected `linkerd-proxy` Prometheus job.

Fix `unsupported resource type` tap error message incorrectly printing the
target resource rather than the destination.

Set `--controller-log-level debug` in `install_test.go` for easier debugging.

Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to
validate proxy requests with a job as destination.

Fixes #1872
Part of #627

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 15:34:49 -08:00
Oliver Gould 926395f616
tap: Include route labels in tap events (#1902)
This change alters the controller's Tap service to include route labels
when translating tap events, modifies the public API to include route
metadata in responses, and modifies the tap CLI command to include
rt_ labels in tap output (when -o wide is used).
2018-12-03 13:52:47 -08:00
Andrew Seigner d121071f87
Adjust proxy, Prometheus, and Grafana probes (#1899)
* Adjust proxy, Prometheus, and Grafana probes

High `readinessProbe.initialDelaySeconds` values delayed the controller's
readiness by up to 30s, preventing cli commands from succeeding shortly after
control plane deployment.

Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and
Grafana to the default 0s. Also change `linkerd check` controller pod ordering
to: controller, prometheus, web, grafana.

Detailed probe changes:
- proxy
  - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s
- prometheus
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `livenessProbe.timeoutSeconds` from 30s to 1s
- grafana
  - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s
  - decrease `readinessProbe.timeoutSeconds` from 30s to 1s
  - decrease `readinessProbe.failureThreshold` from 10 to 3
  - increase `livenessProbe.initialDelaySeconds` from 0s to 30s

Fixes #1804

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-12-03 10:41:11 -08:00
Alex Leong f9d66cf4de
Add --open-api option to linkerd profiles command (#1867)
The `--open-api` flag is an alternative to the `--template` flag for the `linkerd profile` command.  It reads an OpenAPI specification file (also called a swagger file) and uses it to generate a corresponding service profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-30 09:25:19 -08:00
Andrew Seigner 34d9eef03e
proxy injector: insert at end of arrays (#1881)
When using `--proxy-auto-inject` with Kuberntes `v1.9.11`, observed auto
injector incorrectly merging list elements rather than inserting new
ones. This issue was not reproducible on `v1.10.3`.

For example, this input:
```
spec:
  template:
    spec:
      containers:
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

Would yield:
```
spec:
  template:
    spec:
      containers:
      - name: linkerd-proxy
        command:
        - emojivoto-vote-bot
      - name: vote-bot
        command:
        - emojivoto-vote-bot
```

This change replaces json patch specs like
`/spec/template/spec/containers/0` with
`/spec/template/spec/containers/-`. The former is intended to insert at
the beggining of a list, the latter at the end. This also simplifies the
code a bit and more closely aligns with the intent of injecting at the
end of lists.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-11-28 14:21:18 -08:00
Risha Mars f8583df4db
Add ListServices to controller public api (#1876)
Add a barebones ListServices endpoint, in support of autocomplete for services.
As we develop service profiles, this endpoint could probably be used to describe
more aspects of services (like, if there were some way to check whether a
service profile was enabled or not).

Accessible from the web UI via http://localhost:8084/api/services
2018-11-27 11:34:47 -08:00
Alex Leong 73836f05cf
Update proxy version and use canonicalized dst (#1866)
The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag).  Inbound queries (i.e. ones without the `--from` flag) never return any metrics.

We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-26 17:20:07 -08:00
Oliver Gould ba11698d4b
tap: Use nil-safe protobuf accessors (#1873)
The tap server accesses protobuf fields directly instead of using the
`Get*()` accessors. The accessors are necessary to prevent dereferencing
a nil pointer and crashing the tap service.

Furthermore, these maps are explicitly initialized when `nil` to support
label hydration.
2018-11-26 14:14:28 -08:00
Alex Leong 7a7f6b6ecb
Add TopRoutes method the the public api and route CLI command to consume it (#1860)
Add a routes command which displays per-route stats for services that have service profiles defined.

This change has three parts:
* A new public-api RPC called `TopRoutes` which serves per-route stat data about a service
* An implementation of TopRoutes in the public-api service.  This implementation reads per-route data from Prometheus.  This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared.
* A new CLI command called `routes` which displays the per-route data in a tabular or json format.  This is very similar to the `stat` command and much of the code was able to be refactored and shared.

Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data.  This restriction will be lifted in an upcoming change once we add support for inbound route stats as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-11-19 12:20:30 -08:00
Kevin Leimkuhler c68693e820
Fix stat filtering for `--from` queries (#1856)
# Problem
When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`.

Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`).

# Solution
Destination query labels are now appended to `labels` so that those labels can be filtered on.

# Validation
Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries.

Fixes #1766 

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2018-11-14 10:52:27 -08:00
Alejandro Pedraza bbcf5a8c9f Allow stat summary to query for multiple resources (#1841)
* Refactor util.BuildResource so it can deal with multiple resources

First step to address #1487: Allow stat summary to query for multiple
resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update the stat cli help text to explain the new multi resource querying ability

Propsal for #1487: Allow stat summary to query for multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow stat summary to query for multiple resources

Implement this ability by issuing parallel requests to requestStatsFromAPI()

Proposal for #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Update tests as part of multi-resource support in `linkerd stat` (#1487)

- Refactor stat_test.go to reuse the same logic in multiple tests, and
add cases and files for json output.
- Add a couple of cases to api_utils_test.go to test multiple resources
validation.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` called with multiple resources should keep an ordering (#1487)

Add SortedRes holding the order of resources to be followed when
querying `linkerd stat` with multiple resources

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Extra validations for `linkerd stat` with multiple resources (#1487)

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* `linkerd stat` resource grouping, ordering and name prefixing (#1487)

- Group together stats per resource type.
- When more than one resource, prepend name with type.
- Make sure tables always appear in the same order.

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>

* Allow `linkerd stat` to be called with multiple resources

A few final refactorings as per code review.

Fixes #1487

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-11-14 10:44:04 -08:00
Igor Zibarev 60bcdb15f9 controller: use GetConfig from pkg/k8s package (#1857)
This commit removes duplicate logic that loads Kubernetes config and
replaces it with GetConfig from pkg/k8s. This also allows to load
config from default sources like $KUBECONFIG instead of explicitly
passing -kubeconfig option to controller components.

Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
2018-11-13 14:41:31 -08:00
Alex Leong 32d556e732
Improve ergonomics of service profile spec (#1828)
We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API.

* Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition.  Doing so is equivalent to combining the fields with an "all" condition.
* Rename "responses" to "response_classes"
* Change "IsSuccess" to "is_failure"

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-31 12:00:22 -07:00
Alex Leong d8b5ebaa6d
Remove the proxy-api container (#1813)
A container called `proxy-api` runs in the Linkerd2 controller pod.  This container listens on port 8086 and serves the proxy-api but does nothing other than forward gRPC requests to the destination container which listens on port 8089.

We remove the proxy-api container altogether and change the destination container to listen on port 8086 instead of 8089.  The result is that clients still use the proxy-api by connecting to `proxy-api.<ns>.svc.cluster.local:8086` but the controller has one fewer containers.  This results in a simpler system that is easier to reason about.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 16:31:43 -07:00
Alex Leong 82ca821e62
Use fqdn for service profile name (#1808)
Service profiles must be named in the form `"<service>.<namespace>"`.  This is inconsistent with the fully normalized domain name that the proxy sends to the controller.  It also does not permit creating service profiles for non-Kubernetes services.

We switch to requiring that service profiles must be named with the FQDN of their service.  For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`.

This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services.  Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 14:35:42 -07:00
Alex Leong 622185a4dd
Send metric labels in profile API (#1800)
* Send metric labels in profile API

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-29 14:28:09 -07:00
Oliver Gould 0e91dbb18d
Implement GetProfile for the proxy-api service (#1801)
The `proxy-api` service included a stub implementation of `GetProfile`
instead of forwarding requests to the `destination` service.

This change fills in the proxy-api service's `GetProfile` implementation
to forward requests to the destination service.
2018-10-24 12:37:29 -07:00
Alex Leong f549868033
Fix integration test and docker build (#1790)
Fix broken docker build by moving Service Profile conversion and validation into `/pkg`.

Fix broken integration test by adding service profile validation output to `check`'s expected output.

Testing done:
* `gotest -v ./...`
* `bin/docker-build`
* `bin/test-run (pwd)/bin/linkerd`

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-19 10:23:34 -07:00
Alex Leong 5210b7b44a
Add check for service profile validation (#1775)
Add a check to `linkerd check` which validates all service profile resources.  In particular it checks:
* does the service profile refer to an existent service
* is the service profile valid

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-18 16:37:39 -07:00
Alex Leong 43c22fe967
Implement getProfiles method in destination service (#1759)
We implement the getProfiles method in the destination service. This method returns a stream of destination profiles for a given authority. It does this by looking up the ServiceProfile resource in the controller namespace named `<svc>.<ns>` where `<svc>` is the name of the service and `<ns>` is the namespace of the service.

This PR includes:
* Adding a ServiceProfile Custom Resource Definition to linkerd install
* A watch based implementation of the getProfiles method in the destination service, similar to the implementation of get.
* An update to the destination client script that allows querying the getProfiles method.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-16 15:39:12 -07:00
Ivan Sim 1100c4fa8c Proxy injector must preserve the original pod template labels and annotations (#1765)
* Ensure that the proxy injector mutating webhook preserves the original labels
and annotations

The deployment's selector must also match the pod template labels in
newer version of Kubernetes.

This resolves issue #1756.

* Add the Linkerd labels to the deployment metadata during auto proxy
injection

* Remove selector match labels JSON patch from proxy injector

This isn't needed to resolve the selector label mismatch errors.

Signed-off-by: ihcsim <ihcsim@gmail.com>
2018-10-16 15:30:45 -07:00
Ivan Sim 2e1a984eb0 Change the proxy-init container ordering during auto proxy injection (#1763)
Appending proxy-init to the end of the list ensures that it won't
interfere with other init containers from accessing the network,
before the proxy container is created.

This resolves bug #1760

Signed-off-by: ihcsim <ihcsim@gmail.com>
2018-10-15 15:33:09 -07:00
Alejandro Pedraza 37bc8a69db Added support for json output in `linkerd stat` (#1749)
Added support for json output in `linkerd stat` through a new (-o|--output)=json option.

Fixes #1417

Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2018-10-15 14:10:48 -07:00
Risha Mars 31a396b631
Fix incorrect test wording (#1767) 2018-10-15 12:07:06 -07:00
Alex Leong 1fe19bf3ce
Add ServiceProfile support to k8s utilities (#1758)
Updates to the Kubernetes utility code in `/controller/k8s` to support interacting with ServiceProfiles.

This makes use of the code generated client added in #1752 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-12 09:35:11 -07:00
Alex Leong f1f5b49f59
Add generated Kubernetes client for ServiceProfile custom resource (#1752)
To support reading and writing of the ServiceProfile custom resource, we add a codegen'd Kubernetes client for this resource.

* Adding the ServiceProfile type and related boilerplate to /controller/gen/apis/serviceprofile. This boilerplate also contains directives that control how codegen works.
* A script in /hack which invokes codegen that generates Kubernetes client machinery for interacting with ServiceProfile resources. The majority of the generated code lives in /controller/gen/client.
* The above-mentioned generated code.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-10-11 11:43:35 -07:00
Kevin Lingerfelt 46c887ca00
Add --single-namespace install flag for restricted permissions (#1721)
* Add --single-namespace install flag for restricted permissions
* Better formatting in install template
* Mark --single-namespace and --proxy-auto-inject as experimental
* Fix wording of --single-namespace check flag
* Small healthcheck refactor

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-10-11 10:55:57 -07:00
Andrew Seigner 8f4240125e fix test failure, logrus api consistency (#1755)
`go test` was failing with
`Fatalf call has arguments but no formatting directives`

Fix test failure, make all logrus api calls consistent.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-11 10:44:32 -07:00
Ivan Sim 4fba6aca0a Proxy init and sidecar containers auto-injection (#1714)
* Support auto sidecar-injection

1. Add proxy-injector deployment spec to cli/install/template.go
2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration
during the webhook's start-up process.
3. Add a new handler to the CA controller to create a new secret for the
webhook when a new MutatingWebhookConfiguration is created.
4. Declare a config map to store the proxy and proxy-init container
specs used during the auto-inject process.
5. Ignore namespace and pods that are labeled with
linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed
6. Add new flag to `linkerd install` to enable/disable proxy
auto-injection

Proposed implementation for #561.

* Resolve missing packages errors
* Move the auto-inject label to the pod level
* PR review items
* Move proxy-injector to its own deployment
* Ignore pods that already have proxy injected

This ensures the webhook doesn't error out due to proxy that are injected using the  command

* PR review items on creating/updating the MWC on-start
* Replace API calls to ConfigMap with file reads
* Fixed post-rebase broken tests
* Don't mutate the auto-inject label

Since we started using healhcheck.HasExistingSidecars() to ensure pods with
existing proxies aren't mutated, we don't need to use the auto-inject label as
an indicator.

This resolves a bug which happens with the kubectl run command where the deployment
is also assigned the auto-inject label. The mutation causes the pod auto-inject
label to not match the deployment label, causing kubectl run to fail.

* Tidy up unit tests
* Include proxy resource requests in sidecar config map
* Fixes to broken YAML in CLI install config

The ignore inbound and outbound ports are changed to string type to
avoid broken YAML caused by the string conversion in the uint slice.

Also, parameterized the proxy bind timeout option in template.go.

Renamed the sidecar config map to
'linkerd-proxy-injector-webhook-config'.

Signed-off-by: ihcsim <ihcsim@gmail.com>
2018-10-10 12:09:22 -07:00
Ben Lambert 69cebae1a2 Added ability to configure sidecar CPU + Memory requests (#1731)
Horizontal Pod Autoscaling does not work when container definitions in pods do not all have resource requests, so here's the ability to add CPU + Memory requests to install + inject commands by proving proxy options --proxy-cpu + --proxy-memory

Fixes #1480

Signed-off-by: Ben Lambert <ben@blam.sh>
2018-10-08 10:51:29 -07:00
Andrew Seigner dccccebd79
Add LICENSE files to all Docker images (#1727)
To comply with certain environments, include our LICENSE file in all
Docker images.


Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-10-02 16:25:52 -07:00
Alena Varkockova 5a853e8990 Use ListPods always for data plane HC (#1701)
* Use ListPods always for data plane HC
* Missing changes in grpc_server.go
* Address review comments
* Read proxy version from spec

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-10-02 11:45:01 -07:00
Alena Varkockova 11c9b7425b Fix the debug message in endpoints watcher (#1658)
* Fix the debug message in endpoints watcher
* Use better method for converting

Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>
2018-09-20 13:03:45 -07:00
Alex Leong e65a9617bd
Add can-i checks to linkerd check --pre (#1644)
Add checks to `linkerd check --pre` to verify that the user has permission to create:
* namespaces
* serviceaccounts
* clusterroles
* clusterrolebindings
* services
* deployments
* configmaps

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-09-17 11:31:10 -07:00
Dennis Adjei-Baah 00d0a26a9c
Cleanly shutdown tap stream to data plane proxies (#1624)
Sometimes, the tap server causes the controller pod to restart after it receives this error.
This error arises when the Tap server does not close gRPC tap streams to proxies before the tap server terminates its streams to its upstream clients and causes the controller pod to restart.

This PR uses the request context from the initial TapByReource to help shutdown tap streams to the data plane proxies gracefully.

fixes #1504

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-09-12 15:00:19 -07:00
Andrew Seigner c5a719da47
Modify inject to warn when file is un-injectable (#1603)
If an input file is un-injectable, existing inject behavior is to simply
output a copy of the input.

Introduce a report, printed to stderr, that communicates the end state
of the inject command. Currently this includes checking for hostNetwork
and unsupported resources.

Malformed YAML documents will continue to cause no YAML output, and return
error code 1.

This change also modifies integration tests to handle stdout and stderr separately.

example outputs...

some pods injected, none with host networking:

```
hostNetwork: pods do not use host networking...............................[ok]
supported: at least one resource injected..................................[ok]

Summary: 4 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
  deploy/vote-bot
```

some pods injected, one host networking:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true"
supported: at least one resource injected..................................[ok]

Summary: 3 of 8 YAML document(s) injected
  deploy/emoji
  deploy/voting
  deploy/web
```

no pods injected:

```
hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true"
supported: at least one resource injected..................................[warn] -- no supported objects found

Summary: 0 of 8 YAML document(s) injected
```

TODO: check for UDP and other init containers

Part of #1516

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-09-10 10:34:25 -07:00
Kevin Lingerfelt f884caf56d
Upgrade protobuf to v1.2.0 (#1591)
* Upgrade protobuf to v1.2.0
* Fix Gopkg.lock
* Switch linkerd2-proxy-api dep back to stable

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-06 11:36:29 -07:00
Kevin Lingerfelt b5ff29c8aa
Add data plane check to validate proxy version (#1574)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-09-04 15:22:38 -07:00
Risha Mars 249b51f950
Increase MaxRps in Tap server, remove default setting from Web (#1560)
Increase the MaxRps on the tap server to 100 RPS.

The max RPS for tap/top was increased in for the CLI #1531, but we were
still manually setting this to 1 RPS in the Web UI and Web server.

Remove the pervasive setting of MaxRps to 1 in the web frontend and server
2018-08-30 13:37:37 -07:00
Alex Leong 0f7d684ca9
Increase default max-rps for tap and top (#1531)
The default value for the max-rps argument to the tap and top commands is an overly conservative 1rps.  This causes the data to come in very slowly and much data to be discarded.  Furthermore, because tap requests are windowed to 10 seconds, this causes long pauses between updates.

We fix this in two ways.  Firstly we reduce the window size to 1s so that updates will come in at least once per second, even when the actual RPS of the data path is extremely high.  Secondly, we increase the default max-rps parameter from 1 to 100.  This allows tap to paint an accurate picture of the data much more quickly and sidesteps some sampling bias that happens when the max-rps is low.

In general, tap events tend to happen in bursts.  For example, one request in may trigger one or more requests out.  Likewise, a single upstream event may trigger several requests to the tapped pod in quick succession.  Sampling bias will occur when the max-rps is less than the actual rps and when the tap event limit subdivides these event bursts (biasing towards the first few events in the burst).  The greater the max-rps, the less the effects of this bias.

Fixes #1525 

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-08-28 14:16:39 -07:00
Risha Mars fff09c5d06
Only tap pods that are meshed (#1535)
Previously, we would tap any resource's pods, regardless of whether the pods
were meshed or not. We can't actually tap non-meshed pods, so I'm adding a check
that will filter out non-meshed pods from the pods that tap watches.

Previous behaviour:
When attempting to hang a non meshed pod, it would establish
a watch on the pods, but then never return any results. In the CLI you could
just cancel it with Ctrl-C. In the web, clicking Stop would send a
WebSocket.close(1000) but wouldn't actually close the connection... 

Behaviour after change :
If no pods under the specified resource are meshed, it'll
return an error of no pods being found to tap
2018-08-28 09:59:52 -07:00