As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.
This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.
The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.
Fixes#2217
Previously, the update-handling logic was spread across several very
small functions that were only called within this file. I've
consolidated this logic into endpointListener.Update so that all of the
debug logging can be instrumented in one place without having to iterate
over lists multiple times.
Also, I've fixed the formatting of IP addresses in some places.
Logs now look as follows:
msg="Establishing watch on endpoint linkerd-prometheus.linkerd:9090" component=endpoints-watcher
msg="Subscribing linkerd-prometheus.linkerd:9090 exists=true" component=service-port id=linkerd-prometheus.linkerd target-port=admin-http
msg="Update: add=1; remove=0" component=endpoint-listener namespace=linkerd service=linkerd-prometheus
msg="Update: add: addr=10.1.1.160; pod=linkerd-prometheus-7bbc899687-nd9zt; addr:<ip:<ipv4:167838112 > port:9090 > weight:1 metric_labels:<key:\"control_plane_ns\" value:\"linkerd\" > metric_labels:<key:\"deployment\" value:\"linkerd-prometheus\" > metric_labels:<key:\"pod\" value:\"linkerd-prometheus-7bbc899687-nd9zt\" > metric_labels:<key:\"pod_template_hash\" value:\"7bbc899687\" > protocol_hint:<h2:<> > " component=endpoint-listener namespace=linkerd service=linkerd-prometheus
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.
Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.
Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.
TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds a flag, tcp_stats to the StatSummary request, which queries prometheus for TCP stats.
This branch returns TCP stats at /api/tps-reports when this flag is true.
TCP stats are now displayed on the Resource Detail pages.
The current queried TCP stats are:
tcp_open_connections
tcp_read_bytes_total
tcp_write_bytes_total
gosimple is a Go linter that specializes in simplifying code
Also fix one spelling error in `cred_test.go`
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Also, some protobuf updates:
* Rename `api_port` to match recent changes in CLI code.
* Remove the `cni` message because it won't be used.
* Remove `registry` field from proto types. This helps to avoid having to workaround edge cases like fully-qualified image name in different format, and overriding user-specified Linkerd version etc.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
In preparation for creating an Identity service that can chain off of an
existing CA, it's necessary to both (1) be able to create an
intermediate CA that can be used by the identity service and (2) be able
to load a CA from existing key material.
This changes the public API of the `tls` package to deal in actual key
types (rather than opaque blobs) and provides a set of helpers that can
be used to convert these credentials between common formats.
Define the global and proxy configs protobuf types that will be used by CLI install, inject and the proxy-injector.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
The control-plane's clients, specifically the Kubernetes clients, did
not provide telemetry information.
Introduce a `prometheus.ClientWithTelemetry` wrapper to instrument
arbitrary clients. Apply this wrapper to Kubernetes clients.
Fixes#2183
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).
With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).
Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
# Problem
When a route does not specify a timeout, the proxy-api defaults to the default
timeout and logs an error:
```
time="2019-02-13T16:29:12Z" level=error msg="failed to parse duration for route POST /io.linkerd.proxy.destination.Destination/GetProfile: time: invalid duration"
```
# Solution
We now check if a route timeout is blank. If it is not set, it is set to
`DefaultRouteTimeout`. If it is set, we try to parse it into a `Duration`.
A request was made to improve logging to include the service profile and
namespace as well.
# Validation
With valid service profiles installed, edit the `.yaml` to include an invalid
`timeout`:
```
...
name: GET /
timeout: foo
```
We should now see the following errors:
```
proxy-api time="2019-02-13T22:27:32Z" level=error msg="failed to parse duration for route 'GET /' in service profile 'webapp.default.svc.cluster.local' in namespace 'default': time: invalid duration foo"
```
This error does not show up when `timeout` is blank.
Fixes#2276
Signed-off-by: Kevin Leimkuhler <kevinl@buoyant.io>
The way these tests compare the hard-coded base64-encoded JSON
patches with those generated by the proxy injector, is extremely
brittle. Changing any of the proxy configuration causes these tests
to break, even though the proxy injector itself isn't affected.
Also, the AdmissionRequest and AdmissionResponse types are "boundary
objects" that are largely irrelevant to our code.
Fixes#2201
Signed-off-by: Ivan Sim <ivan@buoyant.io>
`golangci-lint` performs numerous checks on Go code, including golint,
ineffassign, govet, and gofmt.
This change modifies `bin/lint` to use `golangci-lint`, and replaces
usage of golint and govet.
Also perform a one-time gofmt cleanup:
- `gofmt -s -w controller/`
- `gofmt -s -w pkg/`
Part of #217
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Running `linkerd routes` for some resource was returning, besides the data for the resource, additional rows for each `ExternalName` service in the namespace.
Fixes#2216
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Fixes#2077
When looking up service profiles, Linkerd always looks for the service profile objects in the Linkerd control namespace. This is limiting because service owners who wish to create service profiles may not have write access to the Linkerd control namespace.
Instead, we have the control plane look for the service profile in both the client namespace (as read from the proxy's `proxy_id` field from the GetProfiles request and from the service's namespace. If a service profile exists in both namespaces, the client namespace takes priority. In this way, clients may override the behavior dictated by the service.
Signed-off-by: Alex Leong <alex@buoyant.io>
The `linkerd check` command was doing limited validation on
ServiceProfiles.
Make ServiceProfile validation more complete, specifically validate:
- types of all fields
- presence of required fields
- presence of unknown fields
- recursive fields
Also move all validation code into a new `Validate` function in the
profiles package.
Validation of field types and required fields is handled via
`yaml.UnmarshalStrict` in the `Validate` function. This motivated
migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml.
Fixes#2190
The Proxy API service lacked introspection of its internal state.
Introduce a new gRPC Discovery API, implemented by two servers:
1) Proxy API Server: returns a snapshot of discovery state
2) Public API Server: pass-through to the Proxy API Server
Also wire up a new `linkerd endpoints` command.
Fixes#2165
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Adds the ability to generate a service profile by running a tap for a configurable
amount of time, and using the route results from the routes seen during the tap.
e.g. `linkerd profile web --tap deploy/web -n emojivoto --tap-duration 2s`
# Problem
In order to switch Linkerd template rendering to use `.yaml` files, static
assets must be bundled in the Go binary for use by `linkerd install`.
# Solution
The solution should not affect the local development process of building and
testing.
[vfsgen](https://github.com/shurcooL/vfsgen) generates Go code that statically
implements the provided `http.FileSystem`. Paired with `go generate` and Go
[build tags](https://golang.org/pkg/go/build/), we can continue to use the
template files on disk when developing with no change required.
In `!prod` Go builds, the `cli/static/templates.go` file provides a
`http.FileSystem` to the local templates. In `prod` Go builds, `go generate
./cli` generates `cli/static/generated_templates.gogen.go` that statically
provides the template files.
When built with `-tags prod`, the executable will be built with the staticlly
generated file instead of the local files.
# Validation
The binaries were compiled locally with `bin/docker-build`. The binaries were
then tested with `bin/test-run (pwd)/target/cli/darwin/linkerd`. All tests
passed.
No change was required to successfully run `bin/go-run cli install`. No change
was required to run `bin/linkerd install`.
Fixes#2153
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
In linkerd/linkerd2-proxy#186, the proxy supports configuration of TCP
keepalive values.
This change sets `LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE` and
`LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE` to 10s when injecting the
proxy, so that remote connections are configured with a keepalive.
This configuration is NOT yet exposed through the CLI. This may be done
in a followup, if necessary.
Fixes#1949
Since 37ae423, deployments have been prefixed with linkerd-; however
the inject logic was not changed to take this into consideration when
constructing the controller's identity.
This means that the proxy's client to the control plane has been unable to
establish TLS'd communcation to the proxy-api. Previously, the proxy would
silently fall back to plaintext, but in master this behavior recently changed to
be stricter, so this bug will prevent the proxy from connecting to proxy-api
in any way.
* Add pod spec annotation to disable injection in CLI and auto-injector
* Remove support for linkerd.io/auto-inject label entirely
* Update based on review feedback
* Fix issue with finding the namespace of deployments applied to the default ns
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Fixes#2042
Adds a new field to service profile routes called `timeout`. Any requests to that route which take longer than the given timeout will be aborted and a 504 response will be returned instead. If the timeout field is not specified, a default timeout of 10 seconds is used.
Signed-off-by: Alex Leong <alex@buoyant.io>
Use `ca.NewCA()` for generating certs and keys for the proxy injector
- Remove from CA controller everything that dealt with the
webhook/proxy-injector
- Remove no longer needed proxy-injector volumes for 'trust-anchors' and
'webhook-secrets'
- Remove from the proxy-injector the retrieval of the trust anchor and
secrets
- tls flag during install is no longer needed for auto-inject to work
Fixes#2095 and fixes#2166
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Export RootOptions and BuildFirewallConfiguration so that the cni-plugin can use them.
* Created the cni-plugin based on istio-cni implementation
* Create skeleton files that need to be filled out.
* Create the install scripts and finish up plugin to write iptables
* Added in an integration test around the install_cni.sh and updated the script to handle the case where it isn't the only plugin. Removed the istio kubernetes.go file in favor of pkg/k8s; initial usage of this package; found and fixed the typo in the ClusterRole and ClusterRoleBinding; found the docker-build-cni-plugin script
* Corrected an incorrect name in the docker build file for cni-plugin
* Rename linkerd2-cni to linkerd-cni
* Fixup Dockerfile and clean up code a bit as well as logging statements.
* Update Gopkg.lock after master merge.
* Update test file to remove temporary tag.
* Fixed the command to run during the test while building up the docker run.
* Added attributions to applicable files; in the test file, use a different container for each test scenario and also print the docker logs to stdout when there is an error;
* Add the --no-init-container flag to install and inject. This flag will not output the initContainer and will add an annotation assuming that the cni will be used in this case.
* Update .travis.yml to build the cni-plugin docker image before running the tests.
* Workaround golint warnings.
* Create a new command to install the linkerd-cni plugin.
* Add the --no-init-container option to linkerd inject
* Use the setup ip tables annotation during the proxy auto inject webhook prevent/allow addition of an init container; move cni-plugin tests to the integration-test section of travis
* gate the cni-plugin tests with the -integration-tests flag; remove unnecessary deployment .yaml file.
* Incorporate PR Cleanup suggestions.
* Remove the SetupIPTablesLabel annotation and use config flags and the presence of the init container to determine whether the cni-plugin writes ip tables.
* Fix a logic bug in the cni-plugin code that prevented the iptables from being written; Address PR comments; make tests pass.
* Update go deps shas
* Changed the single file install-cni plugin filename to be .conf vs .conflist; Incorporated latest PR comments around spacing with the new renderer among others.
* Fix an issue with renaming .conf to .conflist when needed.
* Renamed some of the variables to try to make it more clear what is going on.
* Address final PR comments.
* Hide cni flags for the time being.
Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
* Update client-go to 1.13.1
Fixes#2145
* Update Dockerfile-bin with new tag
* Update all the dockerfile tags
* Clean gopkg and do not apply cluster defaults
* Update for klog
* Match existing behavior with klog
* Add klog to gopkg.lock
* Update go-deps shas
* Update klog comment
* Update comment to be a non-sentence
# Problem
In order to refactor `install` to allow for a more flexible configuration, we
should start with the format of the YAML that it renders. Using the Helm
YAML format will make it easier add flexible configuration options in the
future. Currently, the rendered template that `install` produces does not
follow this format.
# Solution
Use the internals that Helm itself uses to render an inject template that
follows the same formatting rules. Helm's `template` cmd provides a good
outline of what is needed to make Linkerd's `install` cmd work as if it was
a Chart.
# Validation
There are no new tests, but there may not be anything to test at this stage.
This is a WIP PR towards the ultimate goal of `install` allowing a more
flexible configuration.
However, `install` now uses all the Helm `template` internals and therefore
satisfies the needed properties for Helm Charts.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This branch removes the `--proxy-bind-timeout` flag from the
`linkerd inject` and `linkerd install` CLI commands, and the
`LINKERD2_PROXY_BIND_TIMEOUT` environment variable from their output.
This is in preparation for removing that timeout from the proxy (as
described in #2013).
I thought it was prudent to remove this from the CLIs before removing it
from the proxy, so we can't create a situation where the CLIs produce
output that results in broken proxy containers.
Fixes#2013
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
When `GetProfiles` is called for a destination that does not have a service profile, the proxy-api service does not return any messages until a service profile is created for that service. This can be interpreted as hanging, and can make it difficult to calculate response latency metrics.
Change the behavior of the API to always return a service profile message immediately. If the service does not have a service profile, the default service profile is returned.
Signed-off-by: Alex Leong <alex@buoyant.io>
DaemonSet stats are not currently shown in the cli stat command, web ui
or grafana dashboard. This commit adds daemonset support for stat.
Update stat command's help message to reference daemonsets.
Update the public-api to support stats for daemonsets.
Add tests for stat summary and api.
Add daemonset get/list/watch permissions to the linkerd-controller
cluster role that's created using the install command.
Update golden expectation test files for install command
yaml manifest output.
Update web UI with daemonsets
Update navigation, overview and pages to list daemonsets and the pods
associated to them.
Add daemonset paths to server, and ui apps.
Add grafana dashboard for daemonsets; a clone of the deployment
dashboard.
Update dependencies and dockerfile hashes
Add DaemonSet support to tap and top commands
Fixes of #2006
Signed-off-by: Zak Knill <zrjknill@gmail.com>