The `/apis/tap.linkerd.io/v1alpha1` endpoint on the Tap APIServer
included tap resource/subresource pairs, such as `deployments/tap` and
`pods/tap`, but did not include the parent resources, such as
`deployments` and `pods`. This broke commands like `kubectl auth can-i`,
which expect the parent resource to exist.
Introduce parent resources for all tap-able subresources. Concretely, it
fixes this class of command:
```
$ kubectl auth can-i watch deployments.tap.linkerd.io/linkerd-grafana \
--subresource=tap -n linkerd --as siggy@buoyant.io
Warning: the server doesn't have a resource type 'deployments' in group 'tap.linkerd.io'
no - no RBAC policy matched
```
Fixed:
```
$ kubectl auth can-i watch deployments.tap.linkerd.io/linkerd-grafana \
--subresource=tap -n linkerd --as siggy@buoyant.io
yes
```
Additionally, when SubjectAccessReviews fail, return a 403 rather than
500.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.
This change migrates `linkerd profile --tap` to the new Tap APIService.
Depends on #3186Fixes#3169
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.
This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.
Fixes#3168
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
**Significant Update**
This edge release introduces a new tap APIService. The Kubernetes apiserver
authenticates the requesting tap user and then forwards tap requests to the new
tap APIServer. The `linkerd tap` command now makes requests against the
APIService.
With this release, users must be authorized via RBAC to use the `linkerd tap`
command. Specifically `linkerd tap` requires the `watch` verb on all resources
in the `tap.linkerd.io/v1alpha1` APIGroup. More granular access is also
available via sub-resources such as `deployments/tap` and `pods/tap`.
* CLI
* Added a check to the `linkerd check` command to validate the user has
privileges necessary to create CronJobs
* Introduced the `linkerd --as` flag which allows users to impersonate another
user for Kubernetes operations
* The `linkerd tap` command now makes requests against the tap APIService
* Controller
* Added HTTP security headers on all dashboard responses
* Fixed nil pointer dereference in the destination service when an endpoint
does not have a `TargetRef`
* Added resource limits when HA is enabled
* Added RSA support to TLS libraries
* Updated the destination service to return `InvalidArgument` for external
name services so that the proxy does not immediately fail the request
* The `l5d-require-id` header is now set on tap requests so that a connection
is established over TLS
* Introduced the `APIService/v1alpha1.tap.linkerd.io` global resource
* Introduced the `ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator`
global resource
* Introduced the `Secret/linkerd-tap-tls` resource into the `linkerd`
namespace
* Introduced the `RoleBinding/linkerd-linkerd-tap-auth-reader` resource into
the `kube-system` namespace
* Proxy
* Added the `LINKERD2_PROXY_TAP_SVC_NAME` environment variable so that the tap
server attempts to authorize client identities
* Internal
* Replaced `dep` with Go modules for dependency management
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.
This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.
This change also modifies the `linkerd tap` command to make requests
against the new APIService.
The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET /apis
GET /apis/tap.linkerd.io
GET /apis/tap.linkerd.io/v1alpha1
GET /healthz
GET /healthz/log
GET /healthz/ping
GET /metrics
GET /openapi/v2
GET /version
Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.
This change introduces the following resources into the default Linkerd
install:
- Global
- APIService/v1alpha1.tap.linkerd.io
- ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
- Secret/linkerd-tap-tls
- `kube-system` namespace:
- RoleBinding/linkerd-linkerd-tap-auth-reader
Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller
Fixes#2725, #3162, #3172
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The heartbeat cronjob specified `restartPolicy: OnFailure`. In cases
where failure was non-transient, such as if a cluster did not have
internet access, this would continuously restart and fail.
Change the heartbeat cronjob to `restartPolicy: Never`, as a failed job
has no user-facing impact.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
### Summary
In order for Pods' tap servers to start authorizing tap clients, the tap
controller must open TLS connections so that it can identity itself to the
server.
This change introduces the use of `l5d-require-id` header on outbound tap
requests.
### Details
When tap requests are made by the tap controller, the `Authority` header is an
IP address. The proxy does not attempt to do service discovery on such requests
and therefore the connection is over plaintext. By introducing the
`l5d-require-id` header the proxy can require a server name on the connection.
This allows the tap controller to identity itself as the client making tap
requests. The name value for the header can be made from the Pod Spec and tap
request, so the change is rather minimal.
#### Proxy Changes
* Update h2 to v0.1.26
* Properly fall back in the dst_router (linkerd/linkerd2-proxy#291)
### Testing
Unit tests for the header have not been added mainly because [no test
infrastructure currently exists](065c221858/controller/tap/server_test.go (L241)) to mock proxy requests. After talking with
@siggy a little about this, it makes to do in a separate change at some point
when behavior like this cannot be reliably tested through integration tests
either.
Integration tests do test this well, and will continue to do once
linkerd/linkerd2-proxy#290 lands.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498
When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource. Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request. The net effect is that Linkerd fails all requests to external name services.
We change the destination service to instead return `InvalidArgument` for external name services. This causes the proxy to fallback to SO_ORIG_DST instead of failing the request.
Signed-off-by: Alex Leong <alex@buoyant.io>
### Summary
In order for Pods' tap servers to start authorizing tap clients, the tap server
must be able to check client names against the expected tap service name.
This change injects the `LINKERD2_PROXY_TAP_SVC_NAME` into proxy PodSpecs.
### Details
The tap servers on the individual resources being tapped should be able to
verify that the client is the tap service. The `LINKERD2_PROXY_TAP_SVC_NAME` is
now injected as an environment variable in the proxies so that it can check this
value against the client name of the TLS connection. Currently, this environment
will go unused. There is an open PR (linkerd2-proxy#290) to use this variable in
the proxy, but this is *not* dependent on that merging first.
Note: The variable is not injected if tap is disabled.
### Testing
Test output has been updated with the newly injected environment variable.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
PR #3056 introduced a cluster heartbeat cronjob to the Linkerd
installation. This implies the user installing Linkerd requires the
privileges to create CronJobs.
Update `linkerd check` to validate the user has privileges necessary to
create CronJobs.
Fixes#3057
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* increased ha resource limits
* added resource limits to proxy when HA
* update golden files in cmd/main
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The destination service's endpoints watcher assumed every `Endpoints`
object contained a `TargetRef`. This field is optional, and in cases
such as the default `ep/kubernetes` object, `TargetRef` is nil, causing
a nil pointer dereference.
Fix endpoints watcher to check for `TargetRef` prior to dereferencing.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Set the following headers on every dashboard response:
- `X-Content-Type-Options: nosniff`
- `X-Frame-Options: SAMEORIGIN`
- `X-XSS-Protection: 1; mode=block`
Fixes#3082
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The repo relied on `dep` for managing Go dependencies. Go 1.11 shipped
with Go modules support. Go 1.13 will be released in August 2019 with
module support enabled by default, deprecating GOPATH.
This change replaces `dep` with Go modules for dependency management.
All scripts, including Docker builds and ci, should work without any dev
environment changes.
To execute `go` commands directly during development, do one of the
following:
1. clone this repo outside of `GOPATH`; or
2. run `export GO111MODULE=on`
Summary of changes:
- Docker build scripts and ci set `-mod=readonly`, to ensure
dependencies defined in `go.mod` are exactly what is used for the
builds.
- Dependency updates to `go.mod` are accomplished by running
`go build` and `go test` directly.
- `bin/go-run`, `bin/build-cli-bin`, and `bin/test-run` set
`GO111MODULE=on`, permitting usage inside and outside of GOPATH.
- `gcr.io/linkerd-io/go-deps` tags hashed from `go.mod`.
- `bin/update-codegen.sh` still requires running from GOPATH,
instructions added to BUILD.md.
Fixes#1488
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#3136
When the destination service sends a destination profile with a traffic split to the proxy, the override destination authorities are absolute but do no contain a trailing dot. e.g. "bar.ns.svc.cluster.local:80". However, NameAddrs which have undergone canonicalization in the proxy will include the trailing dot. When a traffic split includes the apex service as one of the overrides, the original apex NameAddr will have the trailing dot and the override will not. Since these two NameAddrs are not identical, they will go into two distinct slots in the proxy's concrete dst router. This will cause two services to be created for the same destination which will cause the stats clobbering described in the linked issue.
We change the destination service to always return absolute dst overrides including the trailing dot.
Signed-off-by: Alex Leong <alex@buoyant.io>
We add support for looking up individual pods in a stateful set with the destination service. This allows Linkerd to correctly proxy requests which address individual pods. The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.
Fixes#2266
Signed-off-by: Alex Leong <alex@buoyant.io>
* Adds more PSP restrictions
* Update test fixtures
* Updates PSP to be conditional on initContainer
- The proxy-init container runs as root and needs the PSP to allow this
user when there is an init container.
Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
The `_resources.yaml` partial hard-coded indentation, making it
cumbersome to use it in contexts that did not have a specific
indentation level.
Remove indentation from `_resources.yaml`, and instead specify required
indentation at the call site via `nindent`.
Fixes#3119
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
kubectl introduced `-A` as a shorthand for `--all-namespaces` in
`v1.14.0`:
https://github.com/kubernetes/kubernetes/pull/72006
Update linkerd cli's `edges`, `get`, and `stat` commands to match this
convention.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.
Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.
Example check URL:
https://versioncheck.linkerd.io/version.json?
install-time=1562761177&
k8s-version=v1.15.0&
meshed-pods=8&
rps=3&
source=heartbeat&
uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
version=git-b97ee9f7
Fixes#2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller. Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.
To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2. This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource. If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can. This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.
Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation). In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Allow custom cluster domain in destination watcher
The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.
A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.
* Update proto to allow setting custom cluster domain
Update golden templates
* Allow setting custom domain in grpc, web server
* Remove cluster domain flags from web srv and public api
* Set defaultClusterDomain in validateAndBuild if none is set
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
* Updates for the integration test script
1. Remove existing resources prior to starting the test
2. Remove existing resources post upgrade test
3. Fail fast if 'install_test.go` fails
4. Don't perform cleanup if any of the tests fail for debugging
opportunity
* Remove pre-test cleanup from .travis.yaml
This is now done in the bin/test-run script so that it can be shared
between l5d-bot and staging.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff).
We add pod status to the output to make this more immediately obvious.
Fixes#2877
Signed-off-by: Alex Leong <alex@buoyant.io>
* Added Anti Affinity when HA is configured
* Move check to validate()
* Test output with anti-affinity when ha upgrade
* Add anti-affinity to identity deployment
* made host anti-affinity default when ha
* Define affinity template in a separate file
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When installing using some of the flags that persist in install, e.g
`linkerd install --ha`, and then doing `linkerd upgrade config` a nil
pointer error is thrown.
Fixes#3094
`newCmdUpgradeConfig()` was using passing `flags` as nil because
`linkerd upgrade config` doesn't expose any flags for the subcommand,
but turns out they're still needed down the call stack in
`setFlagsFromInstall` to reuse the flags persisted during install.
I also added a new unit test catching this.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
This PR improves the CLI output for `linkerd edges` to reflect the latest API
changes.
Source and destination namespaces for each edge are now shown by default. The
`MSG` column has been replaced with `Secured` and contains a green checkmark or
the reason for no identity. A new `-o wide` flag shows the identity of client
and server if known.
The `TestGetServicesFor` is flaky because it compares two slices of services which are in a non-deterministic order.
To make this deterministic, we first sort the slices by name.
Signed-off-by: Alex Leong <alex@buoyant.io>
Integration tests may fail and leave behind namespaces that following
builds aren't able to clean up because the git sha is being included in
the namespace name, and the following builds don't know about those
shas.
This modifies the `test-cleanup` script to delete based on object labels
instead of relying on the objects names, now that after 2.4 all the
control plane components are labeled. Note that this will also remove
non-testing linkerd namespaces, but we were already kinda doing that
partially because we were removing the cluster-level resources (CRDs,
webhook configs, clusterroles, clusterrolebindings, psp).
`test-cleanup` no longer receives a namespace name as an argument.
The data plane namespaces aren't labeled though, so I've added the
`linkerd.io/is-test-data-plane` label for them in
`CreateNamespaceIfNotExists()`, and making sure all tests that need a
data plaine explicitly call that method instead of creating the
namespace as a side-effect in `KubectlApply()`.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The git-commt-proxy-version script attempted to link to the specifc
SHAs. GitHub doesn't actually render these links, and the information is
redundant since we link the appropriate PR.
Each time we update the proxy from the linkerd2-proxy repo, we make the
change slightly differently. The bin/git-commit-proxy-version does all the
steps needed to update the proxy version up to and including making a
commit to this repo.
The proxy version is now stored in a .proxy-version file and is
consumed directly by Dockerfile-proxy, which both simplifies the
Dockerfile and the update process.
This script formats commit messages and emits output as follows:
```
commit c05198a851f69bdc7007974a0ef1f4c01c98d0ce (HEAD -> ver/proxy-update)
Author: Oliver Gould <ver@buoyant.io>
Date: Thu Jul 11 17:23:05 2019 +0000
proxy: Update to linkerd/linkerd2-proxy#3a3ec3b
* linkerd/linkerd2-proxy#0cc58cd fallback: Clarify fallback layering (linkerd/linkerd2-proxy#288)
* linkerd/linkerd2-proxy#b71349a Replace `log` and `env-logger` with `tracing` and `tracing-fmt` (linkerd/linkerd2-proxy#277)
* linkerd/linkerd2-proxy#3a3ec3b Use a constant-time load balancer (linkerd/linkerd2-proxy#266)
diff --git a/.proxy-version b/.proxy-version
index f81f40de..d7faa12d 100644
--- a/.proxy-version
+++ b/.proxy-version
@@ -1 +1 @@
-05b012d
+3a3ec3b
```