### Summary
After the addition of the tap APIServer, all the logic related to tap in the public API no longer needs to be there. The servers and clients that are created but not used, as well as all the old testing infrastrucure related to tap can be removed.
This deprecates TapByResource and therefore required an update to the protobuf files with `bin/protoc-go.sh`. While the change to deprecate this method was extremely small, a lot of protobuf fils were updated in the process. These changes to the code and protobuf files should probably remain coupled since `TapByResource` is officially deprecated in the public API, but a majority of the additions/deletions are related to those files.
This draft passes `go test` as well as a local run of the integration tests.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
PR #3217 re-introduced container metrics collection to
linkerd-prometheus. This enabled linkerd-heartbeat to collect mem and
cpu metrics at the container-level.
Add container cpu and mem metrics to heartbeat requests. For each of
(destination, prometheus, linkerd-proxy), collect maximum memory and p95
cpu.
Concretely, this introduces 7 new query params to heartbeat requests:
- p99-handle-us
- max-mem-linkerd-proxy
- max-mem-destination
- max-mem-prometheus
- p95-cpu-linkerd-proxy
- p95-cpu-destination
- p95-cpu-prometheus
Part of #2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR adds `trafficsplit` as a supported resource for the `linkerd stat` command. Users can type `linkerd stat ts` to see the apex and leaf services of their trafficsplits, as well as metrics for those leaf services.
Go dependencies which are only used by generated code had not previously been checked into the repo. Because `go generate` does not respect the `-mod=readonly` flag, running `bin/linkerd` will add these dependencies and dirty the local repo. This can interfere with the way version tags are generated.
To avoid this, we simply check these deps in.
Note that running `go mod tidy` will remove these again. Thus, it is not recommended to run `go mod tidy`.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Delete symlink to old Helm chart
* Update 'install' code to use common Helm template structs
* Remove obsolete TLS assets functions.
These are now handle by Helm functions inside the templates
* Read defaults from values.yaml and values-ha.yaml
* Ensure that webhooks TLS assets are retained during upgrade
* Fix a few bugs in the Helm templates (see bullet points):
* Merge the way the 'install' ha and non-ha options are handled into one function
* Honor the 'NoInitContainer' option in the components templates
* Control plane mTLS will not be disabled if identity context in the
config map is empty. The data plane mTLS will still be automatically disabled
if the context is nil.
* Resolve test failures from rebase with master
* Fix linter issues
* Set service account mount path read-only field
* Add TLS variables of the webhooks and tap to values.yaml
During upgrade, these secrets are preserved to ensure they remain synced
wih the CA bundle in the webhook configurations. These Helm variables are used
to override the defaults in the templates.
* Remove obsolete 'chart' folder
* Fix bugs in templates
* Handle missing webhooks and tap TLS assets during upgrade
When upgrading from an older version that don't have these secrets, fallback to let Helm
create them by creating an empty charts.TLS struct.
* Revert the selector labels of webhooks to be compatible with that in 2.4
In 2.4, the proxy injector and profile validator webhooks already have their selector labels defined.
Since these attributes are immutable, the recent change to these selectors introduced by the Helm chart work will cause upgrade to fail.
* Alejandro's feedback
* Siggy's feedback
* Removed redundant unexported custom types
Signed-off-by: Ivan Sim <ivan@buoyant.io>
Now that we inject at the pod level by default, `linkerd uninject` should remove the `linkerd.io/inject: enabled`
annotation. Also added a test for that.
Fix#3156
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The `linkerd-linkerd-tap-admin` ClusterRole had `watch` privileges on
`*/tap` resources. This disallowed non-namespaced tap requests of the
form: `/apis/tap.linkerd.io/v1alpha1/watch/namespaces/linkerd/tap`,
because that URL structure is interpreted by the Kubernetes API as
watching a resource of type `tap` within the linkerd namespace, rather
than tapping the linkerd namespace.
Modify `linkerd-linkerd-tap-admin` to have `watch` privileges on `*`,
enabling any request of the form
`/apis/tap.linkerd.io/v1alpha1/watch/namespaces/linkerd/*` to succeed.
Fixes#3212
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The web dashboard will be migrating to the new Tap APIService, which
requires RBAC privileges to access.
Introduce a new ClusterRole, `linkerd-linkerd-tap-admin`, which gives
cluster-wide tap privileges. Also introduce a new ClusterRoleBinding,
`linkerd-linkerd-web-admin` which binds the `linkerd-web` service
account to the new tap ClusterRole. This ClusterRoleBinding is enabled
by default, but may be disabled via a new `linkerd install` flag
`--restrict-dashboard-privileges`.
Fixes#3177
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Followup to #3148
Wrong args order in call to `profiles.RenderOpenAPI` was generating an
invalid service profile name.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Refactor proxy injection to use Helm charts
Fixes#3128
A new chart `/charts/patch` was created, that generates the JSON patch
payload that is to be returned to the k8s API when doing the injection
through the proxy injector, and it's also leveraged by the `linkerd
inject --manual` CLI.
The VFS was used by `linkerd install` to access the old chart under
`/chart`. Now the proxy injection also uses the Helm charts to generate
the JSON patch (see above) so we've moved the VFS from `cli/static` to a
new common place under `/pkg/charts/static`, and the new root for the VFS is
now `/charts`.
`linkerd install` hasn't yet migrated to use the new charts (that'll
happen in #3127), so the only change in that regard was the creation of
`/charts/chart` which is a symlink pointing to `/chart` that
`install.go` now uses, so that the VFS contains both the old and new
charts, as a temporary measure.
You can see that `/bin/Dockerfile-bin`, `/controller/Dockerfile` and
`/bin/build-cli-bin` do now `go generate` pointing to the new location
(and the `go generate` annotation was moved from `/cli/main.go` to
`pkg/charts/static/templates.go`).
The symlink trick doesn't work when building the binaries through
Docker, so `/bin/Dockerfile-bin` replaces the symlink with an actual
copy of `/chart`.
Also note that in `/controller/Dockerfile` we now need to include the
`prod` tag in `go install` like we do in `/bin/Dockerfile-bin` so that
the proxy injector does use the VFS instead of the local file system.
- The common logic to parse a chart has been moved from `install.go` to
`/pkg/charts/util.go`.
- The special ENV var in the proxy for "outbound router capacity" that
only applies to the Prometheus pod is now handled directly in the proxy
partial and all the associated go code could be removed.
- The `patch.go` lib for generating the JSON patch in go along
with its tests `patch_test.go` are no longer needed.
- Lots of functions in `/pkg/inject/inject.go` got removed/simplified
with their logic being moved into the charts themselves. As a
consequence lots of things in `inject_test.go` became irrelevant.
- Moved `template-values.go` from `/pkg/inject` to `pkg/charts` as that
contains the go structs representation of the chart variables that
will be leveraged in #3127.
Don't forget to run `/bin/helm.sh` whenever you make changes to charts
;-)
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Continue of #2950.
I decided to check for the `clusterDomain` in the config map in web server main for the same reasons as as pointed out here https://github.com/linkerd/linkerd2/pull/3113#discussion_r306935817
It decouples the server implementations from the config.
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
PR #3167 introduced a Tap APIService, and migrated linkerd tap to it.
This change migrates `linkerd profile --tap` to the new Tap APIService.
Depends on #3186Fixes#3169
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #3167 introduced a Tap APIService, and migrated `linkerd tap` to it.
This change migrates `linkerd top` to the new Tap APIService. It also
addresses a `panic: close of closed channel` issue, where two go
routines could both call `close(done)` on exit.
Fixes#3168
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.
This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.
This change also modifies the `linkerd tap` command to make requests
against the new APIService.
The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET /apis
GET /apis/tap.linkerd.io
GET /apis/tap.linkerd.io/v1alpha1
GET /healthz
GET /healthz/log
GET /healthz/ping
GET /metrics
GET /openapi/v2
GET /version
Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.
This change introduces the following resources into the default Linkerd
install:
- Global
- APIService/v1alpha1.tap.linkerd.io
- ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
- Secret/linkerd-tap-tls
- `kube-system` namespace:
- RoleBinding/linkerd-linkerd-tap-auth-reader
Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller
Fixes#2725, #3162, #3172
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The heartbeat cronjob specified `restartPolicy: OnFailure`. In cases
where failure was non-transient, such as if a cluster did not have
internet access, this would continuously restart and fail.
Change the heartbeat cronjob to `restartPolicy: Never`, as a failed job
has no user-facing impact.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
### Summary
In order for Pods' tap servers to start authorizing tap clients, the tap server
must be able to check client names against the expected tap service name.
This change injects the `LINKERD2_PROXY_TAP_SVC_NAME` into proxy PodSpecs.
### Details
The tap servers on the individual resources being tapped should be able to
verify that the client is the tap service. The `LINKERD2_PROXY_TAP_SVC_NAME` is
now injected as an environment variable in the proxies so that it can check this
value against the client name of the TLS connection. Currently, this environment
will go unused. There is an open PR (linkerd2-proxy#290) to use this variable in
the proxy, but this is *not* dependent on that merging first.
Note: The variable is not injected if tap is disabled.
### Testing
Test output has been updated with the newly injected environment variable.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* increased ha resource limits
* added resource limits to proxy when HA
* update golden files in cmd/main
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The repo relied on `dep` for managing Go dependencies. Go 1.11 shipped
with Go modules support. Go 1.13 will be released in August 2019 with
module support enabled by default, deprecating GOPATH.
This change replaces `dep` with Go modules for dependency management.
All scripts, including Docker builds and ci, should work without any dev
environment changes.
To execute `go` commands directly during development, do one of the
following:
1. clone this repo outside of `GOPATH`; or
2. run `export GO111MODULE=on`
Summary of changes:
- Docker build scripts and ci set `-mod=readonly`, to ensure
dependencies defined in `go.mod` are exactly what is used for the
builds.
- Dependency updates to `go.mod` are accomplished by running
`go build` and `go test` directly.
- `bin/go-run`, `bin/build-cli-bin`, and `bin/test-run` set
`GO111MODULE=on`, permitting usage inside and outside of GOPATH.
- `gcr.io/linkerd-io/go-deps` tags hashed from `go.mod`.
- `bin/update-codegen.sh` still requires running from GOPATH,
instructions added to BUILD.md.
Fixes#1488
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Adds more PSP restrictions
* Update test fixtures
* Updates PSP to be conditional on initContainer
- The proxy-init container runs as root and needs the PSP to allow this
user when there is an init container.
Signed-off-by: Cody Vandermyn <cody.vandermyn@nordstrom.com>
kubectl introduced `-A` as a shorthand for `--all-namespaces` in
`v1.14.0`:
https://github.com/kubernetes/kubernetes/pull/72006
Update linkerd cli's `edges`, `get`, and `stat` commands to match this
convention.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.
Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.
Example check URL:
https://versioncheck.linkerd.io/version.json?
install-time=1562761177&
k8s-version=v1.15.0&
meshed-pods=8&
rps=3&
source=heartbeat&
uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
version=git-b97ee9f7
Fixes#2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller. Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.
To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2. This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource. If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can. This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.
Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation). In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Allow custom cluster domain in destination watcher
The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.
A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.
* Update proto to allow setting custom cluster domain
Update golden templates
* Allow setting custom domain in grpc, web server
* Remove cluster domain flags from web srv and public api
* Set defaultClusterDomain in validateAndBuild if none is set
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
When waiting for controller pods to be created or become ready, `linkerd check` doesn't offer any hints as to whether there has been an error (such as an ImagePullBackoff).
We add pod status to the output to make this more immediately obvious.
Fixes#2877
Signed-off-by: Alex Leong <alex@buoyant.io>
* Added Anti Affinity when HA is configured
* Move check to validate()
* Test output with anti-affinity when ha upgrade
* Add anti-affinity to identity deployment
* made host anti-affinity default when ha
* Define affinity template in a separate file
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When installing using some of the flags that persist in install, e.g
`linkerd install --ha`, and then doing `linkerd upgrade config` a nil
pointer error is thrown.
Fixes#3094
`newCmdUpgradeConfig()` was using passing `flags` as nil because
`linkerd upgrade config` doesn't expose any flags for the subcommand,
but turns out they're still needed down the call stack in
`setFlagsFromInstall` to reuse the flags persisted during install.
I also added a new unit test catching this.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
This PR improves the CLI output for `linkerd edges` to reflect the latest API
changes.
Source and destination namespaces for each edge are now shown by default. The
`MSG` column has been replaced with `Secured` and contains a green checkmark or
the reason for no identity. A new `-o wide` flag shows the identity of client
and server if known.
During operations with `linkerd stat` sometimes it's not clear the actual
pod status.
This commit introduces a method, to the `k8s`package, getting the pod status,
based on [`kubectl` logic](33a3e325f7/pkg/printers/internalversion/printers.go (L558-L640))
to expose the `STATUS` column for pods . Also, it changes the stat command
on the` cli` package adding a column when the resource type is a Pod.
Fixes#1967
Signed-off-by: Jonathan Juares Beber <jonathanbeber@gmail.com>
The existing `linkerd install` error message for existing resources was
shared with `linkerd check`. Given the different contexts, the messaging
made more sense for `linkerd check` than for `linkerd install`.
Modify the error messaging for `linkerd install` to print a bare list
of existing resources, and provide instructions for proceeding.
For example:
```bash
$ linkerd install
Unable to install the Linkerd control plane. It appears that there is an existing installation:
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity
If you are sure you'd like to have a fresh install, remove these resources with:
linkerd install --ignore-cluster | kubectl delete -f -
Otherwise, you can use the --ignore-cluster flag to overwrite the existing global resources.
```
Fixes#3045
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics.
Signed-off-by: Alex Leong <alex@buoyant.io>
PR #2603 modified the web process to read the UUID from the
`linkerd-config` ConfigMap rather than from a command line flag. The
`linkerd check` command relied on that command line flag to retrieve the
UUID as part of its version check.
Modify `linkerd check` to correctly retrieve the UUID from
`linkerd-config`. Also refactor `linkerd-config` retrieval and parsing
code to be shared between healthcheck, install, and upgrade.
Relates to #2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Have `linkerd endpoints` use `Destination.Get`
Fixes#2885
We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.
For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.
I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.
Analogously, I added a `destinationServer` struct that mimics
`tapServer`.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Linkerd's CLI flags all match 1:1 with their `config.linkerd.io/*`
annotation counterparts, except `--enable-debug-sidecar`, which
corresponded to `config.linkerd.io/debug`. Additionally, the Linkerd
docs assume this 1:1 mapping.
Rename the `config.linkerd.io/debug` annotation to
`config.linkerd.io/enable-debug-sidecar`.
Relates to https://github.com/linkerd/website/issues/381
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).
We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to. A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Introduce new checks to determine existence of global resources and the
'linkerd-config' config map.
* Update pre-check to check for existence of global resources
This ensures that multiple control planes can't be installed into
different namespaces.
* Update integration test clean-up script to delete psp and crd
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Simplify port-forwarding code
Simplifies the establishment of a port-forwarding by moving the common
logic into `PortForward.Init()`
Stemmed from this
[comment](https://github.com/linkerd/linkerd2/pull/2937#discussion_r295078800)
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>