Depends on https://github.com/linkerd/linkerd2-proxy-init/pull/10Fixes#4276
We add a `--close-wait-timeout` inject flag which configures the proxy-init container to run with `privileged: true` and to set `nf_conntrack_tcp_timeout_close_wait`.
Signed-off-by: Alex Leong <alex@buoyant.io>
When viewing the output of `linkerd stat` for services which do not have a selector (such as services created by the service-mirror, for example) the meshed count column shows the total number which exist, even though the service actually selects no pods at all.
We update the StatSummary implementation to account for services which have no selector.
Additionally, we update the logic of the `--unmeshed` flag. When the `--unmeshed` flag is not set, we typically skip rows for unmeshed resources because those resources would have no stats. This is not appropriate to do when the `--from` flag is also set because in this case, metrics are not collected on the target resource but are instead collected on the client-side. This means that stats can be present, even for unmeshed resources and these resources should still be displayed, even if the `--unmeshed` flag is not set.
Signed-off-by: Alex Leong <alex@buoyant.io>
This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway.
Fix#4411
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR introduces a few changes that were requested after a bit of service mirror reviewing.
- we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in
- we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters
- fixed checks to account for changes
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Certain install flags are intended to help with Linkerd development and generally are not useful (and are potentially confusing) to users.
We hide these flags in release (edge or stable) builds of the CLI but show them in all other builds. The list of affected flags is:
* control-plane-version
* proxy-image
* proxy-version
* image-pull-policy
* init-image
* init-image-version
Signed-off-by: Alex Leong <alex@buoyant.io>
When using cli commands that work on namespaced resources in the cluster, the default namespace used by the cli is hardcoded to the default Kubernetes namespace (i.e 'default'). This update will allow cli commands that operate on namespaced resources to automatically infer what the name of the default namespace is, by taking the relevant default from the currently used Kubeconfig context. In short, this allows the omission of the -n flag in commands such as linkerd metrics, when working with resources that belong to a namespace that is set as default in the currently active context.
Validation was done manually by setting the default namespace of the currently used context, as well as through two integration tests that target the tap and get command respectively.
Signed-off-by: Matei David <matei.david.35@gmail.com>
This allows end user flexibility for options such as log format. Rather than bubbling up such possible config options into helm values, extra arguments provides more flexibility.
Add prometheusAlertmanagers value allows configuring a list of statically targetted alertmanager instances.
Use rule configmaps for prometheus rules. They take a list of {name,subPath,configMap} values and mounts them accordingly. Provided that subpaths end with _rules.yml or _rules.yaml they should be loaded by prometheus as per prometheus.yml's rule_files content.
Signed-off-by: Naseem <naseem@transit.app>
Fixes#3807
By setting the LINKERD2_PROXY_DESTINATION_GET_NETWORKS environment variable, we configure the Linkerd proxy to do destination lookups for authorities which are IP addresses in the private network range. This allows us to get destination metadata including identity for HTTP requests which target an IP address in the cluster, Prometheus metrics scrape requests, for example.
This change allowed us to update the "direct edges" test which ensures that the edges command produces correct output for traffic which is addressed directly to a pod IP.
We also re-enabled the "linkerd stat" integration tests which had been disabled while the destination service did not yet support these types of IP queries.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Support Multi-stage install with Add-Ons
* add upgrade tests for add-ons
* add multi stage upgrade unit tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* use downward API to mount labels to the proxy container as a volume
* add namespace as a label to the pod
* add a trace inject test
* add downwardAPi for controlplaneTracing
* add controlPlaneTracing condition to volumeMounts
* update add-ons to have workload-ns
* add workload-ns label to control-plane components
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Some `linkerd stat` test failures were being hidden
`linkerd stat` was doing an early `os.Exit(0)` when no traffic was
found, which avoided `go test` to report any test failure that ended in
that code path.
This was hiding a mismatch in the golden files for HA after the
introduction of the rolling update strategy (#4267), and the failure of
`linkerd stat trafficsplit` not returning results unless `--unmeshed` is
used. For the latter, I added the flag to the tests in order to temporarly pass
them, but the underlying issue remains to be fixed in a separate
PR.
The addition of the `--unmeshed` flag changed the rendering behavior of the
`stat` command so that resources with 0 meshed pods are not displayed by
default.
Rendering is based off the row's `MeshedPodCount` field which is currently not
set by `func trafficSplitResourceQuery`. This change sets that field now so
that in rendering, the trafficsplit resource is rendered in the output.
The reason for this not showing up in testing is addressed by #4272 where the
`stat` command behavior for no traffic is changed.
The following now works without `--unmeshed` flag being passed:
```
❯ bin/linkerd stat -A ts
NAMESPACE NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99
default backend-traffic-split backend-svc backend-svc 500m - - - - -
default backend-traffic-split backend-svc failing-svc 0 - - - - -
```
Upgrade Linkerd's base docker image to use go 1.14.2 in order to stay modern.
The only code change required was to update a test which was checking the error message of a `crypto/x509.CertificateInvalidError`. The error message of this error changed between go versions. We update the test to not check for the specific error string so that this test passes regardless of go version.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#3984
We use the new `/live` admin endpoint in the Linkerd proxy for liveness probes instead of the `/metrics` endpoint. This endpoint returns a much smaller payload.
Signed-off-by: Alex Leong <alex@buoyant.io>
It can be difficult to know which versions of the proxy are running in your cluster, especially when you have pods running at multiple different proxy versions.
We add two pieces of CLI functionality to assist with this:
The `linkerd check --proxy` command will now list all data plane pods which are not up-to-date rather than just printing the first one it encounters:
```
‼ data plane is up-to-date
Some data plane pods are not running the current version:
* default/books-84958fff5-95j75 (git-ca760bdd)
* default/authors-57c6dc9b47-djldq (git-ca760bdd)
* default/traffic-85f58ccb66-vxr49 (git-ca760bdd)
* default/release-name-smi-metrics-899c68958-5ctpz (git-ca760bdd)
* default/webapp-6975dc796f-2ngh4 (git-ca760bdd)
* default/webapp-6975dc796f-z4bc4 (git-ca760bdd)
* emojivoto/voting-54ffc5787d-wj6cp (git-ca760bdd)
* emojivoto/vote-bot-7b54d6999b-57srw (git-ca760bdd)
* emojivoto/emoji-5cb99f85d8-5bhvm (git-ca760bdd)
* emojivoto/web-7988674b8b-zfvvm (git-ca760bdd)
* default/webapp-6975dc796f-d2fbc (git-ca760bdd)
* default/curl (git-7f6bbc73)
see https://linkerd.io/checks/#l5d-data-plane-version for hints
```
The `linkerd version` command now supports a `--proxy` flag which will list all proxy versions running in the cluster and the number of pods running each version:
```
linkerd version --proxy
Client version: dev-7b9d475f-alex
Server version: edge-20.4.1
Proxy versions:
edge-20.4.1 (10 pods)
git-ca760bdd (11 pods)
git-7f6bbc73 (1 pods)
```
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#4257
This was introduced in 2.7.0. When performing an upgrade on an
installation having used `--skip-outbound-ports` or
`--skip-inbound-ports`, the upgrade picks those values from the
ConfigMap, parses them wrongly, and then when proxy-init picks them the
iptables commands fail.
I've also improved one of the upgrade unit tests to include these flags,
and confirmed it failed before this fix.
## Motivation
Introduces an `unmeshed` flag to the `stat` command so that users can opt-in
to viewing unmeshed resources in the `stat` output.
This changes the existing behavior of the `stat` command such that unmeshed
resources no longer render by default in the output.
Before:
```
❯ bin/linkerd stat -A deploy
NAMESPACE NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
kube-system coredns 0/1 - - - - - -
kube-system local-path-provisioner 0/1 - - - - - -
kube-system metrics-server 0/1 - - - - - -
kube-system traefik 0/1 - - - - - -
linkerd linkerd-controller 1/1 100.00% 0.3rps 1ms 2ms 2ms 2
linkerd linkerd-destination 1/1 100.00% 0.3rps 1ms 1ms 1ms 11
...
```
After:
```
❯ bin/linkerd stat -A deploy
NAMESPACE NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
linkerd linkerd-controller 1/1 100.00% 0.3rps 1ms 1ms 1ms 2
linkerd linkerd-destination 1/1 100.00% 0.3rps 1ms 2ms 2ms 13
...
```
Closes#3871
## Solution
Using the meshed pod count in the stat response, resources with a count of `0`
are not rendered in the table.
The `-l`/`--selector` flag do not work for all resource types, so applying a
default label does not solve this problem. While it works for pods, it does
not work for deployments as the `linkerd.io/inject` is an annotation that
cannot be selected on.
I did not think a shorthand flag was necessary for this. I do not think users
will commonly pass this flag to the `stat` command, and I didn't think adding
an additional short flag such as `u` was necessary.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This change adds a `--smi-metrics` install flag which controls if the SMI-metrics controller and associated RBAC and APIService resources are installed. The flag defaults to false and is hidden.
We plan to remove this flag or default it to true if and when the SMI-Metrics integration graduates from experimental.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Bug in `linkerd uninstall` when attempting to delete PSP
We were using a wrong apiVersion for PSP in `linkerd uninstall`'s
output, which avoids removing that resource:
```
$ linkerd uninstall | kubectl delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller"
deleted
clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination"
deleted
...
mutatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-proxy-injector-webhook-config" deleted
validatingwebhookconfiguration.admissionregistration.k8s.io
"linkerd-sp-validator-webhook-config" deleted
namespace "linkerd" deleted
error: unable to recognize "uninstall.yml": no matches for kind
"PodSecurityPolicy" in version "extensions/v1beta1"
$ kubectl get psp -oname
podsecuritypolicy.policy/linkerd-linkerd-control-plane
```
I've also replaced the uninstall integration test with a new separate
suite that performs the installation, waits for it to be ready,
uninstalls, and then confirms `linkerd check --pre` returns as expected.
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0. Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.
This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4. This keeps all of our transitive dependencies synchronized on one version of client-go.
This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code. I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Handle automountServiceAccountToken
Return error during inject if pod spec has `automountServiceAccountToken: false`
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
The linkerd-smi-metrics ServiceAccount wasn't hooked into linkerd's PSP
resource, which resulted in the linkerd-smi-metrics ReplicaSet failing
to spawn pods:
```
Error creating: pods "linkerd-smi-metrics-574f57ffd4-" is forbidden:
unable to validate against any pod security policy: []
```
When injecting a Cronjob with no
`spec.jobTemplate.spec.template.metadata` we were getting the following
error:
```
Error transforming resources: jsonpatch add operation does not apply:
doc is missing path:
"/spec/jobTemplate/spec/template/metadata/annotations"
```
This only happens to Cronjobs because other workloads force having at
least a label there that is used in `spec.selector` (at least as of v1
workloads).
With this fix, if no metadata is detected, then we add it in the json patch when
injecting, prior to adding the injection annotation.
I've added a couple of new unit tests, one that verifies that this
doesn't remove metadata contents in Cronjobs that do have that metadata,
and another one that tests injection in Cronjobs that don't have
metadata (which I verified it failed prior to this fix).
This version contains an fix for a bug that was rejecting all requests on clusters configured with an empty list of allowed client names. Because smi-metrics is an apiservice, this was also preventing namespaces from terminating.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Upgrade golangci-lint to v1.23.8
This should help with some timeouts we're seeing in CI.
I fixed some new warnings found in `inject.go` and `uninject.go`.
Also we now have to explicitly disable linting `/controller/gen`.
The linter was also complaining that in `/pkg/k8s/fake.go` the
`spClient.Interface` and `tsclient.Interface` returned in the function
`newFakeClientSetsFromManifests()` aren't used, but I opted to ignore
that to leave them available for future tests.
* Bump proxy-init to v1.3.2
Bumped `proxy-init` version to v1.3.2, fixing an issue with `go.mod`
(linkerd/linkerd2-proxy-init#9).
This is a non-user-facing fix.
## Motivation
I noticed the Go language server stopped working in VS Code and narrowed it
down to `go build ./...` failing with the following:
```
❯ go build ./...
go: github.com/linkerd/stern@v0.0.0-20190907020106-201e8ccdff9c: parsing go.mod: go.mod:3: usage: go 1.23
```
This change updates `linkerd/stern` version with changes made in
linkerd/stern#3 to fix this issue.
This does not depend on #4170, but it is also needed in order to completely
fix `go build ./...`
We add the `linkerd alpha clients` command which displays client side metrics from each of a resource's clients. This allows you to see who all of your clients are and see what your resource's metrics look like from your clients' point of view. Since these metrics are measured on the client-side, they include network latency.
```
> linkerd alpha clients deploy/web -n emojivoto
FROM TO SUCCESS RPS LATENCY_P50 LATENCY_P90 LATENCY_P99
vote-bot.emojivoto web 97.50% 2.0rps 4ms 5ms 5ms
```
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR introduces the `linkerd alpha stat` command which will eventually replace the `linkerd stat` command. This command functions in a similar way, but with slightly different arguments and is implemented using the smi-metrics API. This means that access to metrics can be controlled with RBAC.
See the `linkerd alpha stat` help text for full details, or try one of these commands:
* `linkerd alpha stat -n emojivoto deploy/web`
* `linkerd alpha stat -n emojivoto deploy`
* `linkerd alpha stat -n emojivoto deploy/web --to deploy/emoji`
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Adds the SMI metrics API to the Linkerd install flow. This installs the SMI metrics controller deployment, the SMI metrics ApiService object, and supporting RBAC, and config resources.
This is the first step toward having Linkerd consume the SMI metrics API in the CLI and web dashboard.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Moves Common templates needed to partials
As add-ons re-use the partials helm chart, all the templates needed by multiple charts should be present in partials
This commit also updates the helm tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* add tracing add-on helm chart
Tracing sub-chart includes open-census and jaeger components as a sub-chart which can be enabled as needed
* Updated Install path to also install add-ons
This includes new interface for add-ons to implement, with example tracing implementation
* Updates Linkerd install path to also install add-ons
Changes include:
- Adds an optional Linkerd Values configmap which stores add-on configuration when add-ons are present.
- Updates Linkerd install path to check for add-ons and render their sub-charts.
- Adds a install Option called config, which is used to pass confiugration for add-ons.
- Uses a fork of mergo, to over-write default Values with the Values struct generated from config.
* Updates the upgrade path about add-ons.
Upgrade path now checks for the linkerd-values cm, and overwrites the default values with it, if present.
It then checks the config option, for any further overwrites
* Refactor linkerd-values and re-update tests
also adds relevant nil checks
* Refactor code to fix linting issues
* Fixes an error with linkerd-config global values
Also refactors the linkerd-values cm to work the same with helm
* Fix a nil pointer issue for tests
* Updated Tracing add-on chart meta-data
Also introduced a defaultGetFiles method for add-ons
* Add add-on/charts to gitignore
* refactor gitignore for chart deps
* Moves sub-charts to /charts directly
* Refactor linkerd values cm
* Add comment in linkerd-values
* remove extra controlplanetracing flag
* Support Stages deployment for add-ons along with tests
* linting fix
* update tracing rbac
* Removes the need for add-on Interface
- Uses helm loading capabiltiies to get info about add-ons
- Uses reflection to not have to unnecessarily add checks for each add-on type
* disable tracing flag
* Remove dep on forked mergo
- Re-use merge from helm
* Re-use helm's merge
* Override the chartDir path during tests
* add error check
* Updated the dependency iteration code
Currently, the charts directory, will not have the deps in the repo. So, Code is updated to read the dependencies from requirements.yaml
and use that info to read templates from the relevant add-ons directory.
* Hard Code add-ons name
* Remove struct details for add-ons
- As we don't use fields of a add-on struct, we don't have them to be typed. Instead we can just use the `enabled` flag using reflection
- Users can just use map[string]interface{} as the add-on type.
* update unit tests
* linting fix
* Rename flag to addon-config
* Use Chart loading logic
- This code uses chart loading to read the files and keep in a vfs.
- Once we have those files read we will then use them for generation of sub-charts.
* Go fmt fix
* Update the linkerd-values cm to use second level field
* Add relevant unit tests for mergeRaw
* linting fix
* Move addon tests to a new file
* Fix golden files
* remove addon install unit test
* Refactor sub-chart load logic
* Add install tracing unit test
* golden file update for tracing install
* Update golden files to reflect another pr changes
* Move addon-config flag to recordFlagSet
* add relevant tracing enabled checks
* linting fix
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Update control-plane-namespace label
Upgrade command ignores changes to the namespace object
Add linkerd.io/control-plane-ns=linkerd label to the control-plane namespace
Fixes#3958
* Add controlPlaneNamespace label to namespace.yaml
* Modify tests for updated controlPlaneNamespace label
* Fix faulty values.yaml value
* Localize reference for controlPlaneNamespace label in kubernetes_helper.go
Signed-off-by: Supratik Das <rick.das08@gmail.com>
* feat: added prometheus Registry Option for install command
* chore: draft commit
* Draft for custom prometheus image
* Support for custom prometheus image
This PR adds support to override the default prometheus image name and use custom image names in private repositories
* Added default Prometheus Image from values.yaml
The default can be overridden by the argument given in installOptions
* chore: fixed failing check
* Fixed fialing check
* Updated the tests as per the new flag
* Air-gapped installation for prometheus-image
* Air Gapped installation for Prometheus Image
* Added regex for prometheus repository/image cli option
Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
* CLI command to fetch control plane metrics
Fixes#3116
* Add GetResonse method to return http GET response
* Implemented timeouts using waitgroups
* Refactor metrics command by extracting common code to metrics_diagnostics_util
* Refactor diagnostics to remove code duplication
* Update portforward_test for NewContainerMetricsForward function
* Lint code
* Incorporate Alex's suggestions
* Lint code
* fix minor errors
* Add unit test for getAllContainersWithPort
* Update metrics and diagnostics to store results in a buffer and print once
* Incorporate Ivan's suggestions
* consistent error handling inside diagnostics
* add coloring for the output
* spawn goroutines for each pod instead of each container
* switch back to unbuffered channel
* remove coloring in the output
* Add a long description of the command
Signed-off-by: Saurav Tiwary <srv.twry@gmail.com>
* Improve kubectl apply format by removing misplaced message
Fixes#2956
Also separate stderr messages with a new line
Signed-off-by: Supratik Das <rick.das08@gmail.com>
* cli: handle `linkerd metrics` port-forward gracefully
- add return for routine in func `Init()` in case of error
- add return from func `getMetrics()` if error from `portforward.Init()`
* Remove select block at pkg/k8s/portforward.go
- It is now the caller's responsibility to call pf.Stop()
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
* Table obtained from linkerd top is not scrollable.
Added scroll functionality for the table.
Fixes#2558
Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>
Update identity controller to make issuer certificates diagnosable if
cert validity is causing error
- Add expiry time in identity log message
- Add current time in identity log message
- Emit k8s event with appropriate message
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
In light of the breaking changes we are introducing to the Helm chart and the convoluted upgrade process (see linkerd/website#647) an integration test can be quite helpful. This simply installs latest stable through helm install and then upgrades to the current head of the branch.
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
There was a problem that caused helm install to not reflect the proper list of ignored inbound and outbound ports. Namely if you supply just one port, that would not get reflected.
To reproduce do a:
```
helm install \
--name=linkerd2 \
--set-file global.identityTrustAnchorsPEM=ca.crt \
--set-file identity.issuer.tls.crtPEM=issuer.crt \
--set-file identity.issuer.tls.keyPEM=issuer.key \
--set identity.issuer.crtExpiry=2021-01-14T14:21:43Z \
--set-string global.proxyInit.ignoreInboundPorts="6666" \
linkerd-edge/linkerd2
```
Check your config:
```bash
$ kubectl get configmap -n linkerd -oyaml | grep ignoreInboundPort
"ignoreInboundPorts":[],
```
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
## edge-20.1.3
* CLI
* Introduced `linkerd check --pre --linkerd-cni-enabled`, used when the CNI
plugin is used, to check it has been properly installed before proceeding
with the control plane installation
* Added support for the `--as-group` flag so that users can impersonate
groups for Kubernetes operations (thanks @mayankshah160!)
* Controller
* Fixed an issue where an override of the Docker registry was not being
applied to debug containers (thanks @javaducky!)
* Added check for the Subject Alternate Name attributes to the API server
when access restrictions have been enabled (thanks @javaducky!)
* Added support for arbitrary pod labels so that users can leverage the
Linkerd provided Prometheus instance to scrape for their own labels
(thanks @daxmc99!)
* Fixed an issue with CNI config parsing
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This allows for users of Linkerd to leverage the Prometheus instance
deployed by the mesh for their metric needs. With support for pod labels
outside of the Linkerd metrics users are able to scrape metrics
based upon their own labels.
Signed-off-by: Dax McDonald <dax@rancher.com>
**Subject**
Fixes bug where override of Docker registry was not being applied to debug containers (#3851)
**Problem**
Overrides for Docker registry are not being applied to debug containers and provide no means to correct the image.
**Solution**
This update expands the `data.proxy` configuration section within the Linkerd `ConfigMap` to maintain the overridden image name for debug containers at _install_-time similar to handling of the `proxy` and `proxyInit` images.
This change also enables the further override option of the registry for debug containers at _inject_-time given utilization of the `--registry` CLI option.
**Validation**
Several new unit tests have been created to confirm functionality. In addition, the following workflows were run through:
### Standard Workflow with Custom Registry
This workflow installs Linkerd control plane based upon a custom registry, then injecting the debug sidecar into a service.
* Start with a k8s instance having no Linkerd installation
* Build all images locally using `bin/docker-build`
* Create custom tags (using same version) for generated images, e.g. `docker tag gcr.io/linkerd-io/debug:git-a4ebecb6 javaducky.com/linkerd-io/debug:git-a4ebecb6`
* Install Linkerd with registry override `bin/linkerd install --registry=javaducky.com/linkerd-io | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap now contains the debug image name, pull policy, and version within the `data.proxy` section
* Request injection of the debug image into an available container. I used the Emojivoto voting service as described in https://linkerd.io/2/tasks/using-the-debug-container/ as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Once the deployment creates a new pod for the service, inspection should show that the container now includes the "linkerd-debug" container name based on the applicable override image seen previously within the ConfigMap
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.
### Overriding the Custom Registry Override at Injection
This builds upon the “Standard Workflow with Custom Registry” by overriding the Docker registry utilized for the debug container at the time of injection.
* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=gcr.io/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry. Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image. Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.
### Standard Workflow with Default Registry
This workflow is the typical workflow which utilizes the standard Linkerd image registry.
* Uninstall the Linkerd control plane using `bin/linkerd install --ignore-cluster | kubectl delete -f -` as described at https://linkerd.io/2/tasks/uninstall/
* Clean the Emojivoto environment using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl delete -f -` then reinstall using `curl -sL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -`
* Perform standard Linkerd installation as `bin/linkerd install | kubectl apply -f -`
* Once Linkerd has been fully initialized, you should be able to confirm that the `linkerd-config` ConfigMap references the default debug image of `gcr.io/linkerd-io/debug` within the `data.proxy` section
* Request injection of the debug image into an available container as `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar - | kubectl apply -f -`
* Debugging can also be verified by viewing debug container logs as `kubectl -n emojivoto logs deploy/voting linkerd-debug -f`
* Modifying the `config.linkerd.io/enable-debug-sidecar` annotation, setting to “false”, should show that the pod will be recreated no longer running the debug container.
### Overriding the Default Registry at Injection
This workflow builds upon the “Standard Workflow with Default Registry” by overriding the Docker registry utilized for the debug container at the time of injection.
* “Clean” the Emojivoto voting service by removing any Linkerd annotations from the deployment
* Request injection similar to before, except provide the `--registry` option as in `kubectl -n emojivoto get deploy/voting -o yaml | bin/linkerd inject --enable-debug-sidecar --registry=javaducky.com/linkerd-io - | kubectl apply -f -`
* Inspection of the deployment config should now show the override annotation for `config.linkerd.io/debug-image` having the debug container from the new registry. Viewing the running pod should show that the `linkerd-debug` container was injected and running the correct image. Of note, the proxy and proxy-init images are still running the “original” override images.
* As before, modifying the `config.linkerd.io/enable-debug-sidecar` annotation setting to “false”, should show that the pod will be recreated no longer running the debug container.
Fixes issue #3851
Signed-off-by: Paul Balogh javaducky@gmail.com
As part of the effort to remove the "experimental" label from the CNI plugin, this PR introduces cni checks to `linkerd check`
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Fixes
- https://github.com/linkerd/linkerd2/issues/2962
- https://github.com/linkerd/linkerd2/issues/2545
### Problem
Field omissions for workload objects are not respected while marshaling to JSON.
### Solution
After digging a bit into the code, I came to realize that while marshaling, workload objects have empty structs as values for various fields which would rather be omitted. As of now, the standard library`encoding/json` does not support zero values of structs with the `omitemty` tag. The relevant issue can be found [here](https://github.com/golang/go/issues/11939). To tackle this problem, the object declaration should have _pointer-to-struct_ as a field type instead of _struct_ itself. However, this approach would be out of scope as the workload object declaration is handled by the k8s library.
I was able to find a drop-in replacement for the `encoding/json` library which supports zero value of structs with the `omitempty` tag. It can be found [here](https://github.com/clarketm/json). I have made use of this library to implement a simple filter like functionality to remove empty tags once a YAML with empty tags is generated, hence leaving the previously existing methods unaffected
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
Adds a check to ensure kube-system namespace has `config.linkerd.io/admission-webhooks:disabled`
FIxes#3721
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Fixes https://github.com/linkerd/linkerd2/issues/3878
If the `--registry` flag is provided to Linkerd without the `--proxy-image` or `--init-image` flags, the `--registry` flag is ignored and not applied to the existing values for the proxy or init images pulled from the configmap.
We now override the registry with the value from the `--registry` flag regardless of which other flags are provided.
Signed-off-by: Alex Leong <alex@buoyant.io>
Due to wrong snake casing, lifetime setting lifetime issuance was not reflected when installing through helm. This commit solved that problem
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
* The `linkerd-cni` chart should set proper annotations/labels for the namespace
When installing through Helm, the `linkerd-cni` chart will (by default)
install itself under the same namespace ("linkerd") that the `linkerd` chart will be
installed aftewards. So it needs to set up the proper annotations and labels.
* Fix Helm install when disabling init containers
To install linkerd using Helm after having installed linkerd's CNI plugin, one needs to `--set noInitContainer=true`.
But to determine whether to use init containers or not, we weren't
evaluating that, but instead `Values.proxyInit`, which is indeed null
when installing through the CLI but not when installing with Helm. So
init containers were being set despite having passed `--set
noInitContainers=true`.
* update flags to smaller
* add tests for the same
* fix control plane trace flag
* add tests for controlplane tracing install
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)
This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.
* Bump versions for proxy-init to v1.3.0
Signed-off-by: Paul Balogh <javaducky@gmail.com>
Fixes#3444Fixes#3443
## Background and Behavior
This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter. It returns the stream of endpoints, just as if `Get` had been called with the service's authority. This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections. When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.
Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.
Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.
## Implementation
We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip. `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.
Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.
## Testing
This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:
```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```
Then lookups can be issued using the destination client:
```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```
Service cluster ips and pod ips can be used as the `path` argument.
Signed-off-by: Alex Leong <alex@buoyant.io>
The Kubernetes docs recommend a common set of labels for resources:
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/#labels
Add the following 3 labels to all control-plane workloads:
```
app.kubernetes.io/name: controller # or destination, etc
app.kubernetes.io/part-of: Linkerd
app.kubernetes.io/version: edge-X.Y.Z
```
Fixes#3816
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Inject preStop hook into the proxy sidecar container to stop it last
This commit adds support for a Graceful Shutdown technique that is used
by some Kubernetes administrators while the more perspective
configuration is being discussed in
https://github.com/kubernetes/kubernetes/issues/65502
The problem is that RollingUpdate strategy does not guarantee that all
traffic will be sent to a new pod _before_ the previous pod is removed.
Kubernetes inside is an event-driven system and when a pod is being
terminating, several processes can receive the event simultaneously.
And if an Ingress Controller gets the event too late or processes it
slower than Kubernetes removes the pod from its Service, users requests
will continue flowing into the black whole.
According [to the documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods)
> 1. If one of the Pod’s containers has defined a `preStop` hook,
> it is invoked inside of the container. If the `preStop` hook is still
> running after the grace period expires, step 2 is then invoked with
> a small (2 second) extended grace period.
>
> 2. The container is sent the `TERM` signal. Note that not all
> containers in the Pod will receive the `TERM` signal at the same time
> and may each require a preStop hook if the order in which
> they shut down matters.
This commit adds support for the `preStop` hook that can be configured
in three forms:
1. As command line argument `--wait-before-exit-seconds` for
`linkerd inject` command.
2. As `linkerd2` Helm chart value `Proxy.WaitBeforeExitSeconds`.
2. As `config.alpha.linkerd.io/wait-before-exit-seconds` annotation.
If configured, it will add the following preHook to the proxy container
definition:
```yaml
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep {{.Values.Proxy.WaitBeforeExitSeconds}}
```
To achieve max benefit from the option, the main container should have
its own `preStop` hook with the `sleep` command inside which has
a smaller period than is set for the proxy sidecar. And none of them
must be bigger than `terminationGracePeriodSeconds` configured for the
entire pod.
An example of a rendered Kubernetes resource where
`.Values.Proxy.WaitBeforeExitSeconds` is equal to `40`:
```yaml
# application container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 20
# linkerd-proxy container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 40
terminationGracePeriodSeconds: 160 # for entire pod
```
Fixes#3747
Signed-off-by: Eugene Glotov <kivagant@gmail.com>
* Added checks for cert correctness
* Add warning checks for approaching expiration
* Add unit tests
* Improve unit tests
* Address comments
* Address more comments
* Prevent upgrade from breaking proxies when issuer cert is overwritten (#3821)
* Address more comments
* Add gate to upgrade cmd that checks that all proxies roots work with the identitiy issuer that we are updating to
* Address comments
* Enable use of upgarde to modify both roots and issuer at the same time
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web
dashboard and CLI. It adds a new Grafana dashboard for each kind of resource.
Closes#3614Closes#3630Closes#3584Closes#3585
Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com
Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com
* Pods with non empty securitycontext capabilities fail to be injected
Followup to #3744
The `_capabilities.tpl` template got its variables scope changed in
`Values.Proxy`, which caused inject to fail when security context
capabilities were detected.
Discovered when testing injecting the nginx ingress controller.
* Add identity-issuer-certificate-file and identity-issuer-key-file to upgrade command
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Implement logic to use identity-trust-anchors-file flag to update the anchors
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* Address remarks
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* No need for `processYAML()` in `install`
Since `install` uses helm to do its proxy injection, there's no need to
call `processYAML`. This also fixes an issue discovered in #3687 where
we started supporting injection of cronjobs, and even though `linkerd`'s
namespace is flagged to skip automatic injection it was being injected.
This replaces #3773 as it's a much more simpler approach.
* Removed calico logutils dependency, incompatible with go 1.13
Fixes#1153
Removed dependency on
`github.com/projectcalico/libcalico-go/lib/logutils` because it has
problems with go modules, as described in
projectcalico/libcalico-go#1153
Not a big deal since it was only used for modifying the plugin's log
format.
* Traffic split integration test
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Address comments
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Display placeholder when there is no basic stats data
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Replaced `uuid` with `uid` from linkerd-config resource
Fixes#3621
Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.
I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.
Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.
I also added an integration test to verify the stability of the uid.
Fixes#3566
As explained in #3566, as of go 1.13 there's a strict check that ensures a dependency's timestamp matches it's sha (as declared in go.mod). Our smi-sdk dependency has a problem with that that got resolved later on, but more work would be required to upgrade that dependency. In the meantime a quick pair of replace statements at the bottom of go.mod fix the issue.
`linkerd check` can now be run from the dashboard in the `/controlplane` view.
Once the check results are received, they are displayed in a modal in a similar
style to the CLI output.
Closes#3613
* Add support for uninject command to uninject namespace configs
* Add relevant unit tests in cli/cmd/uninject_test.go
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
* rework annotations doc generation from godoc parsing to map[string]string and get rid of unused yaml tags
* move annotations doc function from pkg/k8s to cli/cmd
Signed-off-by: StupidScience <tonysignal@gmail.com>
* Add cmd to inject debug sidecar for l5d components only
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Revert "Add cmd to inject debug sidecar for l5d components only"
This reverts commit 50b8b3577e.
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Stop uninjecting metadata from control plane components
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Ensure inject can be run on control plane components only if --manual is present
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* DNS rebinding protection for the dashboard
Fixes#3083 and replacement for #3629
This adds a new parameter to the `linkerd-web` container `enforcedHost`
that establishes the regexp that the Host header must enforce, otherwise
it returns an error.
This parameter will be hard-coded for now, in `linkerd-web`'s deployment
yaml.
Note this also protects the dashboard because that's proxied from
`linkerd-web`.
Also note this means the usage of `linkerd dashboard --address` will
require the user to change that parameter in the deployment yaml (or
have Kustomize do it).
How to test:
- Run `linkerd dashboard`
- Go to http://rebind.it:8080/manager.html and change the target port to
50750
- Click on “Start Attack” and wait for a minute.
- The response from the dashboard will be returned, showing an 'Invalid
Host header' message returned by the dashboard. If the attack would have
succeeded then the dashboard's html would be shown instead.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Add inject support for namespaces(Fix#3255)
* Add relevant unit tests (including overridden annotations)
Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
* If tap source IP matches many running pods then only show the IP
When an unmeshed source ip matched more than one running pod, tap was
showing the names for all those pods, even though the didn't necessary
originate the connection. This could be reproduced when using pod
network add-on such as Calico.
With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP.
Fixes#3103
* Add support for --identity-issuer-mode flag to install cmd
* Change flag to be a bool
* Read correct data form identity when external issuer is used
* Add ability for identity service to dynamically reload certs
* Fix failing tests
* Minor refactor
* Load trust anchors from identity issuer secret
* Make identity service actually watch for issuer certs updates
* Add some testing around cmd line identity options validation
* Add tests ensuring that identity service loads issuer
* Take into account external-issuer flag during upgrade + tests
* Fix failing upgrade test
* Address initial review feedback
* Address further review feedback on cli and helm
* Do not persist --identity-external-issuer
* Some improvements to identitiy service
* Bring back persistane of external issuer flag
* Address more feedback
* Update dockerfiles shas
* Publishing k8s events on issuer certs rotation
* Ensure --ignore-cluster+external issuer is not supported
* Update go-deps shas
* Transition to identity issuer scheme based configuration
* Use k8s consts for secret file names
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
The `linkerd upgrade --from-manifests` command supports reading the
manifest output via `linkerd install`. PR #3167 introduced a tap
APIService object into `linkerd install`, but the manifest-reading code
in fake.go was never updated to support this new object kind.
Update the fake clientset code to support APIService objects.
Fixes#3559
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd install` `--ignore-cluster` and `--skip-checks` flags
enable generating install manifests without a connection to a k8s
cluster. Unfortunately these flags were only checked after attempted
connections to a k8s cluster were made. This satisfied the use case of
`linkerd install` "ignoring" the state of the cluster, but for
environments not connected to a cluster, the user would have to wait for
30s timeouts before getting the manifests.
Modify `linkerd install` and its subcommands to pre-emptively check for
`--ignore-cluster` and `--skip-checks`. This decreases `linkerd install
--ignore-cluster` from ~30s to ~1s, and `linkerd install control-plane
--ignore-cluster --skip-checks` from ~60s to ~1s.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script. However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting. This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.
In this change we stop using golang-ci-lint for format checking. We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files. This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project. We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.
Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR aims at preventing `--cluster-domain` from being changed during `linkerd upgrade`. I am not sure this is all that is necessary, but it can probably be at least a good start. 🙂Closes#3454.
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
* Re-add the destination container to the controller spec
This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.
On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.
* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure
Signed-off-by: Ivan Sim <ivan@buoyant.io>
### Motivation
In order to expose arbitrary headers through tap, headers and trailers should be
read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s.
This change should have no user facing changes as it just prepares the events
for JSON output in linkerd/linkerd2#3390
### Solution
The public API has been updated with a headers field for
`TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers
field for `TapEvent_Http_ResponseEnd_`.
These values are set by reading the corresponding fields off of the proxy's tap
events.
The proto changes are equivalent to the proto changes proposed in
linkerd/linkerd2-proxy-api#33
Closes#3262
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
This reverts commit edd3b1f6d4.
This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy. We will reintroduce this change once the config plan is straightened out.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#278
Add `linkerd install|upgrade --disable-heartbeat` flag, and have
`linkerd check` check for the heartbeat's SA only if it's enabled.
Also added those flags into the `linkerd upgrade -h` examples.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
When running linkerd in HA mode, a cluster can be broken by bringing down the proxy-injector.
Add a label to MWC namespace selctor that skips any namespace.
Fixes#3346
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>