linkerd2

Commit Graph

Author	SHA1	Message	Date
Paul Balogh	62d54838b8	Ensure and update debug image during upgrade (#4823 ) Some installations upgrading from versions prior to 2.7.x may have missing debug image name and version. This fix ensures that the default values are in place for this scenario and additionally upgrades the version of debug image with the control plane version. Signed-off-by: Paul Balogh <javaducky@gmail.com>	2020-08-05 11:39:29 -07:00
Ali Ariff	61d7dedd98	Build ARM docker images (#4794 ) Build ARM docker images in the release workflow. # Changes: - Add a new env key `DOCKER_MULTIARCH` and `DOCKER_PUSH`. When set, it will build multi-arch images and push them to the registry. See https://github.com/docker/buildx/issues/59 for why it must be pushed to the registry. - Usage of `crazy-max/ghaction-docker-buildx ` is necessary as it already configured with the ability to perform cross-compilation (using QEMU) so we can just use it, instead of manually set up it. - Usage of `buildx` now make default global arguments. (See: https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope) # Follow-up: - Releasing the CLI binary file in ARM architecture. The docker images resulting from these changes already build in the ARM arch. Still, we need to make another adjustment like how to retrieve those binaries and to name it correctly as part of Github Release artifacts. Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>	2020-08-05 11:14:01 -07:00
Alex Leong	381f237f69	Add multicluster unlink command (#4802 ) Fixes #4707 In order to remove a multicluster link, we add a `linkerd multicluster unlink` command which produces the yaml necessary to delete all of the resources associated with a `linkerd multicluster link`. These are: * the link resource * the service mirror controller deployment * the service mirror controller's RBAC * the probe gateway mirror for this link * all mirror services for this link This command follows the same pattern as the `linkerd uninstall` command in that its output is expected to be piped to `kubectl delete`. The typical usage of this command is: ``` linkerd --context=source multicluster unlink --cluster-name=foo \| kubectl --context=source delete -f - ``` This change also fixes the shutdown lifecycle of the service mirror controller by properly having it listen for the shutdown signal and exit its main loop. A few alternative designs were considered: I investigated using owner references as suggested [here](https://github.com/linkerd/linkerd2/issues/4707#issuecomment-653494591) but it turns out that owner references must refer to resources in the same namespace (or to cluster scoped resources). This was not feasible here because a service mirror controller can create mirror services in many different namespaces. I also considered having the service mirror controller delete the mirror services that it created during its own shutdown. However, this could lead to scenarios where the controller is killed before it finishes deleting the services that it created. It seemed more reliable to have all the deletions happen from `kubectl delete`. Since this is the case, we avoid having the service mirror controller delete mirror services, even when the link is deleted, to avoid the race condition where the controller and CLI both attempt to delete the same mirror services and one of them fails with a potentially alarming error message. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-08-04 16:21:59 -07:00
Rajat Jindal	cbcc305b78	Fixes unit tests when default kubeconfig namespace is not "default" (#4825 ) Use defaultNamespace from kubeconfig context Fixes #4779 Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>	2020-08-03 11:27:02 -07:00
cpretzer	670caaf8ff	Update to proxy-init v1.3.4 (#4815 ) Signed-off-by: Charles Pretzer <charles@buoyant.io>	2020-07-30 15:58:58 -05:00
Alex Leong	a1543b33e3	Add support for service-mirror selectors (#4795 ) * Add selector support Signed-off-by: Alex Leong <alex@buoyant.io> * Removed unused labels Signed-off-by: Alex Leong <alex@buoyant.io>	2020-07-30 10:07:14 -07:00
Tarun Pothulapati	6307868f3d	bump prometheus to the latest v2.19.3 (#4811 ) * bump prometheus to the latest v2.19.3 latest prometheus version shows a lot of decrease in the memory usage and other benefits Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-07-30 12:06:59 -05:00
Alexander Berger	4ffea3ba08	CNI add support for priorityClassName (#4742 ) * CNI add support for priorityClassName As requested in #2981 one should be able to optionally define a priorityClassName for the linkerd2 pods. With this commit support for priorityClassName is added to the CNI plugin helm chart as well as to the cli command for installing the CNI plugin. Also added an `installNamespace` Helm option for the CNI installation. Implements part of #2981. Signed-off-by: alex.berger@nexiot.ch <alex.berger@nexiot.ch>	2020-07-30 10:43:06 -05:00
Naseem	96f662dfac	replace linkerd.io/helm-release-version annotation (#4645 ) Replace mechanism to automatically roll deployments when secret content changes. As per https://helm.sh/docs/howto/charts_tips_and_tricks/\#automatically-roll-deployments this is a recommended approach to dealing with config changes and rolling deployments upon them. Signed-off-by: Naseem <naseem@transit.app>	2020-07-28 21:48:17 -05:00
Tarun Pothulapati	c68ab23ab2	Add global.prometheusUrl field for byop use-case (#4390 ) This pr adds `globa.prometheusUrl` field which will be used to configure publlic-api, hearbeat, grafana, etc (i,e query path) to use a external Prometheus.	2020-07-28 12:26:34 +05:30
Matt Miller	fc33b9b9aa	support overriding inbound and outbound connect timeouts. (#4759 ) * support overriding inbound and outbound connect timeouts. * add validation on user provided TCP connect timeouts * convert valid time values into ms Signed-off-by: Matt Miller <mamiller@rosettastone.com>	2020-07-27 13:56:21 -07:00
memory	d2f547d812	Add sidecar container support for linkerd-prometheus helm chart (#4761 ) * Add sidecar container support for linkerd-prometheus Adds a new setting to the Prometheus' Helm config, allowing adding any kind of sidecar containers to the main container. The specific use case that inspired this was for exporting data from Prometheus to external systems (e.g. cloudwatch, stackdriver, datadog) using a process that watches the prometheus write-ahead log (WAL). Signed-off-by: Nathan J. Mehl <n@oden.io>	2020-07-27 14:26:37 -05:00
Matei David	1c197b14e7	Change destination context token format (#4771 ) Add a new structure on the destination controller side to keep track of contextual information. The token format has been changed from ns:<namespace> to a JSON format so that more variables can be encdoed in the token. As part of this PR, a new field 'nodeName' has been added to help with service topologies. Fixes #4498 Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-07-27 09:49:48 -07:00
Alex Leong	d540e16c8b	Make service mirror controller per target cluster (#4710 ) This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31. For fuller context, please see that RFC. Basic multicluster functionality works here including: * `linkerd mc install` installs the Link CRD but not any service mirror controllers * `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link * The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints. * The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences. * The `linkerd check` multicluster checks have been updated for the new architecture. Several checks have been rendered obsolete by the new architecture and have been removed. The following are known issues requiring further work: * the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror. it does not yet support configuring a label selector. * an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707 * an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708 Signed-off-by: Alex Leong <alex@buoyant.io>	2020-07-23 14:32:50 -07:00
Alejandro Pedraza	5e789ba152	Migrate CI to docker buildx and other improvements (#4765 ) * Migrate CI to docker buildx and other improvements ## Motivation - Improve build times in forks. Specially when rerunning builds because of some flaky test. - Start using `docker buildx` to pave the way for multiplatform builds. ## Performance improvements These timings were taken for the `kind_integration.yml` workflow when we merged and rerun the lodash bump PR (#4762) Before these improvements: - when merging: `24:18` - when rerunning after merge (docker cache warm): `19:00` - when running the same changes in a fork (no docker cache): `32:15` After these improvements: - when merging: `25:38` - when rerunning after merge (docker cache warm): `19:25` - when running the same changes in a fork (docker cache warm): `19:25` As explained below, non-forks and forks now use the same cache, so the important take is that forks will always start with a warm cache and we'll no longer see long build times like the `32:15` above. The downside is a slight increase in the build times for non-forks (up to a little more than a minute, depending on the case). ## Build containers in parallel The `docker_build` job in the `kind_integration.yml`, `cloud_integration.yml` and `release.yml` workflows relied on running `bin/docker-build` which builds all the containers in sequence. Now each container is built in parallel using a matrix strategy. ## New caching strategy CI now uses `docker buildx` for building the container images, which allows using an external cache source for builds, a location in the filesystem in this case. That location gets cached using actions/cache, using the key `{{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}` and the restore key `${{ runner.os }}-buildx-${{ matrix.target }}-`. For example when building the `web` container, its image and all the intermediary layers get cached under the key `Linux-buildx-web-git-abc0123`. When that has been cached in the `main` branch, that cache will be available to all the child branches, including forks. If a new branch in a fork asks for a key like `Linux-buildx-web-git-def456`, the key won't be found during the first CI run, but the system falls back to the key `Linux-buildx-web-git-abc0123` from `main` and so the build will start with a warm cache (more info about how keys are matched in the [actions/cache docs](https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key)). ## Packet host no longer needed To benefit from the warm caches both in non-forks and forks like just explained, we're required to ditch doing the builds in Packet and now everything runs in the github runners VMs. As a result there's no longer separate logic for non-forks and forks in the workflow files; `kind_integration.yml` was greatly simplified but `cloud_integration.yml` and `release.yml` got a little bigger in order to use the actions artifacts as a repository for the images built. This bloat will be fixed when support for [composite actions](https://github.com/actions/runner/blob/users/ethanchewy/compositeADR/docs/adrs/0549-composite-run-steps.md) lands in github. ## Local builds You still are able to run `bin/docker-build` or any of the `docker-build.*` scripts. And to make use of buildx, run those same scripts after having set the env var `DOCKER_BUILDKIT=1`. Using buildx supposes you have installed it, as instructed [here](https://github.com/docker/buildx). ## Other - A new script `bin/docker-cache-prune` is used to remove unused images from the cache. Without that the cache grows constantly and we can rapidly hit the 5GB limit (when the limit is attained the oldest entries get evicted). - The `go-deps` dockerfile base image was changed from `golang:1.14.2` (ubuntu based) to `golang-1:14.2-alpine` also to conserve cache space. # Addressed separately in #4875: Got rid of the `go-deps` image and instead added something similar on top of all the Dockerfiles dealing with `go`, as a first stage for those Dockerfiles. That continues to serve as a way to pre-populate go's build cache, which speeds up the builds in the subsequent stages. That build should in theory be rebuilt automatically only when `go.mod` or `go.sum` change, and now we don't require running `bin/update-go-deps-shas`. That script was removed along with all the logic elsewhere that used it, including the `go_dependencies` job in the `static_checks.yml` github workflow. The list of modules preinstalled was moved from `Dockerfile-go-deps` to a new script `bin/install-deps`. I couldn't find a way to generate that list dynamically, so whenever a slow-to-compile dependency is found, we have to make sure it's included in that list. Although this simplifies the dev workflow, note that the real motivation behind this was a limitation in buildx's `docker-container` driver that forbids us from depending on images that haven't been pushed to a registry, so we have to resort to building the dependencies as a first stage in the Dockerfiles.	2020-07-22 14:27:45 -05:00
Wei Lun	85a042c151	add fish shell completion (#4751 ) fixes #4208 Signed-off-by: Wei Lun <weilun_95@hotmail.com>	2020-07-20 15:46:30 -07:00
Matei David	8b85716eb8	Introduce install flag for EndpointSlices (#4740 ) EndpointSlices have been made opt-in due to their experimental nature. This PR introduces a new install flag 'enableEndpointSlices' that will allow adopters to specify in their cli install or helm install step whether they would like to use endpointslices as a resource in the destination service, instead of the endpoints k8s resource. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-07-15 09:53:04 -07:00
Tarun Pothulapati	2a099cb496	Move Prometheus as an Add-On (#4362 ) This moves Prometheus as a add-on, thus making it optional but enabled by default. The also make `linkerd-prometheus` more configurable, and allow it to have its own life-cycle for upgrades, configuration, etc. This work will be followed by documentation that help users configure existing Prometheus to work with Linkerd. Changes Include: - moving prometheus manifests into a separate chart at `charts/add-ons/prometheus`, and adding it as a dependency to `linkerd2` - implement the `addOn` interface to support the same with CLI. - include configuration in `linkerd-config-addons` User Facing Changes: The default install experience does not change much but for users who have already configured Prometheus differently, would need to apply the same using the new configuration fields present in chart README	2020-07-09 23:29:03 +05:30
cpretzer	d3553c59fd	Add volume and volumeMount for buster-based proxy-init (#4692 ) * Add volume and volumeMount for buster-based proxy-init Signed-off-by: Charles Pretzer <charles@buoyant.io>	2020-07-09 09:55:07 -07:00
Zahari Dichev	a2363f4051	Fix `splitStringListToPorts` port range object rendering (#4688 ) The splitStringListToPorts helm function is currently incorrectly formating a list of ports as an array of Port objects that look ike {"port" : 555}. The config map protobuf representation however expects that the ignoreOutboundPorts and ignoreInboundPorts fields are are list of PortRange objects ({"portRange" : 555}). This was causing the injector to return an empty string when trying to parse a PortRange object resulting in the ports not getting set correctly when injecting workloads. Note that this is happening only with helm installations as this is when we are actually using a helm template for outputting the config map. To fix that the splitStringListToPorts helm function is changed to format the objects as the json representation of PortRange and is renamed to splitStringListToPortRanges Fix: #4679 Signed-off-by: Zahari Dichev zaharidichev@gmail.com	2020-07-09 14:23:12 +03:00
Tarun Pothulapati	1dd8ae425b	update helm render tests to consider child charts values.yaml (#4725 ) * update helm render tests to read child charts values.yaml Helm installation by default, considers values.yaml for dependend charts and uses them in rendering. This function is being used for add-ons to keep the default template values, allowing further overriden from the parent chart's i.e linkerd2 values.yaml or --addon-config through CLI. This PR updates the Helm tests to reflect the same i.e consider values.yaml of chart dependencies if present. This does not have any UX changes but helps with the follow up add-on related work.	2020-07-08 20:28:56 +05:30
Desmond Ho	93bf079640	Added custom tolerations to helm chart (#4626 ) ... for the control plane resources Signed-off-by: Desmond Ho <desmond.ho@cloverhealth.com>	2020-07-07 17:37:02 -05:00
Suraj Deshmukh	d7dbe9cbff	Fix spelling mistakes using codespell (#4700 ) Using following command the wrong spelling were found and later on fixed: ``` codespell --skip CHANGES.md,.git,go.sum,\ controller/cmd/service-mirror/events_formatting.go,\ controller/cmd/service-mirror/cluster_watcher_test_util.go,\ SECURITY_AUDIT.pdf,.gcp.json.enc,web/app/img/favicon.png \ --ignore-words-list=aks,uint,ans,files\' --check-filenames \ --check-hidden ``` Signed-off-by: Suraj Deshmukh <surajd.service@gmail.com>	2020-07-07 17:07:22 -05:00
Matei David	a2bd230cd6	service topologies: add Kubernetes/API EndpointSlice support (#4696 ) Based on the [EndpointSlice PR](https://github.com/linkerd/linkerd2/pull/4663), this is just the k8s/api support for endpointslices to shorten the first PR. * Adds CRD * Adds functions that check whether the cluster has EndpointSlice access * Adds discovery & endpointslice informers to api. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-07-06 15:28:48 -07:00
Tarun Pothulapati	7cd188dc65	Add `values.yaml` to chart tempaltes (#4682 ) This change adds add-on level values.yaml into the template to use for rendering. No changes as of rn, but will be used by add-ons later on!	2020-07-02 22:59:41 +05:30
Tarun Pothulapati	fcc3eb5411	cli: support url with addon-config flag (#4666 ) adds support for urls through addon-config flag	2020-07-02 22:57:22 +05:30
Tarun Pothulapati	c3131cde0e	Use cniPluginVersion with Helm for linkerd2-cni (#4693 ) use `cniPluginVersion` as the fall-back version with Helm	2020-07-02 20:45:13 +05:30
Naseem	361d35bb6a	feat: add log format annotation and helm value (#4620 ) * feat: add log format annotation and helm value Json log formatting has been added via https://github.com/linkerd/linkerd2-proxy/pull/500 but wiring the option through as an annotation/helm value is still necessary. This PR adds the annotation and helm value to configure log format. Closes #2491 Signed-off-by: Naseem <naseem@transit.app>	2020-07-02 10:08:52 -05:00
Arthur Silva Sens	021048d576	GoDocs for completion, dashboard and diagnostics cli commands (#4518 ) Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-06-30 05:53:50 -05:00
Naseem	733d911677	feat: add option to persist prometheus data (#4578 ) Data disappears upon prometheus restarts due to it being all in-memory. Adding an option to enabled persistence by means of a PVC would be the right approach. It is commonly seen in a wide array of helm charts. Fixes #4576 Signed-off-by: Naseem <naseem@transit.app>	2020-06-29 14:26:26 -07:00
Alejandro Pedraza	aea541d6f9	Upgrade generated protobuf files to v1.4.2 (#4673 ) Regenerated protobuf files, using version 1.4.2 that was upgraded from 1.3.2 with the proxy-api update in #4614. As of v1.4 protobuf messages are disallowed to be copied (because they hold a mutex), so whenever a message is passed to or returned from a function we need to use a pointer. This affects _mostly_ test files. This is required to unblock #4620 which is adding a field to the config protobuf.	2020-06-26 09:36:48 -05:00
Oliver Gould	c4d649e25d	Update proxy-api version to v0.1.13 (#4614 ) This update includes no API changes, but updates grpc-go to the latest release.	2020-06-24 12:52:59 -07:00
Mayank Shah	2b0482c821	Update `inject` to throw an error while injecting non-compliant pods (#4346 ) * Update inject to error out on failure Update injection process to throw an error when the reason for failure is due to sidecar, udp, automountServiceAccountToken or hostNetwork Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-06-24 14:07:05 -05:00
Lutz Behnke	846d2f11d4	Add support for Helm configuration of per-component proxy resources requests and limits (#4226 ) Signed-off-by: Lutz Behnke <lutz.behnke@finleap.com>	2020-06-24 12:54:27 -05:00
Zahari Dichev	7f3d872930	Add destination-get-networks option (#4608 ) In #4585 we are observing an issue where a loop is encountered when using nginx ingress. The problem is that the outbound proxy does a dst lookup on the IP address which happens to be the very same address the ingress is listening on. In order to avoid situations like that this PR introduces a way to modify the set of networks for which the proxy shall do IP based discovery. The change introduces a helm flag `.Values.global.proxy.destinationGetNetworks` that can be used to modify this value. There are two ways a user can affect the this setting: - setting the `destinationGetNetworks` field in values during a Helm install, which changes the default on all injected pods - using an annotation ` config.linkerd.io/proxy-destination-get-networks` for injected workloads to override this value Note that this setting cannot be tweaked through the `install` or `inject` command Fix: #4585 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-06-18 20:07:47 +03:00
cpretzer	b176fbeb6d	Upgrade Grafana to 7.0.3 (#4600 ) * Upgrade Grafana to 7.0.3 * use go netdns to avoid DNS resolution errors on alpine Signed-off-by: Charles Pretzer <charles@buoyant.io>	2020-06-17 21:35:29 -07:00
Kevin Leimkuhler	fe71ef04b0	Remove `--prune` from multi-stage upgrade docs (#4613 ) Fixes #4606 This has not worked as far back as stable-2.6.0. ## Solution The recommended upgrade process is to include `--prune` as part of `kubectl apply ..`: ```bash $ linkerd upgrade \| kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - ``` This is an issue for multi-stage upgrade because `linkerd upgrade config` does not include the `linkerd-config` ConfigMap in it's output. `kubectl apply --prune ..` will then prune this resource because it matches the label selector and is not in the above output. The issue occurs when `linkerd upgrade control-plane` is run and expects to find the ConfigMap that was just pruned. This can be fixed by not suggesting to prune resources as part of the multi-stage upgrade. ## Considered Including `templates/config.yaml` in the install output regardless of the stage. Instead of it being a template only used in `control-plane` stage in [render](`4aa3ca7f87/cli/cmd/install.go (L873-L886)`), it could always be rendered. This just exposes other things that are pruned in the process: ```bash ❯ bin/linkerd upgrade control-plane \|kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - × Failed to build upgrade configuration: secrets "linkerd-identity-issuer" not found For troubleshooting help, visit: https://linkerd.io/upgrade/#troubleshooting error: no objects passed to apply ``` Ultimately, resources part of the `control-plane` stage need to remain and that will not happen if we prune all resources not in the `config` stage output Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-17 10:29:34 -04:00
Alexander Berger	b509742c7d	Fix ClusterRole for web-check (#4599 ) As reported in #4259 linkerd check run from linkerd's web cconsole is broken as the underlying RBAC Role cannot access the apiregistration.k8s.io API Group. With this commit the RBAC Role is fixed allowing read-only access to the API Group apiregistration.k8s.io. Fixes #4259 Signed-off-by: alex.berger@nexiot.ch <alex.berger@nexiot.ch>	2020-06-15 10:21:00 -07:00
Joakim Roubert	99a9f1c2c2	Fix missing proxy-init v1.3.2 -> v1.3.3 (#4596 ) Signed-off-by: Joakim Roubert <joakim.roubert@axis.com>	2020-06-15 10:53:01 -05:00
Alejandro Pedraza	7a9527bf00	Fix yaml in linkerd-config-addons when providing grafanaUrl (#4581 ) Put back space after `grafanaUrl` label in `linkerd-config-addons.yaml` to avoid breaking the yaml parsing. ``` $ linkerd check ... linkerd-addons -------------- ‼ 'linkerd-config-addons' config map exists could not unmarshal linkerd-config-addons config-map: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal string into Go struct field Values.global of type linkerd2.Global ``` This was added in #4544 to avoid having the configmap being badly formatted. So this PR fixes the yaml, but then if we don't set `grafanaUrl` the configmap format gets messed up, but apparently that's just a cosmetic problem: ``` apiVersion: v1 data: values: "global:\n grafanaUrl: \ngrafana:\n enabled: true\n image:\n name: gcr.io/linkerd-io/grafana\n name: linkerd-grafana\n resources:\n cpu:\n limit: 240m\n memory:\n limit: null\ntracing:\n enabled: false" kind: ConfigMap ```	2020-06-09 11:08:32 -07:00
Tarun Pothulapati	4219955bdb	multicluster: checks for misconfigured mirror services (#4552 ) Fixes #4541 This PR adds the following checks - if a mirrored service has endpoints. (This includes gateway mirrors too). - if an exported service is referencing a gateway that does not exist. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> Signed-off-by: Alex Leong <alex@buoyant.io> Co-authored-by: Alex Leong <alex@buoyant.io>	2020-06-08 15:29:34 -07:00
Tarun Pothulapati	4aa3ca7f87	remove grafana.image.version field (#4571 ) Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-06-08 17:15:30 -05:00
Matei David	654839e639	Add namespace global flag to hold default namespace name (#4469 ) (#4512 ) * Add namespace global flag to hold default namespace name (#4469) Signed-off-by: Matei David <matei.david.35@gmail.com> * Change name of controlplane install namespace constant and init point for kubeNamespace Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-06-04 10:45:07 -07:00
Lutz Behnke	108b383ab8	add flag for dumping rejected results to folder for use with external diff (#4509 ) Problem When updating / writing tests with complex data, e.g the certificates, the build-in diff is not as powerful as dedicated external tool. Solution Dump all resource specifications created as part of failing tests to a supplied folder for external analysis. Signed-off-by: Lutz Behnke <lutz.behnke@finleap.com>	2020-06-04 10:49:41 -04:00
Alejandro Pedraza	7d9525e316	Removed trailing spaces from entries in configmaps (#4544 ) Fixes #4454 As explained [here](https://github.com/kubernetes/kubernetes/issues/36222#issuecomment-553966166), trailing spaces in configmap data makes it to look funky when retrieved later on. This is currently affecting `linkerd-config-addons` and `linkerd-gateway-config`: ``` $ k -n linkerd-multicluster get cm linkerd-gateway-config -oyaml apiVersion: v1 data: nginx.conf: "events {\n}\nstream { \n \ server { \n \ listen 4180; \n \ proxy_pass 127.0.0.1:4140; \n \ } \n} \nhttp {\n server {\n listen 4181;\n location /health {\n access_log off;\n return 200 \"healthy\\n\";\n }\n }\n server {\n listen \ 8888;\n location /health-local {\n access_log off;\n return 200 \"healthy\\n\";\n }\n } \n}" kind: ConfigMap ``` AFAIK this is only cosmetic and doesn't affect functionality.	2020-06-04 09:06:37 -05:00
Alejandro Pedraza	ed5482ac3b	Fixed prom route in linkerd service profile, and some extra cleanup (#4493 ) * Fixes #4305 Fixed SP route for `POST /api/v1/query`: ``` $ bin/linkerd routes -n linkerd deploy/linkerd-prometheus ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 GET /api/v1/query_range linkerd-prometheus 100.00% 3.9rps 1ms 2ms 2ms GET /api/v1/series linkerd-prometheus 100.00% 1.1rps 1ms 1ms 1ms POST /api/v1/query linkerd-prometheus 100.00% 3.1rps 1ms 17ms 19ms [DEFAULT] linkerd-prometheus - - - - - ``` Also added one missing route for `linkerd-grafana`, realizing afterwards there are many other ones missing, but not really worth adding them all. I also removed the routes in `linkerd-controller` for the tap routes given that's no longer handled in that service. And the tap service SP was also removed alltogether since nothing was getting reported.	2020-06-03 12:53:50 -05:00
Lutz Behnke	163107b8cb	Extend Helm chart to allow disabling secret resources for self-signed certs. (#4289 ) * allow disabling secret resources for self-signed certs. Split cert and ca bundle. Signed-off-by: Lutz Behnke <lutz.behnke@finleap.com>	2020-06-03 09:26:24 -05:00
Kevin Leimkuhler	8f6186f9ae	Change help text to use source/target terminology in multicluster CLI (#4523 ) Change terminology from local/remote to source/target in `multicluster` CLI help text. This does not change any variable, function, struct, or field names since testing is still improving. Relevant issue: #4480 Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-02 12:33:18 -04:00
Zahari Dichev	6c3922a7f1	Probe manager simplification (#4510 ) There are a few notable things happening in this PR: - the probe manager has been decoupled from the cluster_watcher. Now its only responsibility is to watch for mirrored gateways beeing created and to probe them. This means that probes are initiated for all gateways no matter whether there are mirrored services being paired - the number of paired services is derived from the existing services in the cluster rather than being published as a metric by the prober - there are no events being exchanged between the cluster watcher and the probe manager Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-06-01 14:41:29 -07:00
Alejandro Pedraza	571626d524	CI: properly report errors from commands (#4514 ) Failures in `bin/_test-run` from commands different than `go test` aren't currently properly reported, in part because CI's bash default is to have `set -e` which terminates the script and just outputs `##[error]Process completed with exit code 2.` like [here](https://github.com/linkerd/linkerd2/pull/4496/checks?check_run_id=720720352#step:14:116) ``` linkerd-existence ----------------- √ 'linkerd-config' config map exists √ heartbeat ServiceAccount exist √ control plane replica sets are ready × no unschedulable pods linkerd-controller-6c77c7ffb8-w8wh5: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-destination-6767d88f7f-rcnbq: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-grafana-76c76fcfb9-pdhfb: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-identity-5bcf97d6c8-q6rll: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-prometheus-6b95c56b44-hd9m6: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-proxy-injector-58d794ff9-jf7cj: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-sp-validator-6c5f999bfb-qg252: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-tap-6fdf84fc65-6txvr: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. linkerd-web-8484fbd867-nm8z2: 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. see https://linkerd.io/checks/#l5d-existence-unschedulable-pods for hints Status check results are × [error]Process completed with exit code 2. ``` I've made the following changes to `bin/_test-run` to generate better messages and Github annotations when an error occurs: - Unset `set -e` so that errors don't immediately exit the script and don't allow us to properly format the errors. - Removed many of the `exit_on_err` calls after go test calls because those output enough information already (they were not being used anyways in CI because of `set -e`). And instead have `run_test` exit upon a `go test` error. - Added `exit_on_err` calls right after non-`go-test` commands to properly report their failure. - Refactored the `exit_on_err` function so that it generates a Github error annotation upon failure. - Removed `trap` in `install_stable`, since the OS should be able to handle GC for stuff under `/tmp`. Also, I've changed the exit 2 code from `linkerd check` when it fails, to exit code 1.	2020-06-01 15:57:33 -05:00
Alex Leong	015d352f34	Fix array handling in bin/fmt (#4489 ) Quoting the list of directories passed to `goimports` was causing the list to be interpreted as a single argument which was stopping `bin/fmt` from working. Instead, use `read` to split the list of directories into an array. Also fix up incorrect formatting that has crept in while `bin/fmt` has been broken. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-06-01 12:10:24 -07:00
Zahari Dichev	7b46682841	Add allow and link commands (#4466 ) This change adds a `allow` and `link` commands, effectivelly enabling a cluster to have more than one set of credentials that allow it to be mirrored. Fx #4461 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> Co-authored-by: Alex Leong <alex@buoyant.io>	2020-05-27 14:30:55 -07:00
Alejandro Pedraza	1844fd573b	Unhide multicluster command (#4486 ) Unhide multicluster command	2020-05-27 14:22:23 -05:00
Kevin Leimkuhler	4879f07334	cli: rename cluster cli command to multicluster (#4484 ) This is @psinghal20's changes in #4462 which is currently failing CI. Fixes #4456 Description from the original PR: > This pr renames the `cluster` command in CLI to `multicluster` command. It > also adds a shorthand `mc` for easy use. > > Fixes #4456 > > Signed-off-by: psinghal20 <psinghal20@gmail.com> The CI failure doesn't seem to be related to this change, but has only been seen on forks. Opening this from a non-fork for now to continue investigating. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com> Co-authored-by: psinghal20 <psinghal20@gmail.com>	2020-05-27 10:39:52 +03:00
Arthur Silva Sens	bfedcd5485	Added documentation for alpha cli command (#4412 ) Added comments to document several methods and strucs on cmd package. Based on GoDoc guidelines. Focus on alpha cli command Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-05-26 13:59:56 -07:00
Zahari Dichev	8fb0ea608a	Skip services that are mirrors of remote ones (#4460 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-22 09:24:59 +03:00
Alex Leong	acacf2e023	Add --close-wait-timeout inject flag (#4409 ) Depends on https://github.com/linkerd/linkerd2-proxy-init/pull/10 Fixes #4276 We add a `--close-wait-timeout` inject flag which configures the proxy-init container to run with `privileged: true` and to set `nf_conntrack_tcp_timeout_close_wait`. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-05-21 14:14:14 -07:00
Tarun Pothulapati	0c53760094	update golden files with new grafana.image field format (#4455 )	2020-05-21 23:05:04 +05:30
Tarun Pothulapati	bd60c90e5d	Add addon-overwrite flag (#4377 ) provide a `addon-overwrite` flag for upgrades to skip `linkerd-config-addons` and use `--addon-overwrite` if passed or defaults	2020-05-21 21:01:41 +05:30
Tarun Pothulapati	3473db32f8	use "/" for as the FS is virtualised (#4443 ) replacing `filepath.join` in the install path in the CLI, as the fs is virtualized	2020-05-21 10:25:14 +05:30
Alex Leong	9cd4557644	Properly show the meshed count for non-selector services (#4446 ) When viewing the output of `linkerd stat` for services which do not have a selector (such as services created by the service-mirror, for example) the meshed count column shows the total number which exist, even though the service actually selects no pods at all. We update the StatSummary implementation to account for services which have no selector. Additionally, we update the logic of the `--unmeshed` flag. When the `--unmeshed` flag is not set, we typically skip rows for unmeshed resources because those resources would have no stats. This is not appropriate to do when the `--from` flag is also set because in this case, metrics are not collected on the target resource but are instead collected on the client-side. This means that stats can be present, even for unmeshed resources and these resources should still be displayed, even if the `--unmeshed` flag is not set. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-05-20 10:08:27 -07:00
Tarun Pothulapati	be664571c1	Separate grafana image tag in template (#4395 ) Separates grafana image field into image.name, image.version and also moves controllerImageVersion to global	2020-05-20 22:27:19 +05:30
Zahari Dichev	31e33d18d3	Enable service mirroring to work in private networks (#4440 ) This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway. Fix #4411 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-20 19:48:36 +03:00
Zahari Dichev	6574f124a7	Restrict Service mirror RBACs (#4426 ) This PR introduces a few changes that were requested after a bit of service mirror reviewing. - we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in - we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters - fixed checks to account for changes Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-20 17:08:01 +03:00
Tarun Pothulapati	5f37a9f7fa	Add global.grafanaUrl for linking existing grafana use-case (#4381 ) adds global.grafanaUrl for Bring your own Grafana use-case, with configuration in `linkerd-config-addons`	2020-05-20 00:56:31 +05:30
Tarun Pothulapati	e91dbda287	Add health checks for grafana add-on (#4321 ) * Add health checks for grafana add-on Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update testCheck command and fixes Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * fix checkContainersRunnning function Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * linting fix Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update test golden files Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * use hc.ControlPlanePods instead of k8s API Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * use hc.controlPLanePods directly Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * remove unnecessary comments Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * proper comments Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update pod checks to use retries Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add values key check Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-05-14 23:18:43 +05:30
Zahari Dichev	01894c700f	Make export-service handle k8s lists (#4370 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-12 09:11:36 +03:00
Tarun Pothulapati	45ccc24a89	Move grafana templates into a separate sub-chart as a add-on (#4320 ) * adds grafana manifests as a sub-chart - moves grafana templates into its own chart - implement add-on interface Grafana struct - also add relevant conditions for grafana Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * remove redundant grafana fields in Values Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update golden files Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * fix values issue Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * remove extra grafanaImage value Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add add-on upgrade tests Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * fix golden file tests Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add grafana field to linkerd-config-addons Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * Don't apply nil configuration Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update golden files Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * make checks relaxed for grafana Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update test to not test on grafana Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update TestServiceAccountsMatch to contain extra members Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * replace map[string]interface{} with Grafana for better readability Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * update golden files Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-05-11 22:22:14 +05:30
Zahari Dichev	edd9b654a7	Make gateway require TLS for incoming requests (#4339 ) Make gateway require TLS for incoming requests Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-11 10:07:48 +03:00
Zahari Dichev	3008f1f87f	Add check for validating that remote clusters share the same trust an… (#4311 ) Add check for validating that remote clusters share the same trust anchors Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-11 09:59:15 +03:00
Tarun Pothulapati	2be43a5f9d	Add Jaeger links to the Linkerd dashboard (#4177 ) * Add Jaeger reverse proxy * add jaegerLink to the metrics table * update MetricsTable tests * Add optional jaeger link * rename grafana_proxy to reverse_proxy Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-05-07 16:35:56 -05:00
Alex Leong	a703b5b1dc	Hide development flags in release builds (#4310 ) Certain install flags are intended to help with Linkerd development and generally are not useful (and are potentially confusing) to users. We hide these flags in release (edge or stable) builds of the CLI but show them in all other builds. The list of affected flags is: * control-plane-version * proxy-image * proxy-version * image-pull-policy * init-image * init-image-version Signed-off-by: Alex Leong <alex@buoyant.io>	2020-05-05 09:33:10 -07:00
Tarun Pothulapati	fc7456ce2a	Refactor linkerd-config-addons configmap (#4318 ) * rename linkerd-values to linkerd-config-addons Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * refactor linkerd-config-addons to be more saner Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add comments for the linkerd-config-addons Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * move _addon.tpl to partials Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * remove unnecessary checks in _addons.tpl Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * add _addon.tpl partials to TestRenderHelm Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> * remove on a copy Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-05-05 14:11:21 +05:30
Zahari Dichev	4e82ba8878	Multicluster checks (#4279 ) Multicluster checks Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-05 10:19:38 +03:00
Matei David	6b9aaac9d6	Add Kubeconfig contex namespace to cli commands' options (#4197 ) (#4291 ) When using cli commands that work on namespaced resources in the cluster, the default namespace used by the cli is hardcoded to the default Kubernetes namespace (i.e 'default'). This update will allow cli commands that operate on namespaced resources to automatically infer what the name of the default namespace is, by taking the relevant default from the currently used Kubeconfig context. In short, this allows the omission of the -n flag in commands such as linkerd metrics, when working with resources that belong to a namespace that is set as default in the currently active context. Validation was done manually by setting the default namespace of the currently used context, as well as through two integration tests that target the tap and get command respectively. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-05-04 16:21:05 -05:00
Naseem	6aa1e76096	Allow config of prometheus alertmanagers, rules and extra args. (#4220 ) This allows end user flexibility for options such as log format. Rather than bubbling up such possible config options into helm values, extra arguments provides more flexibility. Add prometheusAlertmanagers value allows configuring a list of statically targetted alertmanager instances. Use rule configmaps for prometheus rules. They take a list of {name,subPath,configMap} values and mounts them accordingly. Provided that subpaths end with _rules.yml or _rules.yaml they should be loaded by prometheus as per prometheus.yml's rule_files content. Signed-off-by: Naseem <naseem@transit.app>	2020-05-04 14:06:10 -05:00
Alex Leong	40b921508f	Inject LINKERD2_PROXY_DESTINATION_GET_NETWORKS proxy variable (#4300 ) Fixes #3807 By setting the LINKERD2_PROXY_DESTINATION_GET_NETWORKS environment variable, we configure the Linkerd proxy to do destination lookups for authorities which are IP addresses in the private network range. This allows us to get destination metadata including identity for HTTP requests which target an IP address in the cluster, Prometheus metrics scrape requests, for example. This change allowed us to update the "direct edges" test which ensures that the edges command produces correct output for traffic which is addressed directly to a pod IP. We also re-enabled the "linkerd stat" integration tests which had been disabled while the destination service did not yet support these types of IP queries. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-30 11:22:24 -07:00
Zahari Dichev	00f17d2ed6	Make export-service non side-effecting (#4307 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-30 17:11:55 +03:00
Tarun Pothulapati	e75c6580ec	refacor TestRenderHelm to not need addOn list (#4297 ) - rather than passing the list of add-ons, they can instead be built from the values Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-04-29 23:44:30 +05:30
Zahari Dichev	5149152ef3	Multicluster gateway and remote setup command (#4265 ) Add multicluster gateway and setup command Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-29 20:33:23 +03:00
Zahari Dichev	17dacf5548	Add gateways command, allowing the retrieval of gateway stats (#4241 ) Add gateways command, allowing the retrieval of gateway stats Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-27 13:55:01 +03:00
Zahari Dichev	09262ebd72	Add liveliness checks and metrics for multicluster gateway (#4233 ) Add liveliness checks for gateway Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-27 13:06:58 +03:00
Alejandro Pedraza	dacf87e084	Added missing annotations to addon test fixtures (#4286 ) Followup to #4271 Add missing annotation `linkerd.io/workload-ns: linkerd` in in the addons test fixtures, introduced by the downward work from #4199	2020-04-23 16:15:16 -05:00
Tarun Pothulapati	60ffd1c2a2	Support Multi-stage install with Add-On's (#4271 ) * Support Multi-stage install with Add-Ons * add upgrade tests for add-ons * add multi stage upgrade unit tests Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-04-23 14:40:58 -05:00
Tarun Pothulapati	2b1cbc6fc1	charts: Using downwardAPI to mount labels to the proxy container (#4199 ) * use downward API to mount labels to the proxy container as a volume * add namespace as a label to the pod * add a trace inject test * add downwardAPi for controlplaneTracing * add controlPlaneTracing condition to volumeMounts * update add-ons to have workload-ns * add workload-ns label to control-plane components Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-04-22 10:33:51 -05:00
Alejandro Pedraza	b00a84126d	Some `linkerd stat` test failures were being hidden (#4272 ) * Some `linkerd stat` test failures were being hidden `linkerd stat` was doing an early `os.Exit(0)` when no traffic was found, which avoided `go test` to report any test failure that ended in that code path. This was hiding a mismatch in the golden files for HA after the introduction of the rolling update strategy (#4267), and the failure of `linkerd stat trafficsplit` not returning results unless `--unmeshed` is used. For the latter, I added the flag to the tests in order to temporarly pass them, but the underlying issue remains to be fixed in a separate PR.	2020-04-21 14:52:09 -05:00
Kevin Leimkuhler	2c38f228f7	Add MeshedPodCount field to TS resource rows (#4273 ) The addition of the `--unmeshed` flag changed the rendering behavior of the `stat` command so that resources with 0 meshed pods are not displayed by default. Rendering is based off the row's `MeshedPodCount` field which is currently not set by `func trafficSplitResourceQuery`. This change sets that field now so that in rendering, the trafficsplit resource is rendered in the output. The reason for this not showing up in testing is addressed by #4272 where the `stat` command behavior for no traffic is changed. The following now works without `--unmeshed` flag being passed: ``` ❯ bin/linkerd stat -A ts NAMESPACE NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 default backend-traffic-split backend-svc backend-svc 500m - - - - - default backend-traffic-split backend-svc failing-svc 0 - - - - - ```	2020-04-21 10:23:35 -07:00
Alex Leong	9bf54d36ed	Upgrade to go 1.14.2 (#4278 ) Upgrade Linkerd's base docker image to use go 1.14.2 in order to stay modern. The only code change required was to update a test which was checking the error message of a `crypto/x509.CertificateInvalidError`. The error message of this error changed between go versions. We update the test to not check for the specific error string so that this test passes regardless of go version. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-20 17:14:51 -07:00
Alex Leong	5d3862c120	Use /live for liveness probe (#4270 ) Fixes #3984 We use the new `/live` admin endpoint in the Linkerd proxy for liveness probes instead of the `/metrics` endpoint. This endpoint returns a much smaller payload. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-17 14:53:32 -07:00
Tarun Pothulapati	8e56166774	Refactor AddOn Installation (#4247 ) * refactor add-ons install code Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-04-16 15:24:55 -05:00
Alex Leong	e962bf1968	Improve proxy version diagnostics (#4244 ) It can be difficult to know which versions of the proxy are running in your cluster, especially when you have pods running at multiple different proxy versions. We add two pieces of CLI functionality to assist with this: The `linkerd check --proxy` command will now list all data plane pods which are not up-to-date rather than just printing the first one it encounters: ``` ‼ data plane is up-to-date Some data plane pods are not running the current version: * default/books-84958fff5-95j75 (git-ca760bdd) * default/authors-57c6dc9b47-djldq (git-ca760bdd) * default/traffic-85f58ccb66-vxr49 (git-ca760bdd) * default/release-name-smi-metrics-899c68958-5ctpz (git-ca760bdd) * default/webapp-6975dc796f-2ngh4 (git-ca760bdd) * default/webapp-6975dc796f-z4bc4 (git-ca760bdd) * emojivoto/voting-54ffc5787d-wj6cp (git-ca760bdd) * emojivoto/vote-bot-7b54d6999b-57srw (git-ca760bdd) * emojivoto/emoji-5cb99f85d8-5bhvm (git-ca760bdd) * emojivoto/web-7988674b8b-zfvvm (git-ca760bdd) * default/webapp-6975dc796f-d2fbc (git-ca760bdd) * default/curl (git-7f6bbc73) see https://linkerd.io/checks/#l5d-data-plane-version for hints ``` The `linkerd version` command now supports a `--proxy` flag which will list all proxy versions running in the cluster and the number of pods running each version: ``` linkerd version --proxy Client version: dev-7b9d475f-alex Server version: edge-20.4.1 Proxy versions: edge-20.4.1 (10 pods) git-ca760bdd (11 pods) git-7f6bbc73 (1 pods) ``` Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-16 11:28:19 -07:00
Alejandro Pedraza	7d07504b5b	Upgrade crashes proxy-init when skipping ports (#4258 ) Fixes #4257 This was introduced in 2.7.0. When performing an upgrade on an installation having used `--skip-outbound-ports` or `--skip-inbound-ports`, the upgrade picks those values from the ConfigMap, parses them wrongly, and then when proxy-init picks them the iptables commands fail. I've also improved one of the upgrade unit tests to include these flags, and confirmed it failed before this fix.	2020-04-15 07:11:15 -05:00
Kevin Leimkuhler	0d235694af	Add `unmeshed` flag to stat command (#4254 ) ## Motivation Introduces an `unmeshed` flag to the `stat` command so that users can opt-in to viewing unmeshed resources in the `stat` output. This changes the existing behavior of the `stat` command such that unmeshed resources no longer render by default in the output. Before: ``` ❯ bin/linkerd stat -A deploy NAMESPACE NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN kube-system coredns 0/1 - - - - - - kube-system local-path-provisioner 0/1 - - - - - - kube-system metrics-server 0/1 - - - - - - kube-system traefik 0/1 - - - - - - linkerd linkerd-controller 1/1 100.00% 0.3rps 1ms 2ms 2ms 2 linkerd linkerd-destination 1/1 100.00% 0.3rps 1ms 1ms 1ms 11 ... ``` After: ``` ❯ bin/linkerd stat -A deploy NAMESPACE NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN linkerd linkerd-controller 1/1 100.00% 0.3rps 1ms 1ms 1ms 2 linkerd linkerd-destination 1/1 100.00% 0.3rps 1ms 2ms 2ms 13 ... ``` Closes #3871 ## Solution Using the meshed pod count in the stat response, resources with a count of `0` are not rendered in the table. The `-l`/`--selector` flag do not work for all resource types, so applying a default label does not solve this problem. While it works for pods, it does not work for deployments as the `linkerd.io/inject` is an annotation that cannot be selected on. I did not think a shorthand flag was necessary for this. I do not think users will commonly pass this flag to the `stat` command, and I didn't think adding an additional short flag such as `u` was necessary. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-04-14 20:24:29 -07:00
Alex Leong	7b9d475ffc	Gate SMI-Metrics behind an install flag (#4240 ) This change adds a `--smi-metrics` install flag which controls if the SMI-metrics controller and associated RBAC and APIService resources are installed. The flag defaults to false and is hidden. We plan to remove this flag or default it to true if and when the SMI-Metrics integration graduates from experimental. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-09 14:34:08 -07:00
Tarun Pothulapati	d35a98cb2b	Fix routes wide output formatting for empty values (#4239 ) * use wider template string for empty values when -o wide Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-04-07 14:35:55 -05:00
Alejandro Pedraza	322ba5fd2f	`linkerd uninstall` errors when attempting to delete PSP (#4234 ) * Bug in `linkerd uninstall` when attempting to delete PSP We were using a wrong apiVersion for PSP in `linkerd uninstall`'s output, which avoids removing that resource: ``` $ linkerd uninstall \| kubectl delete -f - clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-controller" deleted clusterrole.rbac.authorization.k8s.io "linkerd-linkerd-destination" deleted ... mutatingwebhookconfiguration.admissionregistration.k8s.io "linkerd-proxy-injector-webhook-config" deleted validatingwebhookconfiguration.admissionregistration.k8s.io "linkerd-sp-validator-webhook-config" deleted namespace "linkerd" deleted error: unable to recognize "uninstall.yml": no matches for kind "PodSecurityPolicy" in version "extensions/v1beta1" $ kubectl get psp -oname podsecuritypolicy.policy/linkerd-linkerd-control-plane ``` I've also replaced the uninstall integration test with a new separate suite that performs the installation, waits for it to be ready, uninstalls, and then confirms `linkerd check --pre` returns as expected.	2020-04-07 11:01:11 -05:00
Zahari Dichev	d6460cf0fb	Update upgrade test certs (#4236 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-06 20:15:06 +03:00
Matei David	fee70c064b	Add uninstall cmd functionality to cli (#3622 ) (#4200 ) Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-04-02 12:35:39 -05:00
Alex Leong	d8eebee4f7	Upgrade to client-go 0.17.4 and smi-sdk-go 0.3.0 (#4221 ) Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0. Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously. This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4. This keeps all of our transitive dependencies synchronized on one version of client-go. This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code. I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-04-01 10:07:23 -07:00
Mayank Shah	4429c1a5b1	Update inject to handle `automountServiceAccountToken: false` (#4145 ) * Handle automountServiceAccountToken Return error during inject if pod spec has `automountServiceAccountToken: false` Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-04-01 09:39:49 -05:00

1 2 3 4 5 ...

793 Commits