`cniEnabled` was hard-coded to `false` in the `_config.tpl` template, thus always adding the init container during injection regardless of having installed the control plane with `--set noInitContainer=true`.
This affects injection after having installed with Helm, not when having installed with the CLI.
Repro steps under the edge-19.12.3's tag:
```bash
$ helm install charts/linkerd2-cni
# wait for the linkerd-cni-xxx pod to come up
# refresh linkerd2's chart dependencies
$ bin/helm-build
# overrides.yaml should contain all the mandatory values for certs
$ helm install -f overrides.yaml --set noInitContainer=true --set installNamespace=false charts/linkerd2
# verify the global config `cniEnabled` is NOT being persisted appropriately
$ k -n linkerd get cm linkerd-config -oyaml | grep cni
"cniEnabled": false,
# install and inject emojivoto
$ curl https://run.linkerd.io/emojivoto.yml|bin/go-run cli inject -|k apply -f -
# verify that the init container is being (unexpectedly) added
$ k -n emojivoto get po emoji-xxxxx-xxx -oyaml | grep initContainer
initContainers:
initContainerStatuses:
```
In this branch:
```bash
$ helm install charts/linkerd2-cni
# wait for the linkerd-cni-xxx pod to come up
# refresh linkerd2's chart dependencies
$ bin/helm-build
# overrides.yaml should contain all the mandatory values for certs
$ helm install -f overrides.yaml --set noInitContainer=true --set installNamespace=false charts/linkerd2
# verify the global config `cniEnabled` is being persisted appropriately
$ k -n linkerd get cm linkerd-config -oyaml | grep cni
"cniEnabled": true,
# install and inject emojivoto
$ curl https://run.linkerd.io/emojivoto.yml|bin/go-run cli inject -|k apply -f -
# verify that the init container is NOT being added
$ k -n emojivoto get po emoji-xxxxx-xxx -oyaml | grep initContainer
# nothing returned
```
This replaces #3872
Fixes a problem where the identitiy serice can issue a certificate that has a lifetime larger than the issuer certificate. This was causing the proxies to end up using an invalid TLS certificate. This fix ensures that the lifetime of the issued certificate is not greater than the smallest lifetime of the certs in the issuer cert trust chain.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Fixes https://github.com/linkerd/linkerd2/issues/3878
If the `--registry` flag is provided to Linkerd without the `--proxy-image` or `--init-image` flags, the `--registry` flag is ignored and not applied to the existing values for the proxy or init images pulled from the configmap.
We now override the registry with the value from the `--registry` flag regardless of which other flags are provided.
Signed-off-by: Alex Leong <alex@buoyant.io>
The current set of Javascript linting rules that we're using in this
project is outdated, and it has lead to a variety of competing styles
in the Javascript codebase.
Update the project's linting rules to match those provided by the latest
release of eslint-config-airbnb, but disable a bunch of rules that
aren't compatible with this project.
I've split this change into two commits. The first commit contains the
manual changes that I made to satisfy the new rules, and the second
commit contains all of the whitespace, quoting and commas changes that
were fixed automatically by eslint.
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Revert "Replace actions/checkout with actions/upload/download (#3602)"
This reverts commit 397970e917.
* Upgraded actions/checkout to @v2
Reverts #3602 and Fixes#3881
## Motivation
Full background: #2074#2074 was recently reopened because a user reported an error that occurs when
refreshing an already opened dashboard after the dashboard build has changed.
This can occur when upgrading or downgrading.
#2074 explores a larger issue about a redirection that occurs when loading the
dashboard JS. However, the actual issue that users are experiencing happens
because `index_bundle.js` is being cached when it should not be.
Even if the hash of the JS bundle changes, users can see (on the current edge)
that browsers do in fact cache `index_bundle.js`.
The easiest way I reproduced this was:
1. Install `edge-19.12.3`
2. `linkerd dashboard` (and keep the tab open)
2. Uninstall `edge-19.12.3`
3. Install `stable-2.5.0`
4. `linkerd dashboard`
5. Refresh in all browsers: Users will observe the `edge-19.12.3` dashboard
still renders (with all of it's new additions) even though `stable-2.5.0` is
installed with it's older theme.
Below are screenshots of Safari and Firefox caching the file. Chrome was not as
easy to reproduce:
*Safari*

*Firefox*

## Solution
This change only changes the response header when requesting `index_bundle.js`
from the server to ensure caching does not take place; mainly `no-cache` is
changed to `no-store` and `must-revalidate` is now included.
`no-store` and `must-revalidate` are redundant on some browsers but both
required to cover all browsers (and versions).
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Due to wrong snake casing, lifetime setting lifetime issuance was not reflected when installing through helm. This commit solved that problem
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
* Use `kind export kubeconfig` instead of `scp`
Followup to #3864
[comment](https://github.com/linkerd/linkerd2/pull/3864#discussion_r360976473)
Stop moving the kubeconfig file between the Github Action env and the
build server with `scp` and use instead `kind export kubeconfig`.
* Replaced deprecated '--loglevel debug' flag with '--verbosity 3'
We were ignoring events like
```
MountVolume.SetUp failed for volume .* : couldn't propagate object cache: timed out waiting for the condition
```
but as k8s 1.16 those got replaced by more precise messages, like
```
MountVolume.SetUp failed for volume "linkerd-identity-token-cm4fn" :failed to sync secret cache: timed out waiting for the condition
MountVolume.SetUp failed for volume "prometheus-config" : failed to sync configmap cache: timed out waiting for the condition
```
This was causing sporadic CI test failures like
[here](https://github.com/linkerd/linkerd2/runs/368424822#step:7:562)
So I'm including another regex for that.
Re: 96c41f8a1e
* The `linkerd-cni` chart should set proper annotations/labels for the namespace
When installing through Helm, the `linkerd-cni` chart will (by default)
install itself under the same namespace ("linkerd") that the `linkerd` chart will be
installed aftewards. So it needs to set up the proper annotations and labels.
* Fix Helm install when disabling init containers
To install linkerd using Helm after having installed linkerd's CNI plugin, one needs to `--set noInitContainer=true`.
But to determine whether to use init containers or not, we weren't
evaluating that, but instead `Values.proxyInit`, which is indeed null
when installing through the CLI but not when installing with Helm. So
init containers were being set despite having passed `--set
noInitContainers=true`.
* Upgrade `kind` to v0.6.1
Fixes#3852
Upgraded `/bin/kind` to pull v0.6.1.
Also have `workflow.yml` use `KUBECONFIG` explicitly for setting the
location of the config file, now that `kind get kubeconfig-path` has
been deprecated (check
https://github.com/kubernetes-sigs/kind/releases/tag/v0.6.0 for detailed
info).
Note that in the build server the kind binary for this version is
`kind-0.6.1`, leaving the `kind` binary still pointing to v0.5.1 while
this gets merged and all the PR branches get this.
* update flags to smaller
* add tests for the same
* fix control plane trace flag
* add tests for controlplane tracing install
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This PR restructures how the the array of apiRequests are constructed in the
`ResourceDetail` component to reduce unnecessary data requests. In the case of a
Pod detail page, we will no longer query the API for a list of pods in a
namespace, or request metrics for those pods, which we do for all other resource
detail pages.
Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
Fixes#3801
This will package and build the `linkerd2-cni` chart from the
`charts/linkerd2-cni` directory and update our Helm Hub's `index.yaml`
file to index it.
This will only be run in the `chart_deploy` job of our Github Actions
when an edge/stable tag is pushed.
Once that happens, users will be able to install the chart with a
command like:
```
helm install linkerd-edge/linkerd2-cni
```
Docs update will follow.
* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)
This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.
* Bump versions for proxy-init to v1.3.0
Signed-off-by: Paul Balogh <javaducky@gmail.com>
In various integration tests we're not showing stderr when a failure
happens, thus hiding some possibly useful debugging info.
E.g. in the latest CI failures, commands like `linkerd update` were
failing with no visible reason why.
* Changes for edge-19.12.3
Signed-off-by: Charles Pretzer <charles@buoyant.io>
* CHANGES.md updates based on feedback
Signed-off-by: Charles Pretzer <charles@buoyant.io>
* Fix flag name
Signed-off-by: Charles Pretzer <charles@buoyant.io>
Fixes#3444Fixes#3443
## Background and Behavior
This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter. It returns the stream of endpoints, just as if `Get` had been called with the service's authority. This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections. When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.
Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.
Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.
## Implementation
We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip. `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.
Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.
## Testing
This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:
```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```
Then lookups can be issued using the destination client:
```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```
Service cluster ips and pod ips can be used as the `path` argument.
Signed-off-by: Alex Leong <alex@buoyant.io>
This release adds a defense mechanism to ensure that resolutions are
released when the associated balancer becomes idle and should have
been dropped from the proxy.
Furthermore, the proxy is now more selective as to which gRPC status
codes are considered "failures" in metrics.
---
* Classify some gRPC status codes as non-errors (linkerd/linkerd2-proxy#395)
* discover: Timeout stalled resolutions (linkerd/linkerd2-proxy#401)
The Kubernetes docs recommend a common set of labels for resources:
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/#labels
Add the following 3 labels to all control-plane workloads:
```
app.kubernetes.io/name: controller # or destination, etc
app.kubernetes.io/part-of: Linkerd
app.kubernetes.io/version: edge-X.Y.Z
```
Fixes#3816
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Inject preStop hook into the proxy sidecar container to stop it last
This commit adds support for a Graceful Shutdown technique that is used
by some Kubernetes administrators while the more perspective
configuration is being discussed in
https://github.com/kubernetes/kubernetes/issues/65502
The problem is that RollingUpdate strategy does not guarantee that all
traffic will be sent to a new pod _before_ the previous pod is removed.
Kubernetes inside is an event-driven system and when a pod is being
terminating, several processes can receive the event simultaneously.
And if an Ingress Controller gets the event too late or processes it
slower than Kubernetes removes the pod from its Service, users requests
will continue flowing into the black whole.
According [to the documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods)
> 1. If one of the Pod’s containers has defined a `preStop` hook,
> it is invoked inside of the container. If the `preStop` hook is still
> running after the grace period expires, step 2 is then invoked with
> a small (2 second) extended grace period.
>
> 2. The container is sent the `TERM` signal. Note that not all
> containers in the Pod will receive the `TERM` signal at the same time
> and may each require a preStop hook if the order in which
> they shut down matters.
This commit adds support for the `preStop` hook that can be configured
in three forms:
1. As command line argument `--wait-before-exit-seconds` for
`linkerd inject` command.
2. As `linkerd2` Helm chart value `Proxy.WaitBeforeExitSeconds`.
2. As `config.alpha.linkerd.io/wait-before-exit-seconds` annotation.
If configured, it will add the following preHook to the proxy container
definition:
```yaml
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep {{.Values.Proxy.WaitBeforeExitSeconds}}
```
To achieve max benefit from the option, the main container should have
its own `preStop` hook with the `sleep` command inside which has
a smaller period than is set for the proxy sidecar. And none of them
must be bigger than `terminationGracePeriodSeconds` configured for the
entire pod.
An example of a rendered Kubernetes resource where
`.Values.Proxy.WaitBeforeExitSeconds` is equal to `40`:
```yaml
# application container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 20
# linkerd-proxy container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 40
terminationGracePeriodSeconds: 160 # for entire pod
```
Fixes#3747
Signed-off-by: Eugene Glotov <kivagant@gmail.com>
Replaces theme.spacing.unit in the TapQueryForm component, which is deprecated,
with theme.spacing(1), as part of the upgrade to Material-UI v4.
Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
This PR pauses the network activity when the dashboard is not visible, resuming
it as soon as the user goes back to it. To do that, we are using the
react-page-visibility library.
Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
This PR updates Material-UI from v3.6.1 to v4.7.1. The Material-UI
icon library has also been updated from v3.0.1 to v4.5.1.
Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
Closes#3483.
This PR refactors and simplifies breadcrumb text pluralization. The redesigned
dashboard added a view that shows the user a list of all pods, deployments, etc.
in a namespace. The breadcrumb navigation text needed to be tweaked to correctly
pluralize the resource type selected.
Closes#3764.
This PR fixes an issue where the dashboard would cut off the bottom of the
Community Updates posts (displayed in an iframe) if the browser height was
shorter than the height of the iframe. Related to [#605 in the linkerd website
repo](https://github.com/linkerd/website/pull/605).
Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
* Added checks for cert correctness
* Add warning checks for approaching expiration
* Add unit tests
* Improve unit tests
* Address comments
* Address more comments
* Prevent upgrade from breaking proxies when issuer cert is overwritten (#3821)
* Address more comments
* Add gate to upgrade cmd that checks that all proxies roots work with the identitiy issuer that we are updating to
* Address comments
* Enable use of upgarde to modify both roots and issuer at the same time
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Closes#3778.
Fixes a formatting issue in the dashboard Tap/Top form where if a longer
resource name was selected, the placement of the buttons was off.
Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
This PR addresses recent JS unit test failures on CI by:
* Upgrading yarn from 1.7.0 to 1.21.1 (current stable version) in the Dockerfile
and Github Actions workflow
* Wrapping the yarn installation with the --network-concurrency 1 flag, setting the
maximum number of concurrent network requests to 1, suggested as a fix here:
https://github.com/yarnpkg/yarn/issues/2629
v2.80.0 fixed a problem where the destination controller client's
connection receive window could become exhausted, preventing additional
updates from the controller. The connection window has been increased
from 64K to 1MB to prevent a single stalled stream from block others.
Furthermore, discovery for IP addresses has been disabled in the proxy,
as the control plane does not yet support these resolutions. This
additionally lessons the load on the destination controller client.
---
* profiles: Eagerly read profiles off the wire (linkerd/linkerd2-proxy#397)
* router: Ensure that the purge task completes (linkerd/linkerd2-proxy#396)
* app-core: Add `accept` context with peer addr (linkerd/linkerd2-proxy#398)
* Remove default for destination lookup subnets (linkerd/linkerd2-proxy#399)
* Configure the HTTP/2 connection window to 1MB (linkerd/linkerd2-proxy#400)