Commit Graph

1893 Commits

Author SHA1 Message Date
Alejandro Pedraza 2ad141d27a
Exclude changes on markup files to trigger CI runs in master (#4084)
Fixes #4082

This tested fine in a fork, under various scenarios combining:
- Modify markup file in root dir
- Modify markup file in subdir
- Modify non-markup file
- In master
- In a PR
2020-02-24 13:50:35 -05:00
Christy Jacob f9b940e89d
Support for custom prometheus registry (#4041)
* feat: added prometheus Registry Option for install command
* chore: draft commit
* Draft for custom prometheus image
* Support for custom prometheus image

This PR adds support to override the default prometheus image name and use custom image names in private repositories

* Added default Prometheus Image from values.yaml

The default can be overridden by the argument given in installOptions

* chore: fixed failing check
* Fixed fialing check
* Updated the tests as per the new flag
* Air-gapped installation for prometheus-image
* Air Gapped installation for Prometheus Image
* Added regex for prometheus repository/image cli option

Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
2020-02-24 09:59:29 -08:00
Saurav Tiwary 1c19e314b7
Linkerd CLI command to get control plane diagnostics (#4050)
* CLI command to fetch control plane metrics
Fixes #3116
* Add GetResonse method to return http GET response
* Implemented timeouts using waitgroups
* Refactor metrics command by extracting common code to metrics_diagnostics_util
* Refactor diagnostics to remove code duplication
* Update portforward_test for NewContainerMetricsForward function
* Lint code
* Incorporate Alex's suggestions
* Lint code
* fix minor errors
* Add unit test for getAllContainersWithPort
* Update metrics and diagnostics to store results in a buffer and print once
* Incorporate Ivan's suggestions
* consistent error handling inside diagnostics
* add coloring for the output
* spawn goroutines for each pod instead of each container
* switch back to unbuffered channel
* remove coloring in the output
* Add a long description of the command

Signed-off-by: Saurav Tiwary <srv.twry@gmail.com>
2020-02-24 09:09:54 -08:00
Kevin Leimkuhler ab4a13ab52
Add minimal release workflow (#4090)
## Motivation

A release workflow will be the only triggered workflow on `push.tags` events.

As a first step in automating the release process, it should assert that
integration tests pass once the docker images have been tagged.

Both KinD and cloud integration tests should run since they have different
sets of integration tests that they are responsible for running.

It then needs to run the `chart_deploy` job.

## Testing

This has been fully tested with a release tag push on my fork. The run can be
found [here](https://github.com/kleimkuhler/linkerd2/actions/runs/42664128)

It properly failed on `chart_deploy` because I did not want to push a test tag
helm chart.

## Solution

This workflow will:

- Build the docker images on the Packet Host
- Tag the docker images with the release tag
- Run KinD integration tests
- Run cloud integration tests
- Run `chart_deploy`

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-02-22 17:13:48 -08:00
Kevin Leimkuhler 4aac6445c4
Add script to create release tag (#4091)
## Motivation

Creating a release tag is a manual process that is prone to error by the
release responsible member.

Additionally, the automated release project will require that a message is
included that is a copy of the recent `CHANGES.md` changes.

These steps can be scripted so that the member will just need to run a release
script.

## Solution

A `bin/create-release-tag` script will:
- Take a `$TAG` argument (maybe can remove this in the future) to use as the
  tag name
- Pull out the top section of `CHANGES.md` to use as the commit message
- Create the a tag with `$TAG` name and release changes as the message

## Example

```
$ TAG="edge-20.2.3"
$ bin/create-release-tag $TAG
$ git push $TAG
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-02-22 16:30:17 -08:00
Alejandro Pedraza ea523a46b0
Fixed shellcheck warnings on bin/helm-build (#4080)
Followup to #4058

```
$ shellcheck -x bin/helm-build; echo $?
0
```
2020-02-21 09:51:21 -05:00
Alejandro Pedraza 9b64f0dc94
Reuse bin/helm-build in Helm integration tests (#4088)
Have the preliminary setup for the Helm integration tests use
`bin/helm-build` instead of directly calling `helm dependency update`.
This allows testing `bin/helm-build` itself, and also lints the linkerd2
and linkerd2-cni charts (the latter lint call is being added as well in this
PR).
2020-02-21 09:26:10 -05:00
Alex Leong c891b22632
Remove extraneous return (#4089)
Remove extraneous `return` which was missed in https://github.com/linkerd/linkerd2/pull/4007

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-20 16:12:49 -08:00
Alejandro Pedraza 8c12f03af8
Update gcloud configs for the chart_deploy job - Part 2 (#4087) 2020-02-20 18:10:40 -05:00
Alejandro Pedraza 262fe578b6
Update gcloud configs for the chart_deploy job (#4086) 2020-02-20 17:00:52 -05:00
Alejandro Pedraza 395ca102d5
Edge-20.2.2 release notes (#4077)
* Edge-20.2.2 release notes
2020-02-20 14:21:30 -05:00
Supratik Das 42efc1da01
Improve kubectl apply format by removing misplaced message (#4053)
* Improve kubectl apply format by removing misplaced message

Fixes #2956

Also separate stderr messages with a new line

Signed-off-by: Supratik Das <rick.das08@gmail.com>
2020-02-20 10:36:36 -05:00
Mayank Shah 7cff974a79
cli: handle panic caused by `linkerd metrics` port-forward failure (#4007)
* cli: handle `linkerd metrics` port-forward gracefully

- add return for routine in func `Init()` in case of error
- add return from func `getMetrics()` if error from `portforward.Init()`

* Remove select block at pkg/k8s/portforward.go

- It is now the caller's responsibility to call pf.Stop()

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-19 21:44:37 -08:00
Alejandro Pedraza df2011dbb2
CI: Upgrades the gcloud action to fix GKE clusters teardown issue (#4074)
Ref linkerd/linkerd2-action-gcloud#1
2020-02-19 17:24:59 -05:00
Oliver Gould dc451208d4
proxy: v2.86.0 (#4075)
This release includes the results from continued profiling & performance
analysis. In addition to modifying internals to prevent unwarranted
memory growth, we've introduced new metrics to aid in debugging and
diagnostics: a new `request_errors_total` metric exposes the number of
requests that receive synthesized responses due to proxy errors; and a
suite of `stack_*` metrics expose proxy internals that can help us
identify unexpected behavior.

---

* trace: update `tracing-subscriber` dependency to 0.2.1 (linkerd/linkerd2-proxy#426)
* Reimplement the Lock middleware with tokio::sync (linkerd/linkerd2-proxy#427)
* Add the request_errors_total metric (linkerd/linkerd2-proxy#417)
* Expose the number of service instances in the proxy (linkerd/linkerd2-proxy#428)
* concurrency-limit: Share a limit across Services (linkerd/linkerd2-proxy#429)
* profiling: add benchmark and profiling scripts (linkerd/linkerd2-proxy#406)
* http-box: Box HTTP payloads via middleware (linkerd/linkerd2-proxy#430)
* lock: Generalize to protect a guarded value (linkerd/linkerd2-proxy#431)
2020-02-19 14:24:47 -08:00
Sanni Michael aa1f200dde
Allow custom host set from helm values (#4054)
* Allow custom host set from helm values #3961

Fixes #3961

Signed-off-by: Sanni Michael Tomiwa <sannimichaelse@gmail.com>
2020-02-19 09:50:11 -05:00
Kohsheen Tiku dea8b8c547
Support for configuring service profile retries(x-linkerd-retryable) via openapi spec (#4052)
* Retry policy is manually written in yaml file and patched it into the service profile

Added support for configuring service profile retries(x-linkerd-retryable) via openapi spec

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>
2020-02-18 13:07:07 -08:00
Christian Hüning 67f83cf51a
Switched figo to finleap (#4056)
Due to company mergers we're now finleap connect and not figo anymore. Therefore I'd like to update the URL and company name here.

Signed-off-by: Christian Hüning <christian.huening@finleap.com>
2020-02-18 12:09:45 -08:00
Alejandro Pedraza 77af716ab2
bin/helm-build automatically updates version in values.yaml (#4058)
* bin/helm-build automatically updates version in values.yaml

Have the Helm charts building script (`bin/helm-build`) update the
linkerd version in the `values.yaml` files according to the tagged
version, thus removing the need of doing this manually on every release.

This is akin to the update we do in `version.go` at CLI build time.

Note that `shellcheck` is issuing some warnings about this script, but
that's on code that was already there, so that will be handled in an
followup PR.
2020-02-18 11:19:58 -05:00
Mayank Shah 3c3a4a5f5d
cli: Add label selector flag for `stat` (#4040)
* Update `linkerd-namespace` shorthand to `L`
* Add --selector (-l) flag for `stat`

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-17 13:40:07 -05:00
Kohsheen Tiku 19806e3626
Scroll functionality for linkerd top deploy/linkerd-web (#4011)
* Table obtained from linkerd top is not scrollable.

Added scroll functionality for the table.

Fixes #2558

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>
2020-02-17 11:17:43 -05:00
Zahari Dichev 6fa9407318
Ensure we get the correct type out of Informer Deletion events (#4034)
Ensure we get what we expect when receiving DELETE events from the k8s Informer api

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:15:24 +02:00
Zahari Dichev 3538944d03
Unify trust anchors terminology (#4047)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:12:46 +02:00
Kevin Leimkuhler c31284e6db
Separate single Actions workflow into multiple workflows (#4039)
Depends on #4033

## Motivation

If any job fails in the current GH Actions workflow, a re-run on the same
commit SHA requires re-running *all* jobs--regardless if the job already
passed in the previous run.

This can be problematic when dealing with flakiness in the integration tests.

If a test fails due to flakiness in `cloud_integration_tests`, all the unit
tests, static checks, and `kind_integration_tests` are re-run which leads to
lots of waiting and dealing with the possibility of flakiness in earlier jobs.

With this change, individual workflows can now be re-run without triggering
all other jobs to complete again first.

## Solution

`workflow.yml` is now split into:
- `static_checks.yml`
- `unit_tests.yml`
- `kind_integration.yml`
- `cloud_integration.yml`

### Workflows

`statc_checks.yml` performs checks related to dependencies, linting, and
formatting.

`unit_tests.yml` performs the Go and JS unit tests.

`kind_integration.yml` builds the images (on Packet or the GH Action VM) and
runs the integration tests on a KinD cluster. This workflow continues to run
for **all** PRs and pushes to `master` and tags.

`cloud_integration.yml` builds the images only on Packet. This is because
forked repositories do not need to trigger this workflow. It then creates a
unique GKE cluster and runs the integration tests on the cluster.

### The actual flow of work..

A forked repository or non-forked repository opening a PR triggers:
- `static_checks`
- `unit_tests`
- `kind_integration_tests`

These workflows all run in parallel and are invidivually re-runnable.

A push to `master` or tags triggers:
- `static_checks`
- `unit_tests`
- `kind_integration_tests`
- `cloud_integration_tests`

These workflows also all run in parallel, including the `docker_build` step of
both integration test workflows. This has not conflicted in testing as it
takes place on the same Packet host and just utilizes docker layer caching
well.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-02-13 09:11:30 -08:00
Mayank Shah c1b683147a
Update identity to make certs more diagnosable (#3990)
Update identity controller to make issuer certificates diagnosable if
cert validity is causing error

    - Add expiry time in identity log message
    - Add current time in identity log message
    - Emit k8s event with appropriate message


Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-13 11:21:41 +02:00
Kevin Leimkuhler a460ada166
Run all PRs on GH Actions VMs (#4033)
Run all PRs on GH Actions VMs

## Motivation

Currently all pushes to master branch, tags, and Linkerd org member PRs run
the `kind_integration_host` job on the same Packet host.

The means that parallel jobs spin up KinD clusters with a unique name and
sandbox the tests so that they do not clash.

This is problematic for a few reasons:
* There is a limit on the number of jobs we can run in parallel due to
  resource constraints.
* Workflow cancellation and re-runs conflict when the cancelled run deletes
  it's namespaces and the running one expects them to be present.
* There has been an observed flakiness with running multiple KinD clusters
  resulting in inconsistent timeouts and docker errors.

## Solution

This change moves all KinD integration testing to GH Actions VMs. This is
currently what forked repository workflows do.

There is no longer a `docker_pull` job as it's responsibilities has been moved
into one of the `kind_integration_tests` steps.

The renamed `kind_integration_tests` job is responsible for **all** PR
workflows and has steps specific to forked and non-forked repositories.

### Non-forked repository PRs

The Packet host is still used for building docker images as leveraging docker
layer caching is still valuable--a build can be as fast as 30 seconds compared
to ~12 minutes.

Loading the docker images into the KinD cluster on the GH Action VM is done by
saving the Packet host docker images as image archives, and loading those
directly into the local KinD cluster.

### Forked repository PRs

`docker_build` has been sped up slightly by sending `docker save` processes to
the background.

Docker layer caching cannot be leveraged since there are no SSH secrets
available, so the `artifact-upload`/`artifact-download` actions introduced in
#TODO are still used.

### Cleanup

This PR also includes some general cleanup such as:
* Some job names have been renamed to better reflect their purpose or match
  the current naming pattern.
* Environment variables are set earlier in jobs as a separate step if it is
  currently exported multiple times.
* Indentation was really bothering me because it switches back and forth
  throughout the workflow file, so lists are now indented.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-02-12 14:38:05 -08:00
Alex Leong ec51434eb9
Show traffic split metrics from sources in all namespaces (#3967)
Fixes #3562 

When a pod in one namespace sends traffic to a service which is the apex of a traffic split in another namespace, that traffic is not displayed in the `linkerd stat trafficsplit` output.  This is because when we do a Prometheus query for traffic to the traffic split, we supply a Prometheus label selector to only select traffic sources in the namespace of the traffic split.

Since any pod in any namespace can send traffic to the apex service of a traffic split, we must look at all possible sources of traffic, not just the ones in the same namespace.

Before:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS   RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m         -     -             -             -             -
webapp-split   webapp   webapp-2     100m         -     -             -             -             -
```

After:

```
$ bin/linkerd stat ts
NAME           APEX     LEAF       WEIGHT   SUCCESS      RPS   LATENCY_P50   LATENCY_P95   LATENCY_P99
webapp-split   webapp   webapp       900m    80.00%   1.4rps          31ms          99ms        2530ms
webapp-split   webapp   webapp-2     100m    60.00%   0.2rps          35ms          93ms          99ms
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-12 09:21:59 -08:00
Alejandro Pedraza 7584f88b69
Integration test flakiness: endpoint version mismatch warning (#4020)
Updated regex for ignoring version mismatch warning events. It was only
applied for '-*upgrade' namespaces.

It is safe to ignore such warnings because the endpoint controller
retries when that happens, and if after many retries it still can't then
a different warning is thrown which is _not_ whitelisted and will make
the build fail.
https://github.com/kubernetes/kubernetes/blob/v1.16.6/pkg/controller/endpoint/endpoints_controller.go#L334-L348

This PR also removes logging matches on expected warnings, to avoid
cluttering the CI log.
2020-02-11 13:58:11 -05:00
Zahari Dichev 20f8da0e61
Remove experimental from CNI (#4038)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-11 20:10:55 +02:00
Christy Jacob f46f372e08
Refactoring to suppress eslint warnings jsx-a11y/click-events-have-key-events (#3995)
* Refactoring to suppress eslint warnings

Upon enabling jsx-a11y/click-events-have-key-events flag in .eslintrc , a couple of warnings are raised because it is recommended
to provide a onKeyPress, onKeyDown or onKeyUp event handler for every onClick event handler.

The code has been refactored to follow the eslint spec.

Fixes #3926

Signed-off-by: Christy Jacob <christyjacob4@gmail.com>

* Remove jsx-a11y/click-events-have-key-events flag from eslintrc

	* During the review it was reuqested to remove the flag
	* The requested change has been done
2020-02-11 09:28:02 -08:00
Kohsheen Tiku 21f1d85c80
Refactoring to suppress eslint warnings (#3996)
* Refactoring to suppress eslint warnings

Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Refactoring to suppress eslint warnings

Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>

* Enabling the eslint/no-param-reassign throws some warnings with the existing code.

Made necessary changes to suppress the warnings

Fixes #3927

Signed-off-by: Kohsheen Tiku <kohsheen.t@gmail.com>
2020-02-11 10:12:00 -05:00
Zahari Dichev 9b29a915d3
Improve cni resources labels (#4032)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-11 12:10:08 +02:00
Alejandro Pedraza 3ba66f6f9d
Fix flakey TestGetProfiles (#3965)
Fixes #3332

Fixes the very rare test failure
```
--- FAIL: TestGetProfiles (0.33s)
    --- FAIL: TestGetProfiles/Returns_server_profile (0.11s)
            server_test.go:228: Expected 1 or 2 updates but got 3:
            [retry_budget:<retry_ratio:0.2 min_retries_per_second:10
            ttl:<seconds:10 > >  routes:<condition:<path:<regex:"/a/b/c"
            > > metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > >
            routes:<condition:<path:<regex:"/a/b/c" > >
            metrics_labels:<key:"route" value:"route1" >
            timeout:<seconds:10 > > retry_budget:<retry_ratio:0.2
            min_retries_per_second:10 ttl:<seconds:10 > > ]
            FAIL
            FAIL  github.com/linkerd/linkerd2/controller/api/destination
            0.624s
```
that occurs when a third unexpected stream update occurs, when the fake
API takes more time to notify its listeners about the resources created.

For all the nasty details check #3332
2020-02-07 19:43:29 -05:00
Alex Leong 41d58c0905
Remove dependency on httpbin in egress integration test (#3987)
* Use linkerd.io for egress test instead of httpbin

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-07 19:35:51 -05:00
Alejandro Pedraza f78bef4ffd
Address external_issuer_test.go flakiness (#4018)
From time to time we get this CI error when testing the external issuer
mechanism:
```
Test script: [external_issuer_test.go] Params:
[--linkerd-namespace=l5d-integration-external-issuer
--external-issuer=true]
--- FAIL: TestExternalIssuer (33.61s)
    external_issuer_test.go:89: Received error while ensuring test app
    works (before cert rotation): Error stripping header and trailing
    newline; full output:
    FAIL
```
https://github.com/alpeb/linkerd2/runs/428273855?check_suite_focus=true#step:6:526

This is caused by the "backend" pod not receiving traffic from
"slow-cooker" in a timely manner.
After those pods are deployed we're only checking that "backend" is
ready, but not "slow-cooker", so this change adds that check.

I'm also removing the `TestHelper.CheckDeployment` call because it's
redundant, since the preceeding `TestHelper.CheckPods` is already checking
that the deployment has all the specified replicas ready.
2020-02-07 19:33:32 -05:00
Alejandro Pedraza 1e8223e143
Allow CI to run concurrent builds in master (#4001)
* Allow CI to run concurrent builds in master

Fixes #3911

Refactors the `cloud_integration` test to run in separate GKE clusters
that are created and torn down on the fly.
It leverages a new "gcloud" github action that is also used to set up
gcloud in other build steps (`docker_deploy` and `chart_deploy`).

The action also generates unique names for those clusters, based on the
git commit SHA and `run_id`, a recently introduced variable that is
unique per CI run and available to all the jobs.
This fixes part of #3635 in that CI runs on the same SHA don't interfere
with one another (in the `cloud_integration` test; still to do for
`kind_integration`).

The "gcloud" GH action is hosted under its own repo in https://github.com/linkerd/linkerd2-action-gcloud
2020-02-07 16:23:36 -05:00
Kevin Leimkuhler ae7d98b4fe
Run integration tests for forked repos (#4002)
* Allow CI to run concurrent builds in master

Fixes #3911

Refactors the `cloud_integration` test to run in separate GKE clusters
that are created and torn down on the fly.
It leverages a new "gcloud" github action that is also used to set up
gcloud in other build steps (`docker_deploy` and `chart_deploy`).

The action also generates unique names for those clusters, based on the
git commit SHA and `run_id`, a recently introduced variable that is
unique per CI run and available to all the jobs.
This fixes part of #3635 in that CI runs on the same SHA don't interfere
with one another (in the `cloud_integration` test; still to do for
`kind_integration`).

The "gcloud" GH action is supported by `.github/actions/gcloud/index.js`
that has a couple of dependencies. To avoid having to commit
`node_modules`, after every change to that file one must run
```bash
# only needed the first time
npm i -g @zeit/ncc

cd .github/actions/gcloud
ncc build index.js
```
which generates the self-contained file
`.github/actions/gcloud/dist/index.js`.
(This last part might get easier in the future after other refactorings
outside this PR).

* Run integration tests for forked repos

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Address reviews

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Address more reviews

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Move some conditionals to jobs

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Change job name

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Move more conditionals to job level

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Added more flags to 'gcloud container clusters create' and consolidated
'create' and 'destroy' into ' action'

* Run kind cleanup only for non-forked PRs

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Got rid of cloud_cleanup by using a post hook in the gcloud action

* Removed cluster naming responsibility from the gcloud action

* Consolidate .gitignore statements

* Removed bin/_gcp.sh

* Change name of Kind int. test job

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Ensure `kind_cleanup` still runs on cancelled host CI runs

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Add reviews

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Update workflow comment

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Split index.js into setup.js and destroy.js

* trigger build

* Moved the gcloud action into its own repo

* Full version for the gcloud GH action

* Rebase back to master

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Remvoe additional changes

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Remove additional changes

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

* Trigger CI

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>

Co-authored-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
2020-02-07 12:27:04 -08:00
Mayank Shah 6c6514f169
cli: Update 'check' command to validate HA configuration (#3942)
Add check for number of control plane replicas for HA

Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>
2020-02-07 19:07:11 +02:00
Dax McDonald 76d3285247
Use correct go module file syntax (#4021)
The correct syntax for the go module file is
go MAJOR.MINOR

Signed-off-by: Dax McDonald <dax@rancher.com>
2020-02-07 07:58:54 -08:00
Zahari Dichev 5cd3655b1e
Update helm overrides to match stable ones (#4025)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-07 09:37:18 -05:00
Alex Leong b9caae0cd9
stable-2.7.0 (#4019)
## stable-2.7.0

This release adds support for integrating Linkerd's PKI with an external
certificate issuer such as [`cert-manager`] as well as streamlining the
certificate rotation process in general. For more details about cert-manager
and certificate rotation, see the
[docs](https://linkerd.io/2/tasks/use_external_certs/). This release also
includes performance improvements to the dashboard, reduced memory usage of the
proxy, various improvements to the Helm chart, and much much more.

To install this release, run: `curl https://run.linkerd.io/install | sh`

**Upgrade notes**: This release includes breaking changes to our Helm charts.
Please see the [upgrade instructions](https://linkerd.io/2/tasks/upgrade/#upgrade-notice-stable-270).

**Special thanks to**: @alenkacz, @bmcstdio, @daxmc99, @droidnoob, @ereslibre,
@javaducky, @joakimr-axis, @JohannesEH, @KIVagant, @mayankshah1607,
@Pothulapati, and @StupidScience!

**Full release notes**:

* CLI
  * Updated the mTLS trust anchor checks to eliminate false positives caused by
    extra trailing spaces
  * Reduced the severity level of the Linkerd version checks, so that they
    don't fail when the external version endpoint is unreachable
    (thanks @mayankshah1607!)
  * Added a new `tap` APIService check to aid with uncovering Kubernetes API
    aggregatation layer issues (thanks @droidnoob!)
  * Introduced CNI checks to confirm the CNI plugin is installed and ready;
    this is done through `linkerd check --pre --linkerd-cni-enabled` before
    installation and `linkerd check` after installation if the CNI plugin is
    present
  * Added support for the `--as-group` flag so that users can impersonate
    groups for Kubernetes operations (thanks @mayankshah1607!)
  * Added HA specific checks to `linkerd check` to ensure that the `kube-system`
    namespace has the `config.linkerd.io/admission-webhooks:disabled`
    label set
  * Fixed a problem causing the presence of unnecessary empty fields in
    generated resource definitions (thanks @mayankshah1607)
  * Added the ability to pass both port numbers and port ranges to
    `--skip-inbound-ports` and `--skip-outbound-ports` (thanks to @javaducky!)
  * Increased the comprehensiveness of `linkerd check --pre`
  * Added TLS certificate validation to `check` and `upgrade` commands
  * Added support for injecting CronJobs and ReplicaSets, as well as the ability
    to use them as targets in the CLI subcommands
  * Introduced the new flags `--identity-issuer-certificate-file`,
    `--identity-issuer-key-file` and `identity-trust-anchors-file` to `linkerd
    upgrade` to support trust anchor and issuer certificate rotation
  * Added a check that ensures using `--namespace` and `--all-namespaces`
    results in an error as they are mutually exclusive
  * Added a `Dashboard.Replicas` parameter to the Linkerd Helm chart to allow
    configuring the number of dashboard replicas (thanks @KIVagant!)
  * Removed redundant service profile check (thanks @alenkacz!)
  * Updated `uninject` command to work with namespace resources
    (thanks @mayankshah1607!)
  * Added a new `--identity-external-issuer` flag to `linkerd install` that
    configures Linkerd to use certificates issued by an external certificate
    issuer (such as `cert-manager`)
  * Added support for injecting a namespace to `linkerd inject` (thanks
    @mayankshah1607!)
  * Added checks to `linkerd check --preinstall` ensuring Kubernetes Secrets
    can be created and accessed
  * Fixed `linkerd tap` sometimes displaying incorrect pod names for unmeshed
    IPs that match multiple running pods
  * Made `linkerd install --ignore-cluster` and `--skip-checks` faster
  * Fixed a bug causing `linkerd upgrade` to fail when used with
  `--from-manifest`
  * Made `--cluster-domain` an install-only flag (thanks @bmcstdio!)
  * Updated `check` to ensure that proxy trust anchors match configuration
       (thanks @ereslibre!)
  * Added condition to the `linkerd stat` command that requires a window size
    of at least 15 seconds to work properly with Prometheus
* Controller
  * Fixed an issue where an override of the Docker registry was not being
    applied to debug containers (thanks @javaducky!)
  * Added check for the Subject Alternate Name attributes to the API server
    when access restrictions have been enabled (thanks @javaducky!)
  * Added support for arbitrary pod labels so that users can leverage the
    Linkerd provided Prometheus instance to scrape for their own labels
    (thanks @daxmc99!)
  * Fixed an issue with CNI config parsing
  * Fixed a race condition in the `linkerd-web` service
  * Updated Prometheus to 2.15.2 (thanks @Pothulapati)
  * Increased minimum kubernetes version to 1.13.0
  * Added support for pod ip and service cluster ip lookups in the destination 
    service
  * Added recommended kubernetes labels to control-plane
  * Added the `--wait-before-exit-seconds` flag to linkerd inject for the proxy 
    sidecar to delay the start of its shutdown process (a huge commit from 
    @KIVagant, thanks!)
  * Added a pre-sign check to the identity service 
  * Fixed inject failures for pods with security context capabilities
  * Added `conntrack` to the `debug` container to help with connection tracking
    debugging
  * Fixed a bug in `tap` where mismatch cluster domain and trust domain caused
    `tap` to hang
  * Fixed an issue in the `identity` RBAC resource which caused start up errors
    in k8s 1.6 (thanks @Pothulapati!)
  * Added support for using trust anchors from an external certificate issuer
    (such as `cert-mananger`) to the `linkerd-identity` service
  * Added support for headless services (thanks @JohannesEH!)
* Helm
  * **Breaking change**: Renamed `noInitContainer` parameter to `cniEnabled`
  * **Breaking Change** Updated Helm charts to follow best practices (thanks
    @Pothulapati and @javaducky!)
  * Fixed an issue with `helm install` where the lists of ignored inbound and
    outbound ports would not be reflected
  * Fixed the `linkerd-cni` Helm chart not setting proper namespace annotations
    and labels
  * Fixed certificate issuance lifetime not being set when installing through
    Helm
  * Updated the helm build to retain previous releases
  * Moved CNI template into its own Helm chart
* Proxy
  * Fixed an issue that could cause the OpenCensus exporter to stall
  * Improved error classification and error responses for gRPC services
  * Fixed a bug where the proxy could stop receiving service discovery updates,
    resulting in 503 errors
  * Improved debug/error logging to include detailed contextual information
  * Fixed a bug in the proxy's logging subsystem that could cause the proxy to
    consume memory until the process is OOM killed, especially when the proxy was
    configured to log diagnostic information
  * Updated proxy dependencies to address RUSTSEC-2019-0033, RUSTSEC-2019-0034,
    and RUSTSEC-2020-02
* Web UI
  * Fixed an error when refreshing an already open dashboard when the Linkerd
    version has changed
  * Increased the speed of the dashboard by pausing network activity when the 
    dashboard is not visible to the user
  * Added support for CronJobs and ReplicaSets, including new Grafana dashboards
    for them
  * Added `linkerd check` to the dashboard in the `/controlplane` view
  * Added request and response headers to the `tap` expanded view in the
    dashboard
  * Added filter to namespace select button
  * Improved how empty tables are displayed
  * Added `Host:` header validation to the `linkerd-web` service, to protect
    against DNS rebinding attacks
  * Made the dashboard sidebar component responsive
  * Changed the navigation bar color to the one used on the [Linkerd](https://linkerd.io/) website
* Internal
  * Added validation to incoming sidecar injection requests that ensures
    the value of `linkerd.io/inject` is either `enabled` or `disabled`
    (thanks @mayankshah1607)
  * Upgraded the Prometheus Go client library to v1.2.1 (thanks @daxmc99!)
  * Fixed an issue causing `tap`, `injector` and `sp-validator` to use 
    old certificates after `helm upgrade` due to not being restarted
  * Fixed incomplete Swagger definition of the tap api, causing benign
    error logging in the kube-apiserver
  * Removed the destination container from the linkerd-controller deployment as
    it now runs in the linkerd-destination deployment
  * Allowed the control plane to be injected with the `debug` container
  * Updated proxy image build script to support HTTP proxy options
    (thanks @joakimr-axis!)
  * Updated the CLI `doc` command to auto-generate documentation for the proxy
    configuration annotations (thanks @StupidScience!)
  * Added new `--trace-collector` and `--trace-collector-svc-account` flags to
    `linkerd inject` that configures the OpenCensus trace collector used by
    proxies in the injected workload (thanks @Pothulapati!)
  * Added a new `--control-plane-tracing` flag to `linkerd install` that enables
    distributed tracing in the control plane (thanks @Pothulapati!)
  * Added distributed tracing support to the control plane (thanks
    @Pothulapati!)

[`cert-manager`]: https://github.com/jetstack/cert-manager

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-06 10:58:59 -08:00
Alex Leong 770da05b1e
edge-20.2.1 (#4012)
This edge release is a release candidate for `stable-2.7` and fixes an issue
where the proxy could consume inappropriate amounts of memory.

* Proxy
  * Fixed a bug in the proxy's logging subsystem that could cause the proxy to
    consume memory until the process is OOMKilled, especially when the proxy was
    configured to log diagnostic information
  * Fixed properly emitting `grpc-status` headers when signaling proxy errors to
    gRPC clients
* Internal
  * Updated to Rust 1.40
  * Updated certain proxy dependencies to address RUSTSEC-2019-0033,
    RUSTSEC-2019-0034, and RUSTSEC-2020-02

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-04 12:26:34 -08:00
Tarun Pothulapati 1a188f1361
Move controlPlaneTracing helm field to globals (#4000)
This already been moved into global as in a54c5b6b65/charts/partials/templates/_trace.tpl (L2)

but the change was missed in `values.yaml`

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-02-04 11:10:04 -08:00
Oliver Gould afcbebd30a
proxy: v2.85.0 (#4010)
This release fixes a bug in the proxy's logging subsystem that could
cause the proxy to consume memory until the process is OOMKilled,
especially when the proxy was configured to log diagnostic information.

The proxy also now properly emits `grpc-status` headers when signaling
proxy errors to gRPC clients.

This release upgrades the proxy's Rust version, the `http` crate
dependency to address RUSTSEC-2019-0033 and RUSTSEC-2019-0034, and the
`prost` crate dependency has been patched to address RUSTSEC-2020-02.

---

* internal: Introduce a locking middleware (linkerd/linkerd2-proxy#408)
* Update to Rust 1.40 with new Cargo.lock format (linkerd/linkerd2-proxy#410)
* Update http to v0.1.21 (linkerd/linkerd2-proxy#412)
* internal: Split retry, http-classify, and http-metrics (linkerd/linkerd2-proxy#409)
* Actually update http to v0.1.21 (linkerd/linkerd2-proxy#413)
* patch `prost` 0.5 to pick up security fix (linkerd/linkerd2-proxy#414)
* metrics: Make Counter & Gauge atomic (linkerd/linkerd2-proxy#415)
* Set grpc-status headers on dispatch errors (linkerd/linkerd2-proxy#416)
* trace: update `tracing-subscriber` to 0.2.0-alpha.4 (linkerd/linkerd2-proxy#418)
* discover: Warn on discovery error (linkerd/linkerd2-proxy#422)
* router: Avoid large up-front allocations (linkerd/linkerd2-proxy#421)
* errors: Set correct HTTP version on responses (linkerd/linkerd2-proxy#424)
* app: initialize tracing prior to parsing env vars (linkerd/linkerd2-proxy#425)
* trace: update tracing-subscriber to 0.2.0-alpha.6 (linkerd/linkerd2-proxy#423)
2020-02-04 10:41:50 -08:00
Zahari Dichev 9f4aa27842
Refactor identity check tests (#3988)
This PR breaks up the tests of the identity related checks to make the code more readable.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-02-04 18:08:26 +02:00
Zahari Dichev c609564dc8
Add helm upgrade integration test (#3976)
In light of the breaking changes we are introducing to the Helm chart and the convoluted upgrade process (see linkerd/website#647) an integration test can be quite helpful. This simply installs latest stable through helm install and then upgrades to the current head of the branch.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-02-04 08:27:46 +02:00
Paul Balogh 9888418276
Added NISC to list of adopters (#4009)
Signed-off-by: Paul Balogh <javaducky@gmail.com>
2020-02-03 17:44:59 -08:00
Christy Jacob 15b1e46b4f
Refactoring to suppress eslint warnings react/no-did-update-set-state flag (#3974)
* Refactoring to suppress eslint warnings

Upon enabling react/no-did-update-set-state flag in .eslintrc , a couple of warnings are raised because it is a bad practice to use the setState() function within the componentDidUpdate() hook.

The code has been refactored to follow the eslint spec.

During the code review, it was pointed out that the react/no-did-update-set-state is enabled by default and can be removed from .eslintrc

The flag was removed from .eslintrc

Fixes #3928

Signed-off-by: Christy Jacob <christyjacob4@gmail.com>
2020-01-30 17:34:07 -08:00
Mithun Arunan a54c5b6b65
add linkerd2 adopter vernacular.ai (#3989)
Signed-off-by: Mithun Arunan <mithun1848@gmail.com>
2020-01-30 08:04:47 -08:00
Ivan Sim 69ce7ab069
Added change log of edge-20.1.4 (#3986)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
2020-01-28 13:15:49 -08:00