When running linkerd in HA mode, a cluster can be broken by bringing down the proxy-injector.
Add a label to MWC namespace selctor that skips any namespace.
Fixes#3346
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
## Summary
[kind](https://github.com/kubernetes-sigs/kind) has been a helpful tool for running local Kubernetes clusters and
testing linkerd builds. Once images are built with `bin/docker-build`, the
images must be loaded into the kind cluster.
This script should be run after `bin/docker-build` and will load the images into
the specified kind cluster.
Example:
```
$ bin/docker-build
$ kind get clusters # show available clusters to load images on to
kleimkuhler
$ bin/kind-load kleimkuhler
$ ./target/cli/linux/linkerd install | kubectl apply -f -
```
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* make identity use grpc server with prom metrics
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* linting fix
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Fix injector timeout under high load
Fixes#3358
When retrieving a pod owner, we were hitting the k8s API directly because
at injection time the informer might not have been informed about the
existence of the parent object.
Under a large number of injection requests this ended up in the k8s API requests
being throttled, the proxy-injector getting blocked and the webhook requests
timing out.
Now we'll hit the shared informer first, and hit the k8s API only when
the informer doesn't return anything. After a few injection requests for
the same owner, the informer should have been updated.
Testing:
Scaling an emoji deployment to 1000 replicas, and after waiting for a
couple of minutes:
Before:
```bash
# a portion of the pods doesn't get injected
$ kubectl-n emojivoto get po | grep ./1 | wc -l
109
kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
.... (lots of errors)
```
After:
```bash
# all the pods get injected
$ kubectl -n emojivoto get po | grep ./1 | wc -l
0
kubectl -n kube-system logs -f kube-apiserver-minikube | grep
failing.*timeout
```
This change updates the internals of the proxy's client to the
Destination controller. Other than some minor fixes to the client's
backoff logic, no user-facing changes are expected.
* Split service discovery into composable components (linkerd/linkerd2-proxy#341)
* logging: update `tracing` and `tracing-subscriber` (linkerd/linkerd2-proxy#352)
* resolve: Do not send the 'k8s' scheme (linkerd/linkerd2-proxy#356)
This PR disables the `Start` button in the dashboard's top routes view if there
is no namespace or resource type selected.
Previously, clicking `Start` on the top routes tab with empty namespace and
resource fields would result in a bad request error.
Signed-off-by: pierdipi <pierangelodipilato@gmail.com>
If the namespace is controlled by an external tool or can't be installed
with Helm, disable its installation
Fixes#3412
Signed-off-by: Eugene Glotov <kivagant@gmail.com>
* Update prometheus cadvisor config to only keep container resources metrics
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Drop unused large metric
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Fix unit test
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Siggy's feedback
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Fix unit test
Signed-off-by: Ivan Sim <ivan@buoyant.io>
Added a few comments in the Chart.yaml files to clarify that some
versions don't need to be updated.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Trim certs and keys in the Helm charts
Fixes#3419
When installing through the CLI the installation will fail if the certs
are malformed, so this only concerns the Helm templates.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Have CI push the Helm artifacts into GCS
- Added missing OWNERS and README files
- Added maintainers section to Chart.yaml
- Changed NOTES.txt so it points to the installation of the CLI
- Set the proxy-init version to v1.1.0 in values.yaml
Ref #3256
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The integration tests check for known k8s events using a regex. This
regex included an incorrect pattern that prepended a failure reason and
object, rather than simply the event message we were trying to match on.
This resulted in failures such as:
https://github.com/linkerd/linkerd2/runs/217872818#step:6:476
Fix the regex to only check for the event message. Also explicitly
differentiate reason, object, and message in the log output.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Last changes before submitting to the Helm incubator
- Added missing OWNERS and README files
- Added maintainers section to Chart.yaml
- Changed NOTES.txt so it points to the installation of the CLI
- Set the proxy-init version to v1.1.0 in values.yaml
- Added missing ProfileValidator vars, and add 'do not edit' comment to the Identity.Issuer.CrtExpiryAnnotation value
- Added new self-hosted repo
- Added option to bin/helm-build
- Added DisableHeartBeat to README
Ref #3256
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The repo depended on an old version of client-go. It also depended on
stern, which itself depended on an old version of client-go, making
client-go upgrade non-trivial.
Update the repo to client-go v12.0.0, and also replace stern with a
fork.
This fork of stern includes the following changes:
- updated to use Go Modules
- updated to use client-go v12.0.0
- fixed log line interleaving:
- https://github.com/wercker/stern/issues/96
- based on:
- 8723308e46Fixes#3382
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
We're getting flakey `KillPodSandbox` events in the integration tests:
https://github.com/linkerd/linkerd2/runs/216505657#step:6:427
This is despite adding a regex for these events in #3380.
Modify the KillPodSandbox event regex to match on a broader set of
strings.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The kind clusters booted by the integration tests each had to pull
Prometheus and proxy-init images from the internet during linkerd
install.
Preemptively pull the images from the internet once, then execute `kind
load` commands for each of the clusters prior to starting integration
tests.
Depends on #3397
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
GitHub Actions has been running unit and integration tests, in parallel
with Travis running those same tests, and also handling master merges
and tags.
This change completes the transtion to GitHub Actions, removing all
references to Travis. Similar to Travis, GitHub Actions now acts on
master merges and tag pushes by pushing Docker images to gcr.io, and
running integration tests against a GKE cluster.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR updates dashboard dependencies and the babel config file to resolve a
Prototype Pollution vulnerability in an older version of `set-package` which is
used by babel, jest and webpack.
This edge release adds traffic splits into the Linkerd dashboard as well as a
variety of other improvements.
* CLI
* Improved the error message when the CLI cannot connect to Kubernetes (thanks
@alenkacz!)
* Added `--address` flag to `linkerd dashboard` (thanks @bmcstdio!)
* Controller
* Fixed an issue where the proxy-injector had insufficient RBAC permissions
* Added support for disabling the heartbeat cronjob (thanks @kevtaylor!)
* Proxy
* Decreased proxy Docker image size by removing bundled debug tools
* Fixed an issue where the incorrect content-length could be set for GET
requests with bodies
* Web UI
* Added trafficsplits as a resource to the dashboard, including a trafficsplit
detail page
* Internal
* Added support for Kubernetes 1.16
Signed-off-by: Alex Leong <alex@buoyant.io>
The controller Docker image included 7 Go binaries (destination,
heartbeat, identity, proxy-injector, public-api, sp-validator, tap),
each roughly 35MB, with similar dependencies.
Change each controller binary into subcommands of a single `controller`
binary, decreasing the controller Docker image size from 315MB to 38MB.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `proxy` and `web` Docker images were 161MB and 186MB, respectively.
Most of the space was tools installed into the `linkerd.io/base` image.
Decrease `proxy` and `web` Docker images to 73MB and 90MB, respectively.
Switch these images to be based off of `debian:stretch-20190812-slim`.
Also set `-ldflags "-s -w"` for `proxy-identity` and `web`. Modify
`linkerd.io/base` to also be based off of
`debian:stretch-20190812-slim`, update tag to `2019-09-04.01`.
Fixes#3383
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Disable heartbeat by default
Signed-off-by: Kevin Taylor <kevtaylor@expedia.com>
* Address review
Signed-off-by: Kevin Taylor <kevtaylor@expedia.com>
* Remove tabs in values
Signed-off-by: Kevin Taylor <kevtaylor@expedia.com>
GitHub Action secrets are intentionally not available to forked PRs.
This causes the integration tests that require those secrets to fail.
Modify GitHub Actions such that they only run for non-forked PRs.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Couple of injection events fixes
When generating events in quick succession against the same target, client-go issues a PATCH request instead of a POST, so we need the extra RBAC permission.
Also we have an informer on pods, so we also need the "watch" permission
for them, whose omission was causing an error entry in the logs.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Fixes#3356
1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):
```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
Migrate to the policy/v1beta1 API, available since v1.10. Existing
persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
Migrate to the apps/v1 API, available since v1.9. Existing persisted
data can be retrieved/updated via the apps/v1 API.
```
Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.
The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The existing Travis CI setup requires additional integrations and
permissions with Github, and also lacks some flexibility around job
dependency management.
Introduce a new CI workflow based on Github Actions. This initial
workflow performs the same CI work that Travis does, and will iniitially
run in parallel:
- Go unit tests
- JS unit tests
- Go lint
- Validate Go deps
- Integration tests (deep, upgrade, helm)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>