Commit Graph

174 Commits

Author SHA1 Message Date
Zahari Dichev 3538944d03
Unify trust anchors terminology (#4047)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:12:46 +02:00
Alejandro Pedraza 7584f88b69
Integration test flakiness: endpoint version mismatch warning (#4020)
Updated regex for ignoring version mismatch warning events. It was only
applied for '-*upgrade' namespaces.

It is safe to ignore such warnings because the endpoint controller
retries when that happens, and if after many retries it still can't then
a different warning is thrown which is _not_ whitelisted and will make
the build fail.
https://github.com/kubernetes/kubernetes/blob/v1.16.6/pkg/controller/endpoint/endpoints_controller.go#L334-L348

This PR also removes logging matches on expected warnings, to avoid
cluttering the CI log.
2020-02-11 13:58:11 -05:00
Alex Leong 41d58c0905
Remove dependency on httpbin in egress integration test (#3987)
* Use linkerd.io for egress test instead of httpbin

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-02-07 19:35:51 -05:00
Alejandro Pedraza f78bef4ffd
Address external_issuer_test.go flakiness (#4018)
From time to time we get this CI error when testing the external issuer
mechanism:
```
Test script: [external_issuer_test.go] Params:
[--linkerd-namespace=l5d-integration-external-issuer
--external-issuer=true]
--- FAIL: TestExternalIssuer (33.61s)
    external_issuer_test.go:89: Received error while ensuring test app
    works (before cert rotation): Error stripping header and trailing
    newline; full output:
    FAIL
```
https://github.com/alpeb/linkerd2/runs/428273855?check_suite_focus=true#step:6:526

This is caused by the "backend" pod not receiving traffic from
"slow-cooker" in a timely manner.
After those pods are deployed we're only checking that "backend" is
ready, but not "slow-cooker", so this change adds that check.

I'm also removing the `TestHelper.CheckDeployment` call because it's
redundant, since the preceeding `TestHelper.CheckPods` is already checking
that the deployment has all the specified replicas ready.
2020-02-07 19:33:32 -05:00
Zahari Dichev 5cd3655b1e
Update helm overrides to match stable ones (#4025)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-07 09:37:18 -05:00
Zahari Dichev c609564dc8
Add helm upgrade integration test (#3976)
In light of the breaking changes we are introducing to the Helm chart and the convoluted upgrade process (see linkerd/website#647) an integration test can be quite helpful. This simply installs latest stable through helm install and then upgrades to the current head of the branch.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-02-04 08:27:46 +02:00
Anantha Krishnan 7f026c96f6 Added check for TapAPI service (#3689)
Added check for TapAPI service

Fixes #3462
Added a checker using `kube-aggregator` client

Signed-off-by: Ananthakrishnan <kannan4mi3@gmail.com>
2020-01-27 20:07:07 +02:00
Kevin Leimkuhler 8c9498def2
Temporarily fix flaky integration test (#3968)
*From the comment disabling the test*:

#2316

The response from `http://httpbin.org/get` is non-deterministic--returning
either `http://..` or `https://..` for GET requests. As #2316 mentions, this
test should not have an external dependency on this endpoint. As a workaround
for edge-20.1.3, temporarily disable this test and renable with one that has
reliable behavior.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-01-24 10:32:10 -08:00
Kevin Leimkuhler 53baecb382
Changes for edge-20.1.3 (#3966)
## edge-20.1.3

* CLI
  * Introduced `linkerd check --pre --linkerd-cni-enabled`, used when the CNI
    plugin is used, to check it has been properly installed before proceeding
    with the control plane installation
  * Added support for the `--as-group` flag so that users can impersonate
    groups for Kubernetes operations (thanks @mayankshah160!)
* Controller
  * Fixed an issue where an override of the Docker registry was not being
    applied to debug containers (thanks @javaducky!)
  * Added check for the Subject Alternate Name attributes to the API server
    when access restrictions have been enabled (thanks @javaducky!)
  * Added support for arbitrary pod labels so that users can leverage the
    Linkerd provided Prometheus instance to scrape for their own labels
    (thanks @daxmc99!)
  * Fixed an issue with CNI config parsing

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-01-23 16:55:21 -08:00
Zahari Dichev e30b9a9c69
Add checks for CNI plugin (#3903)
As part of the effort to remove the "experimental" label from the CNI plugin, this PR introduces cni checks to `linkerd check`

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-17 12:11:19 +02:00
Alex Leong cabe2a13e6
Add distributed tracing integration test (#3920)
This integration test roughly follows the [Linkerd guide to distributed tracing](https://linkerd.io/2019/10/07/a-guide-to-distributed-tracing-with-linkerd/).

We deploy the tracing components (oc-collector and jaeger), emojivoto, and nginx as an ingress to do span initiation.  We then watch the jaeger API and check that a trace is eventually created that includes traces from all of the data plane components: nginx, linkerd-proxy, web, voting, and emoji.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-16 10:57:15 -08:00
Zahari Dichev 0ee409eaa3 Fix inject integration tests failing due to wrong golden files (#3923)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-14 12:47:16 -05:00
Alex Leong 93a81dce97
Change default proxy log level to "warn,linkerd=info" (#3908)
Fixes #3901 

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-01-09 14:22:06 -08:00
Oliver Gould d3d8d855f0
proxy: v2.83.0-experimental (#3897)
This is an experimental release that includes large changes to the
proxy's request buffering and backpressure infrastructure.

Please exercise caution before deploying this proxy version into mission
critical environments.
2020-01-09 14:12:46 -08:00
Zahari Dichev b4266c93de
Ensure proxy cert does not exceed the lifetime of the certs in the trust chain (#3893)
Fixes a problem where the identitiy serice can issue a certificate that has a lifetime larger than the issuer certificate. This was causing the proxies to end up using an invalid TLS certificate. This fix ensures that the lifetime of the issued certificate is not greater than the smallest lifetime of the certs in the issuer cert trust chain.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-01-09 09:52:29 +02:00
Tarun Pothulapati eac06b973c Move common values to global (#3839)
* move values to global in template

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update inject and cli

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update unit tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remote controllerImageVersion from global

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* move identity out of global

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update var name and comments

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update bin and helm tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* update helm readme

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix proxy config

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* fix proxy config indentation

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* more linting issues

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

* remove unnecessary lines

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-01-06 14:31:41 -08:00
Alejandro Pedraza 6f8574a633
Add event regex to ignore in integration test (#3884)
We were ignoring events like
```
MountVolume.SetUp failed for volume .* : couldn't propagate object cache: timed out waiting for the condition
```

but as k8s 1.16 those got replaced by more precise messages, like
```
MountVolume.SetUp failed for volume "linkerd-identity-token-cm4fn" :failed to sync secret cache: timed out waiting for the condition
MountVolume.SetUp failed for volume "prometheus-config" : failed to sync configmap cache: timed out waiting for the condition
```

This was causing sporadic CI test failures like
[here](https://github.com/linkerd/linkerd2/runs/368424822#step:7:562)

So I'm including another regex for that.

Re: 96c41f8a1e
2020-01-06 14:22:15 -05:00
Alejandro Pedraza 4abd778558
Don't hide stderr in integration tests (#3855)
In various integration tests we're not showing stderr when a failure
happens, thus hiding some possibly useful debugging info.
E.g. in the latest CI failures, commands like `linkerd update` were
failing with no visible reason why.
2019-12-20 09:27:18 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Sergio C. Arteaga 56c8a1429f Increase the comprehensiveness of check --pre (#3701)
* Increase the comprehensiveness of check --pre

Closes #3224

Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com>
2019-12-18 13:27:32 -05:00
Zahari Dichev f88b55e36e Tls certs checks (#3813)
* Added checks for cert correctness
* Add warning checks for approaching expiration
* Add unit tests
* Improve unit tests
* Address comments
* Address more comments
* Prevent upgrade from breaking proxies when issuer cert is overwritten (#3821)
* Address more comments
* Add gate to upgrade cmd that checks that all proxies roots work with the identitiy issuer that we are updating to
* Address comments
* Enable use of upgarde to modify both roots and issuer at the same time

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-16 14:49:32 -08:00
Tarun Pothulapati 2f492a77fb Switch to Smaller-Case in Linkerd2 and Partials Charts (#3823)
* update linkerd2, partials charts
* support install and inject workflow
* update helm docs
* update comments in values
* update helm tests
* update comments in test

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2019-12-13 14:48:07 -05:00
Alejandro Pedraza 2a4c71760d
Enable cert rotation test to work with dynamic namespaces, take two (#3795)
* Enable cert rotation test to work with dynamic namespaces

This PR adds support for dynamic cert generation when running the cert rotation intergration tests. This allows to avoid baking in the namespace in the certificate CN, thereby allowing us to run these tests on the clouds.

The tests in #3775 were failing because the second secret holding the issuer cert replacement was a leaf cert and not a root/intermediary cert capable of signing the CSRs. This is how the replacement cert looked like:

```bash
$ k -n l5d-integration-external-issuer get secrets linkerd-identity-issuer-new -ojson | jq '.data|.["tls.crt"]' | tr -d '"' | base64 -d | step certificate inspect -
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 2 (0x2)
    Signature Algorithm: ECDSA-SHA256
        Issuer: CN=identity.l5d-integration-external-issuer.cluster.local
        Validity
            Not Before: Dec 6 19:16:08 2019 UTC
            Not After : Dec 5 19:16:28 2020 UTC
        Subject: CN=identity.l5d-integration-external-issuer.cluster.local
        Subject Public Key Info:
            Public Key Algorithm: ECDSA
                Public-Key: (256 bit)
                X:
                    93:d5:fa:f8:d1:44:4f:9a:8c:aa:0c:9e:4f:98:a3:
                    8d:28:d9:cc:f2:74:4c:5f:76:14:52:47:b9:fb:c9:
                    a3:33
                Y:
                    d2:04:74:95:2e:b4:78:28:94:8a:90:b2:fb:66:1b:
                    e7:60:e5:02:48:d2:02:0e:4d:9e:4f:6f:e9:0a:d9:
                    22:78
                Curve: P-256
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Subject Alternative Name:
                DNS:identity.l5d-integration-external-issuer.cluster.local

    Signature Algorithm: ECDSA-SHA256
         30:46:02:21:00:f6:93:2f:10:ba:eb:be:bf:77:1a:2d:68:e6:
         04:17:a4:b4:2a:05:80:f7:c5:f7:37:82:7b:b7:9c:a1:66:6a:
         e1:02:21:00:b3:65:06:37:49:06:1e:13:98:7c:cf:f9:71:ce:
         5a:55:de:f6:1b:83:85:b0:a8:88:b7:cf:21:d1:16:f2:10:f9
```
For it to be a root/intermediate cert it should have had `CA:TRUE` under the `X509v3 extensions` section.

Why did the test pass sometimes? When it did pass for me, I could see in the linkerd-identity proxy logs something like:
```
ERR! [   320.964592s] linkerd2_proxy_identity::certify Received invalid ceritficate: invalid certificate: UnknownIssuer
```
so the cert retrieved from identity still was invalid but for some reason the proxy, sometimes, keeps on going despite that. And when one would delete the linkerd-identity pod, its proxy wouldn't come up at all, also showing that error.

With the changes from this branch, we no longer see that error in the logs and after deleting the linkerd-identity pod it comes back gracefully.
2019-12-11 15:50:06 -05:00
Zahari Dichev 6faf64e49f Revert "Enable cert rotation test to work with dynamic namespaces (#3775)" (#3787)
This reverts commit 0e45b9c03d.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-05 15:33:22 -05:00
Zahari Dichev 0e45b9c03d
Enable cert rotation test to work with dynamic namespaces (#3775)
This PR adds support for dynamic cert generation when running the cert rotation intergration tests. This allows to avoid baking in the namespace in the certificate CN, thereby allowing us to run these tests on the clouds.

* Enable cert rotation test to work with dynamic namespaces

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

* Address further comments

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2019-12-05 10:08:01 +02:00
Alex Leong ecb55bb1a3
Disable TestCliStatForLinkerdNamespace integration test (#3727)
https://github.com/linkerd/linkerd2/pull/3693 caused the proxy to start resolving private IP addresses with the destination service.  However, the destination service does not support IP lookups and returns failures for these lookups.  This negatively affects the destination service success rate and can cause this test to fail.  We disable this test for now until the destination service supports IP lookups.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-11-14 13:13:10 -08:00
Alejandro Pedraza 29459a50e6 Removed 'no invalid service profiles' from linkerd check test fixtures (#3724)
Followup to #3718
2019-11-14 10:52:38 -08:00
Zahari Dichev 2d224302de
Add integration test for external issuer and cert rotation flows (#3709)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-14 06:58:32 +02:00
Zahari Dichev a6ff442789
Traffic split integration test (#3649)
* Traffic split integration test

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address comments

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Display placeholder when there is no basic stats data

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-13 21:14:34 +02:00
Alejandro Pedraza 4b6254b52e
Replaced `uuid` with `uid` from linkerd-config resource (#3694)
* Replaced `uuid` with `uid` from linkerd-config resource

Fixes #3621

Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.

I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.

Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.

I also added an integration test to verify the stability of the uid.
2019-11-13 13:56:01 -05:00
Alex Leong 5da1a0723d
Disable edges integration test (#3707)
The edges integration test can fail when more edges are added to the Linked namespace due to https://github.com/linkerd/linkerd2/issues/3706.  We disable this test until that issue can be resolved.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-11-13 10:04:40 -08:00
Zahari Dichev 6e67f1a8ca
Modify knownEventWarningsRegex regex to catch ipv6 error events (#3699)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-13 09:38:01 +02:00
Zahari Dichev 038900c27e Remove destination container from controller (#3661)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-08 14:40:25 -08:00
Zahari Dichev 7dd5dfc2ba
Check health of meshed apps before and after linkerd upgrade (#3641)
* Check stats of deployed app before and after linkerd upgrade to ensure nothing broke

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Address naming remarks

Signed-off-by: zaharidichev <zaharidichev@gmail.com>

* Improve application health checking

Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-07 20:48:12 +02:00
Zahari Dichev 1bb9d66757 Integration test for custom cluster domain (#3660)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-11-04 14:49:52 -08:00
Zahari Dichev a8170bd634
Add preinstall checks for deletion and creation of secrets (#3639)
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
2019-10-31 18:01:03 +02:00
Alex Leong befea4aff6
Add direct edges integration test (#3603)
Add an integration test which exercises the behavior when one meshed pod connects to another meshed pod by pod ip address.

The current behavior is that the Linkerd proxy will not do any lookup against the destination service for this kind of connection and will proxy directly to the SO_ORIG_DST.  This means that it will not have the identity metadata necessary to TLS the connection, and the connection will not be present in the `linkerd edges` command output.  This test validates that behavior.

The purpose of this test is to set the stage for future work which will allow the Linkerd proxy to TLS this type of connection and display it in `linkerd edges`.  The assertions in this test will be updated as part of that work.

This test will be run as part of the integration test suite.  It can also be run directly:

```
go test --failfast --mod=readonly test/install_test.go   --linkerd=(pwd)"/bin/linkerd" --k8s-context="$CTX" --integration-tests
go test -v --mod=readonly test/edges/edges_test.go  --linkerd=(pwd)"/bin/linkerd" --k8s-context="$CTX" --integration-tests
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-30 10:48:03 -07:00
Ivan Sim ff69c29f5e
Add missing package to proxy Dockerfile (#3583)
* Add missing package to proxy Dockerfile
* Fix failing 'check' integration test
* Trim whitespaces in certs comparison.

Without this change, the integration test would fail because the trust anchor
stored in the linkerd-config config map generated by the Helm renderer is
stripped of the line breaks. See charts/linkerd2/templates/_config.tpl

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-15 15:51:26 -07:00
Ivan Sim cf69dedf9c
Re-add the destination container to the controller spec (#3540)
* Re-add the destination container to the controller spec

This fix is necessary to avoid data plane downtime during an upgrade to
stable-2.6. All existing older proxies will continue to send requests to
this destination container, until the data plane is restarted.

On restart, the new pods will start forwarding their requests to the new
linkerd-dst service.

* Use the 2.6 destination service fqdn
* Fixed unit tests
* Fix integration test failure

Signed-off-by: Ivan Sim <ivan@buoyant.io>
2019-10-08 10:49:40 -07:00
Alejandro Pedraza c21ceb5b4e
Add integration tests for linkerd endpoints and edges (#3491)
* Add integration tests for linkerd endpoints and edges

Fixes #3477 and #3478
2019-10-01 15:46:27 -05:00
Alejandro Pedraza 6568929028
Add --disable-heartbeat flag for linkerd install|upgrade (#3439)
Fixes #278

Add `linkerd install|upgrade --disable-heartbeat` flag, and have
`linkerd check` check for the heartbeat's SA only if it's enabled.

Also added those flags into the `linkerd upgrade -h` examples.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-25 15:53:36 -05:00
Alejandro Pedraza 1653f88651
Put the destination controller into its own deployment (#3407)
* Put the destination controller into its own deployment

Fixes #3268

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-18 13:41:06 -05:00
Andrew Seigner a5a6e8ff9f
Fix integration test event regex matching (#3416)
The integration tests check for known k8s events using a regex. This
regex included an incorrect pattern that prepended a failure reason and
object, rather than simply the event message we were trying to match on.
This resulted in failures such as:
https://github.com/linkerd/linkerd2/runs/217872818#step:6:476

Fix the regex to only check for the event message. Also explicitly
differentiate reason, object, and message in the log output.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-10 13:24:22 -07:00
Andrew Seigner 9bb7b6f119
Make KillPodSandbox regex match broader (#3409)
We're getting flakey `KillPodSandbox` events in the integration tests:
https://github.com/linkerd/linkerd2/runs/216505657#step:6:427
This is despite adding a regex for these events in #3380.

Modify the KillPodSandbox event regex to match on a broader set of
strings.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-09 11:54:14 -07:00
Bruno M. Custódio 8fec756395 Add '--address' flag to 'linkerd dashboard'. (#3274)
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
2019-09-05 10:56:10 -07:00
Andrew Seigner e51af8c8a9
Add known KillPod k8s event to integration test (#3380)
FailedKillPod events were causing integration tests to fail:
https://github.com/linkerd/linkerd2/runs/212313175#step:6:409

Add FailedKillPod as a known event. Example:
https://play.golang.org/p/WV52tyZgijW

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-04 14:00:16 -07:00
Andrew Seigner a8481b721a
GitHub Actions, kind, integration test logs fixes (#3372)
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to top of `README.md`.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-09-04 12:44:27 -07:00
Alejandro Pedraza acbab93ca8
Add support for k8s 1.16 (#3364)
Fixes #3356

1.16 removes some api groups that were already deprecated. From k8s blog
post (https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/):

```
- PodSecurityPolicy: will no longer be served from extensions/v1beta1 in
v1.16.
    Migrate to the policy/v1beta1 API, available since v1.10. Existing
    persisted data can be retrieved/updated via the policy/v1beta1 API.
- DaemonSet, Deployment, StatefulSet, and ReplicaSet: will no longer be
served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 in v1.16.
    Migrate to the apps/v1 API, available since v1.9. Existing persisted
    data can be retrieved/updated via the apps/v1 API.
```

Previous PRs had already made this change at the Helm templates level,
but we still needed to do it at the API calls and tests.

The integration tests ran fine for k8s 1.12 and 1.15. They fail on 1.16
because the upgrade integration test tries to install linkerd 2.5 which is not
compatible with 1.16.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-09-04 09:59:55 -05:00
Alejandro Pedraza 368d16f23c
Fix auto-injecting pods and integration tests reporting (#3335)
* Fix auto-injecting pods and integration tests reporting

When creating an Event when auto-injection occurs (#3316) we try to
fetch the parent object to associate the event to it. If the parent
doesn't exist (like in the case of stand-alone pods) the event isn't
created. I had missed dealing with one part where that parent was
expected.

This also adds a new integration test that I verified fails before this
fix.

Finally, I removed from `_test-run.sh` some `|| exit_code=$?` that was
preventing the whole suite to report failure whenever one of the tests
in `/tests` failed.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-28 15:04:20 -05:00
Andrew Seigner 956d1bff06
Update warning event regex for integration test (#3336)
Kubernetes was generating events for failed readiness probes that did
not quite match the expected events regex in the install integration
test:
https://travis-ci.org/linkerd/linkerd2/jobs/577642724#L647

Update the readiness probe regex to handle these variations in events:
https://play.golang.org/p/OVGJkFNN-XA

Relates to CI failure in #3333.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-28 10:19:40 -07:00