Commit Graph

485 Commits

Author SHA1 Message Date
Alejandro Pedraza 578d4a19e9
Have the tap APIServer refresh its cert automatically (#5388)
Followup to #5282, fixes #5272 in its totality.

This follows the same pattern as the injector/sp-validator webhooks, leveraging `FsCredsWatcher` to watch for changes in the cert files.

To reuse code from the webhooks, we moved `updateCert()` to `creds_watcher.go`, and `run()` as well (which now is called `ProcessEvents()`).

The `TestNewAPIServer` test in `apiserver_test.go` was removed as it really was just testing two things: (1) that `apiServerAuth` doesn't error which is already covered in the following test, and (2) that the golib call `net.Listen("tcp", addr)` doesn't error, which we're not interested in testing here.

## How to test

To test that the injector/sp-validator functionality is still correct, you can refer to #5282

The steps below are similar, but focused towards the tap component:

```bash
# Create some root cert
$ step certificate create linkerd-tap.linkerd.svc ca.crt ca.key   --profile root-ca --no-password --insecure

# configure tap's caBundle to be that root cert
$ cat > linkerd-overrides.yml << EOF
tap:
  externalSecret: true
  caBundle: |
    < ca.crt contents>
EOF

# Install linkerd
$ bin/linkerd install --config linkerd-overrides.yml | k apply -f -

# Generate an intermediatery cert with short lifespan
$ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc

# Create the secret using that intermediate cert
$ kubectl create secret tls \
  linkerd-tap-k8s-tls \
   --cert=ca-int.crt \
   --key=ca-int.key \
   --namespace=linkerd

# Rollout the tap pod for it to pick the new secret
$ k -n linkerd rollout restart deploy/linkerd-tap

# Tap should work
$ bin/linkerd tap -n linkerd deploy/linkerd-web
req id=0:0 proxy=in  src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :method=GET :authority=10.42.0.11:9994 :path=/metrics
rsp id=0:0 proxy=in  src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :status=200 latency=1779µs
end id=0:0 proxy=in  src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true duration=65µs response-length=1709B

# Wait 5 minutes and rollout tap again
$ k -n linkerd rollout restart deploy/linkerd-tap

# You'll see in the logs that the cert expired:
$ k -n linkerd logs -f deploy/linkerd-tap tap
2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45866: remote error: tls: bad certificate
2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45870: remote error: tls: bad certificate

# Recreate the secret
$ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc
$ k -n linkerd delete secret linkerd-tap-k8s-tls
$ kubectl create secret tls \
  linkerd-tap-k8s-tls \
   --cert=ca-int.crt \
   --key=ca-int.key \
   --namespace=linkerd

# Wait a few moments and you'll see the certs got reloaded and tap is working again
time="2020-12-15T16:03:42Z" level=info msg="Updated certificate" addr=":8089" component=apiserver
```
2020-12-16 17:46:14 -05:00
Tarun Pothulapati 589f36c4c2
jaeger: add check sub command (#5295)
* jaeger: add check sub command

This adds a new `linkerd jaeger check` command to have checks w.r.t
jaeger extension. This is similar to that of the `linkerd check` cmd.
As jaeger is a separate package, It was a bit complex for this to work
as not all types and fields from healthcheck pkg are public, Helper
funcs were used to mitigate this.

This has the following changes:

- Adds a new `check.go` file under the jaeger extension pkg
- Moves some commonly needed funcs and types from `cli/cmd/check.go`
  and `pkg/healthcheck/health.go` into
  `pkg/healthcheck/healthcheck_output.go`.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-17 00:26:34 +05:30
Alex Leong 74950e9407
Add jaeger uninstall command (#5353)
Add a `linkerd jaeger uninstall` command which prints the linkerd-jaeger extension resources so that they can be deleted.  This is similar to the `linkerd uninstall` command.

```
> bin/linkerd jaeger uninstall | k delete -f -
clusterrole.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
clusterrolebinding.rbac.authorization.k8s.io "linkerd-jaeger-linkerd-jaeger-proxy-mutator" deleted
mutatingwebhookconfiguration.admissionregistration.k8s.io "linkerd-proxy-mutator-webhook-config" deleted
namespace "linkerd-jaeger" deleted
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-14 15:48:44 -08:00
Tarun Pothulapati c19cfd71a1
upgrades: make webhooks restart if TLS creds are updated (#5349)
* upgrades: make webhooks restart if TLS creds are updated

Fixes #5231

Currently, we do not re-use the TLS certs during upgrades, which
means that the secrets are updated while the webhooks are still
paired with the older ones, causing the webhook requests to fail.

This can be solved by making webhooks be restarted whenever there
is a change in the certs. This can be performed by storing the hash
of the `*-rbac` file, which contains the secrets, thus making the
pod templates change whenever there is an update to the certs thus
making restarts required.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-10 11:56:53 -05:00
Alejandro Pedraza 35612ae268
`linkerd install --ha` was only partially applying HA config (#5358)
* `linkerd install --ha` was only partially applying HA config

Fixes #5342

`values-ha.yml` contains the specific config for HA, but only the proxy
resources controller replicas settings were applied. This PR adds
EnablePodAntiafinity, WebhookFailurePolicy and all the resource settings
for the other CP pods.

Also the `--controller-replicas` flag is moved after the HA flags so it
can override the HA settings.

Finally, some comments no longer relevant were removed.

## How to test

Perform `linkerd install --ha` and make sure the values in
`values-ha.yml` are propagated correctly in the produced yaml.

## 2.9.1

After merging to `main`, this should be cherry-picked into the
`release/stable-2.9` branch.

Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-12-09 15:23:37 -05:00
Alex Leong cdc57d1af0
Use linkerd-jaeger extension for control plane tracing (#5299)
Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including:

* removing the tracing components from the main control plane chart
* removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension)
* removing tracing related checks (these will be added back into `linkerd jaeger check`)
* removing related tests

We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension.  To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to.

Note that for now, only the control plane components send traces; the proxies in the control plane do not.  This is because the linkerd-jaeger injector is not yet available.  However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available.

I tested this by doing the following:

1. bin/linkerd install | kubectl apply -f -
1. bin/helm install jaeger jaeger/charts/jaeger
1. bin/linkerd upgrade --control-plane-tracing=true | kubectl apply -f -
1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686
1. open http://localhost:16686
1. see traces from the linkerd control plane

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-12-08 14:34:26 -08:00
Tarun Pothulapati 72a0ca974d
extension: Separate multicluster chart and binary (#5293)
Fixes #5257

This branch movies mc charts and cli level code to a new
top level directory. None of the logic is changed.

Also, moves some common types into `/pkg` so that they
are accessible both to the main cli and extensions.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-04 16:36:10 -08:00
Tarun Pothulapati 47a49e5ac5
jaeger: Add support for override flags (#5304)
This change adds flags `set`, `set-string`, `values`, `set-files`,
etc flags which are used to override the default values. This is
similar to that of Helm.

This also updates the install workflow to directly use Helm v3
pkg for chart loading and generation, without having to use
our chart type, etc.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-12-04 16:35:39 -08:00
Alejandro Pedraza 4c634a3816
Have webhooks refresh their certs automatically (#5282)
* Have webhooks refresh their certs automatically

Fixes partially #5272

In 2.9 we introduced the ability for providing the certs for `proxy-injector` and `sp-validator` through some external means like cert-manager, through the new helm setting `externalSecret`.
We forgot however to have those services watch changes in their secrets, so whenever they were rotated they would fail with a cert error, with the only workaround being to restart those pods to pick the new secrets.

This addresses that by first abstracting out `FsCredsWatcher` from the identity controller, which now lives under `pkg/tls`.

The webhook's logic in `launcher.go` no longer reads the certs before starting the https server, moving that instead into `server.go` which in a similar way as identity will receive events from `FsCredsWatcher` and update `Server.cert`. We're leveraging `http.Server.TLSConfig.GetCertificate` which allows us to provide a function that will return the current cert for every incoming request.

### How to test

```bash
# Create some root cert
$ step certificate create linkerd-proxy-injector.linkerd.svc ca.crt ca.key \
  --profile root-ca --no-password --insecure --san linkerd-proxy-injector.linkerd.svc

# configure injector's caBundle to be that root cert
$ cat > linkerd-overrides.yaml << EOF
proxyInjector:
  externalSecret: true
    caBundle: |
      < ca.crt contents>
EOF

# Install linkerd. The injector won't start untill we create the secret below
$ bin/linkerd install --controller-log-level debug --config linkerd-overrides.yaml | k apply -f -

# Generate an intermediatery cert with short lifespan
step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc

# Create the secret using that intermediate cert
$ kubectl create secret tls \
  linkerd-proxy-injector-k8s-tls \
   --cert=ca-int.crt \
   --key=ca-int.key \
   --namespace=linkerd

# start following the injector log
$ k -n linkerd logs -f -l linkerd.io/control-plane-component=proxy-injector -c proxy-injector

# Inject emojivoto. The pods should be injected normally
$ bin/linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f -

# Wait about 5 minutes and delete a pod
$ k -n emojivoto delete po -l app=emoji-svc

# You'll see it won't be injected, and something like "remote error: tls: bad certificate" will appear in the injector logs.

# Regenerate the intermediate cert
$ step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc

# Delete the secret and recreate it
$ k -n linkerd delete secret linkerd-proxy-injector-k8s-tls
$ kubectl create secret tls \
  linkerd-proxy-injector-k8s-tls \
   --cert=ca-int.crt \
   --key=ca-int.key \
   --namespace=linkerd

# Wait a couple of minutes and you'll see some filesystem events in the injector log along with a "Certificate has been updated" entry
# Then delete the pod again and you'll see it gets injected this time
$ k -n emojivoto delete po -l app=emoji-svc

```
2020-12-04 16:25:59 -05:00
Björn Wenzel 0ee18eb168
Allow Multicluster Service to be non LoadBalancer ServiceType (#5307)
Signed-off-by: Björn Wenzel <bjoern.wenzel@dbschenker.com>
2020-12-03 13:03:49 -05:00
Alejandro Pedraza 9cbfb08a38
Bump proxy-init to v1.3.8 (#5283) 2020-11-27 09:07:34 -05:00
Tarun Pothulapati e7f4c31257
extension: Add new jaeger binary (#5278)
* extension: Add new jaeger binary

This branch adds a new jaeger binary project in the jaeger directory.
This follows the same logic as that of `linkerd install`. But as
`linkerd install` VFS logic expects charts to be present in `/charts`
directory, This command gets its own static pkg to generate its own
VFS for its chart.

This covers only the install part of the command

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-25 20:10:35 +05:30
Tarun Pothulapati fafeee38d7
Upgrade to Helm v3 sdk (#4878)
Fixes #4874

This branch upgrades Helm sdk from v2 to v3 *without any functionaly
changes*, just replacing types with newer API's.

This should not effect our current support for Helm v2 as we did not
change any of the underlying tempaltes(which work with Helm v2). This
works becuase we did not use any of the API's that read the Chart
metadata (which are the only ones changed from v2 to v3) and currently
manually load files and pass ito the sdk.

This PR should provide a great point to start more of the newer Helm v3
API's including for the upgrade workflow thus allowing us to make
Linkerd CLI more simpler.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-23 12:46:41 -08:00
hodbn 92eb174e06
Add safe accessor for Global in linkerd-config (#5269)
CLI crashes if linkerd-config contains unexpected values.

Add a safe accessor that initializes an empty Global on the first
access. Refactor all accesses to use the newly introduced accessor using
gopls.

Add test for linkerd-config data without Global.

Fixes #5215

Co-authored-by: Itai Schwartz <yitai27@gmail.com>
Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>
2020-11-23 12:45:58 -08:00
Tarun Pothulapati b389054d53
cli: Don't check for SAN in root and intermediate certs (#5237)
As discussed in #5228, it is not correct for root and intermediate
certs to have SAN. This PR updates the check to not verify the
intermediate issuer cert with the identity dns name (which checks with
SAN and not CN as the the `verify` func is used to verify leaf certs and
not root and intermediate certs). This PR also avoids setting a SAN
field when generating certs in the `install` command.

Fixes #5228
2020-11-18 15:30:39 -08:00
Alejandro Pedraza 5a707323e6
Update proxy-init to v1.3.7 (#5221)
This upgrades both the proxy-init image itself, and the go dependency on
proxy-init as a library, which fixes CNI in k3s and any host using
binaries coming from BusyBox, where `nsenter` has an
issue parsing arguments (see rancher/k3s#1434).
2020-11-13 15:59:14 -05:00
Tarun Pothulapati d9a6e217f9
nit: return crtExpiry even for External Certs (#5173)
This change updates `FetchExternalIssuerData` to be more like
`FetchIssuerData` and return expiry correctly.

This field is currently not used anywhere and is just done for
consistentcy purposes.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-03 13:15:53 -05:00
Oliver Gould 4d85b6cd65
inject: Set LINKERD2_PROXY_CORES from the cpu limit (#5170)
Per #5165, Kubernetes does not necessarily limit the proxy's access to
cores via `cgroups` when a CPU limit is set. As of #5168, the proxy now
supports a `LINKERD2_PROXY_CORES` environment configuration that
augments CPU detection from the host operating system.

This change modifies the proxy injector to ensure that this environment
is configured from the `Values.proxy.cores` Helm value, the
`config.linkerd.io/proxy-cpu-limit` annotation, and the `--proxy-cpu-limit`
install flag.
2020-11-03 10:02:31 -08:00
Oliver Gould d6cb0c56cb
ha: Remove CPU limits for control plane components (#5171)
As discussed in #5167 & #5169, Kubernetes CPU limits are not necessarily
discoverable from within the pod. This means that the control plane
processes may allocate far more threads than can actually be used by the
process given its process limits.

This change removes the default CPU limits for all control plane
components. CPU limits may still be set via Helm configuration.
2020-11-03 09:18:36 -08:00
Oliver Gould 04e15c8544
ha: Do not set a default CPU limit (#5169)
Now that the proxy can use more than one core, this behavior should be
enabled by default, even in HA mode.

This change modifies the default HA helm values to unset the cpu limit
for proxy containers.
2020-11-03 07:53:36 -08:00
Tarun Pothulapati 14b8b8c792
upgrade: set identity.issuer.crtExpiry correctly with legacy upgrades (#5161)
With legacy upgrades, we can parse the cert and store the expiry
correctly instead of storing it as the default value which could be a
problem when we use that field. Currently, we do not use this field and
hence it did not cause any problems.

Install on the latest edges, This field is correctly set and works
as expected. Thus, upgrades also have the right value.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-03 00:19:18 +05:30
Tarun Pothulapati 262d5e041c
charts: Do not store .component in linkerd-config (#5144)
* charts: Do not store .component in linkerd-config

This removes the `.component` fields from `Values.go` and also prevents them from being emitted into `linkerd-config` by attaching them into a temporary variable during injection.

This also simplies inbound and outbound Skip ports helm logic and adds quotes to them.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-11-02 20:41:37 +05:30
Alex Leong da194f5dc3
Warn when webhook certificates near expiry (#5155)
Fixes #5149 

Before:

```
linkerd-webhooks-and-apisvc-tls
-------------------------------
× tap API server has valid cert
    certificate will expire on 2020-10-28T20:22:32Z
    see https://linkerd.io/checks/#l5d-tap-cert-valid for hints
```

After:

```
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ tap API server has valid cert
‼ tap API server cert is valid for at least 60 days
    certificate will expire on 2020-10-28T20:22:32Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ proxy-injector webhook has valid cert
‼ proxy-injector cert is valid for at least 60 days
    certificate will expire on 2020-10-29T18:17:03Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ sp-validator webhook has valid cert
‼ sp-validator cert is valid for at least 60 days
    certificate will expire on 2020-10-28T20:21:34Z
    see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-30 11:48:51 -07:00
Tarun Pothulapati 4c106e9c08
cli: make check return SkipError when there is no prometheus configured (#5150)
Fixes #5143

The availability of prometheus is useful for some calls in public-api
that the check uses. This change updates the ListPods in public-api
to still return the pods even when prometheus is not configured.

For a test that exclusively checks for prometheus metrics, we have a gate
which checks if a prometheus is configured and skips it othervise.

Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com
2020-10-29 19:57:11 +05:30
Tarun Pothulapati 3a16baa141
Use errors.Is instead of checking underlying err messages (#5140)
* Use errors.Is instead of checking underlying err messages

Fixes #5132

This PR replaces the usage of `strings.hasSuffix` with `errors.Is`
wherever error messages are being checked. So, that the code is not
effected by changes in the underlying message. Also adds a string
const for http2 response body closed error

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-10-28 21:33:17 +05:30
Tarun Pothulapati 39e7f84773
cli: fix and update timeout warnings in profile cmd (#5122)
Fixes #5121

* cli: skip emitting warnings in Profile


Whenever the tapDuration gets completed, there is a warning occured
which we do not emit. This looks like it has been changed in the latest
versions of the dependency.

* Use context.withDeadline instead of client.timeout

The usage of `client.Timeout` is not working correctly causing `W1022
17:20:12.372780   19049 transport.go:260] Unable to cancel request for
   promhttp.RoundTripperFunc` to be emitted by the Kubernetes Client.

This is fixed by using context.WithDeadline and passing that into the
http Request.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-10-27 22:08:21 +05:30
Alex Leong b7c5bd07ae
Add 'linkerd.io/inject: ingress' mode (#5130)
Fixes #5118

This PR adds a new supported value for the `linkerd.io/inject` annotation.  In addition to `enabled` and `disabled`, this annotation may now be set to `ingress`.  This functions identically to `enabled` but it also causes the `LINKERD2_PROXY_INGRESS_MODE="true"` environment variable to be set on the proxy.  This causes the proxy to operate in ingress mode as described in #5118 

With this set, ingresses are able to properly load service profiles based on the l5d-dst-override header.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-26 14:32:19 -07:00
Alejandro Pedraza 177669b377
Remove code refs to controllerImageVersion (#5119)
Followup to #5100

We had both `controllerImageVersion` and `global.controllerImageVersion`
configs, but only the latter was taken into account in the chart
templates, so this change removes all of its references.
2020-10-21 13:40:25 -05:00
Oliver Gould 25e49433fd
Do not permit cluster networks to be overridden per-pod (#5111)
In #5110 the `global.proxy.destinationGetNetworks` configuration is
renamed to `global.clusterNetworks` to better reflect its purpose.

The `config.linkerd.io/proxy-destination-get-networks` annotation allows
this configuration to be overridden per-workload, but there's no real use
case for this. I don't think we want to support this value differing
between pods in a cluster. No good can come of it.

This change removes support for the `proxy-destination-get-networks`
annotation.
2020-10-21 09:34:13 -07:00
Oliver Gould 84b1a826bd
Replace global.proxy.destinationGetNetworks with global.clusterNetworks (#5110)
There is no longer a proxy config `DESTINATION_GET_NETWORKS`. Instead of
reflecting this implementation in our values.yaml, this changes this
variable to the more general `clusterNetworks` to emphasize its
similarity to `clusterDomain` for the purposes of discovery.
2020-10-20 19:05:31 -07:00
Oliver Gould c5d3b281be
Add 100.64.0.0/10 to the set of discoverable networks (#5099)
It appears that Amazon can use the `100.64.0.0/10` network, which is
technically private, for a cluster's Pod network.

Wikipedia describes the network as:

> Shared address space for communications between a service provider
> and its subscribers when using a carrier-grade NAT.

In order to avoid requiring additional configuration on EKS clusters, we
should permit discovery for this network by default.
2020-10-19 12:59:44 -07:00
Oliver Gould 4f16a234aa
Add a default set of ports to bypass the proxy (#5093)
The proxy has a default, hardcoded set of ports on which it doesn't do
protocol detection (25, 587, 3306 -- all of which are server-first
protocols). In a recent change, this default set was removed from
the outbound proxy, since there was no way to configure it to anything
other than the default set. I had thought that there was a default set
applied to proxy-init, but this appears to not be the case.

This change adds these ports to the default Helm values to restore the
prior behavior.

I have also elected to include 443 in this set, as it is generally our
recommendation to avoid proxying HTTPS traffic, since the proxy provides
very little value on these connections today.

Additionally, the memcached port 11211 is skipped by default, as clients
do not issue any sort of preamble that is immediately detectable.

These defaults may change in the future, but seem like good choices for
the 2.9 release.
2020-10-16 11:53:41 -07:00
Alex Leong 9701f1944e
Stop rendering addon config (#5078)
The linkerd-addon-config is no longer used and can be safely removed.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-16 11:07:51 -07:00
Oliver Gould 222c11400b
tests: Set proxy log to linkerd=debug (#5081)
The proxy log level `linkerd2_proxy=debug` only enables logging from a
few proxy modules. We should instead use the more general
`linkerd=debug`.
2020-10-14 15:31:03 -07:00
Alex Leong 500c1cc2d7
Expose namespaceSelector for admission webhooks in helm chart (#5074)
Closes (#5026)

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
2020-10-13 16:08:56 -07:00
Tarun Pothulapati 2a5e7dba62
Handle grafana add-on config repair (#5059)
* Handle grafana add-on config repair

Fixes #5014

In Grafana Add-On, Default fields i.e `grafana.image.name`, `grafana.name`
have been removed from `linkerd-config-addons` after `2.8.1`. Only
overriden values are stored in `linkerd-config-addons` as of now.
Hence, `grafana.image.name` has to be removed from
`linkerd-config-addons` unless they are overriden so that updates
to it can take place especially the move from `gcr` to `ghcr`.

This also removes `grafana.name` field if they are set to default, as
its removed.

This problem will not occur again even if we update default values, as
default values are not stored in `linekrd-config-addons` anymore for all
add-ons.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-10-13 13:12:49 -07:00
Alex Leong 41c1fc65b0
Upgrade using config overrides (#5005)
This is a major refactor of the install/upgrade code which removes the config protobuf and replaces it with a config overrides secret which stores overrides to the values struct.  Further background on this change can be found here: https://github.com/linkerd/linkerd2/discussions/4966

Note: as-is this PR breaks injection.  There is work to move injection onto a Values-based config which must land before this can be merged.

A summary of the high level changes:

* the install, global, and proxy fields of linkerd-config ConfigMap are no longer populated
* the CLI install flow now follows these simple steps:
  * load default Values from the chart
  * update the Values based on the provided CLI flags
  * render the chart with these values
  * also render a Secret/linkerd-config-overrides which describes the values which have been changed from their defaults
* the CLI upgrade flow now follows these simple stesp:
  * load the default Values from the chart
  * if Secret/linkerd-config-overrides exists, apply the overrides onto the values
  * otherwise load the legacy ConfigMap/linkerd-config and use it to updates the values
  * further update the values based on the provided CLI flags
  * render the chart and the Secret/linkerd-config-overrides as above
* Helm install and upgrade is unchanged

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-12 14:23:14 -07:00
Alex Leong 530d8beccc
Add podLabels and podAnnotations to Values struct (#5056)
PR https://github.com/linkerd/linkerd2/pull/5027 added `podLabels` and `podAnnotations` to `values.yaml` to allow setting labels and annotations on pods in the Helm template.  However, these fields were not added to the `Values` struct in `Values.go`.  This means that these fields were not serialized out to the `linkerd-config` or to the `linkerd-config-overrides`.  Furthermore, in PR #5005 which moves to using the `Values` struct more authoritatively, the `podLabels` and `podAnnotations` fields would not take effect at all.

Add these fields to the `Values` struct and update all test fixtures accordingly.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-10-09 09:27:28 -07:00
Tarun Pothulapati 1e7bb1217d
Update Injection to use new linkerd-config.values (#5036)
This PR Updates the Injection Logic (both CLI and proxy-injector)
to use `Values` struct instead of protobuf Config, part of our move
in removing the protobuf.

This does not touch any of the flags, install related code.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>

Co-authored-by: Alex Leong <alex@buoyant.io>
2020-10-07 09:54:34 -07:00
Tarun Pothulapati faf77798f0
Update check to use new linkerd-config.values (#5023)
This branch updates the check functionality to read
the new `linkerd-config.values` which contains the full
Values struct showing the current state of the Linkerd
installation. (being added in #5020 )

This is done by adding a new `FetchCurrentConfiguraiton`
which first tries to get the latest, if not falls back
to the older `linkerd-config` protobuf format.`

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-10-01 11:19:25 -07:00
Alex Leong 1784f0643e
Add linkerd-config-overrides secret (#4911)
This PR adds a new secret to the output of `linkerd install` called `linkerd-config-overrides`.  This is the first step towards simplifying the configuration of the linkerd install and upgrade flow through the CLI.  This secret contains the subset of the values.yaml which have been overridden.  In other words, the subset of values which differ from their default values.  The idea is that this will give us a simpler way to produce the `linkerd upgrade` output while still persisting options set during install.  This will eventually replace the `linkerd-config` configmap entirely.

This PR only adds and populates the new secret.  The secret is not yet read or used anywhere.  Subsequent PRs will update individual control plane components to accept their configuration through flags and will update the `linkerd upgrade` flow to use this secret instead of the `linkerd-config` configmap.

This secret is only generated by the CLI and is not present or required when installing or upgrading with Helm.

Here are sample contents of the secret, base64 decoded.  Note that identity tls context is saved as an override so that it can be persisted across updates.  Since these fields contain private key material, this object must be a secret.  This secret is only used for upgrades and thus only the CLI needs to be able to read it.  We will not create any RBAC bindings to grant service accounts access to this secret.

```
global:
  identityTrustAnchorsPEM: |
    -----BEGIN CERTIFICATE-----
    MIIBhDCCASmgAwIBAgIBATAKBggqhkjOPQQDAjApMScwJQYDVQQDEx5pZGVudGl0
    eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMjAwODI1MjMzMTU3WhcNMjEwODI1
    MjMzMjE3WjApMScwJQYDVQQDEx5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9j
    YWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQ0e7IPBlVZ03TL8UVlODllbh8b
    2pcM5mbtSGgpX9z0l3n5M70oHn715xu2szh63oBjPl2ZfOA5Bd43cJIksONQo0Iw
    QDAOBgNVHQ8BAf8EBAMCAQYwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMC
    MA8GA1UdEwEB/wQFMAMBAf8wCgYIKoZIzj0EAwIDSQAwRgIhAI7Sy8P+3TYCJBlK
    pIJSZD4lGTUyXPD4Chl/FwWdFfvyAiEA6AgCPbNCx1dOZ8RpjsN2icMRA8vwPtTx
    oSfEG/rBb68=
    -----END CERTIFICATE-----
heartbeatSchedule: '42 23 * * * '
identity:
  issuer:
    crtExpiry: "2021-08-25T23:32:17Z"
    tls:
      crtPEM: |
        -----BEGIN CERTIFICATE-----
        MIIBhDCCASmgAwIBAgIBATAKBggqhkjOPQQDAjApMScwJQYDVQQDEx5pZGVudGl0
        eS5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMjAwODI1MjMzMTU3WhcNMjEwODI1
        MjMzMjE3WjApMScwJQYDVQQDEx5pZGVudGl0eS5saW5rZXJkLmNsdXN0ZXIubG9j
        YWwwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQ0e7IPBlVZ03TL8UVlODllbh8b
        2pcM5mbtSGgpX9z0l3n5M70oHn715xu2szh63oBjPl2ZfOA5Bd43cJIksONQo0Iw
        QDAOBgNVHQ8BAf8EBAMCAQYwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMC
        MA8GA1UdEwEB/wQFMAMBAf8wCgYIKoZIzj0EAwIDSQAwRgIhAI7Sy8P+3TYCJBlK
        pIJSZD4lGTUyXPD4Chl/FwWdFfvyAiEA6AgCPbNCx1dOZ8RpjsN2icMRA8vwPtTx
        oSfEG/rBb68=
        -----END CERTIFICATE-----
      keyPEM: |
        -----BEGIN EC PRIVATE KEY-----
        MHcCAQEEIJaqjoDnqkKSsTqJMGeo3/1VMfJTBsMEuMWYzdJVxIhToAoGCCqGSM49
        AwEHoUQDQgAENHuyDwZVWdN0y/FFZTg5ZW4fG9qXDOZm7UhoKV/c9Jd5+TO9KB5+
        9ecbtrM4et6AYz5dmXzgOQXeN3CSJLDjUA==
        -----END EC PRIVATE KEY-----
```

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-09-29 08:01:36 -07:00
Lutz Behnke de098cd52d
make api service secrets compatible to cert manager (#4737)
Currently the secrets for the proxy-injector, sp-validator webhooks and tap API service are using the Opaque secret type and linkerd-specific field names. This makes it impossible to use cert-manager (https://github.com/jetstack/cert-manager) to provisions and rotate the secrets for these services. This change converts the secrets defined in the linkerd2 helm charts and the controller use the kubernetes.io/tls format instead. This format is used for secrets containing the generated secrets by cert-manager.

Signed-off-by: Lutz Behnke <lutz.behnke@finleap.com>
2020-09-29 09:17:09 -05:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Kevin Leimkuhler 2ec5245d67
Add configuration for opaque ports (#4972)
## Motivation

Closes #4950

## Solution

Add the `config.linkerd.io/opaque-ports` annotation to either a namespace or pod
spec to set the proxy `LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION`
environment variable.

Currently this environment variable is not used by the proxy, but will be
addressed by #4938.

## Valid values

Ports: `config.linkerd.io/opaque-ports: 4322,3306`

Port ranges: `config.linkerd.io/opaque-ports: 4320-4325`

Mixed ports and port ranges: `config.linkerd.io/opaque-ports: 4320-4325`

If the pod has named ports such as:

```
- name: nginx
  image: nginx:latest
  ports:
  - name: nginx-port
    containerPort: 80
    protocol: TCP
```

The name can also be used as a value: `config.linkerd.io/opaque-ports:
nginx-port`

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-09-25 15:36:12 -04:00
Tarun Pothulapati ecce5b91f6
tests: Add Calico CNI deep integration tests (#4952)
* tests: Add new CNI deep integration tests

Fixes #3944

This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`

This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-23 19:58:28 +05:30
Nil 69ca673682
Introduce support for authenticated docker registries using imagePullSecrets, Fixes #4413 (#4898)
* Introduce support for authenticated docker registries using imagePullSecrets

Problem: Private Docker Registries are not supported for the moment as detailed in issue #4413

Solution: Every Service Account of linkerd subcomponents are Attached with imagePullSecrets,
which in turn can then pulls the docker images from authenticated private registries using them.
The imagePullSecret is configured in global.imagePullSecret parameter of values.yaml like

imagePullSecret:
  - name: <name-of-private-registry-secret-resource>

Fixes #4413

Signed-off-by: Nilakhya <nilakhya@hotmail.com>
2020-09-23 08:49:35 -05:00
Tarun Pothulapati f75b9fe374
tracing: Move default values into addon-chart (#4951)
* tracing: Move default values into chart

This branch updates the tracing add-on's values into their own chart's values.yaml
(just like grafana and prometheus). This prevents them from being saved into
`linkerd-config-addons` where only the overridden values are stored. Thus allowing
us to change the defaults.

This also
-  Updates the check command to fall back to default values, if there are no
overridden name fields.
- Updates jaeger to `1.19.2`

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-15 15:19:25 -05:00
Alejandro Pedraza ccf027c051
Push docker images to ghcr.io instead of gcr.io (#4953)
* Push docker images to ghcr.io instead of gcr.io

The `cloud_integration.yml` and `release.yml` workflows were modified to
log into ghcr.io, and remove the `Configure gcloud` step which is no
longer necessary.

Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes.

Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag.

Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).
2020-09-10 15:16:24 -05:00
Zahari Dichev 084bb678c7
Perform TLS checks on injector, sp validator and tap (#4924)
* Check sp-validator,proxy-injector and tap certs

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-09-10 11:21:23 -05:00
Tarun Pothulapati c4f8ba270d
Generate Identity certs with alternate domain names (#4920)
Updating only the go 1.15 version, makes the upgrades fail from older versions,
as the identity certs do not have that setting and go 1.15 expects them. 
This PR upgrades the cert generation code to have that field, 
allowing us to move to go 1.15 in later versions of Linkerd.
2020-09-03 22:33:10 +05:30