Another attempt at fixing #6511
Even after #6524, we continued experiencing discrepancies on the
linkerd-edges integration test. The problem ended up being the external
prometheus instance not being injected. The injector logs revealed this:
```console
2021-07-29T13:57:10.2497460Z time="2021-07-29T13:54:15Z" level=info msg="caches synced"
2021-07-29T13:57:10.2498191Z time="2021-07-29T13:54:15Z" level=info msg="starting admin server on :9995"
2021-07-29T13:57:10.2498935Z time="2021-07-29T13:54:15Z" level=info msg="listening at :8443"
2021-07-29T13:57:10.2499945Z time="2021-07-29T13:54:18Z" level=info msg="received admission review request 2b7b4970-db40-4bda-895b-bb2e95e98265"
2021-07-29T13:57:10.2511751Z time="2021-07-29T13:54:18Z" level=debug msg="admission request: &AdmissionRequest{UID:2b7b4970-db40-4bda-895b-bb2e95e98265,Kind:/v1, Kind=Service,Resource:{ v1 services},SubResource:,Name:metrics-api,Namespace:linkerd-viz...
```
Usually one expects the webhook server to start first ("listening at
:8443") and then the admin server, but in this case it happened the
other way around. The admin server serves the readiness probe, so k8s
was signaled that the injector was ready before it could listen to
webhook requests, and given the WebhookFailurePolicy is Ignore by
default, sometimes this was causing for the prometheus pod creation
event to get missed, and we see in the log above that it starts by
processing the pods that are created afterwards, which are the viz ones.
In this fix we start first the webhook server, then block on the syncing
of the k8s API, which should give enough time for the webhook to be up,
and finally we start the admin server.
Fixes#6452
We add a `linkerd-identity-trust-roots` ConfigMap which contains the configured trust root bundle. The proxy template partial is modified so that core control plane components load this bundle from the configmap through the downward API.
The identity controller is updated to mount this new configmap as a volume read the trust root bundle at startup.
Similarly, the proxy-injector also mounts this new configmap. For each pod it injects, it reads the trust root bundle file and sets it on the injected pod.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#5589
The core control plane has a dependency on the viz package in order to use the `BuildResource` function. This "backwards" dependency means that the viz source code needs to be included in core docker-builds and is bad for code hygiene.
We move the `BuildResource` function into the viz package. In `cli/cmd/metrics.go` we replace a call to `BuildResource` with a call directly to `CanonicalResourceNameFromFriendlyName`.
Signed-off-by: Alex Leong <alex@buoyant.io>
We were getting sporadic coverage differences on `controller/k8s/test_helper.go` and `pkg/healthcheck/healthcheck_test.go` on pushes unrelated to those files.
For the former, the problem was in tests in `controller/k8s/api_test.go` that compared slices of pods and services by sorting them. The `Sort` interface was implemented through the methods in `test_helper.go`. There is indeterminism in that sorting at the go library level apparently, in that the `Swap` method is not always called, which impacted the coverage report. The fix consists on comparing those slices item by item without needing to sort beforehand.
As for `healthcheck_test.go`, `validateControlPlanePods()` in `healthcheck.go` short-circuits on the first pod having all its containers ready. The unit tests iterate over maps, an iteration we know is not deterministic, so sometimes the short-circuiting avoided to ever cover the `!container.Ready` block, thus affecting the coverage report. This is fixed by adding a new small test that makes sure that block is covered.
The problem is for parent objects that are not supported in Linkerd, we
cannot get any metrics. For example, using a Rollout will not report any
metrics higher than a pod level.
To fix, add validation for ReplicaSet owners; if it's a valid parent,
use parent Kind and Name, otherwise use ReplicaSet.
Tested using CLI/UI
Interim solution for #6429
Signed-off-by: Matei David <matei@buoyant.io>
Fixes#6354
We add Prometheus gauges which track the client-go cache size for each resource type. For example, the following metrics are added to the destination controller:
```
# HELP endpoint_cache_size Number of items in the client-go endpoint cache
# TYPE endpoint_cache_size gauge
endpoint_cache_size 21
# HELP job_cache_size Number of items in the client-go job cache
# TYPE job_cache_size gauge
job_cache_size 0
# HELP namespace_cache_size Number of items in the client-go namespace cache
# TYPE namespace_cache_size gauge
namespace_cache_size 8
# HELP node_cache_size Number of items in the client-go node cache
# TYPE node_cache_size gauge
node_cache_size 1
# HELP pod_cache_size Number of items in the client-go pod cache
# TYPE pod_cache_size gauge
pod_cache_size 23
# HELP replica_set_cache_size Number of items in the client-go replica_set cache
# TYPE replica_set_cache_size gauge
replica_set_cache_size 40
# HELP service_cache_size Number of items in the client-go service cache
# TYPE service_cache_size gauge
service_cache_size 18
# HELP service_profile_cache_size Number of items in the client-go service_profile cache
# TYPE service_profile_cache_size gauge
service_profile_cache_size 4
# HELP traffic_split_cache_size Number of items in the client-go traffic_split cache
# TYPE traffic_split_cache_size gauge
traffic_split_cache_size 0
```
Signed-off-by: Alex Leong <alex@buoyant.io>
* Set `LINKERD2_PROXY_INBOUND_PORTS` during injection
Fixes#6267
The `LINKERD2_PROXY_INBOUND_PORTS` env var will be set during injection,
containing a comma-separated list of the ports in the non-proxy containers in
the pod. For the identity, destination and injector pods, the var is set
manually in their Helm templates.
Since the proxy-injector isn't reinvoked, containers injected by a mutating
webhook after the injector has run won't be detected. As an escape hatch, the
`config.linkerd.io/pod-inbound-ports` annotation has been added to explicit
overrides.
Other changes:
- Removed
`controller/proxy-injector/fake/data/inject-sidecar-container-spec.yaml` which
is no longer used. - Fixed bad indentation in some fixture files under
`controller/proxy-injector/fake/data`.
Default Linkerd skip and opaque port configuration
Missing default ports based on docs
Addressed: Add Redis to default list of Opaque ports #6132
Once merged, the default install values will match the recommendations in Linkerd's TCP ports guide.
Fixes#6132
Signed-off-by: jasonmorgan <jmorgan@f9vs.com>
Co-authored-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
We emit a Kubernetes event from the identity controller when successfully issuing a leaf certificate. The events include the identity, expiry, and a hash of the certificate.
Signed-off-by: Alex Leong <alex@buoyant.io>
List of changes:
- Include more output in the `simulate` mode (thanks @liuerfire!")
- Log to `stdout` instead of `stderr` (thanks @mo4islona!)
Non user-facing changes:
- Added `dependabot.yml` to receive automated dependencies upgrades PRs (both for go and github actions). As a result, also upgraded a bunch of dependencies.
Fixes#6272
The opaqueports is prone to fail, with `context deadline exceeded`
as there are numerous k8s API requests being performed.
This PR updates the pre-fetching logic to instead use
`controller/k8s` which provides a wrapper around `pkg/k8s` with
caching by using shared informers underneath!
This commit includes the following changes:
- Update `checkMisconfiguredOpaquePortAnnotations` to use
`controllerk8s.KubeAPI` instead of `hc.kubeAPI`
- `kubeAPI.Sync` fn also had to be updated as it fails to check
if the sp and ts sharedinformers are nil, which might be the
case in cases like this where they are not needed.
We had to use `controllerK8s.NewAPI` for the initialization
instead of `controllerk8s.InitializeAPI` to take-in
`hc.kubeAPI` so as to support unit testing, etc as `hc.kubeAPI`
is how we pass the fake resources in unit tests.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Closes#6253
### What
---
When we send a profile request with a pod IP, we get back an endpoint as part of the response. This has two advantages: we avoid building a load balancer and we can treat endpoint failure differently (with more of a fail fast approach). At the moment, when we use a pod DNS as the target of the profile lookup, we don't have an endpoint returned in the
response.
Through this change, the behaviour will be consistent. Whenever we look up a pod (either through IP or DNS name) we will get an endpoint back. The change also attempts to simplify some of the logic in GetProfile.
### How
---
We already have a way to build an endpoint and return it back to the client; I sought to re-use most of the code in an effort to also simplify `GetProfile()`. I extracted most of the code that would have been duplicated into a separate method that is responsible for building the address, looking at annotations for opaque ports and for sending the response back.
In addition, to support a pod DNS fqn I've expanded on the `else` branch of the topmost if statement -- if our host is not an IP, we parse the host to get the k8s fqn. If the parsing function returns an instance ID along with the ServiceID, then we know we are dealing directly with a pod -- if we do, we fetch the pod using the core informer and then return an endpoint for it.
### Tests
---
I've tested this mostly with the destination client script. For the tests, I used the following pods:
```
❯ kgp -n emojivoto -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
voting-ff4c54b8d-zbqc4 2/2 Running 0 3m58s 10.42.0.53 k3d-west-server-0 <none> <none>
web-0 2/2 Running 0 3m58s 10.42.0.55 k3d-west-server-0 <none> <none>
vote-bot-7d89964475-tfq7j 2/2 Running 0 3m58s 10.42.0.54 k3d-west-server-0 <none> <none>
emoji-79cc56f589-57tsh 2/2 Running 0 3m58s 10.42.0.52 k3d-west-server-0 <none> <none>
# emoji pod has an opaque port set to 8080.
# web-svc is a headless service and it backs a statefulset (which is why we have web-0).
# without a headless service we can't lookup based on pod DNS.
```
**`Responses before the change`**:
```
# request on IP, this is how things work at the moment. I included this because there shouldn't be
# any diff between the response given here and the response we get with the change.
# note: this corresponds to the emoji pod which has opaque ports set to 8080.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.52:8080
INFO[0000] opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524724} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"emoji"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"emoji-79cc56f589-57tsh"} metric_labels:{key:"pod_template_hash" value:"79cc56f589"} metric_labels:{key:"serviceaccount" value:"emoji"} tls_identity:{dns_like_identity:{name:"emoji.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{} opaque_transport:{inbound_port:4143}}}
INFO[0000]
# request web-0 by IP
# there shouldn't be any diff with the response we get after the change
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.55:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
# request web-0 by DNS name -- will not work.
❯ go run controller/script/destination-client/main.go -method getProfile -path web-0.web-svc.emojivoto.svc.cluster.loc
al:8080
INFO[0000] fully_qualified_name:"web-0.web-svc.emojivoto.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"web-svc.emojivoto.svc.cluster.local.:8080" weight:10000}
INFO[0000]
INFO[0000] fully_qualified_name:"web-0.web-svc.emojivoto.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"web-svc.emojivoto.svc.cluster.local.:8080" weight:10000}
INFO[0000]
# ^
# |
# --> no endpoint in the response
```
**`Responses after the change`**:
```
# request profile for emoji, we see opaque transport being set on the endpoint.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.52:8080
INFO[0000] opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524724} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"emoji"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"emoji-79cc56f589-57tsh"} metric_labels:{key:"pod_template_hash" value:"79cc56f589"} metric_labels:{key:"serviceaccount" value:"emoji"} tls_identity:{dns_like_identity:{name:"emoji.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{} opaque_transport:{inbound_port:4143}}}
INFO[0000]
# request profile for web-0 with IP.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.55:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
# request profile for web-0 with pod DNS, resp contains endpoint.
❯ go run controller/script/destination-client/main.go -method getProfile -path web-0.web-svc.emojivoto.svc.cluster.local:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
```
Signed-off-by: Matei David <matei@buoyant.io>
## Context
Currently, Whenever a `SP` is created with `Spec.Routes` field not being set from [golang types](https://github.com/linkerd/linkerd2/blob/main/controller/gen/apis/serviceprofile/v1alpha2/types.go#L13), k8s API rejects them with the following error
```bash
ServiceProfile.linkerd.io \"backend-svc.linkerd-smi-app.svc.cluster.local\" is invalid: spec.routes: Invalid value: \"null\": spec.routes in body must be of type array: \"null\"
```
This happens because, Golang automatically renders them it as `Routes: Null` whenever it marshaled into json. This is rejected by k8s API server as it expects that field to be an array.
[This is fixed in k8s >= 1.20](https://github.com/kubernetes/kubernetes/pull/95423) as non-nullable nulls are defaulted, and hence this error happens only in `<=1.19`.
## Problem
As `1.19` is a pretty recent version of k8s, and things like [smi-adaptor](https://github.com/linkerd/linkerd-smi/pull) may not want to manage and make sure `Spec.Routes` is not null all the time.
## Fix
This can be easily be fixed by marking `Spec.Routes` as `omitempty` in its json tags which means that the field is omitted whenever it is not set while being marshaled.
This means that the k8s API won't error out, as that field isn't set to anything invalid.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
The misspellings have been reported at 0d56327e6f (commitcomment-51603624)
The action reports that the changes in this PR would make it happy: 03a9c310aa
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
This updates the destination to prefer `serviceprofiles.dstOverrides`
over `trafficsplits`. This is useful as it is important for
ServiceProfile to take preference over TrafficSplits when both are
present.
This also makes integration testing the `smi-adaptor` easier.
This also adds unit tests in the `traffic_split_adaptor` to check
for the same.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
While uncommon, if H2 upgrades are disabled it's possible for an opaque workload
to not have it's hint.OpaqueTransport field set in it's destination profile
response.
This changes the H2 upgrade enabled check to be specific for setting the
hint.Protocol while allowing hint.OpaqueTransport to be set independent of
that value.
Signed-off-by: Kevin Leimkuhler kevin@kleimkuhler.com
Co-authored-by: Oliver Gould <ver@buoyant.io>
Go 1.16.4 includes a fix for a denial-of-service in net/http: golang/go#45710
Go's error file-line formatting changed in 1.16.3, so this change
updates tests to only do suffix matching on these error strings.
* Skip configuring firewall if rules exists
This change fixes an issue where the `proxy-init` will fail if
`PROXY_INIT_*` chains already exist in the pod's iptables. This then
causes the pod to never start because proxy-init never finishes running
with a non-zero exit code.
In this change, we capture the output of the `iptables-save` command and
then check to see if the output contains the `PROXY_INIT_*` chains. If
they do, exist and log a warning stating that the chains already
exist.
Fixes#5786
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Support Traffic Splitting through `ServiceProfile.dstOverrides`
This commit
- updates the destination logic to prevent the override of `ServiceProfiles.dstOverrides` when a `TS` is absent and no `dstOverrides` set. This means that any `serviceprofiles.dstOverrides` set by the user are correctly propagated to the proxy and thus making Traffic Splitting through service profiles work.
- This also updates the `profile_translator.toDstOverrides` to use the port from `GetProfiles` when there is no port in `dstOverrides.authority`
After these changes, The following `ServiceProfile` to perform
traffic splitting should work.
```yaml
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local
spec:
dstOverrides:
- authority: "backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
- authority: "failing-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
```
This PR also adds an integration test i.e `TestTrafficSplitCliWithSP` which checks for
Traffic Splitting through Service Profiles. This integration test follows the same pattern
of other traffic split integration tests but Instead, we perform `linkerd viz stat` on the
client and check if there are expected server objects to be there as `linkerd viz stat ts`
does not _yet_ work with Service Profiles.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* spSharedInformers, tsSharedInformers may be nil, calling the metho will cause panic
Signed-off-by: Cookie Wang <wangchl01@inspur.com>
* spSharedInformers, tsSharedInformers may be nil, calling the metho will cause panic
Signed-off-by: Cookie Wang <wangchl01@inspur.com>
### What
This change adds the `config.linkerd.io/proxy-await` annotation which when set will delay application container start until the proxy is ready. This allows users to force application containers to wait for the proxy container to be ready without modifying the application's Docker image. This is different from the current use-case of [linkerd-await](https://github.com/olix0r/linkerd-await) which does require modifying the image.
---
To support this, Linkerd is using the fact that containers are started in the order that they appear in `spec.containers`. If `linkerd-proxy` is the first container, then it will be started first.
Kubernetes will start each container without waiting on the result of the previous container. However, if a container has a hook that is executed immediately after container creation, then Kubernetes will wait on the result of that hook before creating the next container. Using a `PostStart` hook in the `linkerd-proxy` container, the `linkerd-await` binary can be run and force Kubernetes to pause container creation until the proxy is ready. Once `linkerd-await` completes, the container hook completes and the application container is created.
Adding the `config.linkerd.io/await-proxy` annotation to a pod's metadata results in the `linkerd-proxy` container being the first container, as well as having the container hook:
```yaml
postStart:
exec:
command:
- /usr/lib/linkerd/linkerd-await
```
---
### Update after draft
There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical).
First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents.
Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the `0` index container, and if the proxy container was always `0` then this would defeat the purpose of the overview page.
### Testing
To test this I used the `sleep.sh` script and changed `Dockerfile-proxy` to use it as it's `ENTRYPOINT`. This forces the container to sleep for 20 seconds before starting the proxy.
---
`sleep.sh`:
```bash
#!/bin/bash
echo "sleeping..."
sleep 20
/usr/bin/linkerd2-proxy-run
```
`Dockerfile-proxy`:
```textile
...
COPY sleep.sh /sleep.sh
RUN ["chmod", "+x", "/sleep.sh"]
ENTRYPOINT ["/sleep.sh"]
```
---
```bash
# Build and install with the above changes
$ bin/docker-build
...
$ bin/image-load --k3d
...
$ bin/linkerd install |kubectl apply -f -
```
Annotate the `emoji` deployment so that it's the only workload that should wait for it's proxy to be ready and inject it:
```bash
cat emojivoto.yaml |bin/linkerd inject - |kubectl apply -f -
```
You can then see that the `emoji` deployment is not starting its application container until the proxy is ready:
```bash
$ kubectl get -n emojivoto pods
NAME READY STATUS RESTARTS AGE
voting-ff4c54b8d-sjlnz 1/2 Running 0 9s
emoji-f985459b4-7mkzt 0/2 PodInitializing 0 9s
web-5f86686c4d-djzrz 1/2 Running 0 9s
vote-bot-6d7677bb68-mv452 1/2 Running 0 9s
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* destination: Remove support for IP Queries in `Get` API
Fixes#5246
This PR updates the destination to report an error when `Get`
is called for IP Queries. As the issue mentions, The proxies
are not using this API anymore and it helps to simplify and
remove unnecessary logic.
This removes the relevant `IPWatcher` logic, along with
unit tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Closes#5977
## What
This changes adds support for namespace configuration annotation inheritance for pods. Any annotations (e.g `config.linkerd.io/skip-outbound-ports` or `config.linkerd.io/proxy-await`) that are applied against a namespace will now also be applied to pods running in that namespace by the _proxy-injector_.
* Pods do not inherit annotations from their namespaces; the exception to this is `opaque-ports` introduced in #5941. This expands on the work by allowing all config annotations to be inherited.
* Main advantage here is that instead of applying annotations on a workload-by-workload basis we can just apply them against the namespace and it will be mirrored on all pods within the namespace.
* Through this change the controller can also check the proxy's configuration directly from the pod's meta rather than from env variables.
## How
Change is pretty straightforward. We want to make sure that before we apply a JSON patch we first copy all of the namespace annotations to the pod. The logic that was in place takes care of applying the patch.
* One obvious constraint is that we want only want valid configuration annotations to be applied. To be a "valid" configuration it has to exist and it has to be prefixed with `config.linkerd.io` -- the easiest way to do this is to go through all of the available proxy configuration options and check whether any of the options are included in the namespace's annotations (done in `GetNsConfigKeys()` where we fetch all annotation keys from the namespace).
* A consideration I had with this change is whether to add `opaque-ports` as part of all of the config keys; opaque ports is a bit different though since it can be applied on a pod as well as a service -- through this change we only want to apply config annotations to pods. I chose to keep the two separate.
* Added a unit test that checks if a pod inherits config annotations from its namespace; this also includes an invalid annotation which doesn't show up in the "expected" patch to test we validate configuration correctly.
### Tests
---
I injected emojivoto and added an annotation to its namespace:
```
apiVersion: v1
kind: Namespace
metadata:
annotations:
config.linkerd.io/opaque-ports: "34567"
config.linkerd.io/proxy-log-level: debug
config.linkerd.io/skip-outbound-ports: "44556"
linkerd.io/inject: enabled
```
The deployment specs do not have any additional annotations as part of the pod template metadata. I first tested if the above annotations would be inherited with the current edge release (I expected opaque ports to be).
**Before changes**:
```
apiVersion: v1
kind: Pod
metadata:
annotations:
config.linkerd.io/opaque-ports: "34567"
linkerd.io/created-by: linkerd/proxy-injector edge-21.4.1
linkerd.io/identity-mode: default
linkerd.io/inject: enabled
linkerd.io/proxy-version: edge-21.4.1
creationTimestamp: "2021-04-08T14:33:10Z"
generateName: emoji-696d9d8f95-
labels:
app: emoji-svc
linkerd.io/control-plane-ns: linkerd
linkerd.io/proxy-deployment: emoji
linkerd.io/workload-ns: emojivoto
pod-template-hash: 696d9d8f95
version: v11
spec:
initContainers:
- args:
- --incoming-proxy-port
- "4143"
- --outgoing-proxy-port
- "4140"
- --proxy-uid
- "2102"
- --inbound-ports-to-ignore
- 4190,4191
- --outbound-ports-to-ignore
- "44556"
image: cr.l5d.io/linkerd/proxy-init:v1.3.9
imagePullPolicy: IfNotPresent
name: linkerd-init
```
(opaque ports is in there, skip outbound isn't -- although the initContainer gets the right argument since this is already applied from the namespace by the proxy injector).
**After the changes**:
```
apiVersion: v1
kind: Pod
metadata:
annotations:
config.linkerd.io/opaque-ports: "34567"
config.linkerd.io/proxy-log-level: debug
config.linkerd.io/skip-outbound-ports: "44556"
linkerd.io/created-by: linkerd/proxy-injector dev-a7bb62fd-matei
linkerd.io/identity-mode: default
linkerd.io/inject: enabled
linkerd.io/proxy-version: dev-a7bb62fd-matei
creationTimestamp: "2021-04-08T14:42:06Z"
generateName: web-5f86686c4d-
labels:
app: web-svc
linkerd.io/control-plane-ns: linkerd
linkerd.io/proxy-deployment: web
linkerd.io/workload-ns: emojivoto
pod-template-hash: 5f86686c4d
version: v11
initContainers:
- args:
- --incoming-proxy-port
- "4143"
- --outgoing-proxy-port
- "4140"
- --proxy-uid
- "2102"
- --inbound-ports-to-ignore
- 4190,4191
- --outbound-ports-to-ignore
- "44556"
image: cr.l5d.io/linkerd/proxy-init:v1.3.9
imagePullPolicy: IfNotPresent
name: linkerd-init
```
(opaque ports is there and so is skip outbound and the proxy log level, correct options still passed to the initContainers).
*Edit*: made a small change, had a look at `GetNsConfigKeys()` and thought it'd be better to keep the slice of keys as a fixed length array since we know there will be at most `len(ProxyAnnotations)` at any point. Not sure such a big size is warranted but we can avoid calling append for every element.
Signed-off-by: Matei David <matei@buoyant.io>
* Remove the `linkerd-controller` pod
Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.
## Removals
- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.
## Health checks
- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.
## Other changes
- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
When a pod is configured with `skip-inbound-ports` annotation, a client
proxy trying to connect to that pod tries to connect to it via H2 and
also tries to initiate a TLS connection. This issue is caused by the
destination controller when it sends protocol and TLS hints to the
client proxy for that skipped port.
This change fixes the destination controller so that it no longer
sends protocol and TLS identity hints to outbound proxies resolving a
`podIP:port` that is on a skipped inbound port.
I've included a test that exhibits this error prior to this fix but you
can also test the prior behavior by:
```bash
curl https://run.linkerd.io/booksapp.yml > booksapp.yaml
# edit either the books or authors service to:
1: Configure a failure rate of 0.0
2: add the `skip-inbound-ports` config annotation
bin/linkerd viz stat pods webapp
There should be no successful requests on the webapp deployment
```
Fixes#5995
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Removed `Version` API from the public-api
This is a sibling PR to #5993, and it's the second step towards removing the `linkerd-controller` pod.
This one deals with a replacement for the `Version` API, fetching instead the `linkerd-config` CM and retrieving the `LinkerdVersion` value.
## Changes to the public-api
- Removal of the `publicPb.ApiClient` entry from the `Client` interface
- Removal of the `publicPb.ApiServer` entry from the `Server` interface
- Removal of the `Version` and related methods from `client.go`, `grpc_server.go` and `http_server.go`
## Changes to `linkerd version`
- Removal of all references to the public API.
- Call `healthcheck.GetServerVersion` to retrieve the version
## Changes to `linkerd check`
- Removal of the "can query the control API" check from the "linkerd-api" section
- Addition of a new "can retrieve the control plane version" check under the "control-plane-version" section
## Changes to `linkerd-web`
- The version is now retrieved from the `linkerd-config` CM instead of a public-API call.
- Removal of all references to the public API.
- Removal of the `data-go-version` global attribute on the dashboard, which wasn't being used.
## Other changes
- Added `ValuesFromConfigMap` function in `values.go` to convert the `linkerd-config` CM into a `*Values` struct instance
- Removal of the `public` protobuf
- Refactor 'linkerd repair' to use the refactored 'healthcheck.GetServerVersion()' function
* Removed Destination's `Get` API from the public-api
This is the first step towards removing the `linkerd-controller` pod. It deals with removing the Destination `Get` http and gRPC endpoint it exposes, that only the `linkerd diagnostics endpoints` is consuming.
Removed all references to Destination in the public-api, including all the gRPC-to-http-to-gRPC forwardings:
- Removed the `Get` method from the public-api gRPC server that forwarded the connection from the controller pod to the destination pod. Clients should now connect directly to the destination service via gRPC.
- Likewise, removed the destination boilerplate in the public-api http server (and its `main.go`) that served the `Get` destination endpoint and forwarded it into the gRPC server.
- Finally, removed the destination boilerplate from the public-api's `client.go` that created a client connecting to the http API.
* Schedule heartbeat 10 mins after install
... for the Helm installation method, thus aligning it with the CLI
installation method, to reduce the midnight peak on the receiving end.
The logic added into the chart is now reused by the CLI as well.
Also, set `concurrencyPolicy=Replace` so that when a job fails and it's
retried, the retries get canceled when the next scheduled job is triggered.
Finally, the go client only failed when the connection failed;
successful connections with a non 200 response status were considered
successful and thus the job wasn't retried. Fixed that as well.
### What
When a namespace has the opaque ports annotation, pods and services should
inherit it if they do not have one themselves. Currently, services do this but
pods do not. This can lead to surprising behavior where services are correctly
marked as opaque, but pods are not.
This changes the proxy-injector so that it now passes down the opaque ports
annotation to pods from their namespace if they do not have their own annotation
set. Closes#5736.
### How
The proxy-injector webhook receives admission requests for pods and services.
Regardless of the resource kind, it now checks if the resource should inherit
the opaque ports annotation from its namespace. It should inherit it if the
namespace has the annotation but the resource does not.
If the resource should inherit the annotation, the webhook creates an annotation
patch which is only responsible for adding the opaque ports annotation.
After generating the annotation patch, it checks if the resource is injectable.
From here there are a few scenarios:
1. If no annotation patch was created and the resource is not injectable, then
admit the request with no changes. Examples of this are services with no OP
annotation and inject-disabled pods with no OP annotation.
2. If the resource is a pod and it is injectable, create a patch that includes
the proxy and proxy-init containers—as well as any other annotations and
labels.
3. The above two scenarios lead to a patch being generated at this point, so no
matter the resource the patch is returned.
### UI changes
Resources are now reported to either be "injected", "skipped", or "annotated".
The first pass at this PR worked around the fact that injection reports consider
services and namespaces injectable. This is not accurate because they don't have
pod templates that could be injected; they can however be annotated.
To fix this, an injection report now considers resources "annotatable" and uses
this to clean up some logic in the `inject` command, as well as avoid a more
complex proxy-injector webhook.
What's cool about this is it fixes some `inject` command output that would label
resources as "injected" when they were not even mutated. For example, namespaces
were always reported as being injected even if annotations were not added. Now,
it will properly report that a namespace has been "annotated" or "skipped".
### Tests
For testing, unit tests and integration tests have been added. Manual testing
can be done by installing linkerd with `debug` controller log levels, and
tailing the proxy-injector's app container when creating pods or services.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Fixes#5939
Some CNIs reasssign the IP of a terminating pod to a new pod, which
leads to duplicate IPs in the cluster.
It eventually triggers #5939.
This commit will make the IPWatcher, when given an IP, filter out the terminating pods
(when a pod is given a deletionTimestamp).
The issue is hard reproduce because we are not able to assign a
particular IP to a pod manually.
Signed-off-by: Bruce <wenliang.chen@personio.de>
Co-authored-by: Bruce <wenliang.chen@personio.de>
This fixes an issue where pod lookups by host IP and host port fail even though
the cluster has a matching pod.
Usually these manifested as `FailedPrecondition` errors, but the messages were
too long and resulted in http/2 errors. This change depends on #5893 which fixes
that separate issue.
This changes how often those `FailedPrecondition` errors actually occur. The
destination service now considers pod host IPs and should reduce the frequency
of those errors.
Closes#5881
---
Lookups like this happen when a pod is created with a host IP and host port set
in its spec. It still has a pod IP when running, but requests to
`hostIP:hostPort` will also be redirected to the pod. Combinations of host IP
and host Port are unique to the cluster and enforced by Kubernetes.
Currently, the destination services fails to find pods in this scenario because
we only keep an index with pod and their pod IPs, not pods and their host IPs.
To fix this, we now also keep an index of pods and their host IPs—if and only if
they have the host IP set.
Now when doing a pod lookup, we consider both the IP and the port. We perform
the following steps:
1. Do a lookup by IP in the pod podIP index
- If only one pod is found then return it
2. 0 or more than 1 pods have the same pod IP
3. Do a lookup by IP in the pod hostIP index
- If any number of pods were found, we know that IP maps to a node IP.
Therefore, we search for a pod with a matching host Port. If one exists then
return it; if not then there is no pod that matches `hostIP:port`
4. The IP does not map to a host IP
5. If multiple pods were found in `1`, then we know there are pods with
conflicting podIPs and an error is returned
6. If no pounds were found in `1` then there is no pod that matches `IP:port`
---
Aside from the additional IP watcher test being added, this can be tested with
the following steps:
1. Create a kind cluster. kind is required because it's pods in `kube-system`
have the same pod IPs; this not the case with k3d: `bin/kind create cluster`
2. Install Linkerd with `4445` marked as opaque: `linkerd install --set
proxy.opaquePorts="4445" |kubectl apply -f -`
2. Get the node IP: `kubectl get -o wide nodes`
3. Pull my fork of `tcp-echo`:
```
$ git clone https://github.com/kleimkuhler/tcp-echo
...
$ git checkout --track kleimkuhler/host-pod-repro
```
5. `helm package .`
7. Install `tcp-echo` with the server not injected and correct host IP: `helm
install tcp-echo tcp-echo-0.1.0.tgz --set server.linkerdInject="false" --set
hostIP="..."`
8. Looking at the client's proxy logs, you should not observe any errors or
protocol detection timeouts.
9. Looking at the server logs, you should see all the requests coming through
correctly.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
# Problem
While rolling out often not all pods will be ready in all the same set of
ports, leading the Kubernetes Endpoints API to return multiple subsets,
each covering a different set of ports, with the end result that the
same address gets repeated across subsets.
The old code for endpointsToAddresses would loop through all subsets, and the
later occurrences of an address would overwrite previous ones, with the
last one prevailing.
If the last subset happened to be for an irrelevant port, and the port to
be resolved is named, resolveTargetPort would resolve to port 0, which would
return port 0 to clients, ultimately leading linkerd-proxy to forward
connections to port 0.
This only happens if the pods selected by a service expose > 1 port, the
service maps to > 1 of these ports, and at least one of these ports is named.
# Solution
Never write an address to set of addresses if resolved port is 0, which
indicates named port resolution failed.
# Validation
Added a test case.
Signed-off-by: Riccardo Freixo <riccardofreixo@gmail.com>
This reduces the possible HTTP response size from the destination service when
it encounters an error during a profile lookup.
If multiple objects on a cluster share the same IP (such as pods in
`kube-system`), the destination service will return an error with the two
conflicting pod yamls.
In certain cases, these pod yamls can be too large for the HTTP response and the
destination pod's proxy will indicate that with the following error:
```
hyper::proto::h2::server: send response error: user error: header too big
```
From the app pod's proxy, this results in the following error:
```
poll_profile: linkerd_service_profiles::client: Could not fetch profile error=status: Unknown, message: "http2 error: protocol error: unexpected internal error encountered"
```
We now only return the conflicting pods (or services) names. This reduces the
size of the returned error and fixes these warnings from occurring.
Example response error:
```
poll_profile: linkerd_service_profiles::client: Could not fetch profile error=status: FailedPrecondition, message: "Pod IP address conflict: kube-system/kindnet-wsflq, kube-system/kube-scheduler-kind-control-plane", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 12 Mar 2021 19:54:09 GMT"} }
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* update go.mod and docker images to go 1.16.1
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* update test error messages for ParseDuration
* update go version to 1.16.2
This change fixes an issue where the `linkerd-sp-validator` does not set
the `request.UID` for an `admissionResponse`. This causes an issue that
prevents service profiles from being added or updated.
Fixes#5862
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* destination: pass opaque-ports through cmd flag
Fixes#5817
Currently, Default opaque ports are stored at two places i.e
`Values.yaml` and also at `opaqueports/defaults.go`. As these
ports are used only in destination, We can instead pass these
values as a cmd flag for destination component from Values.yaml
and remove defaultPorts in `defaults.go`.
This means that users if they override `Values.yaml`'s opauePorts
field, That change is propogated both for injection and also
discovery like expected.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When introducing the `linkerd-await` helper, we provided a default value
for `TARGETARCH`. This appears to interfere with multi-arch image
builds, causing ARM builds to fetch amd64 binaries.
Unsetting this default appears to fix this issue.
When a container starts up, we generally want to wait for the proxy to
initialize before starting the controller (which may initiate outbound
connections, especially to the Kubernetes API). This is true for all
pods except the identity controller, which must start before its proxy.
This change adds the linkerd-await helper to all of our container
images. Its use is explicitly disabled in the identity controller, due
to startup ordering constraints, and the heartbeat controller, because
it does not run a proxy currently.
Fixes#5819
* Remove linkerd prefix from extension resources
This change removes the `linkerd-` prefix on all non-cluster resources
in the jaeger and viz linkerd extensions. Removing the prefix makes all
linkerd extensions consistent in their naming.
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>