Variable references are only expanded to previously defined
environment variables as per https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#envvar-v1-core
which means for `LINKERD2_PROXY_POLICY_WORKLOAD` to work correctly, the
`_pod_ns` `_pod_name` should be present before they are used.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#6688
This PR adds the new `LINKERD2_PROXY_POLICY_SVC_ADDR` and
`LINKERD2_PROXY_POLICY_SVC_NAME` env variables which are used to specify
the address and the identity (which is `linkerd-destination`) of the
policy server respectively.
This also adds the new `LINKERD2_PROXY_POLICY_WORKLOAD` in the format
of `$ns:$pod` which is used to specify the identity of the workload itself.
A new `_pod_name` env variable has been added to get the name of the pod
through the Downward API.
These variables are only set if the `proxy.component` is not
`linkerd-identity`.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
`svc/linkerd-policy` selects pods on
`linkerd.io/control-plane-component: policy`; but this doesn't actually
select any controller pods.
This change ensures all `control-plane-component` labels use
`destination` instead of `policy`.
We've implemented a new controller--in Rust!--that implements discovery
APIs for inbound server policies. This change imports this code from
linkerd/polixy@25af9b5e.
This policy controller watches nodes, pods, and the recently-introduced
`policy.linkerd.io` CRD resources. It indexes these resources and serves
a gRPC API that will be used by proxies to configure the inbound proxy
for policy enforcement.
This change introduces a new policy-controller container image and adds a
container to the `Linkerd-destination` pod along with a `linkerd-policy` service
to be used by proxies.
This change adds a `policyController` object to the Helm `values.yaml` that
supports configuring the policy controller at runtime.
Proxies are not currently configured to use the policy controller at runtime. This
will change in an upcoming proxy release.
We've implemented new CRDs to support configuration and discovery of
inbound policies. This change imports these CRDs from
linkerd/polixy@25af9b5 and adds them to the default Linkerd installation:
- `servers.policy.linkerd.io`
- `serverauthorizations.policy.linkerd.io`
* Do not install PSP resources by default
Fixes#6549
PodSecurityPolicy is deprecated as of k8s v1.21 and will be unavailable starting k8s v1.25. This was causing warnings to be displayed in `linkerd install/upgrade` and `linkerd check`.
By default, do not include the linkerd PSP resource along with its Role and RoleBinding. If the user wants that, they can by setting `enablePSP: true`, a new config introduced for this purpose.
This was done in the linkerd, linkerd-cni, linkerd-viz, multicluster and jaeger charts.
The associated checks were also removed, including the NET_ADMIN+NET_RAW capabilities check, which rely on the PSP API.
Fixes#6452
We add a `linkerd-identity-trust-roots` ConfigMap which contains the configured trust root bundle. The proxy template partial is modified so that core control plane components load this bundle from the configmap through the downward API.
The identity controller is updated to mount this new configmap as a volume read the trust root bundle at startup.
Similarly, the proxy-injector also mounts this new configmap. For each pod it injects, it reads the trust root bundle file and sets it on the injected pod.
Signed-off-by: Alex Leong <alex@buoyant.io>
Increase container security by making the root file system of the cni
install plugin read-only.
Change the temporary directory used in the cni install script, add a
writable EmptyDir volume and enable readOnlyFileSystem securityContext
in cni plugin helm chart.
Tested this by building the container image of the cni plugin and
installed the chart onto a cluster. Logs looked the same as before this
change.
Fixes#6468
Signed-off-by: Gerald Pape <gerald@giantswarm.io>
* Set `LINKERD2_PROXY_INBOUND_PORTS` during injection
Fixes#6267
The `LINKERD2_PROXY_INBOUND_PORTS` env var will be set during injection,
containing a comma-separated list of the ports in the non-proxy containers in
the pod. For the identity, destination and injector pods, the var is set
manually in their Helm templates.
Since the proxy-injector isn't reinvoked, containers injected by a mutating
webhook after the injector has run won't be detected. As an escape hatch, the
`config.linkerd.io/pod-inbound-ports` annotation has been added to explicit
overrides.
Other changes:
- Removed
`controller/proxy-injector/fake/data/inject-sidecar-container-spec.yaml` which
is no longer used. - Fixed bad indentation in some fixture files under
`controller/proxy-injector/fake/data`.
Default Linkerd skip and opaque port configuration
Missing default ports based on docs
Addressed: Add Redis to default list of Opaque ports #6132
Once merged, the default install values will match the recommendations in Linkerd's TCP ports guide.
Fixes#6132
Signed-off-by: jasonmorgan <jmorgan@f9vs.com>
Co-authored-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
* Add missing ignoreInboundports to CNI config and also fixed ignoreOutboundports to support passing just one port
This change fixes an issue where tap does not work when running Linkerd
through Linkerd CNI installed via helm charts. This issue was caused by
the CNI chart's value not including tap control and admin ports in the
config. This caused tap request traffic to go to the inbound side of the
proxy as opposed to the respective tap control port.
Fixes#6224
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
List of changes:
- Include more output in the `simulate` mode (thanks @liuerfire!")
- Log to `stdout` instead of `stderr` (thanks @mo4islona!)
Non user-facing changes:
- Added `dependabot.yml` to receive automated dependencies upgrades PRs (both for go and github actions). As a result, also upgraded a bunch of dependencies.
Followup to #6181
After having uploaded the `artifacthub-repo.yml` file, we were able to
claim ownership of the linkerd2-edge repo. This generated a
`repositoryID` in artifacthub.io that we're adding into this same file,
so that we can be marked as "Verified Publisher".
Also we moved that file one level up into `charts/` so it doesn't get
included in the built chart. And created two versions of that file, one
for edge and another for stable (whose ID will be received when the next
stable helm release is pushed), that will be copied into the helm repo
by the `release.yml` workflow.
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
The misspellings have been reported at 0d56327e6f (commitcomment-51603624)
The action reports that the changes in this PR would make it happy: 03a9c310aa
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
The owners in that file match the entries in MAINTAINERS.md.
On the next release build this will upload that file into our Helm repo,
and we'll be able to claim ownership of the chart at artifacthub.io.
Once we do that we'll get a `repositoryID` we can add into that file, to
obtain the "Verified publisher" badge.
This change also removes a couple of no longer relevant OWNERS files.
* Skip configuring firewall if rules exists
This change fixes an issue where the `proxy-init` will fail if
`PROXY_INIT_*` chains already exist in the pod's iptables. This then
causes the pod to never start because proxy-init never finishes running
with a non-zero exit code.
In this change, we capture the output of the `iptables-save` command and
then check to see if the output contains the `PROXY_INIT_*` chains. If
they do, exist and log a warning stating that the chains already
exist.
Fixes#5786
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Support Traffic Splitting through `ServiceProfile.dstOverrides`
This commit
- updates the destination logic to prevent the override of `ServiceProfiles.dstOverrides` when a `TS` is absent and no `dstOverrides` set. This means that any `serviceprofiles.dstOverrides` set by the user are correctly propagated to the proxy and thus making Traffic Splitting through service profiles work.
- This also updates the `profile_translator.toDstOverrides` to use the port from `GetProfiles` when there is no port in `dstOverrides.authority`
After these changes, The following `ServiceProfile` to perform
traffic splitting should work.
```yaml
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local
spec:
dstOverrides:
- authority: "backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
- authority: "failing-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
```
This PR also adds an integration test i.e `TestTrafficSplitCliWithSP` which checks for
Traffic Splitting through Service Profiles. This integration test follows the same pattern
of other traffic split integration tests but Instead, we perform `linkerd viz stat` on the
client and check if there are expected server objects to be there as `linkerd viz stat ts`
does not _yet_ work with Service Profiles.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
### What
This change adds the `config.linkerd.io/proxy-await` annotation which when set will delay application container start until the proxy is ready. This allows users to force application containers to wait for the proxy container to be ready without modifying the application's Docker image. This is different from the current use-case of [linkerd-await](https://github.com/olix0r/linkerd-await) which does require modifying the image.
---
To support this, Linkerd is using the fact that containers are started in the order that they appear in `spec.containers`. If `linkerd-proxy` is the first container, then it will be started first.
Kubernetes will start each container without waiting on the result of the previous container. However, if a container has a hook that is executed immediately after container creation, then Kubernetes will wait on the result of that hook before creating the next container. Using a `PostStart` hook in the `linkerd-proxy` container, the `linkerd-await` binary can be run and force Kubernetes to pause container creation until the proxy is ready. Once `linkerd-await` completes, the container hook completes and the application container is created.
Adding the `config.linkerd.io/await-proxy` annotation to a pod's metadata results in the `linkerd-proxy` container being the first container, as well as having the container hook:
```yaml
postStart:
exec:
command:
- /usr/lib/linkerd/linkerd-await
```
---
### Update after draft
There has been some additional discussion both off GitHub as well as on this PR (specifically with @electrical).
First, we decided that this feature should be enabled by default. The reason for this is more often than not, this feature will prevent start-up ordering issues from occurring without having any negative effects on the application. Additionally, this will be a part of edges up until the 2.11 (the next stable release) and having it enabled by default will allow us to check that it does not conflict often with applications. Once we are closer to 2.11, we'll be able to determine if this should be disabled by default because it causes more issues than it prevents.
Second, this feature will remain configurable; if disabled, then upon injection the proxy container will not be made the first container in the pod manifest. This is important for the reasons discussed with @electrical about tools that make assumptions about app containers being the first container. For example, Rancher defaults to showing overview pages for the `0` index container, and if the proxy container was always `0` then this would defeat the purpose of the overview page.
### Testing
To test this I used the `sleep.sh` script and changed `Dockerfile-proxy` to use it as it's `ENTRYPOINT`. This forces the container to sleep for 20 seconds before starting the proxy.
---
`sleep.sh`:
```bash
#!/bin/bash
echo "sleeping..."
sleep 20
/usr/bin/linkerd2-proxy-run
```
`Dockerfile-proxy`:
```textile
...
COPY sleep.sh /sleep.sh
RUN ["chmod", "+x", "/sleep.sh"]
ENTRYPOINT ["/sleep.sh"]
```
---
```bash
# Build and install with the above changes
$ bin/docker-build
...
$ bin/image-load --k3d
...
$ bin/linkerd install |kubectl apply -f -
```
Annotate the `emoji` deployment so that it's the only workload that should wait for it's proxy to be ready and inject it:
```bash
cat emojivoto.yaml |bin/linkerd inject - |kubectl apply -f -
```
You can then see that the `emoji` deployment is not starting its application container until the proxy is ready:
```bash
$ kubectl get -n emojivoto pods
NAME READY STATUS RESTARTS AGE
voting-ff4c54b8d-sjlnz 1/2 Running 0 9s
emoji-f985459b4-7mkzt 0/2 PodInitializing 0 9s
web-5f86686c4d-djzrz 1/2 Running 0 9s
vote-bot-6d7677bb68-mv452 1/2 Running 0 9s
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Remove the `linkerd-controller` pod
Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.
## Removals
- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.
## Health checks
- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.
## Other changes
- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
* Removed Destination's `Get` API from the public-api
This is the first step towards removing the `linkerd-controller` pod. It deals with removing the Destination `Get` http and gRPC endpoint it exposes, that only the `linkerd diagnostics endpoints` is consuming.
Removed all references to Destination in the public-api, including all the gRPC-to-http-to-gRPC forwardings:
- Removed the `Get` method from the public-api gRPC server that forwarded the connection from the controller pod to the destination pod. Clients should now connect directly to the destination service via gRPC.
- Likewise, removed the destination boilerplate in the public-api http server (and its `main.go`) that served the `Get` destination endpoint and forwarded it into the gRPC server.
- Finally, removed the destination boilerplate from the public-api's `client.go` that created a client connecting to the http API.
* Move `sp-validator` into the `destination` pod
Fixes#5195
The webhook secrets remain the same, as do the `profileValidator` settings in `values.yaml`, so this doesn't pose any upgrading challenges.
Allow users to add extra initContainers to the linkerd2-cni daemonset through a new helm value `extraInitContainers`. This can be used, for example, to wait for the existence of a file, before starting the installation of the linkerd2-cni.
Helps with race conditions like
#2219#4789
* trafficsplit: Support v1alpha2 version
This PR updates the TrafficSplit CRD to add the `v1alpha2` version
which changes the `weight` field from `resource.Quantity` into
integer.
Our Initial approach to fixing this, was to switch all the types in
the code repo to be `v1alpha2`. This was complex because switching
from resource.Quantity to integer means that a conversion webhook
would be needed to support existing users and not break stuff in
clusters.
Instead of changing all golang TrafficSplit types to `v1alpha2`, This
PR tries to do it differently i.e Keep using `storage` version and
all the types in destination as `v1alpha1` and make k8s convert `v1alpha2`
resources into `v1alpha1`. This works because `v1alpha1`'s `weight` field
is `resource.Quantity` which also covers normal integers
(which is in v1alpha2). By keeping all the internal types to still use
`v1lapha1`, we can essentially support both types without having to write
conversion webhooks. The obv trade-off here is that we are essentially
not moving away from `v1alpha1` in the code repository.
Because with TrafficSplit, The proportions matter more than the exact digits
we are not loosing any type of precision when converting from a `integer`
into `resource.Quantity`. So, With this change, The following TrafficSplit
resource should work.
```yaml
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
metadata:
name: reviews-rollout
spec:
service: reviews
backends:
- service: reviews-v2
weight: 50
- service: reviews-v3
weight: 50
```
This PR also updates the trafficsplit integration test to do the same
for both v1alpha1 and v1alpha2.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Currently, We do not support `matches` field in the TrafficSplit
CRD. This seems to be mistakenly added when OpenAPIV3 Validation
support was added.
This PR updates the TrafficSplit CRD to remove that field, while
also updating the golden test files.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Schedule heartbeat 10 mins after install
... for the Helm installation method, thus aligning it with the CLI
installation method, to reduce the midnight peak on the receiving end.
The logic added into the chart is now reused by the CLI as well.
Also, set `concurrencyPolicy=Replace` so that when a job fails and it's
retried, the retries get canceled when the next scheduled job is triggered.
Finally, the go client only failed when the connection failed;
successful connections with a non 200 response status were considered
successful and thus the job wasn't retried. Fixed that as well.
The `ignoreInboundPorts` field is not parsed correctly when passed through the
`--set` Helm flag. This was discovered in https://github.com/linkerd/linkerd2/pull/5874#pullrequestreview-606779599.
This is happening because the value is not parsed into a string before using it
in the templating.
Before:
```
linkerd install --set proxyInit.ignoreInboundPorts="12345" |grep 12345
...
- "4190,4191,%!s(int64=12345)"
...
```
After:
```
linkerd install --set proxyInit.ignoreInboundPorts="12345" |grep 12345
...
- "4190,4191,12345"
...
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Currently, There is no `Notes` that get printed out after installatio
is performed through helm for extensions, like we do for the core
chart. This updates the viz and jaeger charts to include that
along with instructions to view the dashbaord.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
In #5694 we set many Helm values to empty (like version tags), and then the
templates became responsible to filling out to proper default values. We missed
doing that for the `linkerd.io/proxy-version` annotation built in the
`_metadata.tpl` template. This fixes that, and also stops setting
`proxy.image.version` in `helm install|upgrade` in the Helm integration tests,
which is what avoided catching this error .
* destination: pass opaque-ports through cmd flag
Fixes#5817
Currently, Default opaque ports are stored at two places i.e
`Values.yaml` and also at `opaqueports/defaults.go`. As these
ports are used only in destination, We can instead pass these
values as a cmd flag for destination component from Values.yaml
and remove defaultPorts in `defaults.go`.
This means that users if they override `Values.yaml`'s opauePorts
field, That change is propogated both for injection and also
discovery like expected.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
When a container starts up, we generally want to wait for the proxy to
initialize before starting the controller (which may initiate outbound
connections, especially to the Kubernetes API). This is true for all
pods except the identity controller, which must start before its proxy.
This change adds the linkerd-await helper to all of our container
images. Its use is explicitly disabled in the identity controller, due
to startup ordering constraints, and the heartbeat controller, because
it does not run a proxy currently.
Fixes#5819
* Remove linkerd prefix from extension resources
This change removes the `linkerd-` prefix on all non-cluster resources
in the jaeger and viz linkerd extensions. Removing the prefix makes all
linkerd extensions consistent in their naming.
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
This change removes the default ignored inbound and outbound ports from the
proxy init configuration.
These ports have been moved to the the `proxy.opaquePorts` configuration so that
by default, installations will proxy all traffic on these ports opaquely.
Closes#5571Closes#5595
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Currently the identity controller is the only component that receives the CA certificate / trust anchors as option `-identity-trust-anchors-pem` instead of an env var.
This stops one from letting it read the trust anchors from a Secret that is managed by e.g. cert-manager.
This PR uses an env var instead of the option to provide the trust anchors. For most helm chart users this doesn't change anything. However using kustomize the helm output manifest can now be adjusted (again) so that the certificate is loaded from a ConfigMap or Secret like in [this example](https://github.com/mgoltzsche/khelm/tree/master/example/kpt/linkerd) which aims to produce a static manifest to make the installation/update more declarative and support GitOps workflows.
This PR does not provide chart options/values to specify Secrets upfront - it would introduce dependencies to other operators.
Relates to #3843, see https://github.com/linkerd/linkerd2/issues/3843#issuecomment-775516217Fixes#3321
Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>
This change counts the number of service profiles installed in a cluster
and adds that info to the heartbeat HTTP request.
Fixes#5474
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Fixes#5574 and supersedes #5660
- Removed from all the `values.yaml` files all those "do not edit" entries for annotation/label names, hard-coding them in the templates instead.
- The `values.go` files got simplified as a result.
- The `created-by` annotation was also refactored into a reusable partial. This means we had to add a `partials` dependency to multicluster.
This adds namespace inheritance of the opaque ports annotation to services.
This means that the proxy injector now watches services creation in a cluster.
When a new service is created, the webhook receives an admission request for
that service and determines whether a patch needs to be created.
A patch is created if the service does not have the annotation, but the
namespace does. This means the service inherits the annotation from the
namespace.
A patch is not created if the service and the namespace do not have the
annotation, or the service has the annotation. In the case of the service having
the annotation, we don't even need to check the namespace since it would not
inherit it anyways.
If a namespace has the annotation value changed, this will not be reflected on
the service. The service would need to be recreated so that it goes through
another admission request.
None of this applies to the `inject` command which still skips service
injection. We rely on being able to check the namespace annotations, and this is
only possible in the proxy injector webhook when we can query the k8s API.
Closes#5737
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
We've created a custom domain, `cr.l5d.io`, that redirects to `ghcr.io`
(using `scarf.sh`). This custom domain allows us to swap the underlying
container registry without impacting users. It also provides us with
important metrics about container usage, without collecting PII like IP
addresses.
This change updates our Helm charts and CLIs to reference this custom
domain. The integration test workflow now refers to the new domain,
while the release workflow continues to use the `ghcr.io/linkerd` registry
for the purpose of publishing images.
Fixes#5685
Currently, YAML anchors are not supported through Helm Values
fields. These are required for the default* flags which are
used to override flags across components.
For them to work, YAML anchors had to be removed and rely on
default function directly in the tempalte files.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5755 follow-up to #5750 and #5751
- Unifies the Go version across Docker and CI to be 1.14.15;
- Updates the GitHub Actions base image from ubuntu-18.04 to ubuntu-20.04; and
- Updates the runtime base image from debian:buster-20201117-slim to debian:buster-20210208-slim.
Fixes#5616
Changed control-plane templates i.e controller, identity, destination, proxy-injector and sp-validator to add 443 to skip ports if not present. This prevents failing of linkerd installation when 443 is not present in ignoreOutboundPorts.
Signed-off-by: Jimil Desai <jimildesai42@gmail.com>
Fixes#5652
This PR adds new annotation that is added when a
external Prometheus is used. Based on that
annotations, The CLI can get to know if an external instance
is being used and if the annotation is absent, that the
the default instance is present.
This updates the viz Checks to skip some checkers if the default
Prometheus instances are absent.
This PR also removes the grafana checks as they are not useful
and add unnecessary complexity.
This also cleans up some `grafanaUrl` stuff from the core
control-plane chart.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* values: removal of .global field
Fixes#5425
With the new extension model, We no longer need `Global` field
as we don't rely on chart dependencies anymore. This helps us
further cleanup Values, and make configuration more simpler.
To make upgrades and the usage of new CLI with older config work,
We add a new method called `config.RemoveGlobalFieldIfPresent` that
is used in the upgrade and `FetchCurrentConfiguration` paths to remove
global field and attach its child nodes if global is present. This is verified
by the `TestFetchCurrentConfiguration`'s older test that has the global
field.
We also don't yet remove .global in some helm stable-upgrade tests for
the initial install to work.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>