linkerd2

Commit Graph

Author	SHA1	Message	Date
Alex Leong	2367eea473	Create policy-controller-write Lease at runtime (#10823 ) Fixes #10762 The Linkerd control plane chart contains a Lease resource which is used by the Policy controller to do leader election. ArgoCD considers Leases to be runtime resources and will not deploy them. This means that Linkerd will not work for users of ArgoCD. We remove the policy-controller-write Lease resource from the Helm chart and instead have the policy controller create this resource at startup. We create it with an `Apply` patch with `resourceVersion="0"`. This ensures that the Lease resource will only be created if it does not already exist and that if there are multiple replicas of the policy controller starting up at once, only one of them will create the Lease resource. We also set the `linkerd-destination` Deployment as the owner reference of the Lease resource. This means that when the `linkerd-destination` Deployment is deleted (for example, when Linkerd is uninstalled) then the Lease will be garbage collected by Kubernetes. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-04-27 09:59:06 -07:00
David McLaughlin	7963acb53e	edge-23.4.2 (#10794 ) * edge-23.4.2 This edge release contains a number of bug fixes. * CLI * Fixed `linkerd uninstall` issue for HttpRoute * The `linkerd diagnostics policy` command now displays outbound policy when the target resource is a Service * CNI * Fixed incompatibility issue with AWS CNI addon in EKS, that was forbidding pods to acquire networking after scaling up nodes. (thanks @frimik!) * Added --set flag to install-cni plugin (thanks @amit-62!) * Control Plane * Fixed an issue where the policy controller always used the default `cluster.local` domain * Send Opaque protocol hint for opaque ports in destination controller * Helm * Fixed an issue in the viz Helm chart where the namespace metadata template would throw `unexpected argument found` errors * Fixed Jaeger chart installation failure * Multicluster * Remove namespace field from cluster scoped resources to fix pruning * Proxy * Updated `h2` dependency to include a patch for a theoretical denial-of-service vulnerability discovered in CVE-2023-26964 * Handle Opaque protocol hints on endpoints * Changed the proxy's default log level to silence warnings from `trust_dns_proto` that are generally spurious. * Added `outbound_http_balancer_endpoints` metric * Fixed missing route_ metrics for requests with ServiceProfiles * Viz * Bump prometheus image to v2.43.0 * Add the `kubelet` NetworkAuthentication back since it is used by the `linkerd viz allow-scrapes` subcommand. --------- Signed-off-by: David McLaughlin <david@dmclaughlin.com> Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>	2023-04-21 13:40:36 -05:00
Alejandro Pedraza	f57c925ecb	Bump cni-plugin to v1.1.1 (#10780 ) Fixed incompatibility issue with AWS CNI addon in EKS, that was forbidding pods to acquire networking after scaling up nodes. Credits to @frimik for providing a diagnosis and fix, and to @JonKusz for the detailed repro	2023-04-20 12:21:09 -05:00
Amit Kumar	d26c324e76	added --set flag to install-cni plugin (#10633 ) This PR added support for --set flag to linkerd cni-plugin installation command. Also made changes to test file for cni-plugin install. Fixed a bug at pkg/chart/charts.go for resources template. fixes #9917 * Allow supporting all flags and values This leverages `chartutil.CoalesceValues` in order to merge the values provided through regular flags, and the ones provided via `--set` flags. As that function consumes maps, I introduced the `ToMap` method function on the cni `Values` struct (a copy of the same function from the core linkerd `Values` struct) to convert the struct backing the regular flags into a map. And for the `RenderCNI` method to be able to deal with value maps instead of yaml, the `charts.Chart` struct now distinguishes between `Values` (a map) and `RawValues` (YAML). This allowed removing the `parseYAMLValue` function and avoid having to deal with individual entries in `buildValues()`, and we no longer need the `valuesOverrides` field in the `cniPluginOptions` struct. ## Tests ```bash # Testing regular flag $ bin/go-run cli install-cni --use-wait-flag \| grep use.wait.flag "use-wait-flag": true # Testing using --set $ bin/go-run cli install-cni --set useWaitFlag=true \| grep use.wait.flag "use-wait-flag": true # Testing using --set on a setting that has no regular flag $ bin/go-run cli install-cni --set enablePSP=true \| grep PodSecurityPolicy kind: PodSecurityPolicy ``` --------- Signed-off-by: amit-62 <kramit6662@gmail.com> Co-authored-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> Co-authored-by: Matei David <matei.david.35@gmail.com>	2023-04-20 09:34:06 -05:00
Eliza Weisman	83e9c45bd1	add `trust_dns=error` to default proxy log level (#10774 ) * add `trust_dns=error` to default proxy log level Since upstream has yet to release a version with PR bluejekyll/trust-dns#1881, this commit changes the proxy's default log level to silence warnings from `trust_dns_proto` that are generally spurious. Closes #10123.	2023-04-20 09:29:56 -05:00
Alex Leong	b4eb54179e	Check for helm docs diffs in extension charts (#10781 ) The Helm docs action in CI (which changes for discrepancies in Helm chart readmes) only checks the core Linkerd Helm charts, while allowing discrepancies in extension chart readmes. Update the action to enforce Helm doc consistency in extension charts as well. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-04-19 12:29:42 -07:00
Loong Dai	c3c4d743fe	charts: fix uninstall issue (#10704 ) Now we use label to filter all resources to uninstall, but `httproutes.policy.linkerd.io` does not have label, so every uninstall would not remove it. Signed-off-by: Loong <loong.dai@intel.com>	2023-04-17 12:08:24 -07:00
Alex Leong	76cd135b35	Pass cluster domain flag to policy controller (#10741 ) Fixes #10737 The cluster domain is not passed to the policy controller as a flag. This means that the policy controller always uses the default value of `cluster.local`. If the cluster's domain is different from this default, the policy controller will return incorrect authorities in it's outbound policy API. Pass the `--cluster-domain` flag. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-04-13 10:08:54 -07:00
Eliza Weisman	775dc9fbd9	stable-2.13.0 (#10727 ) ## stable-2.13.0 This release introduces client-side policy to Linkerd, including dynamic routing and circuit breaking. [Gateway API](https://gateway-api.sigs.k8s.io/) HTTPRoutes can now be used to configure policy for outbound (client) proxies as well as inbound (server) proxies, by creating HTTPRoutes with Service resources as their `parentRef`. See the Linkerd documentation for tutorials on [dynamic request routing] and [circuit breaking]. New functionality for debugging HTTPRoute-based policy is also included in this release, including [new proxy metrics] and the ability to display outbound policies in the `linkerd diagnostics policy` CLI command. In addition, this release adds `network-validator`, a new init container to be used when CNI is enabled. `network-validator` ensures that local iptables rules are working as expected. It will validate this before linkerd-proxy starts. `network-validator` replaces the `noop` container, runs as `nobody`, and drops all capabilities before starting. Finally, this release includes a number of bugfixes, performance improvements, and other smaller additions. Upgrade notes: Please see the [upgrade instructions][upgrade-2130]. * CRDs * HTTPRoutes may now have Service parents, to configure outbound policy * Updated HTTPRoute version from `v1alpha1` to `v1beta2` * CLI * Added a new `linkerd prune` command to the CLI (including most extensions) to remove resources which are no longer part of Linkerd's manifests * Added additional shortnames for Linkerd policy resources (thanks @javaducky!) * The `linkerd diagnostics policy` command now displays outbound policy when the target resource is a Service * Control Plane * The policy controller now discovers outbound policy configurations from HTTPRoutes that target Services. * Added OutboundPolicies API, for use by `linkerd-proxy` to route outbound traffic * Added Prometheus `/metrics` endpoint to the admin server, with process metrics * Fixed QueryParamMatch parsing for HTTPRoutes * Added the policy status controller which writes the `status` field to HTTPRoutes when a parent reference Server accepts or rejects it * Added KubeAPI server ports to `ignoreOutboundPorts` of `proxy-injector` * No longer apply `waitBeforeExitSeconds` to control plane, viz and jaeger extension pods * Added support for the `internalTrafficPolicy` of a service (thanks @yc185050!) * Added block chomping to strip trailing new lines in ConfigMap (thanks @avdicl!) * Added protection against nil dereference in resources helm template * Added support for Pod Security Admission (Pod Security Policy resources are still supported but disabled by default) * Lowered non-actionable error messages in the Destination log to debug-level entries to avoid triggering false alarms (thanks @siddharthshubhampal!) * Fixed an issue with EndpointSlice endpoint reconciliation on slice deletion; when using more than one slice, a `NoEndpoints` event would be sent to the proxy regardless of the amount of endpoints that were still available (thanks @utay!) * Improved diagnostic log messages * Fixed sending of spurious profile updates * Removed unnecessary Namespaces access from the destination controller RBAC * Added the server_port_subscribers metric to track the number of subscribers to Server changes associated with a pod's port * Added the service_subscribers metric to track the number of subscribers to Service changes * Fixed a small memory leak in the opaque ports watcher * Proxy * Use the new OutboundPolicies API, supporting Gateway API-style routes in the outbound proxy * Added support for dynamic request routing based on HTTPRoutes * Added HTTP circuit breaking * Added `outbound_route_backend_http_requests_total`, `outbound_route_backend_grpc_requests_total`, and `outbound_http_balancer_endpoints` metrics * Changed the proxy's behavior when traffic splitting so that only services that are not in failfast are used. This will enable the proxy to manage failover without external coordination * Updated tokio (async runtime) in the proxy which should reduce CPU usage, especially for proxy's pod local (i.e in the same network namespace) communication * linkerd-proxy-init * Changed `proxy-init` iptables rules to be idempotent upon init pod restart (thanks @jim-minter!) * Improved logging in `proxy-init` and `linkerd-cni` * Added a `proxyInit.privileged` setting to control whether the `proxy-init` initContainer runs as a privileged process * CNI * Added static and dynamic port overrides for CNI eBPF to work with socket-level load balancing * Added `network-validator` init container to ensure that iptables rules are working as expected * Added a `resources` field in the linkerd-cni chart (thanks @jcogilvie!) * Viz * Added `tap.ignoredHeaders` Helm value to the linkerd-viz chart. This value allows users to specify a comma-separated list of header names which will be ignored by Linkerd Tap (thanks @ryanhristovski!) * Removed duplicate SecurityContext in Prometheus manifest * Added new flag `--viz-namespace` which avoids requiring permissions for listing all namespaces in `linkerd viz` subcommands (thanks @danibaeyens!) * Removed the TrafficSplit page from the Linkerd viz dashboard (thanks @h-dav!) * Introduced new values in the `viz` chart to allow for arbitrary annotations on the `Service` objects (thanks @sgrzemski!) * Added an optional AuthorizationPolicy to authorize Grafana to Prometheus in the Viz extension * Multicluster * Removed duplicate AuthorizationPolicy for probes from the multicluster gateway Helm chart * Updated wording for linkerd-multicluster cluster when it fails to probe a remote gateway mirror * Added multicluster gateway `nodeSelector` and `tolerations` helm parameters * Added new configuration options for the multicluster gateway: * `gateway.deploymentAnnotations` * `gateway.terminationGracePeriodSeconds` (thanks @bunnybilou!) * `gateway.loadBalancerSourceRanges` (thanks @Tyrion85!) * Extensions * Removed dependency on the `curlimages/curl` 3rd-party image used to initialize extensions namespaces metadata (so they are visible by `linkerd check`), replaced by the new `extension-init` image * Converted `ServerAuthorization` resources to `AuthorizationPolicy` resources in Linkerd extensions * Removed policy resources bound to admin servers in extensions (previously these resources were used to authorize probes but now are authorized by default) * Fixed the link to the Jaeger dashboard the in viz dashboard (thanks @eugenegoncharuk!) * Updated linkerd-jaeger's collector to expose port 4318 in order support HTTP alongside gRPC (thanks @uralsemih!) * Among other dependency updates, the no-longer maintained ghodss/yaml library was replaced with sigs.k8s.io/yaml (thanks @Juneezee!) This release includes changes from a massive list of contributors! A special thank-you to everyone who helped make this release possible: * Andrew Pinkham [@jambonrose](https://github.com/jambonrose) * Arnaud Beun [@bunnybilou](https://github.com/bunnybilou) * Carlos Tadeu Panato Junior [@cpanato](https://github.com/cpanato) * Christian Segundo [@someone-stole-my-name](https://github.com/someone-stole-my-name) * Dani Baeyens [@danibaeyens](https://github.com/danibaeyens) * Duc Tran [@ductnn](https://github.com/ductnn) * Eng Zer Jun [@Juneezee](https://github.com/Juneezee) * Ivan Ivic [@Tyrion85](https://github.com/Tyrion85) * Joe Bowbeer [@joebowbeer](https://github.com/joebowbeer) * Jonathan Ogilvie [@jcogilvie](https://github.com/jcogilvie) * Jun [@junnplus](https://github.com/junnplus) * Loong Dai [@daixiang0](https://github.com/daixiang0) * María Teresa Rojas [@mtrojas](https://github.com/mtrojas) * Mo Sattler [@MoSattler](https://github.com/MoSattler) * Oleg Vorobev [@olegy2008](https://github.com/olegy2008) * Paul Balogh [@javaducky](https://github.com/javaducky) * Peter Smit [@psmit](https://github.com/psmit) * Ryan Hristovski [@ryanhristovski](https://github.com/ryanhristovski) * Semih Ural [@uralsemih](https://github.com/uralsemih) * Shubhodeep Mukherjee [@shubhodeep9](https://github.com/shubhodeep9) * Siddharth S Pal [@siddharthshubhampal](https://github.com/siddharthshubhampal) * Subhash Choudhary [@subhashchy](https://github.com/subhashchy) * Szymon Grzemski [@sgrzemski](https://github.com/sgrzemski) * Takumi Sue [@mikutas](https://github.com/mikutas) * Yannick Utard [@utay](https://github.com/utay) * Yu Cao [@yc185050](https://github.com/yc185050) * anoxape [@anoxape](https://github.com/anoxape) * bastienbosser [@bastienbosser](https://github.com/bastienbosser) * bitfactory-sem-denbroeder [@bitfactory-sem-denbroeder](https://github.com/bitfactory-sem-denbroeder) * cui fliter [@cuishuang](https://github.com/cuishuang) * eugenegoncharuk [@eugenegoncharuk](https://github.com/eugenegoncharuk) * h-dav @[h-dav](https://github.com/h-dav) * martinkubrak [@martinkubra](https://github.com/martinkubra) * verbotenj [@verbotenj](https://github.com/verbotenj) * ziollek [@ziollek](https://github.com/ziollek) [dynamic request routing]: https://linkerd.io/2.13/tasks/configuring-dynamic-request-routing [circuit breaking]: https://linkerd.io/2.13/tasks/circuit-breaking [new proxy metrics]: https://linkerd.io/2.13/reference/proxy-metrics/#outbound-xroute-metrics [upgrade-2130]: https://linkerd.io/2/tasks/upgrade/#upgrade-notice-stable-2130	2023-04-10 18:34:34 -07:00
Alejandro Pedraza	f9782c1e37	Update `linkerd upgrade` instructions (#10700 ) Replace instructions to use `kubectl apply --prune` with `linkerd prune`	2023-04-10 11:51:57 -07:00
Alex Leong	ce5f192a3f	Add outbound policy support to diagnostics policy command (#10695 ) Fixes #10694 We update the `linkerd diagnostics policy` command to support outbound policy in addition to inbound policy. * `linkerd diagnostics policy` now requires a typed resource instead of just a pod name (i.e. `po/foo` instead of `foo`) * if a pod is specified, inbound policy for that pod is displayed * if a service is specified, outbound policy for that service is displayed * no other resource types are supported * since the output in json format is extremely verbose, we add support for a `--output / -o` flag which can be `json` or `yaml` and make `yaml` the default Signed-off-by: Alex Leong <alex@buoyant.io>	2023-04-07 09:24:01 -07:00
Alejandro Pedraza	0c202bf17b	Bump linkerd2-proxy-init packages (#10678 ) proxy-init v2.2.1: * Sanitize `subnets-to-ignore` flag * Dep bumps cni-plugin v1.1.0: * Add support for the `config.linkerd.io/skip-subnets` annotation * Dep bumps validator v0.1.2: * Dep bumps Also, `linkerd-network-validator` is now released wrapped in a tar file, so this PR also amends `Dockerfile-proxy` to account for that.	2023-04-04 18:07:03 -05:00
Alex Leong	df9b09b154	Enable admin server metrics in the policy controller (#10645 ) We enable kubert's metrics feature which allows us to create a prometheus metrics endpoint on the policy controller's admin server. By default, only process metrics are surfaced. Signed-off-by: Alex Leong <alex@buoyant.io> Co-authored-by: Oliver Gould <ver@buoyant.io>	2023-04-03 18:19:56 -07:00
Andrew Seigner	e71266f2c9	cli: Support running `check` on CLI-only extensions (#10588 ) The existing `linkerd check` command runs extension checks based on extension namespaces already on-cluster. This approach does not permit running extension checks without cluster-side components. Introduce "CLI Checks". These extensions run as part of `linkerd check`, if they satisfy the following criteria: 1) executable in PATH 2) prefixed by `linkerd-` 3) supports an `_extension-metadata` subcommand, that outputs self-identifying JSON, for example: ``` $ linkerd-foo _extension-metadata { "name": "linkerd-foo", "checks": "always" } ``` 4) The `name` value from `_extension-metadata` must match the filename. And `checks` must equal `always`. If a CLI Check is found that also would have run as an on-cluster extension check, it is run as a CLI Check only. Fixes #10544	2023-03-29 12:07:36 -07:00
Oliver Gould	2f3031376b	proxy-injector: Skip Kube API server ports (#10519 ) The destination and identity controllers skip ports that are used by the kubernetes API server to avoid startup ordering issues and to ensure that the controllers are able to communicate with the kubernetes API without requiring the control plane for discovery. To avoid startup issues in the proxy-injector, the same configuration should be applied.	2023-03-13 15:29:23 -07:00
Kevin Leimkuhler	6bcd2e73f4	Remove separate naming of "status controller" from policy controller (#10496 ) This removes the separate naming of “status controller” from the policy controller resources, code, and comments. There is a single controller in all of this — the policy controller. Part of the policy controller is maintaining the status of policy resources. We can therefore remove this separate naming that has been used as well as reorganize some of the code to use single naming consts. The lease resource has been renamed from `status-controller` to `policy-controller-write`. The controller name value has been renamed from `policy.linkerd.io/status-controller` to `linkerd.io/policy-controller`. This field appears in the `status` of HTTPRoutes indicating which controller applied the status. Lastly, I’ve updated the comments to remove the use of “status controller” and moved the use of the name const to the `linkerd-policy-controller-core` package so that it can be shared. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2023-03-10 17:05:05 -07:00
Alex Leong	af219d4bb0	Implement outbound policy API in the policy controller (#10485 ) Implement the outbound policy API as defined in the proxy api: https://github.com/linkerd/linkerd2-proxy-api/blob/main/proto/outbound.proto This API is consumed by the proxy for the routing of outbound traffic. It is intended to replace the GetProfile API which is currently served by the destination controller. It has not yet been released in a proxy-api release, so we take a git dependency on it in the mean time. This PR adds a new index to the policy controller which indexes HTTPRoutes and Services and uses this information to serve the outbound API. We also add outbound API tests to validate the behavior of this implementation. Signed-off-by: Alex Leong <alex@buoyant.io> Co-authored-by: Oliver Gould <ver@buoyant.io>	2023-03-09 21:14:20 -08:00
Steve Jenson	2ec90aec5b	update network-validator helm charts to use proxy-init resources (#10461 ) Signed-off-by: Steve Jenson <stevej@buoyant.io>	2023-03-07 08:59:16 -08:00
Kevin Leimkuhler	40f0bc2360	Add lease claims to status controller (#10424 ) This adds lease claims to the policy status controller so that upon startup, a status controller attempts to claim the `status-controller` lease in the `linkerd` namespace. With this lease, we can enforce leader election and ensure that only one status controller on a cluster is attempting to patch HTTPRoute’s `status` field. Upon startup, the status controller now attempts to create the `status-controller` lease — it will handle failure if the lease is already present on the cluster. It then spawns a task for attempting to claim this lease and sends all claim updates to the index `Index`. Currently, `Index.claims` is not used, but in follow-up changes we can check against the current claim for determining if the status controller is the current leader on the cluster. If it is, we can make decisions about sending updates or not to the controller `Controller`. ### Testing Currently I’ve only manually tested this, but integration tests will definitely be helpful follow-ups. For manually testing, I’ve asserted that the `status-controller` is claimed when one or more status controllers startup and are running on a cluster. I’ve also asserted that when the current leader is deleted, another status controller claims the lease. Below is the summary of how I tested it ```shell $ linkerd install --ha \|kubectl apply -f - … $ kubectl get -n linkerd leases status-controller NAME HOLDER AGE status-controller linkerd-destination-747b456876-dcwlb 15h $ kubectl delete -n linkerd pod linkerd-destination-747b456876-dcwlb pod "linkerd-destination-747b456876-dcwlb" deleted $ kubectl get -n linkerd leases status-controller NAME HOLDER AGE status-controller linkerd-destination-747b456876-5zpwd 15h ``` Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2023-03-06 17:48:41 -07:00
cui fliter	8c6de42210	all: fix some comments (#10387 ) Signed-off-by: cui fliter <imcusg@gmail.com>	2023-03-01 11:47:02 +00:00
Alex Leong	e9eac4c672	Add prune command to linkerd and to extensions (#10303 ) Fixes: #10262 When a resource is removed from the Linkerd manifests from one version to the next, we would like that resource to be removed from the user's cluster as part of the upgrade process. Our current recommendation is to use the `linkerd upgrade` command in conjunction with the `kubectl apply` command and the `--prune` flag to remove resources which are no longer part of the manifest. However, `--prune` has many shortcomings and does not detect resources kinds which are not part of the input manifest, nor does it detect cluster scoped resources. See https://linkerd.io/2.12/tasks/upgrade/#with-the-linkerd-cli We add a `linkerd prune` command which locates all Linkerd resources on the cluster which are not part of the Linkerd manifest and prints their metadata so that users can delete them. The recommended upgrade procedure would then be: ``` > linkerd upgrade \| kubectl apply -f - > linkerd prune \| kubectl delete -f - ``` User must take special care to use the desired version of the CLI to run the prune command since running this command will print all resources on the cluster which are not included in that version. We also add similar prune commands to each of the `viz`, `multicluster`, and `jaeger` extensions for deleting extension resources which are not in the extension manifest. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-02-17 10:44:30 -08:00
Oliver Gould	363e123d79	Update to dev:v39 with Go 1.19 (#10336 )	2023-02-16 08:25:42 -08:00
Kevin Leimkuhler	24171e4e62	Add policy status controller (#10236 ) ### Overview This adds a policy status controller which is responsible for patching Linkerd’s HTTPRoute resource with a `status` field. The `status` field has a list of parent statuses — one status for each of its parent references. Each status indicates whether or not this parent has “accepted” the HTTPRoute. The status controller runs on its own task in the policy controller and watches for updates to the resources that it cares about, similar to the policy controller’s index. One of the main differences is that while the policy controller’s index watches many resources, the status controller currently only cares about HTTPRoutes and Servers; HTTPRoutes can still only have parent references that are Servers so we don’t currently need to consider any other parent reference resources. The status controller maintains its own index of resources so that it is completely separated from the policy controller’s index. This allows the index to be simpler in both its structure, how it handles `apply` and `delete`, and what information it needs to store. ### Follow-ups There are several important follow-ups to this change. #10124 contains changes for the policy controller index filtering out HTTPRoutes that are not accepted by a Server. We don’t want those changes yet. Leaving those out, the status controller does not actually have any affect on Linkerd policy in the cluster. We can probably add additional logging several places in the status controller; that may even take place as part of the reviews on this. Additionally, we could try queue size for updates to be processed. Currently if the status controller fails in any of its potential places, we do not re-queue updates. We probably should do that so that it is more robust against failure. In an HA installation, there could be multiple status controllers trying to patch the same resource. We should explore the k8s lease API so that only one status controller can patch a resource at a time. ### Implementation The status controller `Controller` has a k8s client for patching resources, `index` for tracking resources, and an `updates` channel which handles asynchronous updates to resources. #### Index `Index` synchronously observes changes to resources. It determines which Servers accept each HTTPRoute and generates a status patch for that HTTPRoute. Again, the status contains a list of parent statuses, one for each of the HTTPRoutes parent references. When a Server is added or deleted, the status controller needs to recalculate the status for all HTTPRoutes. This is because an HTTPRoute can reference Servers in other namespaces, so if a Server is added or deleted anywhere in the cluster it could affect any of the HTTPRoutes on the cluster. When an HTTPRoute is added, we need to determine the status only for that HTTPRoute. When it’s deleted we just need to make sure it’s removed from the index. The patches that the `Index` creates are sent to the `Controller` which is responsible only for applying those patches to HTTPRoutes. #### Controller `Controller` asynchronously processes updates and applies patches to HTTPRoutes. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2023-02-15 11:56:04 -07:00
Alejandro Pedraza	aaca63be1f	Remove unneeded `namespaces` access in Destination (#10324 ) (This came out during the k8s api calls review for #9650) In `5dc662ae9` inheritance of opaque ports annotation from namespaces to pods was removed, but we didn't remove the associated RBAC.	2023-02-13 16:13:37 -05:00
Steve Jenson	44424466c1	linkerd-cni: add new release to the build (#10209 ) wind the new linkerd-cni build through the build. refactor image, version, and pullPolicy into an Image object. Signed-off-by: Steve Jenson <stevej@buoyant.io>	2023-02-08 13:54:35 -08:00
Steve Jenson	41a8225b43	network-validator: use limits and requests for ResourceQuota interop (#10254 ) * adds limits and requests so that network-validator works properly in clusters with ResourceQuotas Signed-off-by: Steve Jenson <stevej@buoyant.io>	2023-02-06 10:44:35 -08:00
Kevin Leimkuhler	a37f632394	Add block chomping and update golden files (#10244 ) > When using ArgoCD and Azure Key Vault Plugin to manage Linkerd via Helm, the > identityTrustAnchorsPEM value gets passed from Azure Key Vault with a trailing > new line. This trailing new line makes its way into the config map > linkerd-identity-trust-roots causing Linkerd control plane to crash upon > deployment. There aren't any other alternatives when using Azure Key Vault due > to how multi-line secrets are created. Azure forces this trailing new line. > > The solution is to add a block chomping indicator to strip trailing new lines in > the config map. > > More on block chomping indicators: https://yaml-multiline.info/ > > Fixes: #10012 The original PR #10059 has staled out, but it's worth getting this change in. Signed-off-by: Alexander Di Clemente <diclemea@gmail.com> Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2023-02-02 12:58:17 -07:00
Steve Jenson	1e8d96509b	Upgrading proxy-init from v2.1.0 to v2.2.0 this time without JSON formatting (#10234 ) Signed-off-by: Steve Jenson <stevej@buoyant.io>	2023-02-01 11:53:02 -08:00
Alex Leong	6d0b555e21	edge-23.1.2 (#10210 ) Signed-off-by: Alex Leong <alex@buoyant.io>	2023-01-26 18:06:14 -08:00
Alejandro Pedraza	cf665ef56c	Fix PSP (#10208 ) Fixes #10150 When we added PodSecurityAdmission in #9719 (and included in edge-23.1.1), we added the entry `seccompProfile.type=RuntimeDefault` to the containers SecurityContext. For PSP to accept that we require to add the annotation `seccomp.security.alpha.kubernetes.io/allowedProfileNames: "runtime/default"` into the PSP resource, which also implies we require to add the entry `seccompProfile.type=RuntimeDefault` to the pod's SecurityContext as well, not just the container's. It also turns out the `namespace-metadata` Jobs used by extensions for the helm installation method didn't have their ServiceAccount properly bound to the PSP resource. This resulted in the `helm install` command failing, and although the extensions resources did get deployed, they were not being discoverable by `linkerd check`. This change fixes that as well, that has been broken since 2.12.0!	2023-01-26 16:32:41 -08:00
Alex Leong	03727b753b	Add v1beta2 version to HTTPRoute CRD (#9973 ) Fixes #9965 Adds a `path` property to the RedirectRequestFilter in all versions. This property was absent from the CRD even though it appears in the gateway API documentation and is represented in the internal types. Adding this property to the CRD will also users to specify it. Add a new version to the HTTPRoute CRD: v1beta2. This new version includes two changes from v1beta1: * Added `port` property to `parentRef` for use when the parentRef is a Service * Added `backendRefs` property to HTTPRoute rules We switch the storage version of the HTTPRoute CRD from v1alpha1 to v1beta2 so that these new fields may be persisted. We also update the policy admission controller to allow an HTTPRoute parentRef type to be Service (in addition to Server). Signed-off-by: Alex Leong <alex@buoyant.io>	2023-01-23 08:56:35 -08:00
Paul Balogh	cf712d7b45	Provide additional shortnames for resources (#10030 ) This change expands on existing shortnames while adding others for various policy resources. This improves the user experience when issuing commands via kubectl. Fixes #9322 Signed-off-by: Paul Balogh <javaducky@gmail.com>	2023-01-17 08:27:39 -07:00
Alejandro Pedraza	7428d4aa51	Removed dupe imports (#10049 ) * Removed dupe imports My IDE (vim-gopls) has been complaining for a while, so I decided to take care of it. Found via [staticcheck](https://github.com/dominikh/go-tools) * Add stylecheck to go-lint checks	2023-01-10 14:34:56 -05:00
anoxape	3855aa2371	Correct `identity.issuer.externalCA` to `identity.externalCA` (#10071 ) Helm chart has `identity.externalCA` value. CLI code sets `identity.issuer.externalCA` and fails to produce the desired configuration. This change aligns everything to `identity.externalCA`. Signed-off-by: Dmitry Mikhaylov <anoxape@gmail.com>	2023-01-03 11:37:30 -08:00
Alejandro Pedraza	faf0ff62f7	Add support for Pod Security Admission (#9719 ) Closes #9676 This adds the `pod-security.kubernetes.io/enforce` label as described in [Pod Security Admission labels for namespaces](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces). PSA gives us three different possible values (policies or modes): [privileged, baseline and restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/). For non-CNI mode, the proxy-init container relies on granting the NET_RAW and NET_ADMIN capabilities, which places those pods under the `restricted` policy. OTOH for CNI mode we can enforce the `restricted` policy, by setting some defaults on the containers' `securityContext` as done in this PR. Also note this change also adds the `cniEnabled` entry in the `values.yaml` file for all the extension charts, which determines what policy to use. Final note: this includes the fix from #9717, otherwise an empty gateway UID prevents the pod to be created under the `restricted` policy. ## How to test As this is only enforced as of k8s 1.25, here are the instructions to run 1.25 with k3d using Calico as CNI: ```bash # launch k3d with k8s v1.25, with no flannel CI $ k3d cluster create --image='+v1.25' --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0' # install Calico $ k apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml # load all the images $ bin/image-load --k3d proxy controller policy-controller web metrics-api tap cni-plugin jaeger-webhook # install linkerd-cni $ bin/go-run cli install-cni\|k apply -f - # install linkerd-crds $ bin/go-run cli install --crds\|k apply -f - # install linkerd-control-plane in CNI mode $ bin/go-run cli install --linkerd-cni-enabled\|k apply -f - # Pods should come up without issues. You can also try the viz and jaeger extensions. # Try removing one of the securityContext entries added in this PR, and the Pod # won't come up. You should be able to see the PodSecurity error in the associated # ReplicaSet. ``` To test the multicluster extension using CNI, check this [gist](https://gist.github.com/alpeb/4cbbd5ad87538b9e0d39a29b4e3f02eb) with a patch to run the multicluster integration test with CNI in k8s 1.25.	2022-12-19 10:23:46 -05:00
Matei David	35cecb50e1	Add static and dynamic port overrides for CNI ebpf (#9841 ) When CNI plugins run in ebpf mode, they may rewrite the packet destination when doing socket-level load balancing (i.e in the `connect()` call). In these cases, skipping `443` on the outbound side for control plane components becomes redundant; the packet is re-written to target the actual Kubernetes API Server backend (which typically listens on port `6443`, but may be overridden when the cluster is created). This change adds port `6443` to the list of skipped ports for control plane components. On the linkerd-cni plugin side, the ports are non-configurable. Whenever a pod with the control plane component label is handled by the plugin, we look-up the `kubernetes` service in the default namespace and append the port values (of both ClusterIP and backend) to the list. On the initContainer side, we make this value configurable in Helm and provide a sensible default (`443,6443`). Users may override this value if the ports do not correspond to what they have in their cluster. In the CLI, if no override is given, we look-up the service in the same way that we do for linkerd-cni; if failures are encountered we fallback to the default list of ports from the values file. Closes #9817 Signed-off-by: Matei David <matei@buoyant.io>	2022-11-30 09:45:25 +00:00
Matei David	d45f7331f3	Introduce value to run proxy-init as privileged (#9873 ) This change aims to solve two distinct issues that have cropped up in the proxy-init configuration. First, it decouples `allowPrivilegeEscalation` from running proxy-init as root. At the moment, whenever the container is run as root, privilege escalation is also allowed. In more restrictive environments, this will prevent the pod from coming up (e.g security policies may complain about `allowPrivilegeEscalation=true`). Worth noting that privilege escalation is not necessary in many scenarios since the capabilities are passed to the iptables child process at build time. Second, it introduces a new `privileged` value that will allow users to run the proxy-init container without any restrictions (meaning all capabilities are inherited). This is essentially the same as mapping root on host to root in the container. This value may solve issues in distributions that run security enhanced linux, since iptables will be able to load kernel modules that it may otherwise not be able to load (privileged mode allows the container nearly the same privileges as processes running outside of a container on a host, this further allows the container to set configurations in AppArmor or SELinux). Privileged mode is independent from running the container as root. This gives users more control over the security context in proxy-init. The value may still be used with `runAsRoot: false`. Fixes #9718 Signed-off-by: Matei David <matei@buoyant.io>	2022-11-25 10:58:51 +00:00
Alejandro Pedraza	4ea8ab21dc	edge-22.11.3 change notes (#9884 ) * edge-22.11.3 change notes Besides the notes, this corrects a small point in `RELEASE.md`, and bumps the proxy-init image tag to `v2.1.0`. Note that the entry under `go.mod` wasn't bumped because moving it past v2 requires changes on `linkerd2-proxy-init`'s `go.mod` file, and we're gonna drop that dependency soon anyways. Finally, all the charts got their patch version bumped, except for `linkerd2-cni` that got its minor bumped because of the tolerations default change. ## edge-22.11.3 This edge release fixes connection errors to pods using a `hostPort` different than their `containerPort`. Also the `network-validator` init container improves its logging, and the `linkerd-cni` DaemonSet now gets deployed in all nodes by default. * Fixed `destination` service to properly discover targets using a `hostPort` different than their `containerPort`, which was causing 502 errors * Upgraded the `network-validator` with better logging allowing users to determine whether failures occur as a result of their environment or the tool itself * Added default `Exists` toleration to the `linkerd-cni` DaemonSet, allowing it to be deployed in all nodes by default, regardless of taints Co-authored-by: Oliver Gould <ver@buoyant.io>	2022-11-23 14:35:20 -05:00
Steve Jenson	a83bad9ccb	Adds a default Exists toleration to linkerd-cni (#9789 )	2022-11-22 15:26:20 -05:00
Alejandro Pedraza	1a3ec34dbf	Remove mention of `linkerd repair` (#9816 ) When calling `linkerd upgrade, if the `linkerd-config-overrides` Secret is not found then we ask the user to run `linkerd repair`, but that has long been removed from the CLI. Also removed code comment as the error is explicit enough.	2022-11-11 16:40:06 -05:00
Matei David	400df24fab	cli: Fix upgrade when using --from-manifests (#9802 ) Fix upgrade when using --from-manifests When the `--from-manifests` flag is used to upgrade through the CLI, the kube client used to fetch existing configuration (from the ConfigMap) is a "fake" client. The fake client returns values from a local source. The two clients are used interchangeably to perform the upgrade; which one is initialized depends on whether a value has been passed to `--from-manifests`. Unfortunately, this breaks CLI upgrades to any stable-2.12.x version when the flag is used. Since a fake client is used, the upgrade will fail when checking for the existence of CRDs, even if the CRDs have been previously installed in the cluster. This change fixes the issue by first initializing an actual Kubernetes client (that will be used to check for CRDs). If the values should be read from a local source, the client is replaced with a fake one. Since this takes place after the CRD check, the upgrade will not fail on the CRD precondition. Fixes #9788 Signed-off-by: Matei David <matei@buoyant.io>	2022-11-11 16:04:02 +00:00
Steve Jenson	309e8d1210	Validate CNI configurations during pod startup (#9678 ) When users use CNI, we want to ensure that network rewriting inside the pod is setup before allowing linkerd to start. When rewriting isn't happening, we want to exit with a clear error message and enough information in the container log for the administrator to either file a bug report with us or fix their configuration. This change adds a validator initContainer to all injected workloads, when linkerd is installed with "cniEnabled=false". The validator replaces the noop init container, and will prevent pods from starting up if iptables is not configured. Part of #8120 Signed-off-by: Steve Jenson <stevej@buoyant.io>	2022-10-26 11:14:45 +01:00
Alejandro Pedraza	e6fa5a7156	Replace usage of io/ioutil package (#9613 ) `io/ioutil` has been deprecated since go 1.16 and the linter started to complain about it.	2022-10-13 12:10:58 -05:00
Alex Leong	5cb6755ebe	Add noop init container when the cni plugin is enabled (#9504 ) Add a "noop" init container which uses the proxy image and runs `/bin/sleep 0` to injected pods. This init container is only added when the linkerd-cni-plugin is enabled. The idea here is that by running an init container, we trigger kubernetes to update the pod status. In particular, this ensures that the pod status IP is populated, which is necessary in certain cases where other CNIs such as Calico are involved. Therefore, this may fix https://github.com/linkerd/linkerd2/issues/9310, but I don't have a reproduction and therefore am not able to verify. Signed-off-by: Alex Leong <alex@buoyant.io>	2022-10-11 11:31:45 -07:00
Alex Leong	dc6b6e9ed5	Add diagnostics policy command (#9513 ) Fixes: #9163 We add a new diagnostics command to the CLI which queries the policy controller for the inbound policy for a given pod and port. ```console > linkerd diagnostics policy -n linkerd-viz metrics-api-cbb5cdd44-8z6mh 4191 { "protocol": { "Kind": { "Http1": { "routes": [ { "metadata": { "Kind": { "Default": "default" } }, "rules": [ { "matches": [ { "path": { "Kind": { "Prefix": "/" } } } ] } ] }, { "metadata": { "Kind": { "Default": "probe" } }, "authorizations": [ { "networks": [ { "net": { "ip": { "Ip": { "Ipv4": 0 } } } } ], "authentication": { "Permit": { "Unauthenticated": {} } }, "labels": { "group": "", "kind": "default", "name": "probe" }, "metadata": { "Kind": { "Default": "probe" } } } ], "rules": [ { "matches": [ { "path": { "Kind": { "Exact": "/live" } }, "method": { "Type": { "Registered": 0 } } }, { "path": { "Kind": { "Exact": "/ready" } }, "method": { "Type": { "Registered": 0 } } } ] } ] } ] } } }, "authorizations": [ { "networks": [ { "net": { "ip": { "Ip": { "Ipv4": 0 } } } }, { "net": { "ip": { "Ip": { "Ipv6": {} } } } } ], "authentication": { "Permit": { "Unauthenticated": {} } }, "labels": { "group": "policy.linkerd.io", "kind": "authorizationpolicy", "name": "proxy-admin" }, "metadata": { "Kind": { "Resource": { "group": "policy.linkerd.io", "kind": "authorizationpolicy", "name": "proxy-admin" } } } } ], "labels": { "group": "policy.linkerd.io", "kind": "server", "name": "proxy-admin" } } ``` Signed-off-by: Alex Leong <alex@buoyant.io>	2022-10-07 16:43:24 -07:00
Martin Odstrčilík	89c5729264	Add PodMonitor resources to the Helm chart (#9113 ) Add PodMonitor resources to the Helm chart With an external Prometheus setup installed using prometheus-operator the Prometheus instance scraping can be configured using Service/PodMonitor resources. By adding PodMonitor resource into Linkerd Helm chart we can mimic the configuration of bundled Prometheus, see https://github.com/linkerd/linkerd2/blob/main/viz/charts/linkerd-viz/templates/prometheus.yaml#L47-L151, that comes with linkerd-viz extension. The PodMonitor resources are based on https://github.com/linkerd/website/issues/853#issuecomment-913234295 which are proven to be working. The only problem we face is that bundled Grafana charts will need to look at different jobs when querying metrics. When enabled by `podMonitor.enabled` value in the Helm chart, PodMonitor for Linkerd resources should be installed alongside the Linkerd and Linkerd metrics should be present in the Prometheus. Fixes #6596 Signed-off-by: Martin Odstrcilik <martin.odstrcilik@gmail.com>	2022-10-04 06:19:23 -05:00
Alejandro Pedraza	cded70a923	Added missing proxy annotations in docs (#9440 ) Updated `cli/cmd/doc.go` to include missing annotations, used to render https://linkerd.io/2.12/reference/proxy-configuration/	2022-09-23 16:32:34 -05:00
Takumi Sue	5e5e4e675a	Fix --api-addr flag to be respected (#9270 ) Fixes #9054 Signed-off-by: Takumi Sue <u630868b@alumni.osaka-u.ac.jp>	2022-09-23 13:28:27 -07:00
Alejandro Pedraza	8afe36e6e9	Fix jaeger injector interfering with upgrades to 2.12.0 (#9429 ) Fixes issue described in [this comment](https://github.com/linkerd/linkerd2/issues/9310#issuecomment-1247201646) Rollback #7382 Should be cherry-picked back into 2.12.1 For 2.12.0, #7382 removed the env vars `_l5d_ns` and `_l5d_trustdomain` from the proxy manifest because they were no longer used anywhere. In particular, the jaeger injector used them when injecting the env var `LINKERD2_PROXY_TAP_SVC_NAME=tap.linkerd-viz.serviceaccount.identity.$(_l5d_ns).$(_l5d_trustdomain)` but then started using values.yaml entries instead of these env vars. The problem is when upgrading the core control plane (or anything else) to 2.12.0, the 2.11 jaeger extension will still be running and will attempt to inject the old env var into the pods, making reference to `l5d_ns` and `_l5d_trustdomain` which the new proxy container won't offer anymore. This will put the pod in an error state. This change restores back those env vars. We will be able to remove them at last in 2.13.0, when presumably the jaeger injector would already have already been upgraded to 2.12 by the user. Replication steps: ```bash $ curl -sL https://run.linkerd.io/install \| LINKERD2_VERSION=stable-2.11.4 sh $ linkerd install \| k apply -f - $ linkerd jaeger install \| k apply -f - $ linkerd check $ curl -sL https://run.linkerd.io/install \| LINKERD2_VERSION=stable-2.12.0 sh $ linkerd upgrade --crds \| k apply -f - $ linkerd upgrade \| k apply -f - $ k get po -n linkerd NAME READY STATUS RESTARTS AGE linkerd-identity-58544dfd8-jbgkb 2/2 Running 0 2m19s linkerd-destination-764bf6785b-v8cj6 4/4 Running 0 2m19s linkerd-proxy-injector-6d4b8c9689-zvxv2 2/2 Running 0 2m19s linkerd-identity-55bfbf9cd4-4xk9g 0/2 CrashLoopBackOff 1 (5s ago) 32s linkerd-proxy-injector-5b67589678-mtklx 0/2 CrashLoopBackOff 1 (5s ago) 32s linkerd-destination-ff9b5f67b-jw8w5 0/4 PostStartHookError 0 (8s ago) 32s ```	2022-09-20 10:41:34 -07:00
Oliver Gould	b9ecbcb521	Remove needless RBAC on the identity controller (#9368 ) The identity controller requires access to read all deployments. This isn't necessary. When these permissions were added in #3600, we incorrectly assumed that we must pass a whole Deployment resource as a _parent_ when recording events. The [EventRecorder docs] say: > 'object' is the object this event is about. Event will make a > reference--or you may also pass a reference to the object directly. We can confirm this by reviewing the source for [GetReference]: we can simply construct an ObjectReference without fetching it from the API. This change lets us drop unnecessary privileges in the identity controller. [EventRecorder docs]: https://pkg.go.dev/k8s.io/client-go/tools/record#EventRecorder [GetReference]: `ab826d2728/tools/reference/ref.go (L38-L45)` Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-09-13 12:36:14 -07:00

1 2 3 4 5 ...

1153 Commits