linkerd2

Commit Graph

Author	SHA1	Message	Date
Alex Leong	4e7a588a2c	Add pod name to context token and logging (#11532 ) When the destination controller logs about receiving or sending messages to a data plane proxy, there is no information in the log about which data plane pod it is communicating with. This can make it difficult to diagnose issues which span the data plane and control plane. We add a `pod` field to the context token that proxies include in requests to the destination controller. We add this pod name to the logging context so that it shows up in log messages. In order to accomplish this, we had to plumb through logging context in a few places where it previously had not been. This gives us a more complete logging context and more information in each log message. An example log message with this fuller logging context is: ``` time="2023-10-24T00:14:09Z" level=debug msg="Sending destination add: add:{addrs:{addr:{ip:{ipv4:183762990} port:8080} weight:10000 metric_labels:{key:\"control_plane_ns\" value:\"linkerd\"} metric_labels:{key:\"deployment\" value:\"voting\"} metric_labels:{key:\"pod\" value:\"voting-7475cb974c-2crt5\"} metric_labels:{key:\"pod_template_hash\" value:\"7475cb974c\"} metric_labels:{key:\"serviceaccount\" value:\"voting\"} tls_identity:{dns_like_identity:{name:\"voting.emojivoto.serviceaccount.identity.linkerd.cluster.local\"}} protocol_hint:{h2:{}}} metric_labels:{key:\"namespace\" value:\"emojivoto\"} metric_labels:{key:\"service\" value:\"voting-svc\"}}" addr=":8086" component=endpoint-translator context-ns=emojivoto context-pod=web-767f4484fd-wmpvf remote="10.244.0.65:52786" service="voting-svc.emojivoto.svc.cluster.local:8080" ``` Note the `context-pod` field. Additionally, we have tested this when no pod field is included in the context token (e.g. when handling requests from a pod which does not yet add this field) and confirmed that the `context-pod` log field is empty, but no errors occur. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-10-25 13:48:42 -07:00
Oliver Gould	8cd165a1e4	destination: Index pods and service with multiple IPs (#11515 ) The destination controller indexes Pods by their primary PodIP and their primary HostIP (when applicable); and it indexes Services by their primary ClusterIP. In preparation for dual-stack (IPv6) support, this change updates the destination controller indexers to consume all IPs for these resources. Tests are updated to demonstrate a dual-stack setup (where these resources have a primary IPv4 address and a secondary IPv6 address). While exercising these tests, I had to replace the brittle host:port parsing logic we use in the server, in favor of Go's standard `net.SplitHostPort` utility.	2023-10-23 14:55:33 -07:00
Alex Leong	357a1d32b2	Add update queue to endpoint translator (#11491 ) When a grpc client of the destination.Get API initiates a request but then doesn't read off of that stream, the HTTP2 stream flow control window will fill up and eventually exert backpressure on the destination controller. This manifests as calls to `Send` on the stream blocking. Since `Send` is called synchronously from the client-go informer callback (by way of the endpoint translator), this blocks the informer callback and prevents all further informer calllbacks from firing. This causes the destination controller to stop sending updates to any of its clients. We add a queue in the endpoint translator so that when it gets an update from the informer callback, that update is queued and we avoid potentially blocking the informer callback. Each endpoint translator spawns a goroutine to process this queue and call `Send`. If there is not capacity in this queue (e.g. because a client has stopped reading and we are experiencing backpressure) then we terminate the stream. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-10-18 12:34:38 -07:00
Alejandro Pedraza	c67985def0	Extend unit test for HostPort subscriptions (#11439 ) Followup to https://github.com/linkerd/linkerd2/pull/11334#issuecomment-1736093592 This extends the test introduced in #11334 to excercise upgrading a Server associated to a pod's HostPort, and observing how the stream updates the OpaqueProtocol field. Helper functions were refactored a bit to allow retrieving the l5dCRDClientSet used when building the fake API.	2023-10-02 13:51:19 -05:00
Alejandro Pedraza	65ddba4e5d	dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes (#11334 ) Followup to #11328 Implements a new pod watcher, instantiated along the other ones in the Destination server. It also watches on Servers and carries all the logic from ServerWatcher, which has now been decommissioned. The `CreateAddress()` function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. `getAnnotatedOpaquePorts()` got also moved for similar reasons. Other things to note about PodWatcher: - It publishes a new pair of metrics `ip_port_subscribers` and `ip_port_updates` leveraging the framework in `prometheus.go`. - The complexity in `updatePod()` is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event. - Finally, endpointProfileTranslator's `endpoint` (*pb.WeightedAddr) not being a static object anymore, the `Update()` function now receives an Address that allows it to rebuild the endpoint on the fly (and so `createEndpoint()` was converted into a method of endpointProfileTranslator).	2023-09-28 08:57:52 -05:00
Alejandro Pedraza	ec1c898bd9	Bump proxy-init:v2.2.3 and cni-plugin:v1.2.2 (#11399 ) https://github.com/linkerd/linkerd2-proxy-init/releases/tag/proxy-init%2Fv2.2.3 https://github.com/linkerd/linkerd2-proxy-init/releases/tag/cni-plugin%2Fv1.2.2 Updated to use go 1.21	2023-09-21 11:16:37 -05:00
Matei David	c0da3b95bc	Bump CNI plugin and proxy-init versions (#11348 ) * Bump CNI plugin to v1.2.1 * Bump proxy-init to v2.2.2 Both dependencies include a fix for CVE-2023-2603. Since alpine is used as the runtime image, there is a security vulnerability detected in the produced images (due to an issue with libcap). The alpine images have been bumped to address the CVE. Signed-off-by: Matei David <matei@buoyant.io>	2023-09-07 16:27:13 +01:00
Alejandro Pedraza	3d1a3e018c	dst: Stop overriding Host IP with Pod IP on HostPort lookup (#11328 ) * stopgap fix for hostport staleness ## Problem When there's a pod with a `hostPort` entry, `GetProfile` requests targetting the host's IP and that `hostPort` return an endpoint profile with that pod's IP and `containerPort`. If that pod vanishes and another one in that same host with that same `hostPort` comes up, the existing `GetProfile` streams won't get updated with the new pod information (metadata, identity, protocol). That breaks the connectivity of the client proxy relying on that stream. ## Partial Solution It should be less surprising for those `GetProfile` requests to return an endpoint profile with the same host IP and port requested, and leave to the cluster's CNI to peform the translation to the corresponding pod IP and `containerPort`. This PR performs that change, but continuing returning the corresponding pod's information alongside. If the pod associated to that host IP and port changes, the client proxy won't loose connectivity, but the pod's information won't get updated (that'll be fixed in a separate PR). A new unit test validating this has been added, which will be expanded to validate the changed pod information when that gets implemented. ## Details of Change - We no longer do the HostPort->ContainerPort conversion, so the `getPortForPod` function was dropped. - The `getPodByIp` function will now be split in two: `getPodByPodIP` and `getPodByHostIP`, the latter being called only if the former doesn't return anything. - The `createAddress` function is now simplified in that it just uses the passed IP to build the address. The passed IP will depend on which of the two functions just mentioned returned the pod (host IP or pod IP)	2023-09-06 10:35:14 -05:00
Alex Leong	db2e543b0c	Disable local traffic policy for remote discovery (#11257 ) When a service has it's internal traffic policy set to "local", we will perform filtering to only return local endpoints, as-per the ForZone hints in the endpoints. However, ForZone calculations do not take resources from remote clusters into account, therefore this type of filtering is not appropriate for remote discovery services. We explicitly ignore any internal traffic policy when doing remote discovery. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-08-16 15:27:58 -07:00
Alejandro Pedraza	4720714df2	Fixed destination metrics warning when re-linking (#11253 ) Similar to #11246, the destination controller was complaning above trying to register dupe metrics for the api cache gauges, when a given target cluster got re-linked. This change unregisters the gauges for the target cluster when said cluster is removed. Supersedes #11252	2023-08-16 15:27:31 -07:00
Alex Leong	196cd49fe9	Add cluster store size gauge (#11256 ) We add a `cluster_store_size` gauge to the destination controller to track the number of entries in the remote discovery cluster store. If this is ever different from the number of cluster credentials secrets in the linkerd namespace, this indicates that there is a problem with a link that needs to be investigated further. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-08-16 15:14:22 -07:00
Alejandro Pedraza	d823ad72c4	Fixed service-mirror metrics warning (#11246 ) Whenever the service mirror's main loop was triggered again, the following warnings were generated: ``` time="2023-08-14T20:16:29Z" level=warning msg="failed to register Prometheus gauge Desc{fqName: \"service_cache_size\", help: \"Number of items in the client-go service cache\", constLabels: {cluster=\"remote\"}, variableLabels: []}: duplicate metrics collector registration attempted" time="2023-08-14T20:16:29Z" level=warning msg="failed to register Prometheus gauge Desc{fqName: \"endpoints_cache_size\", help: \"Number of items in the client-go endpoints cache\", constLabels: {cluster=\"remote\"}, variableLabels: []}: duplicate metrics collector registration attempted" ``` To fix, this adds into the cluster watcher's `Stop()` method a directive to unregister the prometheus cache metrics associated to the cluster's client API.	2023-08-16 13:34:50 -05:00
Alex Leong	368b63866d	Add support for remote discovery (#11224 ) Adds support for remote discovery to the destination controller. When the destination controller gets a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` label, this is an indication that the destination controller should discover the endpoints for this service from a remote cluster. The destination controller will look for a remote cluster which has been linked to it (using the `linkerd multicluster link` command) with that name. It will look at the `multicluster.linkerd.io/remote-discovery` label for the service name to look up in that cluster. It then streams back the endpoint data for that remote service. Since we now have multiple client-go informers for the same resource types (one for the local cluster and one for each linked remote cluster) we add a `cluster` label onto the prometheus metrics for the informers and EndpointWatchers to ensure that each of these components' metrics are correctly tracked and don't overwrite each other. --------- Signed-off-by: Alex Leong <alex@buoyant.io>	2023-08-11 09:31:45 -07:00
Matei David	62a8407420	Introduce small improvements to ClusterStore (#11220 ) There were a few improvements we could have made to a recent change that added a ClusterStore concept to the destination service. This PR introduces the small improvements: * Sync dynamically created clients in separate go routines * Refactor metadata API creation * Remove redundant check in cluster_store_test * Create a new runtime schema every time a fake metadata API client is created to avoid racey behaviour. Signed-off-by: Matei David <matei@buoyant.io>	2023-08-08 17:58:49 -07:00
Alejandro Pedraza	69ecb7f10d	Replace `server_port_subscribers` metric (#11206 ) Fixes #10764 Removed the `server_port_subscribers` gauge, as it wasn't distiguishing amongst different pods, and the number of subscribers for each pod were conflicting with one another when updating the metric (see more details [here](https://github.com/linkerd/linkerd2/issues/10764#issuecomment-1650835823)). Besides carying an invalid value, this was generating the warning `unable to delete server_port_subscribers metric with labels` The metric was replaced with the `server_port_subscribes` and `server_port_unsubscribes` counters, which track the overall number of subscribes and unsubscribes to the particular pod port. 🌮 to @adleong for the diagnosis and the fix!	2023-08-08 11:28:51 -05:00
Matei David	c2780adde1	Introduce an EndpointsWatcher cache structure (#11190 ) Currently, the control plane does not support indexing and discovering resources across cluster boundaries. In a multicluster set-up, it is desirable to have access to endpoint data (and by extension, any configuration associated with that endpoint) to support pod-to-pod communication. Linkerd's destination service should be extended to support reading discovery information from multiple sources (i.e. API Servers). This change introduces an `EndpointsWatcher` cache. On creation, the cache registers a pair of event handlers: * One handler to read `multicluster` secrets that embed kubeconfig files. For each such secret, the cache creates a corresponding `EndpointsWatcher` to read remote discovery information. * Another handle to evict entries from the cache when a `multicluster` secret is deleted. Additionally, a set of tests have been added that assert the behaviour of the cache when secrets are created and/or deleted. The cache will be used by the destination service to do look-ups for services that belong to another cluster, and that are running in a "remote discovery" mode. --------- Signed-off-by: Matei David <matei@buoyant.io>	2023-08-04 13:59:21 -07:00
Alex Leong	f7f1a95081	Add mutex lock to updateServer (#11169 ) Fixes #11163 The `servicePublisher.updateServer` function will iterate through all registered listeners and update them. However, a nil listener may temporarily be in the list of listeners if an unsubscribe is in progress. This results in a nil pointer dereference. All functions which result in updating the listeners must therefore be protected by the mutex so that we don't try to act on the list of listeners while it is being modified. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-07-27 11:17:53 -07:00
Kevin Ingelman	32831ba501	Update Go codegen for httproute timeout fields (#11164 ) In #11008 I added Go support for the `timeouts` field in the `HTTPRouteRule` struct. That used Go's built-in `time.Duration` type, but based on my reading of kubernetes-sigs/gateway-api#2013, we should instead by using apimachinery's `metav1.Duration` type. Signed-off-by: Kevin Ingelman <ki@buoyant.io>	2023-07-24 17:57:53 -05:00
Jean-Charles Legras	18755e45cc	Fix support of config.linkerd.io/admin-port annotation (#11031 ) When using `linkerd-await` as a preStart hook, we need to explicitly pass in the proxy's admin port if it is not the default (4191). While the admin server listener can be bound to any arbitrary port using `config.linkerd.io/admin-port` as a configuration annotation, `linkerd-await`'s template is not aware of the override resulting in start-up errors. This change adds the override to `linkerd-await` by always using an explicit `--port` argument. --------- Signed-off-by: jclegras <11457480+jclegras@users.noreply.github.com>	2023-06-28 10:17:01 +01:00
Kevin Ingelman	b02a6a0d83	Add codegen for httproute v1beta3 (#11008 ) Add go client codegen for HttpRoute v1beta3. This will be necessary for any of the go controllers (i.e. metrics-api) or go CLI commands to interact with HttpRoute v1beta3 resources in kubernetes. Signed-off-by: Kevin Ingelman <ki@buoyant.io>	2023-06-14 15:01:56 -07:00
Mark Robinson	21209955c2	Fix bug where topology routing would not disable while service was under load. (#10925 ) Add support for enabling and disabling topology aware routing when hints are added/removed. The testing setup is very involved because it involves so many moving parts 1) Setup a service which is layered over several availability zones. 1a) The best way to do this is one service object, with 3 replicasets explicitly forced to use a specific AZ each. 2) Add `service.kubernetes.io/topology-aware-hints: Auto` annotation to the Service object 3) Use a load tester like k6 to send meaningful traffic to your service but only in one AZ 3) Scale up your replica sets until k8s adds Hints to your endpointslices 4) Observe that traffic shifts to only hit pods in one AZ 5) Turn down the replicasets count until such time that K8s removes the hints from your endpointslices 6) Observe traffic shifts back to all pods across all AZ.	2023-05-26 10:31:14 -07:00
dependabot[bot]	e5830ad05b	build(deps): bump linkerd/dev from 39 to 40 (#10825 ) * build(deps): bump linkerd/dev from 39 to 40 Bumps [linkerd/dev](https://github.com/linkerd/dev) from 39 to 40. - [Release notes](https://github.com/linkerd/dev/releases) - [Commits](https://github.com/linkerd/dev/compare/v39...v40) Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Alex Leong <alex@buoyant.io> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alex Leong <alex@buoyant.io>	2023-05-09 10:57:19 -07:00
Matei David	38c186be41	Introduce discovery cache timeout values (#10831 ) The proxy caches results in-memory, both for inbound and outbound service (and policy) discovery. While the proxy's default values are great in most cases, certain client configurations may require overrides. The proxy supports overriding the default values, however, it currently does not offer an easy way for users to configure them. This PR introduces two new values in Linkerd's control plane chart. The values control the inbound and outbound cache discovery idle timeout -- the amount of time a result will be kept in the cache if unused. Setting this value will change the configuration for all injected proxies, but not for the control plane. --------- Signed-off-by: Matei David <matei@buoyant.io>	2023-05-05 14:33:34 +01:00
Eliza Weisman	34df5aa606	inject: don't expand opaque port ranges (#10827 ) Currently, the proxy injector will expand lists of opaque port ranges into lists of individual port numbers. This is because the proxy has historically not accepted port ranges in the `LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` environment variable. However, when very large ranges are used, the size of the injected manifest can be quite large, since each individual port number in a range must be listed separately. Proxy PR linkerd/linkerd2-proxy#2395 changed the proxy to accept ranges as well as individual port numbers in the opaque ports environment variable, and this change was included in the latest proxy release (v2.200.0). This means that the proxy-injector no longer needs to expand large port ranges into individual port numbers, and can now simply forward the list of ranges to the proxy. This branch changes the proxy injector to do this, resolving issues with manifest size due to large port ranges. Closes #9803	2023-04-27 11:27:40 -07:00
Alejandro Pedraza	33adae2b7c	Fix `server_port_subscribers` metric in Destination (#10820 ) Fixes #10764 `GetProfile` streams create a `server_port_subscribers` gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the `GetProfile` stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with: ``` level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server ``` One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty: ```bash $ linkerd diagnostics controller-metrics \| grep server_port_subscribers ``` After this fix, one can see the metric getting populated: ```bash $ linkerd diagnostics controller-metrics \| grep server_port_subscribers # HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port. # TYPE server_port_subscribers gauge server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1 server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1 server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1 server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1 server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1 server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1 server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1 server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1 server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1 server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1 server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1 server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1 server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1 server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1 server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1 server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1 ``` And when scaling down the voting deployment, one can see how the metric with `name="voting"` is removed.	2023-04-26 16:48:50 -05:00
Eliza Weisman	83e9c45bd1	add `trust_dns=error` to default proxy log level (#10774 ) * add `trust_dns=error` to default proxy log level Since upstream has yet to release a version with PR bluejekyll/trust-dns#1881, this commit changes the proxy's default log level to silence warnings from `trust_dns_proto` that are generally spurious. Closes #10123.	2023-04-20 09:29:56 -05:00
Eliza Weisman	ed4d240e36	destination: send `Opaque` protocol hint for opaque ports (#10301 ) The outbound proxy handles endpoints with the `opaque_transport` flag by opening a direct connection to the inbound proxy's inbound listener port, and sending a ProtoBuf `TransportHeader` including the target port of the originating outbound connection and an (optional) `SessionProtocol` describing the protocol used on that connection. Currently, outbound proxies initiating direct connections will always send `SessionProtocol` values communicating the protocol as understood by the outbound proxy. However, this is not always the desired behavior. Direct connections with `TransportHeader`s are used in two cases: for gateway connections, and for ports which are marked as opaque. When the inbound port is marked as opaque, the presence of a `SessionProtocol` tells the inbound proxy to handle that connection as the indicated protocol, which results in incorrect behavior when the inbound proxy's ServerPolicy configures the target port as opaque (see #9888). Therefore, the `Destination` proxy API has been updated to add a new `ProtocolHint`, `Opaque`, which indicates that an outbound proxy should _not_ send a `SessionProtocol` when initiating a direct connection, even if the outbound proxy handled the connection as HTTP. This hint was added to the proxy API in linkerd/linkerd2-proxy-api#197, and released in `linkerd2-proxy-api` v0.8.0. This branch updates the Destination controller's dependency on `linkerd2-proxy-api` to v0.8.0, and changes the controller to send an `Opaque` protocol hint when the target port is marked as opaque on the destination pod. This should override the `H2` protocol hint that is added when the destination is meshed. I've also added a new test for this behavior. Fixes #9888 (along with linkerd/linkerd2-proxy#2209, which changes the proxy to actually handle the `Opaque` protocol hint).	2023-04-14 16:48:03 -07:00
Alejandro Pedraza	0c202bf17b	Bump linkerd2-proxy-init packages (#10678 ) proxy-init v2.2.1: * Sanitize `subnets-to-ignore` flag * Dep bumps cni-plugin v1.1.0: * Add support for the `config.linkerd.io/skip-subnets` annotation * Dep bumps validator v0.1.2: * Dep bumps Also, `linkerd-network-validator` is now released wrapped in a tar file, so this PR also amends `Dockerfile-proxy` to account for that.	2023-04-04 18:07:03 -05:00
Eng Zer Jun	27703ab900	Replace `github.com/ghodss/yaml` with `sigs.k8s.io/yaml` (#10610 ) At the time of making this commit, the package `github.com/ghodss/yaml` is no longer actively maintained. `sigs.k8s.io/yaml` is a permanent fork of `ghodss/yaml` and is actively maintained by Kubernetes SIG. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2023-03-24 09:20:55 -05:00
Oliver Gould	08f97cc26f	destination: Avoid sending spurious profile updates (#10517 ) The GetProfile API endpoint does not behave as expected: when a profile watch is established, the API server starts two separate profile watches--a primary watch with the client's namespace and fallback watch ignoring the client's namespace. These watches race to send data back to the client. If the backup watch updates first, it may be sent to clients before being corrected by a subsequent update. If the primary watch updates with an empty value, the default profile may be served before being corrected by an update to the backup watch. From the proxy's perspective, we'd much prefer that the API provide a single authoritative response when possible. It avoids needless corrective work from distributing across the system on every watch initiation. To fix this, we modify the fallbackProfileListener to behave predictably: it only emits updates once both its primary and fallback listeners have been updated. This avoids emitting updates based on a partial understanding of the cluster state. Furthermore, the opaquePortsAdaptor is updated to avoid synthesizing a default serviceprofile (surprising behavior) and, instead, this defaulting logic is moved into a dedicated defaultProfileListener helper. A dedupProfileListener is added to squelch obviously redundant updates. Finally, this newfound predictability allows us to simplify the API's tests. Many of the API tests are not clear in what they test and sometimes make assertions about the "incorrect" profile updates.	2023-03-13 13:36:18 -07:00
Oliver Gould	c657aeabe8	destination: Split GetProfile into smaller functions (#10514 ) Before changing any GetProfile behavior, this change splits the API handler into some smaller scopes. This helps to clarify control flow and reduce nested contexts. This change also adds relevant fields to log contexts to improve diagnostics.	2023-03-13 11:36:39 -07:00
Siddharth S Pal	e55b91c320	switched not actionable and are alarming log messages to Debugf from Errorf (#10388 ) issue: #10380 Signed-off-by: siddharth pal <siddharthshubhampal@gmail.com>	2023-03-02 15:12:36 -05:00
Yannick Utard	95a3d19b3f	Fix deleteEndpointSlice when there are multiple EndpointSlices (#10370 ) When processing a `delete` event for an EndpointSlice, regardless of the outcome (whether there are still addresses/endpoints alive or whether we have no endpoints left) we make a call to `noEndpoints` on the subscriber. This will send a `Destination.Get` update to the listeners to advertise a `NoEndpoints` event. If there are still addresses left, NoEndpoints { exists: true } will be sent (to signal the service still exists). Until an update is registered (i.e a new add) there will be no available endpoints -- this is incorrect, since other EndpointSlices may exist. This change fixes the problem by handling `noEndpoints` in a more specialized way for EndpointSlices. Signed-off-by: Yannick Utard <yannickutard@gmail.com>	2023-02-22 13:35:02 +00:00
Oliver Gould	363e123d79	Update to dev:v39 with Go 1.19 (#10336 )	2023-02-16 08:25:42 -08:00
Alejandro Pedraza	4a84f2cb32	Implement the k8s metadata API in the Destination controller (#10326 ) Fixes #9986 After reviewing the k8s API calls in Destination, it was concluded we could only swap out the calls to the Node and RS resources to use the metadata API, as all the other resources (Endpoints, EndpointSlices, Services, Pod, ServiceProfiles, Server) required fields other than those found in their metadata section. This also required completing the `NewFakeAPI` implementation by adding the missing annotations and labels entries. ## Testing Memory Consumption The gains here aren't as big as in #9650. In order to test this we need to push hard and create 4000 RS: ``` bash for i in {0..4000}; do kubectl create deployment test-pod-$i --image=nginx; done ``` In edge-23.2.1 the destination pod's memory consumption goes from 40Mi to 160Mi after all the RS were created. With this change, it went from 37Mi to 140Mi.	2023-02-13 17:30:07 -05:00
Alex Leong	6ebcf19cfc	Don't use pods in map keys in server watcher (#10245 ) Fixes #10205 The server watches uses `*corev1.Pod` in the key of it's subscriber map which is potentially brittle since it's not obvious what exactly should count as equality for full pod resources. More robust is to use a unique per-pod identifier in the key: the pod's name and namespace. We therefore more the pod resource into the value type of the map since we still need it when we receive a Server update so that we can iterate through all subscriptions and find pods which match the Server's pod selector. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-02-13 13:41:00 -08:00
Alejandro Pedraza	aaca63be1f	Remove unneeded `namespaces` access in Destination (#10324 ) (This came out during the k8s api calls review for #9650) In `5dc662ae9` inheritance of opaque ports annotation from namespaces to pods was removed, but we didn't remove the associated RBAC.	2023-02-13 16:13:37 -05:00
Alejandro Pedraza	147c8dc07c	Add metrics to server and service watchers (#10213 ) * Add metrics to server and service watchers Closes #10202 and completes #2204 As a followup to #10201, I'm adding the following metric in `server_watcher.go`: - `server_port_subscribers`: This tracks the number of subscribers to changes to Servers associated to a port in a pod. The metric's label identify the namespace and name of the pod, and its targeted port. Additionally, `opaque_ports.go` was missing metrics as well. I added `service_subscribers` which tracks the number of subscribers to a given Service, labeled by the Service's namespace and name. `opaque_ports.go` was also leaking the subscriber's map key, so that got fixed as well.	2023-02-07 08:51:09 -05:00
Yu Cao	e662e147ca	Support service internal traffic policy (#10186 ) Closes #10130 https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/ 1. Update endpoints watcher to include additional field `localTrafficPolicy`. Set to true when `.spec.internalTrafficPolicy` is set to `Local` 2. Update endpoints translater to filter by node when `localTrafficPolicy` is set to true. Topology Aware Hints are not used when `service.pec.internalTrafficPolicy` is set to local Signed-off-by: Yu Cao <yc185050@ncr.com>	2023-02-06 13:53:07 -07:00
Steve Jenson	1e8d96509b	Upgrading proxy-init from v2.1.0 to v2.2.0 this time without JSON formatting (#10234 ) Signed-off-by: Steve Jenson <stevej@buoyant.io>	2023-02-01 11:53:02 -08:00
Alejandro Pedraza	cf665ef56c	Fix PSP (#10208 ) Fixes #10150 When we added PodSecurityAdmission in #9719 (and included in edge-23.1.1), we added the entry `seccompProfile.type=RuntimeDefault` to the containers SecurityContext. For PSP to accept that we require to add the annotation `seccomp.security.alpha.kubernetes.io/allowedProfileNames: "runtime/default"` into the PSP resource, which also implies we require to add the entry `seccompProfile.type=RuntimeDefault` to the pod's SecurityContext as well, not just the container's. It also turns out the `namespace-metadata` Jobs used by extensions for the helm installation method didn't have their ServiceAccount properly bound to the PSP resource. This resulted in the `helm install` command failing, and although the extensions resources did get deployed, they were not being discoverable by `linkerd check`. This change fixes that as well, that has been broken since 2.12.0!	2023-01-26 16:32:41 -08:00
Alejandro Pedraza	8b29408db3	Fix memory leak in Server listeners (#10201 ) Fixes #8270 When a listener unsubscribes to port updates in Servers, we were removing the listener for the `ServerWatcher.subscriptions` map, leaving the map's key (`podPort` with holds the pod object and port) with an empty value. In clusters where there's a lot of pod churn, those keys with empty values were getting accumulated, so this change cleans that up. The repro (basically constantly rolling emojivoto) is described in #9947. A followup will be up shortly adding metrics to track these metrics, along with similar missing metrics from other parts of Destination.	2023-01-26 12:20:48 -05:00
Alex Leong	b0778bb2ea	Readiness checks fail until caches are synced (#10166 ) Fixes https://github.com/linkerd/linkerd2/issues/10036 The Linkerd control plane components written in go serve liveness and readiness probes endpoint on their admin server. However, the admin server is not started until k8s informer caches are synced, which can take a long time on large clusters. This means that liveness checks can time out causing the controller to be restarted. We start the admin server before attempting to sync caches so that we can respond to liveness checks immediately. We fail readiness probes until the caches are synced. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-01-25 11:43:09 -08:00
dependabot[bot]	62d6d7cd52	build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0 (#10038 ) * build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0 Bumps [sigs.k8s.io/gateway-api](https://github.com/kubernetes-sigs/gateway-api) from 0.5.1 to 0.6.0. - [Release notes](https://github.com/kubernetes-sigs/gateway-api/releases) - [Changelog](https://github.com/kubernetes-sigs/gateway-api/blob/main/CHANGELOG.md) - [Commits](https://github.com/kubernetes-sigs/gateway-api/compare/v0.5.1...v0.6.0) --- updated-dependencies: - dependency-name: sigs.k8s.io/gateway-api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Account for possible errors returned from `AddEventHandler` In v0.26.0 client-go's `AddEventHandler` method for informers started returning a registration handle (that we ignore) and an error that we now surface up. * client-go v0.26.0 removed the openstack plugin * Temporary changes to trigger tests in k8s 1.21 - Adds an innocuous change to integration.yml so that all tests get triggered - Hard-code k8s version in `k3d cluster create` invocation to v1.21 * Revert "Temporary changes to trigger tests in k8s 1.21" This reverts commit `3e1fdd0e5e`. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>	2023-01-16 09:38:09 -05:00
Alejandro Pedraza	7428d4aa51	Removed dupe imports (#10049 ) * Removed dupe imports My IDE (vim-gopls) has been complaining for a while, so I decided to take care of it. Found via [staticcheck](https://github.com/dominikh/go-tools) * Add stylecheck to go-lint checks	2023-01-10 14:34:56 -05:00
Alex Leong	768e04dd7e	Update endpoints watcher to not fetch pods for removed endpoints (#10013 ) Fixes #10003 When endpoints are removed from an EndpointSlice resource, the destination controller builds a list of addresses to remove. However, if any of the removed endpoints have a Pod as their targetRef, we will attempt to fetch that pod to build the address to remove. If that pod has already been removed from the informer cache, this will fail and the endpoint will be skipped in the list of endpoints to be removed. This results in stale endpoints being stuck in the address set and never being removed. We update the endpoint watcher to construct only a list of endpoint IDs for endpoints to remove, rather than fetching the entire pod object. Since we no longer attempt to fetch the pod, this operation is now infallible and endpoints will no longer be skipped during removal. We also add a `TestEndpointSliceScaleDown` test to exercise this. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-01-03 10:04:02 -08:00
Alejandro Pedraza	faf0ff62f7	Add support for Pod Security Admission (#9719 ) Closes #9676 This adds the `pod-security.kubernetes.io/enforce` label as described in [Pod Security Admission labels for namespaces](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces). PSA gives us three different possible values (policies or modes): [privileged, baseline and restricted](https://kubernetes.io/docs/concepts/security/pod-security-standards/). For non-CNI mode, the proxy-init container relies on granting the NET_RAW and NET_ADMIN capabilities, which places those pods under the `restricted` policy. OTOH for CNI mode we can enforce the `restricted` policy, by setting some defaults on the containers' `securityContext` as done in this PR. Also note this change also adds the `cniEnabled` entry in the `values.yaml` file for all the extension charts, which determines what policy to use. Final note: this includes the fix from #9717, otherwise an empty gateway UID prevents the pod to be created under the `restricted` policy. ## How to test As this is only enforced as of k8s 1.25, here are the instructions to run 1.25 with k3d using Calico as CNI: ```bash # launch k3d with k8s v1.25, with no flannel CI $ k3d cluster create --image='+v1.25' --k3s-arg '--disable=local-storage,metrics-server@server:0' --no-lb --k3s-arg --write-kubeconfig-mode=644 --k3s-arg --flannel-backend=none --k3s-arg --cluster-cidr=192.168.0.0/16 --k3s-arg '--disable=servicelb,traefik@server:0' # install Calico $ k apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml # load all the images $ bin/image-load --k3d proxy controller policy-controller web metrics-api tap cni-plugin jaeger-webhook # install linkerd-cni $ bin/go-run cli install-cni\|k apply -f - # install linkerd-crds $ bin/go-run cli install --crds\|k apply -f - # install linkerd-control-plane in CNI mode $ bin/go-run cli install --linkerd-cni-enabled\|k apply -f - # Pods should come up without issues. You can also try the viz and jaeger extensions. # Try removing one of the securityContext entries added in this PR, and the Pod # won't come up. You should be able to see the PodSecurity error in the associated # ReplicaSet. ``` To test the multicluster extension using CNI, check this [gist](https://gist.github.com/alpeb/4cbbd5ad87538b9e0d39a29b4e3f02eb) with a patch to run the multicluster integration test with CNI in k8s 1.25.	2022-12-19 10:23:46 -05:00
Alejandro Pedraza	52ae875e9d	Fixes HostPort mapping lookup that was generating a false warning (#9918 ) When performing the HostPort mapping introduced in #9819, the `containsIP` iterates through the pod IPs searching for a match against `targetIP` using `ip.String()`, but that returns something like `&PodIP{IP: xxx}`. Fixed that to just use `ip.IP`, and also completed the text fixtures to include both `PodIP` and `PodIPs` in the pods manifests. Note this wasn't affecting the end result, it was just producing an extra warning as shown below, that this change eliminates: ```bash $ go test -v ./controller/api/destination/... -run TestGetProfiles === RUN TestGetProfiles ... === RUN TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS time="2022-11-29T09:38:48-05:00" level=info msg="waiting for caches to sync" time="2022-11-29T09:38:49-05:00" level=info msg="caches synced" time="2022-11-29T09:38:49-05:00" level=warning msg="unable to find container port as host (172.17.13.15) matches neither PodIP nor HostIP (&Pod{ObjectMeta:{pod-0 ns 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[linkerd.io/control-plane-ns:linkerd] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,},Status:PodStatus{Phase:Running,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:172.17.13.15,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},})" test=TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS ```	2022-11-29 16:33:31 -05:00
Alejandro Pedraza	ed5dd35b57	Guard `endpointTranslator` with mutex (#9901 ) Fixes #9896 The maps in `endpointTranslator` weren't being guarded against concurrent access, so we're adding locks at the `Add` and `Remove` methods. Also these functions ultimately call the `SendMsg` method on the gRPC `stream`, which is not ["thread-safe"](https://github.com/grpc/grpc-go/blob/master/stream.go#L122-L126), so we're guarding against other problems as well. A new unit test `TestConcurrency` was added that failed in the following ways before this fix: When running the test with the `-race` flag, we immediately get the data race warning: ```bash $ go test ./controller/api/destination/... -run TestConcurrency -race time="2022-11-25T16:48:52-05:00" level=info msg="waiting for caches to sync" time="2022-11-25T16:48:52-05:00" level=info msg="caches synced" ================== WARNING: DATA RACE Read at 0x00c0000c0040 by goroutine 161: github.com/linkerd/linkerd2/controller/api/destination.(endpointTranslator).Add() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80 +0x29c github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92 Previous write at 0x00c0000c0040 by goroutine 162: github.com/linkerd/linkerd2/controller/api/destination.(endpointTranslator).sendFilteredUpdate() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:95 +0x66 github.com/linkerd/linkerd2/controller/api/destination.(endpointTranslator).Add() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:83 +0x330 github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92 Goroutine 161 (running) created at: github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f testing.tRunner() /usr/local/go/src/testing/testing.go:1439 +0x213 testing.(T).Run.func1() /usr/local/go/src/testing/testing.go:1486 +0x47 Goroutine 162 (running) created at: github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f testing.tRunner() /usr/local/go/src/testing/testing.go:1439 +0x213 testing.(T).Run.func1() /usr/local/go/src/testing/testing.go:1486 +0x47 ``` If run without the `-race` flag, we get the `concurrent map writes` panic reported in #9896: ```bash $ go test ./controller/api/destination/... -run TestConcurrency -count=1 time="2022-11-25T16:53:25-05:00" level=info msg="waiting for caches to sync" time="2022-11-25T16:53:25-05:00" level=info msg="caches synced" fatal error: concurrent map writes goroutine 187 [running]: runtime.throw({0x1b57bc4?, 0x500000000000000?}) /usr/local/go/src/runtime/panic.go:992 +0x71 fp=0xc00013dc80 sp=0xc00013dc50 pc=0x43a5b1 runtime.mapassign(0xc00013dec8?, 0x2?, 0x0?) /usr/local/go/src/runtime/map.go:595 +0x4d6 fp=0xc00013dd00 sp=0xc00013dc80 pc=0x4113b6 github.com/linkerd/linkerd2/controller/api/destination.(endpointTranslator).Add(...) /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80 github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1() /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x1a8 fp=0xc00013dfe0 sp=0xc00013dd00 pc=0x16d1da8 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00013dfe8 sp=0xc00013dfe0 pc=0x46d721 created by github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x3c ```	2022-11-28 15:07:18 -05:00
Alejandro Pedraza	4ea8ab21dc	edge-22.11.3 change notes (#9884 ) * edge-22.11.3 change notes Besides the notes, this corrects a small point in `RELEASE.md`, and bumps the proxy-init image tag to `v2.1.0`. Note that the entry under `go.mod` wasn't bumped because moving it past v2 requires changes on `linkerd2-proxy-init`'s `go.mod` file, and we're gonna drop that dependency soon anyways. Finally, all the charts got their patch version bumped, except for `linkerd2-cni` that got its minor bumped because of the tolerations default change. ## edge-22.11.3 This edge release fixes connection errors to pods using a `hostPort` different than their `containerPort`. Also the `network-validator` init container improves its logging, and the `linkerd-cni` DaemonSet now gets deployed in all nodes by default. * Fixed `destination` service to properly discover targets using a `hostPort` different than their `containerPort`, which was causing 502 errors * Upgraded the `network-validator` with better logging allowing users to determine whether failures occur as a result of their environment or the tool itself * Added default `Exists` toleration to the `linkerd-cni` DaemonSet, allowing it to be deployed in all nodes by default, regardless of taints Co-authored-by: Oliver Gould <ver@buoyant.io>	2022-11-23 14:35:20 -05:00

1 2 3 4 5 ...

785 Commits