linkerd2

Commit Graph

Author	SHA1	Message	Date
Kevin Leimkuhler	e7f2a3fba3	viz: add tap-injector (#5540 ) ## What this changes This adds a tap-injector component to the `linkerd-viz` extension which is responsible for adding the tap service name environment variable to the Linkerd proxy container. If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled via annotation on the pod or the namespace, no action is taken. This also removes the environment variable for explicitly disabling tap through an environment variable. Tap status for a proxy is now determined only be the presence or absence of the tap service name environment variable. Closes #5326 ## How it changes ### tap-injector The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be added to a pod's Linkerd proxy container environment. If the pod satisfies the following, it is added: - The pod has a Linkerd proxy container - The pod has not already been mutated - Tap is not disabled via annotation on the pod or the pod's namespace ### LINKERD2_PROXY_TAP_DISABLED Now that tap is an extension of Linkerd and not a core component, it no longer made sense to explicitly enable or disable tap through this Linkerd proxy environment variable. The status of tap is now determined only be if the tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment variable. ### controller image The tap-injector has been added to the controller image's several startup commands which determines what it will do in the cluster. As a follow-up, I think splitting out the `tap` and `tap-injector` commands from the controller image into a linkerd-viz image (or something like that) makes sense. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-21 11:24:08 -05:00
Kevin Leimkuhler	7c0843a823	Add opaque ports to destination service updates (#5294 ) ## Summary This changes the destination service to start indicating whether a profile is an opaque protocol or not. Currently, profiles returned by the destination service are built by chaining together updates coming from watching Profile and Traffic Split updates. With this change, we now also watch updates to Opaque Port annotations on pods and namespaces; if an update occurs this is now included in building a profile update and is sent to the client. ## Details Watching updates to Profiles and Traffic Splits is straightforward--we watch those resources and if an update occurs on one associated to a service we care about then the update is passed through. For Opaque Ports this is a little different because it is an annotation on pods or namespaces. To account for this, we watch the endpoints that we should care about. ### When host is a Pod IP When getting the profile for a Pod IP, we check for the opaque ports annotation on the pod and the pod's namespace. If one is found, we'll indicate if the profile is an opaque protocol if the requested port is in the annotation. We do not subscribe for updates to this pod IP. The only update we really care about is if the pod is deleted and this is already handled by the proxy. ### When host is a Service When getting the profile for a Service, we subscribe for updates to the endpoints of that service. For any ports set in the opaque ports annotation on any of the pods, we check if the requested port is present. Since the endpoints for a service can be added and removed, we do subscribe for updates to the endpoints of the service. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-12-18 12:38:59 -05:00
Alejandro Pedraza	578d4a19e9	Have the tap APIServer refresh its cert automatically (#5388 ) Followup to #5282, fixes #5272 in its totality. This follows the same pattern as the injector/sp-validator webhooks, leveraging `FsCredsWatcher` to watch for changes in the cert files. To reuse code from the webhooks, we moved `updateCert()` to `creds_watcher.go`, and `run()` as well (which now is called `ProcessEvents()`). The `TestNewAPIServer` test in `apiserver_test.go` was removed as it really was just testing two things: (1) that `apiServerAuth` doesn't error which is already covered in the following test, and (2) that the golib call `net.Listen("tcp", addr)` doesn't error, which we're not interested in testing here. ## How to test To test that the injector/sp-validator functionality is still correct, you can refer to #5282 The steps below are similar, but focused towards the tap component: ```bash # Create some root cert $ step certificate create linkerd-tap.linkerd.svc ca.crt ca.key --profile root-ca --no-password --insecure # configure tap's caBundle to be that root cert $ cat > linkerd-overrides.yml << EOF tap: externalSecret: true caBundle: \| < ca.crt contents> EOF # Install linkerd $ bin/linkerd install --config linkerd-overrides.yml \| k apply -f - # Generate an intermediatery cert with short lifespan $ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc # Create the secret using that intermediate cert $ kubectl create secret tls \ linkerd-tap-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Rollout the tap pod for it to pick the new secret $ k -n linkerd rollout restart deploy/linkerd-tap # Tap should work $ bin/linkerd tap -n linkerd deploy/linkerd-web req id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :method=GET :authority=10.42.0.11:9994 :path=/metrics rsp id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :status=200 latency=1779µs end id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true duration=65µs response-length=1709B # Wait 5 minutes and rollout tap again $ k -n linkerd rollout restart deploy/linkerd-tap # You'll see in the logs that the cert expired: $ k -n linkerd logs -f deploy/linkerd-tap tap 2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45866: remote error: tls: bad certificate 2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45870: remote error: tls: bad certificate # Recreate the secret $ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc $ k -n linkerd delete secret linkerd-tap-k8s-tls $ kubectl create secret tls \ linkerd-tap-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Wait a few moments and you'll see the certs got reloaded and tap is working again time="2020-12-15T16:03:42Z" level=info msg="Updated certificate" addr=":8089" component=apiserver ```	2020-12-16 17:46:14 -05:00
Tarun Pothulapati	72a0ca974d	extension: Separate multicluster chart and binary (#5293 ) Fixes #5257 This branch movies mc charts and cli level code to a new top level directory. None of the logic is changed. Also, moves some common types into `/pkg` so that they are accessible both to the main cli and extensions. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-12-04 16:36:10 -08:00
Alejandro Pedraza	4c634a3816	Have webhooks refresh their certs automatically (#5282 ) * Have webhooks refresh their certs automatically Fixes partially #5272 In 2.9 we introduced the ability for providing the certs for `proxy-injector` and `sp-validator` through some external means like cert-manager, through the new helm setting `externalSecret`. We forgot however to have those services watch changes in their secrets, so whenever they were rotated they would fail with a cert error, with the only workaround being to restart those pods to pick the new secrets. This addresses that by first abstracting out `FsCredsWatcher` from the identity controller, which now lives under `pkg/tls`. The webhook's logic in `launcher.go` no longer reads the certs before starting the https server, moving that instead into `server.go` which in a similar way as identity will receive events from `FsCredsWatcher` and update `Server.cert`. We're leveraging `http.Server.TLSConfig.GetCertificate` which allows us to provide a function that will return the current cert for every incoming request. ### How to test ```bash # Create some root cert $ step certificate create linkerd-proxy-injector.linkerd.svc ca.crt ca.key \ --profile root-ca --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # configure injector's caBundle to be that root cert $ cat > linkerd-overrides.yaml << EOF proxyInjector: externalSecret: true caBundle: \| < ca.crt contents> EOF # Install linkerd. The injector won't start untill we create the secret below $ bin/linkerd install --controller-log-level debug --config linkerd-overrides.yaml \| k apply -f - # Generate an intermediatery cert with short lifespan step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # Create the secret using that intermediate cert $ kubectl create secret tls \ linkerd-proxy-injector-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # start following the injector log $ k -n linkerd logs -f -l linkerd.io/control-plane-component=proxy-injector -c proxy-injector # Inject emojivoto. The pods should be injected normally $ bin/linkerd inject https://run.linkerd.io/emojivoto.yml \| kubectl apply -f - # Wait about 5 minutes and delete a pod $ k -n emojivoto delete po -l app=emoji-svc # You'll see it won't be injected, and something like "remote error: tls: bad certificate" will appear in the injector logs. # Regenerate the intermediate cert $ step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # Delete the secret and recreate it $ k -n linkerd delete secret linkerd-proxy-injector-k8s-tls $ kubectl create secret tls \ linkerd-proxy-injector-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Wait a couple of minutes and you'll see some filesystem events in the injector log along with a "Certificate has been updated" entry # Then delete the pod again and you'll see it gets injected this time $ k -n emojivoto delete po -l app=emoji-svc ```	2020-12-04 16:25:59 -05:00
Alejandro Pedraza	4687dc52aa	Refactor webhook framework to allow webhooks define their flags (#5256 ) * Refactor webhook framework to allow webhook define their flags Pulled out of `launcher.go` the flag parsing logic and moved it into the `Main` methods of the webhooks (under `controller/cmd/proxy.injector/main.go` and `controller/cmd/sp-validator/main.go`), so that individual webhooks themselves can define the flags they want to use. Also no longer require that webhooks have cluster-wide access. Finally, renamed the type `webhook.handlerFunc` to `webhook.Handler` so it can be exported. This will be used in the upcoming jaeger webhook.	2020-11-23 10:40:30 -05:00
Tarun Pothulapati	5e774aaf05	Remove dependency of linkerd-config for control plane components (#4915 ) * Remove dependency of linkerd-config for most control plane components This PR removes the dependency of `linkerd-config` into control plane components by making all that information passed through CLI flags. As most of these components require a couple of flags, passing them as flags could be more helpful, as updations to the flags trigger a rollout unlike a configMap update. This does not update the proxy-injector as it needs a lot more data and mounting `linkerd-config` is better.	2020-10-06 22:19:18 +05:30
Tarun Pothulapati	d0caaa86c4	Bump k8s client-go to v0.19.2 (#5002 ) Fixes #4191 #4993 This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5 This consists of the following changes: - Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD. - Bump all k8s related dependencies to v0.19.2 - Generate CRD types, client code using the latest k8s.io/code-generator - Use context.Context as the first argument, in all code paths that touch the k8s client-go interface Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-09-28 12:45:18 -05:00
Alejandro Pedraza	b30d35f46a	Reset service-mirror component when target's k8s API is unreachable (#4996 ) When the service-mirror component can't reach the target's k8s API, the goroutine blocks and it can't be unblocked. This was happenining specifically in the case of the multicluster integration test (still to be pushed), where the source and target clusters are created in quick succession and the target's API service doesn't always have time to be exposed before being requested by the service mirror. The fix consists on no longer have restartClusterWatcher be side-effecting, and instead return an error. If such error is not nil then the link watcher is stopped and reset after 10 seconds.	2020-09-25 11:00:28 -05:00
Alex Leong	9d3cf6ee4d	Move most service-mirror code out of cmd package (#4901 ) All of the code for the service mirror controller lives in the `linkerd/linkerd2/controller/cmd` package. It is typical for control plane components to only have a `main.go` entrypoint in the cmd package. This can sometimes make it hard to find the service mirror code since I wouldn't expect it to be in the cmd package. We move the majority of the code to a dedicated controller package, leaving only main.go in the cmd package. This is purely organizational; no behavior change is expected. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-08-27 14:17:18 -07:00
Matei David	7ed904f31d	Enable endpoint slices when upgrading through CLI (#4864 ) ## What/How @adleong pointed out in #4780 that when enabling slices during an upgrade, the new value does not persist in the `linkerd-config` ConfigMap. I took a closer look and it seems that we were never overwriting the values in case they were different. * To fix this, I added an if block when validating and building the upgrade options -- if the current flag value differs from what we have in the ConfigMap, then change the ConfigMap value. * When doing so, I made sure to check that if the cluster does not support `EndpointSlices` yet the flag is set to true, we will error out. This is done similarly (copy&paste similarily) to what's in the install part. * Additionally, I have noticed that the helm ConfigMap template stored the flag value under `enableEndpointSlices` field name. I assume this was not changed in the initial PR to reflect the changes made in the protocol buffer. The API (and thus the CLI) uses the field name `endpointSliceEnabled` instead. I have changed the config template so that helm installations will use the same field, which can then be used in the destination service or other components that may implement slice support in the future. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-08-24 14:34:50 -07:00
Josh Soref	72aadb540f	Spelling (#4872 ) This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling). The misspellings have been reported at `aaf440489e (commitcomment-41423663)` The action reports that the changes in this PR would make it happy: `5b82c6c5ca` Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately. Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-08-12 21:59:50 -07:00
Alex Leong	024a35a3d3	Move multicluster API connectivity checks earlier (#4819 ) Fixes #4774 When a service mirror controller is unable to connect to the target cluster's API, the service mirror controller crashes with the error that it has failed to sync caches. This error lacks the necessary detail to debug the situation. Unfortunately, client-go does not surface more useful information about why the caches failed to sync. To make this more debuggable we do a couple things: 1. When creating the target cluster api client, we eagerly issue a server version check to test the connection. If the connection fails, the service-mirror-controller logs now look like this: ``` time="2020-07-30T23:53:31Z" level=info msg="Got updated link broken: {Name:broken Namespace:linkerd-multicluster TargetClusterName:broken TargetClusterDomain:cluster.local TargetClusterLinkerdNamespace:linkerd ClusterCredentialsSecret:cluster-credentials-broken GatewayAddress:35.230.81.215 GatewayPort:4143 GatewayIdentity:linkerd-gateway.linkerd-multicluster.serviceaccount.identity.linkerd.cluster.local ProbeSpec:ProbeSpec: {path: /health, port: 4181, period: 3s} Selector:{MatchLabels:map[] MatchExpressions:[{Key:mirror.linkerd.io/exported Operator:Exists Values:[]}]}}" time="2020-07-30T23:54:01Z" level=error msg="Unable to create cluster watcher: cannot connect to api for target cluster remote: Get \"https://36.199.152.138/version?timeout=32s\": dial tcp 36.199.152.138:443: i/o timeout" ``` This error also no longer causes the service mirror controller to crash. Updating the Link resource will cause the service mirror controller to reload the credentials and try again. 2. We rearrange the checks in `linkerd check --multicluster` to perform the target API connectivity checks before the service mirror controller checks. This means that we can validate the target cluster API connection even if the service mirror controller is not healthy. We also add a server version check here to quickly determine if the connection is healthy. Sample check output: ``` linkerd-multicluster -------------------- √ Link CRD exists √ Link resources are valid * broken W0730 16:52:05.620806 36735 transport.go:243] Unable to cancel request for promhttp.RoundTripperFunc × remote cluster access credentials are valid * failed to connect to API for cluster: [broken]: Get "https://36.199.152.138/version?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) see https://linkerd.io/checks/#l5d-smc-target-clusters-access for hints W0730 16:52:35.645499 36735 transport.go:243] Unable to cancel request for promhttp.RoundTripperFunc × clusters share trust anchors Problematic clusters: * broken: unable to fetch anchors: Get "https://36.199.152.138/api/v1/namespaces/linkerd/configmaps/linkerd-config?timeout=30s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) see https://linkerd.io/checks/#l5d-multicluster-clusters-share-anchors for hints √ service mirror controller has required permissions * broken √ service mirror controllers are running * broken × all gateway mirrors are healthy wrong number of (0) gateway metrics entries for probe-gateway-broken.linkerd-multicluster see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints √ all mirror services have endpoints ‼ all mirror services are part of a Link mirror service voting-svc-gke.emojivoto is not part of any Link see https://linkerd.io/checks/#l5d-multicluster-orphaned-services for hints ``` Some logs from the underlying go network libraries sneak into the output which is kinda gross but I don't think it interferes too much with being able to understand what's going on. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-08-05 11:48:23 -07:00
Alex Leong	381f237f69	Add multicluster unlink command (#4802 ) Fixes #4707 In order to remove a multicluster link, we add a `linkerd multicluster unlink` command which produces the yaml necessary to delete all of the resources associated with a `linkerd multicluster link`. These are: * the link resource * the service mirror controller deployment * the service mirror controller's RBAC * the probe gateway mirror for this link * all mirror services for this link This command follows the same pattern as the `linkerd uninstall` command in that its output is expected to be piped to `kubectl delete`. The typical usage of this command is: ``` linkerd --context=source multicluster unlink --cluster-name=foo \| kubectl --context=source delete -f - ``` This change also fixes the shutdown lifecycle of the service mirror controller by properly having it listen for the shutdown signal and exit its main loop. A few alternative designs were considered: I investigated using owner references as suggested [here](https://github.com/linkerd/linkerd2/issues/4707#issuecomment-653494591) but it turns out that owner references must refer to resources in the same namespace (or to cluster scoped resources). This was not feasible here because a service mirror controller can create mirror services in many different namespaces. I also considered having the service mirror controller delete the mirror services that it created during its own shutdown. However, this could lead to scenarios where the controller is killed before it finishes deleting the services that it created. It seemed more reliable to have all the deletions happen from `kubectl delete`. Since this is the case, we avoid having the service mirror controller delete mirror services, even when the link is deleted, to avoid the race condition where the controller and CLI both attempt to delete the same mirror services and one of them fails with a potentially alarming error message. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-08-04 16:21:59 -07:00
Alex Leong	a1543b33e3	Add support for service-mirror selectors (#4795 ) * Add selector support Signed-off-by: Alex Leong <alex@buoyant.io> * Removed unused labels Signed-off-by: Alex Leong <alex@buoyant.io>	2020-07-30 10:07:14 -07:00
Alex Leong	d540e16c8b	Make service mirror controller per target cluster (#4710 ) This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31. For fuller context, please see that RFC. Basic multicluster functionality works here including: * `linkerd mc install` installs the Link CRD but not any service mirror controllers * `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link * The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints. * The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences. * The `linkerd check` multicluster checks have been updated for the new architecture. Several checks have been rendered obsolete by the new architecture and have been removed. The following are known issues requiring further work: * the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror. it does not yet support configuring a label selector. * an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707 * an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708 Signed-off-by: Alex Leong <alex@buoyant.io>	2020-07-23 14:32:50 -07:00
Tarun Pothulapati	b7e9507174	Remove/Relax prometheus related checks (#4724 ) * Removes/Relaxes prometheus related checks Now that prometheus is an add-on, There can be cases where prometheus is disabled at which the check should show a warning but not fail. This decouples the tight depedency. This changes the following checks: - Removes serviceAccount and pod checks in the CLI. - Relaxes `linkerd-api` checks to only check for prometheus access when the URL is not empty. This should work seamlessly with external prometheus as that URL will be passed and it performs the same check. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-07-20 14:24:00 -07:00
Matei David	8b85716eb8	Introduce install flag for EndpointSlices (#4740 ) EndpointSlices have been made opt-in due to their experimental nature. This PR introduces a new install flag 'enableEndpointSlices' that will allow adopters to specify in their cli install or helm install step whether they would like to use endpointslices as a resource in the destination service, instead of the endpoints k8s resource. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-07-15 09:53:04 -07:00
Matei David	9d8d89cce8	Add EndpointSlice logic to EndpointsWatcher (#4501 ) (#4663 ) Introduce support for the EndpointSlice k8s resource (k8s v1.16+) in the destination service. Through this PR, in the EndpointsWatcher, there will be a dedicated informer for EndpointSlice; the informer cannot run at the same time as the Endpoints resource informer. The main difference is that EndpointSlices have a one-to-many relationship with a service, they provide better performance benefits, dual-stack addresses and more. EndpointSlice support also implies service topology and other k8s related features. Validated and tested manually, as well as with dedicated unit tests. Closes #4501 Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-07-07 13:20:40 -07:00
Arthur Silva Sens	021048d576	GoDocs for completion, dashboard and diagnostics cli commands (#4518 ) Signed-off-by: arthursens <arthursens2005@gmail.com>	2020-06-30 05:53:50 -05:00
Alex Leong	755538b84a	Resolve gateway hostnames into IP addresses (#4588 ) Fixes #4582 When a target cluster gateway is exposed as a hostname rather than with a fixed IP address, the service mirror controller fails to create mirror services and gateway mirrors for that gateway. This is because we only look at the IP field of the gateway service. We make two changes to address this problem: First, when extracting the gateway spec from a gateway that has a hostname instead of an IP address, we do a DNS lookup to resolve that hostname into an IP address to use in the mirror service endpoints and gateway mirror endpoints. Second, we schedule a repair job on a regular (1 minute) to update these endpoint objects. This has the effect of re-resolving the DNS names every minute to pick up any changes in DNS resolution. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-06-15 10:33:49 -07:00
Zahari Dichev	f01bcfe722	Tweak service-mirror log levels (#4562 ) This PR just modifies the log levels on the probe and cluster watchers to emit in INFO what they would emit in DEBUG. I think it makes sense as we need that information to track problems. The only difference is that when probing gateways we only log if the probe attempt was unsuccessful. Fix #4546	2020-06-05 13:12:36 -07:00
Zahari Dichev	b6b95455aa	Fix load balancer missing ip race condition (#4554 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-06-05 19:35:47 +03:00
Alex Leong	cffa07ddba	Update gateway identity on gateway mirror endpoints (#4559 ) When the identity annotation on a gateway service is updated, this change is not propagated to the mirror gateway endpoints object. This is because the annotations are updated on the wrong object and the changes are lost. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-06-05 09:21:35 -07:00
Alex Leong	0f84ff61db	Update gateway mirror ports (#4551 ) * Update gateway mirror spec when remote gateway changes Signed-off-by: Alex Leong <alex@buoyant.io> * Only update ports Signed-off-by: Alex Leong <alex@buoyant.io>	2020-06-04 17:25:46 +03:00
Kevin Leimkuhler	8a932ac905	Change text to use source/target terminology in events and metrics (#4527 ) Change terminology from local/remote to source/target in events and metrics. This does not change any variable, function, struct, or field names since testing is still improving Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-03 15:02:39 -04:00
Oliver Gould	7cc5e5c646	multicluster: Use the proxy as an HTTP gateway (#4528 ) This change modifies the linkerd-gateway component to use the inbound proxy, rather than nginx, for gateway. This allows us to detect loops and propagate identity through the gateway. This change also cleans up port naming to `mc-gateway` and `mc-probe` to resolve conflicts with Kubernetes validation. --- * proxy: v2.99.0 The proxy can now operate as gateway, routing requests from its inbound proxy to the outbound proxy, without passing the requests to a local application. This supports Linkerd's multicluster feature by adding a `Forwarded` header to propagate the original client identity and assist in loop detection. --- * Add loop detection to inbound & TCP forwarding (linkerd/linkerd2-proxy#527) * Test loop detection (linkerd/linkerd2-proxy#532) * fallback: Unwrap errors recursively (linkerd/linkerd2-proxy#534) * app: Split inbound/outbound constructors into components (linkerd/linkerd2-proxy#533) * Introduce a gateway between inbound and outbound (linkerd/linkerd2-proxy#540) * gateway: Add a Forwarded header (linkerd/linkerd2-proxy#544) * gateway: Return errors instead of responses (linkerd/linkerd2-proxy#547) * Fail requests that loop through the gateway (linkerd/linkerd2-proxy#545) * inject: Support config.linkerd.io/enable-gateway This change introduces a new annotation, config.linkerd.io/enable-gateway, that, when set, enables the proxy to act as a gateway, routing all traffic targetting the inbound listener through the outbound proxy. This also removes the nginx default listener and gateway port of 4180, instead using 4143 (the inbound port). * proxy: v2.100.0 This change modifies the inbound gateway caching so that requests may be routed to multiple leaves of a traffic split. --- * inbound: Do not cache gateway services (linkerd/linkerd2-proxy#549)	2020-06-02 19:37:14 -07:00
Kevin Leimkuhler	d7f84e6c7b	Change help text to use source/target terminology in service-mirror and healthchecks (#4524 ) Change terminology from local/remote to source/target in service-mirror and healthchecks help text. This does not change any variable, function, struct, or field names since testing is still improving Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-02 15:21:52 -04:00
Alex Leong	91a067c924	Rename gateway ports (#4526 ) * Rename gateway ports Signed-off-by: Alex Leong <alex@buoyant.io> * fmt Signed-off-by: Alex Leong <alex@buoyant.io>	2020-06-02 09:08:23 +03:00
Kevin Leimkuhler	b4804a0bb5	Format fix (#4525 ) Fixes CI failures Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-06-01 18:51:00 -04:00
Zahari Dichev	6c3922a7f1	Probe manager simplification (#4510 ) There are a few notable things happening in this PR: - the probe manager has been decoupled from the cluster_watcher. Now its only responsibility is to watch for mirrored gateways beeing created and to probe them. This means that probes are initiated for all gateways no matter whether there are mirrored services being paired - the number of paired services is derived from the existing services in the cluster rather than being published as a metric by the prober - there are no events being exchanged between the cluster watcher and the probe manager Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-06-01 14:41:29 -07:00
Zahari Dichev	f7f70690fb	Fix resync bug + service selection annotations (#4453 ) THis PR addresses two problems: - when a resync happens (or the mirror controller is restarted) we incorrectly classify the remote gateway as a mirrored service that is not mirrored anymore and we delete it - when updating services due to a gateway update, we need to select only the services for the particular cluster The latter fixes #4451	2020-05-21 14:15:13 -07:00
Zahari Dichev	31e33d18d3	Enable service mirroring to work in private networks (#4440 ) This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway. Fix #4411 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-20 19:48:36 +03:00
Zahari Dichev	6574f124a7	Restrict Service mirror RBACs (#4426 ) This PR introduces a few changes that were requested after a bit of service mirror reviewing. - we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in - we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters - fixed checks to account for changes Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-20 17:08:01 +03:00
Zahari Dichev	115bab9868	Fix gateway update problems (#4388 ) * Fix gateway update problems Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-14 10:59:30 -05:00
Zahari Dichev	fd59ce532d	Add better logging to service mirror controller (#4361 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-11 10:30:16 +03:00
Zahari Dichev	edd9b654a7	Make gateway require TLS for incoming requests (#4339 ) Make gateway require TLS for incoming requests Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-11 10:07:48 +03:00
Zahari Dichev	4e82ba8878	Multicluster checks (#4279 ) Multicluster checks Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-05 10:19:38 +03:00
Zahari Dichev	cd04b94bb9	Probe manager events emission tests (#4312 ) Probe manager events emission tests Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-05 08:57:05 +03:00
Zahari Dichev	09262ebd72	Add liveliness checks and metrics for multicluster gateway (#4233 ) Add liveliness checks for gateway Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-04-27 13:06:58 +03:00
Zahari Dichev	10ecd8889e	Set auth override (#4160 ) Set AuthOverride when present on endpoints annotation Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-03-25 10:56:36 +02:00
Zahari Dichev	72fc94b03c	Service mirroring tests (#4115 ) Unit tests that exercise most of the code in cluster_watcher.go. Essentially the whole cluster mirroring machinary can be tought of as a function that takes remote cluster state, local cluster state, and modification events and as a result it either modifies local cluster state or issues new events onto the queue. This is what these tests are trying to model. I think this covers a lot of the logic there. Any suggestions for other edge cases are welcome. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-03-04 20:17:21 +02:00
Zahari Dichev	edd7fd203d	Service Mirroring Component (#4028 ) This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-03-02 21:16:08 +02:00
Mayank Shah	60ac0d5527	Add `as-group` CLI flag (#3952 ) Add CLI flag --as-group that can impersonate group for k8s operations Signed-off-by: Mayank Shah mayankshah1614@gmail.com	2020-01-22 16:38:31 +02:00
Alex Leong	03762cc526	Support pod ip and service cluster ip lookups in the destination service (#3595 ) Fixes #3444 Fixes #3443 ## Background and Behavior This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter. It returns the stream of endpoints, just as if `Get` had been called with the service's authority. This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections. When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity. Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error. Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`. ## Implementation We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip. `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`. Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response. ## Testing This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster: ``` go run controller/cmd/main.go destination -kubeconfig ~/.kube/config ``` Then lookups can be issued using the destination client: ``` go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086 ``` Service cluster ips and pod ips can be used as the `path` argument. Signed-off-by: Alex Leong <alex@buoyant.io>	2019-12-19 09:25:12 -08:00
Sergio C. Arteaga	cee8e3d0ae	Add CronJobs and ReplicaSets to dashboard and CLI (#3687 ) This PR adds support for CronJobs and ReplicaSets to `linkerd inject`, the web dashboard and CLI. It adds a new Grafana dashboard for each kind of resource. Closes #3614 Closes #3630 Closes #3584 Closes #3585 Signed-off-by: Sergio Castaño Arteaga tegioz@icloud.com Signed-off-by: Cintia Sanchez Garcia cynthiasg@icloud.com	2019-12-11 10:02:37 -08:00
Tarun Pothulapati	f18e27b115	use appsv1 api in identity (#3682 ) Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2019-11-06 15:06:09 -08:00
Alejandro Pedraza	0e8958cd07	Fixed bad identity string for target pod in tap (#3675 ) * Fixed bad identity string for target pod in tap Fixes #3506 Was using the cluster domain instead of the trust domain, which results in an error when those domains differ.	2019-11-05 15:57:41 -05:00
Alejandro Pedraza	8cf4494e78	Add proxy-injector-injections count to heartbeat (#3655 ) Fixes #3059	2019-10-31 11:09:00 -05:00
Alejandro Pedraza	d3d8266c63	If tap source IP matches many running pods then only show the IP (#3513 ) * If tap source IP matches many running pods then only show the IP When an unmeshed source ip matched more than one running pod, tap was showing the names for all those pods, even though the didn't necessary originate the connection. This could be reproduced when using pod network add-on such as Calico. With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP. Fixes #3103	2019-10-25 12:38:11 -05:00

1 2 3

147 Commits