linkerd2

Commit Graph

Author	SHA1	Message	Date
Alejandro Pedraza	a04b30d2ab	Simplify SelfCheck API (#5665 ) Fixes #5575 Now that only viz makes use of the `SelfCheck` api, merged the `healthcheck.proto` into `viz.proto`. Also removed the "checkRPC" functionality that was used for handling multiple API responses and was only used by `SelfCheck`, because the extra complexity was not granted. Revert to use the plain vanilla "check" by just concatenating error responses. ## Success Output ```bash $ bin/linkerd viz check ... linkerd-viz ----------- ... √ viz extension self-check ``` ## Failure Examples Failure when viz fails to connect to the k8s api: ```bash $ bin/linkerd viz check ... linkerd-viz ----------- ... × viz extension self-check Error calling the Kubernetes API: someerror see https://linkerd.io/checks/#l5d-api-control-api for hints Status check results are × ``` Failure when viz fails to connect to Prometheus: ```bash $ bin/linkerd viz check ... linkerd-viz ----------- ... × viz extension self-check Error calling Prometheus from the control plane: someerror see https://linkerd.io/checks/#l5d-api-control-api for hints Status check results are × ``` Failure when viz fails to connect to both the k8s api and Prometheus: ```bash $ bin/linkerd viz check ... linkerd-viz ----------- ... × viz extension self-check Error calling the Kubernetes API: someerror Error calling Prometheus from the control plane: someerror see https://linkerd.io/checks/#l5d-api-control-api for hints Status check results are × ```	2021-02-05 10:13:45 -05:00
Kevin Leimkuhler	228d8e9e95	Add tracing enabled annotation (#5643 ) This change adds the `jaeger.linkerd.io/tracing-enabled` annotation which is automatically added by the Jaeger extension's `jaeger-injector`. All pods that receive this annotation have also had the required environment variables and volume/volume mounts add by the injector. The purpose of this annotation is that it will allow `jaeger check` to check for the presence of this annotation instead of needing to look at the proxy containers directly. If this annotation is not present on pods, `jaeger check` can warn users that tracing is not configured for those pods. This is similar to `viz check` warning users that tap is not configured—recenlty added in #5602. Closes #5632 Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-02-03 14:05:15 -05:00
Tarun Pothulapati	cd2e911be3	viz: add data-plane and prometheus healthchecks (#5602 ) * viz: add data-plane and prometheus healthchecks Fixes #5325 This branch adds the remaining healthchecks for the viz extension i.e - Data-plane metrics check in Prometheus - `--proxy` mode which also checks for tap injections based on annotations. For this, The following changes were needed - Category.ID is made public so that --proxy toggleness can be allowed - Made tap env key as a field so that it can be re-used for checks simplify viz.NewHealthChecker by removing the need to pass categoryIDs and instead using hc.appendCategories directly at the caller to add the required categories. This is possible by dividing the vizCategories into separate functions Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2021-02-01 23:01:13 +05:30
Kevin Leimkuhler	964a190069	viz: only tap pods that have tap explicitly enabled (#5608 ) ## What this changes This allows the tap controller to inform `tap` users when pods either have tap disabled or tap is not enabled yet. ## Why When a user taps a resource that has not been admitted by the Viz extension's `tap-injector`, tap is not explicitly disabled but it is also not enabled. Therefore, the `tap` command hangs and provides no feedback to the user. Closes #5544 ## How A new `viz.linkerd.io/tap-enabled` annotation is introduced which is automatically added by the Viz extension's `tap-injector`. This annotation is added to a pod when it is able to be tapped; this means that the pod and the pod's namespace do not have the `config.linkerd.io/disable-tap` annotation added. When a user attempts to tap a resource, the tap controller now looks for this new annotation; if the annotation is present on the pod then that pod is tappable. If the annotation is not present or tap is explicitly disabled, an error is returned. ## UI changes Multiple errors can now occur when trying to tap a resource: 1. There are no pods for the resource. 2. There are pods for the resource, but tap is disabled via pod or namespace annotation. 3. There are pods for the resource, but tap is not yet enabled because the `tap-injector` did not admit the resource. Errors are now handled as shown below: Tap is disabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap disabled via the config.linkerd.io/disable-tap annotation ``` Tap is not enabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap not enabled; try restarting resource so that it can be injected ``` There are a mix of pods with tap disabled or tap not enabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap disabled via the config.linkerd.io/disable-tap annotation pods found with tap not enabled; try restarting resource so that it can be injected ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-28 17:37:45 -05:00
Alex Leong	964ce11559	Update generated serviceprofile code (#5580 ) I ran `bin/update-codegen.sh` to update the generated code to include the opaque ports in the generated deepcopy function for service profiles. Signed-off-by: Alex Leong <alex@buoyant.io>	2021-01-22 14:34:49 -08:00
Alejandro Pedraza	8ac5360041	Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560 ) * Protobuf changes: - Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510). - Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs. * Added chart templates for new viz linkerd-metrics-api pod * Spin-off viz healthcheck: - Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients. - The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface. - Refactored the data plane checks so they don't rely on calling `ListPods` - The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck. * Removed linkerd-controller dependency on Prometheus: - Removed the `global.prometheusUrl` config in the core values.yml. - Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352). * Moved observability gRPC from linkerd-controller to viz: - Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server). - Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type. - Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.). - Also simplified some type names to avoid stuttering. * Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits. * linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container. * CLI updates and other minor things: - Changes to command files under `cli/cmd`: - Updated `endpoints.go` according to new API interface name. - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically. - Changes to command files under `viz/cmd`: - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz. - Other changes to have tests pass: - Added `metrics-api` to list of docker images to build in actions workflows. - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`). * Add retry to 'tap API service is running' check * mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used	2021-01-21 18:26:38 -05:00
Kevin Leimkuhler	e7f2a3fba3	viz: add tap-injector (#5540 ) ## What this changes This adds a tap-injector component to the `linkerd-viz` extension which is responsible for adding the tap service name environment variable to the Linkerd proxy container. If a pod does not have a Linkerd proxy, no action is taken. If tap is disabled via annotation on the pod or the namespace, no action is taken. This also removes the environment variable for explicitly disabling tap through an environment variable. Tap status for a proxy is now determined only be the presence or absence of the tap service name environment variable. Closes #5326 ## How it changes ### tap-injector The tap-injector component determines if `LINKERD2_PROXY_TAP_SVC_NAME` should be added to a pod's Linkerd proxy container environment. If the pod satisfies the following, it is added: - The pod has a Linkerd proxy container - The pod has not already been mutated - Tap is not disabled via annotation on the pod or the pod's namespace ### LINKERD2_PROXY_TAP_DISABLED Now that tap is an extension of Linkerd and not a core component, it no longer made sense to explicitly enable or disable tap through this Linkerd proxy environment variable. The status of tap is now determined only be if the tap-injector adds or does not add the `LINKERD2_PROXY_TAP_SVC_NAME` environment variable. ### controller image The tap-injector has been added to the controller image's several startup commands which determines what it will do in the cluster. As a follow-up, I think splitting out the `tap` and `tap-injector` commands from the controller image into a linkerd-viz image (or something like that) makes sense. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-21 11:24:08 -05:00
Oleh Ozimok	c416e78261	destination: Fix crash when EndpointSlices are enabled (#5543 ) The Destination controller can panic due to a nil-deref when the EndpointSlices API is enabled. This change updates the controller to properly initialize values to avoid this segmentation fault. Fixes #5521 Signed-off-by: Oleg Ozimok <oleg.ozimok@corp.kismia.com>	2021-01-15 12:52:11 -08:00
Alejandro Pedraza	f3b1ebfa99	Separate observability API (#5510 ) * Separate observability API Closes #5312 This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here. - Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot` - Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff. - Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto` - `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup. - The web server provides handlers for both interfaces. - `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs. - Most of the CLI commands will use the Viz API, except for `version`. The other changes in the go files are just changes in the imports to point to the new protobufs. Other minor changes: - Removed `git add controller/gen` from `bin/protoc-go.sh`	2021-01-13 14:34:54 -05:00
Filip Petkovski	40192e258a	Ignore pods with status.phase=Succeeded when watching IP addresses (#5412 ) Ignore pods with status.phase=Succeeded when watching IP addresses When a pod terminates successfully, some CNIs will assign its IP address to newly created pods. This can lead to duplicate pod IPs in the same Kubernetes cluster. Filter out pods which are in a Succeeded phase since they are not routable anymore. Fixes #5394 Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-01-12 12:25:37 -05:00
Tarun Pothulapati	e04647fb8d	remove prom check for public-api self-check (#5436 ) Currently, public-api is part of the core control-plane where the prom check fails when ran before the viz extension is installed. This change comments out that check, Once metrics api is moved into viz, maybe this check can be part of it instead or directly part of `linkerd viz check`. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-05 17:22:39 -05:00
Alejandro Pedraza	d3d7f4e2e2	Destination should return `OpaqueTransport` hint when annotation matches resolved target port (#5458 ) The destination service now returns `OpaqueTransport` hint when the annotation matches the resolve target port. This is different from the current behavior which always sets the hint when a proxy is present. Closes #5421 This happens by changing the endpoint watcher to set a pod's opaque port annotation in certain cases. If the pod already has an annotation, then its value is used. If the pod has no annotation, then it checks the namespace that the endpoint belongs to; if it finds an annotation on the namespace then it overrides the pod's annotation value with that. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-05 14:54:55 -05:00
Kevin Leimkuhler	b830efdad7	Add OpaqueTransport field to destination protocol hints (#5421 ) ## What When the destination service returns a destination profile for an endpoint, indicate if the endpoint can receive opaque traffic. ## Why Closes #5400 ## How When translating a pod address to a destination profile, the destination service checks if the pod is controlled by any linkerd control plane. If it is, it can set a protocol hint where we indicate that it supports H2 and opaque traffic. If the pod supports opaque traffic, we need to get the port that it expects inbound traffic on. We do this by getting the proxy container and reading it's `LINKERD2_PROXY_INBOUND_LISTEN_ADDR` environment variable. If we successfully parse that into a port, we can set the opaque transport field in the destination profile. ## Testing A test has been added to the destination server where a pod has a `linkerd-proxy` container. We can expect the `OpaqueTransport` field to be set in the returned destination profile's protocol hint. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-12-23 11:06:39 -05:00
Kevin Leimkuhler	7c0843a823	Add opaque ports to destination service updates (#5294 ) ## Summary This changes the destination service to start indicating whether a profile is an opaque protocol or not. Currently, profiles returned by the destination service are built by chaining together updates coming from watching Profile and Traffic Split updates. With this change, we now also watch updates to Opaque Port annotations on pods and namespaces; if an update occurs this is now included in building a profile update and is sent to the client. ## Details Watching updates to Profiles and Traffic Splits is straightforward--we watch those resources and if an update occurs on one associated to a service we care about then the update is passed through. For Opaque Ports this is a little different because it is an annotation on pods or namespaces. To account for this, we watch the endpoints that we should care about. ### When host is a Pod IP When getting the profile for a Pod IP, we check for the opaque ports annotation on the pod and the pod's namespace. If one is found, we'll indicate if the profile is an opaque protocol if the requested port is in the annotation. We do not subscribe for updates to this pod IP. The only update we really care about is if the pod is deleted and this is already handled by the proxy. ### When host is a Service When getting the profile for a Service, we subscribe for updates to the endpoints of that service. For any ports set in the opaque ports annotation on any of the pods, we check if the requested port is present. Since the endpoints for a service can be added and removed, we do subscribe for updates to the endpoints of the service. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-12-18 12:38:59 -05:00
Alejandro Pedraza	d661054795	Fix CLI install/upgrade overriding settings in HA (#5399 ) Fixes #5385 ## The problems - `linkerd install --ha` isn't honoring flags - `linkerd upgrade --ha` is overridding existing configs silently or failing with an error - Upgrading HA instances from before 2.9 to version 2.9.1 results in configs being overridden silently, or the upgrade fails with an error ## The cause The change in #5358 attempted to fix `linkerd install --ha` that was only applying some of the `values-ha.yaml` defaults, by calling `charts.NewValues(true)` and merging that with the values built from `values.yaml` overriden by the flags. It turns out the `charts.NewValues()` implementation was by itself merging against `values.yaml` and as a result any flag was getting overridden by its default. This also happened when doing `linkerd upgrade --ha` on an existing instance, which could result in silently overriding settings, or it could also fail loudly like for example when upgrading set up that has an external issuer (in this case the issuer cert won't be able to be read during upgrade and an error would occur as described in #5385). Finally, when doing `linkerd upgrade` (no --ha flag) on an HA install from before 2.9 results in configs getting overridden as well (silently or with an error) because in order to generate the `linkerd-config-overrides` secret, the original install flags are retrieved from `linkerd-config` via the `loadStoredValuesLegacy()` function which then effectively ends up performing a `linkerd upgrade` with all the flags used for `linkerd install` and falls into the same trap as above. ## The fix In `values.go` the faulting merging logic is not used anymore, so now `NewValues()` only returns the default values from `values.yaml` and doesn't require an argument anymore. It calls `readDefaults()` which now only returns the appropriate values depending on whether we're on HA or not. There's a new function `MergeHAValues()` that merges `values-ha.yaml` into the current values (it doesn't look into `values.yaml` anymore), which is only used when processing the `--ha` flag in `options.go`. ## How to test To replicate the issue try setting a custom setting and check it's not applied: ```bash linkerd install --ha --controller-log level debug \| grep log.level - -log-level=info ``` ## Followup This wasn't caught because we don't have HA integration tests. Now that our test infra is based on k3d, it should be easy to make such a test using a cluster with multiple nodes. Either that or issuing `linkerd install --ha` with additional configs and compare against a golden file.	2020-12-18 12:11:52 -05:00
Alejandro Pedraza	578d4a19e9	Have the tap APIServer refresh its cert automatically (#5388 ) Followup to #5282, fixes #5272 in its totality. This follows the same pattern as the injector/sp-validator webhooks, leveraging `FsCredsWatcher` to watch for changes in the cert files. To reuse code from the webhooks, we moved `updateCert()` to `creds_watcher.go`, and `run()` as well (which now is called `ProcessEvents()`). The `TestNewAPIServer` test in `apiserver_test.go` was removed as it really was just testing two things: (1) that `apiServerAuth` doesn't error which is already covered in the following test, and (2) that the golib call `net.Listen("tcp", addr)` doesn't error, which we're not interested in testing here. ## How to test To test that the injector/sp-validator functionality is still correct, you can refer to #5282 The steps below are similar, but focused towards the tap component: ```bash # Create some root cert $ step certificate create linkerd-tap.linkerd.svc ca.crt ca.key --profile root-ca --no-password --insecure # configure tap's caBundle to be that root cert $ cat > linkerd-overrides.yml << EOF tap: externalSecret: true caBundle: \| < ca.crt contents> EOF # Install linkerd $ bin/linkerd install --config linkerd-overrides.yml \| k apply -f - # Generate an intermediatery cert with short lifespan $ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc # Create the secret using that intermediate cert $ kubectl create secret tls \ linkerd-tap-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Rollout the tap pod for it to pick the new secret $ k -n linkerd rollout restart deploy/linkerd-tap # Tap should work $ bin/linkerd tap -n linkerd deploy/linkerd-web req id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :method=GET :authority=10.42.0.11:9994 :path=/metrics rsp id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true :status=200 latency=1779µs end id=0:0 proxy=in src=10.42.0.15:33040 dst=10.42.0.11:9994 tls=true duration=65µs response-length=1709B # Wait 5 minutes and rollout tap again $ k -n linkerd rollout restart deploy/linkerd-tap # You'll see in the logs that the cert expired: $ k -n linkerd logs -f deploy/linkerd-tap tap 2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45866: remote error: tls: bad certificate 2020/12/15 16:03:41 http: TLS handshake error from 127.0.0.1:45870: remote error: tls: bad certificate # Recreate the secret $ step certificate create linkerd-tap.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-tap.linkerd.svc $ k -n linkerd delete secret linkerd-tap-k8s-tls $ kubectl create secret tls \ linkerd-tap-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Wait a few moments and you'll see the certs got reloaded and tap is working again time="2020-12-15T16:03:42Z" level=info msg="Updated certificate" addr=":8089" component=apiserver ```	2020-12-16 17:46:14 -05:00
Alex Leong	cdc57d1af0	Use linkerd-jaeger extension for control plane tracing (#5299 ) Now that tracing has been split out of the main control plane and into the linkerd-jaeger extension, we remove references to tracing from the main control plane including: * removing the tracing components from the main control plane chart * removing the tracing injection logic from the main proxy injector and inject CLI (these will be added back into the new injector in the linkerd-jaeger extension) * removing tracing related checks (these will be added back into `linkerd jaeger check`) * removing related tests We also update the `--control-plane-tracing` flag to configure the control plane components to send traces to the linkerd-jaeger extension. To make sure this works even when the linkerd-jaeger extension is installed in a non-default namespace, we also add a `--control-plane-tracing-namespace` flag which can be used to change the namespace that the control plane components send traces to. Note that for now, only the control plane components send traces; the proxies in the control plane do not. This is because the linkerd-jaeger injector is not yet available. However, this change adds the appropriate namespace annotations to the control plane namespace to configure the proxies to send traces to the linkerd-jaeger extension once the linkerd-jaeger injector is available. I tested this by doing the following: 1. bin/linkerd install \| kubectl apply -f - 1. bin/helm install jaeger jaeger/charts/jaeger 1. bin/linkerd upgrade --control-plane-tracing=true \| kubectl apply -f - 1. kubectl -n linkerd-jaeger port-forward svc/jaeger 16686 1. open http://localhost:16686 1. see traces from the linkerd control plane Signed-off-by: Alex Leong <alex@buoyant.io>	2020-12-08 14:34:26 -08:00
Tarun Pothulapati	72a0ca974d	extension: Separate multicluster chart and binary (#5293 ) Fixes #5257 This branch movies mc charts and cli level code to a new top level directory. None of the logic is changed. Also, moves some common types into `/pkg` so that they are accessible both to the main cli and extensions. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-12-04 16:36:10 -08:00
Alejandro Pedraza	4c634a3816	Have webhooks refresh their certs automatically (#5282 ) * Have webhooks refresh their certs automatically Fixes partially #5272 In 2.9 we introduced the ability for providing the certs for `proxy-injector` and `sp-validator` through some external means like cert-manager, through the new helm setting `externalSecret`. We forgot however to have those services watch changes in their secrets, so whenever they were rotated they would fail with a cert error, with the only workaround being to restart those pods to pick the new secrets. This addresses that by first abstracting out `FsCredsWatcher` from the identity controller, which now lives under `pkg/tls`. The webhook's logic in `launcher.go` no longer reads the certs before starting the https server, moving that instead into `server.go` which in a similar way as identity will receive events from `FsCredsWatcher` and update `Server.cert`. We're leveraging `http.Server.TLSConfig.GetCertificate` which allows us to provide a function that will return the current cert for every incoming request. ### How to test ```bash # Create some root cert $ step certificate create linkerd-proxy-injector.linkerd.svc ca.crt ca.key \ --profile root-ca --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # configure injector's caBundle to be that root cert $ cat > linkerd-overrides.yaml << EOF proxyInjector: externalSecret: true caBundle: \| < ca.crt contents> EOF # Install linkerd. The injector won't start untill we create the secret below $ bin/linkerd install --controller-log-level debug --config linkerd-overrides.yaml \| k apply -f - # Generate an intermediatery cert with short lifespan step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # Create the secret using that intermediate cert $ kubectl create secret tls \ linkerd-proxy-injector-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # start following the injector log $ k -n linkerd logs -f -l linkerd.io/control-plane-component=proxy-injector -c proxy-injector # Inject emojivoto. The pods should be injected normally $ bin/linkerd inject https://run.linkerd.io/emojivoto.yml \| kubectl apply -f - # Wait about 5 minutes and delete a pod $ k -n emojivoto delete po -l app=emoji-svc # You'll see it won't be injected, and something like "remote error: tls: bad certificate" will appear in the injector logs. # Regenerate the intermediate cert $ step certificate create linkerd-proxy-injector.linkerd.svc ca-int.crt ca-int.key --ca ca.crt --ca-key ca.key --profile intermediate-ca --not-after 4m --no-password --insecure --san linkerd-proxy-injector.linkerd.svc # Delete the secret and recreate it $ k -n linkerd delete secret linkerd-proxy-injector-k8s-tls $ kubectl create secret tls \ linkerd-proxy-injector-k8s-tls \ --cert=ca-int.crt \ --key=ca-int.key \ --namespace=linkerd # Wait a couple of minutes and you'll see some filesystem events in the injector log along with a "Certificate has been updated" entry # Then delete the pod again and you'll see it gets injected this time $ k -n emojivoto delete po -l app=emoji-svc ```	2020-12-04 16:25:59 -05:00
Alejandro Pedraza	9cbfb08a38	Bump proxy-init to v1.3.8 (#5283 )	2020-11-27 09:07:34 -05:00
hodbn	92eb174e06	Add safe accessor for Global in linkerd-config (#5269 ) CLI crashes if linkerd-config contains unexpected values. Add a safe accessor that initializes an empty Global on the first access. Refactor all accesses to use the newly introduced accessor using gopls. Add test for linkerd-config data without Global. Fixes #5215 Co-authored-by: Itai Schwartz <yitai27@gmail.com> Signed-off-by: Hod Bin Noon <bin.noon.hod@gmail.com>	2020-11-23 12:45:58 -08:00
Kevin Leimkuhler	ca86a31816	Add destination service tests for the IP path (#5266 ) This adds additional tests for the destination service that assert `GetProfile` behavior when the path is an IP address. 1. Assert that when the path is a cluster IP, the configured service profile is returned. 2. Assert that when the path a pod IP, the endpoint field is populated in the service profile returned. 3. Assert that when the path is not a cluster or pod IP, the default service profile is returned. 4. Assert that when path is a pod IP with or without the controller annotation, the endpoint has or does not have a protocol hint Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-11-23 13:17:05 -05:00
Alejandro Pedraza	4687dc52aa	Refactor webhook framework to allow webhooks define their flags (#5256 ) * Refactor webhook framework to allow webhook define their flags Pulled out of `launcher.go` the flag parsing logic and moved it into the `Main` methods of the webhooks (under `controller/cmd/proxy.injector/main.go` and `controller/cmd/sp-validator/main.go`), so that individual webhooks themselves can define the flags they want to use. Also no longer require that webhooks have cluster-wide access. Finally, renamed the type `webhook.handlerFunc` to `webhook.Handler` so it can be exported. This will be used in the upcoming jaeger webhook.	2020-11-23 10:40:30 -05:00
Kevin Leimkuhler	92f9387997	Check correct label value when setting protocl hint (#5267 ) This fixes an issue where the protocol hint is always set on endpoint responses. We now check the right value which determines if the pod has the required label. A test for this has been added to #5266. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-11-20 13:32:50 -08:00
Oliver Gould	375ffd782f	proxy: v2.121.0 (#5253 ) This release changes error handling to teardown the server-side connection when an unexpected error is encountered. Additionally, the outbound TCP routing stack can now skip redundant service discovery lookups when profile responses include endpoint information. Finally, the cache implementation has been updated to reduce latency by removing unnecessary buffers. --- * h2: enable HTTP/2 keepalive PING frames (linkerd/linkerd2-proxy#737) * actions: Add timeouts to GitHub actions (linkerd/linkerd2-proxy#738) * outbound: Skip endpoint resolution on profile hint (linkerd/linkerd2-proxy#736) * Add a FromStr for dns::Name (linkerd/linkerd2-proxy#746) * outbound: Avoid redundant TCP endpoint resolution (linkerd/linkerd2-proxy#742) * cache: Make the cache cloneable with RwLock (linkerd/linkerd2-proxy#743) * http: Teardown serverside connections on error (linkerd/linkerd2-proxy#747)	2020-11-18 16:55:53 -08:00
Kevin Leimkuhler	e65f216d52	Add endpoint to GetProfile response (#5227 ) Context: #5209 This updates the destination service to set the `Endpoint` field in `GetProfile` responses. The `Endpoint` field is only set if the IP maps to a Pod--not a Service. Additionally in this scenario, the default Service Profile is used as the base profile so no other significant fields are set. ### Examples ``` # GetProfile for an IP that maps to a Service ❯ go run controller/script/destination-client/main.go -method getProfile -path 10.43.222.0:9090 INFO[0000] fully_qualified_name:"linkerd-prometheus.linkerd.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"linkerd-prometheus.linkerd.svc.cluster.local.:9090" weight:10000} ``` Before: ``` # GetProfile for an IP that maps to a Pod ❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20 INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} ``` After: ``` # GetProfile for an IP that maps to a Pod ❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20 INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524692}} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"fast-1"} metric_labels:{key:"pod" value:"fast-1-5cc87f64bc-9hx7h"} metric_labels:{key:"pod_template_hash" value:"5cc87f64bc"} metric_labels:{key:"serviceaccount" value:"default"} tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}} ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-11-18 15:41:25 -05:00
Alejandro Pedraza	5a707323e6	Update proxy-init to v1.3.7 (#5221 ) This upgrades both the proxy-init image itself, and the go dependency on proxy-init as a library, which fixes CNI in k3s and any host using binaries coming from BusyBox, where `nsenter` has an issue parsing arguments (see rancher/k3s#1434).	2020-11-13 15:59:14 -05:00
Tarun Pothulapati	4c106e9c08	cli: make check return SkipError when there is no prometheus configured (#5150 ) Fixes #5143 The availability of prometheus is useful for some calls in public-api that the check uses. This change updates the ListPods in public-api to still return the pods even when prometheus is not configured. For a test that exclusively checks for prometheus metrics, we have a gate which checks if a prometheus is configured and skips it othervise. Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com	2020-10-29 19:57:11 +05:30
Alex Leong	4f34ce8e2f	Empty the stored addresses when the endpoint translator gets a NoEndpoints message (#5126 ) Signed-off-by: Alex Leong <alex@buoyant.io>	2020-10-22 17:01:03 -07:00
Oliver Gould	d0bce594ea	Remove defunct proxy config variables (#5109 ) The proxy no longer honors DESTINATION_GET variables, as profile lookups inform when endpoint resolution is performed. Also, there is no longer a router capacity limit.	2020-10-20 16:13:53 -07:00
Oliver Gould	c5d3b281be	Add 100.64.0.0/10 to the set of discoverable networks (#5099 ) It appears that Amazon can use the `100.64.0.0/10` network, which is technically private, for a cluster's Pod network. Wikipedia describes the network as: > Shared address space for communications between a service provider > and its subscribers when using a carrier-grade NAT. In order to avoid requiring additional configuration on EKS clusters, we should permit discovery for this network by default.	2020-10-19 12:59:44 -07:00
Kevin Leimkuhler	eff50936bf	Fix --all-namespaces flag handling (#5085 ) ## Motivations Closes #5080 ## Solution When the `--all-namespaces` (`-A`) flag is set for the `linkerd edges` command, ignore the `namespace` value set by default or `-n`. This is similar to the behavior for `kubectl`. `kubectl get -A -n linkerd pods` showing pods in all namespaces. ### Behavior changes With linkerd and emojivoto installed, this results in: Before: ``` ❯ linkerd edges -A pods No edges found. ``` After: ``` ❯ linkerd edges -A pods SRC DST SRC_NS DST_NS SECURED vote-bot-6cb9cb9569-wl6w5 web-5d69bcfdb7-mxf8f emojivoto emojivoto √ web-5d69bcfdb7-mxf8f emoji-7dc976587b-rb9c5 emojivoto emojivoto √ web-5d69bcfdb7-mxf8f voting-bdf4f778c-pjkjg emojivoto emojivoto √ linkerd-prometheus-68d6897d75-ghmgm emoji-7dc976587b-rb9c5 linkerd emojivoto √ linkerd-prometheus-68d6897d75-ghmgm vote-bot-6cb9cb9569-wl6w5 linkerd emojivoto √ linkerd-prometheus-68d6897d75-ghmgm voting-bdf4f778c-pjkjg linkerd emojivoto √ linkerd-prometheus-68d6897d75-ghmgm web-5d69bcfdb7-mxf8f linkerd emojivoto √ linkerd-controller-7d965cf78d-qw6xj linkerd-prometheus-68d6897d75-ghmgm linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-controller-7d965cf78d-qw6xj linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-destination-74dbb9c46b-nkxgh linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-grafana-5d9fb67dc6-sn2l8 linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-identity-c875b5d58-b756v linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-proxy-injector-767b55988d-n9r6f linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-sp-validator-6c8df84fb9-4w8kc linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-tap-777fbf7656-p87dm linkerd linkerd √ linkerd-prometheus-68d6897d75-ghmgm linkerd-web-546c9444b5-68xpx linkerd linkerd √ ``` `linkerd edges -A -n linkerd pods` results in all edges as well (the result above). The behavior of `linkerd edges pods` does not change and shows edges in the `default` namespace. ``` ❯ linkerd edges pods No edges found. ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-10-16 16:49:10 -04:00
Oliver Gould	4f16a234aa	Add a default set of ports to bypass the proxy (#5093 ) The proxy has a default, hardcoded set of ports on which it doesn't do protocol detection (25, 587, 3306 -- all of which are server-first protocols). In a recent change, this default set was removed from the outbound proxy, since there was no way to configure it to anything other than the default set. I had thought that there was a default set applied to proxy-init, but this appears to not be the case. This change adds these ports to the default Helm values to restore the prior behavior. I have also elected to include 443 in this set, as it is generally our recommendation to avoid proxying HTTPS traffic, since the proxy provides very little value on these connections today. Additionally, the memcached port 11211 is skipped by default, as clients do not issue any sort of preamble that is immediately detectable. These defaults may change in the future, but seem like good choices for the 2.9 release.	2020-10-16 11:53:41 -07:00
Alejandro Pedraza	777b06ac55	Expand 'linkerd edges' to work with TCP connections (#5040 ) * Expand 'linkerd edges' to work with TCP connections Fixes #4999 Before: ``` $ bin/linkerd edges po -owide SRC DST SRC_NS DST_NS CLIENT_ID SERVER_ID SECURED linkerd-prometheus-764ddd4f88-t6c2j rabbitmq-controller-5c6cf7cc6d-8lxp2 linkerd default √ linkerd-prometheus-764ddd4f88-t6c2j temp linkerd default √ ``` After: ``` $ bin/linkerd edges po -owide SRC DST SRC_NS DST_NS CLIENT_ID SERVER_ID SECURED temp rabbitmq-controller-5c6cf7cc6d-5fpsc default default default.default default.default √ linkerd-prometheus-66fb97b7fc-vpnxf rabbitmq-controller-5c6cf7cc6d-5fpsc linkerd default √ linkerd-prometheus-66fb97b7fc-vpnxf temp linkerd default √ ``` With the latest proxy upgrade to v2.113.0 (#5037), the `tcp_open_total` metric now contains the `client_id` label so that we can replace the http-only metric `response_total` with this one to determine edges for TCP-only connections. This change basically performs the same query as before, but two times, one for `response_total` and another for `tcp_open_total`. For each resulting entry, the latter is kept if `client_id` is present, otherwise the former is used (if present at all). That way things keep on working for older proxies. Disclaimers: - This doesn't fix #3706: if two sources connect to the same destination there's no way to tell them appart from the metrics perspective and their edges can get mangled. To fix that, the proxy would have to expose `src_resource` labels in the `tcp_open_total` total inbound metric. - Note connections coming from prometheus are still unidentified. The reason is those hit the proxy's admin server (instead of the main container) which doesn't expose metrics.	2020-10-12 09:14:39 -05:00
Alejandro Pedraza	11a5d1d427	Fix Heartbeat mem and cpu stats (#5042 ) Since k8s 1.16 cadvisor uses the `container` label instead of `container_name` in the prometheus metrics it exposes. The heartbeat queries were using the latter, so they were broken for k8s version since 1.16. Note that the `p99-handle-us` value is still missing because the `request_handle_us` metrics is always zero.	2020-10-08 16:31:16 -05:00
Tarun Pothulapati	1e7bb1217d	Update Injection to use new linkerd-config.values (#5036 ) This PR Updates the Injection Logic (both CLI and proxy-injector) to use `Values` struct instead of protobuf Config, part of our move in removing the protobuf. This does not touch any of the flags, install related code. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com> Co-authored-by: Alex Leong <alex@buoyant.io>	2020-10-07 09:54:34 -07:00
Tarun Pothulapati	5e774aaf05	Remove dependency of linkerd-config for control plane components (#4915 ) * Remove dependency of linkerd-config for most control plane components This PR removes the dependency of `linkerd-config` into control plane components by making all that information passed through CLI flags. As most of these components require a couple of flags, passing them as flags could be more helpful, as updations to the flags trigger a rollout unlike a configMap update. This does not update the proxy-injector as it needs a lot more data and mounting `linkerd-config` is better.	2020-10-06 22:19:18 +05:30
Kevin Leimkuhler	6b7a39c9fa	Set FQN in profile resolutions (#5019 ) ## Motivation Closes #5016 Depends on linkerd/linkerd2-proxy-api#44 ## Solution A `profileTranslator` exists for each service and now has a new `fullyQualifiedName` field. This field is used to set the `FullyQualifiedName` field of `DestinationProfile`s each time an update is sent. In the case that no service profile exists for a service, a default `DestinationProfile` is created and we can use the field to set the correct name. In the case that a service profile does exist for a service, we still use this field to set the name to keep it consistent. ### Example Install linkerd on a cluster and run the destination server: ``` go run controller/cmd/main.go destination -kubeconfig ~/.kube/config ``` Get the IP of a service. Here, we'll get the ip for `linkerd-identity`: ``` > kubectl get -n linkerd svc/linkerd-identity NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE linkerd-identity ClusterIP 10.43.161.68 <none> 8080/TCP 4h25m ``` Get the profile of `linkerd-identity` from service name or IP and note the `FullyQualifiedName` field: ``` > go run controller/script/destination-client/main.go -method getProfile -path 10.43.161.68:8080 INFO[0000] fully_qualified_name:"linkerd-identity.linkerd.svc.cluster.local" .. ``` ``` > go run controller/script/destination-client/main.go -method getProfile -path linkerd-identity.linkerd.svc.cluster.local INFO[0000] fully_qualified_name:"linkerd-identity.linkerd.svc.cluster.local" .. ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-10-01 11:06:00 -04:00
Tarun Pothulapati	d0caaa86c4	Bump k8s client-go to v0.19.2 (#5002 ) Fixes #4191 #4993 This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5 This consists of the following changes: - Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD. - Bump all k8s related dependencies to v0.19.2 - Generate CRD types, client code using the latest k8s.io/code-generator - Use context.Context as the first argument, in all code paths that touch the k8s client-go interface Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-09-28 12:45:18 -05:00
Alejandro Pedraza	b30d35f46a	Reset service-mirror component when target's k8s API is unreachable (#4996 ) When the service-mirror component can't reach the target's k8s API, the goroutine blocks and it can't be unblocked. This was happenining specifically in the case of the multicluster integration test (still to be pushed), where the source and target clusters are created in quick succession and the target's API service doesn't always have time to be exposed before being requested by the service mirror. The fix consists on no longer have restartClusterWatcher be side-effecting, and instead return an error. If such error is not nil then the link watcher is stopped and reset after 10 seconds.	2020-09-25 11:00:28 -05:00
Zahari Dichev	0b649e3ed7	Remove double slash (#4985 ) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-09-21 12:15:54 -07:00
Oliver Gould	6d67b84447	profiles: Eliminate default timeout (#4958 ) * profiles: Eliminate default timeout	2020-09-10 14:00:18 -07:00
Alejandro Pedraza	ccf027c051	Push docker images to ghcr.io instead of gcr.io (#4953 ) * Push docker images to ghcr.io instead of gcr.io The `cloud_integration.yml` and `release.yml` workflows were modified to log into ghcr.io, and remove the `Configure gcloud` step which is no longer necessary. Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes. Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag. Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).	2020-09-10 15:16:24 -05:00
Oliver Gould	7ee638bb0c	inject: Configure the proxy to discover profiles for unnamed services (#4960 ) The proxy performs endpoint discovery for unnamed services, but not service profiles. The destination controller and proxy have been updated to support lookups for unnamed services in linkerd/linkerd2#4727 and linkerd/linkerd2-proxy#626, respectively. This change modifies the injection template so that the `proxy.destinationGetNetworks` configuration enables profile discovery for all networks on which endpoint discovery is permitted.	2020-09-10 12:44:00 -07:00
Zahari Dichev	77c88419b8	Make destination and identity services headless (#4923 ) * Make destination and identity svcs headless Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-09-02 14:53:38 -05:00
Ali Ariff	5186383c81	Add ARM64 Integration Test (#4897 ) * Add ARM64 Integration Test Signed-off-by: Ali Ariff <ali.ariff12@gmail.com>	2020-08-28 10:38:40 -07:00
Alex Leong	9d3cf6ee4d	Move most service-mirror code out of cmd package (#4901 ) All of the code for the service mirror controller lives in the `linkerd/linkerd2/controller/cmd` package. It is typical for control plane components to only have a `main.go` entrypoint in the cmd package. This can sometimes make it hard to find the service mirror code since I wouldn't expect it to be in the cmd package. We move the majority of the code to a dedicated controller package, leaving only main.go in the cmd package. This is purely organizational; no behavior change is expected. Signed-off-by: Alex Leong <alex@buoyant.io>	2020-08-27 14:17:18 -07:00
Matei David	7ed904f31d	Enable endpoint slices when upgrading through CLI (#4864 ) ## What/How @adleong pointed out in #4780 that when enabling slices during an upgrade, the new value does not persist in the `linkerd-config` ConfigMap. I took a closer look and it seems that we were never overwriting the values in case they were different. * To fix this, I added an if block when validating and building the upgrade options -- if the current flag value differs from what we have in the ConfigMap, then change the ConfigMap value. * When doing so, I made sure to check that if the cluster does not support `EndpointSlices` yet the flag is set to true, we will error out. This is done similarly (copy&paste similarily) to what's in the install part. * Additionally, I have noticed that the helm ConfigMap template stored the flag value under `enableEndpointSlices` field name. I assume this was not changed in the initial PR to reflect the changes made in the protocol buffer. The API (and thus the CLI) uses the field name `endpointSliceEnabled` instead. I have changed the config template so that helm installations will use the same field, which can then be used in the destination service or other components that may implement slice support in the future. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-08-24 14:34:50 -07:00
Kevin Leimkuhler	c2301749ef	Always return destination overrides for services (#4890 ) ## Motivation #4879 ## Solution When no traffic split exists for services, return a single destination override with a weight of 100%. Using the destination client on a new linkerd installation, this results in the following output for `linkerd-identity` service: ``` ❯ go run controller/script/destination-client/main.go -method getProfile -path linkerd-identity.linkerd.svc.cluster.local:8080 INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"linkerd-identity.linkerd.svc.cluster.local.:8080" weight:100000} INFO[0000] ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2020-08-19 12:25:58 -07:00
Matei David	f797ab1e65	service topologies: topology-aware service routing (#4780 ) [Link to RFC](https://github.com/linkerd/rfc/pull/23) ### What --- * PR that puts together all past pieces of the puzzle to deliver topology-aware service routing, as specified in the [Kubernetes docs](https://kubernetes.io/docs/concepts/services-networking/service-topology/) but with a much better load balancing algorithm and all the coolness of linkerd :) * The first piece of this PR is focused on adding topology metadata: topology preference for services and topology `<k,v>` pairs for endpoints. * The second piece of this PR puts together the new context format and fetching the source node topology metadata in order to allow for endpoints filtering. * The final part is doing the filtering -- passing all of the metadata to the listener and on every `Add` filtering endpoints based on the topology preference of the service, topology `<k,v>` pairs of endpoints and topology of the source (again `<k,v>` pairs). ### How --- * Collecting metadata: - Services do not have values for topology keys -- the topological keys defined in a service's spec are only there to dictate locality preference for routing; as such, I decided to store them in an array, they will be taken exactly as they are found in the service spec, this ensures we respect the preference order. - For EndpointSlices, we are using a map -- an EndpointSlice has locality information in the form of `<k,v>` pair, where the key is a topological key (similar to what's listed in the service) and the value is the locality information -- e.g `hostname: minikube`. For each address we now have a map of topology values which gets populated when we translate the endpoints to an address set. Because normal Endpoints do not have any topology information, we create each address with an empty map which is subsequently populated ONLY for slices in the `endpointSliceToAddressSet` function. * Filtering endpoints: - This was a tricky part and filled me with doubts. I think there are a few ways to do this, but this is how I "envisioned" it. First, the `endpoint_translator.go` should be the one to do the filtering; this means that on subscription, we need to feed all of the relevant metadata to the listener. To do this, I created a new function `AddTopologyFilter` as part of the listener interface. - To complement the `AddTopologyFilter` function, I created a new `TopologyFilter` struct in `endpoints_watcher.go`. I then embedded this structure in all listeners that implement the interface. The structure holds the source topology (source node), a boolean to tell if slices are activated in case we need to double check (or write tests for the function) and the service preference. We create the filter on Subscription -- we have access to the k8s client here as well as the service, so it's the best point to collect all of this data together. Addresses all have their own topology added to them so they do not have to be collected by the filter. - When we add a new set of addresses, we check to see if slices are enabled -- chances are if slices are enabled, service topology might be too. This lets us skip this step if the latest version is not adopted. Prior to sending an `Add` we filter the endpoints -- if the preference is registered by the filter we strictly enforce it, otherwise nothing changes. And that's pretty much it. Signed-off-by: Matei David <matei.david.35@gmail.com>	2020-08-18 11:11:09 -07:00

1 2 3 4 5 ...

578 Commits