Fixes#4707
In order to remove a multicluster link, we add a `linkerd multicluster unlink` command which produces the yaml necessary to delete all of the resources associated with a `linkerd multicluster link`. These are:
* the link resource
* the service mirror controller deployment
* the service mirror controller's RBAC
* the probe gateway mirror for this link
* all mirror services for this link
This command follows the same pattern as the `linkerd uninstall` command in that its output is expected to be piped to `kubectl delete`. The typical usage of this command is:
```
linkerd --context=source multicluster unlink --cluster-name=foo | kubectl --context=source delete -f -
```
This change also fixes the shutdown lifecycle of the service mirror controller by properly having it listen for the shutdown signal and exit its main loop.
A few alternative designs were considered:
I investigated using owner references as suggested [here](https://github.com/linkerd/linkerd2/issues/4707#issuecomment-653494591) but it turns out that owner references must refer to resources in the same namespace (or to cluster scoped resources). This was not feasible here because a service mirror controller can create mirror services in many different namespaces.
I also considered having the service mirror controller delete the mirror services that it created during its own shutdown. However, this could lead to scenarios where the controller is killed before it finishes deleting the services that it created. It seemed more reliable to have all the deletions happen from `kubectl delete`. Since this is the case, we avoid having the service mirror controller delete mirror services, even when the link is deleted, to avoid the race condition where the controller and CLI both attempt to delete the same mirror services and one of them fails with a potentially alarming error message.
Signed-off-by: Alex Leong <alex@buoyant.io>
This PR removes the service mirror controller from `linkerd mc install` to `linkerd mc link`, as described in https://github.com/linkerd/rfc/pull/31. For fuller context, please see that RFC.
Basic multicluster functionality works here including:
* `linkerd mc install` installs the Link CRD but not any service mirror controllers
* `linkerd mc link` creates a Link resource and installs a service mirror controller which uses that Link
* The service mirror controller creates and manages mirror services, a gateway mirror, and their endpoints.
* The `linkerd mc gateways` command lists all linked target clusters, their liveliness, and probe latences.
* The `linkerd check` multicluster checks have been updated for the new architecture. Several checks have been rendered obsolete by the new architecture and have been removed.
The following are known issues requiring further work:
* the service mirror controller uses the existing `mirror.linkerd.io/gateway-name` and `mirror.linkerd.io/gateway-ns` annotations to select which services to mirror. it does not yet support configuring a label selector.
* an unlink command is needed for removing multicluster links: see https://github.com/linkerd/linkerd2/issues/4707
* an mc uninstall command is needed for uninstalling the multicluster addon: see https://github.com/linkerd/linkerd2/issues/4708
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#4582
When a target cluster gateway is exposed as a hostname rather than with a fixed IP address, the service mirror controller fails to create mirror services and gateway mirrors for that gateway. This is because we only look at the IP field of the gateway service.
We make two changes to address this problem:
First, when extracting the gateway spec from a gateway that has a hostname instead of an IP address, we do a DNS lookup to resolve that hostname into an IP address to use in the mirror service endpoints and gateway mirror endpoints.
Second, we schedule a repair job on a regular (1 minute) to update these endpoint objects. This has the effect of re-resolving the DNS names every minute to pick up any changes in DNS resolution.
Signed-off-by: Alex Leong <alex@buoyant.io>
There are a few notable things happening in this PR:
- the probe manager has been decoupled from the cluster_watcher. Now its only responsibility is to watch for mirrored gateways beeing created and to probe them. This means that probes are initiated for all gateways no matter whether there are mirrored services being paired
- the number of paired services is derived from the existing services in the cluster rather than being published as a metric by the prober
- there are no events being exchanged between the cluster watcher and the probe manager
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This change creates a gateway proxy for every gateway. This enables the probe worker to leverage the destination service functionality in order to discover the identity of the gateway.
Fix#4411
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR introduces a few changes that were requested after a bit of service mirror reviewing.
- we restrict the RBACs so the service mirror controller cannot read secrets in all namespaces but only in the one that it is installed in
- we unify the namespace namings so all multicluster resources are installedi n `linkerd-multicluster` on both clusters
- fixed checks to account for changes
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>