Commit Graph

29 Commits

Author SHA1 Message Date
Zahari Dichev 3365455e45
Fix mc labels (#4560)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-06-05 19:36:09 +03:00
Zahari Dichev 4176580a0f
Threadsafe buffering listener (#4359)
* Add thread safety to watcher tests

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 20:45:41 +03:00
Zahari Dichev ef1a2c2b10
Multicluster dashboard for traffic metrics (#4178)
This change adds labels to endpoints that target remote services. It also adds a Grafana dashboard that can be used to monitor multicluster traffic.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-05-14 17:48:27 +03:00
Alex Leong 8fbaa3ef9b
Don't send NoEndpoints during pod updates for ip watches (#4338)
When the proxy has an IP watch on a pod and the destination controller gets a pod update event, the destination controller sends a NoEndpoints message to all listeners followed by an Add with the new pod state.  This can result in the proxy's load balancer being briefly empty and could result in failing requests in the period.  

Since consecutive Add events with the same address will override each other, we can simply send the Adds without needing to clear the previous state with a NoEndpoints message.
2020-05-07 16:10:17 -07:00
Zahari Dichev 26c14d3c66
Detect changes in addresses when getting updates in endpoints watcher (#4104)
Detect changes in addresses when getting updates in endpoints watcher

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-04-10 11:42:39 +03:00
Alex Leong d8eebee4f7
Upgrade to client-go 0.17.4 and smi-sdk-go 0.3.0 (#4221)
Here we upgrade our dependencies on client-go to 0.17.4 and smi-sdk-go to 0.3.0.  Since smi-sdk-go uses client-go 0.17.4, these upgrades must be performed simultaneously.

This also requires simultaneously upgrading our dependency on linkerd/stern to a SHA which also uses client-go 0.17.4.  This keeps all of our transitive dependencies synchronized on one version of client-go.

This ALSO requires updating our codegen scripts to use the 0.17.4 version of code-generator and running it to generate 0.17.4 compatible generated code.  I took this opportunity to update our code generation script to properly use the version of code-generater from `go.mod` rather than a hardcoded SHA.

Signed-off-by: Alex Leong <alex@buoyant.io>
2020-04-01 10:07:23 -07:00
Zahari Dichev 10ecd8889e
Set auth override (#4160)
Set AuthOverride when present on endpoints annotation

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-25 10:56:36 +02:00
Zahari Dichev 2db307ee91
Remove target port requirement in port resolution (#4174)
This change removes the target port requirement when resolving ports in the dst service. Based on the comments, it seems that we need to have a target port defined in the port spec in order to resolve to the port in the Endpoints. In reality if target port is note defined when creating the service, k8s will set the port and the target port to the same value. Seems to me that checking for the targetPort to be different than 0, is a no-op.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com
2020-03-16 23:04:08 +02:00
Zahari Dichev caf4e61daf
Enable identitiy on endpoints not associated with pods (#4134)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-09 20:55:57 +02:00
Zahari Dichev edd7fd203d
Service Mirroring Component (#4028)
This PR introduces a service mirroring component that is responsible for watching remote clusters and mirroring their services locally.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-02 21:16:08 +02:00
Zahari Dichev 6fa9407318
Ensure we get the correct type out of Informer Deletion events (#4034)
Ensure we get what we expect when receiving DELETE events from the k8s Informer api

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-02-15 10:15:24 +02:00
Alejandro Pedraza afb93cddc8
Use `t.Name()` instead of `t.Name` in tests (#3970)
Use `t.Name()` instead of `t.Name` when retrieving the name of tests.
This was causing an error to be added in the log:
```
output: logrus_error="can not add field \"test\"
```

Followup to
[comment](https://github.com/linkerd/linkerd2/pull/3965#discussion_r370387990)
2020-01-27 09:17:19 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Alex Leong 3dcff52b9f
Switch from using golangci fmt to using goimports (#3555)
CI currently enforcing formatting rules by using the fmt linter of golang-ci-lint which is invoked from the bin/lint script.  However it doesn't seem possible to use golang-ci-lint as a formatter, only as a linter which checks formatting.  This means any formatter used by your IDE or invoked manually may or may not use the same formatting rules as golang-ci-lint depending on which formatter you use and which specific revision of that formatter you use.  

In this change we stop using golang-ci-lint for format checking.  We introduce `tools.go` and add goimports to the `go.mod` and `go.sum` files.  This allows everyone to easily get the same revision of goimports by running `go install -mod=readonly golang.org/x/tools/cmd/goimports` from inside of the project.  We add a step in the CI workflow that uses goimports via the `bin/fmt` script to check formatting.

Some shell gymnastics were required in the `bin/fmt` script to work around some limitations of `goimports`:
* goimports does not have a built-in mechanism for excluding directories, and we need to exclude the vendor director as well as the generated Go sources
* goimports returns a 0 exit code, even when formatting errors are detected

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-10-16 13:56:11 -07:00
Johannes Hansen f880e71fcd The linkerd proxy does not work with headless services (#3470)
* The linkerd proxy does not work with headless services (i.e. endpoints not referencing a pod).

Changed endpoints_watcher to also return endpoints with no targetref.

Fixes #3308

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>

* Fix panic in endpoint_translator

Signed-off-by: Johannes Hansen <johannesh1980@gmail.com>
2019-10-15 14:56:41 -07:00
陈谭军 e281fb3410 fix-up grammar (#3351)
Signed-off-by: chentanjun <2799194073@qq.com>
2019-08-30 08:09:36 -07:00
Alejandro Pedraza fd248d3755
Undo refactoring from #3316 (#3331)
Thus fixing `linkerd edges` and the dashboard topology graph

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 13:37:54 -05:00
Alejandro Pedraza 5d7499dc84
Avoid the dashboard requesting stats when not needed (#3338)
* Avoid the dashboard requesting stats when not needed

Create an alternative to `urlsForResource` called
`urlsForResourceNoStats` that makes use of the `skip_stats` parameter in
the stats API (created in #1871) that doesn't query Prometheus when not needed.

When testing using the dashboard looking at the linkerd namespace,
queries per second went down from 2874 to 2756, a 4% decrease.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 05:52:44 -05:00
Alejandro Pedraza 02efb46e45
Have the proxy-injector emit events upon injection/skipping injection (#3316)
* Have the proxy-injector emit events upon injection/skipping injection

Fixes #3253

Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).

The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.

Related changes:

- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-26 13:34:36 -05:00
Guangming Wang 70d85d2065 Cleanup: fix some typos in code comment (#3296)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-21 09:40:43 -07:00
Alex Leong ab7226cbcd
Return invalid argument for external name services (#3120)
Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498

When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource.  Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request.  The net effect is that Linkerd fails all requests to external name services.

We change the destination service to instead return `InvalidArgument` for external name services.  This causes the proxy to fallback to SO_ORIG_DST instead of failing the request.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-29 16:31:22 -07:00
Andrew Seigner 51b33ad53c
Fix nil pointer dereference in endpoints watcher (#3147)
The destination service's endpoints watcher assumed every `Endpoints`
object contained a `TargetRef`. This field is optional, and in cases
such as the default `ep/kubernetes` object, `TargetRef` is nil, causing
a nil pointer dereference.

Fix endpoints watcher to check for `TargetRef` prior to dereferencing.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-25 17:11:56 -07:00
Alex Leong e538a05ce2
Add support for stateful sets (#3113)
We add support for looking up individual pods in a stateful set with the destination service.  This allows Linkerd to correctly proxy requests which address individual pods.  The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.

Fixes #2266 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 14:09:46 -07:00
Alex Leong d6ef9ea460
Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078)
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller.  Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.

To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2.  This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource.  If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can.  This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.  

Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation).  In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-23 11:46:31 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong 92ddffa3c2
Add prometheus metrics for watchers (#3022)
To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-08 11:50:26 -07:00
Alex Leong 27373a8b78
Add traffic splitting to destination profiles (#2931)
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).

We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to.  A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-28 13:19:47 -07:00
Alejandro Pedraza 8988a5723f
Have `GetOwnerKindAndName` be able to skip the cache (#2972)
* Have `GetOwnerKindAndName` be able to skip the cache

Refactored `GetOwnerKindAndName` so it can optionally skip the
shared informer cache and instead hit the k8s API directly.
Useful for the proxy injector, when the pod's replicaset got just
created and might not be in ready in the cache yet.

Fixes #2738

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-20 12:58:15 -05:00
Alex Leong 06a69f69c5
Refactor destination service (#2786)
This is a major refactor of the destination service.  The goals of this refactor are to simplify the code for improved maintainability.  In particular:

* Remove the "resolver" interfaces.  These were a holdover from when our decision tree was more complex about how to handle different kinds of authorities.  The current implementation only accepts fully qualified kubernetes service names and thus this was an unnecessary level of indirection.
* Moved the endpoints and profile watchers into their own package for a more clear separation of concerns.  These watchers deal only in Kubernetes primitives and are agnostic to how they are used.  This allows a cleaner layering when we use them from our gRPC service.
* Renamed the "listener" types to "translator" to make it more clear that the function of these structs is to translate kubernetes updates from the watcher to gRPC messages.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-04 15:01:16 -07:00