Commit Graph

33 Commits

Author SHA1 Message Date
Alejandro Pedraza 5d7499dc84
Avoid the dashboard requesting stats when not needed (#3338)
* Avoid the dashboard requesting stats when not needed

Create an alternative to `urlsForResource` called
`urlsForResourceNoStats` that makes use of the `skip_stats` parameter in
the stats API (created in #1871) that doesn't query Prometheus when not needed.

When testing using the dashboard looking at the linkerd namespace,
queries per second went down from 2874 to 2756, a 4% decrease.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-29 05:52:44 -05:00
arminbuerkle 5c38f38a02 Allow custom cluster domains in remaining backends (#3278)
* Set custom cluster domain in GetServiceProfileFor
* Set custom cluster domain in tap server
Move fetching cluster domain for tap server to cmd main
* Handle fetchting cluster domain errors separately
* Use custom cluster domain for traffic split adaptor

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-08-27 10:01:36 -07:00
Alejandro Pedraza 02efb46e45
Have the proxy-injector emit events upon injection/skipping injection (#3316)
* Have the proxy-injector emit events upon injection/skipping injection

Fixes #3253

Have the proxy-injector emit an event whenever a injection happens, or
when injection is skipped for some reason (also added that reason into
the proxy-injector logs). The level is associated to the parent workload
(it can't be associated to the pod because at this point the pod hasn't
been persisted).

The event recorder was setup at the `webhook/server.go` level and passed
to the proxy-injector's `Inject` function. The sp-validator thus also
has access to the event recorder, but for now it's not using it.

Related changes:

- Refactored `api.GetOwnerKindAndName()` to have it return a more
generic object.
- Refactored `report.Injectable()` to also have it return the reason why
a workload is not injectable.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-08-26 13:34:36 -05:00
Guangming Wang 70d85d2065 Cleanup: fix some typos in code comment (#3296)
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-21 09:40:43 -07:00
Oliver Gould ee79d5d324
destination: Reorganize authority-parsing (#3244)
In preparation for #3242, the destination controller will need to
support a broader set of valid authorities including IP addresses.

This change modifies the destination controller's authority-parsing code
so that the is-this-a-kubernete-service-name decision is decoupled from
parsing of authorities into their consituent parts.

The `Get` API now explicitly handles IP address names, though it
currently fails all such resolutions.
2019-08-21 07:19:42 -07:00
Andrew Seigner f98bc27a38
Fix invalid `l5d-require-id` for some tap requests (#3210)
PR #3154 introduced an `l5d-require-id` header to Tap requests. That
header string was constructed based on the TapByResourceRequest, which
includes 3 notable fields (type, name, namespace). For namespace-level
requests (via commands like `linkerd tap ns linkerd`), type ==
`namespace`, name == `linkerd`, and namespace == "". This special casing
for namespace-level requests yielded invalid `l5d-require-id` headers,
for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`.

Fix `l5d-require-id` string generation to account for namespace-level
requests. The bulk of this change is tap unit test updates to validate
the fix.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-08-08 09:42:11 -07:00
Alex Leong ab7226cbcd
Return invalid argument for external name services (#3120)
Fixes https://github.com/linkerd/linkerd2/issues/2800#issuecomment-513740498

When the Linkerd proxy sends a query for a Kubernetes external name service to the destination service, the destination service returns `NoEndpoints: exists=false` because an external name service has no endpoints resource.  Due to a change in the proxy's fallback logic, this no longer causes the proxy to fallback to either DNS or SO_ORIG_DST and instead fails the request.  The net effect is that Linkerd fails all requests to external name services.

We change the destination service to instead return `InvalidArgument` for external name services.  This causes the proxy to fallback to SO_ORIG_DST instead of failing the request.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-29 16:31:22 -07:00
Andrew Seigner 51b33ad53c
Fix nil pointer dereference in endpoints watcher (#3147)
The destination service's endpoints watcher assumed every `Endpoints`
object contained a `TargetRef`. This field is optional, and in cases
such as the default `ep/kubernetes` object, `TargetRef` is nil, causing
a nil pointer dereference.

Fix endpoints watcher to check for `TargetRef` prior to dereferencing.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-07-25 17:11:56 -07:00
Alex Leong 3c4a0e4381
Make authorities in destination overrides absolute (#3137)
Fixes #3136 

When the destination service sends a destination profile with a traffic split to the proxy, the override destination authorities are absolute but do no contain a trailing dot.  e.g. "bar.ns.svc.cluster.local:80".  However, NameAddrs which have undergone canonicalization in the proxy will include the trailing dot.  When a traffic split includes the apex service as one of the overrides, the original apex NameAddr will have the trailing dot and the override will not.  Since these two NameAddrs are not identical, they will go into two distinct slots in the proxy's concrete dst router.  This will cause two services to be created for the same destination which will cause the stats clobbering described in the linked issue.

We change the destination service to always return absolute dst overrides including the trailing dot.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 17:08:40 -07:00
Alex Leong e538a05ce2
Add support for stateful sets (#3113)
We add support for looking up individual pods in a stateful set with the destination service.  This allows Linkerd to correctly proxy requests which address individual pods.  The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.

Fixes #2266 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 14:09:46 -07:00
Alex Leong d6ef9ea460
Update ServiceProfile CRD to version v1alpha2 and remove validation (#3078)
The openAPIV3Schema validation in the ServiceProfiles CRD is very limited in what it can validate and is obviated by more sophisticated validation done by the validating admission controller.  Therefore, we would like to remove the openAPIV3Schema validation to reduce the size and complexity of the CRD object.

To do so, we must also bump the version of the ServiceProfile custom resource from v1alpha1 to v1alpha2.  This ensures that when the controller is upgraded, it will attempt to watch the v1alpha2 resource.  If it cannot (because, for example, the controller pod started before the ServiceProfile CRD was updated and therefore the v1alpha2 version does not exist) then it will go into a crash loop backoff until it can.  This essentially means that the controller will wait for the CRD to be upgraded to include v1alpha2 before it will start.  

Bumping the version is necessary because if we did not, it would be possible for the controller to start before the CRD is updated (removing the validation).  In this case, when the CRD is edited, the controller will lose its list watch on ServiceProfiles and will stop getting updates.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-23 11:46:31 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong 92ddffa3c2
Add prometheus metrics for watchers (#3022)
To give better visibility into the inner workings of the kubernetes watchers in the destination service, we add some prometheus metrics.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-08 11:50:26 -07:00
Alejandro Pedraza 53e589890d
Have `linkerd endpoints` use `Destination.Get` (#2990)
* Have `linkerd endpoints` use `Destination.Get`

Fixes #2885

We're refactoring `linkerd endpoints` so it hits
directly the `Destination.Get` endpoint, instead of relying on the
Discovery service.

For that, I've created a new `client.go` for Destination and added it to
the `APIClient` interface.

I've also added a `destinationClient` struct that mimics `tapClient`,
and whose common logic has been moved into `stream_client.go`.

Analogously, I added a `destinationServer` struct that mimics
`tapServer`.

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-07-03 09:11:03 -05:00
Alex Leong 27373a8b78
Add traffic splitting to destination profiles (#2931)
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).

We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to.  A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-28 13:19:47 -07:00
Alejandro Pedraza 8988a5723f
Have `GetOwnerKindAndName` be able to skip the cache (#2972)
* Have `GetOwnerKindAndName` be able to skip the cache

Refactored `GetOwnerKindAndName` so it can optionally skip the
shared informer cache and instead hit the k8s API directly.
Useful for the proxy injector, when the pod's replicaset got just
created and might not be in ready in the cache yet.

Fixes #2738

Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
2019-06-20 12:58:15 -05:00
Alex Leong 06a69f69c5
Refactor destination service (#2786)
This is a major refactor of the destination service.  The goals of this refactor are to simplify the code for improved maintainability.  In particular:

* Remove the "resolver" interfaces.  These were a holdover from when our decision tree was more complex about how to handle different kinds of authorities.  The current implementation only accepts fully qualified kubernetes service names and thus this was an unnecessary level of indirection.
* Moved the endpoints and profile watchers into their own package for a more clear separation of concerns.  These watchers deal only in Kubernetes primitives and are agnostic to how they are used.  This allows a cleaner layering when we use them from our gRPC service.
* Renamed the "listener" types to "translator" to make it more clear that the function of these structs is to translate kubernetes updates from the watcher to gRPC messages.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-04 15:01:16 -07:00
Oliver Gould da0330743f
Provide peer Identities via the Destination API (#2537)
This change reintroduces identity hinting to the destination service.
The Get endpoint includes identities for pods that are injected with an
identity-mode of "default" and have the same linkerd control plane.

A `serviceaccount` label is now also added to destination response
metadata so that it's accessible in prometheus and tap.
2019-03-22 09:19:14 -07:00
Oliver Gould 91c5f07650
proxy: Upgrade to identity-capable proxy (#2524)
The new proxy has changed its configuration as follows:

- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
  just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.

Identity injection is **NOT** configured by this branch.
2019-03-19 14:20:39 -07:00
Oliver Gould 790c13b3b2
Introduce the Identity controller implementation (#2521)
This change introduces a new Identity service implementation for the
`io.linkerd.proxy.identity.Identity` gRPC service.

The `pkg/identity` contains a core, abstract implementation of the service
(generic over both the CA and (Kubernetes) Validator interfaces).

`controller/identity` includes a concrete implementation that uses the
Kubernetes TokenReview API to validate serviceaccount tokens when
issuing certificates.

This change does **NOT** alter installation or runtime to include the
identity service. This will be included in a follow-up.
2019-03-19 13:58:45 -07:00
Oliver Gould 81f645da66
Remove `--tls=optional` and `linkerd-ca` (#2515)
The proxy's TLS implementation has changed to use a new _Identity_ controller.

In preparation for this, the `--tls=optional` CLI flag has been removed
from install and inject; and the `ca` controller has been deleted. Metrics
and UI treatments for TLS have **not** been removed, as they will continue to
be valuable for the new Identity system.

With the removal of the old identity scheme, the Destination service's proxy
ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable
locality awareness.
2019-03-18 17:40:31 -07:00
Andrew Seigner e5d2460792
Remove single namespace functionality (#2474)
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.

This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.

A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
  combining some check categories, as we can always assume cluster-wide
  requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
  make a decision based on cluster-wide vs. namespace-wide access.
  Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.

Reverts #1721
Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-12 00:17:22 -07:00
Andrew Seigner 8da2cd3fd4
Require cluster-wide k8s API access (#2428)
linkerd/linkerd2#2349 removed the `--single-namespace` flag, in favor of
runtime detection of cluster vs. namespace access, and also
ServiceProfile availability. This maintained control-plane support for
running in these two states.

This change requires control-plane components have cluster-wide
Kubernetes API access and ServiceProfile availability, and will error
out if not. Once #2349 merges, stage 1 install will be a requirement for
a successful stage 2 install.

Part of #2337

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-03-07 10:23:18 -08:00
Oliver Gould ab90263461
destination: Only return TLS identities when appropriate (#2371)
As described in #2217, the controller returns TLS identities for results even
when the destination pod may not be able to participate in identity
requester: specifically, the other pod may not have the same controller
namespace or it may not be injected with identity.

This change introduces a new annotation, linkerd.io/identity-mode that is set
when injecting pods (via both CLI and webhook). This annotation is always
added.

The destination service now only returns TLS identities when this annotation
is set to optional on a pod and the destination pod uses the same controller.
These semantics are expected to change before the 2.3 release.

Fixes #2217
2019-02-27 12:18:39 -08:00
Andrew Seigner 9f748d2d2e
lint: Enable unparam (#2369)
unparam reports unused function parameters:
https://github.com/mvdan/unparam

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-27 10:34:02 -08:00
Oliver Gould c3f9ff8e32
Consolidate endpointListener.Update with logging (#2389)
Previously, the update-handling logic was spread across several very
small functions that were only called within this file. I've
consolidated this logic into endpointListener.Update so that all of the
debug logging can be instrumented in one place without having to iterate
over lists multiple times.

Also, I've fixed the formatting of IP addresses in some places.

Logs now look as follows:

    msg="Establishing watch on endpoint linkerd-prometheus.linkerd:9090" component=endpoints-watcher
    msg="Subscribing linkerd-prometheus.linkerd:9090 exists=true" component=service-port id=linkerd-prometheus.linkerd target-port=admin-http
    msg="Update: add=1; remove=0" component=endpoint-listener namespace=linkerd service=linkerd-prometheus
    msg="Update: add: addr=10.1.1.160; pod=linkerd-prometheus-7bbc899687-nd9zt; addr:<ip:<ipv4:167838112 > port:9090 > weight:1 metric_labels:<key:\"control_plane_ns\" value:\"linkerd\" > metric_labels:<key:\"deployment\" value:\"linkerd-prometheus\" > metric_labels:<key:\"pod\" value:\"linkerd-prometheus-7bbc899687-nd9zt\" > metric_labels:<key:\"pod_template_hash\" value:\"7bbc899687\" > protocol_hint:<h2:<> > " component=endpoint-listener namespace=linkerd service=linkerd-prometheus
2019-02-26 15:05:23 -08:00
Andrew Seigner ec5a0ca8d9
Authorization-aware control-plane components (#2349)
The control-plane components relied on a `--single-namespace` param,
passed from `linkerd install` into each individual component, to
determine which namespaces they were authorized to access, and whether
to support ServiceProfiles. This command-line flag was redundant given
the authorization rules encoded in the parent `linkerd install` output,
via [Cluster]Role[Binding]s.

Modify the control-plane components to query Kubernetes at startup to
determine which namespaces they are authorized to access, and whether
ServiceProfile support is available. This allows removal of the
`--single-namespace` flag on the components.

Also update `bin/test-cleanup` to cleanup the ServiceProfile CRD.

TODO:
- Remove `--single-namespace` flag on `linkerd install`, part of #2164

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-26 11:54:52 -08:00
Andrew Seigner 25e462352d
lint: Enable goimports (#2366)
goimports checks import lines, adding missing ones and removing
unreferenced ones:
https://godoc.org/golang.org/x/tools/cmd/goimports

It also requires named imports for packages whose
import paths don't match their package names:
- https://github.com/golang/go/issues/28428
- https://go-review.googlesource.com/c/tools/+/145699/

Also standardized named imports of common Kubernetes packaages.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 15:51:10 -08:00
Andrew Seigner 35a0b652f2
lint: Enable goconst (#2365)
goconst finds repeated strings that could be replaced by a constant:
https://github.com/jgautheron/goconst

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-25 12:00:03 -08:00
Oliver Gould f7435800da
lint: Enable scopelint (#2364)
[scopelint][scopelint] detects a nasty reference-scoping issue in loops.

[scopelint]: https://github.com/kyoh86/scopelint
2019-02-24 08:59:51 -08:00
Andrew Seigner cc3ff70f29
Enable `unused` linter (#2357)
`unused` checks Go code for unused constants, variables, functions, and
types.

Part of #217

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2019-02-23 11:05:39 -08:00
Kevin Lingerfelt 5384ca8c97
Add discovery package for managing discovery API (#2317)
* Add discovery package for managing discovery API
* Fix typo in destination server comment

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-18 16:38:04 -08:00
Oliver Gould 71ce786dd3
Rename linkerd-proxy-api to linkerd-destination (#2281)
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).

With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).

Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2019-02-15 15:11:04 -08:00