linkerd2

Commit Graph

Author	SHA1	Message	Date
Kevin Leimkuhler	964a190069	viz: only tap pods that have tap explicitly enabled (#5608 ) ## What this changes This allows the tap controller to inform `tap` users when pods either have tap disabled or tap is not enabled yet. ## Why When a user taps a resource that has not been admitted by the Viz extension's `tap-injector`, tap is not explicitly disabled but it is also not enabled. Therefore, the `tap` command hangs and provides no feedback to the user. Closes #5544 ## How A new `viz.linkerd.io/tap-enabled` annotation is introduced which is automatically added by the Viz extension's `tap-injector`. This annotation is added to a pod when it is able to be tapped; this means that the pod and the pod's namespace do not have the `config.linkerd.io/disable-tap` annotation added. When a user attempts to tap a resource, the tap controller now looks for this new annotation; if the annotation is present on the pod then that pod is tappable. If the annotation is not present or tap is explicitly disabled, an error is returned. ## UI changes Multiple errors can now occur when trying to tap a resource: 1. There are no pods for the resource. 2. There are pods for the resource, but tap is disabled via pod or namespace annotation. 3. There are pods for the resource, but tap is not yet enabled because the `tap-injector` did not admit the resource. Errors are now handled as shown below: Tap is disabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap disabled via the config.linkerd.io/disable-tap annotation ``` Tap is not enabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap not enabled; try restarting resource so that it can be injected ``` There are a mix of pods with tap disabled or tap not enabled: ``` ❯ bin/linkerd viz tap deploy/test Error: no pods to tap for deployment/test pods found with tap disabled via the config.linkerd.io/disable-tap annotation pods found with tap not enabled; try restarting resource so that it can be injected ``` Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-01-28 17:37:45 -05:00
Alejandro Pedraza	8ac5360041	Extract from public-api all the Prometheus dependencies, and moves things into a new viz component 'linkerd-metrics-api' (#5560 ) * Protobuf changes: - Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510). - Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs. * Added chart templates for new viz linkerd-metrics-api pod * Spin-off viz healthcheck: - Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients. - The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface. - Refactored the data plane checks so they don't rely on calling `ListPods` - The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck. * Removed linkerd-controller dependency on Prometheus: - Removed the `global.prometheusUrl` config in the core values.yml. - Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352). * Moved observability gRPC from linkerd-controller to viz: - Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server). - Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type. - Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.). - Also simplified some type names to avoid stuttering. * Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits. * linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container. * CLI updates and other minor things: - Changes to command files under `cli/cmd`: - Updated `endpoints.go` according to new API interface name. - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically. - Changes to command files under `viz/cmd`: - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz. - Other changes to have tests pass: - Added `metrics-api` to list of docker images to build in actions workflows. - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`). * Add retry to 'tap API service is running' check * mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used	2021-01-21 18:26:38 -05:00
Alejandro Pedraza	f3b1ebfa99	Separate observability API (#5510 ) * Separate observability API Closes #5312 This is a preliminary step towards moving all the observability API into `/viz`, by first moving its protobuf into `viz/metrics-api`. This should facilitate review as the go files are not moved yet, which will happen in a followup PR. There are no user-facing changes here. - Moved `proto/common/healthcheck.proto` to `viz/metrics-api/proto/healthcheck.prot` - Moved the contents of `proto/public.proto` to `viz/metrics-api/proto/viz.proto` except for the `Version` Stuff. - Merged `proto/controller/tap.proto` into `viz/metrics-api/proto/viz.proto` - `grpc_server.go` now temporarily exposes `PublicAPIServer` and `VizAPIServer` interfaces to separate both APIs. This will get properly split in a followup. - The web server provides handlers for both interfaces. - `cli/cmd/public_api.go` and `pkg/healthcheck/healthcheck.go` temporarily now have methods to access both APIs. - Most of the CLI commands will use the Viz API, except for `version`. The other changes in the go files are just changes in the imports to point to the new protobufs. Other minor changes: - Removed `git add controller/gen` from `bin/protoc-go.sh`	2021-01-13 14:34:54 -05:00
Tarun Pothulapati	d0caaa86c4	Bump k8s client-go to v0.19.2 (#5002 ) Fixes #4191 #4993 This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5 This consists of the following changes: - Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD. - Bump all k8s related dependencies to v0.19.2 - Generate CRD types, client code using the latest k8s.io/code-generator - Use context.Context as the first argument, in all code paths that touch the k8s client-go interface Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>	2020-09-28 12:45:18 -05:00
Alejandro Pedraza	aea541d6f9	Upgrade generated protobuf files to v1.4.2 (#4673 ) Regenerated protobuf files, using version 1.4.2 that was upgraded from 1.3.2 with the proxy-api update in #4614. As of v1.4 protobuf messages are disallowed to be copied (because they hold a mutex), so whenever a message is passed to or returned from a function we need to use a pointer. This affects _mostly_ test files. This is required to unblock #4620 which is adding a field to the config protobuf.	2020-06-26 09:36:48 -05:00
Zahari Dichev	edd9b654a7	Make gateway require TLS for incoming requests (#4339 ) Make gateway require TLS for incoming requests Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>	2020-05-11 10:07:48 +03:00
Mayank Shah	963b9b049a	Add kubectl-style label selectors (#4120 ) * Update tap, routes and top commands to support label selectors Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-03-20 10:45:06 -05:00
Mayank Shah	3c3a4a5f5d	cli: Add label selector flag for `stat` (#4040 ) * Update `linkerd-namespace` shorthand to `L` * Add --selector (-l) flag for `stat` Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2020-02-17 13:40:07 -05:00
Alejandro Pedraza	0e8958cd07	Fixed bad identity string for target pod in tap (#3675 ) * Fixed bad identity string for target pod in tap Fixes #3506 Was using the cluster domain instead of the trust domain, which results in an error when those domains differ.	2019-11-05 15:57:41 -05:00
Alejandro Pedraza	d3d8266c63	If tap source IP matches many running pods then only show the IP (#3513 ) * If tap source IP matches many running pods then only show the IP When an unmeshed source ip matched more than one running pod, tap was showing the names for all those pods, even though the didn't necessary originate the connection. This could be reproduced when using pod network add-on such as Calico. With this change, if a node matches, return it, otherwise we proceed to look for a matching pod. If exactly one running pod matches we return it. Otherwise we return just the IP. Fixes #3103	2019-10-25 12:38:11 -05:00
Kevin Leimkuhler	a3a240e0ef	Add TapEvent headers and trailers to the tap protobuf (#3410 ) ### Motivation In order to expose arbitrary headers through tap, headers and trailers should be read from the linkerd2-proxy-api `TapEvent`s and set in the public `TapEvent`s. This change should have no user facing changes as it just prepares the events for JSON output in linkerd/linkerd2#3390 ### Solution The public API has been updated with a headers field for `TapEvent_Http_RequestInit_` and `TapEvent_Http_ResponseInit_`, and trailers field for `TapEvent_Http_ResponseEnd_`. These values are set by reading the corresponding fields off of the proxy's tap events. The proto changes are equivalent to the proto changes proposed in linkerd/linkerd2-proxy-api#33 Closes #3262 Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-09-29 09:54:37 -07:00
Alejandro Pedraza	fd248d3755	Undo refactoring from #3316 (#3331 ) Thus fixing `linkerd edges` and the dashboard topology graph Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-08-29 13:37:54 -05:00
arminbuerkle	5c38f38a02	Allow custom cluster domains in remaining backends (#3278 ) * Set custom cluster domain in GetServiceProfileFor * Set custom cluster domain in tap server Move fetching cluster domain for tap server to cmd main * Handle fetchting cluster domain errors separately * Use custom cluster domain for traffic split adaptor Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>	2019-08-27 10:01:36 -07:00
Alejandro Pedraza	02efb46e45	Have the proxy-injector emit events upon injection/skipping injection (#3316 ) * Have the proxy-injector emit events upon injection/skipping injection Fixes #3253 Have the proxy-injector emit an event whenever a injection happens, or when injection is skipped for some reason (also added that reason into the proxy-injector logs). The level is associated to the parent workload (it can't be associated to the pod because at this point the pod hasn't been persisted). The event recorder was setup at the `webhook/server.go` level and passed to the proxy-injector's `Inject` function. The sp-validator thus also has access to the event recorder, but for now it's not using it. Related changes: - Refactored `api.GetOwnerKindAndName()` to have it return a more generic object. - Refactored `report.Injectable()` to also have it return the reason why a workload is not injectable. Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-08-26 13:34:36 -05:00
Kevin Leimkuhler	c9c41e2e8a	Remove gRPC tap server listener from controller (#3276 ) ### Summary As an initial attempt to secure the connection from clients to the gRPC tap server on the tap Pod, the tap `addr` only listened on localhost. As @adleong pointed out #3257, this was not actually secure because the inbound proxy would establish a connection to localhost anyways. This change removes the gRPC tap server listener and changes `TapByResource` requests to interface with the server object directly. From this, we know that all `TapByResourceRequests` have gone through the tap APIServer and thus authorized by RBAC. ### Details [NewAPIServer](`ef90e0184f/controller/tap/apiserver.go (L25-L26)`) now takes a [GRPCTapServer](`f6362dfa80/controller/tap/server.go (L33-L34)`) instead of a `pb.TapClient` so that `TapByResource` requests can interact directly with the [TapByResource](`f6362dfa80/controller/tap/server.go (L49-L50)`) method. `GRPCTapServer.TapByResource` now makes a private [grpcTapServer](`ef90e0184f/controller/tap/handlers.go (L373-L374)`) that satisfies the [tap.TapServer](https://godoc.org/github.com/linkerd/linkerd2/controller/gen/controller/tap#TapServer) interface. Because this interface is satisfied, we can interact with the tap server methods without spawning an additional listener. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-08-16 16:38:50 -04:00
Andrew Seigner	f98bc27a38	Fix invalid `l5d-require-id` for some tap requests (#3210 ) PR #3154 introduced an `l5d-require-id` header to Tap requests. That header string was constructed based on the TapByResourceRequest, which includes 3 notable fields (type, name, namespace). For namespace-level requests (via commands like `linkerd tap ns linkerd`), type == `namespace`, name == `linkerd`, and namespace == "". This special casing for namespace-level requests yielded invalid `l5d-require-id` headers, for example: `pd-sa..serviceaccount.identity.linkerd.cluster.local`. Fix `l5d-require-id` string generation to account for namespace-level requests. The bulk of this change is tap unit test updates to validate the fix. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-08-08 09:42:11 -07:00
Kevin Leimkuhler	6cc52c3363	Add `l5d-require-id` header to Tap requests (#3154 ) ### Summary In order for Pods' tap servers to start authorizing tap clients, the tap controller must open TLS connections so that it can identity itself to the server. This change introduces the use of `l5d-require-id` header on outbound tap requests. ### Details When tap requests are made by the tap controller, the `Authority` header is an IP address. The proxy does not attempt to do service discovery on such requests and therefore the connection is over plaintext. By introducing the `l5d-require-id` header the proxy can require a server name on the connection. This allows the tap controller to identity itself as the client making tap requests. The name value for the header can be made from the Pod Spec and tap request, so the change is rather minimal. #### Proxy Changes * Update h2 to v0.1.26 * Properly fall back in the dst_router (linkerd/linkerd2-proxy#291) ### Testing Unit tests for the header have not been added mainly because [no test infrastructure currently exists](`065c221858/controller/tap/server_test.go (L241)`) to mock proxy requests. After talking with @siggy a little about this, it makes to do in a separate change at some point when behavior like this cannot be reliably tested through integration tests either. Integration tests do test this well, and will continue to do once linkerd/linkerd2-proxy#290 lands. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2019-07-30 11:17:52 -07:00
Alejandro Pedraza	8988a5723f	Have `GetOwnerKindAndName` be able to skip the cache (#2972 ) * Have `GetOwnerKindAndName` be able to skip the cache Refactored `GetOwnerKindAndName` so it can optionally skip the shared informer cache and instead hit the k8s API directly. Useful for the proxy injector, when the pod's replicaset got just created and might not be in ready in the cache yet. Fixes #2738 Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-06-20 12:58:15 -05:00
Alejandro Pedraza	065c221858	Support for resources opting-out of tap (#2807 ) Support for resources opting out of tap Implements the `linkerd inject --disable-tap` flag (although hidden pending #2811) and the config override annotation `config.linkerd.io/disable-tap`. Fixes #2778 Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>	2019-05-10 14:17:23 -05:00
Andrew Seigner	25e462352d	lint: Enable goimports (#2366 ) goimports checks import lines, adding missing ones and removing unreferenced ones: https://godoc.org/golang.org/x/tools/cmd/goimports It also requires named imports for packages whose import paths don't match their package names: - https://github.com/golang/go/issues/28428 - https://go-review.googlesource.com/c/tools/+/145699/ Also standardized named imports of common Kubernetes packaages. Part of #217 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-25 15:51:10 -08:00
Kevin Lingerfelt	5384ca8c97	Add discovery package for managing discovery API (#2317 ) * Add discovery package for managing discovery API * Fix typo in destination server comment Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2019-02-18 16:38:04 -08:00
Andrew Seigner	72812baf99	Introduce Discovery API and endpoints command (#2195 ) The Proxy API service lacked introspection of its internal state. Introduce a new gRPC Discovery API, implemented by two servers: 1) Proxy API Server: returns a snapshot of discovery state 2) Public API Server: pass-through to the Proxy API Server Also wire up a new `linkerd endpoints` command. Fixes #2165 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2019-02-07 14:02:21 -08:00
Andrew Seigner	37a5455445	Add filtering by job in stat, tap, top; fix panic (#1904 ) Filtering by Kubernetes job was not supported. Also filtering by any unknown type caused a panic. Add filtering support by Kubernetes job, with special case mapping `job` to `k8s_job`, to not conflict with Prometheus' job label. Fix panic when unknown type specified as a `--from` or `--to` flag. Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at collection time. This caused all metrics collected by proxy sidecars in Kubernetes jobs to be collected into an incorrect Prometheus job, rather than the expected `linkerd-proxy` Prometheus job. Fix `unsupported resource type` tap error message incorrectly printing the target resource rather than the destination. Set `--controller-log-level debug` in `install_test.go` for easier debugging. Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to validate proxy requests with a job as destination. Fixes #1872 Part of #627 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-12-03 15:34:49 -08:00
Oliver Gould	926395f616	tap: Include route labels in tap events (#1902 ) This change alters the controller's Tap service to include route labels when translating tap events, modifies the public API to include route metadata in responses, and modifies the tap CLI command to include rt_ labels in tap output (when -o wide is used).	2018-12-03 13:52:47 -08:00
Oliver Gould	ba11698d4b	tap: Use nil-safe protobuf accessors (#1873 ) The tap server accesses protobuf fields directly instead of using the `Get*()` accessors. The accessors are necessary to prevent dereferencing a nil pointer and crashing the tap service. Furthermore, these maps are explicitly initialized when `nil` to support label hydration.	2018-11-26 14:14:28 -08:00
Alex Leong	f549868033	Fix integration test and docker build (#1790 ) Fix broken docker build by moving Service Profile conversion and validation into `/pkg`. Fix broken integration test by adding service profile validation output to `check`'s expected output. Testing done: * `gotest -v ./...` * `bin/docker-build` * `bin/test-run (pwd)/bin/linkerd` Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-19 10:23:34 -07:00
Alex Leong	43c22fe967	Implement getProfiles method in destination service (#1759 ) We implement the getProfiles method in the destination service. This method returns a stream of destination profiles for a given authority. It does this by looking up the ServiceProfile resource in the controller namespace named `<svc>.<ns>` where `<svc>` is the name of the service and `<ns>` is the namespace of the service. This PR includes: * Adding a ServiceProfile Custom Resource Definition to linkerd install * A watch based implementation of the getProfiles method in the destination service, similar to the implementation of get. * An update to the destination client script that allows querying the getProfiles method. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-16 15:39:12 -07:00
Dennis Adjei-Baah	00d0a26a9c	Cleanly shutdown tap stream to data plane proxies (#1624 ) Sometimes, the tap server causes the controller pod to restart after it receives this error. This error arises when the Tap server does not close gRPC tap streams to proxies before the tap server terminates its streams to its upstream clients and causes the controller pod to restart. This PR uses the request context from the initial TapByReource to help shutdown tap streams to the data plane proxies gracefully. fixes #1504 Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2018-09-12 15:00:19 -07:00
Kevin Lingerfelt	f884caf56d	Upgrade protobuf to v1.2.0 (#1591 ) * Upgrade protobuf to v1.2.0 * Fix Gopkg.lock * Switch linkerd2-proxy-api dep back to stable Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-06 11:36:29 -07:00
Risha Mars	249b51f950	Increase MaxRps in Tap server, remove default setting from Web (#1560 ) Increase the MaxRps on the tap server to 100 RPS. The max RPS for tap/top was increased in for the CLI #1531, but we were still manually setting this to 1 RPS in the Web UI and Web server. Remove the pervasive setting of MaxRps to 1 in the web frontend and server	2018-08-30 13:37:37 -07:00
Alex Leong	0f7d684ca9	Increase default max-rps for tap and top (#1531 ) The default value for the max-rps argument to the tap and top commands is an overly conservative 1rps. This causes the data to come in very slowly and much data to be discarded. Furthermore, because tap requests are windowed to 10 seconds, this causes long pauses between updates. We fix this in two ways. Firstly we reduce the window size to 1s so that updates will come in at least once per second, even when the actual RPS of the data path is extremely high. Secondly, we increase the default max-rps parameter from 1 to 100. This allows tap to paint an accurate picture of the data much more quickly and sidesteps some sampling bias that happens when the max-rps is low. In general, tap events tend to happen in bursts. For example, one request in may trigger one or more requests out. Likewise, a single upstream event may trigger several requests to the tapped pod in quick succession. Sampling bias will occur when the max-rps is less than the actual rps and when the tap event limit subdivides these event bursts (biasing towards the first few events in the burst). The greater the max-rps, the less the effects of this bias. Fixes #1525 Signed-off-by: Alex Leong <alex@buoyant.io>	2018-08-28 14:16:39 -07:00
Risha Mars	fff09c5d06	Only tap pods that are meshed (#1535 ) Previously, we would tap any resource's pods, regardless of whether the pods were meshed or not. We can't actually tap non-meshed pods, so I'm adding a check that will filter out non-meshed pods from the pods that tap watches. Previous behaviour: When attempting to hang a non meshed pod, it would establish a watch on the pods, but then never return any results. In the CLI you could just cancel it with Ctrl-C. In the web, clicking Stop would send a WebSocket.close(1000) but wouldn't actually close the connection... Behaviour after change : If no pods under the specified resource are meshed, it'll return an error of no pods being found to tap	2018-08-28 09:59:52 -07:00
Eliza Weisman	efabd90ff7	Fix missing ns/svc labels in metadata hydrated by Tap server (#1496 ) Fixes #1493. When the tap server hydrates metadata for the source or destination peer of a Tap event from the peer's IP address, it doesn't currently add a namespace label. However, destinations labeled by the proxy do have such a label. This is because the tap server currently gets the hydrated labels from the `GetPodLabels` function, which is also used by the Destination service for labeling the individual endpoints in a `WeightedAddrSet` response. However, the Destination service also adds some labels to all the endpoints in the set, including the namespace and service, so `GetPodLabels` doesn't return these labels. However, when the tap server uses that function, it does not add the service or namespace labels. This branch fixes this issue by adding those labels to the Tap event after calling `GetPodLabels`. In addition, it fixes a missing space between the `src/dst_res` and `src/dst_ns` labels in Tap CLI output with the `-o wide` flag set. This issue was introduced during the review of #1437, but was missed at the time because the namespace label wasn't being set correctly. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-08-20 18:09:34 -07:00
Eliza Weisman	cda05aa34c	Add inbound destination label hydration to Tap server (#1442 ) Based on @adleong's suggestion in https://github.com/linkerd/linkerd2/pull/1434#pullrequestreview-145428857, this branch adds label hydration from destination IPs to the Tap server. This works the same as the label hydration for destination IPs added in #1434. However, it is only applied to the destination fields of events recorded by proxies in the inbound direction, since outbound destinations are already labeled with metadata provided by the Destination service. This means that when a user taps inbound traffic, the CLI will show k8s metadata labels for the destination peer (if it's available). This can be useful especially when tapping several pods at once, as it makes it easier to distinguish what pod received a request. This branch also refactors how the label hydration is performed, primarily to make adding it to the destination field less repetitive. Also, the `hydrateIPLabels` function now mutates the label map in the `TapEvent`, rather than returning a new map of labels, so that the case where no pod was found doesn't require an additional allocation of an empty map. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-08-13 13:46:33 -07:00
Eliza Weisman	bf7fc12f5c	Add source metadata to Tap server tap events (#1434 ) The `TapEvent` protobuf contains two maps, `DestinationMeta` and `SourceMeta`. The `DestinationMeta` contains all the metadata provided by the proxy that originated the event (ultimately originating from the Destination service), while the `SourceMeta` currently only contains the source connection's TLS status. This branch modifies the Tap server to hydrate the same set of metadata from the source IP address, when the source was within the cluster. It does this by adding an indexer of pod IPs to pods to its k8s API client, and looking up IPs against this index. If a pod was found, the extra metadata is added to the tap event sent to the client. This branch also changes the client so that if a source pod name was provided in the metadata, it prints the pod name rather than the IP address for the `src` field in its output. This mimics what is currently done for the `dst` field in tap output. Furthermore, the added source metadata will be necessary for adding src resource types to tap output (see issue #1170). Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-08-13 13:25:14 -07:00
Kevin Lingerfelt	4b9700933a	Update prometheus labels to match k8s resource names (#1355 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-07-23 15:45:05 -07:00
Oliver Gould	941cad4a9c	Migrate build infrastructure to linkerd2 (#1298 ) This PR begins to migrate Conduit to Linkerd2: * The proxy has been completely removed from this repo, and is now located at github.com/linkerd/linkerd2-proxy. * A `Dockerfile-proxy` has been added to fetch the most-recently published proxy binary from build.l5d.io. * Proxy-specific protobuf bindings have been moved to github.com/linkerd/linkerd2-proxy-api. * All docker images now use the gcr.io/linkerd-io registry. * `inject` now uses `LINKERD2_PROXY_` environment variables * Go paths have been updated to reflect the new (future) repo location.	2018-07-09 15:38:38 -07:00
Kevin Lingerfelt	9f1df963e9	Move controller/util and web/util packages to pkg (#1109 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-06-13 11:25:56 -07:00
Kevin Lingerfelt	6e66f6d662	Rename Lister to API and expose informers as well as listers (#1072 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-06-12 10:27:55 -07:00
Risha Mars	f94856e489	Modify the Stat endpoint to also return the number of failed conduit pods (#895 ) * Modify the Stat endpoint to also return the count of failed pods * Add comments explaining pod count stats * Rename total pod count to running pod count This is to support the service mesh overview page, as I'd like to include an indicator of failed pods there.	2018-05-08 10:35:21 -07:00
Andrew Seigner	dce31b888f	Deprecate Tap, rename TapByResource to Tap (#844 ) The `conduit tap` command is now deprecated. Replace `conduit tap` with `connduit tapByResource`. Rename tapByResource to tap. The underlying protobuf for tap remains, the tap gRPC endpoint now returns Unimplemented. Fixes #804 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-25 12:24:46 -07:00
Andrew Seigner	a0a9a42e23	Implement Public API and Tap on top of Lister (#835 ) public-api and and tap were both using their own implementations of the Kubernetes Informer/Lister APIs. This change factors out all Informer/Lister usage into the Lister module. This also introduces a new `Lister.GetObjects` method. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-24 18:10:48 -07:00
Andrew Seigner	baf4ea1a5a	Implement TapByResource in Tap Service (#827 ) The TapByResource endpoint was previously a stub. Implement end-to-end tapByResource functionality, with support for specifying any kubernetes resource(s) as target and destination. Fixes #803, #49 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-23 16:13:26 -07:00
Oliver Gould	06dd8d90ee	Introduce the TapByResource API (#778 ) This changes the public api to have a new rpc type, `TapByResource`. This api supersedes the Tap api. `TapByResource` is richer, more closely reflecting the proxy's capabilities. The proxy's Tap api is extended to select over destination labels, corresponding with those returned by the Destination api. Now both `Tap` and `TapByResource`'s responses may include destination labels. This change avoids breaking backwards compatibility by: * introducing the new `TapByResource` rpc type, opting not to change Tap * extending the proxy's Match type with a new, optional, `destination_label` field. * `TapEvent` is extended with a new, optional, `destination_meta`.	2018-04-18 15:37:07 -07:00
Phil Calçado	19001f8d38	Add pod-based metric_labels to destinations response (#429 ) (#654 ) * Extracted logic from destination server * Make tests follow style used elsewhere in the code * Extract single interface for resolvers * Add tests for k8s and ipv4 resolvers * Fix small usability issues * Update dep * Act on feedback * Add pod-based metric_labels to destinations response * Add documentation on running control plane to BUILD.md Signed-off-by: Phil Calcado <phil@buoyant.io> * Fix mock controller in proxy tests (#656) Signed-off-by: Eliza Weisman <eliza@buoyant.io> * Address review feedback * Rename files in the destination package Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-02 18:36:57 -07:00
Alex Leong	84ba1f3017	Ensure tap requests at least 1rps from each pod (#459 ) When attempting to tap N pods when N is greater than the target rps, a rounding error occurs that requests 0 rps from each pod and no tap data is returned. Ensure that tap requests at least 1 rps from each target pod. Tested in Kubernetes on docker-for-desktop with a 15 replica deployment and a maxRps of 10. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-02-27 16:03:47 -08:00
Phil Calçado	9410da471a	Better error handling for Tap (#177 ) Previously, running `$conduit tap` would return a `Unexpected EOF` error when the server wasn't available. This was due to a few problems with the way we were handling errors all the way down the tap server. This change fixes that and cleans some of the protobuf-over-HTTP code. - first step towards #49 - closes #106	2018-01-25 11:49:38 -05:00
Kevin Lingerfelt	1dc1c00a2a	Upgrade k8s.io/client-go to v6.0.0 (#122 ) * Sort imports Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Upgrade k8s.io/client-go to v6.0.0 Signed-off-by: Kevin Lingerfelt <kl@buoyant.io> * Make k8s store initialization blocking with timeout Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-01-11 10:22:37 -08:00
Oliver Gould	b104bd0676	Introducing Conduit, the ultralight service mesh We’ve built Conduit from the ground up to be the fastest, lightest, simplest, and most secure service mesh in the world. It features an incredibly fast and safe data plane written in Rust, a simple yet powerful control plane written in Go, and a design that’s focused on performance, security, and usability. Most importantly, Conduit incorporates the many lessons we’ve learned from over 18 months of production service mesh experience with Linkerd. This repository contains a few tightly-related components: - `proxy` -- an HTTP/2 proxy written in Rust; - `controller` -- a control plane written in Go with gRPC; - `web` -- a UI written in React, served by Go.	2017-12-05 00:24:55 +00:00

49 Commits