linkerd2

Commit Graph

Author	SHA1	Message	Date
Alex Leong	cbb196066f	Support service profiles for external authorities (#1928 ) Add support for service profiles created on external (non-service) authorities. For example, this allows you to create a service profile named `linkerd.io` which will apply to calls made to `linkerd.io`. This is done by changing the `LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` to `.` so that the proxy will attempt to lookup a service profile for any authority. We provide the `--disable-external-profiles` proxy flag to revert this behavior in case it is a problem. We also refactor the proxy-api implementation of GetProfiles so that it does the profile lookup, regardless of if the authority looks like a Kubernetes service name or not. To simplify this, support for multiple resolves (which was unused) was removed. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-12-05 14:32:59 -08:00
Oliver Gould	8f9bb711dd	proxy-api: Expose a flag to control auto-h2-upgrade (#1925 ) When debugging issues, it's helpful to disable HTTP/2 upgrading to simplify diagnostics. This chagne adds an `enable-h2-ugprade` flag to _proxy-api_. When this flag is set to false, the proxy-api will not suggest that meshed endpoints are upgraded to use HTTP/2. As a follow-up, a flag should be added to `install` to control how the proxy-api is initialized.	2018-12-05 12:41:20 -08:00
Alex Leong	380ec52a39	Rework routes command to accept any resource (#1921 ) We rework the routes command so that it can accept any Kubernetes resource, making it act much more similarly to the stat command. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-12-05 11:11:34 -08:00
Alex Leong	4f3e55e937	Rename path to path_regex in ServiceProfile CRD (#1923 ) We rename path to path_regex in the ServiceProfile CRD to make it clear that this field accepts a regular expression. We also take this opportunity to remove unnecessary line anchors from regular expressions now that these anchors are added in the proxy. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-12-05 10:42:47 -08:00
Kevin Lingerfelt	37ae423bb3	Add linkerd- prefix to all objects in linkerd install (#1920 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-12-04 15:41:47 -08:00
Andrew Seigner	ad2366f208	Revert proxy readiness initialDelaySeconds change (#1912 ) Reverts part of #1899 to workaround readiness failures. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-12-04 14:27:55 -08:00
Andrew Seigner	37a5455445	Add filtering by job in stat, tap, top; fix panic (#1904 ) Filtering by Kubernetes job was not supported. Also filtering by any unknown type caused a panic. Add filtering support by Kubernetes job, with special case mapping `job` to `k8s_job`, to not conflict with Prometheus' job label. Fix panic when unknown type specified as a `--from` or `--to` flag. Fix `job` label from `linkerd-proxy` overwriting Prometheus `job` label at collection time. This caused all metrics collected by proxy sidecars in Kubernetes jobs to be collected into an incorrect Prometheus job, rather than the expected `linkerd-proxy` Prometheus job. Fix `unsupported resource type` tap error message incorrectly printing the target resource rather than the destination. Set `--controller-log-level debug` in `install_test.go` for easier debugging. Expose `slow-cooker`'s metrics via a k8s service in the tap integration test, to validate proxy requests with a job as destination. Fixes #1872 Part of #627 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-12-03 15:34:49 -08:00
Oliver Gould	926395f616	tap: Include route labels in tap events (#1902 ) This change alters the controller's Tap service to include route labels when translating tap events, modifies the public API to include route metadata in responses, and modifies the tap CLI command to include rt_ labels in tap output (when -o wide is used).	2018-12-03 13:52:47 -08:00
Andrew Seigner	d121071f87	Adjust proxy, Prometheus, and Grafana probes (#1899 ) * Adjust proxy, Prometheus, and Grafana probes High `readinessProbe.initialDelaySeconds` values delayed the controller's readiness by up to 30s, preventing cli commands from succeeding shortly after control plane deployment. Decrease `readinessProbe.initialDelaySeconds` in the proxy, Prometheus, and Grafana to the default 0s. Also change `linkerd check` controller pod ordering to: controller, prometheus, web, grafana. Detailed probe changes: - proxy - decrease `readinessProbe.initialDelaySeconds` from 10s to 0s - prometheus - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s - decrease `readinessProbe.timeoutSeconds` from 30s to 1s - decrease `livenessProbe.timeoutSeconds` from 30s to 1s - grafana - decrease `readinessProbe.initialDelaySeconds` from 30s to 0s - decrease `readinessProbe.timeoutSeconds` from 30s to 1s - decrease `readinessProbe.failureThreshold` from 10 to 3 - increase `livenessProbe.initialDelaySeconds` from 0s to 30s Fixes #1804 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-12-03 10:41:11 -08:00
Alex Leong	f9d66cf4de	Add --open-api option to linkerd profiles command (#1867 ) The `--open-api` flag is an alternative to the `--template` flag for the `linkerd profile` command. It reads an OpenAPI specification file (also called a swagger file) and uses it to generate a corresponding service profile. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-11-30 09:25:19 -08:00
Andrew Seigner	34d9eef03e	proxy injector: insert at end of arrays (#1881 ) When using `--proxy-auto-inject` with Kuberntes `v1.9.11`, observed auto injector incorrectly merging list elements rather than inserting new ones. This issue was not reproducible on `v1.10.3`. For example, this input: ``` spec: template: spec: containers: - name: vote-bot command: - emojivoto-vote-bot ``` Would yield: ``` spec: template: spec: containers: - name: linkerd-proxy command: - emojivoto-vote-bot - name: vote-bot command: - emojivoto-vote-bot ``` This change replaces json patch specs like `/spec/template/spec/containers/0` with `/spec/template/spec/containers/-`. The former is intended to insert at the beggining of a list, the latter at the end. This also simplifies the code a bit and more closely aligns with the intent of injecting at the end of lists. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-11-28 14:21:18 -08:00
Risha Mars	f8583df4db	Add ListServices to controller public api (#1876 ) Add a barebones ListServices endpoint, in support of autocomplete for services. As we develop service profiles, this endpoint could probably be used to describe more aspects of services (like, if there were some way to check whether a service profile was enabled or not). Accessible from the web UI via http://localhost:8084/api/services	2018-11-27 11:34:47 -08:00
Alex Leong	73836f05cf	Update proxy version and use canonicalized dst (#1866 ) The `linkerd` routes command only supports outbound metrics queries (i.e. ones with the `--from` flag). Inbound queries (i.e. ones without the `--from` flag) never return any metrics. We update the proxy version and use the new canonicalized form for dst labels to gain support for inbound metrics as well. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-11-26 17:20:07 -08:00
Oliver Gould	ba11698d4b	tap: Use nil-safe protobuf accessors (#1873 ) The tap server accesses protobuf fields directly instead of using the `Get*()` accessors. The accessors are necessary to prevent dereferencing a nil pointer and crashing the tap service. Furthermore, these maps are explicitly initialized when `nil` to support label hydration.	2018-11-26 14:14:28 -08:00
Alex Leong	7a7f6b6ecb	Add TopRoutes method the the public api and route CLI command to consume it (#1860 ) Add a routes command which displays per-route stats for services that have service profiles defined. This change has three parts: * A new public-api RPC called `TopRoutes` which serves per-route stat data about a service * An implementation of TopRoutes in the public-api service. This implementation reads per-route data from Prometheus. This is very similar to how the StatSummaries RPC and much of the code was able to be refactored and shared. * A new CLI command called `routes` which displays the per-route data in a tabular or json format. This is very similar to the `stat` command and much of the code was able to be refactored and shared. Note that as of the currently targeted proxy version, only outbound route stats are supported so the `--from` flag must be included in order to see data. This restriction will be lifted in an upcoming change once we add support for inbound route stats as well. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-11-19 12:20:30 -08:00
Kevin Leimkuhler	c68693e820	Fix stat filtering for `--from` queries (#1856 ) # Problem When we add a `--from` query to `linkerd stat au` we get more rows than if we would have just run `linkerd stat au`. Adding a `--from` causes an extra row to be added, and the named authority to be ignored (this is the result we would have expected when running `linkerd stat au -n emojivoto --from deploy/web`). # Solution Destination query labels are now appended to `labels` so that those labels can be filtered on. # Validation Tests have been updated to reflect the expected expected destination labels now appended in `--from` queries. Fixes #1766 Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2018-11-14 10:52:27 -08:00
Alejandro Pedraza	bbcf5a8c9f	Allow stat summary to query for multiple resources (#1841 ) * Refactor util.BuildResource so it can deal with multiple resources First step to address #1487: Allow stat summary to query for multiple resources Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Update the stat cli help text to explain the new multi resource querying ability Propsal for #1487: Allow stat summary to query for multiple resources Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Allow stat summary to query for multiple resources Implement this ability by issuing parallel requests to requestStatsFromAPI() Proposal for #1487 Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Update tests as part of multi-resource support in `linkerd stat` (#1487) - Refactor stat_test.go to reuse the same logic in multiple tests, and add cases and files for json output. - Add a couple of cases to api_utils_test.go to test multiple resources validation. Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * `linkerd stat` called with multiple resources should keep an ordering (#1487) Add SortedRes holding the order of resources to be followed when querying `linkerd stat` with multiple resources Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Extra validations for `linkerd stat` with multiple resources (#1487) Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * `linkerd stat` resource grouping, ordering and name prefixing (#1487) - Group together stats per resource type. - When more than one resource, prepend name with type. - Make sure tables always appear in the same order. Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com> * Allow `linkerd stat` to be called with multiple resources A few final refactorings as per code review. Fixes #1487 Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>	2018-11-14 10:44:04 -08:00
Igor Zibarev	60bcdb15f9	controller: use GetConfig from pkg/k8s package (#1857 ) This commit removes duplicate logic that loads Kubernetes config and replaces it with GetConfig from pkg/k8s. This also allows to load config from default sources like $KUBECONFIG instead of explicitly passing -kubeconfig option to controller components. Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>	2018-11-13 14:41:31 -08:00
Alex Leong	32d556e732	Improve ergonomics of service profile spec (#1828 ) We make several changes to the service profile spec to make service profiles more ergonomic and to make them more consistent with the destination profile API. * Allow multiple fields to be simultaneously set on a RequestMatch or ResponseMatch condition. Doing so is equivalent to combining the fields with an "all" condition. * Rename "responses" to "response_classes" * Change "IsSuccess" to "is_failure" Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-31 12:00:22 -07:00
Alex Leong	d8b5ebaa6d	Remove the proxy-api container (#1813 ) A container called `proxy-api` runs in the Linkerd2 controller pod. This container listens on port 8086 and serves the proxy-api but does nothing other than forward gRPC requests to the destination container which listens on port 8089. We remove the proxy-api container altogether and change the destination container to listen on port 8086 instead of 8089. The result is that clients still use the proxy-api by connecting to `proxy-api.<ns>.svc.cluster.local:8086` but the controller has one fewer containers. This results in a simpler system that is easier to reason about. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-29 16:31:43 -07:00
Alex Leong	82ca821e62	Use fqdn for service profile name (#1808 ) Service profiles must be named in the form `"<service>.<namespace>"`. This is inconsistent with the fully normalized domain name that the proxy sends to the controller. It also does not permit creating service profiles for non-Kubernetes services. We switch to requiring that service profiles must be named with the FQDN of their service. For Kubernetes services, this is `"<service>.<namespace>.svc.cluster.local"`. This change alone is not sufficient for allowing service profile for non-Kubernetes services because the k8s resolver will ignore any DNS names which are not Kubernetes services. Further refactoring of the resolver will be required to allow looking up non-Kubernetes service profiles in Kuberenetes. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-29 14:35:42 -07:00
Alex Leong	622185a4dd	Send metric labels in profile API (#1800 ) * Send metric labels in profile API Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-29 14:28:09 -07:00
Oliver Gould	0e91dbb18d	Implement GetProfile for the proxy-api service (#1801 ) The `proxy-api` service included a stub implementation of `GetProfile` instead of forwarding requests to the `destination` service. This change fills in the proxy-api service's `GetProfile` implementation to forward requests to the destination service.	2018-10-24 12:37:29 -07:00
Alex Leong	f549868033	Fix integration test and docker build (#1790 ) Fix broken docker build by moving Service Profile conversion and validation into `/pkg`. Fix broken integration test by adding service profile validation output to `check`'s expected output. Testing done: * `gotest -v ./...` * `bin/docker-build` * `bin/test-run (pwd)/bin/linkerd` Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-19 10:23:34 -07:00
Alex Leong	5210b7b44a	Add check for service profile validation (#1775 ) Add a check to `linkerd check` which validates all service profile resources. In particular it checks: * does the service profile refer to an existent service * is the service profile valid Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-18 16:37:39 -07:00
Alex Leong	43c22fe967	Implement getProfiles method in destination service (#1759 ) We implement the getProfiles method in the destination service. This method returns a stream of destination profiles for a given authority. It does this by looking up the ServiceProfile resource in the controller namespace named `<svc>.<ns>` where `<svc>` is the name of the service and `<ns>` is the namespace of the service. This PR includes: * Adding a ServiceProfile Custom Resource Definition to linkerd install * A watch based implementation of the getProfiles method in the destination service, similar to the implementation of get. * An update to the destination client script that allows querying the getProfiles method. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-16 15:39:12 -07:00
Ivan Sim	1100c4fa8c	Proxy injector must preserve the original pod template labels and annotations (#1765 ) * Ensure that the proxy injector mutating webhook preserves the original labels and annotations The deployment's selector must also match the pod template labels in newer version of Kubernetes. This resolves issue #1756. * Add the Linkerd labels to the deployment metadata during auto proxy injection * Remove selector match labels JSON patch from proxy injector This isn't needed to resolve the selector label mismatch errors. Signed-off-by: ihcsim <ihcsim@gmail.com>	2018-10-16 15:30:45 -07:00
Ivan Sim	2e1a984eb0	Change the proxy-init container ordering during auto proxy injection (#1763 ) Appending proxy-init to the end of the list ensures that it won't interfere with other init containers from accessing the network, before the proxy container is created. This resolves bug #1760 Signed-off-by: ihcsim <ihcsim@gmail.com>	2018-10-15 15:33:09 -07:00
Alejandro Pedraza	37bc8a69db	Added support for json output in `linkerd stat` (#1749 ) Added support for json output in `linkerd stat` through a new (-o\|--output)=json option. Fixes #1417 Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>	2018-10-15 14:10:48 -07:00
Risha Mars	31a396b631	Fix incorrect test wording (#1767 )	2018-10-15 12:07:06 -07:00
Alex Leong	1fe19bf3ce	Add ServiceProfile support to k8s utilities (#1758 ) Updates to the Kubernetes utility code in `/controller/k8s` to support interacting with ServiceProfiles. This makes use of the code generated client added in #1752 Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-12 09:35:11 -07:00
Alex Leong	f1f5b49f59	Add generated Kubernetes client for ServiceProfile custom resource (#1752 ) To support reading and writing of the ServiceProfile custom resource, we add a codegen'd Kubernetes client for this resource. * Adding the ServiceProfile type and related boilerplate to /controller/gen/apis/serviceprofile. This boilerplate also contains directives that control how codegen works. * A script in /hack which invokes codegen that generates Kubernetes client machinery for interacting with ServiceProfile resources. The majority of the generated code lives in /controller/gen/client. * The above-mentioned generated code. Signed-off-by: Alex Leong <alex@buoyant.io>	2018-10-11 11:43:35 -07:00
Kevin Lingerfelt	46c887ca00	Add --single-namespace install flag for restricted permissions (#1721 ) * Add --single-namespace install flag for restricted permissions * Better formatting in install template * Mark --single-namespace and --proxy-auto-inject as experimental * Fix wording of --single-namespace check flag * Small healthcheck refactor Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-10-11 10:55:57 -07:00
Andrew Seigner	8f4240125e	fix test failure, logrus api consistency (#1755 ) `go test` was failing with `Fatalf call has arguments but no formatting directives` Fix test failure, make all logrus api calls consistent. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-10-11 10:44:32 -07:00
Ivan Sim	4fba6aca0a	Proxy init and sidecar containers auto-injection (#1714 ) * Support auto sidecar-injection 1. Add proxy-injector deployment spec to cli/install/template.go 2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration during the webhook's start-up process. 3. Add a new handler to the CA controller to create a new secret for the webhook when a new MutatingWebhookConfiguration is created. 4. Declare a config map to store the proxy and proxy-init container specs used during the auto-inject process. 5. Ignore namespace and pods that are labeled with linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed 6. Add new flag to `linkerd install` to enable/disable proxy auto-injection Proposed implementation for #561. * Resolve missing packages errors * Move the auto-inject label to the pod level * PR review items * Move proxy-injector to its own deployment * Ignore pods that already have proxy injected This ensures the webhook doesn't error out due to proxy that are injected using the command * PR review items on creating/updating the MWC on-start * Replace API calls to ConfigMap with file reads * Fixed post-rebase broken tests * Don't mutate the auto-inject label Since we started using healhcheck.HasExistingSidecars() to ensure pods with existing proxies aren't mutated, we don't need to use the auto-inject label as an indicator. This resolves a bug which happens with the kubectl run command where the deployment is also assigned the auto-inject label. The mutation causes the pod auto-inject label to not match the deployment label, causing kubectl run to fail. * Tidy up unit tests * Include proxy resource requests in sidecar config map * Fixes to broken YAML in CLI install config The ignore inbound and outbound ports are changed to string type to avoid broken YAML caused by the string conversion in the uint slice. Also, parameterized the proxy bind timeout option in template.go. Renamed the sidecar config map to 'linkerd-proxy-injector-webhook-config'. Signed-off-by: ihcsim <ihcsim@gmail.com>	2018-10-10 12:09:22 -07:00
Ben Lambert	69cebae1a2	Added ability to configure sidecar CPU + Memory requests (#1731 ) Horizontal Pod Autoscaling does not work when container definitions in pods do not all have resource requests, so here's the ability to add CPU + Memory requests to install + inject commands by proving proxy options --proxy-cpu + --proxy-memory Fixes #1480 Signed-off-by: Ben Lambert <ben@blam.sh>	2018-10-08 10:51:29 -07:00
Andrew Seigner	dccccebd79	Add LICENSE files to all Docker images (#1727 ) To comply with certain environments, include our LICENSE file in all Docker images. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-10-02 16:25:52 -07:00
Alena Varkockova	5a853e8990	Use ListPods always for data plane HC (#1701 ) * Use ListPods always for data plane HC * Missing changes in grpc_server.go * Address review comments * Read proxy version from spec Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-10-02 11:45:01 -07:00
Alena Varkockova	11c9b7425b	Fix the debug message in endpoints watcher (#1658 ) * Fix the debug message in endpoints watcher * Use better method for converting Signed-off-by: Alena Varkockova <varkockova.a@gmail.com>	2018-09-20 13:03:45 -07:00
Alex Leong	e65a9617bd	Add can-i checks to linkerd check --pre (#1644 ) Add checks to `linkerd check --pre` to verify that the user has permission to create: * namespaces * serviceaccounts * clusterroles * clusterrolebindings * services * deployments * configmaps Signed-off-by: Alex Leong <alex@buoyant.io>	2018-09-17 11:31:10 -07:00
Dennis Adjei-Baah	00d0a26a9c	Cleanly shutdown tap stream to data plane proxies (#1624 ) Sometimes, the tap server causes the controller pod to restart after it receives this error. This error arises when the Tap server does not close gRPC tap streams to proxies before the tap server terminates its streams to its upstream clients and causes the controller pod to restart. This PR uses the request context from the initial TapByReource to help shutdown tap streams to the data plane proxies gracefully. fixes #1504 Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>	2018-09-12 15:00:19 -07:00
Andrew Seigner	c5a719da47	Modify inject to warn when file is un-injectable (#1603 ) If an input file is un-injectable, existing inject behavior is to simply output a copy of the input. Introduce a report, printed to stderr, that communicates the end state of the inject command. Currently this includes checking for hostNetwork and unsupported resources. Malformed YAML documents will continue to cause no YAML output, and return error code 1. This change also modifies integration tests to handle stdout and stderr separately. example outputs... some pods injected, none with host networking: ``` hostNetwork: pods do not use host networking...............................[ok] supported: at least one resource injected..................................[ok] Summary: 4 of 8 YAML document(s) injected deploy/emoji deploy/voting deploy/web deploy/vote-bot ``` some pods injected, one host networking: ``` hostNetwork: pods do not use host networking...............................[warn] -- deploy/vote-bot uses "hostNetwork: true" supported: at least one resource injected..................................[ok] Summary: 3 of 8 YAML document(s) injected deploy/emoji deploy/voting deploy/web ``` no pods injected: ``` hostNetwork: pods do not use host networking...............................[warn] -- deploy/emoji, deploy/voting, deploy/web, deploy/vote-bot use "hostNetwork: true" supported: at least one resource injected..................................[warn] -- no supported objects found Summary: 0 of 8 YAML document(s) injected ``` TODO: check for UDP and other init containers Part of #1516 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-09-10 10:34:25 -07:00
Kevin Lingerfelt	f884caf56d	Upgrade protobuf to v1.2.0 (#1591 ) * Upgrade protobuf to v1.2.0 * Fix Gopkg.lock * Switch linkerd2-proxy-api dep back to stable Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-06 11:36:29 -07:00
Kevin Lingerfelt	b5ff29c8aa	Add data plane check to validate proxy version (#1574 ) Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-09-04 15:22:38 -07:00
Risha Mars	249b51f950	Increase MaxRps in Tap server, remove default setting from Web (#1560 ) Increase the MaxRps on the tap server to 100 RPS. The max RPS for tap/top was increased in for the CLI #1531, but we were still manually setting this to 1 RPS in the Web UI and Web server. Remove the pervasive setting of MaxRps to 1 in the web frontend and server	2018-08-30 13:37:37 -07:00
Alex Leong	0f7d684ca9	Increase default max-rps for tap and top (#1531 ) The default value for the max-rps argument to the tap and top commands is an overly conservative 1rps. This causes the data to come in very slowly and much data to be discarded. Furthermore, because tap requests are windowed to 10 seconds, this causes long pauses between updates. We fix this in two ways. Firstly we reduce the window size to 1s so that updates will come in at least once per second, even when the actual RPS of the data path is extremely high. Secondly, we increase the default max-rps parameter from 1 to 100. This allows tap to paint an accurate picture of the data much more quickly and sidesteps some sampling bias that happens when the max-rps is low. In general, tap events tend to happen in bursts. For example, one request in may trigger one or more requests out. Likewise, a single upstream event may trigger several requests to the tapped pod in quick succession. Sampling bias will occur when the max-rps is less than the actual rps and when the tap event limit subdivides these event bursts (biasing towards the first few events in the burst). The greater the max-rps, the less the effects of this bias. Fixes #1525 Signed-off-by: Alex Leong <alex@buoyant.io>	2018-08-28 14:16:39 -07:00
Risha Mars	fff09c5d06	Only tap pods that are meshed (#1535 ) Previously, we would tap any resource's pods, regardless of whether the pods were meshed or not. We can't actually tap non-meshed pods, so I'm adding a check that will filter out non-meshed pods from the pods that tap watches. Previous behaviour: When attempting to hang a non meshed pod, it would establish a watch on the pods, but then never return any results. In the CLI you could just cancel it with Ctrl-C. In the web, clicking Stop would send a WebSocket.close(1000) but wouldn't actually close the connection... Behaviour after change : If no pods under the specified resource are meshed, it'll return an error of no pods being found to tap	2018-08-28 09:59:52 -07:00
Eliza Weisman	efabd90ff7	Fix missing ns/svc labels in metadata hydrated by Tap server (#1496 ) Fixes #1493. When the tap server hydrates metadata for the source or destination peer of a Tap event from the peer's IP address, it doesn't currently add a namespace label. However, destinations labeled by the proxy do have such a label. This is because the tap server currently gets the hydrated labels from the `GetPodLabels` function, which is also used by the Destination service for labeling the individual endpoints in a `WeightedAddrSet` response. However, the Destination service also adds some labels to all the endpoints in the set, including the namespace and service, so `GetPodLabels` doesn't return these labels. However, when the tap server uses that function, it does not add the service or namespace labels. This branch fixes this issue by adding those labels to the Tap event after calling `GetPodLabels`. In addition, it fixes a missing space between the `src/dst_res` and `src/dst_ns` labels in Tap CLI output with the `-o wide` flag set. This issue was introduced during the review of #1437, but was missed at the time because the namespace label wasn't being set correctly. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-08-20 18:09:34 -07:00
Kevin Lingerfelt	e97be1f5da	Move all healthcheck-related code to pkg/healthcheck (#1492 ) * Move all healthcheck-related code to pkg/healthcheck * Fix failed check formatting * Better version check wording Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-08-20 16:50:22 -07:00
Eliza Weisman	b8434d60d4	Add resource metadata to Tap CLI output (#1437 ) Closes #1170. This branch adds a `-o wide` (or `--output wide`) flag to the Tap CLI. Passing this flag adds `src_res` and `dst_res` elements to the Tap output, as described in #1170. These use the metadata labels in the tap event to describe what Kubernetes resource the source and destination peers belong to, based on what resource type is being tapped, and fall back to pods if either peer is not a member of the specified resource type. In addition, when the resource type is not `namespace`, `src_ns` and `dst_ns` elements are added, which show what namespaces the the source and destination peers are in. For peers which are not in the Kubernetes cluster, none of these labels are displayed. The source metadata added in #1434 is used to populate the `src_res` and `src_ns` fields. Also, this branch includes some refactoring to how tap output is formatted. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-08-20 14:25:26 -07:00

1 2 3 4 5

225 Commits