linkerd2

Commit Graph

Author	SHA1	Message	Date
Scott Fleener	3847f9cf13	Set minimum TLS version to 1.3 (#13500 ) This helps ensure a minimum level of security. The two places this affects is our controller webhook and linkerd-viz tap API. The controller requires that kube-api supports TLSv1.3, which it does as of 1.19 (our minimum is currently 1.22). The linkerd-viz tap API is mostly used internally, and is deprecated. It may be worth revisiting if we want to keep it around at all. Signed-off-by: Scott Fleener <scott@buoyant.io>	2024-12-19 09:19:09 -05:00
Oliver Gould	9bd16f3b3b	chore: update Go code for new lints (#13437 ) Before updating our dev image with a new version of golangci-lint, this change updates our Go code to satisfy new lints. No functional changes.	2024-12-06 07:14:17 -08:00
dependabot[bot]	d42432914d	build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0 (#12593 ) * build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0 Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.63.2 to 1.64.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.63.2...v1.64.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> I've replaced all the `grpc.Dial` calls with `grpc.NewClient`. There was one call `grpc.DialContext(ctx,...)` in `viz/tap/api/grpc_server.go` that also got replaced with `grpc.NewClient`, which loses the ability to pass `ctxt` but that doesn't matter; as we're not using `WithBlock(true)` that context wasn't being accounted for when we were using `DialContext()` anyways. https://github.com/grpc/grpc-go/blob/v1.64.0/clientconn.go#L216-L242 --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>	2024-05-22 14:40:04 -05:00
Matei David	407df01ec3	chore(controller): Remove stream concurrency limits (#12598 ) Our gRPC servers use the default gRPC server configuration, which limits the number of concurrent streams to 100. Since the controllers run with proxies, this provides a hard scaling limit for the number of watches an application can have. This change updates our gRPC server configuration to clear the default concurrency limit, allowing the server to handle as many streams as possible. Signed-off-by: Matei David <matei@buoyant.io> Co-authored-by: Oliver Gould <ver@buoyant.io>	2024-05-15 18:07:15 +01:00
Alejandro Pedraza	b697e285a0	Refactor IPv4-only functions to also work for IPv6 (#12303 ) The main change here is the refactoring of the address functions in `addr.go` that support the Destination controller and Viz's Tap controller. Some of those functions only worked for IPv4, so this change refactored them to make them IP family agnostic. This enabled adding (and fixing) IPv6 unit tests as detailed in the following sections. Other changes: - The `ProxyAddressesToString()` function was no longer used, so it got removed. - The `ProxyIPToString()` function was only used by the destination-client script, so that got stripped out. ## `addr_test.go` We added IPv6 cases to each test, that would have failed previously. ## `endpoint_translator_test.go` One of the test pods (pod3) was changed to have an IPv6. Without the other changes in this PR those tests would still have passed, but just because when comparing actual IPs with expected ones we weren't checking if they were both zero. So here we added checks against that. ## `server_test.go` As above, we added checks against empty IPs. And in the mocked resources in `test_util.go` we added an IPv6 EndpointSlice.	2024-03-22 07:20:52 -05:00
Andrew Seigner	ff25a71d90	Remove shortnames from Tap API resources (#11816 ) The Tap API resource shortnames were colliding with existing Kubernetes resources (e.g. `po`, `deploy`, etc), causing warnings from kubectl v1.29.0+. Remove the shortnames from the Tap APIService handlers. To validate: ```bash bin/k3d cluster create # install latest edge curl https://run.linkerd.io/install-edge \| sh linkerd install --crds \| kubectl apply -f - linkerd install \| kubectl apply -f - linkerd check linkerd viz install \| kubectl apply -f - linkerd check # observe shortnames kubectl api-resources --api-group=tap.linkerd.io # with kubectl v1.29.0+, observe "Warning: short name..." kubectl get po # replace tap image TAP_IMAGE=$(bin/docker-build-tap) bin/k3d image load $TAP_IMAGE kubectl -n linkerd-viz set image deploy/tap tap=$TAP_IMAGE # verify shortnames are no longer present kubectl api-resources --api-group=tap.linkerd.io # with kubectl v1.29.0+, observe no warning kubectl get po ``` Fixes linkerd/linkerd2#11784 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2024-01-04 06:37:36 -05:00
Alex Leong	368b63866d	Add support for remote discovery (#11224 ) Adds support for remote discovery to the destination controller. When the destination controller gets a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` label, this is an indication that the destination controller should discover the endpoints for this service from a remote cluster. The destination controller will look for a remote cluster which has been linked to it (using the `linkerd multicluster link` command) with that name. It will look at the `multicluster.linkerd.io/remote-discovery` label for the service name to look up in that cluster. It then streams back the endpoint data for that remote service. Since we now have multiple client-go informers for the same resource types (one for the local cluster and one for each linked remote cluster) we add a `cluster` label onto the prometheus metrics for the informers and EndpointWatchers to ensure that each of these components' metrics are correctly tracked and don't overwrite each other. --------- Signed-off-by: Alex Leong <alex@buoyant.io>	2023-08-11 09:31:45 -07:00
Ryan Hristovski	5d8e5e0959	Add support for customizable ignore headers list in Linkerd Tap (#10443 ) Currently, Linkerd Tap displays all headers by default, which can contain irrelevant or sensitive information that users may want to exclude from being displayed. Therefore, there is a need for a feature that allows users to set their own ignore headers list, so that they can choose which headers to exclude from the output of Linkerd Tap on the dashboard. This commit adds a new helm value, `tap.ignoredHeaders`, to the Linkerd-viz chart. This value allows users to specify a comma-separated list of headers to be ignored by Linkerd Tap. The default list includes commonly irrelevant and sensitive headers such as authorization tokens and cookies. Once merged, every comma-seperated value in `tap.ignoredHeaders` will be removed from the headers in the Tap Dashboard when expanding a request headers result. Fixes #10389 Signed-off-by: Ryan Hristovski <ryan.hristovski@docker.com>	2023-03-30 16:16:16 -07:00
cui fliter	8c6de42210	all: fix some comments (#10387 ) Signed-off-by: cui fliter <imcusg@gmail.com>	2023-03-01 11:47:02 +00:00
Oliver Gould	363e123d79	Update to dev:v39 with Go 1.19 (#10336 )	2023-02-16 08:25:42 -08:00
Alex Leong	b0778bb2ea	Readiness checks fail until caches are synced (#10166 ) Fixes https://github.com/linkerd/linkerd2/issues/10036 The Linkerd control plane components written in go serve liveness and readiness probes endpoint on their admin server. However, the admin server is not started until k8s informer caches are synced, which can take a long time on large clusters. This means that liveness checks can time out causing the controller to be restarted. We start the admin server before attempting to sync caches so that we can respond to liveness checks immediately. We fail readiness probes until the caches are synced. Signed-off-by: Alex Leong <alex@buoyant.io>	2023-01-25 11:43:09 -08:00
Alejandro Pedraza	7428d4aa51	Removed dupe imports (#10049 ) * Removed dupe imports My IDE (vim-gopls) has been complaining for a while, so I decided to take care of it. Found via [staticcheck](https://github.com/dominikh/go-tools) * Add stylecheck to go-lint checks	2023-01-10 14:34:56 -05:00
Oliver Gould	54d2bcb0ec	controller: Increase HTTP ReadHeaderTimeout to 15s (#9272 ) `04a66ba` added a `ReadHeaderTimeout` to our HTTP servers (at gosec's insistence). We chose a fairly arbitrary timeout of 10s. This configuration causes any connection that has been idle for 10s to be torn down by the server. Unfortunately, this timeout value matches the default Kubernetes probe interval and the default linkerd-viz scrape interval. This can cause probes to race the timeout so that the connection is healthy from the proxy's point of view and a request is sent on the connection exactly as the server drops the connection. These request failures cause controller success rate to appear degraded. To correct this, this change raises the timeout to 15s so that the timeout no longer matches the default probe interval. The proxy's HTTP client is supposed to [retry] requests that encounter this type of error. We should follow up by doing more research into why that is not occurring in this situation. [retry]: https://docs.rs/hyper/0.14.20/hyper/client/struct.Builder.html#method.retry_canceled_requests Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-08-26 13:33:38 -07:00
Oliver Gould	04a66bacea	Set a header read timeout on HTTP servers (#9181 ) Newer versions of golangci-lint flag `http.Server` instances that do not set a `ReadHeaderTimeout` as being vulnerable to "slowloris" attacks, wherein clients initiate requests that hold connections open indefinitely. This change sets a `ReadHeaderTimeout` of 10s. This timeout is fairly conservative so that clients can eagerly create connections, but is still constrained enough that these connections won't remain open indefinitely. This change also updates kubert to v0.9.1, which instruments a header read timeout on the policy admission server. Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-08-16 11:10:23 -07:00
Steve Zhang	f8413f805c	Generate ipv4 and ipv6 compat URL address for Linkerd2 (#8598 ) Introduce change to generate dual-stack compatible URL addresses. This change refactors some of the code to use Go's standard net library to generate IPv4 and IPv6 compatible addresses. The main components affected by this change: * multicluster probe worker * port-forwarding package * viz tap api Signed-off-by: zhlsunshine <huailong.zhang@intel.com>	2022-06-06 12:19:47 +01:00
Oliver Gould	fa8ddb4801	Use go-test/deep for comparisons in tests (#8427 ) We frequently compare data structures--sometimes very large data structures--that are difficult to compare visually. This change replaces uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal` returns a diff of values that are not equal. Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-05-05 09:31:07 -07:00
Kevin Leimkuhler	388f14f48f	allow pprof to be configurable via helm flags (#8090 ) Follow-up to #8087 that allows pprof to be enabled via the `--set enablePprof=true` flag. Each control plane components spawns its own admin server, so each of these received it's own `enable-pprof` flag. When `enablePprof=true`, it is passed through to each component so that when it launches its admin server, its pprof endpoints are enabled. A note on the templating: `-enable-pprof={{.Values.enablePprof \| default false}}`. `false` values are not rendered by Helm so without the `... \| default false}}`, it tries to pass the flag as `-enable-pprof=""` which results in an error. Inlining this felt better than conditionally passing the flag with ```yaml {{ if .Values.enablePprof -}} -enable-pprof={{.Values.enablePprof}} {{ end -}} ``` Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2022-03-22 14:31:04 -06:00
Oliver Gould	619cd2641e	tap: Quote logged errors for CWE-117 (#8044 ) When the Tap controller logs errors, it can potentially log error messages that include data that originates from the network. This makes these logs succeptible to log forgery ([CWE-117]). This change updates log formatting to use `%q` so that newlines, etc are escaped. [CWE-117]: https://cwe.mitre.org/data/definitions/117.html Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-03-14 07:25:21 -07:00
Kevin Leimkuhler	fc2032fb8e	enable `staticcheck` (#8037 ) Closes #7881 This makes the rest of the necessary fixes to satisfy the `staticcheck` lint. The only class of lints that are being skipped are those related to deprecated tap code. There was some discussion on the original change started by @adleong about if this _actually_ deprecated [here](https://github.com/linkerd/linkerd2/pull/3240#discussion_r313634584); it doesn't look like we every came back around to fully removing it but I don't think it should be a blocker for enabling the lint right now. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2022-03-10 15:43:35 -08:00
Kevin Leimkuhler	67bcd8f642	Add `gosec` and `errcheck` lints (#7954 ) Closes #7826 This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time. A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword. Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover. Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them. Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>	2022-03-03 10:09:51 -07:00
Oliver Gould	f5876c2a98	go: Enable `errorlint` checking (#7885 ) Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks that error formatting and inspection is wrapping-aware. This change enables `errorlint` in golangci-lint and updates all error handling code to pass the lint. Some comparisons in tests have been left unchanged (using `//nolint:errorlint` comments). [el]: https://github.com/polyfloyd/go-errorlint Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-02-16 18:32:19 -07:00
Oliver Gould	e03f6182f4	Require use of at least TLS v1.2 (#7837 ) In several places where we build TLS servers (usually in our webhooks), we use the default TLS configuration, which enables legacy versions of TLS. This change updates these servers to specify a minimum TLS version of v1.2. Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-02-07 19:13:02 -08:00
Oliver Gould	a24aa51639	tap: Fix potential log forgery in debug logging (#7819 ) Our `SubjectAccessReview` debug logging can include newlines in user-provided data. This change ensures these values are quoted/escaped. Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-02-07 15:06:52 -08:00
Oliver Gould	f93ed670a9	tap: Fix typo in filename (#7820 ) The file named `sever.go` should be named `server.go`. Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-02-07 13:45:18 -08:00
Oliver Gould	ab70db014c	Fix log forgery issues in production-facing code (#7664 ) CodeQL has caught several instances where we may be susceptible to [log forgery][cql]. This change ensures that we strip newlines from log messages that include potentially user-supplied strings. Several redundant error logs are removed--we should generally not log an error when returning an error. Errors should be logged where they are handled. This change also properly escapes URL paths when constructing them from protobuf messages. Note that CodeQL continued to mark some of these uses as issues, but we've marked them as false-positive. See github/codeql-go#635 and github/codeql-go#650. [cql]: https://codeql.github.com/codeql-query-help/go/go-log-injection/ Signed-off-by: Oliver Gould <ver@buoyant.io>	2022-01-24 10:18:39 -08:00
Stepan Rabotkin	5e6a1b5508	Graceful shutdown for admin server (#6817 ) * Graceful shutdown for admin server Signed-off-by: Stepan Rabotkin <epicstyt@gmail.com>	2021-09-07 10:50:31 -05:00
Alex Leong	24792cfd1c	Remove core dependency on viz (#6497 ) Fixes #5589 The core control plane has a dependency on the viz package in order to use the `BuildResource` function. This "backwards" dependency means that the viz source code needs to be included in core docker-builds and is bad for code hygiene. We move the `BuildResource` function into the viz package. In `cli/cmd/metrics.go` we replace a call to `BuildResource` with a call directly to `CanonicalResourceNameFromFriendlyName`. Signed-off-by: Alex Leong <alex@buoyant.io>	2021-07-19 14:28:45 -07:00
dependabot[bot]	789aeea561	Fix gRPC servers (#6510 ) Bump github.com/linkerd/linkerd2-proxy-api from 0.1.18 to 0.2.0 Bumps [github.com/linkerd/linkerd2-proxy-api](https://github.com/linkerd/linkerd2-proxy-api) from 0.1.18 to 0.2.0. - [Release notes](https://github.com/linkerd/linkerd2-proxy-api/releases) - [Changelog](https://github.com/linkerd/linkerd2-proxy-api/blob/main/CHANGES.md) - [Commits](https://github.com/linkerd/linkerd2-proxy-api/compare/v0.1.18...v0.2.0) --- updated-dependencies: - dependency-name: github.com/linkerd/linkerd2-proxy-api dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Oliver Gould <olix0r@gmail.com> Co-authored-by: Oliver Gould <ver@buoyant.io> Co-authored-by: Oliver Gould <olix0r@gmail.com>	2021-07-19 10:24:23 -05:00
dependabot[bot]	a3c21d7aad	Bump github.com/prometheus/common from 0.10.0 to 0.29.0 (#6327 ) * Bump github.com/prometheus/common from 0.10.0 to 0.29.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.10.0 to 0.29.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](https://github.com/prometheus/common/compare/v0.10.0...v0.29.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alex Leong <alex@buoyant.io> Co-authored-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-06-22 09:56:12 -06:00
Alex Leong	948f9a4ece	Update protoc (#6333 ) Update protoc from 3.6.0 to 3.15.7 Signed-off-by: Alex Leong <alex@buoyant.io>	2021-06-21 16:37:57 -07:00
Oliver Gould	ab32c4e32b	tap: Avoid debug logging of headers values (#6144 ) CodeQL flags our debug-logging of user and group header values. While this information isn't strictly sensitive, it isn't really necessary, either--we've haven't used any of this logged information for practical diagnostics in years, if then. And because these header names are read as configuration and not hardcoded, there's no way for us to be really _sure_ that these headers are safe to log. So, I propose that we just stop logging them and instead log the header names instead :). While we're here, we should use `Header.Values()` instead of direct slice access. `Values()` ensures that headers are encoded in the canonical MIME format ("Train-Case") to ensure case-insenstive comparison. [1]: https://codeql.github.com/codeql-query-help/go/go-clear-text-logging/	2021-05-24 09:47:36 -07:00
Shubhendra Singh Chauhan	ad3b9accd8	fix: issues affecting code quality (#5827 ) Fix various lint issues: - Remove unnecessary calls to fmt.Sprint - Fix check for empty string - Fix unnecessary calls to Printf - Combine multiple `append`s into a single call Signed-off-by: shubhendra <withshubh@gmail.com>	2021-03-15 17:35:40 -04:00
Alex Leong	57d851b434	Report better errors for pods with tap disabled (#5799 ) Fixes https://github.com/linkerd/linkerd2/discussions/5777 When a user runs `linkerd viz check --proxy`, it will print a warning if there are any proxies which cannot be tapped. This is a normal state of affairs after freshly installing the linkerd-viz extensions because any existing pods will need to be restarted before they can be tapped. The check warning may lead users to falsely believe that something has gone wrong with their installation. We remove this specific check from `linkerd viz check --proxy`. To replace it, we improve the error output when attempting to tap a resource which is not tappable. This gives the user actionable feedback when the tap command fails. Old: ```console > linkerd viz tap -n emojivoto deploy/vote-bot no pods to tap for deployment/vote-bot ``` New: ```console > linkerd viz tap -n emojivoto deploy/vote-bot no pods to tap for deployment/vote-bot 1 pods found with tap not enabled: * vote-bot-64dd87cb87-7mcv4 restart these pods to enable tap and make them valid tap targets ``` Signed-off-by: Alex Leong <alex@buoyant.io>	2021-02-24 12:32:46 -08:00
Mayank Shah	96e078421c	CLI: Remove the `--disable-tap` flag from inject (#5671 ) Fixes https://github.com/linkerd/linkerd2/issues/5664 - Remove `--disable-flag` from `inject` - Move `config.linkerd.io/disable-tap` to `viz.linkerd.io/disable-tap` Signed-off-by: Mayank Shah <mayankshah1614@gmail.com>	2021-02-11 10:01:53 -05:00
Kevin Leimkuhler	75fcc9d623	Move tap from core into Viz extension (#5651 ) Closes #5545. This change moves all tap and tap-injector code into the viz directory. The tap and tap-injector components now also use a new tap image—separating these components from the controller image that they are currently part of. This means the controller image has removed all its build dependencies related to tap. Finally, the tap Protobuf has been separated from the metrics-api and moved into it's own `.proto` file and gen directory. This introduces a clear split between metrics-api and tap Protobuf. There is no change in behavior for the `viz tap` command. ### Reviewing #### Docker images All the bin directory scripts should be updated to build and load the tap image. All the CI workflows should be updated to build and push the tap image. #### Controller and pkg directories This is primarily deletions. Most of the deleted code in this directory is now in the tap directory of the Viz extension. #### viz/tap This is the location that all the tap related code now lives in. New files are mostly moved from the controller and pkg directories. Imports have all been updated to point at the right locations and Protobuf. The Protobuf here is taken from metrics-api and contains all tap-related Protobuf. Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>	2021-02-09 12:43:21 -05:00

35 Commits