linkerd2

Commit Graph

Author	SHA1	Message	Date
Brian Smith	b114ef6819	Add initial infrastructure for optionally accepting TLS connections (#1047 ) * Add initial infrastructure for optinally accepting TLS connections. If the environment gives us the paths to the certificate chain and private key then use TLS for all accepted TCP connections. Otherwise, continue on using plaintext for all accepted TCP connections. The default behavior--no TLS--isn't changed. Later we'll make this smarter by adding protocol detection so that when the TLS configuration is available, we'll accept both TLS and non-TLS connections. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-05-31 12:20:57 -10:00
Oliver Gould	e91699bba2	proxy/router: Implement LRU cache eviction (#925 ) The router's cache has no means to evict unused entries when capacity is reached. This change does the following: - Wraps cache values in a smart pointer that tracks the last time of access for each entry. The smart pointer updates the access time when the reference to entry is dropped. - When capacity is not available, all nodes that have not been accessed within some minimal idle age are dropped. Accesses and updates to the map are O(1) when capacity is available. Reclaiming capacity is O(n), so it's expected that the router is configured with enough capacity such that capacity need not be reclaimed usually.	2018-05-10 19:06:31 -07:00
Oliver Gould	63fbbd6931	proxy: Parse units with duration configurations (#909 ) Configuration values that take durations are currently specified as time values with no units. So `600` may mean 600ms in some contexts and 10 minutes in others. In order to avoid this problem, this change now requires that configurations provide explicit units for time values such as '600ms' or 10 minutes'. Fixes #27.	2018-05-08 13:54:12 -07:00
Oliver Gould	68e203a2fc	proxy: Use Duration types for config defaults (#906 ) It's easy to misconfigure default durations, since they're recorded as integers and converted to Durations separately. Now, all default constants that represent durations use const `Duration` instances (enabled by a recent Rust release). This fixes #905 which was caused by using the wrong time unit for the metrics retain time.	2018-05-08 10:58:22 -07:00
Oliver Gould	02e6d018d0	proxy: Bound on router capacity (#898 ) Currently, the proxy may cache an unbounded number of routes. In order to prevent such leaks in production, new configurations are introduced to limit the number of inbound and outbound HTTP routes. By default, we support 100 inbound routes and 10K outbound routes. In a followup, we'll introduce an eviction strategy so that capacity can be reclaimed gracefully.	2018-05-04 16:32:30 -07:00
Oliver Gould	ada5cb267e	proxy: Expire metrics that have not been updated for 10 minutes (#880 ) The proxy is now configured with the CONDUIT_PROXY_METRICS_RETAIN_IDLE environment variable that dictates the amount of time that the proxy will retain metrics that have not been updated. A timestamp is maintained for each unique set of labels, indicating the last time that the scope was updated. Then, when metrics are read, all metrics older than CONDUIT_PROXY_METRICS_RETAIN_IDLE are dropped from the stats registry. A ctx::test_utils module has been added to aid testing. Fixes #819	2018-04-30 16:11:12 -07:00
Oliver Gould	cc44db054f	Remove NODE_NAME and POD_NAME env usage (#758 ) * proxy: Remove pod_name and node_name * cli: Do not inject POD_NAME and NODE_NAME env vars	2018-04-13 13:09:51 -07:00
Oliver Gould	efdfc93b50	Stop pushing telemetry reports from the proxy (#616 ) Now that the controller does not depend on pushed telemetry reports, the proxy need not depend on the telemetry API or maintain legacy sampling logic.	2018-04-12 17:39:29 -07:00
Brian Smith	7d3b715c4d	Proxy: Move DNS name normalization to service discovery (#722 ) Only the destination service needs normalized names (and even then, that's just temporary). The rest of the code needs the name as it was given, except case-normalized (lowercased). Because DNS fallack isn't implemented in service discovery yet, Outbound still a temporary workaround using FullyQualifiedName to keep things working; thta will be removed once DNS fallback is implemented in service discovery. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-06 15:04:09 -10:00
Brian Smith	7bc4ffd0a4	Revert "Proxy: Refactor DNS name parsing and normalization (#673 )" (#700 ) This reverts commit `311ef410a8`. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 16:49:32 -10:00
Brian Smith	311ef410a8	Proxy: Refactor DNS name parsing and normalization (#673 ) Proxy: Refactor DNS name parsing and normalization Only the destination service needs normalized names (and even then, that's just temporary). The rest of the code needs the name as it was given, except case-normalized (lowercased). Because DNS fallack isn't implemented in service discovery yet, Outbound still a temporary workaround using FullyQualifiedName to keep things working; thta will be removed once DNS fallback is implemented in service discovery. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 12:32:12 -10:00
Sean McArthur	47f9665b8e	proxy: allow disable protocol detection on specific ports (#648 ) - Adds environment variables to configure a set of ports that, when an incoming connection has an SO_ORIGINAL_DST with a port matching, will disable protocol detection for that connection and immediately start a TCP proxy. - Adds a default list of well known ports: SMTP and MySQL. Closes #339	2018-04-02 14:24:36 -07:00
Eliza Weisman	1c9ce4d118	Add Prometheus /metrics endpoint to proxy (#569 ) This PR adds an endpoint to the proxy that serves metrics in Prometheus' text exposition format. The endpoint currently serves the `request_total`, `response_total`, `response_latency_ms`, and `response_duration_ms metrics`, as described in #536. The endpoint's port and address are configurable with the `CONDUIT_PROXY_METRICS_LISTENER` environment variable. Tests have been added in t`ests/telemetry.rs`	2018-03-21 16:19:32 -07:00
Brian Smith	649e784d9c	Simplify cluster zone suffix handling in the proxy (#528 ) * Temporarily stop trying to support configurable zones in the proxy. None of the zone configuration is tested and lots of things assume the cluster zone is `cluster.local`. Further, how exactly the proxy will actually learn the cluster zone hasn't been decided yet. Just hard-code the zone as "cluster.local" in the proxy until configurable zones are fully implemented and tested to be working correctly. Signed-off-by: Brian Smith <brian@briansmith.org> * Remove the CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN setting The way that Kubernetes configures DNS search suffixes has some negative consequences as some names like "example.com" are ambiguous: depending on whether there is a service "example" in the "com" namespace, "example.com" may refer to an external service or an internal service, and this can fluctuate over time. In recognition of that we added the CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN setting, thinking this would be part of a solution for users to opt out of the unfortunate behavior if their applications didn't depend on the DNS search suffix feature. It turns out similar effects can be acheived using a custom dnsConfig, starting in Kubernetes 1.10 when dnsConfig reaches the beta stability level. Now any CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN-based seems duplicative. Further, attempting to support it optionally made the code complex and hard to read. Therefore, let's just remove it. If/when somebody actually requests this functionality then we can add it back, if dnsConfig isn't a valid alternative for them. Signed-off-by: Brian Smith <brian@briansmith.org> * Further hard-code "cluster.local" as the zone, temporarily. Addresses review feedback. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-03-07 14:30:13 -10:00
Brian Smith	72c6a9cab2	Proxy: Make CONDUIT_PROXY_POD_NAMESPACE a required parameter. (#527 ) Wwe will be able to simplify service discovery in the near future if we can rely on the namespace being available. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-03-07 11:12:05 -10:00
Eliza Weisman	ad073c79b9	Remove connect timeouts from Bind (#487 ) Currently, the `Reconnect` middleware does not reconnect on connection errors (see #491) and treats them as request errors. This means that when a connection timeout is wrapped in a `Reconnect`, timeout errors are treated as request errors, and the request returns HTTP 500. Since this is not the desired behavior, the connection timeouts should be removed, at least until their errors can be handled differently. This PR removes the connect timeouts from `Bind`, as described in https://github.com/runconduit/conduit/pull/483#issuecomment-369380003. It removes the `CONDUIT_PROXY_PUBLIC_CONNECT_TIMEOUT_MS` environment variable, but _not_ the `CONDUIT_PROXY_PRIVATE_CONNECT_TIMEOUT_MS` variable, since this is also used for the TCP connect timeouts. If we want also want to remove the TCP connection timeouts, I can do that as well. Closes #483. Fixes #491.	2018-03-05 15:38:20 -08:00
Brian Smith	8607875267	Stop using the url crate in the proxy. (#450 ) Version 1.7.0 of the url crate seems to be broken which means we cannot `cargo update` the proxy without locking url to version 1.6. Since we only use it in a very limited way anyway, and since we use http::uri for parsing much more, just switch all uses of the url crate to use http::uri for parsing instead. This eliminates some build dependencies. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-02-26 08:55:48 -10:00
Eliza Weisman	694f691b71	Add timeout to Outbound::bind_service (#436 ) Closes #403. When the Destination service does not return a result for a service, the proxy connection for that service will hang indefinitely waiting for a result from Destination. If, for example, the requested name doesn't exist, this means that the proxy will wait forever, rather than responding with an error. I've added a timeout wrapping the service returned from `<Outbound as Recognize>::bind_service`. The timeout can be configured by setting the `CONDUIT_PROXY_BIND_TIMEOUT` environment variable, and defaults to 10 seconds (because that's the default value for [a similar configuration in Linkerd](https://linkerd.io/config/1.3.5/linkerd/index.html#router-parameters)). Testing with @klingerf's reproduction from #403: ``` curl -sIH 'Host: httpbin.org' $(minikube service proxy-http --url)/get \| head -n1 HTTP/1.1 500 Internal Server Error ``` proxy logs: ```rust proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy using controller at HostAndPort { host: Domain("proxy-api.conduit.svc.cluster.local"), port: 8086 } proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy routing on V4(127.0.0.1:4140) proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy proxying on V4(0.0.0.0:4143) to None proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy::transport::connect "controller-client", DNS resolved proxy-api.conduit.svc.cluster.local to 10.0.0.240 proxy-5698f79b66-8rczl conduit-proxy ERR! conduit_proxy::map_err turning service error into 500: Inner(Timeout(Duration { secs: 10, nanos: 0 })) ```	2018-02-26 10:18:35 -08:00
Sean McArthur	236f71fbe0	proxy: use original dst if authority doesnt look like local service (#397 ) The proxy will check that the requested authority looks like a local service, and if it doesn't, it will no longer ask the Destination service about the request, instead just using the SO_ORIGINAL_DST, enabling egress naturally. The rules used to determine if it looks like a local service come from this comment: > If default_zone.is_none() and the name is in the form $a.$b.svc, or if !default_zone.is_none() and the name is in the form $a.$b.svc.$default_zone, for some a and some b, then use the Destination service. Otherwise, use the IP given.	2018-02-20 18:09:21 -08:00
Brian Smith	02176d8d16	Remove default controller URL from proxy. (#48 ) Previously there was a default controller URL in the proxy. This default was never used for any proxy injected by `conduit inject` and it was the wrong default when using the proxy outside of Kubernetes. Also more generally this is such an important setting in terms of correctness and security that it was dangerous to let it be implied in any context. Remove the default, requiring that it be set in order for the proxy to start.	2018-01-02 08:44:27 -10:00
Sky Ao	238c54414b	correct typo: Enviroment -> Environment (#100 ) Signed-off-by: Sky Ao <aoxiaojian@gmail.com>	2017-12-29 10:14:48 -08:00
Brian Smith	8385a7a8c1	Proxy: Map unqualified/partially-qualified names to FQDN (#59 ) * Proxy: Map unqualified/partially-qualified names to FQDN Previously we required the service to fully qualify all service names for outbound traffic. Many services are written assuming that Kubernetes will complete names using its DNS search path, and those services weren't working with Conduit. Now add an option, used by default, to fully-qualify the domain names. Currently only Kubernetes-like name completion for services is supported, but the configuration syntax is open-ended to allow for alternatives in the future. Also, the auto-completion can be disabled for applications that prefer to ensure they're always using unambiguous names. Once routing is implemented then it is likely that (default) routing rules will replace these hard-coded rules. Unit tests for the name completion logic are included. Part of the solution for #9. The changes to `conduit inject` to actually use this facility will be in another PR.	2017-12-19 11:59:26 -10:00
Brian Smith	81fb0fea5f	Move default private connect timeout to `Config` (#42 ) Previously the default value of this setting was in lib.rs instead of being automatically set in `Config` like all the other defaults, which was inconsistent and confusing. Fix this by moving the defaulting logic to `Config`. Validated by running the test suite.	2017-12-13 21:15:21 -06:00
Brian Smith	0185522821	Proxy: Parse environment variables in one place (#26 ) Previously `Process` did its own environment variable parsing and did not benefit from the improved error handling that `config` now has. Additionally, future changes will need access to these same environment variables in other parts of the proxy. Move `Process`'s environment variable parsing to `config` to address both of these issues. Now there are no uses of `env::var` outside of `config` except for logging, which is the final desired state. I validated this manually.	2017-12-13 19:33:37 -06:00
Brian Smith	559f4a76fb	Proxy: Use production config parsing in tests (#25 ) * Proxy: Use production config parsing in tests Previosuly the testing code for the proxy was sensitive to the values of environment variables unintentionally, because `Config` looked at the environment variables. Also, the tests were largely avoiding testing the production configuration parsing code since they were doing their own parsing. Now the tests avoid looking at environment variables other than `ENV_LOG`, which makes them more resilient. Also the tests now parse the settings using the same code as production use uses. I validated this manually.	2017-12-13 19:27:50 -06:00
Brian Smith	0ebc20c013	Proxy: Parse all environment variables before aborting (#24 ) Previously, as soon as we would encounter one environment variable with an invalid value we would exit. This is frustrating behavior when deploying to Kubernetes and there are multiple problems because the edit-compile-test cycle is so slow. Fix this by parsing all the environment variables and logging error messages before exiting. I validated this manually.	2017-12-13 18:56:14 -06:00
Eliza Weisman	2fdb859dff	Add timeout to in-flight telemetry reports (#12 ) This PR adds a configurable timeout duration after which in-flight telemetry reports are dropped, cancelling the corresponding RPC request to the control plane. I've also made the `Timeout` implementation used in `TimeoutConnect` generic, and reused it in multiple places, including the timeout for in-flight reports. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2017-12-13 15:07:36 -08:00
Brian Smith	e29a02d63b	Proxy: Improve error reporting for invalid environment variables (#23 ) * Proxy: Improve error reporting for invalid environment variables Previously when an environment variable had an invalid value the process would exit with an error that did not mention which environment variable is invalid. Start fixing this by routing environment variable parsing through functions that always know the name of the environment variable when they report errors. I validated this change manually. * Proxy: Improve configuration URL parsing Previously there was a bit of duplicated logic between parsing `Addr` and `HostAndPort` values. Factor out the common logic. In the process, improve the error reporting in the cases where parsing fails.	2017-12-08 12:32:43 -06:00
Oliver Gould	980f85963d	apply rustffmt on proxy, remove rustfmt.toml for now	2017-12-05 00:44:16 +00:00
Oliver Gould	b104bd0676	Introducing Conduit, the ultralight service mesh We’ve built Conduit from the ground up to be the fastest, lightest, simplest, and most secure service mesh in the world. It features an incredibly fast and safe data plane written in Rust, a simple yet powerful control plane written in Go, and a design that’s focused on performance, security, and usability. Most importantly, Conduit incorporates the many lessons we’ve learned from over 18 months of production service mesh experience with Linkerd. This repository contains a few tightly-related components: - `proxy` -- an HTTP/2 proxy written in Rust; - `controller` -- a control plane written in Go with gRPC; - `web` -- a UI written in React, served by Go.	2017-12-05 00:24:55 +00:00

30 Commits