This new annotation is used by the proxy injector to determine if the
debug container needs to be injected.
When using 'linkerd install', the 'pkg/inject' library will only inject
annotations into the workload YAML. Even though 'conf.debugSidecar'
is set in the CLI, the 'injectPodSpec()' function is never invoked on
the proxy injector side. Once the workload YAML got picked up by the
proxy injector, 'conf.debugSidecar' is already nil, since it's a different,
new 'conf' object. The new annotation ensures that the proxy injector
injects the debug container.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
In #2679 we introduced an upgrade integration test. At the time we only
supported upgrading from a recent edge. Since that PR, a stable build
was released supporting upgrade.
Modify the upgrade integration test to upgrade from the latest stable
rather than latest edge. This fulfills the original intent of #2669.
Also add some known k8s event warnings.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Update helm charts to include webhooks config and TLS secret
* Update the webhooks to read the secret cert and key
* Update webhooks to not recreate config on restart
* Ensure upgrade preserve existing secrets
* Revert the change to rename the webhook configs
The renaming change breaks upgrade, where the new webhook configs conflict with
the existing ones. The older resources aren't deleted during upgrade because
they are dynamically created.
* Make the secret volume read-only
* Remove unnecessary exported getter functions
* Remove obsolete mwc and vwc templates
Signed-off-by: Ivan Sim <ivan@buoyant.io>
commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1
Author: Takanori Ishibashi <takanori.1112@gmail.com>
Date: Wed May 15 05:58:42 2019 +0900
updaes->updates (#250)
Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com>
commit 16441c25a9d423a6ab12b689b830d9ae3798fa00
Author: Eliza Weisman <eliza@buoyant.io>
Date: Tue May 14 14:40:03 2019 -0700
Pass router::Config directly to router::Layer (#253)
Currently, router `layer`s are constructed with a single argument, a
type implementing `Recognize`. Then, the entire router stack is built
with a `router::Config`. However, in #248, it became necessary to
provide the config up front when constructing the `router::layer`, as
the layer is used in a fallback layer. Rather than providing a separate
type for a preconfigured layer, @olix0r suggested we simply change all
router layers to accept the `Config` when they're constructed (see
https://github.com/linkerd/linkerd2-proxy/pull/248#discussion_r283575008).
This branch changes `router::Layer` to accept the config up front. The
`router::Stack` types `make` function now requires no arguments, and the
implementation of `Service` for `Stack` can be called with any `T` (as
the target is now ignored).
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
commit b70c68d4504a362eac6a7828039a2e5c7fcd308a
Author: Eliza Weisman <eliza@buoyant.io>
Date: Wed May 15 13:14:04 2019 -0700
Load balancers fall back to ORIG_DST when no endpoints exist (#248)
Currently, when no endpoints exist in the load balancer for a
destination, we fail the request. This is because we expect endpoints to
be discovered by both destination service queries _and_ DNS lookups, so
if there are no endpoints for a destination, it is assumed to not exist.
In linkerd/linkerd2#2661, we intend to remove the DNS lookup from the
proxy and instead fall back to routing requests for which no endpoints
exist in the destination service to their SO_ORIGINAL_DST IP address.
This means that the current approach of failing requests when the load
balancer has no endpoints will no longer work.
This branch introduces a generic `fallback` layer, which composes a
primary and secondary service builder into a new layer. The primary
service can fail requests with an error type that propages the original
request, allowing the fallback middleware to call the fallback service
with the same request. Other errors returned by the primary service are
still propagated upstream.
In contrast to the approach used in #240, this fallback middleware is
generic and not tied directly to a load balancer or a router, and can
be used for other purposes in the future. It relies on the router cache
eviction added in #247 to drain the router when it is not being used,
rather than proactively destroying the router when endpoints are
available for the lb, and re-creating it when they exist again.
A new trait, `HasEndpointStatus`, is added in order to allow the
discovery lookup to communicate the "no endpoints" state to the
balancer. In addition, we add a new `Update::NoEndpoints` variant to
`proxy::resolve::Update`, so that when the control plane sends a no
endpoints update, we switch from the balancer to the no endpoints state
_immediately_, rather than waiting for all the endpoints to be
individually removed. When the balancer has no endpoints, it fails all
requests with a fallback error, so that the fallback middleware
A subsequent PR (#248) will remove the DNS lookups from the discovery
module.
Closes#240.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
commit 6525b0638ad18e74510f3156269e0613f237e2f5
Author: Zahari Dichev <zaharidichev@gmail.com>
Date: Wed May 15 23:35:09 2019 +0300
Allow disabling tap by setting an env var (#252)
This PR fixeslinkerd/linkerd2#2811. Now if
`LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The
approach taken is that the `ProxyParts` is changed so the
`control_listener` is now an `Option` that will be None if tap is
disabled as this control_listener seems to be exclusively used to serve
the tap. Feel free to suggest a better approach.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
commit 91f32db2ea6d74470fd689c713ff87dc7586222d
Author: Zahari Dichev <zaharidichev@gmail.com>
Date: Thu May 16 00:45:23 2019 +0300
Assert that outbound TLS works before identity is certified (#251)
This commit introduces TLS capabilities to the support server as well as
tests to ensure that outbound TLS works even when there is no verified
certificate for the proxy yet.
Fixeslinkerd/linkerd2#2599
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
commit 45aadc6b1b28e6daea0c40e694a86ae518887d85
Author: Sean McArthur <sean@buoyant.io>
Date: Wed May 15 14:25:39 2019 -0700
Update h2 to v0.1.19
Includes a couple HPACK fixes
Signed-off-by: Sean McArthur <sean@buoyant.io>
commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f
Author: Oliver Gould <ver@buoyant.io>
Date: Thu May 16 08:11:06 2019 -0700
Update mio to 0.6.17 (#257)
To pick up https://github.com/tokio-rs/mio/pull/939
Adds a check to Prometheus `edges` queries to verify that data for the requested
resource type exists. Previously, if Prometheus could not find request data for the
requested resource type, it would skip that label and still return data for
other labels in the `by` clause, leading to an incorrect response.
Adds an edges command to the CLI. `linkerd edges` displays connections between resources, and Linkerd proxy identities. Currently this feature will only display edges where both the client identity and server identity are known. The next step will be to display edges for which identity is not known and/or one-sided traffic such as Prometheus and tap requests.
* CLI
* Fixed `linkerd check` and `linkerd dashboard` failing when any control plane
pod is not ready, even when multiple replicas exist (as in HA mode)
* Controller
* Fixed control plane components failing on startup when the Kubernetes API
returns an `ErrGroupDiscoveryFailed`
* Proxy
* Added a dispatch timeout that limits the amount of time a request can be
buffered in the proxy
* Removed the limit on the number of concurrently active service discovery
queries to the Destination service
Special thanks to @zaharidichev for adding end to end tests for proxies with
TLS!
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Support for resources opting out of tap
Implements the `linkerd inject --disable-tap` flag (although hidden pending #2811) and the config override annotation `config.linkerd.io/disable-tap`.
Fixes#2778
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
commit 5f89351081eff47a4ab8cd88e2e1a69a04f86541
Author: Oliver Gould <ver@buoyant.io>
Date: Thu May 9 16:39:24 2019 -0700
Upgrade tower dependencies (#249)
Tower must be updated in order to pickup tower-rs/tower#281
to address linkerd/linkerd2#2804.
This adopts released crates where possible.
commit 5d5eed6f8180b8db4090d995e71fdf7b0890c647
Author: Zahari Dichev <zaharidichev@gmail.com>
Date: Thu May 9 01:08:34 2019 +0300
Assert that TLS connection is refused if identity is not certified yet (#243)
This branch adds tls capability to the support cient used in tests. In addition to that it adds two tests verifying that a TLS connection is refused in case the identity is not certified yet. This attempts to fix #https://github.com/linkerd/linkerd2/issues/2598 and provide facility to write tests for https://github.com/linkerd/linkerd2/issues/2676.
As these are still some of my first lines of Rust code, it is advised to approach everything with a healthy dose of doubt :)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
commit 1b9bb3745e44c959d1d41d14fed2b2822c82b5ba
Author: Oliver Gould <ver@buoyant.io>
Date: Wed May 8 14:28:37 2019 -0700
Introduce dispatch timeouts around buffers (#246)
The proxy has several buffers, especially where it routes requests over
shared stacks. If any of these routes is unavailable, then a request may
remain buffered indefinitely. Previously, before service profiles were
introduced, there was a default _response_ timeout that would cause
these requests to fail; but since this response timeout is now optional
(and is only applied once the request has been routed within a proxy),
then we need a new mechanism to prevent requests from getting "stuck".
This change does the following:
- all proxied requests are annotated with a dispatch deadline;
- each time a request is bufered, a timeout is registered.
- if the timeout fires, the response exception fails, a 503 is returned,
and the request is dropped.
- if the request is processed into the inner stack, the timeout is
ignored.
The dispatch timeout limits the _time a request is buffered in a proxy_.
This is distinct from the response timeout, as the server's response may
naturally be delayed for any number of (non-proxy-related) reasons.
The `insert_target` module has been generalized to `insert` to support
setting the DispatchDeadline extension.
The `buffer` module has been augmented with generic deadline-extraction
logic.
The `svc` module now exposes its own builder type that notably adds
a `buffer_pending` helper. It's helpful to pull a builder type into the
proxy to assist debugging type errors when modifying stacks.
Fixeslinkerd/linkerd2#2779linkerd/linkerd2#2795
commit caf899557c3b041190f63544da865396231b3e30
Author: Oliver Gould <ver@buoyant.io>
Date: Fri May 3 15:55:32 2019 -0700
router: Fail requests when the route is not ready (#241)
In linkerd/linkerd2#2779, we plan to expire requests while they are
buffered. However, the router _implicitly_ buffers requests in the
executor when the inner service is not ready.
This change alters the route to wrap all inner layers in a `LoadShed`
so it can expect all services to `poll_ready()` immediately.
commit 587bad101d9e5daeacb24b6733097c350a798356
Author: Eliza Weisman <eliza@buoyant.io>
Date: Fri May 3 14:18:08 2019 -0700
Remove Destination service query concurrency limit (#244)
Currently, the proxy enforces a limit on the number of concurrent
queries (i.e., the number of gRPC streams) to the Destination service.
This limit was added based on information about the behaviour of the
Destination service that is now known to be incorrect.
This branch removes the limit on concurrent queries from the proxy's
`control::destination` module. Although it should now be possible to
simplify this code as a result of this change, I've refrained from doing
any major refactoring in this branch --- my intention is to do this
after the DNS fallback behaviour has also been removed, as together with
this change, that will result in a _significant_ simplification of the
module. Additionally, I've removed the tests for the concurrency limit,
as they are no longer relevant.
The `LINKERD2_PROXY_DESTINATION_CLIENT_CONCURRENCY_LIMIT`
environment variable was also removed; this is not a breaking change as
neither the CLI nor the proxy injector will currently set this env var.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
commit cbdf45b44f7e4d852dc0497716062167ab9539fb
Author: Sean McArthur <sean@buoyant.io>
Date: Thu May 2 11:47:48 2019 -0700
Remove h2::Error requirement from metrics
Signed-off-by: Sean McArthur <sean@buoyant.io>
commit 3276949d4608dc4344b7bed3de2fc4b3080c2c6e
Author: Sean McArthur <sean@buoyant.io>
Date: Thu May 2 09:44:00 2019 -0700
delete unused proxy::http::metrics::class module
Signed-off-by: Sean McArthur <sean@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Integration test for k8s events generated during install
Fixes#2713
I did make sure a scenario like the one described in #2964 is caught.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
This change adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies. This PR only includes changes to the controller and protobuf.
The `docker-build-proxy` script builds `Dockerfile-proxy`. That
Dockerfile depends on a go-deps image, and takes a `LINKERD_VERSION`
arg. The `docker-build-proxy` script was neither ensuring go-deps had
been built, nor setting `LINKERD_VERSION`. The former resulted in the
build failing if go-deps did not exist. The latter resulted in
`dev-undefined` log messages in the `linkerd-proxy` container.
Fix `docker-build-proxy` to ensure go-deps are built, and also set the
`LINKERD_VERSION`. This brings this script more in-line with the other
`docker-build-*` scripts.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`ServiceProfilesAccess()`, called by control plane components at
startup, would fail if it encountered an `ErrGroupDiscoveryFailed` from
a GroupVersion request. This error is mostly innocuous, as it returns an
error if any GroupVersion fails. `ServiceProfilesAccess()` only needs to
validate ServiceProfiles are available.
Modify `ServiceProfilesAccess()` to specifically request the
ServiceProfile GroupVersion. Also add Discovery object
(`APIResourceList`) support to `NewFakeClientSets`.
Fixes#2780
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd check` and `linkerd dashboard` commands validate control
plane pods are up via the `LinkerdAPIChecks` category of checks. These
checks will fail if a single pod is not ready, even in HA mode.
Modify the underlying `validateControlPlanePods` check to return
successful if at least one pod per control plane component is ready.
Fixes#2554
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
With the server configured to response with a failure of 50%, the test first
checks to ensure the actual success rate is less than 100%. Then the
service profile is edited to perform retries. The test then checks to
ensure the effective success rate is at least 95%.
This is (hopefully) more reliable than changing the test to perform waits and
retries until there is a difference between effective success rate and actual
success rate and compare them.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* CLI
* Added a `linkerd check config` command for verifying that
`linkerd install config` was successful
* Improved the help documentation of `linkerd install` to clarify flag usage
* Added support for private Kubernetes clusters by changing the CLI to connect
to the control plane using a port-forward (thanks, @jackprice!)
* Controller
* Fixed pod creation failure when a `ResourceQuota` exists by adding a default
resource spec for the proxy-init init container
* Proxy
* Replaced the fixed reconnect backoff with an exponential one (thanks,
@zaharidichev!)
* Fixed an issue where load balancers can become stuck
* Internal
* Fixed integration tests by adding known proxy-injector log warning to tests
Signed-off-by: Alex Leong <alex@buoyant.io>
Private k8s clusters, such as the private GKE clusters offered by Google
Cloud, cannot be reached through the current API proxy method.
This commit uses the port forwarding feature already developed.
Also modify dashboard command to not fall back to ephemeral port.
Signed-off-by: Jack Price <jackprice@outlook.com>
The multi-stage args used by install, upgrade, and check were
implemented as positional arguments to their respective parent commands.
This made the help documentation unclear, and the code ambiguous as to
which flags corresponded to which stage.
Define `config` and `control-plane` stages as subcommands. The help
menus now explicitly state flags supported.
Fixes#2729
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
commit 073a1beb4a7cd709c6b1eaa56a319c1829a94d11
Author: Sean McArthur <sean@buoyant.io>
Date: Mon Apr 29 17:54:01 2019 -0700
tap: remove need to clone Services (#238)
This refactors the tap system to not require intermediary channels to
register matches and taps when a request comes through. The Dispatcher
that used to exist in order to prevent tapping more requests than the
limit asked for has been removed. In its place is a shared atomic
counter to keep the count under the limit.
The resulting behavior should be the same. There should be improved
performance as tap registration doesn't need go through a second
channel, and requests don't need to be delayed waiting for the
dispatcher to be able to process its queue.
Signed-off-by: Sean McArthur <sean@buoyant.io>
commit 7a3be8c8737188e5debbc465f9a33da0d79b8b80
Author: Zahari Dichev <zaharidichev@gmail.com>
Date: Wed May 1 01:57:01 2019 +0300
Replace fixed reconnect backoff with exponential one (#237)
When reconnecting to a destination, use an exponential, jittered backoff strategy.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
commit 32b813aad4fe2fcf0252e8c2215d6835101d2337
Author: Oliver Gould <ver@buoyant.io>
Date: Tue Apr 30 15:58:20 2019 -0700
Support endpoint weights (#230)
This change modifies the proxy to honor weights provided by the
destination service. When the destination service replies with a
weight, this value is divided by 10,000 to produce a weight on
[0.0, ~400000.0]. This weight is used by load the load balancer
to modify load interpretation and therefore request distribution.
A weight of 0.0 will cause the endpoint's load to be effectively infinite
so that requests will only be sent to the endpoint when no other endpoints
exists or when the other endpoints that were considered had 0-weights.
commit 501802671a346250b6dbaae73f29d9be7a4c2086
Author: Sean McArthur <sean@buoyant.io>
Date: Wed May 1 13:42:38 2019 -0700
Remove buffers from endpoint stacks (#239)
Due to the `http::settings::router`, a `buffer` was needed in each
endpoint stack. This meant that the service was always ready, even if
the client were falling over (and reconnecting). In turn, this meant
that the balancer would pick one of these endpoint stacks, because it
was always ready!
This change includes a test of a failing endpoint, that the balancer no
longer assumes it is ready, and has the following functional changes:
- Removed `http::settings::router`, instead the client HTTP settings are
detected as part of the `DstAddr`. This means that each balancer only
has endpoints with the same HTTP settings.
- Removed `buffer` layer from inside the endpoint stacks.
Signed-off-by: Sean McArthur <sean@buoyant.io>
Add support for `linkerd check config`. Validates the existence of the
Linkerd Namespace, ClusterRoles, ClusterRoleBindings, ServiceAccounts,
and CustomResourceDefitions.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #2737 introduced a warning in the proxy-injector when owner ref
lookups failed due to not having up-to-date ReplicaSet information. That
warning may occur during integration tests, causing a failure.
Add the warning as a known controller log message. The warning will be
printed as a skipped test, allowing the integration tests to pass.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
When developing on the proxy, it's convenient to build the proxy while
the linkerd2 image is building at a given tag; but because the proxy is
built last, it's difficult to build the proxy at the same tag
simultaneously.
This is made easier by building the proxy first so that the parallel
build can be initiated after this. This shouldn't impact other
development workflows.
CustomResourceDefinition parsing and retrieval is not available via
client-go's `kubernetes.Interface`, but rather via a separate
`k8s.io/apiextensions-apiserver` package.
Introduce support for CustomResourceDefintion object parsing and
retrieval. This change facilitates retrieval of CRDs from the k8s API
server, and also provides CRD resources as mock objects.
Also introduce a `NewFakeAPI` constructor, deprecating
`NewFakeClientSets`. Callers need no longer be concerned with discreet
clientsets (for k8s resources vs. CRDs vs. (eventually)
ServiceProfiles), and can instead use the unified `KubernetesAPI`.
Part of #2337, in service to multi-stage check.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
commit 61db2e77a247f7b0235b67581f60e8a92f8543cb
Author: Sean McArthur <sean@seanmonstar.com>
Date: Tue Apr 23 17:20:43 2019 -0700
Replace linkerd2-stack with tower-layer (#236)
Signed-off-by: Sean McArthur <sean@buoyant.io>
commit 2d6c7145cadf709832f3507bcefdaee509ebde81
Author: Sean McArthur <sean@seanmonstar.com>
Date: Thu Apr 18 12:40:48 2019 -0700
Add load shedding when over max-in-flight requests. (#225)
Also adds configuration for inbound and outbound max-in-flight requests.
Signed-off-by: Sean McArthur <sean@buoyant.io>
commit f4b5cd0b4a25d7d942e018b42af1157ae2e7dbb0
Author: Oliver Gould <ver@buoyant.io>
Date: Wed Apr 17 13:53:49 2019 -0700
Upgrade tower (#232)
This avails the proxy of newer load balancer features, an updated buffer
implementation, etc.
The new buffer implementation requires that we implement TypedExecutor
for our logging executor; and more error types have been made dynamic.
All ServiceAccounts are intended to be grouped together with other RBAC
resources, particularly for `linkerd install config` output. Grafana and
Web ServiceAccounts were still included with their respective
Deployments.
Group Grafana and Web ServiceAccounts with other RBAC resources.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The `linkerd check --proxy` command checks for proxies in all
namespaces, if the `--namespace` flag is not set. PR #2747 modified the
behavior of `KubernetesAPI.NamespaceExists`. Previously it would succeed
if given an emptry string for a namespace. Now it fails with a
`resource name may not be empty` error (for k8s server `v1.10.11`), or a
not found error (for our fake test client).
Modify the data plane proxy namespace check to return success if the
namespace is not set.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`linkerd install` supports a 2-stage install process, `linkerd upgrade`
did not.
Add 2-stage support for `linkerd upgrade`. Also exercise multi-stage
functionality during upgrade integration tests.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This reverts commit 3de16d47be.
#2740 modified the ServiceProfiles CRD which will cause issues for users upgrading from the old CRD version to the new version. #2748 was an attempt to fix this by bumping the service profile CRD version, however, our testing infrastructure is not well set up to accommodate changes to CRDs because they are resources which are global to the cluster.
We revert this change for now and will revisit it in the future when we can give more thought to CRD versioning, upgrade, and testing.
Signed-off-by: Alex Leong <alex@buoyant.io>
Numerous codepaths have emerged that create k8s configs, k8s clients,
and make k8s api requests.
This branch consolidates k8s client creation and APIs. The primary
change migrates most codepaths to call `k8s.NewAPI` to instantiate a
`KubernetesAPI` struct from `pkg`. `KubernetesAPI` implements the
`kubernetes.Interface` (clientset) interface, and also persists a
`client-go` `rest.Config`.
Specific list of changes:
- removes manual GET requests from `k8s.KubernetesAPI`, in favor of
clientsets
- replaces most calls to `k8s.GetConfig`+`kubernetes.NewForConfig` with
a single `k8s.NewAPI`
- introduces a `timeout` param to `k8s.NewAPI`, currently only used by
healthchecks
- removes `NewClientSet` in `controller/k8s/clientset.go` in favor of
`k8s.NewAPI`
- removes `httpClient` and `clientset` from `HealthChecker`, use
`KubernetesAPI` instead
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Makes the Font Awesome CSS stylesheet available offline. Users loading the dashboard with no/limited internet will see both the Font Awesome and Material-UI sidebar icons consistently. Before, only the Material-UI icons were available offline.
Signed-off-by: Gaurav Kumar <gaurav.kumar9825@gmail.com>
Fixes#2720 and 2711
This changes the default behavior of `linkerd inject` to not inject the
proxy but just the `linkerd.io/inject: enabled` annotation for the
auto-injector to pick it up (regardless of any namespace annotation).
A new `--manual` mode was added, which behaves as before, injecting
the proxy in the command output.
The unit tests are running with `--manual` to avoid any changes in the
fixtures.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>