This edge supersedes edge-20.10.6 as a release candidate for stable-2.9.0.
* Fixed issue where the `check` command would error when there is no Prometheus
configured
* Fixed recent regression that caused multicluster on EKS to not work properly
* Changed the `check` command to warn instead of error when webhook certificates
are near expiry
* Added the `--ingress` flag to the `inject` command which adds the recently
introduced `linkerd.io/inject: ingress` annotation
* Fixed issue with upgrades where external certs would be fetched and stored
even though this does not happen on fresh installs with externally created
certs
* Fixed issue with upgrades where the issuer cert expiration was being reset
* Removed the `--registry` flag from the `multicluster install` command
* Removed default CPU limits for the proxy and control plane components in HA
mode
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
This change updates `FetchExternalIssuerData` to be more like
`FetchIssuerData` and return expiry correctly.
This field is currently not used anywhere and is just done for
consistentcy purposes.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Per #5165, Kubernetes does not necessarily limit the proxy's access to
cores via `cgroups` when a CPU limit is set. As of #5168, the proxy now
supports a `LINKERD2_PROXY_CORES` environment configuration that
augments CPU detection from the host operating system.
This change modifies the proxy injector to ensure that this environment
is configured from the `Values.proxy.cores` Helm value, the
`config.linkerd.io/proxy-cpu-limit` annotation, and the `--proxy-cpu-limit`
install flag.
As discussed in #5167 & #5169, Kubernetes CPU limits are not necessarily
discoverable from within the pod. This means that the control plane
processes may allocate far more threads than can actually be used by the
process given its process limits.
This change removes the default CPU limits for all control plane
components. CPU limits may still be set via Helm configuration.
Now that the proxy can use more than one core, this behavior should be
enabled by default, even in HA mode.
This change modifies the default HA helm values to unset the cpu limit
for proxy containers.
This release adds support for the LINKERD2_PROXY_CORES environment
variable. When set, the value may limit the proxy's runtime resources
so that it does not allocate a thread per core available from the host
operating system.
---
* inbound: use MakeSwitch for loopback (linkerd/linkerd2-proxy#729)
* buffer: Remove readiness watch (linkerd/linkerd2-proxy#731)
* Allow specifying the number of available cores via the env (linkerd/linkerd2-proxy#733)
After the 2.9 multicluster refactoring, `linkerd mc install`'s only
workload installed is the nginx gateway, whose docker image is
configured through the flags `--gateway-nginx-image` and
`--gateway-nginx-image-version`. Thus there's no longer need of the
`--registry` flag, which is used OTOH by `linkerd mc link` which deploys the service mirror.
Currently, For legacy upgrades we are fetching even external certs and
using it for upgrades which contradicts the condition at
https://github.com/linkerd/linkerd2/blob/master/cli/cmd/options.go#L550
used with install and thus causing errors.
Instead we don't retrieve them with upgrades and hence they don't get
stored into the config and secrets which seems correct as we do not want
to store certs in the config and use them with upgrades when they are
created externally.
This touches only the upgrade path i.e `fetchIssuers` and would not
effect the retrievel of external certs for checks, etc.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
With legacy upgrades, we can parse the cert and store the expiry
correctly instead of storing it as the default value which could be a
problem when we use that field. Currently, we do not use this field and
hence it did not cause any problems.
Install on the latest edges, This field is correctly set and works
as expected. Thus, upgrades also have the right value.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* charts: Do not store .component in linkerd-config
This removes the `.component` fields from `Values.go` and also prevents them from being emitted into `linkerd-config` by attaching them into a temporary variable during injection.
This also simplies inbound and outbound Skip ports helm logic and adds quotes to them.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* cli: add `--ingress` flag to inject cmd
This PR adds a new inject flag called `--ingress` which when enabled
adds a new annotation i.e `linkerd.io/inject: ingress`.
This annotation is not applied in the `--manual` case and the env
variable is directly set.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5149
Before:
```
linkerd-webhooks-and-apisvc-tls
-------------------------------
× tap API server has valid cert
certificate will expire on 2020-10-28T20:22:32Z
see https://linkerd.io/checks/#l5d-tap-cert-valid for hints
```
After:
```
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ tap API server has valid cert
‼ tap API server cert is valid for at least 60 days
certificate will expire on 2020-10-28T20:22:32Z
see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ proxy-injector webhook has valid cert
‼ proxy-injector cert is valid for at least 60 days
certificate will expire on 2020-10-29T18:17:03Z
see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
√ sp-validator webhook has valid cert
‼ sp-validator cert is valid for at least 60 days
certificate will expire on 2020-10-28T20:21:34Z
see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints
```
Signed-off-by: Alex Leong <alex@buoyant.io>
`linkerd mc link` wasn't properly setting the `gatewayAddresses` field
when such address had a `Hostname` field instead of `Ip`, like is the
case in EKS services of type LoadBalancer.
Fixes#5143
The availability of prometheus is useful for some calls in public-api
that the check uses. This change updates the ListPods in public-api
to still return the pods even when prometheus is not configured.
For a test that exclusively checks for prometheus metrics, we have a gate
which checks if a prometheus is configured and skips it othervise.
Signed-off-by: Tarun Pothulapati tarunpothulapati@outlook.com
* Use errors.Is instead of checking underlying err messages
Fixes#5132
This PR replaces the usage of `strings.hasSuffix` with `errors.Is`
wherever error messages are being checked. So, that the code is not
effected by changes in the underlying message. Also adds a string
const for http2 response body closed error
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* docs: Update external prom and grafana readme
Update `Values.yaml` to make it more clear about reverse proxy
configuration with external grafana instances.
Also, adds `global.prometheusUrl` and `global.grafanaUrl` into charts
`README`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Restrict controlPlaneTracing field only to control plane components
Previously, `global.controlPlaneTracing` was not available during
injection and thus not affecting it.
This commit creates a new method which checks if controlPlaneTracing is
enabled and sets to the defaults if it is. This is done on the
duplicates thus preventing it from not being propagated into
`linkerd-config`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This edge supersedes edge-20.10.5 as a release candidate for
stable-2.9.0. It adds a new `linkerd.io/inject: ingress` annotation to
support service profiles and enable per-route metrics and traffic splits
for HTTP ingress controllers
* Added a new `linkerd.io/inject: ingress` annotation to configure the
proxy to support service profiles and enable per-route metrics and
traffic splits for HTTP ingress controllers
* Reduced performance impact of logging in the proxy, especially when
the `debug` or `trace` log levels are disabled
* Fixed spurious warnings logged by the `linkerd profile` CLI command
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Fixes#5121
* cli: skip emitting warnings in Profile
Whenever the tapDuration gets completed, there is a warning occured
which we do not emit. This looks like it has been changed in the latest
versions of the dependency.
* Use context.withDeadline instead of client.timeout
The usage of `client.Timeout` is not working correctly causing `W1022
17:20:12.372780 19049 transport.go:260] Unable to cancel request for
promhttp.RoundTripperFunc` to be emitted by the Kubernetes Client.
This is fixed by using context.WithDeadline and passing that into the
http Request.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Fixes#5118
This PR adds a new supported value for the `linkerd.io/inject` annotation. In addition to `enabled` and `disabled`, this annotation may now be set to `ingress`. This functions identically to `enabled` but it also causes the `LINKERD2_PROXY_INGRESS_MODE="true"` environment variable to be set on the proxy. This causes the proxy to operate in ingress mode as described in #5118
With this set, ingresses are able to properly load service profiles based on the l5d-dst-override header.
Signed-off-by: Alex Leong <alex@buoyant.io>
This release adds an 'ingress mode' to support per-request routing for
HTTP ingresses.
Additionally, the performance impact of logging should be reduced,
especially when the proxy log level is not set to `debug` or `trace`.
---
* router: Use NewService instead of MakeService (linkerd/linkerd2-proxy#724)
* outbound: Split TCP stack into dedicated modules (linkerd/linkerd2-proxy#725)
* trace: update `tracing-subscriber` to 0.2.14 (linkerd/linkerd2-proxy#726)
* outbound: Extract HTTP and server modules (linkerd/linkerd2-proxy#727)
* outbound: Introduce 'ingress mode' (linkerd/linkerd2-proxy#728)
* Reduce tracing spans to the debug level (linkerd/linkerd2-proxy#730)
gcr.io has an issue that it's not possible to update multi-arch images
(see eclipse/che#16983 and open-policy-agent/gatekeeper#665).
We're now relying on ghcr.io instead, which I verified doesn't have this
bug, so we can stop skipping these pushes.
Used to be triggered only for stable releases, but now that 2.9 stable
approaches let's turn it on for the upcoming RCs.
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
Fixes#5098
When setting up multicluster, a target cluster may wish to create multiple service accounts to be used by source clusters' service mirrors. This allows the target cluster to individually revoke access to each of the source clusters. When using the Linkerd CLI, this can be accomplished by running the `linkerd multicluster allow` command multiple times to create multiple service accounts. However, there is no analogous workflow when installing with Helm.
We update the Helm templates to support interpreting the `remoteMirrorServiceAccountName` value as either a single string or a list of strings. In the case where it is a list, we create a service account and associated RBAC for each entry in the list.
Signed-off-by: Alex Leong <alex@buoyant.io>
Followup to #5100
We had both `controllerImageVersion` and `global.controllerImageVersion`
configs, but only the latter was taken into account in the chart
templates, so this change removes all of its references.
In #5110 the `global.proxy.destinationGetNetworks` configuration is
renamed to `global.clusterNetworks` to better reflect its purpose.
The `config.linkerd.io/proxy-destination-get-networks` annotation allows
this configuration to be overridden per-workload, but there's no real use
case for this. I don't think we want to support this value differing
between pods in a cluster. No good can come of it.
This change removes support for the `proxy-destination-get-networks`
annotation.
In order for the integration tests to run successfully on a dedicated ARM cluster, two small changes are necessary:
* We need to skip the multicluster test since this test uses two separate clusters (source and target)
* We need to properly uninstall the multicluster helm chart during cleanup.
With these changes, I was able to successfully run the integration tests on a dedicated ARM cluster.
Signed-off-by: Alex Leong <alex@buoyant.io>
There is no longer a proxy config `DESTINATION_GET_NETWORKS`. Instead of
reflecting this implementation in our values.yaml, this changes this
variable to the more general `clusterNetworks` to emphasize its
similarity to `clusterDomain` for the purposes of discovery.
The proxy no longer honors DESTINATION_GET variables, as profile lookups
inform when endpoint resolution is performed. Also, there is no longer
a router capacity limit.
As described in #5105, it's not currently possible to set the proxy log
level to `off`. The proxy injector's template does not quote the log
level value, and so the `off` value is handled as `false`. Thanks, YAML.
This change updates the proxy template to use helm's `quote` function
throughout, replacing manually quoted values and fixing the quoting for
the log level value.
We also remove the default logFormat value, as the default is specified
in values.yaml.
Currently the tracing deployments do not start on clusters where
restricted PodSecurityPolicies are enforced.
This PR adds the subchart's ServiceAccounts to the `linkerd-psp`
RoleBinding, thereby allowing the deployments to be satisfied.
Signed-off-by: Simon Weald <glitchcrab-github@simonweald.com>
This release fixes a minor regression in outbound tap data, where the
source TCP address was omitted.
This release also improves logging:
- uptime formatting is fixed to only display microsecond granularity,
which fixes formatting/alignment inconsistencies.
- The `off` log level is now special-cased to entirely disable the
logging subsystem. This can substantially reduce memory usage.
---
* telemetry: Include git SHA in build_info (linkerd/linkerd2-proxy#716)
* outbound: Set source address in Tap metadata (linkerd/linkerd2-proxy#718)
* outbound: test profile search nets filtering (linkerd/linkerd2-proxy#714)
* app: Consolidate metrics types in `core::metrics` (linkerd/linkerd2-proxy#709)
* outbound: test load balancer adding/removing TCP endpoints (linkerd/linkerd2-proxy#717)
* Remove hardcoded list of ports to skip (linkerd/linkerd2-proxy#719)
* admin: Simplify metrics server (linkerd/linkerd2-proxy#720)
* Split tracing init & admin handlers into crate (linkerd/linkerd2-proxy#721)
* tracing: Fix time formatting to ensure alignment (linkerd/linkerd2-proxy#722)
* tracing: Support disabling tracing entirely (linkerd/linkerd2-proxy#723)
It appears that Amazon can use the `100.64.0.0/10` network, which is
technically private, for a cluster's Pod network.
Wikipedia describes the network as:
> Shared address space for communications between a service provider
> and its subscribers when using a carrier-grade NAT.
In order to avoid requiring additional configuration on EKS clusters, we
should permit discovery for this network by default.
## Motivations
Closes#5080
## Solution
When the `--all-namespaces` (`-A`) flag is set for the `linkerd edges` command,
ignore the `namespace` value set by default or `-n`.
This is similar to the behavior for `kubectl`. `kubectl get -A -n linkerd pods`
showing pods in all namespaces.
### Behavior changes
With linkerd and emojivoto installed, this results in:
Before:
```
❯ linkerd edges -A pods
No edges found.
```
After:
```
❯ linkerd edges -A pods
SRC DST SRC_NS DST_NS SECURED
vote-bot-6cb9cb9569-wl6w5 web-5d69bcfdb7-mxf8f emojivoto emojivoto √
web-5d69bcfdb7-mxf8f emoji-7dc976587b-rb9c5 emojivoto emojivoto √
web-5d69bcfdb7-mxf8f voting-bdf4f778c-pjkjg emojivoto emojivoto √
linkerd-prometheus-68d6897d75-ghmgm emoji-7dc976587b-rb9c5 linkerd emojivoto √
linkerd-prometheus-68d6897d75-ghmgm vote-bot-6cb9cb9569-wl6w5 linkerd emojivoto √
linkerd-prometheus-68d6897d75-ghmgm voting-bdf4f778c-pjkjg linkerd emojivoto √
linkerd-prometheus-68d6897d75-ghmgm web-5d69bcfdb7-mxf8f linkerd emojivoto √
linkerd-controller-7d965cf78d-qw6xj linkerd-prometheus-68d6897d75-ghmgm linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-controller-7d965cf78d-qw6xj linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-destination-74dbb9c46b-nkxgh linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-grafana-5d9fb67dc6-sn2l8 linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-identity-c875b5d58-b756v linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-proxy-injector-767b55988d-n9r6f linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-sp-validator-6c8df84fb9-4w8kc linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-tap-777fbf7656-p87dm linkerd linkerd √
linkerd-prometheus-68d6897d75-ghmgm linkerd-web-546c9444b5-68xpx linkerd linkerd √
```
`linkerd edges -A -n linkerd pods` results in all edges as well (the result
above).
The behavior of `linkerd edges pods` does not change and shows edges in the
`default` namespace.
```
❯ linkerd edges pods
No edges found.
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
The proxy has a default, hardcoded set of ports on which it doesn't do
protocol detection (25, 587, 3306 -- all of which are server-first
protocols). In a recent change, this default set was removed from
the outbound proxy, since there was no way to configure it to anything
other than the default set. I had thought that there was a default set
applied to proxy-init, but this appears to not be the case.
This change adds these ports to the default Helm values to restore the
prior behavior.
I have also elected to include 443 in this set, as it is generally our
recommendation to avoid proxying HTTPS traffic, since the proxy provides
very little value on these connections today.
Additionally, the memcached port 11211 is skipped by default, as clients
do not issue any sort of preamble that is immediately detectable.
These defaults may change in the future, but seem like good choices for
the 2.9 release.
Most invocations of `TestHelper.LinkerdRun` don't actually need the stderr
output except to encode it in the error message. This changes this helper
to return an error that includes the full invoked command and error message.
Invocations that need direct access to stderr must call `TestHelper.PipeToLinkerdRun`
This reverts commit 85cbcb4a85.
We disable the ARM integration tests for now until we have more confidence in them.
Signed-off-by: Alex Leong <alex@buoyant.io>
The SMI metrics image does not yet support arm. Thus we must skip the SMI metrics integration test when using arm.
Signed-off-by: Alex Leong <alex@buoyant.io>
The release workflow uses the `-skip-kind-create` flag when the flag is actually called `-skip-cluster-create`. This causes the workflow to fail.
We correct the flag name.
Signed-off-by: Alex Leong <alex@buoyant.io>