## Motivation
Closes#4950
## Solution
Add the `config.linkerd.io/opaque-ports` annotation to either a namespace or pod
spec to set the proxy `LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION`
environment variable.
Currently this environment variable is not used by the proxy, but will be
addressed by #4938.
## Valid values
Ports: `config.linkerd.io/opaque-ports: 4322,3306`
Port ranges: `config.linkerd.io/opaque-ports: 4320-4325`
Mixed ports and port ranges: `config.linkerd.io/opaque-ports: 4320-4325`
If the pod has named ports such as:
```
- name: nginx
image: nginx:latest
ports:
- name: nginx-port
containerPort: 80
protocol: TCP
```
The name can also be used as a value: `config.linkerd.io/opaque-ports:
nginx-port`
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Adds bin/certs-openssl, which creates self-signed root cert/key and issuer cert/key using openssl. This will be used in the two clusters set up in the multicluster integration test (followup PR), given CI already has openssl and to avoid having to install step.
Adds a new flag `--certs-path` to the integration tests, pointing to the path where those certs (ca.crt, ca.key, issuer.key and issuer.crt) will be located to be fed into linkerd install's `--identity-*` flags.
When the service-mirror component can't reach the target's k8s API, the goroutine blocks and it can't be unblocked.
This was happenining specifically in the case of the multicluster integration test (still to be pushed), where the source and target clusters are created in quick succession and the target's API service doesn't always have time to be exposed before being requested by the service mirror.
The fix consists on no longer have restartClusterWatcher be side-effecting, and instead return an error. If such error is not nil then the link watcher is stopped and reset after 10 seconds.
## edge-20.9.4
This edge release introduces support for authenticated docker registries and
fixes a recent multicluster regression.
* Fixed a regression in multicluster gateway configurations that would forbid
inbound gateway traffic
* Upgraded bundled Grafana to v7.1.5
* Enabled Jaeger receiver in collector configuration in Helm chart (thanks
@olivierboudet!)
* Fixed skip port configuration being skipped in CNI plugin
* Introduced support for authenticated docker registries (thanks @c-n-c!)
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
* Integration test for smi-metrics
This PR adds an integration test which installs SMI-Metrics and performs
queries and matches the reply with a regex query.
Currently, We store the SMI Helm pkg locally and run the test on top, so
That our CI does not break and we will periodically update the package
based on the newer releases of SMI-Metrics
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* tests: Add new CNI deep integration tests
Fixes#3944
This PR adds a new test, called cni-calico-deep which installs the Linkerd CNI
plugin on top of a cluster with Calico and performs the current integration tests on top, thus
validating various Linkerd features when CNI is enabled. For Calico
to work, special config is required for kind which is at `cni-calico.yaml`
This is different from the CNI integration tests that we run in
cloud integration which performs the CNI level integration tests.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Introduce support for authenticated docker registries using imagePullSecrets
Problem: Private Docker Registries are not supported for the moment as detailed in issue #4413
Solution: Every Service Account of linkerd subcomponents are Attached with imagePullSecrets,
which in turn can then pulls the docker images from authenticated private registries using them.
The imagePullSecret is configured in global.imagePullSecret parameter of values.yaml like
imagePullSecret:
- name: <name-of-private-registry-secret-resource>
Fixes#4413
Signed-off-by: Nilakhya <nilakhya@hotmail.com>
* CNI: Use skip ports configuration in CNI
This PR updates the install and `cmdAdd` workflow (which is called
for each new Pod creation) to retrieve and set the configured Skip
Ports. This also updates the `cmdAdd` workflow to check if the new
pod is a control plane Pod, and adds `443` to OutBoundSkipPort so
that 443 (used with k8s API) is skipped as it was causing errors because
a resolve lookup was happening for them which is not intended.
When some test failed in the middle of the
`./tests/integration/install_test.go` suite, multicluster resources can
be left-over, which `./bin/test-cleanup` wasn't removing.
This was affecting the ARM integration tests, that require good cleanup
since they use a non-transient cluster.
This edge release includes fixes and updates for the control plane and
CLI.
* Added `--dest-cni-bin-dir` flag to the `linkerd install-cni` command,
to configure the directory on the host where the CNI binary will be
placed
* Removed `collector.name` and `jaeger.name` config fields from the
tracing addon
* Updated Jaeger to 1.19.2
* Fixed a warning about deprecated Go packages in controller container
logs
* Do not run cloud integration tests in CI
Closes#4963
Removed the `./.github/workflows/cloud_integration.yml` workflow, and
removed the `cloud_integration_tests` job from the ``./.github/workflows/release.yml` workflow.
This adds a `replace` statement to `go.mod` to force the newer version `1.14.x` of `github.com/grpc-ecosystem/grpc-gateway` to avoid the following warning in all the controller container logs:
```
WARNING: Package "github.com/golang/protobuf/protoc-gen-go/generator" is deprecated.
A future release of golang/protobuf will delete this package,
which has long been excluded from the compatibility promise.
```
More info [here](https://github.com/golang/protobuf/issues/1104)
Currently, This field has to be configured to make CNI work in
GKE clusters as thats where the binaries have to be stored. This
was configurable through Helm, but the same can be allowed through
the CLI too
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* tracing: Move default values into chart
This branch updates the tracing add-on's values into their own chart's values.yaml
(just like grafana and prometheus). This prevents them from being saved into
`linkerd-config-addons` where only the overridden values are stored. Thus allowing
us to change the defaults.
This also
- Updates the check command to fall back to default values, if there are no
overridden name fields.
- Updates jaeger to `1.19.2`
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Push docker images to ghcr.io instead of gcr.io
The `cloud_integration.yml` and `release.yml` workflows were modified to
log into ghcr.io, and remove the `Configure gcloud` step which is no
longer necessary.
Note that besides the changes to cloud_integration.yml and release.yml, there was a change to the upgrade-stable integration test so that we do linkerd upgrade --addon-overwrite to reset the addons settings because in stable-2.8.1 the Grafana image was pegged to gcr.io/linkerd-io/grafana in linkerd-config-addons. This will need to be mentioned in the 2.9 upgrade notes.
Also the egress integration test has a debug container that now is pegged to the edge-20.9.2 tag.
Besides that, the other changes are just a global search and replace (s/gcr.io\/linkerd-io/ghcr.io\/linkerd/).
This release includes several major changes to the proxy's behavior:
- Service profile lookups are now necessary and fundamental to outbound
discovery for HTTP traffic. That is, if a service profile lookup is
rejected, endpoint discovery will not be performed; and endpoint
discovery must succeed for all destinations that are permitted by
service profiles. This simplifies caching and buffering to reduce
latency (especially under concurrency).
- Service discovery is now performed for all TCP traffic, and
connections are balanced over endpoints according to connection
latency.
- This enables mTLS for **all** meshed connections; not just HTTP.
- Outbound TCP metrics are now hydrated with endpoint-specific labels.
---
* outbound: Cache balancers within profile stack (linkerd/linkerd2-proxy#641)
* outbound: Remove unused error type (linkerd/linkerd2-proxy#648)
* Eliminate the ConnectAddr trait (linkerd/linkerd2-proxy#649)
* profiles: Do not rely on tuples as stack targets (linkerd/linkerd2-proxy#650)
* proxy-http: Remove unneeded boilerplate (linkerd/linkerd2-proxy#651)
* outbound: Clarify Http target types (linkerd/linkerd2-proxy#653)
* outbound: TCP discovery and load balancing (linkerd/linkerd2-proxy#652)
* metrics: Add endpoint labels to outbound TCP metrics (linkerd/linkerd2-proxy#654)
The proxy performs endpoint discovery for unnamed services, but not
service profiles.
The destination controller and proxy have been updated to support
lookups for unnamed services in linkerd/linkerd2#4727 and
linkerd/linkerd2-proxy#626, respectively.
This change modifies the injection template so that the
`proxy.destinationGetNetworks` configuration enables profile
discovery for all networks on which endpoint discovery is permitted.
`./bin/test-cleanup` was trying to remove the
resources with the label `linkerd.io/is-test-helm` which we're not
using. Instead, we simply call `helm delete` on the appropriate helm
releases.
This is required for a clean cleanup after the ARM integration test, whose
cluster is just cleaned by this script at the end and is not torn down.
## edge-20.9.1
This edge release contains an important proxy update that allows linkerd to
continue to operate normally in HA during node outages. We're also adding full
Kubernetes 1.19 support!
* Improved the proxy's error handling for DNS errors encountered when
discovering control plane addresses, which can be common during installation,
before all components have been started
* The destination and identity services had to be made headless in order to
support that new controller discovery (which now can leverage SRV records)
* Use SAN fields when generating the linkerd webhook configs; this completes the
Kubernetes 1.19 support which enforces them
* Fixed `linkerd check` for multicluster that was spuriously claiming the
absence of some resources
* Improved the injection test cleanup (thanks @zhouhao3!)
* Added ability to run the integration test suite using a cluster in an ARM
architecture (thanks @aliariff!)
Updating only the go 1.15 version, makes the upgrades fail from older versions,
as the identity certs do not have that setting and go 1.15 expects them.
This PR upgrades the cert generation code to have that field,
allowing us to move to go 1.15 in later versions of Linkerd.
* Create webhook certs with SANs, along with legacy Common Name
Fixes#4918
In Kubernetes 1.18, Go version has been updated to 1.15 which updated
how certificates are verified. They moved away from legacy Common Name
field to SANs.
This PR replaces the internal Helm cert generation functions to
`genSelfSignedCert` as they allow alternate DNS names to be specified.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The multicluster checks make sure that the correct resources exist for each service mirror controller. When looking up these resources, it uses the `linkerd.io/control-plane-component=linkerd-service-mirror` label selector. However, these resources have the label `linkerd.io/control-plane-component=service-mirror`. This causes the resource lookup to fail to find the resource and the check spuriously fails.
```
× service mirror controller has required permissions
missing ServiceAccounts: linkerd-service-mirror-self
missing ClusterRoles: linkerd-service-mirror-access-local-resources-self
missing ClusterRoleBindings: linkerd-service-mirror-access-local-resources-self
missing Roles: linkerd-service-mirror-read-remote-creds-self
missing RoleBindings: linkerd-service-mirror-read-remote-creds-self
see https://linkerd.io/checks/#l5d-multicluster-source-rbac-correct for hints
| * no service mirror controller deployment for Link self
```
Instead, use the correct label selector when looking up these resources.
Signed-off-by: Alex Leong <alex@buoyant.io>
All of the code for the service mirror controller lives in the `linkerd/linkerd2/controller/cmd` package. It is typical for control plane components to only have a `main.go` entrypoint in the cmd package. This can sometimes make it hard to find the service mirror code since I wouldn't expect it to be in the cmd package.
We move the majority of the code to a dedicated controller package, leaving only main.go in the cmd package. This is purely organizational; no behavior change is expected.
Signed-off-by: Alex Leong <alex@buoyant.io>
## edge-20.8.4
* Fixed a problem causing the `enable-endpoint-slices` flag to not be persisted
when set via `linkerd upgrade` (thanks @Matei207!)
* Removed SMI-Metrics templates and experimental sub-commands
* Use `--frozen-lockfile` to avoid accidental update of dashboard JS
dependencies in CI (thanks @tharun208!)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Fixes#4790
This PR removes both the SMI-Metrics templates along with the
experimental sub-commands. This also removes pkg `smi-metrics`
as there is no direct use of it without the commands.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
## What/How
@adleong pointed out in #4780 that when enabling slices during an upgrade, the new value does not persist in the `linkerd-config` ConfigMap. I took a closer look and it seems that we were never overwriting the values in case they were different.
* To fix this, I added an if block when validating and building the upgrade options -- if the current flag value differs from what we have in the ConfigMap, then change the ConfigMap value.
* When doing so, I made sure to check that if the cluster does not support `EndpointSlices` yet the flag is set to true, we will error out. This is done similarly (copy&paste similarily) to what's in the install part.
* Additionally, I have noticed that the helm ConfigMap template stored the flag value under `enableEndpointSlices` field name. I assume this was not changed in the initial PR to reflect the changes made in the protocol buffer. The API (and thus the CLI) uses the field name `endpointSliceEnabled` instead. I have changed the config template so that helm installations will use the same field, which can then be used in the destination service or other components that may implement slice support in the future.
Signed-off-by: Matei David <matei.david.35@gmail.com>
unit tests in ci are runned using yarn install.
So, there will be some update to the dependencies.
This is fixed by passing --frozen-lockfile in ci workflow
Fixes#3838
Signed-off-by: Tharun <rajendrantharun@live.com>
This edge release adds support for [topology-aware service routing][1]
to the Destination controller. When providing service discovery updates
to proxies, the Destination controller will now filter endpoints based
on the service's topology preferences. Additionally, this release
includes bug fixes for the `linkerd check` CLI command and web
dashboard.
* CLI
* `linkerd check` will no longer warn about a looser webhook failure
policy in HA mode
* Controller
* Added support for [topology-aware service routing][1] to the
Destination controller (thanks @Matei207)
* Changed the Destination controller to always return destination
overrides for service profiles when no traffic split is present
* Web UI
* Fixed Tap `Authority` dropdown not being populated (thanks to
@tharun208!)
[1]: https://kubernetes.io/docs/concepts/services-networking/service-topology/
Tap component is calling fetch metrics with skip_stats and authority
service type is not sent.So, authority dropdown is not getting populated.
Added a seperate call to get metrics for authority
Fixes#4697
Signed-off-by: Tharun <rajendrantharun@live.com>
## Motivation
#4879
## Solution
When no traffic split exists for services, return a single destination override
with a weight of 100%.
Using the destination client on a new linkerd installation, this results in the
following output for `linkerd-identity` service:
```
❯ go run controller/script/destination-client/main.go -method getProfile -path linkerd-identity.linkerd.svc.cluster.local:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"linkerd-identity.linkerd.svc.cluster.local.:8080" weight:100000}
INFO[0000]
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>