* Enable mixed configuration of skip-[inbound|outbound]-ports using port numbers and ranges (#3752)
* included tests for generated output given proxy-ignore configuration options
* renamed "validate" method to "parseAndValidate" given mutation
* updated documentation to denote inclusiveness of ranges
* Updates for expansion of ignored inbound and outbound port ranges to be handled by the proxy-init rather than CLI (#3766)
This change maintains the configured ports and ranges as strings rather than unsigned integers, while still providing validation at the command layer.
* Bump versions for proxy-init to v1.3.0
Signed-off-by: Paul Balogh <javaducky@gmail.com>
* Inject preStop hook into the proxy sidecar container to stop it last
This commit adds support for a Graceful Shutdown technique that is used
by some Kubernetes administrators while the more perspective
configuration is being discussed in
https://github.com/kubernetes/kubernetes/issues/65502
The problem is that RollingUpdate strategy does not guarantee that all
traffic will be sent to a new pod _before_ the previous pod is removed.
Kubernetes inside is an event-driven system and when a pod is being
terminating, several processes can receive the event simultaneously.
And if an Ingress Controller gets the event too late or processes it
slower than Kubernetes removes the pod from its Service, users requests
will continue flowing into the black whole.
According [to the documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods)
> 1. If one of the Pod’s containers has defined a `preStop` hook,
> it is invoked inside of the container. If the `preStop` hook is still
> running after the grace period expires, step 2 is then invoked with
> a small (2 second) extended grace period.
>
> 2. The container is sent the `TERM` signal. Note that not all
> containers in the Pod will receive the `TERM` signal at the same time
> and may each require a preStop hook if the order in which
> they shut down matters.
This commit adds support for the `preStop` hook that can be configured
in three forms:
1. As command line argument `--wait-before-exit-seconds` for
`linkerd inject` command.
2. As `linkerd2` Helm chart value `Proxy.WaitBeforeExitSeconds`.
2. As `config.alpha.linkerd.io/wait-before-exit-seconds` annotation.
If configured, it will add the following preHook to the proxy container
definition:
```yaml
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep {{.Values.Proxy.WaitBeforeExitSeconds}}
```
To achieve max benefit from the option, the main container should have
its own `preStop` hook with the `sleep` command inside which has
a smaller period than is set for the proxy sidecar. And none of them
must be bigger than `terminationGracePeriodSeconds` configured for the
entire pod.
An example of a rendered Kubernetes resource where
`.Values.Proxy.WaitBeforeExitSeconds` is equal to `40`:
```yaml
# application container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 20
# linkerd-proxy container
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- sleep 40
terminationGracePeriodSeconds: 160 # for entire pod
```
Fixes#3747
Signed-off-by: Eugene Glotov <kivagant@gmail.com>
* Add identity-issuer-certificate-file and identity-issuer-key-file to upgrade command
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
* Implement logic to use identity-trust-anchors-file flag to update the anchors
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* Address remarks
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
* No need for `processYAML()` in `install`
Since `install` uses helm to do its proxy injection, there's no need to
call `processYAML`. This also fixes an issue discovered in #3687 where
we started supporting injection of cronjobs, and even though `linkerd`'s
namespace is flagged to skip automatic injection it was being injected.
This replaces #3773 as it's a much more simpler approach.
* Replaced `uuid` with `uid` from linkerd-config resource
Fixes#3621
Removed the old `uuid` for identifying linkerd installations, and
replaced it with the `uid` property from the `linkerd-config` ConfigMap.
I tested that this `uid` remains the same by updating the config and
also upgrading linkerd, using both the CLI and Helm.
Note that this required granting `linkerd-web` RBAC access to the
`linkerd-config` Config.
I also added an integration test to verify the stability of the uid.
`linkerd check` can now be run from the dashboard in the `/controlplane` view.
Once the check results are received, they are displayed in a modal in a similar
style to the CLI output.
Closes#3613
* Add support for --identity-issuer-mode flag to install cmd
* Change flag to be a bool
* Read correct data form identity when external issuer is used
* Add ability for identity service to dynamically reload certs
* Fix failing tests
* Minor refactor
* Load trust anchors from identity issuer secret
* Make identity service actually watch for issuer certs updates
* Add some testing around cmd line identity options validation
* Add tests ensuring that identity service loads issuer
* Take into account external-issuer flag during upgrade + tests
* Fix failing upgrade test
* Address initial review feedback
* Address further review feedback on cli and helm
* Do not persist --identity-external-issuer
* Some improvements to identitiy service
* Bring back persistane of external issuer flag
* Address more feedback
* Update dockerfiles shas
* Publishing k8s events on issuer certs rotation
* Ensure --ignore-cluster+external issuer is not supported
* Update go-deps shas
* Transition to identity issuer scheme based configuration
* Use k8s consts for secret file names
Signed-off-by: zaharidichev <zaharidichev@gmail.com>
The `linkerd install` `--ignore-cluster` and `--skip-checks` flags
enable generating install manifests without a connection to a k8s
cluster. Unfortunately these flags were only checked after attempted
connections to a k8s cluster were made. This satisfied the use case of
`linkerd install` "ignoring" the state of the cluster, but for
environments not connected to a cluster, the user would have to wait for
30s timeouts before getting the manifests.
Modify `linkerd install` and its subcommands to pre-emptively check for
`--ignore-cluster` and `--skip-checks`. This decreases `linkerd install
--ignore-cluster` from ~30s to ~1s, and `linkerd install control-plane
--ignore-cluster --skip-checks` from ~60s to ~1s.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This PR aims at preventing `--cluster-domain` from being changed during `linkerd upgrade`. I am not sure this is all that is necessary, but it can probably be at least a good start. 🙂Closes#3454.
Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com>
This reverts commit edd3b1f6d4.
This is a temporary revert of #3461 while we sort out some details of how this should configured and how it should interact with configuring a trace collector on the Linkerd proxy. We will reintroduce this change once the config plan is straightened out.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#278
Add `linkerd install|upgrade --disable-heartbeat` flag, and have
`linkerd check` check for the heartbeat's SA only if it's enabled.
Also added those flags into the `linkerd upgrade -h` examples.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Rename template-values.go
* Define new constructor of charts.Values type
* Move all Helm values related code to the pkg/charts package
* Bump dependency
* Use '/' in filepath to remain compatible with VFS requirement
* Add unit test to verify Helm YAML output
* Alejandro's feedback
* Add unit test for Helm YAML validation (HA)
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* Always use forward-slash when interacting with the VFS
Fixes#3283
Our VFS implementation relies on `net.http.FileSystem` which always
expects `/` regardless of the OS.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
PR #3217 re-introduced container metrics collection to
linkerd-prometheus. This enabled linkerd-heartbeat to collect mem and
cpu metrics at the container-level.
Add container cpu and mem metrics to heartbeat requests. For each of
(destination, prometheus, linkerd-proxy), collect maximum memory and p95
cpu.
Concretely, this introduces 7 new query params to heartbeat requests:
- p99-handle-us
- max-mem-linkerd-proxy
- max-mem-destination
- max-mem-prometheus
- p95-cpu-linkerd-proxy
- p95-cpu-destination
- p95-cpu-prometheus
Part of #2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Delete symlink to old Helm chart
* Update 'install' code to use common Helm template structs
* Remove obsolete TLS assets functions.
These are now handle by Helm functions inside the templates
* Read defaults from values.yaml and values-ha.yaml
* Ensure that webhooks TLS assets are retained during upgrade
* Fix a few bugs in the Helm templates (see bullet points):
* Merge the way the 'install' ha and non-ha options are handled into one function
* Honor the 'NoInitContainer' option in the components templates
* Control plane mTLS will not be disabled if identity context in the
config map is empty. The data plane mTLS will still be automatically disabled
if the context is nil.
* Resolve test failures from rebase with master
* Fix linter issues
* Set service account mount path read-only field
* Add TLS variables of the webhooks and tap to values.yaml
During upgrade, these secrets are preserved to ensure they remain synced
wih the CA bundle in the webhook configurations. These Helm variables are used
to override the defaults in the templates.
* Remove obsolete 'chart' folder
* Fix bugs in templates
* Handle missing webhooks and tap TLS assets during upgrade
When upgrading from an older version that don't have these secrets, fallback to let Helm
create them by creating an empty charts.TLS struct.
* Revert the selector labels of webhooks to be compatible with that in 2.4
In 2.4, the proxy injector and profile validator webhooks already have their selector labels defined.
Since these attributes are immutable, the recent change to these selectors introduced by the Helm chart work will cause upgrade to fail.
* Alejandro's feedback
* Siggy's feedback
* Removed redundant unexported custom types
Signed-off-by: Ivan Sim <ivan@buoyant.io>
The web dashboard will be migrating to the new Tap APIService, which
requires RBAC privileges to access.
Introduce a new ClusterRole, `linkerd-linkerd-tap-admin`, which gives
cluster-wide tap privileges. Also introduce a new ClusterRoleBinding,
`linkerd-linkerd-web-admin` which binds the `linkerd-web` service
account to the new tap ClusterRole. This ClusterRoleBinding is enabled
by default, but may be disabled via a new `linkerd install` flag
`--restrict-dashboard-privileges`.
Fixes#3177
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Refactor proxy injection to use Helm charts
Fixes#3128
A new chart `/charts/patch` was created, that generates the JSON patch
payload that is to be returned to the k8s API when doing the injection
through the proxy injector, and it's also leveraged by the `linkerd
inject --manual` CLI.
The VFS was used by `linkerd install` to access the old chart under
`/chart`. Now the proxy injection also uses the Helm charts to generate
the JSON patch (see above) so we've moved the VFS from `cli/static` to a
new common place under `/pkg/charts/static`, and the new root for the VFS is
now `/charts`.
`linkerd install` hasn't yet migrated to use the new charts (that'll
happen in #3127), so the only change in that regard was the creation of
`/charts/chart` which is a symlink pointing to `/chart` that
`install.go` now uses, so that the VFS contains both the old and new
charts, as a temporary measure.
You can see that `/bin/Dockerfile-bin`, `/controller/Dockerfile` and
`/bin/build-cli-bin` do now `go generate` pointing to the new location
(and the `go generate` annotation was moved from `/cli/main.go` to
`pkg/charts/static/templates.go`).
The symlink trick doesn't work when building the binaries through
Docker, so `/bin/Dockerfile-bin` replaces the symlink with an actual
copy of `/chart`.
Also note that in `/controller/Dockerfile` we now need to include the
`prod` tag in `go install` like we do in `/bin/Dockerfile-bin` so that
the proxy injector does use the VFS instead of the local file system.
- The common logic to parse a chart has been moved from `install.go` to
`/pkg/charts/util.go`.
- The special ENV var in the proxy for "outbound router capacity" that
only applies to the Prometheus pod is now handled directly in the proxy
partial and all the associated go code could be removed.
- The `patch.go` lib for generating the JSON patch in go along
with its tests `patch_test.go` are no longer needed.
- Lots of functions in `/pkg/inject/inject.go` got removed/simplified
with their logic being moved into the charts themselves. As a
consequence lots of things in `inject_test.go` became irrelevant.
- Moved `template-values.go` from `/pkg/inject` to `pkg/charts` as that
contains the go structs representation of the chart variables that
will be leveraged in #3127.
Don't forget to run `/bin/helm.sh` whenever you make changes to charts
;-)
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
The Tap Service enabled tapping of any meshed pod, regardless of user
privilege.
This change introduces a new Tap APIService. Kubernetes provides
authentication and authorization of Tap requests, and then forwards
requests to a new Tap APIServer, which implements a Kubernetes
aggregated APIServer. The Tap APIServer authenticates the client TLS
from Kubernetes, and authorizes the user via a SubjectAccessReview.
This change also modifies the `linkerd tap` command to make requests
against the new APIService.
The Tap APIService implements these Kubernetes-style endpoints:
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/tap
POST /apis/tap.linkerd.io/v1alpha1/watch/namespaces/:ns/:res/:name/tap
GET /apis
GET /apis/tap.linkerd.io
GET /apis/tap.linkerd.io/v1alpha1
GET /healthz
GET /healthz/log
GET /healthz/ping
GET /metrics
GET /openapi/v2
GET /version
Users authorize to the new `tap.linkerd.io/v1alpha1` via RBAC. Only the
`watch` verb is supported. Access is also available via subresources
such as `deployments/tap` and `pods/tap`.
This change introduces the following resources into the default Linkerd
install:
- Global
- APIService/v1alpha1.tap.linkerd.io
- ClusterRoleBinding/linkerd-linkerd-tap-auth-delegator
- `linkerd` namespace:
- Secret/linkerd-tap-tls
- `kube-system` namespace:
- RoleBinding/linkerd-linkerd-tap-auth-reader
Tasks not covered by this PR:
- `linkerd top`
- `linkerd dashboard`
- `linkerd profile --tap`
- removal of the unauthenticated tap controller
Fixes#2725, #3162, #3172
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Similar to `kubectl --as`, global flag across all linkerd subcommands
which sets a `ImpersonationConfig` in the Kubernetes API config.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* increased ha resource limits
* added resource limits to proxy when HA
* update golden files in cmd/main
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
`linkerd check`, the web dashboard, and Grafana all perform version
checks to validate Linkerd is up to date. It's common for users to
seldom execute these codepaths. This makes it difficult to identify what
versions of Linkerd are currently in use and what environments it is
being run in, which helps prioritize testing and backports.
Introduce a `heartbeat` CronJob to the default Linkerd install. The
cronjob executes every 24 hours, starting from 5 minutes after
`linkerd install` is run.
Example check URL:
https://versioncheck.linkerd.io/version.json?
install-time=1562761177&
k8s-version=v1.15.0&
meshed-pods=8&
rps=3&
source=heartbeat&
uuid=cc4bb700-3314-426a-9f0f-ec588b9df020&
version=git-b97ee9f7
Fixes#2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Allow custom cluster domain in destination watcher
The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.
A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.
* Update proto to allow setting custom cluster domain
Update golden templates
* Allow setting custom domain in grpc, web server
* Remove cluster domain flags from web srv and public api
* Set defaultClusterDomain in validateAndBuild if none is set
Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
* Added Anti Affinity when HA is configured
* Move check to validate()
* Test output with anti-affinity when ha upgrade
* Add anti-affinity to identity deployment
* made host anti-affinity default when ha
* Define affinity template in a separate file
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The existing `linkerd install` error message for existing resources was
shared with `linkerd check`. Given the different contexts, the messaging
made more sense for `linkerd check` than for `linkerd install`.
Modify the error messaging for `linkerd install` to print a bare list
of existing resources, and provide instructions for proceeding.
For example:
```bash
$ linkerd install
Unable to install the Linkerd control plane. It appears that there is an existing installation:
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity
If you are sure you'd like to have a fresh install, remove these resources with:
linkerd install --ignore-cluster | kubectl delete -f -
Otherwise, you can use the --ignore-cluster flag to overwrite the existing global resources.
```
Fixes#3045
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
PR #2603 modified the web process to read the UUID from the
`linkerd-config` ConfigMap rather than from a command line flag. The
`linkerd check` command relied on that command line flag to retrieve the
UUID as part of its version check.
Modify `linkerd check` to correctly retrieve the UUID from
`linkerd-config`. Also refactor `linkerd-config` retrieval and parsing
code to be shared between healthcheck, install, and upgrade.
Relates to #2961
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This change implements the DstOverrides feature of the destination profile API (aka traffic splitting).
We add a TrafficSplitWatcher to the destination service which watches for TrafficSplit resources and notifies subscribers about TrafficSplits for services that they are subscribed to. A new TrafficSplitAdaptor then merges the TrafficSplit logic into the DstOverrides field of the destination profile.
Signed-off-by: Alex Leong <alex@buoyant.io>
* Introduce new checks to determine existence of global resources and the
'linkerd-config' config map.
* Update pre-check to check for existence of global resources
This ensures that multiple control planes can't be installed into
different namespaces.
* Update integration test clean-up script to delete psp and crd
Signed-off-by: Ivan Sim <ivan@buoyant.io>
Fixes#2927
Also moved `TestInstallSP` after `TestCheckPostInstall` so we're sure
the validating webhook is ready before installing a service profile.
Signed-off-by: Alejandro Pedraza Borrero <alejandro@buoyant.io>
* Add control plane and CNI PSP and RBAC resources
* Add the '--linkerd-cni-enabled' flag to the multi-stage install subcommands
This flag ensures that the NET_ADMIN capability is omitted from the control
plane's PSP during 'install config' and the proxy-init containers aren't
injected during 'install control-plane'.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
* If HA, set the webhooks failure policy to 'Fail'
I'm adding to the linkerd namespace a new label
`linkerd.io/is-control-plane: true` that is used in the webhook configs'
selector to skip the proxy injector for this namespace. This avoids
running into the timing issues described in #2852.
Fixes#2852
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Fix HA during upgrade
If we have a Linkerd installation with HA, and then we do `linkerd
upgrade` without specifying `--ha`, the replicas will get set back to 1,
yet the resource requests will keep their HA values.
Desired behavior: `linkerd install --ha` adds the `ha` value into the
linkerd-config, so it should be used during upgrade even if `--ha` is
not passed to `linkerd upgrade`.
Note we still can do `linkerd upgrade --ha=false` to disable HA.
This is a prerequesite to address #2852
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
Split proxy-init into separate repo
Fixes#2563
The new repo is https://github.com/linkerd/linkerd2-proxy-init, and I
tagged the latest there `v1.0.0`.
Here, I've removed the `/proxy-init` dir and pinned the injected
proxy-init version to `v1.0.0` in the injector code and tests.
`/cni-plugin` depends on proxy-init, so I updated the import paths
there, and could verify CNI is still working (there is some flakiness
but unrelated to this PR).
For consistency, I added a `--init-image-version` flag to `linkerd
inject` along with its corresponding override config annotation.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* Update helm charts to include webhooks config and TLS secret
* Update the webhooks to read the secret cert and key
* Update webhooks to not recreate config on restart
* Ensure upgrade preserve existing secrets
* Revert the change to rename the webhook configs
The renaming change breaks upgrade, where the new webhook configs conflict with
the existing ones. The older resources aren't deleted during upgrade because
they are dynamically created.
* Make the secret volume read-only
* Remove unnecessary exported getter functions
* Remove obsolete mwc and vwc templates
Signed-off-by: Ivan Sim <ivan@buoyant.io>
The multi-stage args used by install, upgrade, and check were
implemented as positional arguments to their respective parent commands.
This made the help documentation unclear, and the code ambiguous as to
which flags corresponded to which stage.
Define `config` and `control-plane` stages as subcommands. The help
menus now explicitly state flags supported.
Fixes#2729
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
All ServiceAccounts are intended to be grouped together with other RBAC
resources, particularly for `linkerd install config` output. Grafana and
Web ServiceAccounts were still included with their respective
Deployments.
Group Grafana and Web ServiceAccounts with other RBAC resources.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
`linkerd install` supports a 2-stage install process, `linkerd upgrade`
did not.
Add 2-stage support for `linkerd upgrade`. Also exercise multi-stage
functionality during upgrade integration tests.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Numerous codepaths have emerged that create k8s configs, k8s clients,
and make k8s api requests.
This branch consolidates k8s client creation and APIs. The primary
change migrates most codepaths to call `k8s.NewAPI` to instantiate a
`KubernetesAPI` struct from `pkg`. `KubernetesAPI` implements the
`kubernetes.Interface` (clientset) interface, and also persists a
`client-go` `rest.Config`.
Specific list of changes:
- removes manual GET requests from `k8s.KubernetesAPI`, in favor of
clientsets
- replaces most calls to `k8s.GetConfig`+`kubernetes.NewForConfig` with
a single `k8s.NewAPI`
- introduces a `timeout` param to `k8s.NewAPI`, currently only used by
healthchecks
- removes `NewClientSet` in `controller/k8s/clientset.go` in favor of
`k8s.NewAPI`
- removes `httpClient` and `clientset` from `HealthChecker`, use
`KubernetesAPI` instead
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Fixes#2720 and 2711
This changes the default behavior of `linkerd inject` to not inject the
proxy but just the `linkerd.io/inject: enabled` annotation for the
auto-injector to pick it up (regardless of any namespace annotation).
A new `--manual` mode was added, which behaves as before, injecting
the proxy in the command output.
The unit tests are running with `--manual` to avoid any changes in the
fixtures.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
* The 'linkerd-version' CLI flag is renamed to 'control-plane-version'
* Add version field to proxy config
* Add the control plane version to the global config
* Unit test for init image version
* Use more specific control plane and proxy versions in unit tests
Signed-off-by: Ivan Sim <ivan@buoyant.io>
This is an initial change to separate out config-specific k8s objects
from the control-plane components. The eventual goal will be rendering
these configs as the first stage of a multi-stage install.
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
This change introduces some unit tests on individual methods in the
upgrade code path, along with some minor cleanup.
Part of #2637
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
When upgrading from an older cluster that has a Linkerd config but no
identity, we need to generate an identity context so that the cluster is
configured properly.
Fixes#2650
The UUID implementation we use to generate install IDs is technically
not random enough for secure uses, which ours is not. To prevent
security scanners like SNYK from flagging this false-positive, let's
just switch to the other UUID implementation (Already in our
dependencies).
The instalOnlyFlagSet incorrectly extends the recordableFlagSet.
I'm not sure if this has any potential for unexpected user interactions,
but it's at least confusing when reading the code.
This change makes the flag sets distinct.
Add validation webhook for service profiles
Fixes#2075
Todo in a follow-up PRs: remove the SP check from the CLI check.
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>
When the --ha flag is set, we currently set a 10m CPU request, which
corresponds to 1% of a core, which isn't actually enough to keep the
proxy responding to health checks if you have 100 processes on the box.
Let's give ourselves a little more breathing room.
Fixes#2643
* Disable external profiles by default
* Rename the --disable-external-profiles flag to --enable-external-profiles
Signed-off-by: Ivan Sim <ivan@buoyant.io>
The `install` command errors when the deploy target contains an existing
Linkerd deployment. The `upgrade` command is introduced to reinstall or
reconfigure the Linkerd control plane.
Upgrade works as follows:
1. The controller config is fetched from the Kubernetes API. The Public
API is not used, because we need to be able to reinstall the control
plane when the Public API is not available; and we are not concerned
about RBAC restrictions preventing the installer from reading the
config (as we are for inject).
2. The install configuration is read, particularly the flags used during
the last install/upgrade. If these flags were not set again during the
upgrade, the previous values are used as if they were passed this time.
The configuration is updated from the combination of these values,
including the install configuration itself.
Note that some flags, including the linkerd-version, are omitted
since they are stored elsewhere in the configurations and don't make
sense to track as overrides..
3. The issuer secrets are read from the Kubernetes API so that they can
be re-used. There is currently no way to reconfigure issuer
certificates. We will need to create _another_ workflow for
updating these credentials.
4. The install rendering is invoked with values and config fetched from
the cluster, synthesized with the new configuration.
When installing Linkerd, a user may override default settings, or may
explicitly configure defaults. Consider install options like `--ha
--controller-replicas=4` -- the `--ha` flag sets a new default value for
the controller-replicas, and then we override it.
When we later upgrade this cluster, how can we know how to configure the
cluster?
We could store EnableHA and ControllerReplicas configurations in the
config, but what if, in a later upgrade, the default value changes? How
can we know whether the user specified an override or just used the
default?
To solve this, we add an `Install` message into a new config.
This message includes (at least) the CLI flags used to invoke
install.
upgrade does not specify defaults for install/proxy-options fields and,
instead, uses the persisted install flags to populate default values,
before applying overrides from the upgrade invocation.
This change breaks the protobuf compatibility by altering the
`installation_uuid` field introduced in 9c442f6885.
Because this change was not yet released (even in an edge release), we
feel that it is safe to break.
Fixes https://github.com/linkerd/linkerd2/issues/2574
This change moves resource-templating logic into a dedicated template,
creates new values types to model kubernetes resource constraints, and
changes the `--ha` flag's behavior to create these resource templates
instead of hardcoding the resource constraints in the various templates.
When reading a Linkerd configuration, we cannot determine whether
auto-inject should be configured.
This change adds auto-inject configuration to the global config
structure. Currently, this configuration is effectively boolean,
determined by the presence of an empty value (versus a null).
Currently, the install UUID is regenerated each time `install` is run.
When implementing cluster upgrades, it seems most appropriate to reuse
the prior UUID, rather than generate a new one.
To this end, this change stores an "Installation UUID" in the global
linkerd config.
Because the linkerd-config resource is created after pods that require
it, they can be started before the files are mounted, causing the pods
to restart integration tests to fail.
If we extract the config into its own template file, it can be inserted
before pods are created.
The introduction of identity in 0626fa37 created new state in the
control plane's configuration that must be considered when re-installing
the control plane or when injecting pods.
This change alters `install` to fail if it would seem to conflict with
an existing installation. This behavior may be disabled with the
`--ignore-cluster` flag.
Furthermore, `inject` now _requires_ that it can fetch a configuration
from the control plane in order to operate. Otherwise the
`--ignore-cluster` and `--disable-identity` flags must be specified.
This change does not actually instrument pods to use identity yet---it
lays the framework for proxy identity without changing the test fixture
output (besides a change to how identity HA is configured).
Fixes#2531
https://github.com/linkerd/linkerd2/pull/2521 introduces an "Identity"
controller, but there is no way to include it in linkerd installation.
This change alters the `install` flow as follows:
- An Identity service is _always_ installed;
- Issuer credentials may be specified via the CLI;
- If no Issuer credentials are provided, they are generated each time `install` is called.
- Proxies are NOT configured to use the identity service.
- It's possible to override the credential generation logic---especially
for tests---via install options that can be configured via the CLI.
The new proxy has changed its configuration as follows:
- `LISTENER` urls are now `LISTEN_ADDR` addresses;
- `CONTROL_URL` is now `DESTINATION_SVC_ADDR`;
- `*_NAMESPACE` vars are no longer needed;
- The `PROXY_ID` is now the `DESTINATION_CONTEXT`;
- The "metrics" port is now the "admin" port, since it serves more than
just metrics;
- A readiness probe now checks a dedicated /ready endpoint eagerly.
Identity injection is **NOT** configured by this branch.
The proxy's TLS implementation has changed to use a new _Identity_ controller.
In preparation for this, the `--tls=optional` CLI flag has been removed
from install and inject; and the `ca` controller has been deleted. Metrics
and UI treatments for TLS have **not** been removed, as they will continue to
be valuable for the new Identity system.
With the removal of the old identity scheme, the Destination service's proxy
ID field is now set with an opaque string (e.g. `ns:emojivoto`) to enable
locality awareness.
linkerd/linkerd2#1721 introduced a `--single-namespace` install flag,
enabling the control-plane to function within a single namespace. With
the introduction of ServiceProfiles, and upcoming identity changes, this
single namespace mode of operation is becoming less viable.
This change removes the `--single-namespace` install flag, and all
underlying support. The control-plane must have cluster-wide access to
operate.
A few related changes:
- Remove `--single-namespace` from `linkerd check`, this motivates
combining some check categories, as we can always assume cluster-wide
requirements.
- Simplify the `k8s.ResourceAuthz` API, as callers no longer need to
make a decision based on cluster-wide vs. namespace-wide access.
Components either have access, or they error out.
- Modify the web dashboard to always assume ServiceProfiles are enabled.
Reverts #1721
Part of #2337
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Changed the protobuf definition to take out destinationApiPort entirely
* Store destinationAPIPort as a constant in pkg/inject.go
Fixes#2351
Signed-off-by: Aditya Sharma <hello@adi.run>
- Created the pkg/inject package to hold the new injection shared lib.
- Extracted from `/cli/cmd/inject.go` and `/cli/cmd/inject_util.go`
the core methods doing the workload parsing and injection, and moved them into
`/pkg/inject/inject.go`. The CLI files should now deal only with
strictly CLI concerns, and applying the json patch returned by the new
lib.
- Proceeded analogously with `/cli/cmd/uninject.go` and
`/pkg/inject/uninject.go`.
- The `InjectReport` struct and helping methods were moved into
`/pkg/inject/report.go`
- Refactored webhook to use the new injection lib
- Removed linkerd-proxy-injector-sidecar-config ConfigMap
- Added the ability to add pod labels and annotations without having to
specify the already existing ones
Fixes#1748, #2289
Signed-off-by: Alejandro Pedraza <alejandro.pedraza@gmail.com>
linkerd/linkerd2#2349 introduced a `SelfSubjectAccessReview` check at
startup, to determine whether each control-plane component should
establish Kubernetes watches cluster-wide or namespace-wide. If this
check occurs before the linkerd-proxy sidecar is ready, it fails, and
the control-plane component restarts.
This change configures each control-plane pod to skip outbound port 443
when injecting the proxy, allowing the control-plane to connect to
Kubernetes regardless of the `linkerd-proxy` state.
A longer-term fix should involve a more robust control-plane startup,
that is resilient to failed Kubernetes API requests. An even longer-term
fix could involve injecting `linkerd-proxy` as a Kubernetes "sidecar"
container, when that becomes available.
Workaround for #2407
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Also, some protobuf updates:
* Rename `api_port` to match recent changes in CLI code.
* Remove the `cni` message because it won't be used.
* Remove `registry` field from proto types. This helps to avoid having to workaround edge cases like fully-qualified image name in different format, and overriding user-specified Linkerd version etc.
Signed-off-by: Ivan Sim <ivan@buoyant.io>
Add options in CLI for setting proxy CPU and memory limits
- Deprecated `proxy-cpu` and `proxy-memory` in favor of `proxy-cpu-limit` and `proxy-memory-limit`
- Updated validations and tests to reflect new options
Signed-off-by: TwinProduction <twin@twinnation.org>
chart/templates/base.yaml is nearly 800 lines and contains the
kubernetes configurations for the marjority of the control plane.
Furthermore, its contents are not particularly organized (for example,
the prometheus RBAC bindings are in the middle of the controller's
configuration).
The size and complexity of this file makes it especially daunting to
introduce new functionality.
In order to make the situation easier to understand and change, this
splits base.yaml into several new template files: namespace, controller,
serviceprofile, and prometheus, and grafana. The `tls.yaml` template has
been renamed `ca.yaml`, since it installs the `linkerd-ca` resources.
This change also makes the comments uniform, adding a "header" to each
logical component.
Fixes#2154
Up until now, the proxy-api controller service has been the sole service
that the proxy communicates with, implementing the majoriry of the API
defined in the `linkerd2-proxy-api` repo. But this is about to change:
linkerd/linkerd2-proxy-api#25 introduces a new Identity service; and
this service must be served outside of the existing proxy-api service
in the linkerd-controller deployment (so that it may run under a
distinct service account).
With this change, the "proxy-api" name becomes less descriptive. It's no
longer "the service that serves the API for the proxy," it's "the
service that serves the Destination API to the proxy." Therefore, it
seems best to bite the bullet and rename this to be the "destination"
service (i.e. because it only serves the
`io.linkerd.proxy.destination.Destination` service).
Co-authored-by: Kevin Lingerfelt <kl@buoyant.io>
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The `linkerd check` command was doing limited validation on
ServiceProfiles.
Make ServiceProfile validation more complete, specifically validate:
- types of all fields
- presence of required fields
- presence of unknown fields
- recursive fields
Also move all validation code into a new `Validate` function in the
profiles package.
Validation of field types and required fields is handled via
`yaml.UnmarshalStrict` in the `Validate` function. This motivated
migrating from github.com/ghodss/yaml to a fork, sigs.k8s.io/yaml.
Fixes#2190
# Problem
In order to switch Linkerd template rendering to use `.yaml` files, static
assets must be bundled in the Go binary for use by `linkerd install`.
# Solution
The solution should not affect the local development process of building and
testing.
[vfsgen](https://github.com/shurcooL/vfsgen) generates Go code that statically
implements the provided `http.FileSystem`. Paired with `go generate` and Go
[build tags](https://golang.org/pkg/go/build/), we can continue to use the
template files on disk when developing with no change required.
In `!prod` Go builds, the `cli/static/templates.go` file provides a
`http.FileSystem` to the local templates. In `prod` Go builds, `go generate
./cli` generates `cli/static/generated_templates.gogen.go` that statically
provides the template files.
When built with `-tags prod`, the executable will be built with the staticlly
generated file instead of the local files.
# Validation
The binaries were compiled locally with `bin/docker-build`. The binaries were
then tested with `bin/test-run (pwd)/target/cli/darwin/linkerd`. All tests
passed.
No change was required to successfully run `bin/go-run cli install`. No change
was required to run `bin/linkerd install`.
Fixes#2153
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
In linkerd/linkerd2-proxy#186, the proxy supports configuration of TCP
keepalive values.
This change sets `LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE` and
`LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE` to 10s when injecting the
proxy, so that remote connections are configured with a keepalive.
This configuration is NOT yet exposed through the CLI. This may be done
in a followup, if necessary.
Fixes#1949
* Add pod spec annotation to disable injection in CLI and auto-injector
* Remove support for linkerd.io/auto-inject label entirely
* Update based on review feedback
* Fix issue with finding the namespace of deployments applied to the default ns
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Use `ca.NewCA()` for generating certs and keys for the proxy injector
- Remove from CA controller everything that dealt with the
webhook/proxy-injector
- Remove no longer needed proxy-injector volumes for 'trust-anchors' and
'webhook-secrets'
- Remove from the proxy-injector the retrieval of the trust anchor and
secrets
- tls flag during install is no longer needed for auto-inject to work
Fixes#2095 and fixes#2166
Signed-off-by: Alejandro Pedraza <alejandro@buoyant.io>