Commit Graph

47 Commits

Author SHA1 Message Date
Sunny 74e33a70c4 Delete stale metrics on object delete
Use the metrics helper to record all the metrics. Metrics helpers
ensures that the metrics for deleted objects are deleted as well.

Move all the metrics recording to be performed at the very end of the
reconciliation. Realtime metrics for readiness is no longer recorded as
it will be removed in a future version for CRD metrics collected using
kube-state-metrics. Updating the object status with realtime readiness
should provide the readiness to CRD metrics watchers.

`HelmReleaseReconciler.reconcileDelete()` is modified to receive a
pointer HelmRelease object so that any modifications on the object is
reflected on the object instance that's passed to the metrics recorder.
This is not needed for `HelmReleaseReconciler.reconcile()` as it returns
a new copy of the object that's saved in the same object variable,
overwriting the object instance with the updates.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2023-08-15 02:42:09 +05:30
Hidde Beydals d76f3a355b
controller: jitter requeue interval
This adds a `--interval-jitter-percentage` flag to the controller to
add a +/- percentage jitter to the interval defined in a HelmRelease
(defaults to 5%).

Effectively, this results in a reconciliation every 9.5 - 10.5 minutes
for a resource with an interval of 10 minutes.

Main reason to add this change is to mitigate spikes in memory and
CPU usage caused by many resources being configured with the same
interval.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-08-09 17:50:43 +02:00
Hidde Beydals d345af0e73
Rename controllers to controller
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-05-24 11:05:53 +02:00
Hidde Beydals 2ba28c6d9e
Update Kubernetes, controller-runtime and Helm
This commit updates Kubernetes to v1.27, controller-runtime to
v0.15, and Helm to v3.12.

It deals with various breaking changes in controller-runtime, as
documented in the release notes:
https://github.com/kubernetes-sigs/controller-runtime/releases/tag/v0.15.0

In short:

- `Watches` now use a `client.Object` instead of a `source.Kind`.
- `handler.MapFunc` signature accepts a Go context, which is used to
  log any errors, instead of silently ignoring them and/or panicking.
- Max concurrent reconciles is configured on the manager, instead of
  configuring them per reconciler instance.
- Various manager configuration options have been moved to new
  structures and/or fields.

In addition to this, all other dependencies which had updates
available are updated to their latest versions as well.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-05-24 10:39:51 +02:00
Hidde Beydals 08925bc282
Add reconciler sharding capability based on label
With this enhancement, the controller can be configured with
`--watch-label-selector`, after which only objects with this label will
be reconciled by the controller.

This allows for horizontal scaling of the helm-controller, where each
controller can be deployed multiple times with a unique label selector
which is used as the sharding key.

Note that if you want to ensure a `HelmChart` gets created for a
specific source-controller instance, you have to provide the labels for
this controller in `.spec.chart.metadata.labels` of the `HelmRelease`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-29 15:37:53 +02:00
Hidde Beydals 3615feef2a
Move `controllers` to `internal/controllers`
There is no good reason for it to be exposed and available through a
public API, and this follows the new kubebuilder defaults.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-27 17:25:31 +02:00
Hidde Beydals b732420f26
oomwatch: auto detect well known cgroup paths
This commit adds support for recognizing cgroup v1 paths, and allows for
the configuration of alternative absolute path locations using
`--oom-watch-max-memory-path` and `--oom-watch-current-memory-path`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-10 15:08:17 +01:00
Hidde Beydals 3cb5b5c934
Use `logger.SetLogger` to also configure `klog`
This uses the newly introduced helper from runtime, which also
configures the logger for `klog`.

Resulting in all logs now being properly formatted, even when logged by
internal Kubernetes elements like the leader election or a dynamic
client.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-08 00:26:27 +01:00
Hidde Beydals c4566a5459
oomwatch: small tweaks
- Change memory usage percent threshold to `uint8` to no longer allow
  fractions.
- Validate interval to prevent configurations `<50ms`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-07 10:39:31 +01:00
Hidde Beydals 62456c94ff
Add OOM watcher to allow graceful shutdown
This commit introduces an OOM watcher, which can be enabled using
`--feature-gates=OOMWatch=true`. The OOM watcher watches the current
memory usage as reported by cgroups via `memory.current` and cancels
the context when it reaches a certain threshold compared to
`memory.max` (default `95`%, configurable using
`--oom-watch-memory-threshold`).

This allows ongoing Helm processes to gracefully exit with a failure
before the controller is forcefully OOM killed, preventing a deadlock
of releases in a pending state.

The OOM watcher polls the `memory.current` file on an interval (default
`500ms`, configurable using `--oom-watch-interval`), as subscribing to
file updates using inotify is not possible for cgroups (v2) except for
`*.events` files. Which does provide signals using `memory.events`, but
these will generally be too late for our use case. As for example `high`
equals `max` in most containers, buying us little time to gracefully
stop our processes.

In addition, because we simply watch current usage compared to max
usage in bytes. This approach should work for cgroups v1 as well, given
this has (most of the time) files for these values available, albeit
at times at different locations. For which this commit does not
introduce a flag yet, but the library takes into account that it could
be configured at some point.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-07 10:39:19 +01:00
Hidde Beydals 1240f20183
Enable experimental drift detection
This enables experimental drift detection of cluster state compared to
the current manifest data from the Helm storage's manifest blob.

Drift detection works based on the already proven approach of the
kustomize-controller's SSA package, and utilizes the managed field
configured by the controller since `v0.12.2`.

This feature is planned to go out of experimental once the further
controller rewrite has been finished, and the state of the Helm storage
itself is more fault tolerant.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-01 09:36:43 +01:00
Aurel Canciu d2b52dece8
Align graceful-shutdown-timeout with terminationGracePeriodSeconds
Setting the default value for the graceful-shutdown-timeout flag to
match the default terminationGracePeriodSeconds value we set for the
controller pod container.
It seems the controller-runtime does not support passing -1 as a value
to skip the timeout as documented here:
https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/manager/manager.go#L286

Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
2022-12-21 19:11:27 +01:00
Hidde Beydals 2e96c92918 Set `--graceful-shutdown-timeout` default to `-1`
This is the correct default value as intended in #570.

xref: 92234b3c49/pkg/manager/manager.go (L292-L293)

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-12-20 15:07:42 +00:00
Mac Chaffee 9bcf125e2c
Disable caching of secrets and configmaps by default.
You can re-enabled caching of secrets by starting the
controller with the argument '--feature-gates=CacheSecretsAndConfigMaps=true'

Signed-off-by: Mac Chaffee <machaffe@renci.org>
2022-12-19 09:53:01 -05:00
Aurel Canciu e242bb0e8e
Allow overriding ctrl manager graceful shutdown timeout
Overriding the default GracefulShutdownTimeout option given to the
controller manager with a default of 0 (no timeout) since the helm
operations are sensitive to interruption and can lead to leaving the
HelmRelease in a bad state.

This will also allow users to override the option via a cli flag
`-graceful-shutdown-timeout` how much time to wait before forcibly
exiting.

Related to #569

Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
2022-11-25 12:29:53 +01:00
Hidde Beydals 1bed542fe4 internal/kube: get REST config from runtime
Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-05-12 12:55:36 +02:00
Hidde Beydals 4371610e4b Cherry-pick kube changes from dev
This is a partial cherry-pick of commit ae4f499e87, including
changes around `kube`. This to include some of the changes around the
construction of the ConfigFlags RESTClientGetter, as an attempt to
solve token refresh issues.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-05-12 12:18:41 +02:00
Alex Marston 42aaf61852 Add flags for exponential back-off retry
Signed-off-by: Alex Marston <alexander.marston@gmail.com>
2022-04-19 14:25:51 +02:00
Paulo Gomes 6f4ca28c9a
Add flags to control kubeconfig support
Two new flags were added to allow users to enable the
use of user.Exec and InsecureTLS in the kubeconfigs
provided remote apply reconciliations.

Breaking change: both functionalities are no longer
enabled by default.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-31 14:51:38 +01:00
Sunny 6bd29a729d Use new standardized runtime and meta package
This includes an update of the source-controller to v0.22.0, to pull in
the v1beta2 API which makes use of the same packages.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-18 13:10:32 +01:00
Stefan Prodan 0173eaa0df
Allow setting a default service account for impersonation
Introduce the flag `--default-service-account` for allowing cluster admins to enforce impersonation for resources reconciliation.

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-01-31 12:09:39 +02:00
Somtochi Onyekwere acf164c46e Add flag to disable cross namespace references
Signed-off-by: Somtochi Onyekwere <somtochionyekwere@gmail.com>
2022-01-29 13:51:06 +01:00
Stefan Prodan 049aca937a
Set the managed fields owner to helm-controller
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-10-25 16:18:31 +03:00
Hidde Beydals 43ee4dceee Panic on non-nil AddToScheme errors in main init
Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-06-18 13:20:12 +02:00
Stefan Prodan 50eba699be
Use controller name in LeaderElectionID
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 14:09:34 +02:00
Stefan Prodan 3e558b3fc4
Remove deprecated log-json flag
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 13:31:13 +02:00
Stefan Prodan b74081dbf7
Update fluxcd/pkg/runtime to v0.10.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 12:41:48 +02:00
Stefan Prodan 7622dd9683
Add leader election deadline to cmd args
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-23 19:10:55 +02:00
Stefan Prodan a8dcafaf2e
Retry with exponential backoff when fetching artifacts
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-02-26 13:37:45 +02:00
Hidde Beydals 1459b11116 Enable pprof endpoints on metrics server
Using the helper from `pkg/runtime/pprof`, which follows the suggestion
from controller-runtime to use `AddMetricsExtraHandler`.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-02-07 12:41:17 +01:00
Stefan Prodan d072da6298
Update fluxcd/pkg/runtime to v0.8.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-01-21 19:09:04 +02:00
Hidde Beydals e511cb8af4 Upgrade controller-runtime to v0.7.0
This commit upgrades the `controller-runtime` dependency to `v0.7.0`,
including all changes required to make all wiring work again.

- Upgrade `runtime` to v0.6.0 to include `controller-runtime` changes.
- Loggers have been removed from the reconcilers and are now retrieved
  from the `context.Context` passed to the `Reconcile` method and
  downwards functions.
- Logger configuration flags are now bound to the flag set using
  `BindFlags` from `runtime/logger`, ensuring the same contract across
  GitOps Toolkit controllers, and the `--log-json` flag has been
  deprecated in favour of the `--log-encoding=json` default.
- The `ChangePredicate` from `runtime` has changed to a
  `ReconcilateAtChangedPredicate`, and is now chained with the
  `GenerationChangedPredicate` from `controller-runtime` using
  `predicate.Or`.
- Signatures that made use of `runtime.Object` have changed to
  `client.Object`, removing the requirement to e.g. call
  `runtime.Object#Object`.
- The `leader-election-role` was changed, as leader election now works
  via the `coordination/v1` API.

Other notable changes:

- `util.ObjectKey` was added to easily construct a `client.ObjectKey` /
  `types.NamespacedName` from a `metav1.Object`.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-01-11 17:41:49 +01:00
Stefan Prodan 62c2a375cb
Add readiness/liveness probes
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-11-26 10:09:44 +02:00
Hidde Beydals b3baf39e11 Move dedicated watcher to in-controller watcher
This prevents the resources from getting annotated, and instead uses
the `handler.EnqueueRequestsFromMapFunc` to queue requests based on
changes to the source objects.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2020-10-28 13:28:11 +01:00
Hidde Beydals 7ac2a41e1a Change copyright to Flux authors
Signed-off-by: Hidde Beydals <hello@hidde.co>
2020-10-27 17:55:18 +01:00
Stefan Prodan 1819f143a9
Implement Prometheus instrumentation
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-10-13 16:24:56 +03:00
Stefan Prodan 6a04f769b2
Update fluxcd/pkg/runtime to v0.1.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-10-13 15:33:46 +03:00
Hidde Beydals accd4762fe Promote API to v2beta1 2020-09-30 19:37:23 +02:00
stefanprodan b5844b2c77 Add watch all namespaces flag 2020-09-11 16:02:05 +03:00
stefanprodan 2a8d08091f Configure manager logging and set level to info 2020-09-09 15:13:02 +03:00
Hidde Beydals e66a93698b Setup production logging
For production the log format is JSON, the timestamps format is ISO8601
and stack traces are logged when the level is set to debug.
2020-07-13 10:33:59 +02:00
stefanprodan 948986c27e Update maintainers 2020-07-09 20:31:28 +03:00
stefanprodan b94dccac13 Implement event recording
- emit Kubernetes events when a release status changes
- forward events to notification controller
2020-07-09 20:19:08 +03:00
Hidde Beydals 6bded34a1b Various small changes
* newline fixes
* source-controller v0.0.2 -> v0.0.3
* pkg v0.0.2 -> v0.0.3
* const for HelmRelease kind
* wording change in README
* lockfile error message fix
2020-07-09 18:04:45 +02:00
Hidde Beydals dc606ea797 Support dependencies on other releases 2020-07-08 19:41:37 +02:00
Hidde Beydals 51da5f0fe2 Support Helm install action 2020-07-08 16:23:38 +02:00
Hidde Beydals 2a5c905145 Init 2020-05-05 22:35:49 +02:00