Use the metrics helper to record all the metrics. Metrics helpers
ensures that the metrics for deleted objects are deleted as well.
Move all the metrics recording to be performed at the very end of the
reconciliation. Realtime metrics for readiness is no longer recorded as
it will be removed in a future version for CRD metrics collected using
kube-state-metrics. Updating the object status with realtime readiness
should provide the readiness to CRD metrics watchers.
`HelmReleaseReconciler.reconcileDelete()` is modified to receive a
pointer HelmRelease object so that any modifications on the object is
reflected on the object instance that's passed to the metrics recorder.
This is not needed for `HelmReleaseReconciler.reconcile()` as it returns
a new copy of the object that's saved in the same object variable,
overwriting the object instance with the updates.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
This adds a `--interval-jitter-percentage` flag to the controller to
add a +/- percentage jitter to the interval defined in a HelmRelease
(defaults to 5%).
Effectively, this results in a reconciliation every 9.5 - 10.5 minutes
for a resource with an interval of 10 minutes.
Main reason to add this change is to mitigate spikes in memory and
CPU usage caused by many resources being configured with the same
interval.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit updates Kubernetes to v1.27, controller-runtime to
v0.15, and Helm to v3.12.
It deals with various breaking changes in controller-runtime, as
documented in the release notes:
https://github.com/kubernetes-sigs/controller-runtime/releases/tag/v0.15.0
In short:
- `Watches` now use a `client.Object` instead of a `source.Kind`.
- `handler.MapFunc` signature accepts a Go context, which is used to
log any errors, instead of silently ignoring them and/or panicking.
- Max concurrent reconciles is configured on the manager, instead of
configuring them per reconciler instance.
- Various manager configuration options have been moved to new
structures and/or fields.
In addition to this, all other dependencies which had updates
available are updated to their latest versions as well.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
With this enhancement, the controller can be configured with
`--watch-label-selector`, after which only objects with this label will
be reconciled by the controller.
This allows for horizontal scaling of the helm-controller, where each
controller can be deployed multiple times with a unique label selector
which is used as the sharding key.
Note that if you want to ensure a `HelmChart` gets created for a
specific source-controller instance, you have to provide the labels for
this controller in `.spec.chart.metadata.labels` of the `HelmRelease`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
There is no good reason for it to be exposed and available through a
public API, and this follows the new kubebuilder defaults.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit adds support for recognizing cgroup v1 paths, and allows for
the configuration of alternative absolute path locations using
`--oom-watch-max-memory-path` and `--oom-watch-current-memory-path`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This uses the newly introduced helper from runtime, which also
configures the logger for `klog`.
Resulting in all logs now being properly formatted, even when logged by
internal Kubernetes elements like the leader election or a dynamic
client.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit introduces an OOM watcher, which can be enabled using
`--feature-gates=OOMWatch=true`. The OOM watcher watches the current
memory usage as reported by cgroups via `memory.current` and cancels
the context when it reaches a certain threshold compared to
`memory.max` (default `95`%, configurable using
`--oom-watch-memory-threshold`).
This allows ongoing Helm processes to gracefully exit with a failure
before the controller is forcefully OOM killed, preventing a deadlock
of releases in a pending state.
The OOM watcher polls the `memory.current` file on an interval (default
`500ms`, configurable using `--oom-watch-interval`), as subscribing to
file updates using inotify is not possible for cgroups (v2) except for
`*.events` files. Which does provide signals using `memory.events`, but
these will generally be too late for our use case. As for example `high`
equals `max` in most containers, buying us little time to gracefully
stop our processes.
In addition, because we simply watch current usage compared to max
usage in bytes. This approach should work for cgroups v1 as well, given
this has (most of the time) files for these values available, albeit
at times at different locations. For which this commit does not
introduce a flag yet, but the library takes into account that it could
be configured at some point.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This enables experimental drift detection of cluster state compared to
the current manifest data from the Helm storage's manifest blob.
Drift detection works based on the already proven approach of the
kustomize-controller's SSA package, and utilizes the managed field
configured by the controller since `v0.12.2`.
This feature is planned to go out of experimental once the further
controller rewrite has been finished, and the state of the Helm storage
itself is more fault tolerant.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Setting the default value for the graceful-shutdown-timeout flag to
match the default terminationGracePeriodSeconds value we set for the
controller pod container.
It seems the controller-runtime does not support passing -1 as a value
to skip the timeout as documented here:
https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/manager/manager.go#L286
Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
You can re-enabled caching of secrets by starting the
controller with the argument '--feature-gates=CacheSecretsAndConfigMaps=true'
Signed-off-by: Mac Chaffee <machaffe@renci.org>
Overriding the default GracefulShutdownTimeout option given to the
controller manager with a default of 0 (no timeout) since the helm
operations are sensitive to interruption and can lead to leaving the
HelmRelease in a bad state.
This will also allow users to override the option via a cli flag
`-graceful-shutdown-timeout` how much time to wait before forcibly
exiting.
Related to #569
Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
This is a partial cherry-pick of commit ae4f499e87, including
changes around `kube`. This to include some of the changes around the
construction of the ConfigFlags RESTClientGetter, as an attempt to
solve token refresh issues.
Signed-off-by: Hidde Beydals <hello@hidde.co>
Two new flags were added to allow users to enable the
use of user.Exec and InsecureTLS in the kubeconfigs
provided remote apply reconciliations.
Breaking change: both functionalities are no longer
enabled by default.
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
This includes an update of the source-controller to v0.22.0, to pull in
the v1beta2 API which makes use of the same packages.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Introduce the flag `--default-service-account` for allowing cluster admins to enforce impersonation for resources reconciliation.
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Using the helper from `pkg/runtime/pprof`, which follows the suggestion
from controller-runtime to use `AddMetricsExtraHandler`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit upgrades the `controller-runtime` dependency to `v0.7.0`,
including all changes required to make all wiring work again.
- Upgrade `runtime` to v0.6.0 to include `controller-runtime` changes.
- Loggers have been removed from the reconcilers and are now retrieved
from the `context.Context` passed to the `Reconcile` method and
downwards functions.
- Logger configuration flags are now bound to the flag set using
`BindFlags` from `runtime/logger`, ensuring the same contract across
GitOps Toolkit controllers, and the `--log-json` flag has been
deprecated in favour of the `--log-encoding=json` default.
- The `ChangePredicate` from `runtime` has changed to a
`ReconcilateAtChangedPredicate`, and is now chained with the
`GenerationChangedPredicate` from `controller-runtime` using
`predicate.Or`.
- Signatures that made use of `runtime.Object` have changed to
`client.Object`, removing the requirement to e.g. call
`runtime.Object#Object`.
- The `leader-election-role` was changed, as leader election now works
via the `coordination/v1` API.
Other notable changes:
- `util.ObjectKey` was added to easily construct a `client.ObjectKey` /
`types.NamespacedName` from a `metav1.Object`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This prevents the resources from getting annotated, and instead uses
the `handler.EnqueueRequestsFromMapFunc` to queue requests based on
changes to the source objects.
Signed-off-by: Hidde Beydals <hello@hidde.co>