This accidentally did not get `if`-wrapped in
eaa2a8c2fe, breaking the configuration
option to watch a single namespace, and thereby as by-effect the
breakage of sharding.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This introduces a `--snapshot-digest-algo` flag to allow configuring a
different algorithm than SHA256.
This allows the user to for example configure `blake3`, which is
potentially faster (and less resource intensive) on modern hardware.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
- github.com/fluxcd/cli-utils to v0.36.0-flux.1
- github.com/fluxcd/pkg/apis/event to v0.6.0
- github.com/fluxcd/pkg/apis/kustomize to v1.2.0
- github.com/fluxcd/pkg/apis/meta to v1.2.0
- github.com/fluxcd/pkg/runtime to v0.43.0
- github.com/fluxcd/pkg/ssa to v0.34.0
- github.com/fluxcd/pkg/testserver to v0.5.0
- github.com/go-logr/logr to v1.3.0
- github.com/google/go-cmp to v0.6.0
- github.com/hashicorp/go-retryablehttp to v0.7.5
- github.com/onsi/gomega to v1.30.0
- github.com/opencontainers/go-digest to v1.0.1-0.20231025023718-d50d2fec9c98
- github.com/opencontainers/go-digest/blake3 to v0.0.0-20231025023718-d50d2fec9c98
- golang.org/x/text to v0.14.0
- helm.sh/helm/v3 to v3.13.2
- k8s.io/api to v0.28.4
- k8s.io/apiextensions-apiserver to v0.28.4
- k8s.io/apimachinery to v0.28.4
- k8s.io/cli-runtime to v0.28.4
- k8s.io/client-go to v0.28.4
- k8s.io/kubectl to v0.28.4
- k8s.io/utils to v0.0.0-20231121161247-cf03d44ff3cf
- sigs.k8s.io/controller-runtime to v0.16.3
- sigs.k8s.io/kustomize/api to v0.15.0
- sigs.k8s.io/kustomize/kyaml to v0.15.0
- sigs.k8s.io/yaml to v1.4.0
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
"With hope comes the potential for both triumph and tribulation."
Due to difficulties beyond the time I have at hands at present[1], the
separate reconciler which took care of ensuring the HelmChart of the
HelmRelease was kept up-to-date has been transformed into a
sub-reconciler.
The behavior of the sub-reconciler remains largely unchanged, except the
required changes to deal with the lack of possibilities to requeue.
Effectively, this means that instead of e.g. deleting the HelmChart
object, requeue, and create it again. This is now handled in a single
operation, unless the deletion fails.
[1]: The core of the issue is that deregistration of finalizers becomes
difficult due to the behavior of the patch helper, and unavailability of
list merges for patch operations on Custom Resources within Kubernetes.
This means that when two reconcilers simultaneously work on the
deregistration of the finalizers, and one succeeds before the other. The
last finishing reconciler will attempt to add the finalizer of the other
reconciler back, as it did exist at the start of their reconciliation
run.
Attempts to work around this (for example, by using an optimistic lock
on the patch operation of the finalizers field) would cause new issues.
As Kubernetes will then delete the object as soon as the patch has
succeeded, and before the reconciliation process actually ends.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This adds the base wiring to get the controller to work with the
v2beta2 API and the newly introduced packages in `internal/`.
In essence, this means that from now on the controller will utilize all
new code for the reconciliation of the HelmRelease resource.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This moves the HelmChart template handling to a separate reconciler,
with predicates detecing relevant changes. The idea is that this would
both facilitate working _without_ chart templates but with references
in the future, and to reduce cognitive load while working with
reconciler logic.
The predicate uses `DeepEqual` from `k8s.io/apimachinery/pkg/api/equality`
to inspect the Chart template objects of the old and new HelmRelease
object in the update event.
The reconciler uses server-side apply to create or update the HelmChart
on the cluster, and emits an event based on the change set of the
action. It does not produce any diff yet, as the server-side apply
library at present does not provide a way to gain access to an "old"
versus "new" objects after performing an apply. The `diff` package
has however been prepared to allow diffing Unstructured objects.
As this reconciler has a separate life-cycle, a new
`chart.finalizers.fluxcd.io` finalizer has been introduced to ensure
a HelmChart is properly garbage collected before the HelmRelease is
allowed to be deleted.
The implementation on the release reconciler's end is a rough sketch,
but in working shape. The foresight is that much of the reconciler will
change when the release logic will be adjusted to work with the earlier
introduced storage observer.
Signed-off-by: Hidde Beydals <hello@hidde.co>
Use the metrics helper to record all the metrics. Metrics helpers
ensures that the metrics for deleted objects are deleted as well.
Move all the metrics recording to be performed at the very end of the
reconciliation. Realtime metrics for readiness is no longer recorded as
it will be removed in a future version for CRD metrics collected using
kube-state-metrics. Updating the object status with realtime readiness
should provide the readiness to CRD metrics watchers.
`HelmReleaseReconciler.reconcileDelete()` is modified to receive a
pointer HelmRelease object so that any modifications on the object is
reflected on the object instance that's passed to the metrics recorder.
This is not needed for `HelmReleaseReconciler.reconcile()` as it returns
a new copy of the object that's saved in the same object variable,
overwriting the object instance with the updates.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
This adds a `--interval-jitter-percentage` flag to the controller to
add a +/- percentage jitter to the interval defined in a HelmRelease
(defaults to 5%).
Effectively, this results in a reconciliation every 9.5 - 10.5 minutes
for a resource with an interval of 10 minutes.
Main reason to add this change is to mitigate spikes in memory and
CPU usage caused by many resources being configured with the same
interval.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit updates Kubernetes to v1.27, controller-runtime to
v0.15, and Helm to v3.12.
It deals with various breaking changes in controller-runtime, as
documented in the release notes:
https://github.com/kubernetes-sigs/controller-runtime/releases/tag/v0.15.0
In short:
- `Watches` now use a `client.Object` instead of a `source.Kind`.
- `handler.MapFunc` signature accepts a Go context, which is used to
log any errors, instead of silently ignoring them and/or panicking.
- Max concurrent reconciles is configured on the manager, instead of
configuring them per reconciler instance.
- Various manager configuration options have been moved to new
structures and/or fields.
In addition to this, all other dependencies which had updates
available are updated to their latest versions as well.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
With this enhancement, the controller can be configured with
`--watch-label-selector`, after which only objects with this label will
be reconciled by the controller.
This allows for horizontal scaling of the helm-controller, where each
controller can be deployed multiple times with a unique label selector
which is used as the sharding key.
Note that if you want to ensure a `HelmChart` gets created for a
specific source-controller instance, you have to provide the labels for
this controller in `.spec.chart.metadata.labels` of the `HelmRelease`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
There is no good reason for it to be exposed and available through a
public API, and this follows the new kubebuilder defaults.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit adds support for recognizing cgroup v1 paths, and allows for
the configuration of alternative absolute path locations using
`--oom-watch-max-memory-path` and `--oom-watch-current-memory-path`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This uses the newly introduced helper from runtime, which also
configures the logger for `klog`.
Resulting in all logs now being properly formatted, even when logged by
internal Kubernetes elements like the leader election or a dynamic
client.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit introduces an OOM watcher, which can be enabled using
`--feature-gates=OOMWatch=true`. The OOM watcher watches the current
memory usage as reported by cgroups via `memory.current` and cancels
the context when it reaches a certain threshold compared to
`memory.max` (default `95`%, configurable using
`--oom-watch-memory-threshold`).
This allows ongoing Helm processes to gracefully exit with a failure
before the controller is forcefully OOM killed, preventing a deadlock
of releases in a pending state.
The OOM watcher polls the `memory.current` file on an interval (default
`500ms`, configurable using `--oom-watch-interval`), as subscribing to
file updates using inotify is not possible for cgroups (v2) except for
`*.events` files. Which does provide signals using `memory.events`, but
these will generally be too late for our use case. As for example `high`
equals `max` in most containers, buying us little time to gracefully
stop our processes.
In addition, because we simply watch current usage compared to max
usage in bytes. This approach should work for cgroups v1 as well, given
this has (most of the time) files for these values available, albeit
at times at different locations. For which this commit does not
introduce a flag yet, but the library takes into account that it could
be configured at some point.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This enables experimental drift detection of cluster state compared to
the current manifest data from the Helm storage's manifest blob.
Drift detection works based on the already proven approach of the
kustomize-controller's SSA package, and utilizes the managed field
configured by the controller since `v0.12.2`.
This feature is planned to go out of experimental once the further
controller rewrite has been finished, and the state of the Helm storage
itself is more fault tolerant.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Setting the default value for the graceful-shutdown-timeout flag to
match the default terminationGracePeriodSeconds value we set for the
controller pod container.
It seems the controller-runtime does not support passing -1 as a value
to skip the timeout as documented here:
https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/manager/manager.go#L286
Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
You can re-enabled caching of secrets by starting the
controller with the argument '--feature-gates=CacheSecretsAndConfigMaps=true'
Signed-off-by: Mac Chaffee <machaffe@renci.org>
Overriding the default GracefulShutdownTimeout option given to the
controller manager with a default of 0 (no timeout) since the helm
operations are sensitive to interruption and can lead to leaving the
HelmRelease in a bad state.
This will also allow users to override the option via a cli flag
`-graceful-shutdown-timeout` how much time to wait before forcibly
exiting.
Related to #569
Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
This is a partial cherry-pick of commit ae4f499e87, including
changes around `kube`. This to include some of the changes around the
construction of the ConfigFlags RESTClientGetter, as an attempt to
solve token refresh issues.
Signed-off-by: Hidde Beydals <hello@hidde.co>
Two new flags were added to allow users to enable the
use of user.Exec and InsecureTLS in the kubeconfigs
provided remote apply reconciliations.
Breaking change: both functionalities are no longer
enabled by default.
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
This includes an update of the source-controller to v0.22.0, to pull in
the v1beta2 API which makes use of the same packages.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
Introduce the flag `--default-service-account` for allowing cluster admins to enforce impersonation for resources reconciliation.
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Using the helper from `pkg/runtime/pprof`, which follows the suggestion
from controller-runtime to use `AddMetricsExtraHandler`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This commit upgrades the `controller-runtime` dependency to `v0.7.0`,
including all changes required to make all wiring work again.
- Upgrade `runtime` to v0.6.0 to include `controller-runtime` changes.
- Loggers have been removed from the reconcilers and are now retrieved
from the `context.Context` passed to the `Reconcile` method and
downwards functions.
- Logger configuration flags are now bound to the flag set using
`BindFlags` from `runtime/logger`, ensuring the same contract across
GitOps Toolkit controllers, and the `--log-json` flag has been
deprecated in favour of the `--log-encoding=json` default.
- The `ChangePredicate` from `runtime` has changed to a
`ReconcilateAtChangedPredicate`, and is now chained with the
`GenerationChangedPredicate` from `controller-runtime` using
`predicate.Or`.
- Signatures that made use of `runtime.Object` have changed to
`client.Object`, removing the requirement to e.g. call
`runtime.Object#Object`.
- The `leader-election-role` was changed, as leader election now works
via the `coordination/v1` API.
Other notable changes:
- `util.ObjectKey` was added to easily construct a `client.ObjectKey` /
`types.NamespacedName` from a `metav1.Object`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This prevents the resources from getting annotated, and instead uses
the `handler.EnqueueRequestsFromMapFunc` to queue requests based on
changes to the source objects.
Signed-off-by: Hidde Beydals <hello@hidde.co>