This propagates the reason a HelmChart is (likely) not ready to the
message of the Ready condition.
The goal of this is to make it easier for people to reason about a
potential failure that may be happening while retrieving the chart,
without having to inspect the HelmChart itself.
As at times, they may not have access (due to e.g. not being able to
access the namespace, while the controller is allowed to create the
object there), or are simply not aware of the fact that this object
is created by the controller for them.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This ensures that if there is no target to roll back to due to all of
them being in a failed state, the controller stalls instead of ending up
in a loop of upgrade attempts.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This ensures that any unfulfilled dependencies for which we requeue do
not prematurely bump the observed generation by introducing typed
errors.
These typed errors ensure that the logic to bump the observed generation
can continue to be the same, while ignoring them just in time before
returning the final error.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This makes the controller actually take the
`reconcile.fluxcd.io/forceAt` and `reconcile.fluxcd.io/resetAt` into
account.
For `reconcile.fluxcd.io/resetAt`, this means that the failure counts on
the `HelmRelease` object are reset when the token value of the
annotation equals `reconcile.fluxcd.io/requestedAt`. Allowing the
controller to start over with attempting to install or upgrade the
release until the retries count has been reached again.
For `reconcile.fluxcd.io/forceAt`, this means that a one-off Helm
install or upgrade is allowed to take place even if the object is out of
retries, in a failed state where it should be remediated, or in-sync.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
- github.com/fluxcd/cli-utils to v0.36.0-flux.1
- github.com/fluxcd/pkg/apis/event to v0.6.0
- github.com/fluxcd/pkg/apis/kustomize to v1.2.0
- github.com/fluxcd/pkg/apis/meta to v1.2.0
- github.com/fluxcd/pkg/runtime to v0.43.0
- github.com/fluxcd/pkg/ssa to v0.34.0
- github.com/fluxcd/pkg/testserver to v0.5.0
- github.com/go-logr/logr to v1.3.0
- github.com/google/go-cmp to v0.6.0
- github.com/hashicorp/go-retryablehttp to v0.7.5
- github.com/onsi/gomega to v1.30.0
- github.com/opencontainers/go-digest to v1.0.1-0.20231025023718-d50d2fec9c98
- github.com/opencontainers/go-digest/blake3 to v0.0.0-20231025023718-d50d2fec9c98
- golang.org/x/text to v0.14.0
- helm.sh/helm/v3 to v3.13.2
- k8s.io/api to v0.28.4
- k8s.io/apiextensions-apiserver to v0.28.4
- k8s.io/apimachinery to v0.28.4
- k8s.io/cli-runtime to v0.28.4
- k8s.io/client-go to v0.28.4
- k8s.io/kubectl to v0.28.4
- k8s.io/utils to v0.0.0-20231121161247-cf03d44ff3cf
- sigs.k8s.io/controller-runtime to v0.16.3
- sigs.k8s.io/kustomize/api to v0.15.0
- sigs.k8s.io/kustomize/kyaml to v0.15.0
- sigs.k8s.io/yaml to v1.4.0
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This allows the controller to be updated from `v2beta1` to `v2beta2`
without triggering a release to settle state.
It does this by looking at the previous successful release as recorded
for the `v2beta1` object, and if found, recording a snapshot for it in
the new `History` field of the status.
This feature can be disabled by setting the `AdoptLegacyReleases`
feature flag to `false`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This decouples the state determination from deciding which action to
take, making it easier to reason about the different types of state
and what action should be taken to drive it forward.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This ensures the logs of the Kubernetes client used by Helm are
persisted to the log buffer, as they can contain important information
when an action times out.
In addition, move the logs from the Helm actions themselves to the
"debug" log level (while still including them in Kubernetes Events in
case of a failure), in favor of the logs produced by the `reconcile`
package itself. While moving the logs from the Helm storage to the
"trace" log level, as they only contain information about e.g. writes
to a Secret.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
The primary reason for this is the alphabetical ordering of `kubectl
describe`, which caused the fields to be listed in separate places
instead of a bundle.
From a programmatic perspective, it is also great because it is now much
easier to reset any previous state when e.g. uninstalling a release. As
we can simply write an empty struct to erase any memory of a previous
release, instead of having to deal with multiple fields.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This ensures certain edge-cases around the availability of the service
account and/or KubeConfig are handled.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
"With hope comes the potential for both triumph and tribulation."
Due to difficulties beyond the time I have at hands at present[1], the
separate reconciler which took care of ensuring the HelmChart of the
HelmRelease was kept up-to-date has been transformed into a
sub-reconciler.
The behavior of the sub-reconciler remains largely unchanged, except the
required changes to deal with the lack of possibilities to requeue.
Effectively, this means that instead of e.g. deleting the HelmChart
object, requeue, and create it again. This is now handled in a single
operation, unless the deletion fails.
[1]: The core of the issue is that deregistration of finalizers becomes
difficult due to the behavior of the patch helper, and unavailability of
list merges for patch operations on Custom Resources within Kubernetes.
This means that when two reconcilers simultaneously work on the
deregistration of the finalizers, and one succeeds before the other. The
last finishing reconciler will attempt to add the finalizer of the other
reconciler back, as it did exist at the start of their reconciliation
run.
Attempts to work around this (for example, by using an optimistic lock
on the patch operation of the finalizers field) would cause new issues.
As Kubernetes will then delete the object as soon as the patch has
succeeded, and before the reconciliation process actually ends.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
When an object is marked as under deletion, the API server will reject
any attempt to register new finalizers. Given this, handling the
deletion timestamp always has to come before an attempt to register
the finalizer.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This ensures they both have the same observation on the last
modifications made to the object. Preventing possible scenarios where
a condition would not be removed because it wasn't set at the start of
the reconcile run, then added, and then removed. Causing it to go
unnoticed during the diff calculation.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This adds the base wiring to get the controller to work with the
v2beta2 API and the newly introduced packages in `internal/`.
In essence, this means that from now on the controller will utilize all
new code for the reconciliation of the HelmRelease resource.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This commit moves various generic bits out of the reconciler into
separate modules, while adding more test coverage.
Some of the logic around merging chart values from references has been
improved to work with `client.Object`, instead of two separate maps.
In addition, the option to override the hostname of an Artifact has
been removed. It was undocumented and for testing purposes only, which
these days can be better achieved by e.g. configuring the
`--storage-adv-addr`.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This moves the HelmChart template handling to a separate reconciler,
with predicates detecing relevant changes. The idea is that this would
both facilitate working _without_ chart templates but with references
in the future, and to reduce cognitive load while working with
reconciler logic.
The predicate uses `DeepEqual` from `k8s.io/apimachinery/pkg/api/equality`
to inspect the Chart template objects of the old and new HelmRelease
object in the update event.
The reconciler uses server-side apply to create or update the HelmChart
on the cluster, and emits an event based on the change set of the
action. It does not produce any diff yet, as the server-side apply
library at present does not provide a way to gain access to an "old"
versus "new" objects after performing an apply. The `diff` package
has however been prepared to allow diffing Unstructured objects.
As this reconciler has a separate life-cycle, a new
`chart.finalizers.fluxcd.io` finalizer has been introduced to ensure
a HelmChart is properly garbage collected before the HelmRelease is
allowed to be deleted.
The implementation on the release reconciler's end is a rough sketch,
but in working shape. The foresight is that much of the reconciler will
change when the release logic will be adjusted to work with the earlier
introduced storage observer.
Signed-off-by: Hidde Beydals <hello@hidde.co>
This addresses an issue in which the defunct `DefaultServiceAccount`
from the `HelmReleaseReconciler` was being used to construct the
impersonator used by the differ.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
To prevent spurious newlines between the error message and the captured
logs, as at times Helm ends error with one or multiple newlines.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Use the metrics helper to record all the metrics. Metrics helpers
ensures that the metrics for deleted objects are deleted as well.
Move all the metrics recording to be performed at the very end of the
reconciliation. Realtime metrics for readiness is no longer recorded as
it will be removed in a future version for CRD metrics collected using
kube-state-metrics. Updating the object status with realtime readiness
should provide the readiness to CRD metrics watchers.
`HelmReleaseReconciler.reconcileDelete()` is modified to receive a
pointer HelmRelease object so that any modifications on the object is
reflected on the object instance that's passed to the metrics recorder.
This is not needed for `HelmReleaseReconciler.reconcile()` as it returns
a new copy of the object that's saved in the same object variable,
overwriting the object instance with the updates.
Signed-off-by: Sunny <darkowlzz@protonmail.com>
This adds a `--interval-jitter-percentage` flag to the controller to
add a +/- percentage jitter to the interval defined in a HelmRelease
(defaults to 5%).
Effectively, this results in a reconciliation every 9.5 - 10.5 minutes
for a resource with an interval of 10 minutes.
Main reason to add this change is to mitigate spikes in memory and
CPU usage caused by many resources being configured with the same
interval.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Likely after the upgrade to controller-runtime v0.15.0 a regression
surfaced for long-running reconciliations of HelmRelease resources (e.g.
for charts having pre-upgrade hooks taking a few minutes to complete).
This regression would cause the controller to immediately re-run the
upgrade after a successful upgrade, thus entering an almost-endless
loop.
Apparently, the only fix to this issue is to ensure
`.Status.LastReleaseRevision` is updated as soon as possible in the
reconiliation cycle rather than wait for the update at the end of the
cycle.
Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>