Commit Graph

58 Commits

Author SHA1 Message Date
Stefan Prodan 72ec296d18
Allow cross-shard dependency check
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2024-09-26 21:15:31 +03:00
Stefan Prodan f76d6fe026
Update samples to GA APIs
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2024-05-03 17:12:22 +03:00
Stefan Prodan d0900635cf
Update `HelmChart` API to v1 (GA)
Bump source-controller to v1.3.0

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2024-05-03 13:43:51 +03:00
Hidde Beydals 54eed52a6b
Properly configure namespace selector
This accidentally did not get `if`-wrapped in
eaa2a8c2fe, breaking the configuration
option to watch a single namespace, and thereby as by-effect the
breakage of sharding.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-12-19 16:52:50 +01:00
Hidde Beydals 7a70bd599f
Allow configuration of digest algorithm
This introduces a `--snapshot-digest-algo` flag to allow configuring a
different algorithm than SHA256.

This allows the user to for example configure `blake3`, which is
potentially faster (and less resource intensive) on modern hardware.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-24 18:19:53 +01:00
Hidde Beydals 2d927b9b9e
Miscellaneous tidying of minor things
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-24 17:59:45 +01:00
Hidde Beydals eaa2a8c2fe
Update dependencies
- github.com/fluxcd/cli-utils to v0.36.0-flux.1
- github.com/fluxcd/pkg/apis/event to v0.6.0
- github.com/fluxcd/pkg/apis/kustomize to v1.2.0
- github.com/fluxcd/pkg/apis/meta to v1.2.0
- github.com/fluxcd/pkg/runtime to v0.43.0
- github.com/fluxcd/pkg/ssa to v0.34.0
- github.com/fluxcd/pkg/testserver to v0.5.0
- github.com/go-logr/logr to v1.3.0
- github.com/google/go-cmp to v0.6.0
- github.com/hashicorp/go-retryablehttp to v0.7.5
- github.com/onsi/gomega to v1.30.0
- github.com/opencontainers/go-digest to v1.0.1-0.20231025023718-d50d2fec9c98
- github.com/opencontainers/go-digest/blake3 to v0.0.0-20231025023718-d50d2fec9c98
- golang.org/x/text to v0.14.0
- helm.sh/helm/v3 to v3.13.2
- k8s.io/api to v0.28.4
- k8s.io/apiextensions-apiserver to v0.28.4
- k8s.io/apimachinery to v0.28.4
- k8s.io/cli-runtime to v0.28.4
- k8s.io/client-go to v0.28.4
- k8s.io/kubectl to v0.28.4
- k8s.io/utils to v0.0.0-20231121161247-cf03d44ff3cf
- sigs.k8s.io/controller-runtime to v0.16.3
- sigs.k8s.io/kustomize/api to v0.15.0
- sigs.k8s.io/kustomize/kyaml to v0.15.0
- sigs.k8s.io/yaml to v1.4.0

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-24 12:43:33 +01:00
Hidde Beydals fbd73ac399
controller: start w/ adding tests for HelmRelease
This adds base coverage for some of the simpler methods which do not
require extensive mocking.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-20 12:06:39 +01:00
Hidde Beydals 5e3ad5d21a
reconcile: add `HelmChartTemplate` sub-reconciler
"With hope comes the potential for both triumph and tribulation."

Due to difficulties beyond the time I have at hands at present[1], the
separate reconciler which took care of ensuring the HelmChart of the
HelmRelease was kept up-to-date has been transformed into a
sub-reconciler.

The behavior of the sub-reconciler remains largely unchanged, except the
required changes to deal with the lack of possibilities to requeue.
Effectively, this means that instead of e.g. deleting the HelmChart
object, requeue, and create it again. This is now handled in a single
operation, unless the deletion fails.

[1]: The core of the issue is that deregistration of finalizers becomes
difficult due to the behavior of the patch helper, and unavailability of
list merges for patch operations on Custom Resources within Kubernetes.

This means that when two reconcilers simultaneously work on the
deregistration of the finalizers, and one succeeds before the other. The
last finishing reconciler will attempt to add the finalizer of the other
reconciler back, as it did exist at the start of their reconciliation
run.

Attempts to work around this (for example, by using an optimistic lock
on the patch operation of the finalizers field) would cause new issues.
As Kubernetes will then delete the object as soon as the patch has
succeeded, and before the reconciliation process actually ends.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-20 12:06:38 +01:00
Hidde Beydals d802ba6cc1
controllers: roughly rewire HelmRelease reconciler
This adds the base wiring to get the controller to work with the
v2beta2 API and the newly introduced packages in `internal/`.

In essence, this means that from now on the controller will utilize all
new code for the reconciliation of the HelmRelease resource.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-11-20 12:06:35 +01:00
Hidde Beydals fe661df9d7
Move HelmChart handling to separate reconciler
This moves the HelmChart template handling to a separate reconciler,
with predicates detecing relevant changes. The idea is that this would
both facilitate working _without_ chart templates but with references
in the future, and to reduce cognitive load while working with
reconciler logic.

The predicate uses `DeepEqual` from `k8s.io/apimachinery/pkg/api/equality`
to inspect the Chart template objects of the old and new HelmRelease
object in the update event.

The reconciler uses server-side apply to create or update the HelmChart
on the cluster, and emits an event based on the change set of the
action. It does not produce any diff yet, as the server-side apply
library at present does not provide a way to gain access to an "old"
versus "new" objects after performing an apply. The `diff` package
has however been prepared to allow diffing Unstructured objects.

As this reconciler has a separate life-cycle, a new
`chart.finalizers.fluxcd.io` finalizer has been introduced to ensure
a HelmChart is properly garbage collected before the HelmRelease is
allowed to be deleted.

The implementation on the release reconciler's end is a rough sketch,
but in working shape. The foresight is that much of the reconciler will
change when the release logic will be adjusted to work with the earlier
introduced storage observer.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-11-20 12:02:40 +01:00
Sunny 74e33a70c4 Delete stale metrics on object delete
Use the metrics helper to record all the metrics. Metrics helpers
ensures that the metrics for deleted objects are deleted as well.

Move all the metrics recording to be performed at the very end of the
reconciliation. Realtime metrics for readiness is no longer recorded as
it will be removed in a future version for CRD metrics collected using
kube-state-metrics. Updating the object status with realtime readiness
should provide the readiness to CRD metrics watchers.

`HelmReleaseReconciler.reconcileDelete()` is modified to receive a
pointer HelmRelease object so that any modifications on the object is
reflected on the object instance that's passed to the metrics recorder.
This is not needed for `HelmReleaseReconciler.reconcile()` as it returns
a new copy of the object that's saved in the same object variable,
overwriting the object instance with the updates.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2023-08-15 02:42:09 +05:30
Hidde Beydals d76f3a355b
controller: jitter requeue interval
This adds a `--interval-jitter-percentage` flag to the controller to
add a +/- percentage jitter to the interval defined in a HelmRelease
(defaults to 5%).

Effectively, this results in a reconciliation every 9.5 - 10.5 minutes
for a resource with an interval of 10 minutes.

Main reason to add this change is to mitigate spikes in memory and
CPU usage caused by many resources being configured with the same
interval.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-08-09 17:50:43 +02:00
Hidde Beydals d345af0e73
Rename controllers to controller
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-05-24 11:05:53 +02:00
Hidde Beydals 2ba28c6d9e
Update Kubernetes, controller-runtime and Helm
This commit updates Kubernetes to v1.27, controller-runtime to
v0.15, and Helm to v3.12.

It deals with various breaking changes in controller-runtime, as
documented in the release notes:
https://github.com/kubernetes-sigs/controller-runtime/releases/tag/v0.15.0

In short:

- `Watches` now use a `client.Object` instead of a `source.Kind`.
- `handler.MapFunc` signature accepts a Go context, which is used to
  log any errors, instead of silently ignoring them and/or panicking.
- Max concurrent reconciles is configured on the manager, instead of
  configuring them per reconciler instance.
- Various manager configuration options have been moved to new
  structures and/or fields.

In addition to this, all other dependencies which had updates
available are updated to their latest versions as well.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-05-24 10:39:51 +02:00
Hidde Beydals 08925bc282
Add reconciler sharding capability based on label
With this enhancement, the controller can be configured with
`--watch-label-selector`, after which only objects with this label will
be reconciled by the controller.

This allows for horizontal scaling of the helm-controller, where each
controller can be deployed multiple times with a unique label selector
which is used as the sharding key.

Note that if you want to ensure a `HelmChart` gets created for a
specific source-controller instance, you have to provide the labels for
this controller in `.spec.chart.metadata.labels` of the `HelmRelease`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-29 15:37:53 +02:00
Hidde Beydals 3615feef2a
Move `controllers` to `internal/controllers`
There is no good reason for it to be exposed and available through a
public API, and this follows the new kubebuilder defaults.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-27 17:25:31 +02:00
Hidde Beydals b732420f26
oomwatch: auto detect well known cgroup paths
This commit adds support for recognizing cgroup v1 paths, and allows for
the configuration of alternative absolute path locations using
`--oom-watch-max-memory-path` and `--oom-watch-current-memory-path`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-10 15:08:17 +01:00
Hidde Beydals 3cb5b5c934
Use `logger.SetLogger` to also configure `klog`
This uses the newly introduced helper from runtime, which also
configures the logger for `klog`.

Resulting in all logs now being properly formatted, even when logged by
internal Kubernetes elements like the leader election or a dynamic
client.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-08 00:26:27 +01:00
Hidde Beydals c4566a5459
oomwatch: small tweaks
- Change memory usage percent threshold to `uint8` to no longer allow
  fractions.
- Validate interval to prevent configurations `<50ms`.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-07 10:39:31 +01:00
Hidde Beydals 62456c94ff
Add OOM watcher to allow graceful shutdown
This commit introduces an OOM watcher, which can be enabled using
`--feature-gates=OOMWatch=true`. The OOM watcher watches the current
memory usage as reported by cgroups via `memory.current` and cancels
the context when it reaches a certain threshold compared to
`memory.max` (default `95`%, configurable using
`--oom-watch-memory-threshold`).

This allows ongoing Helm processes to gracefully exit with a failure
before the controller is forcefully OOM killed, preventing a deadlock
of releases in a pending state.

The OOM watcher polls the `memory.current` file on an interval (default
`500ms`, configurable using `--oom-watch-interval`), as subscribing to
file updates using inotify is not possible for cgroups (v2) except for
`*.events` files. Which does provide signals using `memory.events`, but
these will generally be too late for our use case. As for example `high`
equals `max` in most containers, buying us little time to gracefully
stop our processes.

In addition, because we simply watch current usage compared to max
usage in bytes. This approach should work for cgroups v1 as well, given
this has (most of the time) files for these values available, albeit
at times at different locations. For which this commit does not
introduce a flag yet, but the library takes into account that it could
be configured at some point.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-07 10:39:19 +01:00
Hidde Beydals 1240f20183
Enable experimental drift detection
This enables experimental drift detection of cluster state compared to
the current manifest data from the Helm storage's manifest blob.

Drift detection works based on the already proven approach of the
kustomize-controller's SSA package, and utilizes the managed field
configured by the controller since `v0.12.2`.

This feature is planned to go out of experimental once the further
controller rewrite has been finished, and the state of the Helm storage
itself is more fault tolerant.

Signed-off-by: Hidde Beydals <hidde@hhh.computer>
2023-03-01 09:36:43 +01:00
Aurel Canciu d2b52dece8
Align graceful-shutdown-timeout with terminationGracePeriodSeconds
Setting the default value for the graceful-shutdown-timeout flag to
match the default terminationGracePeriodSeconds value we set for the
controller pod container.
It seems the controller-runtime does not support passing -1 as a value
to skip the timeout as documented here:
https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/manager/manager.go#L286

Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
2022-12-21 19:11:27 +01:00
Hidde Beydals 2e96c92918 Set `--graceful-shutdown-timeout` default to `-1`
This is the correct default value as intended in #570.

xref: 92234b3c49/pkg/manager/manager.go (L292-L293)

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-12-20 15:07:42 +00:00
Mac Chaffee 9bcf125e2c
Disable caching of secrets and configmaps by default.
You can re-enabled caching of secrets by starting the
controller with the argument '--feature-gates=CacheSecretsAndConfigMaps=true'

Signed-off-by: Mac Chaffee <machaffe@renci.org>
2022-12-19 09:53:01 -05:00
Aurel Canciu e242bb0e8e
Allow overriding ctrl manager graceful shutdown timeout
Overriding the default GracefulShutdownTimeout option given to the
controller manager with a default of 0 (no timeout) since the helm
operations are sensitive to interruption and can lead to leaving the
HelmRelease in a bad state.

This will also allow users to override the option via a cli flag
`-graceful-shutdown-timeout` how much time to wait before forcibly
exiting.

Related to #569

Signed-off-by: Aurel Canciu <aurelcanciu@gmail.com>
2022-11-25 12:29:53 +01:00
Hidde Beydals 1bed542fe4 internal/kube: get REST config from runtime
Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-05-12 12:55:36 +02:00
Hidde Beydals 4371610e4b Cherry-pick kube changes from dev
This is a partial cherry-pick of commit ae4f499e87, including
changes around `kube`. This to include some of the changes around the
construction of the ConfigFlags RESTClientGetter, as an attempt to
solve token refresh issues.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-05-12 12:18:41 +02:00
Alex Marston 42aaf61852 Add flags for exponential back-off retry
Signed-off-by: Alex Marston <alexander.marston@gmail.com>
2022-04-19 14:25:51 +02:00
Paulo Gomes 6f4ca28c9a
Add flags to control kubeconfig support
Two new flags were added to allow users to enable the
use of user.Exec and InsecureTLS in the kubeconfigs
provided remote apply reconciliations.

Breaking change: both functionalities are no longer
enabled by default.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-31 14:51:38 +01:00
Sunny 6bd29a729d Use new standardized runtime and meta package
This includes an update of the source-controller to v0.22.0, to pull in
the v1beta2 API which makes use of the same packages.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-18 13:10:32 +01:00
Stefan Prodan 0173eaa0df
Allow setting a default service account for impersonation
Introduce the flag `--default-service-account` for allowing cluster admins to enforce impersonation for resources reconciliation.

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-01-31 12:09:39 +02:00
Somtochi Onyekwere acf164c46e Add flag to disable cross namespace references
Signed-off-by: Somtochi Onyekwere <somtochionyekwere@gmail.com>
2022-01-29 13:51:06 +01:00
Stefan Prodan 049aca937a
Set the managed fields owner to helm-controller
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-10-25 16:18:31 +03:00
Hidde Beydals 43ee4dceee Panic on non-nil AddToScheme errors in main init
Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-06-18 13:20:12 +02:00
Stefan Prodan 50eba699be
Use controller name in LeaderElectionID
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 14:09:34 +02:00
Stefan Prodan 3e558b3fc4
Remove deprecated log-json flag
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 13:31:13 +02:00
Stefan Prodan b74081dbf7
Update fluxcd/pkg/runtime to v0.10.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-25 12:41:48 +02:00
Stefan Prodan 7622dd9683
Add leader election deadline to cmd args
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-03-23 19:10:55 +02:00
Stefan Prodan a8dcafaf2e
Retry with exponential backoff when fetching artifacts
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-02-26 13:37:45 +02:00
Hidde Beydals 1459b11116 Enable pprof endpoints on metrics server
Using the helper from `pkg/runtime/pprof`, which follows the suggestion
from controller-runtime to use `AddMetricsExtraHandler`.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-02-07 12:41:17 +01:00
Stefan Prodan d072da6298
Update fluxcd/pkg/runtime to v0.8.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2021-01-21 19:09:04 +02:00
Hidde Beydals e511cb8af4 Upgrade controller-runtime to v0.7.0
This commit upgrades the `controller-runtime` dependency to `v0.7.0`,
including all changes required to make all wiring work again.

- Upgrade `runtime` to v0.6.0 to include `controller-runtime` changes.
- Loggers have been removed from the reconcilers and are now retrieved
  from the `context.Context` passed to the `Reconcile` method and
  downwards functions.
- Logger configuration flags are now bound to the flag set using
  `BindFlags` from `runtime/logger`, ensuring the same contract across
  GitOps Toolkit controllers, and the `--log-json` flag has been
  deprecated in favour of the `--log-encoding=json` default.
- The `ChangePredicate` from `runtime` has changed to a
  `ReconcilateAtChangedPredicate`, and is now chained with the
  `GenerationChangedPredicate` from `controller-runtime` using
  `predicate.Or`.
- Signatures that made use of `runtime.Object` have changed to
  `client.Object`, removing the requirement to e.g. call
  `runtime.Object#Object`.
- The `leader-election-role` was changed, as leader election now works
  via the `coordination/v1` API.

Other notable changes:

- `util.ObjectKey` was added to easily construct a `client.ObjectKey` /
  `types.NamespacedName` from a `metav1.Object`.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2021-01-11 17:41:49 +01:00
Stefan Prodan 62c2a375cb
Add readiness/liveness probes
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-11-26 10:09:44 +02:00
Hidde Beydals b3baf39e11 Move dedicated watcher to in-controller watcher
This prevents the resources from getting annotated, and instead uses
the `handler.EnqueueRequestsFromMapFunc` to queue requests based on
changes to the source objects.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2020-10-28 13:28:11 +01:00
Hidde Beydals 7ac2a41e1a Change copyright to Flux authors
Signed-off-by: Hidde Beydals <hello@hidde.co>
2020-10-27 17:55:18 +01:00
Stefan Prodan 1819f143a9
Implement Prometheus instrumentation
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-10-13 16:24:56 +03:00
Stefan Prodan 6a04f769b2
Update fluxcd/pkg/runtime to v0.1.0
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2020-10-13 15:33:46 +03:00
Hidde Beydals accd4762fe Promote API to v2beta1 2020-09-30 19:37:23 +02:00
stefanprodan b5844b2c77 Add watch all namespaces flag 2020-09-11 16:02:05 +03:00