This commit introduces an OOM watcher, which can be enabled using
`--feature-gates=OOMWatch=true`. The OOM watcher watches the current
memory usage as reported by cgroups via `memory.current` and cancels
the context when it reaches a certain threshold compared to
`memory.max` (default `95`%, configurable using
`--oom-watch-memory-threshold`).
This allows ongoing Helm processes to gracefully exit with a failure
before the controller is forcefully OOM killed, preventing a deadlock
of releases in a pending state.
The OOM watcher polls the `memory.current` file on an interval (default
`500ms`, configurable using `--oom-watch-interval`), as subscribing to
file updates using inotify is not possible for cgroups (v2) except for
`*.events` files. Which does provide signals using `memory.events`, but
these will generally be too late for our use case. As for example `high`
equals `max` in most containers, buying us little time to gracefully
stop our processes.
In addition, because we simply watch current usage compared to max
usage in bytes. This approach should work for cgroups v1 as well, given
this has (most of the time) files for these values available, albeit
at times at different locations. For which this commit does not
introduce a flag yet, but the library takes into account that it could
be configured at some point.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
As otherwise with a persistent discovery client and/or REST mapper
configuration, newly installed CRDs will not be recognized and cause a
`resource mapping not found for name` error.
In addition, remove the `ServerGroups` and `Invalidate` calls. As this
is later done (again) by Helm when gathering server capabilities.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This drops the twofold implementation in favor of a single
`MemoryRESTClientGetter` which can work with an arbitrary `rest.Config`.
The new `MemoryRESTClientGetter` lazy-loads and caches the objects it
initializes, thereby creating at most one instance of each object for
the duration of the reconcile of a single `HelmRelease` object.
Based on some initial tests, this seems to reduce the overal memory
footprint of the controller.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
- github.com/fluxcd/pkg/apis/event to v0.4.1
- github.com/fluxcd/pkg/apis/kustomize to v0.8.1
- github.com/fluxcd/pkg/apis/meta to v0.19.1
- github.com/fluxcd/pkg/runtime to v0.30.0
- sigs.k8s.io/controller-runtime to v0.14.5
- github.com/containerd/containerd to v1.6.18
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Check if the service account to be impersonated actually exists
and proceed with uninstalling the Helm release only if it does.
Otherwise, skip uninstalling the release and carry on with finalization.
Add an e2e test to check if deleting a namespace with the RBAC and
HelmRelease succeeds with the namespace being fully deleted.
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
This reduces the amount of log lines pushed to `debug` by configuring the kube
client and storage loggers to only log to `trace`.
In addition, the log buffer used in events will now just contain the
most relevant information about a failure as reported by the Helm action
itself, and not the in-depth information from the underlying client
and/or storage.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This allows install and upgrade actions to use DNS lookups while
rendering Helm templates after it got disabled in Helm due to possible
security risks.
It is enabled (globally) on the controller by configuring
`--feature-gates=AllowDNSLookups=true`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
- github.com/fluxcd/source-controller/api to v0.35.2
- github.com/onsi/gomega to v1.27.2
- k8s.io/api to v0.26.2
- k8s.io/apiextensions-apiserver to v0.26.2
- k8s.io/apimachinery to v0.26.2
- k8s.io/cli-runtime to v0.26.2
- k8s.io/client-go to v0.26.2
- k8s.io/utils to v0.0.0-20230220204549-a5ecb0141aa5
- Unpin github.com/emicklei/go-restful as it is no longer an (indirect)
dependency.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
As there are currently no other utilities to properly see what change
the controller detected, this allows people to have an insight into
the observed changes by configuring the controller with
`--log-level=debug`.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This allows a specific object from a release manifest to be excluded
from drift detection by labeling or annotating it with:
`helm.toolkit.fluxcd.io/diff: disabled`.
Using a Kustomize post renderer definition in a HelmRelease, this can
be used to ignore any object from an arbitrary chart.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This enables experimental drift detection of cluster state compared to
the current manifest data from the Helm storage's manifest blob.
Drift detection works based on the already proven approach of the
kustomize-controller's SSA package, and utilizes the managed field
configured by the controller since `v0.12.2`.
This feature is planned to go out of experimental once the further
controller rewrite has been finished, and the state of the Helm storage
itself is more fault tolerant.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This is backwards compatible, as it only changes the type without the
further requirements around the YAML declaration.
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
- github.com/fluxcd/pkg/apis/event to v0.4.0
- github.com/fluxcd/pkg/runtime to v0.29.0
- helm.sh/helm/v3 to v3.11.1
- k8s.io/utils to v0.0.0-20230209194617-a36077c30491
- github.com/containerd/containerd to v1.6.18
Signed-off-by: Hidde Beydals <hello@hidde.co>
- sigs.k8s.io/controller-runtime to v0.14.4
- Unpin golang.org/x/text from v0.4.0 to allow update to v0.5.0
Signed-off-by: Hidde Beydals <hello@hidde.co>
This updates all the comparisons to make use of `HasRevision` which
supports the RFC-0005 and legacy revision formats.
Signed-off-by: Hidde Beydals <hello@hidde.co>