Commit Graph

131 Commits

Author SHA1 Message Date
Hidde Beydals 76c4bb78bd helmrepo: only log recovery msg on actual recovery
Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-22 23:35:21 +01:00
Hidde Beydals c0a1099719 helm: only use Digest to calculcate index revision
In #1001 bits around the Helm repository reconciliation logic were
rewritten, mostly based on the documented behavior instead of the
actual code. This resulted in the reintroduction of a YAML marshal of
the (sorted) index YAML instead of reliance of just the checksum of the
file.

This to take situations into account in which a repository would e.g.
provide a new random order on every generation. However, this approach
is (extremely) expensive as the marshal goes through a JSON -> YAML
loop, eating lots of RAM in the process.

As the further (silently) introduced behavior has not resulted in any
reported issues, I deem this approach safe and better than e.g.
encoding to just JSON which would still require a substantial amount of
memory.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-22 23:33:59 +01:00
Hidde Beydals 9283894bbe Use MetaDigestKey from event API
Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-14 12:48:36 +01:00
Hidde Beydals f53bfd1dc1 Use Artifact.Path for HelmRepository index cache
Resolving it to a local path does not make it more unique, while
resulting in longer keys and a lot of safejoin calls.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-14 12:48:36 +01:00
Hidde Beydals d62f4dc0c6 misc: order imports and align digest aliases
Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-14 12:48:36 +01:00
Hidde Beydals 0aaeeee5e9 controllers: RFC-0005 fmt for HelmRepository rev
This includes changes to the `ChartRepository`, to allow calculating
the revision and digest and tidy things.

In addition, the responsibility of caching the `IndexFile` has been
moved to the reconcilers. As this allowed to remove a lot of
complexities within the `ChartRepository`, and prevented passing on
the cache in general.

Change `HelmRepository`'s Revision to digest

Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-14 12:48:36 +01:00
Hidde Beydals a72badf16b reconcilers: include artifact digest in event meta
Signed-off-by: Hidde Beydals <hello@hidde.co>
2023-02-14 12:48:36 +01:00
Stefan Prodan f89d07579f
Update dependencies
- k8s.io/* v0.26.1
- helm.sh/helm/v3 v3.11.0
- github.com/sigstore/sigstore v1.5.1
- github.com/google/go-containerregistry v0.13.0
- github.com/fluxcd/pkg/oci v0.18.0
- github.com/fluxcd/pkg/runtime v0.27.0
- cloud.google.com/go/storage v1.29.0
- github.com/Azure/azure-sdk-for-go/sdk/azcore v1.3.0
- sigs.k8s.io/controller-runtime v0.14.1

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2023-01-27 14:03:04 +02:00
Sunny 3d6a5e1203 Add progressive status in helmrepo reconciler
Signed-off-by: Sunny <darkowlzz@protonmail.com>
2023-01-10 00:30:40 +05:30
Sunny d551e59a06 Use Event v1 API metadata keys in notifications
Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-12-19 20:21:49 +05:30
Stefan Prodan 65e1041492
Use Flux Event API v1beta1
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-11-09 11:06:23 +02:00
Sunny 15cdd85805 controllers: Allow deletion of suspended objects
Reorders the object suspended check in all the reconcilers to allow
deletion of objects when they are suspended. Objects used to get stuck
on delete because the finalizers were not getting removed due to the
suspended state.

Adds a generic test for all the reconcilers to check if a suspended
source object can be delete.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-10-20 17:22:11 +05:30
Sunny 59294bf582
controllers: Remove ctx overwrite
Context in the reconcilers were overwritten earlier after adding new
log field `reconcileID` in the logger. Since the `reconcileID` is now
set by controller-runtime, this is no longer needed. The logger in the
context already has the field set and when the context is passed to
other functions, they too have the logger with the reconcileID set.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-09-04 18:56:02 +05:30
Somtochi Onyekwere c38fafe128 Align controller logs to Kubernetes structured logging
Signed-off-by: Somtochi Onyekwere <somtochionyekwere@gmail.com>
2022-08-31 14:24:40 +01:00
Santosh Kaluskar 1ad2f004ac Implementing RecoverPanic on reconcilers to ensure it recovers from panic instead of crashing the controller and Squashed commits.
Signed-off-by: Santosh Kaluskar <dtshbl@gmail.com>
2022-08-11 18:35:25 +05:30
York Chen d5a75f6b2f feat: cache helmrepo early after reconcile
1. moved chartRepo.Unload() from reconcileSource() to the defer func in reconcileArtifact to allow caching index in memory
2. added step to init memory cache in reconcileArtifact()
3. added step to save helmrepo index into memory cache in reconcileArtifact()

Signed-off-by: York Chen <ychen@d2iq.com>
2022-07-21 18:17:26 +01:00
Sunny e345e71eca
Minor comment updates
- Update the comments around artifact retention fields in Storage.
- Update the comments around reconcileStorage regarding artifact
  retention and garbage collection.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-07-04 19:58:26 +05:30
Paulo Gomes 42dcb87345
Add reconcileID to all reconcilers
GitRepository introduced correlation ID to improve
transport level logging. This change aligns the other
reconcilers to the same approach.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-06-14 08:59:44 +01:00
Stefan Prodan 2441f1f0e9
Log on new artifact and failure recovery
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-06-03 15:58:19 +03:00
Sunny 9fe287d912
helmrepo: rm stale condition when type switching
Remove stale condition from HelmRepo during garbage collection when a
type switch to OCI HelmRepo occurs. This ensures the OCI HelmRepo does
not have any conditions from the previous type.

Co-authored-by: Soule BA <soule@weave.works>
Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-05-31 03:53:33 +05:30
Max Jonas Werner 841ed7ae66
[RFC 0002] Flux OCI support for Helm (#690)
* Add OCI Helm support

* users will be able to declare OCI HelmRepository by using the `.spec.type` field of the HelmRepository API. Contrary to the HTTP/S HelmRepository no index.yaml is reconciled from source, instead a simple url and credentials validation is performed.
* For backwards-compatibility, an empty `.spec.type` field leads to the HelmRepository being treated as a plain old HTTP Helm repository.
* users will be able to declare the new OCI HelmRepository type as source using the .Spec.SourceRef field of the HelmChart API. This will result in reconciling a chart from an OCI repository.
* Add registryTestServer in the test suite and OCI HelmRepository test case
* Add a new OCI chart repository type that manage tags and charts from an OCI registry.
* Adapat RemoteBuilder to accept both repository types
* discard output from OCI registry client; The client has no way to set a verbosity level and spamming the controller logs with "Login succeeded" every time the object is reconciled doesn't help much.

Signed-off-by: Soule BA <soule@weave.works>
Signed-off-by: Max Jonas Werner <mail@makk.es>
Co-authored-by: Soule BA <soule@weave.works>
2022-05-19 14:50:16 +02:00
Sunny eeaa958866
helmrepo: same revision different checksum condn
This change prevents Reconciling and ArtifactOutdated conditions to be
set on HelmRepo when the checksum of a cached repo index changes.

Adds some tests to ensure that when the repo index is cached, the
revision and checksum of the returned artifact are the same as on the
existing object status.
Also adds checks for the returned artifact and chartRepo from
reconcileSource, to ensure that chartRepo is populated and the checksum
of a new potential artifact is always empty, as it's populated when the
artifact is written in the storage.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-04-27 13:03:51 +05:30
Paulo Gomes 009504b294 helm: optimise repository index loading
Avoid validating (and thus loading) indexes if the checksum already exists in storage.
In other words, if the YAML is identical to the Artifact in storage, the reconciliation should
be a no-op, and therefore can short-circuit long/heavy operations.

Co-authored-by: Hidde Beydals <hello@hidde.co>
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-25 17:00:27 +02:00
Max Jonas Werner c3d5dac4a8
fix panic when HelmRepository's artifact size is nil
This fixes the immediate issue of the nil pointer dereference but we
still haven't isolated the actual cause of the size being nil to begin
with. This is ongoing work and as soon as we have boiled that down to
the simplest case we will provide a regression test for that case.

closes #680

Signed-off-by: Max Jonas Werner <mail@makk.es>
Co-authored-by: Hidde Beydals <hiddeco@users.noreply.github.com>
2022-04-22 10:29:56 +02:00
Paulo Gomes 4198191759
Add flags to configure exponential back-off retry
Add two new flags to enable users to configure exponential
back-off for Flux objects. The default values are now
set to 750ms for minimum retry time, and 15min for max.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-04-12 10:44:46 +01:00
Sanskar Jaiswal 72a4982541 remove leftover timeout in reconcilers
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-04-07 22:33:07 +05:30
Sunny 5da74ca5a9
Add notify() in all the reconcilers
notify() is used to emit events for new artifact and failure recovery
scenarios. It's implemented in all the reconcilers.
Previously, when there used to be a failure due to any reason, on a
subsequent successful reconciliation, no notification was sent to
indicate that the failure has been resolved.
With notify(), the old version of the object is compared with the new
version of the object to determine if all, if any, of the failures have
been resolved and a notification is sent. The notification message is
the same that's sent in usual successful source reconciliation message
about stored artifact.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-04-07 21:39:16 +05:30
Sunny 44207f46d5
Avoid event logging GC failure
We try to avoid affecting the source reconciliation when there's a
garbage collection related failure.

The event logging was resulting in events and notifications related to
GC failure when the artifact directory isn't created in the first
reconciliation of an object.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-04-07 21:14:12 +05:30
Sanskar Jaiswal f8c27a85dd Garbage collect with provided retention options.
Introduce two new flags to configure the ttl of an artifact and the max
no. of files to retain for an artifact. Modify the gc process to
consider the options and use timeouts to prevent the controller from
hanging.
This helps in situations when the SC has already garbage collected the
current artifact but the advertised artifact url is still the same,
which leads to the server returning a 404.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-04-07 18:43:55 +05:30
Sunny b41c717e16
controllers: emit event and log source up-to-date
Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-30 17:06:16 +05:30
Sunny 86860ec913
Update all reconcilers with ArtifactInStorage cond
Update alll the other reconcilers similar to the GitRepository
reconcilers to introduce positive condition ArtifactInStorage and
reorder the status conditions.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-30 17:06:12 +05:30
Hidde Beydals 7ca393498c Prefix event annotations with API Group FQDN
This to facilitate improvements on the notification-controller side,
where annotations prefixed with the FQDN of the Group of the Involved
Object will be transformed into "fields".

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-03-23 19:28:05 +01:00
Sunny a102d95cf9
Prioritize StorageOperationFailedCondition
Prioritize StorageOperationFailedCondition over other artifact outdated
and unavailable conditions so that when artifact is failing due to
storage operation, it's visble in the ready status condition, making the
reason for not ready more accurate.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-16 21:41:39 +05:30
Sunny a2d6af126d
Add new condition StorageOperationFailedCondition
Introduce new condition StorageOperationFailedCondition for all the
failures related to the storage. It is a negative polarity condition and
is considered in computing summary of reconciliation.

Also, introduce more granular event reasons related to
StorageOperationFailedCondition for precise reasoning behind failures.
These replace the vague StorageOperationFailedReason.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-03-14 21:20:37 +05:30
Hidde Beydals 971caf92d5 controllers: finetune `eventLogf` (variant) docs
Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-03-11 10:04:14 +01:00
Hidde Beydals 86d1d80bf2 Document HelmRepository API v1beta2 spec
Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-03-11 10:04:14 +01:00
Paulo Gomes 3b4cc52419
Use uppercase TLS in error messages
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-02 13:02:11 +00:00
Paulo Gomes d9d789fdb1
Reuse transport for helm chart download
Reuses the same transport across different helm chart downloads,
whilst resetting the tlsconfig to avoid cross-contamination.

Crypto material is now only processed in-memory and does not
touch the disk.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-03-02 13:02:11 +00:00
Sunny 234b7f4c9d
Remove redundant reconciling in reconcileArtifact
reconcileSource() adds reconciling condition with accurate information.
Remove setting reconciling condition in reconcileArtifact().

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-25 20:39:24 +05:30
Sunny 9c7661dcbd helmrepo: Make NewArtifact event human friendly
Inform index size and repo instead of a revision.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:37:48 +01:00
Sunny f72a28a193 Use field owner in the patch helper
- Update summarize helper to have patch field owner.
- Updated the controllers to set the patch field owner.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Sunny 45df2d76c8 Update API descriptions and messages to be consistent
- Update v1beta2 API descriptions and reconciling messages to be
  consistent.
- Replace 'download' with 'fetch'. Since the status condition for
  download failure is called FetchFailed, using the term 'fetch' makes
  the messaging more consistent.
- Replace `BucketOperationSucceed` with `BucketOperationSucceeded` and
  generate api docs.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Sunny d997876b07 Make generic SummarizeAndPatch()
summarizeAndPatch() was used by all the reconcilers with their own
object type. This creates a generic SummarizeAndPatch helper that takes
a conditions.Setter object and performs the same operations. All the
reconcilers are updated to use SummarizeAndPatch(). The process of
summarize and patch can be configured using the HelperOptions.

Introduce ResultProcessor to allow injecting middlewares in the
SummarizeAndPatch process.

Introduce RuntimeResultBuilder to allow defining how the reconciliation
result is computed for specific reconciler. This enabled different
reconcilers to have different meanings of the reconciliation results.

Introduce Conditions in summary package to store all the status
conditions related information of a reconciler. This is passed to
SummarizeAndPatch() to be used for summary and patch calculation.

Remove all the redundant summarizeAndPatch() tests per reconciler.

Add package internal/object containing helpers for interacting with
runtime.Object needed by the generic SummarizeAndPatch().

Add tests for ComputeReconcileResult().

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Sunny 9b5613732f storage: Return details about the deleted items
Update Storage.RemoveAll() and Storage.RemoveAllButCurrent() to return
the details about the deleted items. This helps emit useful information
about garbage collection in the controllers and ignore no-op garbage
collections.

RemoveAll() returns the path of the deleted directory if any.
RemoveAllButCurrent() returns a slice of path of all the deleted items
from a resource's artifact dir.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Hidde Beydals ad0993e93e controllers: truncate temporary cached Helm index
Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-02-23 12:35:30 +01:00
Sunny 78882b3b36 Consolidate result conversion and computation
Consolidate BuildRuntimeResult() into summarizeAndPatch() to simplify
where the results are computed, summarized and patched.

Move the event recording and logging of context specific errors into
RecordContextualError() and call it in summarizeAndPatch().

Introduce Waiting error for wait and requeue scenarios. Update
ComputeReconcileResult() and RecordContextualError() to consider Waiting
error.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Hidde Beydals 032ffb4d27 controllers: tweak events and logging
- Ensure all logged messages start with a lowercase.
- Make some pushed (and logged) events of type `EventTypeTrace` to
  prevent them from being sinked to the external event recorder, to
  prevent spam.
- Only log if artifact is up-to-date with upstream (instead of pushing
  an event).

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-02-23 12:35:30 +01:00
Sunny f14a053f0a helmrepo: Add more reconciler design improvements
- Remove ArtifactUnavailable condition and use Reconciling condition to
  convey the same.
- Make Reconciling condition affect the ready condition.
- Introduce summarizeAndPatch() to calculate the final status conditions
  and patch them.
- Introduce reconcile() to iterate through the sub-reconcilers and
  execute them.
- Simplify logging and event recording
- Introduce controller-check/status checks to assert that the status
  conditions are valid at the end of the tests.
- Create variables for various condition groups: owned conditions, ready
  dependencies and ready dependencies negative.

Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00
Hidde Beydals dd68cd57b7 Rewrite `HelmRepositoryReconciler` to new standards
This commit rewrites the `HelmRepositoryReconciler` to new standards,
while implementing the newly introduced Condition types, and trying to
adhere better to Kubernetes API conventions.

More specifically it introduces:

- Implementation of more explicit Condition types to highlight
  abnormalities.
- Extensive usage of the `conditions` subpackage from `runtime`.
- Better and more conflict-resilient (status)patching of reconciled
  objects using the `patch` subpackage from runtime.
- Proper implementation of kstatus' `Reconciling` and `Stalled`
  conditions.
- Refactoring of some Helm elements to make them easier to use within
  the new reconciler logic.
- Integration tests that solely rely on `testenv` and do not
  use Ginkgo.

There are a couple of TODOs marked in-code, these are suggestions for
the future and should be non-blocking.
In addition to the TODOs, more complex and/or edge-case test scenarios
may be added as well.

Signed-off-by: Hidde Beydals <hello@hidde.co>
2022-02-23 12:35:30 +01:00
Sunny 5f125ebfcd helmrepo: Replace GetInterval() with GetRequeueAfter()
Signed-off-by: Sunny <darkowlzz@protonmail.com>
2022-02-23 12:35:30 +01:00