* Generator: allow reconcilers to listen to leader promotion events
Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
* Run hack/update-codegen.sh
Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
---------
Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
The controller package now promotes those buckets
prior to starting reconciliation and leader election
The unopposed elector now seeds the univeral bucket
This helps avoid races in unit tests were the univeral
bucket isn't promoted in time when reconciling an resource
* Avoid double-resyncs without leader election.
tl;dr Without leader election enabled, we've been suffering from double resyncs, this fixes that.
Essentially when the informers are started, prior to starting the controllers, they fill the controller's workqueue with ~the world. Once we start the controllers, the leader election code calls `Promote` on a `Bucket` so that it starts processing them as their leader, and then is requeues every key that falls into that `Bucket`.
When leader election is disabled, we use the `UniversalBucket`, which always owns everything. This means that the first pass through the workqueue was already hitting every key, so re-enqueuing every key results in superfluous processing of those keys.
This change makes the `LeaderAwareFuncs` elide the call to `PromoteFunc` when the `enq` function passed is `nil`, and makes the `unopposedElector` have a `nil` `enq` field when we explicitly pass it the `UniversalBucket` because leader election is disabled.
* Add more unit test coverage
* Introduce `NewContext`, deprecate `NewImplFull`.
Our generated `NewImpl` methods have long taken `context.Context`, but despite many iterations the forms we expose from our `controller` package never have. This change contains several elements:
1. Expose a new `NewContext` method that takes `context.Context` in addition to the current `NewImplFull` signature.
2. Call `NewContext` instead of the deprecated `NewImpl` from our generated controller code.
3. Call `NewContext` from all our webhook reconcilers.
* Add a Tracker to controller.Impl to cut down on downstream boilerplate.
* Use consistent case for "Deprecated" comments
Not the most important thing ever, but the canonical string to use for
Deprecated warnings is case sensitive, and also it's nice to be
consistent.
* Add nolint comment
This is modelled after some of the semantics available in controller-runtime's `Result` type, which allows reconcilers to indicate a desire to reprocess a key either immediately or after a delay.
I worked around our lack of this when I reworked Tekton's timeout handling logic by exposing a "snooze" function to the Reconciler that wrapped `EnqueueAfter`, but this feels like a cleaner API for folks to use in general, and is consistent with our `NewPermanentError` and `NewSkipKey` functions in terms of influencing the queuing behaviors in our reconcilers via wrapped errors.
You can see some discussion of this here: https://knative.slack.com/archives/CA4DNJ9A4/p1627524921161100
Most resources stamped out by knative controllers have OwnerReferences synthesized via `kmeta.NewOwnerRef`, which requires the parent resource to implement `kmeta.OwnerRefable` for accessing the `GroupVersionKind`.
However, where we setup informer watches on those same children resources we have essentially relied on direct synthesis of the `Group[Version]Kind` where we could instead provide an empty instance of the controller resource and leverage `GetGroupVersionKind` to provide the GVK used for filtration.
So where before folks would write:
```golang
FilterFunc: controller.FilterControllerGK(v1alpha1.WithKind("MyType"))
```
They may now write:
```golang
FilterFunc: controller.FilterController(&v1alpha1.MyType{})
```
The latter is preferable in part because it is more strongly typed, but also shorter.
This change introduces a new `controller.NewSkipKey` method to designate certain reconciliations as "skipped".
The primary motivation for this is to squelch useless logging on non-leader replicas, which currently report success with trivial latency.
I have plumbed this through existing reconcilers and the code-gen so most things downstream should get this for free. In places where a key is observed, I do not mark the reconcile as skipped as the reconciler did some processing for which the awareness of side-effects and reported latency may be interesting.
* Enable golint and exclude some other generated or additional dirs
Also remove `test` ignore, since it's covered by path ignore rule.
* meh
* fixes
* more
* progressing
* further
* like a boss
* TestEnqueue: remove unnecessary calls to Sleep
The rate limiter applies only when multiple items are put onto the
workqueue, which is not the case in those tests.
Execution: ~7.6s -> ~2.1s
* TestEnqueueAfter: remove assumptions on execution times
Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.
Execution: ~1s -> ~0.05s
* TestEnqueueKeyAfter: remove assumptions on execution times
Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.
Execution: ~1s -> ~0.05s
* TestStartAndShutdownWithErroringWork: remove sleep
Instead of sleeping for a conservative amount of time, keep watching the
state number of requeues in a goroutine, and notify the test logic as
soon as the expected threshold is reached.
Logs, for an idea of timings
----------------------------
Started workers
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 104µs {"knative.dev/key": "bar"}
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 48.2µs {"knative.dev/key": "bar"}
Execution: ~1s -> ~0.01s
* TestStart*/TestRun*: reduce sleep time
There is no need to sleep for that long. If an error was returned, it
would activate the second select case immediately.
Execution: ~1s -> ~0.05s
* TestImplGlobalResync: reduce sleep time
We know the fast lane is empty in this test, so we can safely assume
immediate enqueuing of all items on the slow lane.
Logs, for an idea of timings
----------------------------
Started workers
Processing from queue foo/bar (depth: 0)
Reconcile succeeded. Time taken: 11.5µs {"knative.dev/key": "foo/bar"}
Processing from queue bar/foo (depth: 1)
Processing from queue fizz/buzz (depth: 0)
Reconcile succeeded. Time taken: 9.7µs {"knative.dev/key": "fizz/buzz"}
Reconcile succeeded. Time taken: 115µs {"knative.dev/key": "bar/foo"}
Shutting down workers
Execution: ~4s -> ~0.05s
* review: Replace for/select with PollUntil
* review: Remove redundant duration multiplier
* review: Replace defer with t.Cleanup