Commit Graph

116 Commits

Author SHA1 Message Date
Matt Moore 3826bb2436
Add a new mechanism for requeuing a key. (#2201)
This is modelled after some of the semantics available in controller-runtime's `Result` type, which allows reconcilers to indicate a desire to reprocess a key either immediately or after a delay.

I worked around our lack of this when I reworked Tekton's timeout handling logic by exposing a "snooze" function to the Reconciler that wrapped `EnqueueAfter`, but this feels like a cleaner API for folks to use in general, and is consistent with our `NewPermanentError` and `NewSkipKey` functions in terms of influencing the queuing behaviors in our reconcilers via wrapped errors.

You can see some discussion of this here: https://knative.slack.com/archives/CA4DNJ9A4/p1627524921161100
2021-07-30 08:31:33 -07:00
Ben Moss 7d1b0f19ef
Add ability to filter objects on injection controller promotion (#2180)
Currently we enqueue every object with no way to filter, which causes
problems for eventing's source controller which reconciles duck CRDs.
2021-07-13 10:55:50 -07:00
zhaojizhuang 4cdacd0473
add concurrency for each controller (#2160) 2021-06-25 12:41:44 -07:00
Markus Thömmes 980a33719a
Fix revive related linting issues (#2131) 2021-05-26 01:10:28 -07:00
Evan Anderson 728bc4ad4e
Update OWNERS_ALIASES to match autogen in community (#2078) 2021-04-08 07:42:51 -07:00
Markus Thömmes 04fdbd775b
Add WaitForCacheSyncQuick and use it in tests (#2045) 2021-03-05 07:07:20 -08:00
Victor Agababov 8fbab7ebb7
Redo the comment a bit further (#2042)
I think `thus` is quite wrong here (I commented on the pr), but
also took it a bit further.
2021-03-03 11:22:15 -08:00
Markus Thömmes 08fc6268bf
Fix comment for skipKeyError (#2041)
* Fix comment for skipKeyError

* Review
2021-03-03 03:19:15 -08:00
Lionel Villard 07b5ddfaea
add demoteFunc controller option (#2033)
* add demoteFunc controller option

* use tab instead of space for indent

* run codegen
2021-02-25 14:10:47 -08:00
Matt Moore 8a9bf766d3
Add symmetric filter helped based on OwnerRefable. (#2032)
Most resources stamped out by knative controllers have OwnerReferences synthesized via `kmeta.NewOwnerRef`, which requires the parent resource to implement `kmeta.OwnerRefable` for accessing the `GroupVersionKind`.

However, where we setup informer watches on those same children resources we have essentially relied on direct synthesis of the `Group[Version]Kind` where we could instead provide an empty instance of the controller resource and leverage `GetGroupVersionKind` to provide the GVK used for filtration.

So where before folks would write:

```golang
FilterFunc: controller.FilterControllerGK(v1alpha1.WithKind("MyType"))
```

They may now write:

```golang
FilterFunc: controller.FilterController(&v1alpha1.MyType{})
```

The latter is preferable in part because it is more strongly typed, but also shorter.
2021-02-19 08:47:03 -08:00
Stavros Kontopoulos 84c98f3c3e
remove reflector metrics (#2020) 2021-02-15 08:55:23 -08:00
Stavros Kontopoulos 448ae657fb
Metric unit fixes (#2018)
* metric unit fixes

* additional fixes
2021-02-12 12:38:35 -08:00
Markus Thömmes a02dcff9ee
Bump a few assorted dependencies to their latest versions (#2013)
* Bump a few assorted dependencies to their latest versions

* Use new uuid helper

* Some more slight adjustments
2021-02-08 09:52:52 -08:00
Dave Protasowski 5bb97df49b
fix duration logging (#1992) 2021-01-15 12:20:20 -08:00
Dave Protasowski a74906c7fb
Use structured logging to augment our logger vs. naming (#1991)
* Use structured logging to augment our logger vs. naming

* remove unused line
2021-01-15 12:08:21 -08:00
Matt Moore f0ea5e6b9c
Use special error type to designate skips. (#1988)
This change introduces a new `controller.NewSkipKey` method to designate certain reconciliations as "skipped".

The primary motivation for this is to squelch useless logging on non-leader replicas, which currently report success with trivial latency.

I have plumbed this through existing reconcilers and the code-gen so most things downstream should get this for free.  In places where a key is observed, I do not mark the reconcile as skipped as the reconciler did some processing for which the awareness of side-effects and reported latency may be interesting.
2021-01-14 14:30:20 -08:00
Markus Thömmes 261c9b4624
Remove unnecessary intermediate loggers (#1969)
* Remove unnecessary intermediate loggers

* Make linter happy

* Collapse logging variable into if contexts
2020-12-16 08:27:58 -08:00
rusde 3154117dcf
Add tags to knative provided reconcile metrics (#1916)
* Add  tags to knative provided reconcile metrics

* Remove resource tag to reduce metric cardinality

* Removing unknown tags

* Update controller/testing/fake_stats_reporter.go

Co-authored-by: Victor Agababov <vagababov@gmail.com>

Co-authored-by: Victor Agababov <vagababov@gmail.com>
2020-11-24 12:42:35 -08:00
Markus Thömmes d9c4e5c439
Fix a few more occurrences of divisive language (#1902) 2020-11-12 06:41:59 -08:00
Markus Thömmes 565516e224
Add errorlint and fix all existing issues (#1855) 2020-10-29 01:14:35 -07:00
Markus Thömmes 385c8b9c0e
Fix nolint warnings and adhere to best practices (#1823) 2020-10-20 09:33:59 -07:00
Markus Thömmes 3d42810561
Make sure all controllers finish before ending test (#1818) 2020-10-19 10:02:58 -07:00
Josh Soref b39d5da935
Spelling (#1797)
* spelling: adopted

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: aliased

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: apierrs

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: assignment

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: available

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: coexistence

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: commit

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: conversions

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: creates

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: custom

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: determine

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: different

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: distribution

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: duplicate

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: editing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: endpoint

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: environment

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: generate

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: implementation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: identified

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: ignore

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: indicates

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: interface

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: interleaved

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: labels

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: label

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: mimic

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: namespaced

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: necessary

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: organization

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: populatable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: prometheus

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: refer

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: reference

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: repetitive

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: response

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: something

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: specable

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: spoofing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: synchronized

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: this

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: trailing

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unsupported

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: validation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* chore: reviewdog go header boilerplate

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-18 14:22:57 -07:00
Victor Agababov 7bad843466
Enable golint and exclude some other generated or additional dirs (#1783)
* Enable golint and exclude some other generated or additional dirs

Also remove `test` ignore, since it's covered by path ignore rule.

* meh

* fixes

* more

* progressing

* further

* like a boss
2020-10-07 14:58:20 -07:00
Victor Agababov a371418524
v2 (#1754) 2020-09-29 13:18:29 -07:00
Markus Thömmes 5fbbde31b3
Align linters with serving (enables stylecheck and asciicheck) (#1738) 2020-09-23 07:37:40 -07:00
Julian Friedman 6e0430fd94
Fix flakes in EnqueueAfter tests (#1710)
* Fix flakes in EnqueueAfter tests

* only call q.Len() once
2020-09-16 10:15:41 -07:00
Zbynek Roubalik 2d4efecc6b
bump to k8s 1.18 (#1428)
* bump to k8s 1.18.8

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* plumbing ctx through

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* add more ctx plumbing

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* ctx WithCancel()

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>
2020-09-11 07:54:00 -07:00
Victor Agababov 9c75061487
Make controller tests reentrant (#1671)
The stats tests were completely wonky checking some random things
and despite attempt to make them reentrant they weren't so.

So fix that \o/!
             |
             /
2020-08-31 14:06:15 -07:00
Victor Agababov 55cef32af0
Remove code duplication by reusing funcs (#1669)
and consts
2020-08-31 12:11:15 -07:00
Victor Agababov dba58d1d78
Modernize the controller tests (#1661)
- remove mutexes where we can get by with an atomic
- make things private that are not public
- etc
2020-08-28 21:54:07 -07:00
Victor Agababov 34838c4559
Remove all ClearAll calls from pkg UTs (#1654)
We need this in order to deprecate the function.
Serving is already free of those.
2020-08-27 13:13:07 -07:00
Antoine Cotten c56f5e203b
Reduce delays in controller tests (#1603)
* TestEnqueue: remove unnecessary calls to Sleep

The rate limiter applies only when multiple items are put onto the
workqueue, which is not the case in those tests.

Execution: ~7.6s -> ~2.1s

* TestEnqueueAfter: remove assumptions on execution times

Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.

Execution: ~1s -> ~0.05s

* TestEnqueueKeyAfter: remove assumptions on execution times

Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.

Execution: ~1s -> ~0.05s

* TestStartAndShutdownWithErroringWork: remove sleep

Instead of sleeping for a conservative amount of time, keep watching the
state number of requeues in a goroutine, and notify the test logic as
soon as the expected threshold is reached.

Logs, for an idea of timings
----------------------------
Started workers
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 104µs     {"knative.dev/key": "bar"}
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 48.2µs    {"knative.dev/key": "bar"}

Execution: ~1s -> ~0.01s

* TestStart*/TestRun*: reduce sleep time

There is no need to sleep for that long. If an error was returned, it
would activate the second select case immediately.

Execution: ~1s -> ~0.05s

* TestImplGlobalResync: reduce sleep time

We know the fast lane is empty in this test, so we can safely assume
immediate enqueuing of all items on the slow lane.

Logs, for an idea of timings
----------------------------
Started workers
Processing from queue foo/bar (depth: 0)
Reconcile succeeded. Time taken: 11.5µs {"knative.dev/key": "foo/bar"}
Processing from queue bar/foo (depth: 1)
Processing from queue fizz/buzz (depth: 0)
Reconcile succeeded. Time taken: 9.7µs  {"knative.dev/key": "fizz/buzz"}
Reconcile succeeded. Time taken: 115µs  {"knative.dev/key": "bar/foo"}
Shutting down workers

Execution: ~4s -> ~0.05s

* review: Replace for/select with PollUntil

* review: Remove redundant duration multiplier

* review: Replace defer with t.Cleanup
2020-08-26 10:09:06 -07:00
Victor Agababov d5c09d2aef
Optimize and clean the controller code (#1639)
- fix outdated comments
- reorder the code to reduce the  number of defer calls
- various other nits
"
2020-08-21 22:20:46 -07:00
Victor Agababov c30ec2ffd4
Fix the flaky test. (#1632)
The test assumes the threads would schedule in particular way, but they don't.
But what we really care to check is that we thread in the proper RL and it works.
We don't need to check that underlying queue impl works, that's done in its own tests.
So just verify these two things.
2020-08-18 17:06:13 -07:00
Matt Moore 3ac62a93ca
Fix the logkey.Key tagging in Enqueue. (#1627)
Debugging https://github.com/knative-sandbox/net-contour/issues/214 I noticed the logging was showing up as:

```
"knative.dev/key":{"knative.dev/key":"serving-tests/service-create-and-update-teaoccuu"}
```
2020-08-16 11:49:06 -07:00
Victor Agababov 0ecf6f86c1
Fix the debug->debugf (#1609)
this is a shame cube situation
2020-08-11 14:58:05 -07:00
Victor Agababov 62f2560aa7
Also log the key we're enqueueing when printing debug info (#1594) 2020-08-06 16:07:28 -07:00
Jon Donovan 7be5c0a87b
Allow creating controller with custom RateLimiter. (#1546)
* Allow creating controller with custom RateLimiter.

Which was possible before via field modification.
Not switching to a builder pattern mostly for speed of resolution.
Happy to consider alternatives.

* Add tests for new functionality.

Specifically, these test that the Wait() function is notified about
the item, and that the RateLimiter is passed through to the queue.

* Add Options. Gophers love Options.

* Even moar controller GenericOptions.

* Attempt to appease lint, don't create struct for typecheck.

* GenericOptions -> ControllerOptions

* Public struct fields.
2020-08-03 14:31:28 -07:00
Markus Thömmes 58be631c12
Avoid copying object if TypeMeta is already correct. (#1561)
* Avoid copying object if TypeMeta is already correct.

* Fix typos.

* Add test for the new path.
2020-07-28 00:09:58 -07:00
Victor Agababov 557b6826ef
Reorder statements for more precise behavior and harden the test (#1534) 2020-07-21 14:14:53 -07:00
Victor Agababov 08156c67f6
Use slow lane to do global resync (#1528)
* Use slow lane to do global resync

* cmt

* yolo

* yolo v2

* fix log str

* fixes

* publicize things

* renamemove
2020-07-21 13:11:54 -07:00
Victor Agababov 1cea86c85f
Use two lane queue instead of the regular workqueue (#1514)
* Use two lane queue instead of the regular workqueue

- we need to poll for len in the webhook tests because we have async propagation now, and check at the wrong time will be not correct.
- otherwise just a drop in replacement.

* update test

* cmt

* tests hardened
2020-07-19 14:01:34 -07:00
Victor Agababov 25be382806
Add more time for the checks (#1516)
* Add more time for the checks
On the prow clusters which are usually strapped for CPU this fails due to wrong order of things

* harden
2020-07-17 16:03:33 -07:00
Victor Agababov 97e2175a17
Implement Slowlane/FastLane queues (#1512)
* Fast/slow queue implementation/

* more3
2020-07-17 09:51:34 -07:00
Lance Liu e863db0344
Remove key in tags to reduce metrics count (#1494)
* Remove key in tags to reduce metrics count
Issue: https://github.com/knative/serving/issues/8609

Signed-off-by: Lance Liu <xuliuxl@cn.ibm.com>

* remove tag key for OpenCensus

Signed-off-by: Lance Liu <xuliuxl@cn.ibm.com>
2020-07-15 01:29:32 -07:00
Victor Agababov 16eea5bd5b
remove the ResourceLock field from the pkg/LE (#1482)
we only use a single possible way, so no need to have, parse and validate
field that is in effectg a constant.
2020-07-14 09:15:18 -07:00
Matt Moore a81727701f
Enable leader election by default. (#1476)
* Enable HA by default.

This consolidates the core of sharedmain around the new leaderelection logic, which will now be **enabled by default**.

This can now be disabled with `--disable-ha` or by passing `sharedmain.WithHADisabled(ctx)` to `sharedmain.MainWithConfig`.

* vagababov comments, build failure

* Open an issue for enabledComponents removal.

* Move the configmap watcher startup.

This race was uncovered by the chaos duck on knative/serving!  When we have enabled a feature flag, e.g. multi-container, and the webhook pods are restarted, there is a brief window where the webhook is up and healthy before the configmaps have synchronized and the new webhook pod realizes the feature is enabled.

* Drop the import alias
2020-07-13 12:43:18 -07:00
Markus Thömmes a92c682188
Add an option to skip automated status updates in a reconciler. (#1456)
* Add an option to skip automated status updates in a reconciler.

This option is necessary to be able to create reconcilers like Serving's labeler, that is purely adding labels to resources. If that fails, the new automated observed generation handling changes the status and that gets written to the API currently, which is not desired.

* Flip the bool.
2020-06-30 08:02:29 -07:00
Markus Thömmes 09d5e09da8
Assorted linting fixes. (#1443) 2020-06-24 12:11:27 -07:00