Commit Graph

92 Commits

Author SHA1 Message Date
Victor Agababov a371418524
v2 (#1754) 2020-09-29 13:18:29 -07:00
Markus Thömmes 5fbbde31b3
Align linters with serving (enables stylecheck and asciicheck) (#1738) 2020-09-23 07:37:40 -07:00
Julian Friedman 6e0430fd94
Fix flakes in EnqueueAfter tests (#1710)
* Fix flakes in EnqueueAfter tests

* only call q.Len() once
2020-09-16 10:15:41 -07:00
Zbynek Roubalik 2d4efecc6b
bump to k8s 1.18 (#1428)
* bump to k8s 1.18.8

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* plumbing ctx through

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* add more ctx plumbing

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>

* ctx WithCancel()

Signed-off-by: Zbynek Roubalik <zroubali@redhat.com>
2020-09-11 07:54:00 -07:00
Victor Agababov 9c75061487
Make controller tests reentrant (#1671)
The stats tests were completely wonky checking some random things
and despite attempt to make them reentrant they weren't so.

So fix that \o/!
             |
             /
2020-08-31 14:06:15 -07:00
Victor Agababov 55cef32af0
Remove code duplication by reusing funcs (#1669)
and consts
2020-08-31 12:11:15 -07:00
Victor Agababov dba58d1d78
Modernize the controller tests (#1661)
- remove mutexes where we can get by with an atomic
- make things private that are not public
- etc
2020-08-28 21:54:07 -07:00
Victor Agababov 34838c4559
Remove all ClearAll calls from pkg UTs (#1654)
We need this in order to deprecate the function.
Serving is already free of those.
2020-08-27 13:13:07 -07:00
Antoine Cotten c56f5e203b
Reduce delays in controller tests (#1603)
* TestEnqueue: remove unnecessary calls to Sleep

The rate limiter applies only when multiple items are put onto the
workqueue, which is not the case in those tests.

Execution: ~7.6s -> ~2.1s

* TestEnqueueAfter: remove assumptions on execution times

Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.

Execution: ~1s -> ~0.05s

* TestEnqueueKeyAfter: remove assumptions on execution times

Instead of sleeping for a conservative amount of time, keep watching the
state of the workqueue in a goroutine, and notify the test logic as soon
as the item is observed.

Execution: ~1s -> ~0.05s

* TestStartAndShutdownWithErroringWork: remove sleep

Instead of sleeping for a conservative amount of time, keep watching the
state number of requeues in a goroutine, and notify the test logic as
soon as the expected threshold is reached.

Logs, for an idea of timings
----------------------------
Started workers
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 104µs     {"knative.dev/key": "bar"}
Processing from queue bar (depth: 0)
Reconcile error {"error": "I always error"}
Requeuing key bar due to non-permanent error (depth: 0)
Reconcile failed. Time taken: 48.2µs    {"knative.dev/key": "bar"}

Execution: ~1s -> ~0.01s

* TestStart*/TestRun*: reduce sleep time

There is no need to sleep for that long. If an error was returned, it
would activate the second select case immediately.

Execution: ~1s -> ~0.05s

* TestImplGlobalResync: reduce sleep time

We know the fast lane is empty in this test, so we can safely assume
immediate enqueuing of all items on the slow lane.

Logs, for an idea of timings
----------------------------
Started workers
Processing from queue foo/bar (depth: 0)
Reconcile succeeded. Time taken: 11.5µs {"knative.dev/key": "foo/bar"}
Processing from queue bar/foo (depth: 1)
Processing from queue fizz/buzz (depth: 0)
Reconcile succeeded. Time taken: 9.7µs  {"knative.dev/key": "fizz/buzz"}
Reconcile succeeded. Time taken: 115µs  {"knative.dev/key": "bar/foo"}
Shutting down workers

Execution: ~4s -> ~0.05s

* review: Replace for/select with PollUntil

* review: Remove redundant duration multiplier

* review: Replace defer with t.Cleanup
2020-08-26 10:09:06 -07:00
Victor Agababov d5c09d2aef
Optimize and clean the controller code (#1639)
- fix outdated comments
- reorder the code to reduce the  number of defer calls
- various other nits
"
2020-08-21 22:20:46 -07:00
Victor Agababov c30ec2ffd4
Fix the flaky test. (#1632)
The test assumes the threads would schedule in particular way, but they don't.
But what we really care to check is that we thread in the proper RL and it works.
We don't need to check that underlying queue impl works, that's done in its own tests.
So just verify these two things.
2020-08-18 17:06:13 -07:00
Matt Moore 3ac62a93ca
Fix the logkey.Key tagging in Enqueue. (#1627)
Debugging https://github.com/knative-sandbox/net-contour/issues/214 I noticed the logging was showing up as:

```
"knative.dev/key":{"knative.dev/key":"serving-tests/service-create-and-update-teaoccuu"}
```
2020-08-16 11:49:06 -07:00
Victor Agababov 0ecf6f86c1
Fix the debug->debugf (#1609)
this is a shame cube situation
2020-08-11 14:58:05 -07:00
Victor Agababov 62f2560aa7
Also log the key we're enqueueing when printing debug info (#1594) 2020-08-06 16:07:28 -07:00
Jon Donovan 7be5c0a87b
Allow creating controller with custom RateLimiter. (#1546)
* Allow creating controller with custom RateLimiter.

Which was possible before via field modification.
Not switching to a builder pattern mostly for speed of resolution.
Happy to consider alternatives.

* Add tests for new functionality.

Specifically, these test that the Wait() function is notified about
the item, and that the RateLimiter is passed through to the queue.

* Add Options. Gophers love Options.

* Even moar controller GenericOptions.

* Attempt to appease lint, don't create struct for typecheck.

* GenericOptions -> ControllerOptions

* Public struct fields.
2020-08-03 14:31:28 -07:00
Markus Thömmes 58be631c12
Avoid copying object if TypeMeta is already correct. (#1561)
* Avoid copying object if TypeMeta is already correct.

* Fix typos.

* Add test for the new path.
2020-07-28 00:09:58 -07:00
Victor Agababov 557b6826ef
Reorder statements for more precise behavior and harden the test (#1534) 2020-07-21 14:14:53 -07:00
Victor Agababov 08156c67f6
Use slow lane to do global resync (#1528)
* Use slow lane to do global resync

* cmt

* yolo

* yolo v2

* fix log str

* fixes

* publicize things

* renamemove
2020-07-21 13:11:54 -07:00
Victor Agababov 1cea86c85f
Use two lane queue instead of the regular workqueue (#1514)
* Use two lane queue instead of the regular workqueue

- we need to poll for len in the webhook tests because we have async propagation now, and check at the wrong time will be not correct.
- otherwise just a drop in replacement.

* update test

* cmt

* tests hardened
2020-07-19 14:01:34 -07:00
Victor Agababov 25be382806
Add more time for the checks (#1516)
* Add more time for the checks
On the prow clusters which are usually strapped for CPU this fails due to wrong order of things

* harden
2020-07-17 16:03:33 -07:00
Victor Agababov 97e2175a17
Implement Slowlane/FastLane queues (#1512)
* Fast/slow queue implementation/

* more3
2020-07-17 09:51:34 -07:00
Lance Liu e863db0344
Remove key in tags to reduce metrics count (#1494)
* Remove key in tags to reduce metrics count
Issue: https://github.com/knative/serving/issues/8609

Signed-off-by: Lance Liu <xuliuxl@cn.ibm.com>

* remove tag key for OpenCensus

Signed-off-by: Lance Liu <xuliuxl@cn.ibm.com>
2020-07-15 01:29:32 -07:00
Victor Agababov 16eea5bd5b
remove the ResourceLock field from the pkg/LE (#1482)
we only use a single possible way, so no need to have, parse and validate
field that is in effectg a constant.
2020-07-14 09:15:18 -07:00
Matt Moore a81727701f
Enable leader election by default. (#1476)
* Enable HA by default.

This consolidates the core of sharedmain around the new leaderelection logic, which will now be **enabled by default**.

This can now be disabled with `--disable-ha` or by passing `sharedmain.WithHADisabled(ctx)` to `sharedmain.MainWithConfig`.

* vagababov comments, build failure

* Open an issue for enabledComponents removal.

* Move the configmap watcher startup.

This race was uncovered by the chaos duck on knative/serving!  When we have enabled a feature flag, e.g. multi-container, and the webhook pods are restarted, there is a brief window where the webhook is up and healthy before the configmaps have synchronized and the new webhook pod realizes the feature is enabled.

* Drop the import alias
2020-07-13 12:43:18 -07:00
Markus Thömmes a92c682188
Add an option to skip automated status updates in a reconciler. (#1456)
* Add an option to skip automated status updates in a reconciler.

This option is necessary to be able to create reconcilers like Serving's labeler, that is purely adding labels to resources. If that fails, the new automated observed generation handling changes the status and that gets written to the API currently, which is not desired.

* Flip the bool.
2020-06-30 08:02:29 -07:00
Markus Thömmes 09d5e09da8
Assorted linting fixes. (#1443) 2020-06-24 12:11:27 -07:00
Weston Haught 602857dcc5
add self to aliases and add reviewers to OWNERS (#1409)
* add self to aliases and add reviewers to OWNERS

* fix typo
2020-06-22 12:30:27 -07:00
Matt Moore 7df8fc5d77
Implement the first wave of per-reconciler leaderelection. (#1301)
* Implement the first wave of per-reconciler leaderelection.

Detailed design: https://docs.google.com/document/d/1i_QHjQO2T3SNv49xjZLWlivcc0UvZN1Tbw2NKxThkyM/edit#
Issue: https://github.com/knative/pkg/issues/1181

* Feedback from vagababov

* Feedback from yanweiguo

* Drop IsLeaderFor from the LeaderAware interface.

* Moar vagababov nits

* dprotaso feedback

* Add issue comment, error return

* Incorporate dprotaso test feedback
2020-06-18 19:07:25 -07:00
Antoine Cotten 82fe339a5e
Implement Unwrap() and Is() for permanentErrors (#1363)
* Implement Unwrap() for permanentErrors

* Implement Is() for permanentErrors
2020-06-02 13:27:18 -07:00
Yanwei Guo 19b1d7b64d
Add a helper func to set a default metric config for unit tests (#1263)
* do not record for empty metric config

* Revert "do not record for empty metric config"

This reverts commit 539a5e4dbb.

* add a comment

* fix typo

* fix tests

* revert

* revert tests

* revert

* fix conflicts

* one more test file
2020-05-07 21:11:45 -07:00
Victor Agababov 66f1d63f10
Fix logging around pkg (#1310)
- use more performant functions (mostly remove formatters were possible)
- move zap.Error() to the call site, rather than creating a new sugared logger with a key attached (not cheap)
- fix *w usages where the key was not provided.
2020-05-07 15:00:45 -07:00
Yanwei Guo 66c1551b08
Use helper func from metricstest package for the unit test (#1268)
* do not record for empty metric config

* Revert "do not record for empty metric config"

This reverts commit 539a5e4dbb.

* use metricstest package for test
2020-05-07 12:04:45 -07:00
Markus Thömmes 0e0f650dfa
Fix deprecation comment. (#1279) 2020-05-04 00:13:43 -07:00
Shashwathi 93be3f499d
Add ability to override controller agent name (#1240)
* Add the ability to override controller agent name

- controller can set agent name via context. If nothing is set then
controller falls back to default controller agent name

* Check if context name is not nil and can be casted to string before returning.

* Create EventRecorder after we determine the controller agent name

* Address comments

Signed-off-by: Andrew Su <asu@pivotal.io>

* Remove check for if recorder is nil

Co-authored-by: Andrew Su <asu@pivotal.io>
2020-05-02 10:00:43 -07:00
Markus Thömmes 47137cdc30
Explicitly name controller filters. (#1257)
* Explicitly name controller filters.

Everytime I read the generic "Filter" or "FilterGroupVersionKind" my brain needs to do an extra roundtrip to realize that this actually filters on the **controller** having that GVK/GK. This adds new functions that explictly state that to avoid that roundtrip.

Old functions are just deprecated so this can be rolled out without a downstream break.

* Rename after review.
2020-04-29 10:18:42 -07:00
Markus Thömmes d29cf98a77
Assorted linting fixes. (#1249)
* Remove unused code.

* Use raw strings to avoid escaping.

* Remove unneeded type conversions.

* Preallocate slices where possible.

* Use semantic equality in psbinding reconciler.
2020-04-28 08:20:51 -07:00
Matt Moore 7b6e21a57a
Change StartAll to take context. (#1247)
* Change StartAll to take context.

This has bugged me since we started using `ctx`, which containers a `stopCh` of sorts as `Done()`.  This is somewhat for consistency, but by using `ctx` explicitly we enable ourselves to take advantage of more contextual information.

I did a quick scan of call sites and the good news is that the `sharedmain` change should be the place through which the vast majority of calls occur, however, the one outlier here is the KPA which calls this manually.  I will stage a PR to manually import pkg into serving to fix this once this lands.

* Add a Run shim for back-compat
2020-04-25 16:21:49 -07:00
Markus Thömmes 6103dd9b71
Add a controller option to specify a custom finalizer name. (#1230) 2020-04-22 14:55:40 -07:00
Ville Aikas 4e57475bc8
add EnqueueNamespaceOf (#1217)
* add EnqueueNamespaceOf

* correct log error msg
2020-04-10 18:11:07 -07:00
Scott Nichols d93ce78496
[Reconciler Generators] Adding support for configStore.ToContext (#1085)
* Support optional config maps.

* document configmap stores

* whitespace.

* optionsFns

* review

* check for nil.

* zero trust imports.
2020-02-12 16:10:35 -08:00
Dave Protasowski e3d924ba00
allow filtering on schema.GroupKind (#1080)
* allow filtering on schema.GroupKind

In addition deprecated usage of Filter with the introduction of
FilterGroupVersionKind

* reduce nesting & simplify boolean logic
2020-02-12 12:12:18 -08:00
Victor Agababov 41aec11a3c
Use new RecordBatch method to join metric reporting (#1029)
* Use new RecordBatch method to join metric reporting

* review
2020-02-03 16:27:30 -08:00
Maisem Ali 64ed9fcf84 add EnqueueSentinel to pkg/controller (#841)
* add EnqueueSentinel to pkg/controller

* address comments
2019-11-04 16:22:20 -08:00
Markus Thömmes 56c2594e4f Assorted linting fixes. (#840)
* Remove unused code.

* Remove unneeded loops.

* Remove unneeded Printf calls.

* Use time.Since instead of time.Now().Sub.

* Remove unused values.

* Rename error variable according to conventions.

* Return error last.

* Simplify array allocations.

* Remove leaky ticker.

* Remove Yoda conditions.

* Remove deprecated function to talk to GKE.

* Remove dot import.

* Remove empty critical section and replace with a channel operation.

* Add linter directives to explicitly state wanted weirdness.

* Update deps.

* Fix broken line.
2019-11-01 12:49:12 -07:00
Matt Moore 809ce573e4 Add FilterByName for cluster-scoped resources. (#816)
This is a precursor to reconciling named webhook configurations, and largely a copy of `FilterByNameAndNamespace`.
2019-10-28 10:41:42 -07:00
Slavomir Kaslev 29642b017b Add RunInformers function (#758)
Add RunInfomers which is similar to the StartInformers function but allows for
users to for informers to finish running.

This function will be mainly used in tests to fix the race described in
knative/serving#5351
2019-10-14 10:02:32 -07:00
Markus Thömmes de32ec136d Fix a subtle bug with cluster scoped entities. (#708) 2019-09-20 12:26:07 -07:00
Markus Thömmes 4a790dd36c Plumb through a structured key, keep current behavior. (#703)
* Plumb through a structured key, keep current behavior.

* Rename variable.
2019-09-20 06:52:05 -07:00
Matt Moore e4ac97c252 Update our dependency on K8s libs to 1.15.3 (#686)
With a minimum K8s version of 1.14 (starting in 0.10), 1.15.3 puts us in the center of the +/-1 version window of support.
2019-09-18 13:36:48 -07:00
Matt Moore 3c828cf99f Hook into two other Kubernetes metric subsystems. (#682)
This adds logic to hook into two other metric systems:
1. `cache.SetReflectorMetricsProvider`, which doesn't seem hooked up in Kubernetes yet, but would theoretically give us metrics about the mechanisms underpinning informers.
2. `metrics.Register`, which hooks us into the rest client infrastructure to give us metrics about low-level API server calls.

Fixes: https://github.com/knative/pkg/issues/679
Fixes: https://github.com/knative/pkg/issues/680
2019-09-16 10:46:43 -07:00