Also make some design changes exposed in testing and review.
Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early. This creates a problem, that metric can not
keep both of its old meanings. I chose the configured concurrency
limit.
Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking. The current design in the KEP is
as follows.
> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.
But this does not work out well at server startup. As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state. As always,
the two mandatory priority levels are implicitly added whenever they
are not read. So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design. So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels. Its Target is higher than
all the others, once they start to show up. It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.
I have considered the following fix. The idea is that the designed
initialization is not appropriate before all the default objects are
read. So the fix is to have a mode bit in the controller. In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit. In the later mode,
the seat demand tracking variables are initialized as originally
designed.
However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.
So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero. Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.
Also: revise logging logic, to log at numerically lower V level when
there is a change.
Also: bug fix in float64close.
Also, separate imports in some file
Co-authored-by: Han Kang <hankang@google.com>
Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac
Fix the one path where boundNextDispatchLocked was not being called
after modifying a queue.
Also check for negative work in a request.
These are motivated by
https://github.com/kubernetes/kubernetes/issues/112169 but I do not
have a way to reproduce it and so can not check that these changes
actually remove that symptom. But these changes are good anyway.
Kubernetes-commit: 6ee93e2cee695203a6ce4935da1b9a807b624260
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Kubernetes-commit: a9593d634c6a053848413e600dadbf974627515f
Members are not used in (waiting,executing) pairs, so stopped
using the wrapper that adds such pairing.
Kubernetes-commit: cd33c7cf2260b351dd345497223a944e80bc7b61
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:
- Logging a message only above a certain verbosity threshold without
recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
this matters when using a logging backend which records the verbosity
level.
- Passing a format string with parameters to a logging function that
doesn't do string formatting.
All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.
In some cases it reports false positives, but those can be suppressed with
source code comments.
Kubernetes-commit: edffc700a43e610f641907290a5152ca593bad79
This reverts commit 83ca74541216405323ddfb67f5f80ad5717da826, reversing
changes made to 1c216c6ec86e700170620fe4c75fa3a2a2817530.
Kubernetes-commit: b0b460921b81b260473d5c393d85beeb5a03e834
This reverts commit 6faa4f001008a5a29476f5722f66430c35f48229, reversing
changes made to 33a2c50bce334467640e016f68cf19e9382ba1a7.
Kubernetes-commit: 8fb33338635565f2f755a4557b94c26039c175d9
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
Kubernetes-commit: 9eaa2dc554e0c3d4485d4c916dfdbc2f517db2e0
Track the introduction of FinalSeats.
Give up on calculating expected results for tests with added latency,
because I did not find an easy and obvious way to do it.
Kubernetes-commit: 0fc595e03360ba7fc4c3e251d4b41f39172aca72
New anti-windup technique: use the request arrival time as the floor
on the virtual dispatch time. Prevent bound violations where they
might arise rather than fixing up just one queue at dispatch time,
so that the fixed up dispatch times figure into the dispatching choice.
Two tweaks to the shuffle sharding. Take seats of executing requests
into account as well as seats of waiting requests. Do not always
consider the generated hand in the same order.
Rename the queueset methods that do shuffle sharding and finding the
queue to dispatch from, because the old names were confusingly
similar.
Tighten up some request margins.
Name the test cases in TestNoRestraint and TestWindup.
Kubernetes-commit: 4b9cba85874158b25b5c994773a4ec04343820c2
Canonicalize listing of test cases.
Make TestNoRestraint try both cases: competition and none.
Kubernetes-commit: 0ee1a7b4ff9012b050bd447055ad5e1e8c57c30e
Make TestNoRestraint verify that fairness is NOT achieved
when there is real competition.
Make TestWindup run two cases, to show that 0.1 is too narrow
a margin and 0.26 is wide enough.
Kubernetes-commit: c4945fdf0c14ba2032a5c8edf192678d9fe00374
These behavioral unit tests of queueset were failing because the
evaluation criteria were too strict.
Kubernetes-commit: 59d319ec06bb33289a87036418b4a61ed3bb215f
So that the width estimate has some effect but not a grossly excessive
one.
Added the fifo::Peek method to simplify the fifo client code.
Also renamed the queueSet::estimatedServiceTime field to
estimatedServiceSeconds to make the units clear.
Kubernetes-commit: a0c161f2f6908ee424ea888ff40f75ff071bd20a
Instead of a plain `Mutex`, use an `RWMutex` so that the common
operations can proceed in parallel.
Kubernetes-commit: 58927c1abede11ce7a8a74104328cf823df1b39e
The cmp comparison is relatively expensive (#104821). If we're not
going to log it, we shouldn't make the comparison.
Kubernetes-commit: f9f556dc7061df1dfc8c1628db983eeb97149317
Now tries a little harder to get to meaningful stack frames.
This revision seems to strike a balance that is useful in the queueset tests.
Kubernetes-commit: b56dd725032cb7a14aa27e4c50c1c9d7c6d23eb1
Added missing dispatching after delayed release of seats.
Updated logging for all six situations of execution completion and
seat release.
Added behavioral tests for non-zero extra latency and non-unit width.
Also added two tests for baseline functionality.
Also improved some comments and other logging in `queueset.go`.
Kubernetes-commit: d2a27a58f0af20c6185fa1c21890d666e9d3746b
Add comment outlining TestContextCancel.
Stop calling `t.Errorf` from wrong goroutine.
Package up queueNoteFn expectation checking.
Add counting of goroutine in req1 exec fn.
Remove unnecessary assignment to `_`.
Make TestContextCancel wait on fake clock, to insulate timing check
from scheduler noise.
Factor goroutine counting out of queueset.go, into queueset_test.go,
where it matters.
Refactor promise: Use a simple channel-based implementation for normal
code, a mutex-based one for testing code.
Took all the panics out of queueset.go
Shrink the timeouts in promise tests to 1 second.
Kubernetes-commit: 1db36ae3b30e30d70972998a22987a7db470479b
Rename from `clock` to `eventclock`.
Simplify by removing the prohibition on an EventFunc suspending and
resuming activity.
Remove "EventClock" from names to avoid stuttering.
Start to consolidate test code under fairqueuing/testing/.
Kubernetes-commit: 80ca6a4ae6ff571c32962a7155efd55edefff9e6