Commit Graph

90 Commits

Author SHA1 Message Date
Mike Spreitzer 770f2e1fa4 apiserver: finish implementation of borrowing in APF
Also make some design changes exposed in testing and review.

Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early.  This creates a problem, that metric can not
keep both of its old meanings.  I chose the configured concurrency
limit.

Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking.  The current design in the KEP is
as follows.

> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.

But this does not work out well at server startup.  As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state.  As always,
the two mandatory priority levels are implicitly added whenever they
are not read.  So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design.  So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels.  Its Target is higher than
all the others, once they start to show up.  It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.

I have considered the following fix.  The idea is that the designed
initialization is not appropriate before all the default objects are
read.  So the fix is to have a mode bit in the controller.  In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit.  In the later mode,
the seat demand tracking variables are initialized as originally
designed.

However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.

So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero.  Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.

Also: revise logging logic, to log at numerically lower V level when
there is a change.

Also: bug fix in float64close.

Also, separate imports in some file

Co-authored-by: Han Kang <hankang@google.com>

Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac
2022-10-31 16:13:25 -07:00
Mike Spreitzer 084f1abd96 apiserver: define metrics for API Priority and Fairness borrowing
Kubernetes-commit: ba5ec78916ae5fe9e400a298da6879515029a12f
2022-10-31 15:09:39 -07:00
Mike Spreitzer 413be63b46 Add instrumentation for seat borrowing
Kubernetes-commit: 9b684579e230f105bcaa743f06bc07c39af703df
2022-10-20 15:21:09 -04:00
Mike Spreitzer 3419387b18 Call queueSet::boundNextDispatchLocked enough
Fix the one path where boundNextDispatchLocked was not being called
after modifying a queue.

Also check for negative work in a request.

These are motivated by
https://github.com/kubernetes/kubernetes/issues/112169 but I do not
have a way to reproduce it and so can not check that these changes
actually remove that symptom.  But these changes are good anyway.

Kubernetes-commit: 6ee93e2cee695203a6ce4935da1b9a807b624260
2022-09-01 22:54:53 -04:00
jupblb 16f776a534 Switch initial/final seats type to uint64
Kubernetes-commit: 3c46482eb09d7343e0f98a930a9aaa158237e278
2022-07-28 10:48:40 +02:00
Davanum Srinivas 7e94033a61 Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>

Kubernetes-commit: a9593d634c6a053848413e600dadbf974627515f
2022-07-19 20:54:13 -04:00
Mike Spreitzer eb15930b31 Fix APF metric denominator problems
Co-authored-by: JUN YANG <yang.jun22@zte.com.cn>

Kubernetes-commit: fdd921cad0cd9308ec62c1b86c9c1cc5d12e5d21
2022-05-22 23:39:49 -04:00
Wojciech Tyczyński 8f7c120935 Eliminate MaintainObservations function in P&F
Kubernetes-commit: badf436ac4451590e5e84e537f2234e3632ea3b4
2021-11-25 12:44:50 +01:00
Mike Spreitzer 0796534fe5 Remove the PairVec types
Kubernetes-commit: 1f1cfba2a3fb35a8542bbf64a46746214355674c
2022-06-11 00:57:19 -04:00
Mike Spreitzer cae328fb1c Give apf metrics abstractions more familiar names
The logic is similar to Prometheus gauges and vectors,
adopt that terminology.

Kubernetes-commit: 7d64a93a1407f91b5e13bf540a0fa834a41622eb
2022-05-17 23:27:47 -04:00
Mike Spreitzer 8628966894 Fix more initial numerators
Kubernetes-commit: ba690c2257af76bd971d0dfb6bef13ff9099e549
2022-05-18 00:22:30 -04:00
Patrick Ohly ba3b8e9322 enhance and fix log calls
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:

- Logging a message only above a certain verbosity threshold without
  recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
  this matters when using a logging backend which records the verbosity
  level.

- Passing a format string with parameters to a logging function that
  doesn't do string formatting.

All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.

In some cases it reports false positives, but those can be suppressed with
source code comments.

Kubernetes-commit: edffc700a43e610f641907290a5152ca593bad79
2022-02-16 12:17:47 +01:00
Abu Kashem 44e5395e0e apf: add metric to track dispatch with no accommodation
Kubernetes-commit: 30c0485e0cba3ec6b19e092e7e78059b3fd4f18c
2021-11-23 10:55:31 -05:00
Patrick Ohly ec795ae204 avoid klog Info calls without verbosity
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:

   if klog.V(5).Enabled() {
       klog.Info("hello world")
   }

Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.

Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.

Kubernetes-commit: 9eaa2dc554e0c3d4485d4c916dfdbc2f517db2e0
2021-12-11 12:10:21 +01:00
Mike Spreitzer 6adfddf535 Clarify APF metric wrt all three stages of execution
Kubernetes-commit: 88f8e8448bf873cf41035cb858422a10a1d03018
2021-11-30 11:45:53 -05:00
Mike Spreitzer 4098be7694 Factored TimedObserver into less surprising pieces
Kubernetes-commit: ab64e852023965fd8873abcd50ff09cf79814d11
2021-11-15 14:59:30 -05:00
Mike Spreitzer 6a2631848c Add sample-and-watermark for seats occupied during all of execution
Kubernetes-commit: 945f960cfb8fc018b093c1a08e5d4cdd362b1fc6
2021-10-25 01:13:52 -04:00
Wojciech Tyczyński 55b43e446f P&F: move seat-seconds to a better location
Kubernetes-commit: e262db7a4daf5218520e49b423789ea55a94af75
2021-10-27 10:30:25 +02:00
Abu Kashem 3f529d0551 apf: call metrics.AddReject for decisionCancel
Kubernetes-commit: f6dcf17a10dfd2cb5ce0ea7923723547c92c2e55
2021-10-18 13:08:56 -04:00
Mike Spreitzer 5283383fb5 Clarify metrics help wrt APF execution phases
Kubernetes-commit: d7a3bf0d260a0c291941cda68492f10e5010ac91
2021-10-24 22:32:13 -04:00
Mike Spreitzer 5ab6f3fe6b Remove presumptions about what decision has been made
Kubernetes-commit: e70999becd98aa00fd00bc622ffdca0476ec7340
2021-10-18 02:10:21 -06:00
Abu Kashem b40e786ba3 apf: return nil for a request that has been removed from queue
Kubernetes-commit: cd06ba502cf85603242008cc9f61234790ede6de
2021-10-18 12:46:54 -04:00
Mike Spreitzer c5a0365136 Fix nits noticed in recent code review
Kubernetes-commit: 1844a052776bce33322ce20c11b2902403655ef8
2021-10-18 23:51:48 -05:00
Abu Kashem fb57181f8d apf: include seat information in request dump
Kubernetes-commit: 8e33a3b2cc97813724b64be36fe977290db3053a
2021-10-14 09:27:54 -04:00
Abu Kashem 88992ba87a apf: change QueueDump to use SeatSeconds
Kubernetes-commit: 3ef5752edf5b3f7076041368b1c1b5396ee534d7
2021-10-13 10:22:41 -04:00
Abu Kashem e4ae91b4a2 apf: include queue sum stats in debug
Kubernetes-commit: 87c7401eb853678da804ff503e266daeef5ad126
2021-10-13 10:14:30 -04:00
Mike Spreitzer d69d77c659 Update queueset_test.go for FinalSeats
Track the introduction of FinalSeats.

Give up on calculating expected results for tests with added latency,
because I did not find an easy and obvious way to do it.

Kubernetes-commit: 0fc595e03360ba7fc4c3e251d4b41f39172aca72
2021-10-08 22:27:39 -07:00
wojtekt c3ef02ad27 Adjust final seats if they don't fit the limit
Kubernetes-commit: c5a77d8a761b0651b53864f9e396d6f23efd01d2
2021-10-08 11:14:11 +02:00
Mike Spreitzer 56c6fd034b Unconfused logging wrt additional latency
Fixed confusion in queueSet logging wrt seats and additional latency.

Kubernetes-commit: 42f698daad73d9c010027bb60fff9728168c71da
2021-10-11 13:21:46 -07:00
Mike Spreitzer f7bfb170d7 Keep the progress meter R from overflowing
Also add test for that situation.

Kubernetes-commit: a797fbd96de8c67aaed58aef54fbe9f0eb94a2c2
2021-10-01 22:04:05 -07:00
Mike Spreitzer 487fea8e61 Update log messages in finishRequestLocked
Make them clearer and consistent.

Kubernetes-commit: 3906e187a685cebddef85226950fc36f65e8ddb4
2021-10-09 23:48:41 -07:00
Mike Spreitzer 1b1389676f Relax TestDifferentWidths
Make the margin a little wider because flakiness was reported.

Kubernetes-commit: 10326282f9d1abcd4a45b737628286d40971efea
2021-10-07 16:09:53 -07:00
Mike Spreitzer a5192405d9 Calculate the work in each request just once
Kubernetes-commit: f2c46c8f9d0b360cf913e22c222d9954b4ff9a76
2021-10-07 17:20:56 -07:00
Abu Kashem 9560ec6e92 introduce final seats for work estimate
Kubernetes-commit: 3d6cc118fee15313419bf7aa0082a2a608ec62f6
2021-09-24 15:18:27 -04:00
Mike Spreitzer dc449969cc Use SeatSeconds
Kubernetes-commit: 4b5e1398199282f471d0f332eefeb5c2415bdb01
2021-10-01 15:33:37 -07:00
Mike Spreitzer 0cb46ec2f7 Draft datatype for seat-seconds
Kubernetes-commit: d866f9445831687ab3254d754b13a4acf271882f
2021-10-01 13:29:35 -07:00
Abu Kashem 863c48fbc2 apf: rename WorkEstimate.Seats to InitialSeats
Kubernetes-commit: 5d67896adedbce27f01b59eb5f2054919a047f2b
2021-09-24 09:41:38 -04:00
Mike Spreitzer 72ff8a6261 Improve queueset sharding and dispatching
New anti-windup technique: use the request arrival time as the floor
on the virtual dispatch time.  Prevent bound violations where they
might arise rather than fixing up just one queue at dispatch time,
so that the fixed up dispatch times figure into the dispatching choice.

Two tweaks to the shuffle sharding.  Take seats of executing requests
into account as well as seats of waiting requests.  Do not always
consider the generated hand in the same order.

Rename the queueset methods that do shuffle sharding and finding the
queue to dispatch from, because the old names were confusingly
similar.

Tighten up some request margins.

Name the test cases in TestNoRestraint and TestWindup.

Kubernetes-commit: 4b9cba85874158b25b5c994773a4ec04343820c2
2021-09-20 15:45:24 -04:00
Mike Spreitzer 8d3036922c More test tweaks
Canonicalize listing of test cases.

Make TestNoRestraint try both cases: competition and none.

Kubernetes-commit: 0ee1a7b4ff9012b050bd447055ad5e1e8c57c30e
2021-09-20 15:45:24 -04:00
Mike Spreitzer c505aa64af Update TestNoRestraint and TestWindup
Make TestNoRestraint verify that fairness is NOT achieved
when there is real competition.

Make TestWindup run two cases, to show that 0.1 is too narrow
a margin and 0.26 is wide enough.

Kubernetes-commit: c4945fdf0c14ba2032a5c8edf192678d9fe00374
2021-09-17 01:40:16 -04:00
Mike Spreitzer de042674ed Widen margins of TestDifferentWidths and TestTooWide
These behavioral unit tests of queueset were failing because the
evaluation criteria were too strict.

Kubernetes-commit: 59d319ec06bb33289a87036418b4a61ed3bb215f
2021-09-09 17:07:58 -04:00
Mike Spreitzer de227d1d37 Change execution duration guess from 1 minute to 3 milliseconds
So that the width estimate has some effect but not a grossly excessive
one.

Added the fifo::Peek method to simplify the fifo client code.

Also renamed the queueSet::estimatedServiceTime field to
estimatedServiceSeconds to make the units clear.

Kubernetes-commit: a0c161f2f6908ee424ea888ff40f75ff071bd20a
2021-09-07 00:46:50 -04:00
Mike Spreitzer 7d5430cfba Fix extra latency and add tests for that and width
Added missing dispatching after delayed release of seats.

Updated logging for all six situations of execution completion and
seat release.

Added behavioral tests for non-zero extra latency and non-unit width.

Also added two tests for baseline functionality.

Also improved some comments and other logging in `queueset.go`.

Kubernetes-commit: d2a27a58f0af20c6185fa1c21890d666e9d3746b
2021-08-12 16:48:02 -04:00
Abu Kashem da50ca4c6e apf: free seats in use after additional latency
Kubernetes-commit: d68186452d9150b113489e6a722caf82f898857f
2021-06-27 13:04:20 -04:00
Mike Spreitzer 8c2108bc80 Refactor goroutine counting
Add comment outlining TestContextCancel.

Stop calling `t.Errorf` from wrong goroutine.

Package up queueNoteFn expectation checking.

Add counting of goroutine in req1 exec fn.

Remove unnecessary assignment to `_`.

Make TestContextCancel wait on fake clock, to insulate timing check
from scheduler noise.

Factor goroutine counting out of queueset.go, into queueset_test.go,
where it matters.

Refactor promise: Use a simple channel-based implementation for normal
code, a mutex-based one for testing code.

Took all the panics out of queueset.go

Shrink the timeouts in promise tests to 1 second.

Kubernetes-commit: 1db36ae3b30e30d70972998a22987a7db470479b
2021-07-29 00:35:25 -04:00
Mike Spreitzer 904cd74454 Some cleanup of the package for event clocks
Rename from `clock` to `eventclock`.

Simplify by removing the prohibition on an EventFunc suspending and
resuming activity.

Remove "EventClock" from names to avoid stuttering.

Start to consolidate test code under fairqueuing/testing/.

Kubernetes-commit: 80ca6a4ae6ff571c32962a7155efd55edefff9e6
2021-08-06 02:06:43 -04:00
Abu Kashem f013b63777 apf: use EventClock rather than a PassiveClock for queueset
Kubernetes-commit: c2a3b793d3ec62e781dd20704370d09f7e1be706
2021-07-21 17:06:48 -04:00
Mike Spreitzer 0c550377cf Introduce event clocks based on k8s.io/utils/clock
So we can move off of the apimachinery clock package.

Switch queueset to new clocks.

Removed event clocks based on apimachinery clocks,
because this PR introduces ones based on k8s.io/utils/clock .

Removed interface that is implemented by only one interesting type.

Simplify RealEventClock::EventAfterTime.

Kubernetes-commit: dcb298c9552de44e27ed52f5e2b58a0dd7cd8d54
2021-07-21 16:56:11 -04:00
wojtekt 719fda2a8b Simplify APF promise to what is really used in the code
Kubernetes-commit: 9f735e71bbb7d0dde67a718891641d8afd20a8bc
2021-07-23 13:30:34 +02:00
wojtekt b4c306e1e8 Rename width to workEstimate in P&F code
Kubernetes-commit: 73211256e8f15cf84ee69d6fe8258c3a912e0f94
2021-07-13 15:10:58 +02:00