apiserver

Commit Graph

Author	SHA1	Message	Date
Abu Kashem	d64c9b18da	apf: remove RequestWaitLimit from queueset config Kubernetes-commit: 11ef9514dad6f46a4315198978fee14132c4bbca	2023-08-29 12:11:08 -04:00
Abu Kashem	290096a4d0	apf: remove timeoutOldRequestsAndRejectOrEnqueueLocked function Kubernetes-commit: da8a472206623d0727ba486489d34780c4b6c1d9	2023-08-28 17:26:11 -04:00
Abu Kashem	27772523df	apf: refactor promise to use a context Kubernetes-commit: 0039f24d74d0f57c8ba868ae361821d37fd908d6	2023-08-21 15:19:31 -04:00
Andrew Sy Kim	066c7cb8cc	apiserver: add flow control metric current_inqueue_seats Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Kubernetes-commit: fb9646fd60d4b8e79223b729c1cb54fc6818fdd1	2023-07-24 19:40:05 +00:00
Mike Spreitzer	b8bc556baa	Add tracking and reporting of executing requests Signed-off-by: Mike Spreitzer <mspreitz@us.ibm.com> Kubernetes-commit: a8a2fb317c8bc9c64ced023988802b2517d34f81	2023-06-30 22:55:35 -04:00
Andrew Sy Kim	73f18d34af	promote the following APF metrics to beta: apiserver_flowcontrol_request_wait_duration_seconds apiserver_flowcontrol_request_concurrency_in_use apiserver_flowcontrol_request_concurrency_limit apiserver_flowcontrol_rejected_requests_total apiserver_flowcontrol_dispatched_requests_total apiserver_flowcontrol_current_inqueue_requests apiserver_flowcontrol_current_executing_requests Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Kubernetes-commit: 0bb419b1498a664d1dda3b487e9f15fd220ea363	2023-07-05 18:19:36 +00:00
Mike Spreitzer	078694d35d	Make QueueSet support exempt behavior; use it Signed-off-by: Mike Spreitzer <mspreitz@us.ibm.com> Kubernetes-commit: f269acd12b225f6a2dbbfae64a475f73f448b918	2023-06-28 22:55:30 -04:00
RuquanZhao	bc5f595633	fix undefined convertion Signed-off-by: Ruquan Zhao ruquan.zhao@arm.com Kubernetes-commit: 65f3454c1d926a1f119710684794bb54350ef4b1	2023-04-20 17:16:46 +08:00
Andrew Sy Kim	f86340dad2	increase expected fairness margin in TestDifferentWidths Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Kubernetes-commit: 736720128824264b4246f247b9ec0d09f5383cf0	2022-10-21 11:39:11 -04:00
Mike Spreitzer	770f2e1fa4	apiserver: finish implementation of borrowing in APF Also make some design changes exposed in testing and review. Do not remove the ambiguous old metric `apiserver_flowcontrol_request_concurrency_limit` because reviewers though it is too early. This creates a problem, that metric can not keep both of its old meanings. I chose the configured concurrency limit. Testing has revealed a design flaw, which concerns the initialization of the seat demand state tracking. The current design in the KEP is as follows. > Adjustment is also done on configuration change … For a newly > introduced priority level, we set HighSeatDemand, AvgSeatDemand, and > SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to > zero. But this does not work out well at server startup. As part of its construction, the APF controller does a configuration change with zero objects read, to initialize its request-handling state. As always, the two mandatory priority levels are implicitly added whenever they are not read. So this initial reconfig has one non-exempt priority level, the mandatory one called catch-all --- and it gets its SmoothSeatDemand initialized to the whole server concurrency limit. From there it decays slowly, as per the regular design. So for a fairly long time, it appears to have a high demand and competes strongly with the other priority levels. Its Target is higher than all the others, once they start to show up. It properly gets a low NominalCL once other levels show up, which actually makes it compete harder for borrowing: it has an exceptionally high Target and a rather low NominalCL. I have considered the following fix. The idea is that the designed initialization is not appropriate before all the default objects are read. So the fix is to have a mode bit in the controller. In the initial state, those seat demand tracking variables are set to zero. Once the config-producing controller detects that all the default objects are pre-existing, it flips the mode bit. In the later mode, the seat demand tracking variables are initialized as originally designed. However, that still gives preferential treatment to the default PriorityLevelConfiguration objects, over any that may be added later. So I have made a universal and simpler fix: always initialize those seat demand tracking variables to zero. Even if a lot of load shows up quickly, remember that adjustments are frequent (every 10 sec) and the very next one will fully respond to that load. Also: revise logging logic, to log at numerically lower V level when there is a change. Also: bug fix in float64close. Also, separate imports in some file Co-authored-by: Han Kang <hankang@google.com> Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac	2022-10-31 16:13:25 -07:00
Mike Spreitzer	413be63b46	Add instrumentation for seat borrowing Kubernetes-commit: 9b684579e230f105bcaa743f06bc07c39af703df	2022-10-20 15:21:09 -04:00
Mike Spreitzer	3419387b18	Call queueSet::boundNextDispatchLocked enough Fix the one path where boundNextDispatchLocked was not being called after modifying a queue. Also check for negative work in a request. These are motivated by https://github.com/kubernetes/kubernetes/issues/112169 but I do not have a way to reproduce it and so can not check that these changes actually remove that symptom. But these changes are good anyway. Kubernetes-commit: 6ee93e2cee695203a6ce4935da1b9a807b624260	2022-09-01 22:54:53 -04:00
jupblb	16f776a534	Switch initial/final seats type to uint64 Kubernetes-commit: 3c46482eb09d7343e0f98a930a9aaa158237e278	2022-07-28 10:48:40 +02:00
Davanum Srinivas	7e94033a61	Generate and format files - Run hack/update-codegen.sh - Run hack/update-generated-device-plugin.sh - Run hack/update-generated-protobuf.sh - Run hack/update-generated-runtime.sh - Run hack/update-generated-swagger-docs.sh - Run hack/update-openapi-spec.sh - Run hack/update-gofmt.sh Signed-off-by: Davanum Srinivas <davanum@gmail.com> Kubernetes-commit: a9593d634c6a053848413e600dadbf974627515f	2022-07-19 20:54:13 -04:00
Mike Spreitzer	0796534fe5	Remove the PairVec types Kubernetes-commit: 1f1cfba2a3fb35a8542bbf64a46746214355674c	2022-06-11 00:57:19 -04:00
Mike Spreitzer	cae328fb1c	Give apf metrics abstractions more familiar names The logic is similar to Prometheus gauges and vectors, adopt that terminology. Kubernetes-commit: 7d64a93a1407f91b5e13bf540a0fa834a41622eb	2022-05-17 23:27:47 -04:00
Mike Spreitzer	8628966894	Fix more initial numerators Kubernetes-commit: ba690c2257af76bd971d0dfb6bef13ff9099e549	2022-05-18 00:22:30 -04:00
Mike Spreitzer	6adfddf535	Clarify APF metric wrt all three stages of execution Kubernetes-commit: 88f8e8448bf873cf41035cb858422a10a1d03018	2021-11-30 11:45:53 -05:00
Mike Spreitzer	4098be7694	Factored TimedObserver into less surprising pieces Kubernetes-commit: ab64e852023965fd8873abcd50ff09cf79814d11	2021-11-15 14:59:30 -05:00
Mike Spreitzer	6a2631848c	Add sample-and-watermark for seats occupied during all of execution Kubernetes-commit: 945f960cfb8fc018b093c1a08e5d4cdd362b1fc6	2021-10-25 01:13:52 -04:00
Wojciech Tyczyński	55b43e446f	P&F: move seat-seconds to a better location Kubernetes-commit: e262db7a4daf5218520e49b423789ea55a94af75	2021-10-27 10:30:25 +02:00
Mike Spreitzer	5283383fb5	Clarify metrics help wrt APF execution phases Kubernetes-commit: d7a3bf0d260a0c291941cda68492f10e5010ac91	2021-10-24 22:32:13 -04:00
Mike Spreitzer	c5a0365136	Fix nits noticed in recent code review Kubernetes-commit: 1844a052776bce33322ce20c11b2902403655ef8	2021-10-18 23:51:48 -05:00
Mike Spreitzer	d69d77c659	Update queueset_test.go for FinalSeats Track the introduction of FinalSeats. Give up on calculating expected results for tests with added latency, because I did not find an easy and obvious way to do it. Kubernetes-commit: 0fc595e03360ba7fc4c3e251d4b41f39172aca72	2021-10-08 22:27:39 -07:00
Mike Spreitzer	f7bfb170d7	Keep the progress meter R from overflowing Also add test for that situation. Kubernetes-commit: a797fbd96de8c67aaed58aef54fbe9f0eb94a2c2	2021-10-01 22:04:05 -07:00
Mike Spreitzer	1b1389676f	Relax TestDifferentWidths Make the margin a little wider because flakiness was reported. Kubernetes-commit: 10326282f9d1abcd4a45b737628286d40971efea	2021-10-07 16:09:53 -07:00
Mike Spreitzer	a5192405d9	Calculate the work in each request just once Kubernetes-commit: f2c46c8f9d0b360cf913e22c222d9954b4ff9a76	2021-10-07 17:20:56 -07:00
Abu Kashem	9560ec6e92	introduce final seats for work estimate Kubernetes-commit: 3d6cc118fee15313419bf7aa0082a2a608ec62f6	2021-09-24 15:18:27 -04:00
Mike Spreitzer	dc449969cc	Use SeatSeconds Kubernetes-commit: 4b5e1398199282f471d0f332eefeb5c2415bdb01	2021-10-01 15:33:37 -07:00
Abu Kashem	863c48fbc2	apf: rename WorkEstimate.Seats to InitialSeats Kubernetes-commit: 5d67896adedbce27f01b59eb5f2054919a047f2b	2021-09-24 09:41:38 -04:00
Mike Spreitzer	72ff8a6261	Improve queueset sharding and dispatching New anti-windup technique: use the request arrival time as the floor on the virtual dispatch time. Prevent bound violations where they might arise rather than fixing up just one queue at dispatch time, so that the fixed up dispatch times figure into the dispatching choice. Two tweaks to the shuffle sharding. Take seats of executing requests into account as well as seats of waiting requests. Do not always consider the generated hand in the same order. Rename the queueset methods that do shuffle sharding and finding the queue to dispatch from, because the old names were confusingly similar. Tighten up some request margins. Name the test cases in TestNoRestraint and TestWindup. Kubernetes-commit: 4b9cba85874158b25b5c994773a4ec04343820c2	2021-09-20 15:45:24 -04:00
Mike Spreitzer	8d3036922c	More test tweaks Canonicalize listing of test cases. Make TestNoRestraint try both cases: competition and none. Kubernetes-commit: 0ee1a7b4ff9012b050bd447055ad5e1e8c57c30e	2021-09-20 15:45:24 -04:00
Mike Spreitzer	c505aa64af	Update TestNoRestraint and TestWindup Make TestNoRestraint verify that fairness is NOT achieved when there is real competition. Make TestWindup run two cases, to show that 0.1 is too narrow a margin and 0.26 is wide enough. Kubernetes-commit: c4945fdf0c14ba2032a5c8edf192678d9fe00374	2021-09-17 01:40:16 -04:00
Mike Spreitzer	de042674ed	Widen margins of TestDifferentWidths and TestTooWide These behavioral unit tests of queueset were failing because the evaluation criteria were too strict. Kubernetes-commit: 59d319ec06bb33289a87036418b4a61ed3bb215f	2021-09-09 17:07:58 -04:00
Mike Spreitzer	de227d1d37	Change execution duration guess from 1 minute to 3 milliseconds So that the width estimate has some effect but not a grossly excessive one. Added the fifo::Peek method to simplify the fifo client code. Also renamed the queueSet::estimatedServiceTime field to estimatedServiceSeconds to make the units clear. Kubernetes-commit: a0c161f2f6908ee424ea888ff40f75ff071bd20a	2021-09-07 00:46:50 -04:00
Mike Spreitzer	7d5430cfba	Fix extra latency and add tests for that and width Added missing dispatching after delayed release of seats. Updated logging for all six situations of execution completion and seat release. Added behavioral tests for non-zero extra latency and non-unit width. Also added two tests for baseline functionality. Also improved some comments and other logging in `queueset.go`. Kubernetes-commit: d2a27a58f0af20c6185fa1c21890d666e9d3746b	2021-08-12 16:48:02 -04:00
Abu Kashem	da50ca4c6e	apf: free seats in use after additional latency Kubernetes-commit: d68186452d9150b113489e6a722caf82f898857f	2021-06-27 13:04:20 -04:00
Mike Spreitzer	8c2108bc80	Refactor goroutine counting Add comment outlining TestContextCancel. Stop calling `t.Errorf` from wrong goroutine. Package up queueNoteFn expectation checking. Add counting of goroutine in req1 exec fn. Remove unnecessary assignment to `_`. Make TestContextCancel wait on fake clock, to insulate timing check from scheduler noise. Factor goroutine counting out of queueset.go, into queueset_test.go, where it matters. Refactor promise: Use a simple channel-based implementation for normal code, a mutex-based one for testing code. Took all the panics out of queueset.go Shrink the timeouts in promise tests to 1 second. Kubernetes-commit: 1db36ae3b30e30d70972998a22987a7db470479b	2021-07-29 00:35:25 -04:00
Mike Spreitzer	904cd74454	Some cleanup of the package for event clocks Rename from `clock` to `eventclock`. Simplify by removing the prohibition on an EventFunc suspending and resuming activity. Remove "EventClock" from names to avoid stuttering. Start to consolidate test code under fairqueuing/testing/. Kubernetes-commit: 80ca6a4ae6ff571c32962a7155efd55edefff9e6	2021-08-06 02:06:43 -04:00
Mike Spreitzer	0c550377cf	Introduce event clocks based on k8s.io/utils/clock So we can move off of the apimachinery clock package. Switch queueset to new clocks. Removed event clocks based on apimachinery clocks, because this PR introduces ones based on k8s.io/utils/clock . Removed interface that is implemented by only one interesting type. Simplify RealEventClock::EventAfterTime. Kubernetes-commit: dcb298c9552de44e27ed52f5e2b58a0dd7cd8d54	2021-07-21 16:56:11 -04:00
wojtekt	b4c306e1e8	Rename width to workEstimate in P&F code Kubernetes-commit: 73211256e8f15cf84ee69d6fe8258c3a912e0f94	2021-07-13 15:10:58 +02:00
Abu Kashem	cf5c77fde9	apf: add additional latency into width Kubernetes-commit: 24e19229101d242d924ce98a562be3864dde9eae	2021-06-27 12:45:24 -04:00
Abu Kashem	e1aec4ecae	apf: take seats into account when dispatching request Kubernetes-commit: ff716cef508f948b50e1026e980e6df5ee475538	2021-06-14 12:19:06 -04:00
Abu Kashem	345d1c6ff9	apf: add a gauge for the number of seats currently in use Kubernetes-commit: c710f99ef730a791a6911e63cc3b9d26cced6bd3	2021-06-10 17:34:50 -04:00
Abu Kashem	3c7f54740f	apf: add plumbing to estimate width" of a request - add plumbing that allows us to estimated "width" of a request - the default implementation returns 1 as the "width" of all incoming requests, this is in keeping with the current behavior. Kubernetes-commit: 9b72eb1929a64b9d5a5234090a631ba312fb4d41	2021-05-11 07:03:05 -04:00
Abu Kashem	eea0d66fcd	clean up executing request on panic Kubernetes-commit: 13cedca0eb5337b13e5176983ea5e784ec38df22	2020-12-10 12:57:21 -05:00
Adhityaa Chandrasekar	ebe254b2e6	APF: use snake_case in metric labels Signed-off-by: Adhityaa Chandrasekar <adtac@google.com> Kubernetes-commit: f9d57a8d5db3e58f79a1b1958d80c049c63d6cde	2020-11-04 22:19:52 +00:00
yue9944882	5474822749	fixes max-min fairness Kubernetes-commit: fd889ec8ae37437a9e75386542291bd0e2cc605e	2020-10-29 18:57:38 +08:00
Ken Sipe	32533315c9	fix S1000 simplify ch switch cases Signed-off-by: Ken Sipe <kensipe@gmail.com> Kubernetes-commit: 268c2f81c7ab94cbab68a8d6c00725144b81fa09	2020-06-26 10:45:30 -05:00
Mike Spreitzer	e28ab56bd4	Introduce more metrics on concurrency Introduce min, average, and standard deviation for the number of executing mutating and readOnly requests. Introduce min, max, average, and standard deviation for the number waiting and number waiting per priority level. Later: Revised to use a series of windows Use three individuals instead of array of powers Later: Add coarse queue count metrics, removed windowed avg and stddev Add metrics for number of queued mutating and readOnly requests, to complement metrics for number executing. Later: Removed windowed average and standard deviation because consumers can derive such from integrals of consumer's chosen window. Also replaced "requestKind" Prometheus label with "request_kind". Later: Revised to focus on sampling Make the clock intrinsic to a TimedObserver ... so that the clock can be read while holding the observer's lock; otherwise, forward progress is not guaranteed (and violations were observed in testing). Bug fixes and histogram buckets revision SetX1 to 1 when queue length limit is zero, beause dividing by zero is nasty. Remove obsolete argument in gen_test.go. Add a bucket boundary at 0 for sample-and-water-mark histograms, to distinguish zeroes from non-zeros. This includes adding Integrator test. Simplified test code. More pervasively used "ctlr" instead of "ctl" as abbreviation for "controller". Kubernetes-commit: 57ecea22296797a93b0157169db0ff2e477f58d0	2020-05-17 01:02:25 -04:00

1 2

64 Commits