Commit Graph

27 Commits

Author SHA1 Message Date
Mike Spreitzer 770f2e1fa4 apiserver: finish implementation of borrowing in APF
Also make some design changes exposed in testing and review.

Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early.  This creates a problem, that metric can not
keep both of its old meanings.  I chose the configured concurrency
limit.

Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking.  The current design in the KEP is
as follows.

> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.

But this does not work out well at server startup.  As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state.  As always,
the two mandatory priority levels are implicitly added whenever they
are not read.  So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design.  So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels.  Its Target is higher than
all the others, once they start to show up.  It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.

I have considered the following fix.  The idea is that the designed
initialization is not appropriate before all the default objects are
read.  So the fix is to have a mode bit in the controller.  In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit.  In the later mode,
the seat demand tracking variables are initialized as originally
designed.

However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.

So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero.  Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.

Also: revise logging logic, to log at numerically lower V level when
there is a change.

Also: bug fix in float64close.

Also, separate imports in some file

Co-authored-by: Han Kang <hankang@google.com>

Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac
2022-10-31 16:13:25 -07:00
Mike Spreitzer 084f1abd96 apiserver: define metrics for API Priority and Fairness borrowing
Kubernetes-commit: ba5ec78916ae5fe9e400a298da6879515029a12f
2022-10-31 15:09:39 -07:00
Mike Spreitzer eb15930b31 Fix APF metric denominator problems
Co-authored-by: JUN YANG <yang.jun22@zte.com.cn>

Kubernetes-commit: fdd921cad0cd9308ec62c1b86c9c1cc5d12e5d21
2022-05-22 23:39:49 -04:00
Mike Spreitzer 959fbf9f84 Use timing ratio histograms instead of sample-and-watermark histograms
Kubernetes-commit: 0c0b7ca49f9ade72b990bf3a6f568485586af8b4
2022-05-18 02:56:48 -04:00
Mike Spreitzer 0796534fe5 Remove the PairVec types
Kubernetes-commit: 1f1cfba2a3fb35a8542bbf64a46746214355674c
2022-06-11 00:57:19 -04:00
Mike Spreitzer 0f5737dda8 Remove unhelpful pairing of members of read_vs_write_request_count_samples
Members are not used in (waiting,executing) pairs, so stopped
using the wrapper that adds such pairing.

Kubernetes-commit: cd33c7cf2260b351dd345497223a944e80bc7b61
2022-05-22 22:39:06 -04:00
Mike Spreitzer cae328fb1c Give apf metrics abstractions more familiar names
The logic is similar to Prometheus gauges and vectors,
adopt that terminology.

Kubernetes-commit: 7d64a93a1407f91b5e13bf540a0fa834a41622eb
2022-05-17 23:27:47 -04:00
Abu Kashem 44e5395e0e apf: add metric to track dispatch with no accommodation
Kubernetes-commit: 30c0485e0cba3ec6b19e092e7e78059b3fd4f18c
2021-11-23 10:55:31 -05:00
Mike Spreitzer 6adfddf535 Clarify APF metric wrt all three stages of execution
Kubernetes-commit: 88f8e8448bf873cf41035cb858422a10a1d03018
2021-11-30 11:45:53 -05:00
Abu Kashem 6bd59a523a apf: add a metric to count seat samples
Kubernetes-commit: bb15bdf15c1cc4d5a4380f3f6ed46d4adc9662a1
2021-11-23 11:36:09 -05:00
Abu Kashem 40993f6319 apf: add new label for request_execution_seconds metric
Kubernetes-commit: 54439e934371a3018f49e629cdc68f0944e08af0
2021-10-06 11:55:12 -04:00
Mike Spreitzer 56b220f8cd Add metrics about watch counts seen by APF
Kubernetes-commit: 154bf6aab33c2486a9066f66ab3a056c1095cb9a
2021-10-25 03:31:47 -04:00
Mike Spreitzer 6a2631848c Add sample-and-watermark for seats occupied during all of execution
Kubernetes-commit: 945f960cfb8fc018b093c1a08e5d4cdd362b1fc6
2021-10-25 01:13:52 -04:00
Mike Spreitzer 5283383fb5 Clarify metrics help wrt APF execution phases
Kubernetes-commit: d7a3bf0d260a0c291941cda68492f10e5010ac91
2021-10-24 22:32:13 -04:00
Mike Spreitzer f7bfb170d7 Keep the progress meter R from overflowing
Also add test for that situation.

Kubernetes-commit: a797fbd96de8c67aaed58aef54fbe9f0eb94a2c2
2021-10-01 22:04:05 -07:00
Mike Spreitzer e417abf592 Migrate apiserver/pkg/util/flowcontrol to use k8s.io/utils/clock
.. instead of apimachinery/pkt/util/clock

Kubernetes-commit: 9f45c0f8c07cd0adfe38c887aa618d33b8a4ee1c
2021-09-17 15:14:42 -04:00
Mike Spreitzer d28ccb4224 Add APF metrics about R(t)
Kubernetes-commit: 676f0450ed37eeec92b67246719cc46e7567e512
2021-06-14 16:48:27 -04:00
Abu Kashem 345d1c6ff9 apf: add a gauge for the number of seats currently in use
Kubernetes-commit: c710f99ef730a791a6911e63cc3b9d26cced6bd3
2021-06-10 17:34:50 -04:00
yoyinzyc 1a8abfc56f add context to metrics in util/flowcontrol.
Kubernetes-commit: 57d0bc301a017c41d890baee0a3a287f448c664d
2020-12-16 17:08:43 -08:00
Adhityaa Chandrasekar 8b21b5725d APF metrics: set StabilityLevel to ALPHA
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>

Kubernetes-commit: b16f36b251ddbfef5f12fed58640de53512631f0
2020-11-05 15:35:39 +00:00
Adhityaa Chandrasekar ebe254b2e6 APF: use snake_case in metric labels
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>

Kubernetes-commit: f9d57a8d5db3e58f79a1b1958d80c049c63d6cde
2020-11-04 22:19:52 +00:00
Mike Spreitzer e28ab56bd4 Introduce more metrics on concurrency
Introduce min, average, and standard deviation for the number of
executing mutating and readOnly requests.

Introduce min, max, average, and standard deviation for the number
waiting and number waiting per priority level.

Later:

Revised to use a series of windows

Use three individuals instead of array of powers

Later:

Add coarse queue count metrics, removed windowed avg and stddev

Add metrics for number of queued mutating and readOnly requests,
to complement metrics for number executing.

Later:

Removed windowed average and standard deviation because consumers can
derive such from integrals of consumer's chosen window.

Also replaced "requestKind" Prometheus label with "request_kind".

Later:

Revised to focus on sampling

Make the clock intrinsic to a TimedObserver

... so that the clock can be read while holding the observer's lock;
otherwise, forward progress is not guaranteed (and violations were
observed in testing).

Bug fixes and histogram buckets revision

SetX1 to 1 when queue length limit is zero, beause dividing by zero is nasty.

Remove obsolete argument in gen_test.go.

Add a bucket boundary at 0 for sample-and-water-mark histograms, to
distinguish zeroes from non-zeros.

This includes adding Integrator test.

Simplified test code.

More pervasively used "ctlr" instead of "ctl" as abbreviation for
"controller".

Kubernetes-commit: 57ecea22296797a93b0157169db0ff2e477f58d0
2020-05-17 01:02:25 -04:00
Mike Spreitzer 9df60c9fe6 Renaming: "Change" -> "Add" for consistency with underlying method
Kubernetes-commit: c7b098ac6c276d65a79db6cfeb04f5f0f86eb315
2020-03-05 15:17:33 -05:00
Mike Spreitzer 6ae3e470a2 Make some metrics finer-grained, add dispatch counts, note immediate reject
Also add testing of metrics for queuesets.

Kubernetes-commit: f535a9c9ed4b6a0def47c354acad0ac2a8f961b0
2020-03-01 20:22:58 -05:00
yue9944882 f452a698b0 register metrics from comp-base
Kubernetes-commit: 11656478be93d4a9e54129ec35cd2b9558e901ac
2020-02-27 17:04:17 +08:00
yue9944882 f93a7a8312 homogenize metrics naming
Kubernetes-commit: a1523a049ff9fc47d7dc2c4354b16b69d2eb4be2
2020-02-19 16:34:49 +08:00
Aaron Prindle a222f282e1 fairqueuing implementation with unit tests
Kubernetes-commit: 24065cf5be6bed995da7b7abb37ee78ff95230f0
2019-10-29 21:54:16 -07:00