In contrast to the original HandleError and HandleCrash, the new
HandleErrorWithContext and HandleCrashWithContext functions properly do contextual
logging, so if a problem occurs while e.g. dealing with a certain request and
WithValues was used for that request, then the error log entry will also
contain information about it.
The output changes from unstructured to structured, which might be a breaking
change for users who grep for panics. Care was taken to format panics
as similar as possible to the original output.
For errors, a message string gets added. There was none before, which made it
impossible to find all error output coming from HandleError.
Keeping HandleError and HandleCrash around without deprecating while changing
the signature of callbacks is a compromise between not breaking existing code
and not adding too many special cases that need to be supported. There is some
code which uses PanicHandlers or ErrorHandlers, but less than code that uses
the Handle* calls.
In Kubernetes, we want to replace the calls. logcheck warns about them in code
which is supposed to be contextual. The steps towards that are:
- add TODO remarks as reminder (this commit)
- locally remove " TODO(pohly): " to enable the check with `//logcheck:context`,
merge fixes for linter warnings
- once there are none, remove the TODO to enable the check permanently
Kubernetes-commit: 5a130d2b71e5d70cfff15087f4d521c6b68fb01e
This handler allows running execution prior to actual serving in a separate
goroutine when serving requests. Doing so benefits cases in serving long running
requests because it allows freeing memory used by the separate goroutine
and keeps the serving routines slim.
Signed-off-by: Eric Lin <exlin@google.com>
Kubernetes-commit: 7b2698a5e5c61b303481c2006847409fc8704746
Max seats from prioriy & fairness work estimator is now min(0.15 x
nominalCL, nominalCL/handSize)
'Max seats' calculated by work estimator is currently hard coded to 10.
When using lower values for --max-requests-inflight, a single
LIST request taking up 10 seats could end up using all if not most seats in
the priority level. This change updates the default work estimator
config such that 'max seats' is at most 10% of the
maximum concurrency limit for a priority level, with an upper limit of 10.
This ensures seats taken from LIST request is proportional to the total
available seats.
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
Kubernetes-commit: d3ef2d4fe95c3ef7b1c606ad01be1183659da391
TestTimeoutRequestHeaders and TestTimeoutWithLogging are designed to
catch data races on request headers and include an HTTP handler that
triggers timeout then repeatedly mutates request headers. Sometimes,
the request header mutation loop could complete before the timeout
filter observed the timeout, resulting in a test failure. The mutation
loop now runs until the test ends.
Kubernetes-commit: e5a15c87e9d83ee19ba93aa356dfbb7b33a013c8
Also make some design changes exposed in testing and review.
Do not remove the ambiguous old metric
`apiserver_flowcontrol_request_concurrency_limit` because reviewers
though it is too early. This creates a problem, that metric can not
keep both of its old meanings. I chose the configured concurrency
limit.
Testing has revealed a design flaw, which concerns the initialization
of the seat demand state tracking. The current design in the KEP is
as follows.
> Adjustment is also done on configuration change … For a newly
> introduced priority level, we set HighSeatDemand, AvgSeatDemand, and
> SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
> zero.
But this does not work out well at server startup. As part of its
construction, the APF controller does a configuration change with zero
objects read, to initialize its request-handling state. As always,
the two mandatory priority levels are implicitly added whenever they
are not read. So this initial reconfig has one non-exempt priority
level, the mandatory one called catch-all --- and it gets its
SmoothSeatDemand initialized to the whole server concurrency limit.
From there it decays slowly, as per the regular design. So for a
fairly long time, it appears to have a high demand and competes
strongly with the other priority levels. Its Target is higher than
all the others, once they start to show up. It properly gets a low
NominalCL once other levels show up, which actually makes it compete
harder for borrowing: it has an exceptionally high Target and a rather
low NominalCL.
I have considered the following fix. The idea is that the designed
initialization is not appropriate before all the default objects are
read. So the fix is to have a mode bit in the controller. In the
initial state, those seat demand tracking variables are set to zero.
Once the config-producing controller detects that all the default
objects are pre-existing, it flips the mode bit. In the later mode,
the seat demand tracking variables are initialized as originally
designed.
However, that still gives preferential treatment to the default
PriorityLevelConfiguration objects, over any that may be added later.
So I have made a universal and simpler fix: always initialize those
seat demand tracking variables to zero. Even if a lot of load shows
up quickly, remember that adjustments are frequent (every 10 sec) and
the very next one will fully respond to that load.
Also: revise logging logic, to log at numerically lower V level when
there is a change.
Also: bug fix in float64close.
Also, separate imports in some file
Co-authored-by: Han Kang <hankang@google.com>
Kubernetes-commit: feb42277884bc7cfbd6f0bb1d875cc63b1b6caac
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Kubernetes-commit: a9593d634c6a053848413e600dadbf974627515f
Make the test accept all the legitimate outcomes.
Expand the explanation of how TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter/priority_level_concurrency_is_set_to_1,_queue_length_is_1,_first_request_should_time_out_and_second_(enqueued)_request_should_time_out_as_well is supposed to work.
Expand debug information that is available when the test fails.
Kubernetes-commit: 1f450695ffd5b2d028c87328b8b32630a8052129
Members are not used in (waiting,executing) pairs, so stopped
using the wrapper that adds such pairing.
Kubernetes-commit: cd33c7cf2260b351dd345497223a944e80bc7b61
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:
- Logging a message only above a certain verbosity threshold without
recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
this matters when using a logging backend which records the verbosity
level.
- Passing a format string with parameters to a logging function that
doesn't do string formatting.
All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.
In some cases it reports false positives, but those can be suppressed with
source code comments.
Kubernetes-commit: edffc700a43e610f641907290a5152ca593bad79
Since flowDistinguisher may hold data identifying a user accessing the
cluster this can be a source of a PII leak.
Kubernetes-commit: 94c92f78e5b02c27502f3b9d59b4e194e476a6f4