NonBlockingRun should also return a channel that gets closed when the
underlying http Server has stopped listening (during the graceful
shutdown period)
Kubernetes-commit: a84c1b71005930e8253c1348515020132c5c175b
- refactor graceful termination logic so we can write unit tests
to assert on the expected behavior.
Kubernetes-commit: d85619030e3a5fec5960ad00136e8d9bd030b5f8
When API Priority and Fairness is enabled, the inflight limits must
add up to something positive.
This rejects the configuration that prompted
https://github.com/kubernetes/kubernetes/issues/102885
Update help for max inflight flags
Kubernetes-commit: 0762f492c5b850471723a305cfa7390e44851145
The test was flaking because the test was creating more connections
than expected.
Disabling connection pooling removes the flakes, and no more connections
are created that the necessary.
Kubernetes-commit: 4d11c3cd8cb18c1e246a7a6b8e9a791177c49d31
- We use the new v22 module released on May 10
- We drop the unmaintained `github.com/coreos/pkg`
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Kubernetes-commit: 7fcdbbef06d0bc8c4416db1d2cbba9f30d30e8c4
- add plumbing that allows us to estimated "width" of a request
- the default implementation returns 1 as the "width" of all
incoming requests, this is in keeping with the current behavior.
Kubernetes-commit: 9b72eb1929a64b9d5a5234090a631ba312fb4d41
/kind bug
This PR adds depth to logging which was removed when migrating to structured logging in the file
Ref #102353
```release-note
NONE
```
```docs
```
Kubernetes-commit: 5d4c1162b944ff34374313103d0555ac0b334a1d
While the apiserver audit options merely use the lumberjack logger in
order to write the appropriate log files, this library has very loose
permissions by default for these files [1]. However, this library will
respect the permissions that the file has, if it exists already. This is
also the most tested scenario in the library [2].
So, let's follow the pattern marked in the library's tests and
pre-create the audit log file with an appropriate mode.
[1] https://github.com/natefinch/lumberjack/blob/v2.0/lumberjack.go#L280
[2] https://github.com/natefinch/lumberjack/blob/v2.0/linux_test.go
Signed-off-by: Juan Antonio Osorio Robles <jaosorior@redhat.com>
Kubernetes-commit: 42df7bc5b3aa26bf545b6392b557833c7162c472
Right now, `_, ok := provider.(Notifier); !ok` can mean one of two
things:
1. The provider does not support notification because the provided
content is static.
2. The implementor of the provider hasn't gotten around to implementing
Notifier yet.
These have very different implications. We should not force consumers of
these interfaces to have to figure out the static of Notifier across
sometimes numerous different implementations. Instead, we should force
implementors to implement Notifier, even if it's a noop.
Change-Id: Ie7a26697a9a17790bfaa58d67045663bcc71e3cb
Kubernetes-commit: 9b7d654a08d694d20226609f7075b112fb18639b
it turns out that setting a timeout on HTTP client affect watch requests made by the delegated authentication component.
with a 10 second timeout watch requests are being re-established exactly after 10 seconds even though the default request timeout for them is ~5 minutes.
this is because if multiple timeouts were set, the stdlib picks the smaller timeout to be applied, leaving other useless.
for more details see a937729c2c/src/net/http/client.go (L364)
instead of setting a timeout on the HTTP client we should use context for cancellation.
Kubernetes-commit: d690d71d27c78f2f7981b286f5b584455ff30246
- when we forward the request to the aggregated server, set the audit
ID in the new request header. This allows audit logs from aggregated
apiservers to be correlated with the kube-apiserver.
- use the audit ID in the current tracer
- use the audit ID in httplog
- when a request panics, log an error with the audit ID.
Kubernetes-commit: b607ca1bf3e1cf6152c446ea61ac7fdd9014e1f1
Manage the audit ID early in the request handling logic so that it can
be used by different layers to improve correlation.
- If the caller does not specify a value for Audit-ID in the request
header, we generate a new audit ID
- If a user specified Audit-ID is too large, we truncate it
- We echo the Audit-ID value to the caller via the response
Header 'Audit-ID'
Kubernetes-commit: 31653bacb9b979ee2f878ebece7e25f79d3f9aa6
We've dropped the content-type field since it is effectively unbounded
(we had a sec-vuln about this before actually). We retain all other
fields, despite their unboundedness due to the fact that we can now
explicitly set bounds on label values.
Change-Id: Icc483fc6a17ea6382928f4448643cda6f3e21adb
Kubernetes-commit: cfd00de6866e636332bdcd3f46d6d2ffd8d2bc88
SARs
healthz, readyz, and livez are canonical names for checks that the kubelet does. By default, allow access to them in the options. Callers can adjust the defaults if they have a reason to require checks.
system:masters has full power, so the authorization check is unnecessary and just uses an extra call for in-cluster access. Callers can adjust the defaults if they have a reason to require checks.
Kubernetes-commit: cebce291ddcb8490a705c79623c0b4f13faef6e7
- as soon as a request is received by the apiserver, determine the
timeout of the request and set a new request context with the deadline.
- the timeout filter that times out non-long-running requests should
use the request context as opposed to a fixed 60s wait today.
- admission and storage layer uses the same request context with the
deadline specified.
we use the default timeout enforced by the apiserver:
- if the user has specified a timeout of 0s, this implies no timeout on the user's part.
- if the user has specified a timeout that exceeds the maximum deadline allowed by the apiserver.
Kubernetes-commit: e416c9e574c49fd0190c8cdac58322aa33a935cf
- as soon as a request is received by the apiserver, determine the
timeout of the request and set a new request context with the deadline.
- the timeout filter that times out non-long-running requests should
use the request context as opposed to a fixed 60s wait today.
- admission and storage layer uses the same request context with the
deadline specified.
Kubernetes-commit: 83f869ee1350da1b65d508725749fb70d0f535f2
Aborted requests are the ones that were disrupted with http.ErrAbortHandler.
For example, the timeout handler will panic with http.ErrAbortHandler when a response to the client has been already sent
and the timeout elapsed.
Additionally, a new metric requestAbortsTotal was defined to count aborted requests. The new metric allows for aggregation for each group, version, verb, resource, subresource and scope.
Kubernetes-commit: 057986e32c1bb7284b0edbc161f0380f1548492f
without APIServerIdentity enabled, stale apiserver leases won't be GC'ed
and the same for stale storage version entries. In that case the storage
migrator won't operate correctly without manual intervention.
Kubernetes-commit: 1c2d446648662529282a3bb1528a6dbb50700fdb
StorageVersions are updated during apiserver bootstrap.
Also add a poststarthook to the aggregator which updates the
StorageVersions via the storageversion.Manager
Kubernetes-commit: 721897871697db007c2439ac298c579c0f201388
Previously no timeout was set. Requests without explicit timeout might potentially hang forever and lead to starvation of the application.
When no timeout was specified a default one will be applied.
Kubernetes-commit: 7340c3498ac23f46fc8b6bff4d5ac664a9c64a3b
The MaxInFlight and PriorityAndFairness apiserver filters maintain
watermarks with histogram metrics that are observed when requests
are handled. When a request is received, the watermark observer
needs to fill out observations for the entire time period since the
last request was received. If it has been a long time since a
request has been received, then it can take an inordinate amount of
time to fill out the observations, to the extent that the request
may time out. To combat this, these changes will have the filters
fill out the observations on a 10-second interval, so that the
observations never fall too far behind.
This follows a similar approach taken in
9e89b92a92c02cdd2c70c0f52a30936e9c3309c7.
https://github.com/kubernetes/kubernetes/issues/95300
The Priority-and-Fairness and Max-in-Flight filters start goroutines to
handle some maintenance tasks on the watermarks for those filters. Once
started, these goroutines run forever. Instead, the goroutines should
have a lifetime tied to the lifetime of the apiserver.
These changes move the functionality for starting the goroutines to
a PostStartHook. The goroutines have been changed to accept a stop channel
and only run until the stop channel is closed.
Kubernetes-commit: 6c9b86646871f13a4431361310ba6a0785372053
Currently webhook retry backoff parameters are hard coded, we want
to have the ability to configure the backoff parameters for webhook
retry logic.
Kubernetes-commit: 53a1307f68ccf6c9ffd252eeea2b333e818c1103
previously no timeout was set. Requests without explicit timeout might potentially hang forever and lead to starvation of the application.
Kubernetes-commit: 2160cbc53fdd27a3cbc1b361e523abda4c39ac42
apiserver_request_duration_seconds does not take into account the
time a request spends in the server filters. If a filter takes longer
then the latency incurred will not be reflected in the apiserver
latency metrics.
For example, the amount of time a request spends in priority and
fairness machineries or in shuffle queues will not be accounted for.
- Add a server filter that attaches request received timestamp to the
request context very early in in the handler chain (as soon as
net/http hands over control to us).
- Use the above received timestamp in the apiserver latency metrics
apiserver_request_duration_seconds.
- Use the above received timestamp in the audit layer to set
RequestReceivedTimestamp.
Kubernetes-commit: d74ab9e1a4929be208d4529fd12b76d3fcd5d546
Introduce min, average, and standard deviation for the number of
executing mutating and readOnly requests.
Introduce min, max, average, and standard deviation for the number
waiting and number waiting per priority level.
Later:
Revised to use a series of windows
Use three individuals instead of array of powers
Later:
Add coarse queue count metrics, removed windowed avg and stddev
Add metrics for number of queued mutating and readOnly requests,
to complement metrics for number executing.
Later:
Removed windowed average and standard deviation because consumers can
derive such from integrals of consumer's chosen window.
Also replaced "requestKind" Prometheus label with "request_kind".
Later:
Revised to focus on sampling
Make the clock intrinsic to a TimedObserver
... so that the clock can be read while holding the observer's lock;
otherwise, forward progress is not guaranteed (and violations were
observed in testing).
Bug fixes and histogram buckets revision
SetX1 to 1 when queue length limit is zero, beause dividing by zero is nasty.
Remove obsolete argument in gen_test.go.
Add a bucket boundary at 0 for sample-and-water-mark histograms, to
distinguish zeroes from non-zeros.
This includes adding Integrator test.
Simplified test code.
More pervasively used "ctlr" instead of "ctl" as abbreviation for
"controller".
Kubernetes-commit: 57ecea22296797a93b0157169db0ff2e477f58d0
Fixes:
* Don't call LogArgs if log will not be written due low verbosity
* Create separate slice for hijacked to avoid append on main path
* Shorten log message as this log is to common to be verbose
name old time/op new time/op delta
WithLogging-4 4.95µs ± 3% 3.52µs ± 1% -28.80% (p=0.000 n=10+8)
name old alloc/op new alloc/op delta
WithLogging-4 2.93kB ± 0% 1.22kB ± 0% -58.45% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
WithLogging-4 32.0 ± 0% 20.0 ± 0% -37.50% (p=0.000 n=10+10)
Kubernetes-commit: 303e1c19225149868d735b5c876d8ca9d3e1b5c9