This adds an Authentication/Authorization filter through Kubernetes'
TokenReview / SubjectAccessReview resources.
The client config for kube-state-metrics needs a clusterrole for
* apiGroups: authentication.k8s.io, resources: tokenreviews, verbs: create
* apiGroups: authorization.k8s.io, resources: subjectaccessreviews, verbs: create
The Prometheus client needs a clusterrole for
* nonResourceURLs: "/metrics", verbs: get
This change allows user-controlled limits on how many objects KSM will
list from the API. This is helpful to prevent resource exhaustion on
KSM, in case the API creates too many resources.
The object limit it set globally and applied per resource watched.
Add automatic detection of container and system memory limits to control
the Go `GOMEMLIMIT` garbage collector feature. This helps reduced OOMs
by triggering GC when the process approaches system limits.
Signed-off-by: SuperQ <superq@gmail.com>
There are a few documented scenarios where `kube-state-metrics` will
lock up(#995, #1028). I believe a much simpler solution to ensure
`kube-state-metrics` doesn't lock up and require a restart to server
`/metrics` requests is to add default read and write timeouts and to
allow them to be configurable. At Grafana, we've experienced a few
scenarios where `kube-state-metrics` running in larger clusters falls
behind and starts getting scraped multiple times. When this occurs,
`kube-state-metrics` becomes completely unresponsive and requires a
reboot. This is somewhat easily reproduceable(I'll provide a script in
an issue) and causes other critical workloads(KEDA, VPA) to fail in
weird ways.
Adds two flags:
- `server-read-timeout`
- `server-write-timeout`
Updates the metrics http server to set the `ReadTimeout` and
`WriteTimeout` to the configured values.