Add a `livez` endpoint to identify network outages. This helps in
restarting the binary if such as case is observed.
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
There are a few documented scenarios where `kube-state-metrics` will
lock up(#995, #1028). I believe a much simpler solution to ensure
`kube-state-metrics` doesn't lock up and require a restart to server
`/metrics` requests is to add default read and write timeouts and to
allow them to be configurable. At Grafana, we've experienced a few
scenarios where `kube-state-metrics` running in larger clusters falls
behind and starts getting scraped multiple times. When this occurs,
`kube-state-metrics` becomes completely unresponsive and requires a
reboot. This is somewhat easily reproduceable(I'll provide a script in
an issue) and causes other critical workloads(KEDA, VPA) to fail in
weird ways.
Adds two flags:
- `server-read-timeout`
- `server-write-timeout`
Updates the metrics http server to set the `ReadTimeout` and
`WriteTimeout` to the configured values.
Add support for variable VKs in CRS config, while maintaining a cache
of discovered GVKs in the cluster, and updating it every 30s.
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
This change adds hot reloading support for the customresourcestate
config file.
It also resolves a bug in which the customresourcestate config file was
included in the ksm config file, in which it did not get detected.
It also resolves a bug in which customresourcestatemetrics were not
added when set resources were non-default resources.
Fixes: https://github.com/kubernetes/kube-state-metrics/issues/1892
This uses code pieces from prometheus/alertmanager in https://github.com/prometheus/alertmanager/blob/main/config/coordinator.go#LL56C26-L56C26
licensed under Apache-2.0.
kube_state_metrics_config_hash{type="config", filename="config.yml"} 4.0061079457904e+13
kube_state_metrics_config_last_reload_success_timestamp_seconds{type="config", filename="config.yml"} 1.6697483049487052e+09
kube_state_metrics_config_last_reload_successful{type="config",
filename="config.yml"} 1
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
... to only monitor all known custom-resource configurations instead of
listing each of them explicitly
Signed-off-by: Mario Constanti <mario@constanti.de>
Remediate:
G104: Errors unhandled.
G109: Potential Integer overflow made by strconv.Atoi result conversion to int16/32
G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server
G304: Potential file inclusion via variable
G601: Implicit memory aliasing in for loop.
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
Handle singular labels in allowlist failing when such a label is
supplied, in order to keep the behaviour in sync with --resources.
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>