Node Problem Detector aims to make various node problems visible to
the upstream layers in the cluster management stack. It is a daemon
that runs on each node, detects node problems and reports them to apiserver
so to avoid scheduling new pods on bad nodes and also easily identify
which are the problems on underlying nodes.
Project Home: https://github.com/kubernetes/node-problem-detector
Signed-off-by: dntosas <ntosas@gmail.com>
Node termination handler as all daemonSets may play a critical role in
capacity planning, define resource policy for chosing instanceType etc.
In this commit, we enable users to define resources themselves to meet
their needs and also removed limits to convey with the chosen strategy
to avoid limits on such components.
Signed-off-by: dntosas <ntosas@gmail.com>
After upgrading Cilium to 1.8 via kops one of our clusters had a total
outage due to cilium reporting errors as below:
```
level=error msg="endpoint regeneration failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=592 error="Failed to load tc filter: exit status 1" identity=40147 ipv4= ipv6= k8sPodName=/ subsys=endpoint
```
upon searching Cilium slack we found the below thread:
https://cilium.slack.com/archives/C1MATJ5U5/p1616400216167600
which recommended setting `enable-host-reachable-services` to true will
address the problems. We set the field and it fixed our issues too,
however we observed that kops does not have a means to configure this
hence this PR.
We will like to have this backported after it has been merged.
- Update manifests
- Bump components version
- Add API capability of setting Version + VolumeLimit
- Remove snapshot-controller resources as it should be independent from
any CSI driver
Signed-off-by: dntosas <ntosas@gmail.com>
Apply suggestions from code review
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
Apply suggestions from code review
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
Cilium as a CNI is a critical component for the cluster so it would be safe
to have some guaranteed resources as well as allowing the users to
define them based on their needs.
In this commit, we init default requested resources and add the
capability of user-defined values.
Signed-off-by: dntosas <ntosas@gmail.com>
Generate and upload keys.json + discovery.json to public store
Don't enable anonymous auth on publicjwks
Remove tests that won't work using FS VFS anymore