community/sig-node/archive/meeting-notes-2023.md

116 KiB
Raw Blame History

SIG Node Meeting Notes

Dec [all dates] - meetings canceled, next meeting Jan 2nd 2024

Dec 5, 2023

Recording: https://youtu.be/k7kzjKnmwSI

  • [tzneal] Feedback on OOM cgroup killing change https://github.com/kubernetes/kubernetes/pull/117793#issuecomment-1837400726

    • Lets have a kubelet option to return the old behavior. New behavior is a default
    • Need to cherry-pick to 1.27
  • [haircommander] https://github.com/kubernetes/kubernetes/pull/114847 follow ups

    • summary of proposed policy changes.. pull-never policy pods must have the same cred to re-use an image that was pulled with a cred; kubelet needs a switch to disable validation checking of in the cache preloaded images (for disconnected mode at node / kubelet restart), otherwise images pulled with not present are subject to revalidation, and pull never will fail if never authenticated; a pod that successfully pulls an image anonymously from registry A(or default) is to be considered “unique”.. we will not use that anonymous pull as an anonymous success for pods pulling from another registry.. requires alg change in current feature implementation
    • Derek: lets consider other patterns for time slicing the auth and pull time slots, policies for specifying when and possibly for what reason we need to auth.. (align better with disconnected needs, not just performance/multi-tenant)
    • future items: possible integration with registries for discovering (header/artifacts) what the expiration is for an image
  • [SergeyKanzhelev] Planning of 1.30
    Eliminating perma betas. List of “old” feature gates:

    • AppArmor
      • Mostly need to clean up tests
      • Sergey to follow up
    • CustomCPUCFSQuotaPeriod - Peter will take a look
    • GracefulNodeShutdown
      • Issues with some controllers - Ryan add a comment on KEP indicating what the issue is.
    • GracefulNodeShutdownBasedOnPodPriority
    • LocalStorageCapacityIsolationFSQuotaMonitoring
    • MemoryManager
      • Got as far as PRR review. Lack of observability is concerning from PRR review - need to work on this.
      • Fracesco will follow up on this.
      • Swati: many issues opened for MemoryManager before GA
        • totally true we need to address them - silver lining is this can be done in parallel with observability improvements
    • MemoryQoS
    • PodAndContainerStatsFromCRI
      • Stalled on CRI implementation of those metrics
      • Working on it in CRI-O
      • Need some help from Containerd side
      • Exit criteria: must test performance that is not regressing
    • RotateKubeletServerCertificate
      • No tests and docs
      • Need volunteer to clean it up
      • Harche will take a look
    • SizeMemoryBackedVolumes
      • Need volunteers

    Deprecations:

    • cgroup v1
    • Mrunal, Dawn:
      • lets announce deprecation in 1.30.
      • Default to cgroupv2 in tests and have cgroupv1 as an “additional”
      • [Alexander Kanevsky] Collect list of distress that people uses, their default cgroups, and the EOL of those disttros. e.g. centos 7 or some ubuntu lts.
  • [SergeyKanzhelev] Should we cancel all the rest of the meetings for the month of Dec?

    • Let;s cancel meeting till the end of the year and meet on Jan 2nd
  • [hakman] node-problem-detector maintainers are needed to keep the project alive. I tried to follow the guidelines to step up as a reviewer and later approver, but it seems there is a lack of approvers. If possible, I would like someone from #sig-node to sponsor me. Thanks in advance! https://github.com/kubernetes/node-problem-detector/pull/830

    • AI: SergeyKanzhelev to follow up

Nov 28, 2023 [Canceled]

  • Canceled due to lack of agenda.
  • Please bring 1.30 planning topics for the next meeting.

Nov 21, 2023 [Cancelled]

  • Cancelled due to Thanksgiving week in US and leads availability

Nov 14, 2023 [Cancelled]

Nov 7, 2023 [Cancelled]

  • Canceled due to kubecon

October 31, 2023

Recording: https://youtu.be/RYBb81l1IGw

October 24, 2023

Canceled due to an empty agenda. Review PRs for freeze next week.

October 17, 2023

Recording: https://youtu.be/740kJACH3i8

Total active pull requests: 311

(weekly changes)

Incoming Completed
Created: 41 Closed: 10
Updated: 113 Merged: 38
New stats needed:
  • PRs needed other SIG approvals
  • Waiting for approvers
  • Waiting for reviewers
  • Separate cherry-picks and regressions

October 10, 2023

Recording: https://www.youtube.com/watch?v=akrWtsCbJZo

October 3, 2023

Recording: https://www.youtube.com/watch?v=HdIURTQSm7Q

September 26, 2023

Recording: https://www.youtube.com/watch?v=yEOUKJCJXa8

September 19, 2023

Recording: https://www.youtube.com/watch?v=ngboQ3GvX5o

September 12, 2023

Recording: https://www.youtube.com/watch?v=hVuZg2mqNsw

September 5th, 2023

Recording: https://www.youtube.com/watch?v=5iiD9OIeJv8

August 29th, 2023

Recording: https://www.youtube.com/watch?v=tNangR9QLkg

August 22th, 2023

Recording: https://www.youtube.com/watch?v=Y9btZGnyDK0

August 15th, 2023

Recording: https://www.youtube.com/watch?v=wgF8UDgp1sQ

  • [ruiwen] 1.28 KEPs retro (at least 30 minutes. We may not have much time for many other topics)
  • [haircommander] Kubelet image GC conversation
    • [Ruiwen] Pin images
    • [Derek] i am curious if secret pulled images have any unique gc requirements that have surfaced…
      • Tie lifecycle of image to lifecycle of pod?
    • [Sergey] Mirror config into kubelet?
    • Peter to begin a WG in between now and KEP freeze to come up with next steps before bringing to larger group.
  • [SergeyKanzhelev] Sidecar WG: join for the next push in 1.29:

August 8th, 2023

Recording: https://www.youtube.com/watch?v=9BBSMdw8dMA

August 1st, 2023

Recording: https://www.youtube.com/watch?v=V9F8jHgs6R4

July 25th, 2023 [Cancelled due to lack of agenda]

July 18th, 2023

Recording: https://www.youtube.com/watch?v=0Uqq8jNSSDk

  • [ndixita] memory QoS Beta K8s 1.28 might be infeasible https://docs.google.com/document/d/1mY0MTT34P-Eyv5G1t_Pqs4OWyIH-cg9caRKWmqYlSbI/edit#bookmark=id.qaybju6wvb05
    • Requesting kernel experts here for discussion around memory.high memcg controller usage, signals for memory reclaim(pgscan, pgsteal from memory.stat?).
  • [jiaxin] new CPU Manager static policy and in-place VPA improvements (performance, make it work with CPU Manager together), KEP or PR?
    • Problem 1: noisy neighbor issue. We want to spread hyper thread across physical cores to get better performance.
    • Problem 2: In-place VPA currently doesnt work with CPU Manager
    • Problem 2: In-place VPA sometimes takes up to a minute to finish scaling etc. We will finish a doc with the problems and solutions for further discussion.
    • [fromani] most likely a KEP+1, perhaps share a (preliminary) design doc in the community to outline the proposed scope and changes
    • [Dawn] Please start with a doc on the issue / problem statement and the suggested solution.
    • [Alex] Please separate in-place VPA improvements from CPU static policy.

July 11th, 2023

Recording: https://www.youtube.com/watch?v=0ggcapGYwtc

July 4th, 2023 [Canceled due to US holiday]

June 27th, 2023

Recording: https://www.youtube.com/watch?v=KMD17c5EbFU

  • [Wedson] Discuss setting a default runtime handler for CRI image operations if no runtimeclass is specified. Containerd supports using different snapshotters if pods have the runtime handler annotation specified but this can cause some issues if a pod without an annotation is scheduled after a pod with a runtime handler is specified because kubelet will think the image is already present because it was fetched with a different snapshotter.

  • [mahamed/upodroid] Overhauling sig-node node e2e tests. I have been working with dims on introducing EC2 node e2e tests and I want to use this opportunity to complete KEP-2464 and adopt kops' prowjob generator to generate jobs at scale as we need to test various permutations of multiple OS, architectures and CRI implementations.

    Implementation: https://github.com/kubernetes/test-infra/pull/29944

    PTAL at the e2e tests guidance in works:

  • [fromani][discussion if time allows, otherwise PTAL and comment on github!] handling devices assignment on node reboot and kubelet restart: issue https://github.com/kubernetes/kubernetes/issues/118559 and its proposed fix https://github.com/kubernetes/kubernetes/pull/118635

  • [haircommander] cgroup driver implementation discussion https://github.com/kubernetes/kubernetes/pull/118770

June 20th, 2023 [Cancelled]

June 13th, 2023

Recording: https://www.youtube.com/watch?v=nF_3dnZJVnA

Enhancements tracking board: https://github.com/orgs/kubernetes/projects/140/views/1?filterQuery=sig%3A%22sig-node%22&sortedBy%5Bdirection%5D=desc&sortedBy%5BcolumnId%5D=Status

May 23rd, 2023 * Need formal approval from SIG Node Tech Leads on the issue

June 6th, 2023

Recording: https://www.youtube.com/watch?v=rR3zOunp6FE

May 30th, 2023

Recording: https://www.youtube.com/watch?v=H9vnLgvTLvo

Agenda

KEPs: https://github.com/kubernetes/enhancements/issues?page=1&q=is%3Aissue+is%3Aopen+label%3Asig%2Fnode+milestone%3Av1.28

  • [harche/mrunalp] Cautiously enabling swap only for Burstable Pods - https://github.com/kubernetes/enhancements/pull/3957
  • [marquiz/haircommander]: KEP 4033: discover kubelet cgroup driver from CRI
    • There are other options that the CRI may want to tell the Kubelet what the state of the world is
    • focus this KEP on cgroup driver, but have API extendable so those other use cases (runtime class, QOS class, user namespace support) can be easily covered in the future
    • Separate CRI message from RuntimeStatus so Kubelet can request separately.
  • [mimowo] Changed pod phase when containers exit with 0, related issue: https://github.com/kubernetes/kubernetes/issues/118310. Summary:
    • eviction_manager, preemption: 1.26: Failed, 1.27: Succeeded
    • node shutdown 1.26: Failed, 1.27: Succeeded
    • active deadline exceeded 1.26: Failed, 1.27: Failed
  • [astoycos] bpfd Presentation!
    • Slides
    • [SergeyKanzhelev] SiG node may help in terms of attributing events to pods metadata. When kernel events received - would be nice to know what Pod is running the process that sent this event. Please let us know if anything can be improved from SIG Node side to help with this.
  • [byako] KEP-3542 CRI PullImageWithProgress https://github.com/kubernetes/enhancements/pull/3547/files
  • [adilGhaffarDev] What is the status of this fix: https://github.com/kubernetes/kubernetes/pull/117030 what can we do to escalate it, if possible?
  • [haircommander] KEP 3983: Add support for a drop-in kubelet configuration directory
    • Mostly a review request
  • [SergeyKanzhelev] https://github.com/kubernetes/kubernetes/pull/116429 sidecar PR.

May 23rd, 2023

Recording: https://www.youtube.com/watch?v=shmDtrq55V8

Agenda

  • [intunderflow] Following up from meeting on April 25th talking about lowering frequency of Startup / Readiness probe failure events, my preferred approach after digesting feedback, thoughts from the group about this approach? If happy I can put together a KEP
    • Always emit an event when the result of a probe changes (between Success and Failure, or Failure and Success)
    • When a startup probe fails or a readiness probe fails:
      • We emit the first failure
      • We then emit a failure every 1 hour if still failing
        • Should this event be the same as the first failure, or should it be perhaps something like “Probe still failing since [first failure time]”
    • No changes to liveness probes failing for now: * This will still cause mass event emission to hit the rate limit, but I want to tackle this incrementally and follow up on liveness probes * Lots of users watch for liveness probe failed events, so it's something to be particularly careful about in my opinion (people of course watch readiness/startup probes too, but Id assume not as many / that liveness probes are the most populous probe type)
    • Thoughts from the group about this approach? If happy I can put together a KEP
  • [intunderflow] https://github.com/kubernetes/kubernetes/pull/115963 needs approver - Id like to target this for 1.28 if no objections
  • [ffromani] REQUEST: looking for approvers for (all items already part of 1.28 tracking document)
  • [swsehgal] Proposing NodeResourceTopology API under kubernetes-sigs: https://github.com/kubernetes/org/issues/4224. Previously the API was proposed under staging but that proposal was rejected during API review.
    • +1: Alexander +1: Francesco
  • [astoycos] Super Short Introduction of https://github.com/bpfd-dev/bpfd (propose an actual 15-20 minute presentation for next week?) Also reach out in K8s slack #bpfd and #ebpf

May 16th, 2023

Recording: https://www.youtube.com/watch?v=gnbV1nrXVZc

Agenda:

  • [everpeace] I opened a PR for KEP-3169.
  • [tzneal] Discuss using the cgroup aware OOM killer https://github.com/kubernetes/kubernetes/pull/117793
    • KEP needed for the API change?
    • Potential Options
      • No config, just a new default
      • Add API to Container to allow workload specific configuration
      • Add flag to kubelet
    • [Dawn] Lets just change the default
    • [mrunal] OK with this.
    • Dawn will comment on the PR.
  • [mimowo]
  • Can this refactoring be led by sig-node?
  • Alternatively, can we go with the simple approach of adding the condition whenever when timeout is exceeded, as suggested in the POC PR: https://github.com/kubernetes/kubernetes/pull/117973. Then, we could document that the behavior when the timeout is exceeded, but the containers arent killed (but terminate on their own) is subject to change. Proposed KEP updated for review: https://github.com/kubernetes/enhancements/pull/3999
  • [SergeyKanzhelev] Sidecar KEP: https://github.com/kubernetes/enhancements/pull/3968/files and https://github.com/kubernetes/kubernetes/pull/116429
  • [mo] looking for a way to provide dynamic environment variables at runtime without persisting them in the Kube API (because the contents are sensitive)
    • would like to avoid any approach that uses admission to mutate pods
    • [Anish Ramasekar to Everyone (10:43 AM)] This is the subproject: https://github.com/kubernetes-sigs/secrets-store-csi-driver
    • [Sergey] will this help: https://github.com/kubernetes/enhancements/issues/3721?
    • Init container can download and then regular container will use those.
    • [mo] this ^^^ can work. Is this the right way?
    • [kevin] are you familiar with DRA? CDA is lowest level that makes abstract notion of a device available for a container. CDA can inject environment variables into the container. There may be a “device” that will perform all vault work and then will inject those variables to the container
    • [mo] what is the security model?
    • [kevin] this information will end up being statically stored at CDA file host system
    • [mo] is there way to observer this from kubernetes API?
    • [kevin] DRA is generalization of persistent volumes API. So it will provide some isolation.
    • [Sasha] this will not protect from exec into container. As no env variables would do.
    • [mo] can other containers see it? non-priviledged for example.
    • [mo] what is the interface for DRA? Can it be Daemonset in runtime?
    • [kevin] there is a talk about it at KubeCon. It has all the pieces to build this.
      • [Kevin] Here is my talk on how DRA drivers are structured:
  • [klueska] New Feature for 1.28: Add CDI devices to device plugin API

—- MOVED from 5/2/2023. Move above this line if you plan to show up on the meeting —

May 9th, 2023

Recording: https://www.youtube.com/watch?v=18cRhXTf0Cc

Total active pull requests: 242

Incoming Completed
Created: 19 Closed: 9
Updated: 118 Merged: 17
  • [swsehgal] Community discussion on device Manager recovery bugfix backport
  • [karthik-k-n] Community thoughts on Dynamic Node resize proposal
  • [clayton] Discussion of kubelet state improvements for 1.28 - trying to identify which areas to focus on
  • [zmerlynn] Discuss
    • Dawn: Maybe first restart free, dont punish
    • Clayton: DaemonSet that runs effectively a for loop to anneal policy
      • There are things we dont account for, like system resources in a crash looping pod - what does it actually cost to restart a container
    • Derek (on chat): I wonder if we need a way to measure a qps generally for the behavior that crashloopbackoff is trying to protect
      • systemd gives StartLimitBurst and then when that is exhausted you go to StartLimitInterval.... feels like we could give a burst
    • Sergey: Maybe we also need “its a bad failure, reschedule me”
    • David: Is it up to the admin to define this?
    • Kevin: KEP in question that Sergei mentioned: https://github.com/kubernetes/enhancements/pull/3816
    • Clayton: Full backoff doesnt make sense for static pod anyways

—-- End of the meeting. MOVED TO THE NEXT WEEK —--

May 2nd, 2023

Recording: https://www.youtube.com/watch?v=whN6nPOp62g

Total active pull requests: 241

Incoming Completed
Created: 27 Closed: 11
Updated: 88 Merged: 25

Apr 25th, 2023

Recording: https://www.youtube.com/watch?v=oQi3gPsODV0

  • [intunderflow] https://github.com/kubernetes/kubernetes/pull/115963 needs approver
  • [intunderflow] Thoughts on startup probe / readiness probe event emission behavior?
    • Currently the readiness probes and startup probes emit ContainerUnhealthy events each time they probe the container and it is Unhealthy.
    • For liveness probes a container going from a healthy state to suddenly unhealthy is important and notable, but for Readiness and Startup probes it's pretty typical for a container to be unhealthy since the point of these probes is to wait until the container is healthy.
    • Emitting these events eats into the rate limit of 25 events per object sent to the API server.
    • Readiness probes and Startup probes failing multiple times is pretty typical of their operation, since their point is to gate the container until it succeeds.
    • It would be nice if Readiness probes and Startup probes didnt eat events as fast as they did.
    • My thoughts and opinions:
      • We could consider changing the startup and readiness probe to only emit when they probe the container and it is healthy (since that leads to a change in state and action being taken)
      • My PR above (if approved) would still then report if a startup probe or readiness probe fail conclusively against a container
    • [Action Item] Count incrementation on Events? Why not working for failing probes?
    • [Ryan] The event recorder has a max retries of 12
    • https://github.com/kubernetes/client-go/blob/master/tools/record/event.go#L38
    • [Todd] we need events to be re-emitted periodically. Do not discard them universally. Less frequency, but definitely more events we want to know about like flakes of readiness probe.
  • [SergeyKanzhelev] Probes functionality cleanup: https://docs.google.com/document/d/1G5nGH97s3UTANbA5IyQ7nVIHnrLKfgVZssSYnvp_qX4/edit
  • [haircommander/Peter] Kubelet drop-in config support
    • After conversation about dropping cli flag support, it was illuminated that our users (downstream in Openshift) rely on this feature. Could be a good time to introduce drop-in file support like in /etc/kubernetes/kubelet.conf.d
    • Peter to make proposal to SIG-Arch to see if other components would like to adopt a similar pattern, as well as open an issue to have an asynchronous conversation.
  • [SergeyKanzhelev] https://github.com/kubernetes/kubernetes/pull/116429 , uber issue: https://github.com/kubernetes/kubernetes/issues/115934

Apr 18th, 2023

No call (kubecon)

Apr 11th, 2023

Recording: https://www.youtube.com/watch?v=R9bml9YmP3k

  • [klueska] Need approval on PR to update DRA KEP with changes merged into v1.27
  • [liggitt/derek] proposal to support node/control-plane skew of n-3 (KEP-3935, draft proposal)
    • What in-progress node feature / cleanup rollouts rely on n-2 skew?
      • might delay default-on of in-place-resize for one release (AI: jordan / vinay sync up); notes from jordan/vinay 2023-05-03:
        • a 1.27+ node with the feature disabled will not modify resources as requested, will mark pods requesting resize as "infeasible"
        • a pre-1.27 node will not modify resources as requested, with no user feedback
        • after 1.27 work, we realized that kubelet perpetually reports pod resize as InProgress when running against a containerd that supports UpdateContainerResources CRI API (containerd ~1.4/~1.5 era) but does not support ContainerStatus CRI API (added to CRI API in k8s 1.25, supported in containerd 1.6.9+), so there's already user feedback improvements to make and possibly delay beta for
        • if we were ready to promote in-place-resize to beta in 1.29, n-3 skew would mean 1.26 kubelets would not give any user feedback about lack of support for the feature, but would otherwise fail safe
    • derek:
      • include alternative considered of supporting in-place minor upgrades, rationale why that approach wasn't chosen
        • OS upgrades, immutable nodes can't use in-place for minor upgrades
        • cost of supporting/testing in-place minor upgrades is significantly higher, impacts development of new features and evolution of existing features
      • make sure it is clear what guidance should be given to people working on new features for what to do for features older kubelets don't support yet
  • [mweston & atanas] Still working on the https://github.com/obiTrinobihttps://github.com/obiTrinobiIntel/enhancements/tree/atanas/cci-updated/keps/sig-node/3675-resource-plugin-managerIntel/enhancements/tree/atanas/cci-updated/keps/sig-node/3675-resource-plugin-manager KEP. Need help with scheduling re Dawn or other member in getting feedback.
  • [mrunal] Canceling next week's meeting for kubecon.

Apr 4th, 2023

Recording: https://www.youtube.com/watch?v=Y_TWnklb0vI

  • [pacoxu] undeprecate kubelet --provider-id flag: what are your plans around graduating kubelet config file/actually deprecating these flags in the future?
  • [iancoolidge] Follow-up on issue https://github.com/kubernetes/kubernetes/issues/115994
  • [rata] Userns KEP 127: add support for stateful pods
    • We dont need code changes in the kubelet for this (just change the validation)
    • Therefore, we want to just change the scope of the KEP to support stateful pods too
    • We want to deprecate the feature gate “UserNamespacesStatelessPodsSupport” and add “UserNamespacesSupport”
    • This new feature gate will activate userns for all pods (stateful/stateless)
    • If this sounds good, we will do a PoC and propose the KEP changes widening the scope and explaining how the stateful case works too.
      • [mrunal] This may be okay but lets open a KEP change and get opinions of other reviewers involved.
      • [mrunal] We need to start thinking about how user namespaces will work with pod security policies.
      • [rata]: Mrunal and I will join sig-auth to start the PSS conversation
      • [rata] Maybe they need fields to be GA? But happy to start discussing.

Mar 28st, 2023

Recording: https://www.youtube.com/watch?v=yb_LtE0hGDc

  • [SergeyKanzhelev] Annual Report: https://github.com/kubernetes/community/pull/7220

    Lets edit together: https://docs.google.com/document/d/17Z3LO3pSdv9R-v9yLIMO5a46nwXRQTsaEDg0iN74rhs/edit?usp=sharing

  • [jlpedrosa]

    • memory.oom.group setting to oom the whole cgroup in the container.
      slack convo.
      • [Mrunal] container level makes sense
      • [Sergey] for sidecars we will adjust oom score for sidecars so its almost the “whole Pod” being killed
      • [Mrunal] we can start with the issue, may not need a KEP for this
      • [Todd Neal] I think there is a potential for API surface as the new behavior may not be desired in all cases. haproxy was the example brought up in Slack where it may handle OOM correctly on a single process. Most everything else probably doesn't, so you might want a default of turning oom.group on and allowing containers to opt-out.

Mar 21st, 2023

Recording: https://www.youtube.com/watch?v=IjxUleYcKgk

Mar 14th, 2023

Recording: https://www.youtube.com/watch?v=e0DA7x4zTs0

Total: 200

Incoming Completed
Created: 86 Closed: 35
Updated: 203 Merged: 103

Needs approval: label:lgtm -label:approved 41

https://github.com/kubernetes/kubernetes/issues?q=is%3Aissue+is%3Aopen+label%3Apriority%2Fcritical-urgent++label%3Asig%2Fnode+

https://github.com/kubernetes/kubernetes/issues?q=is%3Aissue+is%3Aopen+label%3Apriority%2Fimportant-soon++label%3Asig%2Fnode+

Jan 31st, 2023

Mar 7th, 2023

Recording: https://www.youtube.com/watch?v=KgAR613c1Bs

Total PRs: 241

Incoming Completed
Created: 35 Closed: 14
Updated: 136 Merged: 31
[https://github.com/kubernetes/kubernetes/issues?q=is%3Aissue+is%3Aopen+label%3Apriority%2Fimportant-soon+label%3Asig%2Fnode](https://github.com/kubernetes/kubernetes/issues?q=is%3Aissue+is%3Aopen+label%3Apriority%2Fimportant-soon+label%3Asig%2Fnode)

Feb 28st, 2023

Recording: https://www.youtube.com/watch?v=IHcI6Jwo5PQ

Total PRs: 248

Incoming Completed
Created: 36 Closed: 9
Updated: 83 Merged: 11

Feb 21st, 2023

Recording: https://www.youtube.com/watch?v=Hod1MGk99lc

Total PRs: 230

From Jan 24th:

Incoming Completed
Created: 129 Closed: 48
Updated: 177 Merged: 76

Feb 14th, 2023

Recording: https://www.youtube.com/watch?v=NsV9TVcJw54

Feb 7th, 2023

Recording: https://www.youtube.com/watch?v=cam97qjy8qE

27 KEPs: https://github.com/kubernetes/enhancements/issues?q=is%3Aissue+is%3Aopen+label%3Asig%2Fnode+milestone%3Av1.27+label%3Alead-opted-in+

Jan 31st, 2023

Recording: https://youtu.be/96DTU9ncSLA

[KEPS REVIEW]: 15 minutes

Jan 24th, 2023

Recording: https://youtu.be/NQaTeTfI9UY

Incoming Completed
Created: 29 Closed: 12
Updated: 90 Merged: 14

Jan 17th, 2023

Recording: https://youtu.be/wirWRKSqY10

Total PRs: 217

Incoming Completed
Created: 30 Closed: 16
Updated: 103 Merged: 16

Jan 10th, 2023

Recording: https://youtu.be/5V0uRxH4O4k

  • ~~[pacoxu] KEP-3610: namespace-wide global env injection #3612, not sure if this can be an admission controller.(removed due to mutating CEL admission should be the final solution.) ~~
  • [ruiwen/pacoxu] KEP-3673: Kubelet limit of Parallel Image Pulls #3713 *
  • [klueska] Update CRI to include CDI devices (needed by DRA before moving to beta)
  • [QuentinN42] Add FileEnvSource and FileKeySelector to add environment generated on the fly #114674
    • Sourcing from any file from any source may be too big of a scope. Would limiting this to empty dir files be enough?
    • Security - is there a risk to source some secret as an environment variable that would expose the file that wasnt available otherwise.
    • Action: Need to move this to kubernetes/enhancements as a KEP and follow the process. => https://github.com/kubernetes/enhancements/issues/3721
    • [Mike Brown] fyi.. not sure if this is the right pattern but NRI plugins support modifying environment variables for the containers. might be useful at least for prototyping
    • [QuentinN42] another question is error conditions depending on the file format
    • [Alexander Kanevsky] my first impression - the env variables are populated in oci spec before container started. sourcing from some file inside container might be not feasible....
    • [Mike Brown] right would require a set for any env change happening in prestart (which could be done by setting a runc hook via NRI or hook schema, or just doing the set on the update response)
  • [vinaykul] InPlace Pod Vertical Scaling PR - status update
    • I wont be in the Node meeting today due to another 10 am meeting.
    • Please review and merge KEP update PR
      • Updated beta target to v1.29
      • Added details on handling version skew.
    • Tim prefers that we merge PR 102884 in its entirety as opposed to merging API PR 111946 followed by the rest of it a week later.
  • [derek] sig updates

Jan 3rd, 2023

Recording: https://www.youtube.com/watch?v=AG3U91-5keo

Total active pull requests: 205

Incoming Completed
Created: 45 Closed: 22
Updated: 144 Merged: 18