37 KiB
		
	
	
	
	
	
			
		
		
	
	SIG Node Meeting Notes
Future
- Monitoring pipeline proposal
 - Virtual Kubelet (@rbitia)
 - PR Review Request - #63170 - @micahhausler
 - Image name inconsistency in node status - #64413 - @resouer
 - [Jess Frazelle / Kent Rancourt]: Proposal for kubelet feature to freeze pods placed in a hypothetical "frozen" state by the replication controller in response to a scale-down event. Enable pods to be thawed by socket activation.
 
Dec 18
- 
Graduate SupportPodPidsLimit to beta (Derek Carr)
 - 
UserNS remapping: updated proposal, implementation PR (vikasc):
Question raised at Zoom Chat:
 - 
From Mike Danese to Everyone: (10:42 AM)
I would like to reconcile the "Motivation" section with how we want people to use users and groups in general even without remapping. We want tenants to run workloads as different users and groups to segment their disk access to improve security, and we want userns to support compatibility with images that expect to be running as a given uid/gid. The current proposal uses a file in /etc/... which makes getting both hard. Any path towards both these goals? yup exactly
 - 
From Patrick Lang to Everyone: (10:44 AM)
bf48175c42/contributors/design-proposals/node/node-usernamespace-remapping.md (sandbox-type-runtimes)- this proposes using nodeselectors to select nodes that can support this. Should this be designed to work with RuntimeClass instead to avoid using node selectors? - 
From Mike Danese to Everyone: (10:45 AM)
SIngle mapping on a node seems problematic IMO. Namespace would be better. Per pod would be best. I'd like to see that explored thanks
 - 
From 刘澜涛 to Everyone: (10:46 AM)
Just some background, it is possible to support per pod user namespace mapping in containerd. It is already supported, we just need to add corresponding CRI implementation and kubelet support.
 
Dec 11
Cancelled due to KubeCon Seattle
Dec 4
- Proposal: node shut down handling (Jing Xu)
 - Mechanism to apply/monitor quotas for ephemeral storage: https://docs.google.com/presentation/d/1I9yYACVSBOO0SGB0ohvpTImJSCc2AaOTuO6Kgsg4IIU/edit?usp=sharing
 
Nov 27
- 
Adding Termination Reason (OOMKilled/etc) Event or Counter (Brian)
- Following discussion in sig-instrumentation, discuss either
- A. generating a logline / event for Pod termination reason or
 - B. adding a termination reason counter in cadvisor/kubelet that can export to Prometheus
 
 - Want to get better metrics when a container is killed.
 - Want to get a count of all the termination reasons.
 - Pod has last state, which has a termination reason
 - kube-state-metrics has a termination reason, but as soon as the pod restarts it's gone
 - would be nice to have a counter for how many containers have been killed for each possible reason (OOM, etc.)
 - at minimum would be nice to have an event with the termination reason
- on top of this would be nice if cadvisor or kubelet could export a counter (e.g. for these events)
 
 - Relevant Issue is: "Log something about OOMKilled containers" https://github.com/kubernetes/kubernetes/issues/69676
 
 - Following discussion in sig-instrumentation, discuss either
 - 
Node-Problem-Detector updates (wangzhen127)
- adding a few new plugins:
- kubelet, docker, containerd crashlooping issues
 - more checks on filesystem and docker overlay2 issues
 
 - plans on making NPD easier to configure within k8s clusters
 
 - adding a few new plugins:
 - 
**wg-lts a new working group. [Dhawal Yogesh Bhanushali dbhanushali@vmware.com]
**Mailing list: https://groups.google.com/forum/#!forum/kubernetes-wg-lts
PR: https://github.com/kubernetes/community/pull/2911
Survey: https://docs.google.com/document/d/13lOGSSY7rz3yHMjrPq6f1kkRnzONxQeqUEbLOOOG-f4/editslack: #wg-lts
 
Nov 20
Meeting canceled due to USA holiday (Thanksgiving)
Nov 13
no agenda
Nov 6
- Windows progress update [@patricklang]
- 
Tracking remaining v1.13 [stable] work at https://github.com/PatrickLang/k8s-project-management/projects/1
As of 11/6, down to test & docs - 
Release criteria doc, includes test case list: https://docs.google.com/document/d/1YkLZIYYLMQhxdI2esN5PuTkhQHhO0joNvnbHpW68yg8/edit#
 
 - 
 - Change of node label name(space) of NFD labels (@marquiz) 
https://github.com/kubernetes-incubator/node-feature-discovery/issues/176#issuecomment-436166692
(related to NFD repo migration: https://github.com/kubernetes-incubator/node-feature-discovery/issues/175) 
Oct 30
- Node GPU Monitoring Demo (dashpole@google.com) https://github.com/dashpole/example-gpu-monitor#example-gpu-monitor
- Feedback from Swati from Nvidia: the initial proposal with gRPC is very difficult to integrate. The new version with socket is doble.
 - Next step: dashpole: Open PR to add and test the new socket endpoint in 1.13 
Swati from Nvidia: work on integrating DCGM exporter with the endpoint. 
 - Topology Manager (formerly known as NUMA Manager) proposal [cdoyle] 
https://github.com/kubernetes/community/pull/1680- TL;DR: Align topology-dependent resource binding within Kubelet.
 - Agreed to move forward with the proposal.
 
 
Oct 9
- Move Node Feature Discovery to kubernetes-sigs (Markus Lehtonen) 
https://github.com/kubernetes-incubator/node-feature-discovery/issues/175 - RuntimeClass scheduling (tallclair)
 
Oct 02
- 
1.13 release
 - 
Q4 planning discussion:
https://docs.google.com/document/d/1HU6Ytm378IIw_ES3_6oQBoKxzYlt02DvOoOcQFkBcy4/edit?usp=sharing
 
\
Sept 18
- Discuss release notes for 1.12 (Derek Carr)
 - NUMA Manager Proposal Demo & Update (Louise Daly, Connor Doyle, Balaji Subramaniam)
 
Sept 11
- Kubelet Devices Endpoint (device monitoring v2)
 - Fix port-forward fo non-namespaced Pods (@Xu)
- https://docs.google.com/presentation/d/1x1B2_DFZ9VI2E_-pB2pzeYKtUxrK4XLXd7I0JWD1Z40/edit#slide=id.g4189217af3_0_58
 - Related to: containerd shimv2 + kata etc
 - some comments in the meeting:
- Kubelet doesn't want to see any containers/images visible to but not managed by kubelet. So if we want a solution like method 3, it should not visible to kubelet at all. And method 1 looks good to kubelet.
 - There is a debug-container under discussion, which works quite similar to the method 3.
 - Sig-node likes to revisit the port-forward method itself from the architecture level(?), however, the feature is required by many use cases, such as OpenShift, and it is essential.
 
 
 - RFC Improve node health tracking - https://github.com/kubernetes/community/pull/2640
- No discussion necessary at the moment
 
 
Sept 4
- Ephemeral storage (part 2)
- Discussion of moving ephemeral storage & volume management to the CRI: 
https://groups.google.com/d/topic/kubernetes-sig-storage/v2DKu8kNIgo/discussion 
 - Discussion of moving ephemeral storage & volume management to the CRI: 
 - Interested mentor/mentees post 1.12?
 
Aug 28
- Discuss ephemeral storage quota enhancement (Robert Krawitz @Red Hat) https://docs.google.com/document/d/1ETuraEnA4UcMezNxSaEvxc_ZNF3ow-WyWlosAsqGsW0/edit?usp=sharing_eil&ts=5b7effb9
 
Aug 21 proposed agenda
- Windows GA updates (Patrick Lang @Microsoft)
- Just finished sig-windows , and meeting notes is at https://docs.google.com/document/d/1Tjxzjjuy4SQsFSUVXZbvqVb64hjNAG5CQX8bK7Yda9w/edit#
 - Discussed with sig-network and sig-storage as suggested at sig-node several weeks ago
 - Based on the current quality and testing, plan to have GA for 1.13.
 
 - Sidecar-container proposal: https://github.com/kubernetes/community/pull/2148 (Joseph)
 - Kata & Container shim v2 Integration updates (Xu@hyper.sh, https://docs.google.com/presentation/d/1icEJ77idnXrRSj-mSpAkmy9cCpD1rIefPFrbKQnyB3Q/edit?usp=sharing)
- Shim V2 Proposal: https://github.com/containerd/containerd/issues/2426
 
 - Device Assignment Proposal: https://github.com/kubernetes/community/pull/2454
 
Aug 14
- Summary on the New Resource API Follow-up offline discussions (@vikaschoudhary16, @jiayingz, @connor, @renaud):
- We will start with multiple-matching model without priority field.
 - We will start with allowing ResourceClass mutation.
 
 - Summary on the New Resource API user feedbacks (@vikaschoudhary16, @jiayingz):
- tldr: we received feedbacks from multiple HW vendors, admins who manage large enterprise clusters, and k8s providers that represent large customer sets that the new Resource API a useful feature that will help unblock some of their important use cases
 
 
Aug 07
- Follow-up New Resource API KEP proposal (@vikaschoudhary16 and @jiayingz):
- Using 
Priorityfield for handling overlapping res classes - Should res classes with same priority be allowed?
 - non-mutable or mutable
 
 - Using 
 - Device plugin 1.12 enhancement plan: (@jiayingz)
 - Add list of CSI drivers to Node.Status (@jsafrane)
- https://github.com/kubernetes/community/pull/2487
 - sig-node will review
 - Major concern right away: node object is already too big. Jan will get API approval + sig-architecture approval to remove the old annotation (to save space).
 - Will continue next week.
 
 
July 31
- [tstclair & jsb] CRI versions and validation
- Docker 18.03-ce is the only installable version for Ubuntu 18.04
 - Validate against a docker API version: https://github.com/kubernetes/kubernetes/issues/53221
 - Sig-cluster-lifecycle requirements:
- Test dashboard to show the container runtime status:
- Docker: https://k8s-testgrid.appspot.com/sig-node-kubelet, https://k8s-testgrid.appspot.com/sig-node-cri
 - Containerd: https://k8s-testgrid.appspot.com/sig-node-containerd
 - CRI-O: https://k8s-testgrid.appspot.com/sig-node-cri-o
 - Pending work to move to the newly defined test jobs for CRI
 
 - A central place of document to tell users how to configure each container runtime.
- Follow up here - https://github.com/kubernetes/website/issues/9692
 
 
 - Test dashboard to show the container runtime status:
 
 - 1.12 Feature Freeze Review (Dawn/Derek)
- Planning doc for discussion
 - List of feature issues opened
 
 
July 24
- Agenda Topic (owner)
 - [tstclair & jsb] CRI versions and validation
- Docker 18.03-ce is the only installable version for Ubuntu 18.04
 
 
July 17
- Device "Scheduling" proposal (@dashpole) slides
 - New Resource API KEP proposal (@vikaschoudhary16 and @jiayingz)
- Discussed the user stories we would like to enable through this proposal
 - Discussed goals and non-goals. In particular, discussed why we proposed to use ResourceClass to express resource property matching constraints instead of directly expressing such constraints in Pod/Container spec. These reasons are documented in the non-goal section
 - Question on whether it is common for people to create a cluster with multiple types of GPU. Or do people usually create multiple clusters, each maps to one GPU type.
- Answer: Yes. We have seen such customers from Openshift, GKE, and on-prem.
 
 - Bobby: When we start to support group resources, and ComputeResource to ResourceClass matching behavior becomes many-to-many, that would bring quite a lot complexity and scaling concerns on scheduler. People have already seen scaling problems on scheduler when they use very big number of extended resources.
- Answer: scaling is definitely a concern. We don't plan to support group resource in the initial phase. We need to collect scaling and performance numbers and carefully evaluate them after the initial phase before moving forward. We should publish performance numbers and best-practice guidelines to advise people from mis-using the building blocks we put in. We should also think about the necessary mechanism to prevent people from getting into "bad" situation.
 
 
 
July 10
- Follow-up from June 26
- Move cri-tools to kubernetes-sigs. https://github.com/kubernetes-incubator/cri-tools/issues/331
 - Move cri-o to kubernetes-sigs (Mrunal/Antonio). 
https://github.com/kubernetes-incubator/cri-o/issues/1639 - Follow-up
- no disagreement in sig to move out of incubator org into sigs org
 - dawn/derek to ack on issues and initiate transfer
 
 
 - Future of Node feature discovery (Markus): 
https://docs.google.com/document/d/1TXvveLiA_ByQoHTlFWtWCxz_kwlXD0EO9TqbRLTiOSQ/edit# - RuntimeClass follow up: s/Parameters/RuntimeHandler/ (tallclair)
 - Float command refactor idea (tallclair)
 - New Resource API KEP proposal (@vikaschoudhary16 and @jiayingz)
 - Device "Scheduling" proposal (@dashpole)
 
July 3
Cancelled
June 26
- Move cri-tools to kubernetes-sigs. https://github.com/kubernetes-incubator/cri-tools/issues/331
 - Move cri-o to kubernetes-sigs (Mrunal/Antonio). 
https://github.com/kubernetes-incubator/cri-o/issues/1639 - RuntimeClass KEP: https://github.com/kubernetes/community/pull/2290
- Derek: How will RuntimeClass will be used? What is the scope?
 - Derek: Can we have a concrete example to prove the api works?
 - Derek: Should unqualified image name normalization problem solved by RuntimeClass? - If yes, it is not only container runtime need to handle it, both Kubelet and scheduler need to normalize image name based on the RuntimeClass.
 - How is storage and networking going to work in this sandbox model?
 
 - Future of Node feature discovery (Markus): 
https://docs.google.com/document/d/1TXvveLiA_ByQoHTlFWtWCxz_kwlXD0EO9TqbRLTiOSQ/edit# 
June 19
- User Capabilities (using ambient capabilities for non-root in container) @filbranden 
PR kubernetes/community#2285 
June 12
- Sandboxes API Decision (@tallclair)
- Proposed to move forward with RuntimeClass, instead of sandbox boolean
 - Some rationale behind the proposal
- Unblocks runtime extensions beyond sandboxes like kata vs. gVisor, for example, windows
 - Provides a clean way of specifying pod overhead and sandbox configuration
 
 
 - APIsnoop e2e test to API mapping (@hh and @rohfle)
 
June 5
Cancelled
May 29
May 22
- Sandboxes API Proposal (@tallclair)
- Q: Do we expect cluster admin to label nodes which support sandbox?
- It's not clear whether we should require all container runtimes to support sandbox, or this is an optional feature.
 - This doesn't block alpha.
 - Ideally each node will only support one sandbox technology, but if the user want to use different sandbox technology, they may need to have different node setup and label the node.
 
 - Q: Will this be included in conformance test?
 - Q: Can the sandbox be a runc implementation which meets the conformance?
- It is hard to define a conformance test suite to validate whether the sandbox meets the requirement.
 - We can only provide guideline for now.
 - We are still working on the definition, but one probably definition: idependent kernel, and 2 layers of security?
 
 - If the sandbox definition is not clear, derek is worried that:
- User may never or always set Sandbox=true.
 - If sandbox is only a boolean, when 
sandbox=true, a workload may work for some sandbox technology, but not work for another sandbox technology. 
 - Q: Why not just expose OCI compatible runtime? So that we can get similar thing with sandbox boolean, and more flexibility.
- For example, we can have an admission controller to do the validation and set defaults for an OCI compatible runtime, which is not necessarily to be in core Kubernetes.
 
 - Q: What is the criteria to graduate this to beta/GA?
- The alpha version is mainly to get user feedback, it is not finalized api.
 
 - Q: Is this a Kubernetes 1.13 evolution? Or a longer term project?
- A: 1.13 would be good if there is strong signal that this works, but expect this to take longer. Would like to see this gets into alpha in 1.12.
 
 - Q: Is Selinux blocked when sandbox=true?
- AppArmor, Selinux are not blocked, guestOS can support to provide better isolation between containers inside the sandbox.
 
 - Q: Instead of changing all "Sandbox" in CRI, can we rename to a more specific name "Kernel-Isolated" or something, which also makes the api less confusing? "Sandbox" is too vague.
 
 - Q: Do we expect cluster admin to label nodes which support sandbox?
 
May 15
- Monitoring proposal update (@dashpole)
 - CFS quota change discussion : https://github.com/kubernetes/kubernetes/pull/63437
- Dawn to sync with Vish, we seemed good with a knob at node level to tune period on call.
 
 - Cheaper heartbeats: https://github.com/kubernetes/community/pull/2090
 - Update on 1.11 development feature progress (reviews, etc.)
- https://docs.google.com/document/d/1rtdcp4n3dTTxjplkNvPDgAW4bbB5Ve4ZsMXBGOYKiP0/edit
 - sysctls to Beta fields need reviews (Jan)
 
 - CRI container log path:
- Only add pod name and namespace into the log path to avoid regression: https://github.com/kubernetes/kubernetes/issues/58638
 - For more metadata support e.g. namespace uid, need more thoughts, let's do it in the future.
- e.g. We can potentially use fluentd output filter plugin (e.g. https://github.com/bwalex/fluent-plugin-docker-format) + node metadata endpoint proposed by @dashpole.
 
 
 
May 8
- Cancelled due to KubeCon and OpenShift
 
May 1
- Sysctl move to beta (@ingvagabund) (Seth)
- 
Discuss KEP for user namespace (proposal, @mrunal, @vikas, @adelton)
 - 
Add probe based mechanism for kubelet plugin discovery: (@vikasc)
 
 
April 24
- 
Discuss KEP for user namespace (proposal, @mrunal, @vikas, @adelton)
 - 
Online volume resizing (proposal, @gnufied)
 - 
Maximum volume limit (proposal, gnufied)
 - 
Sig-Node Initial Charter: (https://github.com/kubernetes/community/pull/2065, @derekwaynecarr)
 - 
Add probe based mechanism for kubelet plugin discovery: (@vikasc)
 
April 17
- Re-categorize/tag Node E2E tests (proposal, @yujuhong)
- We need to consider whether to include test like eviction test into conformance suite. They are essential features, but the test itself has some special requirement 1) it needs to be run in serial; 2) it does make some assumptions such as the node size.
 
 - Virtual Kubelet (@rbitia) Updates
 - Add probe based mechanism for kubelet plugin discovery: https://github.com/kubernetes/kubernetes/pull/59963
 
April 10
- Windows Service Accounts in Kubernetes (@patricklang)
- https://github.com/kubernetes/kubernetes/issues/62038
 - How to support Windows Group Managed Service Account (gMSA)
- WindowsCredentialSpec is windows specific identity
 - Option 1: WindowsCredentialSpec references a secret
 - Option 2: WindowsCredentialSpec contains the JSON passed as-is in the OCI spec
 - No password information inside, only some information about the host.
 
 
 - Sandbox resource management - discussion questions
- Pod resources design proposal: https://docs.google.com/document/d/1EJKT4gyl58-kzt2bnwkv08MIUZ6lkDpXcxkHqCvvAp4/edit
 - If user set the overhead, how should runtime controller deal with it?
- Reject the pod, may cause problem on upgrade.
 - Overwrite it.
 
 - Derek: Can the node advertise/register the overhead to the control plane? Can node report per pod overhead in the node status? And let scheduler make decision based on that?
 - Derek: Can we make kata as a node resource? And pod wants to run on kata needs to request that resource?
 - Derek: Pod level limit is easier to set by user. - Dawn: It may not be true in the future based on the trend of Kubernetes today, e.g. Sidecar containers injected by Istio. - Derek: We can resolve the istio issue by dynamically resizing the limit.
 - Dawn: Do we expect mixed sandbox and non-sandbox containers on a node? - Tim: Yes. The reason is that we do have system containers can't run in sandbox, e.g. fluentd, kube-proxy etc.
 - Derek: Do we expect multiple runtimes on a single node? e.g. containerd, cri-o etc. - Dawn: No. Each node should have one container runtime, but it supports running container with different OCI runtimes, e.g. runc, kata etc.
 - Derek: What happens to the best effort pod? - Dawn: This is an issue for frakti today, so they still set fixed limit.
 
 - Container runtime stream server authentication https://github.com/kubernetes/kubernetes/issues/36666#issuecomment-378440458 *
 - Add probe based mechanism for kubelet plugin discovery: https://github.com/kubernetes/kubernetes/pull/59963
 - Node-level checkpointing:
 
https://github.com/kubernetes/kubernetes/pull/56040
- sig-node charter discussion
 
April 3
- Resource Management F2F update
 - Q2 planning draft (@dawnchen)
 - Node Controller question - https://github.com/kubernetes/kubernetes/issues/61921
- Node deduplicates multiple InternalIP addresses on a node
 - ask Walter Fender (@cheftako)
 
 
March 27
March 20
- Secure Container Use Cases updates (@tallclair)
- Strongly suggesting pod as the security boundary, @tallclair to write up decision (welcoming feedback and/or disagreement)
 - Extra overhead of sandbox. Charge to who? pod-level resource request / limits? See updates on the solution space doc
 - Need to think more about volume model in the sandboxes to enforce 2 security boundaries (and prevent issues like CVE-2017-100210[12]
 - sysctl: need to review the support list.
 - device plugins - needs more consideration, contributions welcome
 
 - Proposal: Pod Ready++ (@freehan)
 - containerd f2f meeting update (@lantaol)
 - Virtual Kubelet (@rbitia)
- Doc posted during the meeting. Please read for next week.
 - https://docs.google.com/document/d/1vhrbB6vytFJxU6CrlMqlH6wC3coQOwtsIj-aFzBbPXI/edit#heading=h.dshaptx6acc from Ben VMWare.
 - Review the design next week
 
 - RawProc options for CRI (@jessfraz)
- Add use cases and how this will be incorporated in the kubernetes API
 
 
March 13
- Containerd F2F meeting notes: https://docs.google.com/document/d/1MrgDYOSTjysMPcc6D7OnaeIDEc48lMjQAB10EIKQ5Go/edit?usp=sharing
 
Mar 6
- CRI-O status updates (@mrunal)
- slide update: https://docs.google.com/presentation/d/1TQ6sBo63AXt6QF3LxA3jtnNzT330MpPDODORhLVcDH0/edit?ts=5a9ed52a#slide=id.g3134b94d16_0_0
 - pr help needed for dashboard: https://github.com/kubernetes/test-infra/pull/5943
 
 - Official Contributor Experience Reach-Out (@jberkus)
- Process for label and PR workflow changes
 - Mentoring!
 - Contributor Guide
 - Governance and charters
 - What can Contribex do for you?
 
 
Feb 27
- Virtual Kubelet implementation doc: (@rbitia) https://docs.google.com/document/d/1tu27_BquhUAmYLaJznjbbdLJKYrH5-stm1i0OcxRgE8/edit?usp=sharing
 
Feb 20
- Secure isolation discussion continue (@tallclair)
 
Feb 13
- Secure isolation update
- Owner: @tallclair
 - https://docs.google.com/document/d/1QQ5u1RBDLXWvC8K3pscTtTRThsOeBSts_imYEoRyw8A/edit?usp=sharing
 - Introduce solution space doc, goals & high-level overview
 - Discuss in a future meeting, after folks have time to digest the document
 
 - Collection of options to integrate kata with CRI
- Owner: @resouer, ~20 mins
 - Doc: https://docs.google.com/document/d/1PgXJpzSfhR_1idkNtcZNuncfUYV-U-syO-HSGxwQrVw/edit?usp=sharing
 - Introduce the existing & proposed ways of integrating Kata with CRI, a brief evaluation will also be included.
 
 
Feb 6
- CRI: testing policy
- Owner: yujuhong@
 - https://github.com/kubernetes/community/pull/1743
 
 - CRI: container runtime cgroup detection (or not)
- Owner: yujuhong@, lantaol@
 - https://github.com/kubernetes/kubernetes/issues/30702
 - The runtime stats is only used for monitoring today.
 - Do not change CRI for this. Instead, the runtime cgroup can be passed to the kubelet through the existing flag 
runtime-cgroups. 
 - CRI: Image Filesystem Storage Identifier
- Owner: yujuhong@, lantaol@
 - https://github.com/kubernetes/kubernetes/issues/57356
 - Slides: https://docs.google.com/presentation/d/1BbXgmEVhH0p2cgoojN36Q6SMHZ8tl00kfjITmYEf9vM/edit?usp=sharing
 - Kubelet can use 
statfsto get fs stats. It is simple enough. - Kubelet get image filesystem path through CRI vs. Configure image filesystem path through kubelet flag? The latter seems to be preferred.
 
 - Release reminder: sig-node's v1.10 marked issues and features need scrubbed. We're officially past feature freeze and code freeze is coming soon. More details in email on kubernetes-dev today and new issue list this week is largely SIG-Node's.
 - Reminder: We are close (hopefully this week) to graduating the Kubelet's componentconfig API to beta status, as our existing TODO list is complete. Please take a look at https://github.com/kubernetes/kubernetes/pull/53833.
 - Reminder: NamespaceOption is changing in the CRI with #58973, runtimes will break. *
 
Jan 30
- 
Revised Windows container resource management (CPU & memory):
- Owner: Jiangtian Li (jiangtli@microsoft.com), Patrick Lang (plang@microsoft.com)
 - https://github.com/kubernetes/community/pull/1510
 - What about container without resource request/limit? - Zero value passed to Docker, and Docker will set an value based on the current node usage.
 - CPU assignment? - Not supported in windows platform now.
 - Is in place resource update supported on Windows? - Immutable today.
 
 - 
Virtual Kubelet introduction
- 
Owner: Ria Bhatia, ria.bhatia@microsoft.com
 - 
Context:
Virtual Kubelet is an interface for plugging in _anything, into a K8 cluster, _to replicate the lifecycle of a kubernetes pod. I'd like to kick off discussion for the concepts of VK, the design of the project itself, networking scenarios and also come up with a formal provider definition.
 
 - 
 - 
Node e2e conformance tests: https://github.com/kubernetes/kubernetes/issues/59001
- Spreadsheet to re-categorize tests: please comment if you think a test should be added to or removed from the conformance suite. Please also suggest feature tags for tests.
 
 
Jan 23
- Invoke Allocate RPC call at container creation time#58282
- Owner: @RenaudWasTaken rgaubert@nvidia.com
 - Context: 
Last week the resource-management workgroup tried to tackle a design issue related to the device plugin.
After much discussion we agreed that we wanted more opinions on the different approaches that we currently have.
I've created a document that captures the different approaches as well as the pros and cons - We should have define a separate per container operation in device plugin.
 - https://github.com/kubernetes/kubernetes/pull/58172 -- yuju to review
 
 - Logging improvements
- Derek Carr, Peter Portante
 - https://github.com/kubernetes/kubernetes/issues/58638
 - Namespace name can duplicate overtime, we may need namespace UUID in container log path.
 
 - CRI container log stats
 - Container log rotation
 - Node auto repair repository discussion
- Derek Carr
 - desire is to have an open source node remedy system that watches node conditions and/or taints to coordinate remedy response across multiple clouds (i.e. for example, reboot node)
 
 
Jan 16
- Review Pending v1.Pod API Changes [@verb]
- Debug Containers: Command in PodStatus
- Requires keeping track of v1.Container in kubelet
 
 - Shared PID: ShareProcessNamespace in PodSpec
 
 - Debug Containers: Command in PodStatus
 - Windows container roadmap and Windows configuration in CRI (Patrick.Lang, jiangtli @microsoft.com)
- https://trello.com/b/rjTqrwjl/windows-k8s-roadmap
 - Windows container configuration in CRI:
- https://github.com/kubernetes/community/pull/1510
 - Questions:
- Question: Is there pod level sandbox isolation mechanism on Windows?
 - Answer: Currently only kernel level isolation for per-container, no hierarchy isolation like cgroups. They will work with containerd and cri-containerd to add the hypervisor level isolation at pod level.
 - Dawn: Can we add e2e test for the change we make?
 - Dawn & Yuju: Can we have an overview of how resource management will work on windows?
 - Dawn: Let's start with CPU and memory, from your current proposal, currently there is no storage resource define there now.
 
 
 
 - Review request: Node-level Checkpointing Manager (Vikas, vichoudh@redhat.com)
- https://github.com/kubernetes/kubernetes/pull/56040
 - Just requires approval
 - It's good to have a common library instead of having separate implementations in different part. However, we should make it clear that we prefer not to add checkpoint.
 - Derek: Maybe we can have a document to track all the checkpoint done by kubelet, and whenever people add new checkpoint they need to update the document.
 
 - Short discussion about the node-fencing progress (bronhaim@, redhat)
 
Jan 9
- crictl v1.0.0-alpha.0 demo. @lantaol
 - kube-spawn local multi-node cluster tool
- Using kubeadm and systemd-nspawn
 - Useful for: testing CRI or other Kubernetes patches
 - Demo: https://asciinema.org/a/152314
 
 - deprecating rktnetes in k8s v1.10 (tracking issue: https://github.com/kubernetes/kubernetes/issues/53601)
 - sig-node Q1, 2018 plan
- https://docs.google.com/document/d/15F3nWPPG3keP0pzxgucPjA7UBj3C31VsFElO7KkDU04/edit
 - much of the work carried from last quarter
 - draft includes priority and possible owners
 - please review and suggest changes before sharing with SIG PM
 
 
Jan 2
Cancelled.