community/sig-storage/archive/meeting-notes-2016.md

39 KiB
Raw Blame History

Kubernetes Storage SIG Meeting Notes (2016)

The Kubernetes Storage Special-Interest-Group (SIG) is a working group within the Kubernetes contributor community interested in storage and volume plugins. This document contains historical meeting notes from past meeting.

December 22, 2016

Recording: https://youtu.be/pMMAM6SNeeY

Agenda/Notes:

December 8, 2016

Recording: https://youtu.be/1GYAulB4mdM

Agenda/Notes:

  • Status Update:
    • Jings containerized mounter changes went into 1.5 and are backported to 1.4.7
  • Storage Wish List (Brainstorming) for 2017 and 1.6 (Starter)
    • (AI) Create a separate doc to let people think it over and track thi
    • [Matt De Lio] Some ideas:
      • Only schedule pods on nodes that support a given volume type
      • Containerized Mounts/Out-of-tree Volume Driver
      • Stabilizing AWS support for EBS
      • Snapshot
        • Need a consensus on what this really need
      • Data Replication?
      • Local Storage Support
      • Mount Option Passthrough
  • Quarterly face to face meeting for Q1 2017
    • Dell EMC is willing to host in our office in Santa Clara
      • Potential for this to be at KubeCon EU too? [CML]
    • Timing: Late Q1 early Q2?
    • Should have an agenda set before the meeting to make discussion more focused.
  • [from last meeting] [rootfs] Metrics in volume controller
  • [jsafrane] Safer mount manager that does not accidentally delete user data
  • [jsafrane] Configurable ReclaimPolicy in StorageCla

November 23, 2016

Since Thursday, November 24, 2016 is a holiday in the United States, we are moving this occurrence of the Storage SIG meeting one day early to Wednesday, November 23, 2016.

Recording: https://youtu.be/fhtaR0rsLBA

Agenda/Notes:

November 10, 2016

Recording: https://youtu.be/k1JRpMd051M

Agenda/Note

November 7, 2016

F2F meeting in Seattle, WA

October 27, 2016

Recording: https://youtu.be/gjb2X1QiKG0

Agenda/Note

  • Reminder meeting is now held on Zoom
  • [Saad Ali] Update on NFS/Gluster in GCI (GCE/GKE)
  • [Michael Rubin] On Containerization and the future of volume plugins and deployment (Flex, etc.)
  • [Hayley Swimelar (Linbit)] Adding support for DRBD Volumes. issue #32739
  • [Saad Ali] We have less than two weeks left until code freeze. Please list the outstanding PRs that really need attention:
    • PR #30285 - Proposal for external dynamic provisioner
    • PR #30091 - support Azure disk dynamic provisioning
    • PR #33660 - rbd attach/detach refactoring
    • PR #35284 - Add test for provisioning with storage cla
    • PR #35675 - Require PV provisioner secrets to match type
  • [Erin Boyd] Status update on Testing
    • [bchilds] - update on AWS E2E test
  • [bchilds] External NFS Provisioner incubator proposal
  • [Simon Croome] StorageOS introduction / plans for Kubernetes integration
  • [bchilds] External Storage Provisioner - https://github.com/kubernetes/kubernetes/pull/30285
  • [Steve Wong] Storage SIG face to face Nov 7 logistic

October 13, 2016

Agenda/Note

  • Switching to Zoom
  • Updates on librbd and rbd-nbd integration with Kubernetes.
    • Radoslaw Zarzynski - Looking at TCMU in addition to rbd-nbd. Interested in building a bridge between. Plugin needs to address both and doing so is possible since they are so similar. Taking requirements on CEPH side, have people from mirantis to do the development. Looking at in-tree instead of flex and plans to start development next week. Could support V2 vs V1 https://github.com/kubernetes/kubernetes/issues/32266
    • Love to see PoC and implementations coming. Please ensure krbd is not broken when introducing user space rbd.
  • Reminder: Please review the external storage provisioning PR https://github.com/kubernetes/kubernetes/pull/30285
  • Flex in 1.5 - Will not make 1.5, but will discuss in November F2F. Working with Docker to come to consensus about the API such that its the same between Flex and Docker. Should see updates on proposal next week. Trending towards socket model instead of exec. Proposals will be updated to reflect it.
  • November F2F Agenda: https://docs.google.com/document/d/1drdxPkZEiGA06-jnsbSqywzQ6z0bT8uMpJdnIzkRoPo/edit?ts=57ed5df2#
  • provisioning/deleting secrets on storage class
  • AWS Storage E2E Tests Update - Need an XFS filesystem on the host images.
  • DAS - (local) storage - may be designed in F2F. no target release.

September 29, 2016

Agenda/Note

September 15, 2016

Agenda/Note

September 1, 2016

Agenda/Note

  • v1.4 Statu
  • v1.5 Priorities
    • Testing
    • Debuggability
    • Code hardening
    • Dynamic Provisioning:
      • Direction for “secrets” and endpoints on PV
        • Jan: we need to close on this.
        • Concern about backwards compat
        • Tim: 1.4 too late, hard to justify.
        • MRubin/PMorie: might be something we want to bring up at community SIG--other SIGs are touching on and interested in this area
        • Erin, Eric Paris, and Jan will help with this.
      • External provisioner
        • Tim may be able to look at it for 1.5 (about 3 weeks)
    • Flex Volume
      • Resolve open issues.
    • Local Storage
      • Beep and Clayton are driving, SIG should keep an eye on it
    • Snapshotting
      • Jing is driving
    • There is a kubernetes features repo, where issues are created and tracked (in v1.4 we did this last minute), for 1.5 we should do this early.
  • Switch to from Hangouts to something else
    • Zoom?
      • Others use thi
      • Requires an external binary
      • Seems to work fine
      • You need a Mac or Linux, no ChromeBook
      • Paid service--need an Admin account for SIG
    • Current limit is 30 on Hangouts.
    • Saad: Well keep hangouts until we hit the limit. If we start to have issues, Ill look into switching us to Zoom
  • Discuss dates for next F2F
    • Nov 7, 2016 before KubeCon (Nov 8 to Nov 9) work for everyone?
      • Steve Wong has tentatively arranged for a conference room for 25 people at the EMC Seattle office near Pioneer Square. This includes lunch and hosting of a dinner at some nearby location TBD
        • Patrick from Samsung can help with organizing.

August 18, 2016

Agenda/Note

  • Update from F2F for Community
  • v1.4 Code Freeze this week
    • Status of in-flight PR
      • Flex Volume
        • In review
        • Very large but well factored
        • Should be able to get it in before code freeze
        • Some open questions around attacher
      • Azure
        • Verifying functionality
        • Questions about performance
      • Quobyte
        • In review, no major issues.
      • Event
        • LGTMd
  • Storage community test strawman - bchilds https://docs.google.com/document/d/17j5ofzOOhWUVBOJ3Uop-MUay2k1iJomBJuVCGxMZcMA/edit?usp=sharing
    • Cant test all plugin
      • Non-GCE cloud storage plugins cant run in GCE need federated testing
  • Jan: StorageClass/PV security - jsafrane
    • Is it safe to store credentials to Gluster server on non-namespaced objects?
      • 1.4 storage classes with .
      • GlusterFS needs to store user/name password etc in blob.
      • Problem with secrets is we dont have cross namespace/using a secret from non-namespaced context. Should get BrianGrants opinion.
      • Parameters are a map of strings.
      • Need to handle this for more than Gluster. File on FS.
      • Will go forward with password in blob, and
  • Mark: 1.5 features question
    • Snapshots: would like to participate in design
      • Sync up with Jing

August 11, 2016

F2F meeting in San Jose

August 4, 2016

Agenda/Note

  • Status Updates:
    • v1.3.4 went out on Monday
      • Lots of storage fixe
  • Face to face meeting - Everyone welcome - Logistics info -> https://docs.google.com/document/d/1qVL7UE7TtZ_D3P4F7BeRK4mDOvYskUjlULXmRJ4z-oE/edit
  • Tim: Internal/external plugins discussion
    • Probably want to save it for the face-to-face meeting. Its big.
  • Tim: Storage volumes read-write once multiple pods can use it at once. This is inconsistent, if pod happens to land on the same machine it will work if not it will not.
    • Two separate conversations if that is correct or not.
    • Share if on same machine AND bias scheduler to put on same machine.
    • Use case: people want to do rolling updates without down time
    • Concern: is this something that is unsafe. We cant guarantee it will always work for applications.
    • AI: would love to define the correct semantic document it
    • Eric Paris: this is a bad idea.
    • Eric Paris: biasing scheduler could be a good idea. but two mounting at same time bad idea.
    • AI: make attacher/detacher smarter--dont detach if volume is already attached
      • Saad: may already do that.
    • Tim: What about a new Read-Write-Node access mode?
    • Steve Watt: maybe k8s should fail Read-Write-Only if it is used by multiple pod
    • Agreement, using Read-Write-Only in multiple pods is an anti-pattern
    • AI: Tim will follow up on the various conversations.
    • AI: Saad will check if Attach/Detach is already optimal for waiting pod. If not file low-pri bug.

July 21, 2016

Agenda/Note

  • Diamanti Demo
    • Action item: would like snapshotting support in flex
  • Discuss support for Cinder volumes without cloud-provider (cf hypernetes)
    • Enable Cinder volumes to attach to any VM or bare metal
    • Action Item: WIP will be sent out by Quentin
  • Dynamic Provisioning V2 https://github.com/kubernetes/kubernetes/pull/29006
    • Action item: need reviewers for PR
  • Face to face meeting
    • Decided on dates: Aug 10 and 11 with optional break out sessions on Aug 12
  • Post-mortems:
    • Google has a format.
    • Brad work on it with Michael?
  • Lots of storage improvements in v1.3.4
    • If anyone has spare cycles to help debug issues, please volunteer
  • Flex volume/pluggable volume discussion
    • Blocked on Tim Hockin
    • Would be great to discuss at F2F
  • Portworks welcome!

July 7, 2016

Agenda/Notes:

June 23, 2016

Agenda/Notes:

  • v1.3 status update
  • v1.4 priorities
    • Want early in 1.4 (1.3.x release):
    • Other priorities
        1. Improve Testing
        • Huge item
        • Performance testing
        • Testing core component
        • Plugin
        1. Continue to improve stability and robustness of existing code
        1. New Volume plugin
        • Encourage people who submit volume plugins to stick around and make sure that they continue to work.
        • Core team does not have the ability to test them.
        1. Local storage
        • Start on the design, maybe have some alpha level code
    • Saads feature list:
      • ---Wanted Feature/requests for volumes---
      • Improve Test Coverage
      • Continue to improve stability and robustness of existing code
        • Refactor some existing code
      • New Feature
        • Finish Dynamic Provisioning
          • volume selectors (done)
          • volume classe
        • Container volume for pre-populated emptydir container #831 (probably read-only)
          • Empty dir with a claim?
        • LogDir
        • Data gravity/local storage
          • When scheduling a work loads affinity to certain nodes that provide schedule.
          • For ephemeral volumes (empty dir): use local storage instead of empty dir
          • AI (saadali): Open a PR for requirements so others can contribute
        • Enable mount namespace propagation to enable storage providers to run on k8s and to containerize kubelet https://github.com/kubernetes/kubernetes/pull/20698
        • Snapshotting support in volume API
        • Automatic resize increase/decrease of Attached/mounted disks https://groups.google.com/forum/#!topic/kubernetes-sig-storage/YLWEbTDKTHE
          • Steve Watt: Scaling up and snapshotting might be something we can not ignore. We are lagging behind the competition in comparisons.
        • FUSE FS support: would enable mounting Google Cloud Storage into a container #7890 /FUSE FS http://stackoverflow.com/questions/35966832/mount-google-storage-bucket-in-google-container?noredirect=1#comment59614874_35966832
        • Improve Deployment Story
          • User experience vs extensibility
          • Establish a concrete API and move plugins out of tree?
          • Containerize plugin binaries so no dependency on deployment of binaries to node.
          • Containerize volume plugins to enable dynamic loading of plugins on demand?
          • Magic: run a kubectl command, and X storage system is deployed with all appropriate binaries and APIs?
      • New Plugin
        • Nexenta
        • EMC
        • Azure VHD
        • Others pending?
      • Bug level work:
        • Rapid delete/recreation should schedule to same node and not detach (bgrant says p2 low user impact)
        • GCE PD attach/detach sometimes takes several minutes to complete--follow up with GCE team to figure out if this is normal/expected.
        • UX issues brought up by Erin ("at least 10GB" is confusing) (bgrant recommends adding limit to api)
        • Support dynamic max pod per node limit (scheduler)
      • Misc
        • Quotas we need a design (esp for creating 1000s of new volumes) (limit claims PR by derekcarr?)
        • Explore when does dynamic volume deletion happen?
        • consider using taints/toleration for resource reservation

June 9, 2016

Agenda/Notes:

May 26, 2016

Agenda/Note

  • Status updates on pending work

May 19, 2016

Agenda/Note

May 12, 2016

Agenda/Notes:

April 28, 2016

Agenda/Notes:

April 14, 2016

Agenda/Notes:

March 31, 2016

Agenda/Notes:

  • Discuss binder/recycler/provisioner controller consolidation
  • Steve Watt: Current dynamic provisioning proposal--two concepts and dynamic provisioning
    • Propose: Break into two separate design
    • Storage classes with manually provisioned ideas may be able to be split off
    • But the two features should be designed together.
    • How bad would it be to only have classes and not provisioning?
      • RH: will circle back.
    • Classes and profiles must be implemented before “next version” of dynamic provisioning.
    • Decided were ok with shipping 1.3 with either, both, or neither features.
  • Lets get more participation from non-RH/Google folks:
    • Steve Wong from EMC:
      • silent because things appear to be going just fine
    • Chakri from Datawise:
      • Would like to give a demo in 4 weeks for flex volume.
    • Tom from ConvergeIO
      • Notion of profiles that incorporate a lot of the auto-provisioning, dir mounting, etc, in the profile
      • Standard profile for work profiles.
      • Documenting REST API, should be in a position to share with the rest of the storage-SIG

March 17, 2016

Agenda/Notes:

  • Proposal to Create new way to match PVC PV “strict” - Erin Boyd
    • Two Issues:
      • Need a way to control pod defined storage
        • Suggestions: Pod sec policy, admission controller
      • Need a way to specify exactly which PV a PVC should bind to
    • Action Item: Erin/Brad will create a new page under storage-sig on k8s wiki to track “User Stories”
  • Containerized mount Design Proposal - Huamin Chen
    • Proposal: ConfigMap + DeamonSet
    • Issues:
      • Docker dependency
      • Deamon is single of of failure for FUSE FS
      • Single container vs multiple container
      • Mounter pods should be subject to CPU/mem quota limit
      • Consider deamonset vs just scheduling a single pod
        • Only FUSE requires a long running proce
      • How does mount namespace propagation get triggered for k8s container
      • Auth mechanisms (so arbitrary user containers cant trigger mount/unmount
  • Bind/Provision/Recycler Controller Consolidation PRs - Jan Safranek
    • Existing PR needs review
    • Suggest diagramming the existing flow and coming up with a detailed design proposal
  • attach/detach controller and mount/unmount redesign status - Sami Wagiaalla

March 1, 2016

Agenda:

  • Containerized Volume Driver

  • EMEA SME concerns around PVs(Erin / Scott)

    • Customer concern
      • Data replication
        • Moving Data with PVs that are dynamically provisioned
      • Local storage
        • Data locality issues that shared storage bring
      • selector label by security
      • Non-enforcement of label
      • Manageability of storage when there is a 1:1 for PV->PVC (too many)
  • Dynamic Provisioning - Level of backwards Compat? Maintain old way (marked as Experimental), do we need to keep old behavior?

  • Jan Safranek - “Any chance to move SIG to another day?”. Pmorie - “This SIG overlaps with Node-SIG

February 16, 2016

Agenda:

  • Nearly all Dynamic Provisioning talk
    • Discussed through the various means of parameterizing options to help the “many configMap” issue. Decided none of the approaches here are blocked by the current proposal.
    • Discussed using “storage-class” as a special field in claim.Spec.PVSelector. Decided that allowing arbitrary labels on PVs and Provisioners was better than magic keys.
  • Brought up API PRs that are required for provisioning
  • Brought up status of attach/detach controller. Work is ongoing.

February 9, 2016

Agenda:

  • Dynamic provisioner
  • 1.2 PR

February 2, 2016

Agenda:

  • Dynamic provisioner PR selector vs resources (eparis)
  • PRs for 1.2 (thockin)
  • Azure Plugin (rootfs)
  • Refactoring catchup / attach-detach (saad-ali)
  • Fake Multi-write - yes or no.
  • XFS quota & emptydir

Dynamic provisioning:

  • eparis: selector vs resources, #17056 usability
    • What goes into a selector? Why are resources not selector?
    • Discovering the meaning of a selector is opaque
    • thockin: could be fixed in kubectl tooling “show me storage classes”
    • Steve: admin wants to catalog storage without users really understanding
    • eparis: as a new user what class do I use?
    • Sami: You know to use class as the key. Could maybe upgrade to a full-field.
    • Erin: How does the user know what to request?
    • thockin: need a default “” class, but admin has to document what classes are available
    • cargo culting will happen
    • thockin: could promote class to a field and configmap to an object, eventually
    • eparis: can I look at a config map and understand the meaning
    • saad: configmap holds params for provisioners, up to admin
    • could expose more knobs to users (iops, etc). Gets crazy quickly.
    • Steve: can we get tooling of “show me what classes *I* can use
    • thockin: no concept of “I”. Punting on context-dependent classes for now
    • could add a description string to config map
    • eparis: config map blowout?
    • dont need combinatoric maps for zones/feature
    • class says which provisioner, regions, encryption, etc could be features.
    • not great for discoverability.
      • could document at the configmap
      • can we validate this? UX is not great
      • events are easy, api validation is much harder - async
    • provisioners MUST satisfy all fields of the selector
    • e.g. class=gold-east/gold-west vs class=gold,region=east
    • if zone is a parameter label, does that affect matching against PV
    • mapping to PV is looser? examples: object name == class? label zone is noise.
    • what if user does not ask for a zone, but provisioner needs one? provisioner has to pick a default or choose to fail

1.2 PRs:

December 2, 2015

Moderator: Mark Turansky Agenda: Dynamic provisioning technical design discussion, other outstanding PR