8.1 KiB
Skip attach for non-attachable CSI volumes
Author: @jsafrane
Goal
- Non-attachable CSI volumes should not require external attacher and
VolumeAttachment
instance creation. This will speed up pod startup.
Motivation
Currently, CSI requires admin to start external CSI attacher for all CSI drivers, including those that don't implement attach/detach operation (such as NFS or all ephemeral Secrets-like volumes). Kubernetes Attach/Detach controller always creates VolumeAttachment
objects for them and always waits until they're reported as "attached" by external CSI attacher.
We want to skip creation of VolumeAttachment
objects in A/D controller for CSI volumes that don't require 3rd party attach/detach.
Dependencies
In order to skip both A/D controller attaching a volume and kubelet waiting for the attachment, both of them need to know if a particular CSI driver is attachable or not. In this document we expect that proposal #2514 is implemented and both A/D controller and kubelet has informer on CSIDriver
so they can check if a volume is attachable easily.
Design
CSI volume plugin
- Rework
Init
to get or create informer to cache CSIDriver instances.- Depending on where the API for CSIDriver ends up, we may:
- Rework VolumeHost to either provide the informer. This leaks CSI implementation details to A/D controller and kubelet
- Or the CSI volume plugin can create and run CSIDriver informer by itself. No other component in controller-manager or kubelet needs the informer right now, so a non-shared informer is viable option. Depending on when the API for CSIDriver ends up,
VolumeHost
may need to be extended to provide client interface to the API and kubelet and A/D controller may need to be updated to create the interface (somewhere incmd/
, where RESTConfig is still available to create new clients ) and pass it to theirVolumeHost
implementations.
- Depending on where the API for CSIDriver ends up, we may:
- Rework
Attach
,Detach
,VolumesAreAttached
andWaitForAttach
to check forCSIDriver
instance using the informer.- If CSIDriver for the driver exists and it's attachable, perform usual logic.
- If CSIDriver for the driver exists and it's not attachable, return success immediately (basically NOOP). A/D controller will still mark the volume as attached in
Node.Status.VolumesAttached
. - If CSIDriver for the driver does not exist, perform usual logic (i.e. treat the volume as attachable).
- This keeps the behavior the same as in old Kubernetes version without CSIDriver object.
- This also happens when CSIDriver informer has not been quick enough. It is suggested that CSIDriver instance is created before any pod that uses corresponding CSI driver can run.
- In case that CSIDriver informer (or user) is too slow, CSI volume plugin
Attach()
will createVolumeAttachment
instance and wait for (non-existing) external attacher to fulfill it. The CSI plugin shall recover whenCSIDriver
instance is created and skip attach. AnyVolumeAttachment
instance created here will be deleted onDetach()
, see the next bullet.
- In case that CSIDriver informer (or user) is too slow, CSI volume plugin
- In addition to the above,
Detach()
removesVolumeAttachment
instance even if the volume is not attachable. This deletesVolumeAttachment
instances created by old A/D controller or beforeCSIDriver
instance was created.
Authorization
- A/D controller and kubelet must be allowed to list+watch CSIDriver instances. Updating RBAC rules should be enough.
API
No API changes.
Upgrade
This chapter covers:
- Upgrade from old Kubernetes that has
CSISkipAttach
disabled to new Kubernetes withCSISkipAttach
enabled. - Update from Kubernetes that has
CSISkipAttach
disabled to the same Kubernetes withCSISkipAttach
enabled. - Creation of CSIDriver instance with non-attachable CSI driver.
In all cases listed above, an "attachable" CSI driver becomes non-attachable. Upgrade does not affect attachable CSI drivers, both "old" and "new" Kubernetes processes them in the same way.
For non-attachable volumes, if the volume was attached by "old" Kubernetes (or "new" Kubernetes before CSIDriver instance was created), it has VolumeAttachment
instance. It will be deleted by Detach()
, as it deletes VolumeAttachment
instance also for non-attachable volumes.
Downgrade
This chapter covers:
- Downgrade from new Kubernetes that has
CSISkipAttach
enabled to old Kubernetes with `CSISkipAttach disabled. - Update from Kubernetes that has
CSISkipAttach
feature enabled to the same Kubernetes withCSISkipAttach
disabled. - Deletion of CSIDriver instance with non-attachable CSI driver.
In all cases listed above, a non-attachable CSI driver becomes "attachable" (i.e. requires external attacher). Downgrade does not affect attachable CSI drivers, both "old" and "new" Kubernetes processes them in the same way.
For non-attachable volumes, if the volume was mounted by "new" Kubernetes, it has no VolumeAttachment instance. "Old" A/D controller does not know about it. However, it will periodically call plugin's VolumesAreAttached()
that checks for VolumeAttachment
presence. Volumes without VolumeAttachment
will be reported as not attached and A/D controller will call Attach()
on these. Since "old" Kubernetes required an external attacher even for non-attachable CSI drivers, the external attacher will pick the VolumeAttachment
instances and fulfil them in the usual way.
Performance considerations
-
Flow suggested in this proposal adds new
CSIDriver
informer both to A/D controller and kubelet. We don't expect any high amount of instances ofCSIDriver
nor any high frequency of updates.CSIDriver
should have negligible impact on performance. -
A/D controller will not create
VolumeAttachment
instances for non-attachable volumes. Etcd load will be reduced. -
On the other hand, all CSI volumes still must go though A/D controller. A/D controller must process every CSI volume and kubelet must wait until A/D controller marks a volume as attached, even if A/D controller basically does nothing. All CSI volumes must be added to
Node.Status.VolumesInUse
andNode.Status.VolumesAttached
. This does not introduce any new API calls, all this is already implemented, however this proposal won't reduceNode.Status
update frequency in any way.- If all volumes move to CSI eventually, pod startup will be slower than when using in-tree volume plugins that don't go through A/D controller and
Node.Status
will grow in size.
- If all volumes move to CSI eventually, pod startup will be slower than when using in-tree volume plugins that don't go through A/D controller and
Implementation
Expected timeline:
- Alpha: 1.12 (behind feature gate
CSISkipAttach
) - Beta: 1.13 (enabled by default)
- GA: 1.14
Alternatives considered
A/D controller and kubelet can be easily extended to check if a given volume is attachable. This would make mounting of non-attachable volumes easier, as kubelet would not need to wait for A/D controller to mark the volume as attached. However, there would be issues when upgrading or downgrading Kubernetes (or marking CSIDriver as attachable or non-attachable, which has basically the same handling).
- On upgrade (i.e. a previously attachable CSI volume becomes non-attachable, e.g. when user creates CSIDriver instance while corresponding CSI driver is already running), A/D controller could discover that an attached volume is not attachable any longer. A/D controller could clean up
Node.Status.VolumesAttached
, but since A/D controller does not know anything aboutVolumeAttachment
, we would either need to introduce a new volume plugin call to clean it up in CSI volume plugin, or something else would need to clean it. - On downgrade (i.e. a previously non-attachable CSI volume becomes attachable, e.g. when user deletes CSIDriver instance or downgrades to old Kubernetes without this feature), kubelet must discover that already mounted volume has changed from non-attachable to attachable and put it into
Node.Status.VolumesInUse
. This would race with A/D controller detaching the volume when a pod was deleted at the same time a CSIDriver instance was made attachable.
Passing all volumes through A/D controller saves us from these difficulties and even races.