Topology updates to Kubernetes CSI implementation
This commit is contained in:
parent
771e825d94
commit
d7709a3d26
|
@ -89,14 +89,12 @@ CSI volume drivers should create a socket at the following path on the node mach
|
|||
|
||||
`Sanitized CSIDriverName` is CSI driver name that does not contain dangerous character and can be used as annotation name. It can follow the same pattern that we use for [volume plugins](https://git.k8s.io/kubernetes/pkg/util/strings/escape.go#L27). Too long or too ugly driver names can be rejected, i.e. all components described in this document will report an error and won't talk to this CSI driver. Exact sanitization method is implementation detail (SHA in the worst case).
|
||||
|
||||
Upon initialization of the external “CSI volume driver”, some external component must call the CSI method `GetNodeId` to get the mapping from Kubernetes Node names to CSI driver NodeID. It must then add the CSI driver NodeID to the `csi.volume.kubernetes.io/nodeid` annotation on the Kubernetes Node API object. The key of the annotation must be `csi.volume.kubernetes.io/nodeid`. The value of the annotation is a JSON blob, containing key/value pairs for each CSI driver.
|
||||
|
||||
For example:
|
||||
Upon initialization of the external “CSI volume driver”, some external component must call the CSI method `GetNodeId` to get the mapping from Kubernetes Node names to CSI driver NodeID and the associated AccessibleTopology. It must then update the Kubernetes Node API object with the CSI driver NodeID as the `csi.volume.kubernetes.io/nodeid` annotation and AccessibleTopology as labels. The key of the NodeID annotation must be `csi.volume.kubernetes.io/nodeid`. The value of the annotation is a JSON blob, containing key/value pairs for each CSI driver. For example:
|
||||
```
|
||||
csi.volume.kubernetes.io/nodeid: "{ \"driver1\": \"name1\", \"driver2\": \"name2\" }
|
||||
```
|
||||
|
||||
This will enable the component that will issue `ControllerPublishVolume` calls to use the annotation as a mapping from cluster node ID to storage node ID.
|
||||
This will enable the component that will issue `ControllerPublishVolume` calls to use the annotation as a mapping from cluster node ID to storage node ID. There are no hard restrictions on the label format, but for the format to be used by the recommended setup, please refer to [Topology Representation in Node Objects](#topology-representation-in-node-objects).
|
||||
|
||||
To enable easy deployment of an external containerized CSI volume driver, the Kubernetes team will provide a sidecar "Kubernetes CSI Helper" container that can manage the unix domain socket registration and NodeId initialization. This is detailed in the “Suggested Mechanism for Deploying CSI Drivers on Kubernetes” section below.
|
||||
|
||||
|
@ -116,7 +114,7 @@ Provisioning and deletion operations are handled using the existing [external pr
|
|||
|
||||
In short, to dynamically provision a new CSI volume, a cluster admin would create a `StorageClass` with the provisioner corresponding to the name of the external provisioner handling provisioning requests on behalf of the CSI volume driver.
|
||||
|
||||
To provision a new CSI volume, an end user would create a `PersistentVolumeClaim` object referencing this `StorageClass`. The external provisioner will react to the creation of the PVC and issue the `CreateVolume` call against the CSI volume driver to provision the volume. The `CreateVolume` name will be auto-generated as it is for other dynamically provisioned volumes. The `CreateVolume` capacity will be taken from the `PersistentVolumeClaim` object. The `CreateVolume` parameters will be passed through from the `StorageClass` parameters (opaque to Kubernetes). Once the operation completes successfully, the external provisioner creates a `PersistentVolume` object to represent the volume using the information returned in the `CreateVolume` response. The `PersistentVolume` object is bound to the `PersistentVolumeClaim` and available for use.
|
||||
To provision a new CSI volume, an end user would create a `PersistentVolumeClaim` object referencing this `StorageClass`. The external provisioner will react to the creation of the PVC and issue the `CreateVolume` call against the CSI volume driver to provision the volume. The `CreateVolume` name will be auto-generated as it is for other dynamically provisioned volumes. The `CreateVolume` capacity will be taken from the `PersistentVolumeClaim` object. The `CreateVolume` parameters will be passed through from the `StorageClass` parameters (opaque to Kubernetes). If the `PersistentVolumeClaim` has the `selectedNode` annotation set (TODO verult update to actual annotation name) (only added if delayed volume binding is enabled in the `StorageClass`), the provisioner will get relevant topology labels from the corresponding `Node` and pass them to the `CreateVolume` call as preferred topology. `AllowedTopologies`from the `StorageClass` is passed through as permitted topology. Once the operation completes successfully, the external provisioner creates a `PersistentVolume` object to represent the volume using the information returned in the `CreateVolume` response. The topology of the returned volume is translated to the `PersistentVolume` `NodeAffinity` field. The `PersistentVolume` object is then bound to the `PersistentVolumeClaim` and available for use.
|
||||
|
||||
To delete a CSI volume, an end user would delete the corresponding `PersistentVolumeClaim` object. The external provisioner will react to the deletion of the PVC and based on its reclamation policy it will issue the `DeleteVolume` call against the CSI volume driver commands to delete the volume. It will then delete the `PersistentVolume` object.
|
||||
|
||||
|
@ -387,6 +385,25 @@ To deploy a containerized third-party CSI volume driver, it is recommended that
|
|||
|
||||
Alternatively, deployment could be simplified by having all components (including external-provisioner and external-attacher) in the same pod (DaemonSet). Doing so, however, would consume more resources, and require a leader election protocol (likely https://git.k8s.io/contrib/election) in the `external-provisioner` and `external-attacher` components.
|
||||
|
||||
#### Topology Representation in Node Objects
|
||||
**Work In Progress**
|
||||
|
||||
Topology information will be represented as labels.
|
||||
|
||||
Requirements:
|
||||
* Must adhere to the [label format](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set).
|
||||
* Must support different drivers on the same node.
|
||||
* The format of each key/value pair must match those in `PersistentVolume` and `StorageClass` objects. When a `StorageClass` has delayed volume binding enabled, the scheduler uses the topology information of a `Node` in the following ways:
|
||||
1. During dynamic provisioning, the scheduler selects a candidate node for the provisioner by comparing each `Node`'s topology with the `AllowedTopology` in the `StorageClass`. (TODO verult Link to volume scheduling design doc)
|
||||
1. During volume binding and pod scheduling, the scheduler selects a candidate node for the pod by comparing `Node` topology with `VolumeNodeAffinity` in `PersistentVolume`s. (TODO verult Link to volume scheduling design doc)
|
||||
* Must avoid collision with topology specified from sources other than CSI.
|
||||
|
||||
Proposal: `"csi.kubernetes.io.csi-driver.example.com/rack": "rack1"`
|
||||
|
||||
Alternative: `"csi.kubernetes.io/com.example.csi-driver.rack": "rack1"`
|
||||
|
||||
The alternative is invalid because the driver name could be up to 63 characters long, but the label name (after `/`) must be no longer than 63 characters.
|
||||
|
||||
### Example Walkthrough
|
||||
|
||||
#### Provisioning Volumes
|
||||
|
|
Loading…
Reference in New Issue