CSI Migration Dependencies, Explaination of CSINodeInfo ADC Kubelet sync
bringing design doc up to speed with most current design Added CSINodeInfo API design and improved testing section.
This commit is contained in:
parent
e660fe6fb5
commit
7ef087c91b
|
@ -6,6 +6,14 @@ This document presents a detailed design for migrating in-tree storage plugins
|
|||
to CSI. This will be an opt-in feature turned on at cluster creation time that
|
||||
will redirect in-tree plugin operations to a corresponding CSI Driver.
|
||||
|
||||
## Glossary
|
||||
|
||||
* ADC (Attach Detach Controller): Controller binary that handles Attach and Detach portion of a volume lifecycle
|
||||
* Kubelet: Kubernetes component that runs on each node, it handles the Mounting and Unmounting portion of volume lifecycle
|
||||
* CSI (Container Storage Interface): An RPC interface that Kubernetes uses to interface with arbitrary 3rd party storage drivers
|
||||
* In-tree: Code that is compiled into native Kubernetes binaries
|
||||
* Out-of-tree: Code that is not compiled into Kubernetes binaries, but can be run as Deployments on Kubernetes
|
||||
|
||||
## Background and Motivations
|
||||
|
||||
The Kubernetes volume plugins are currently in-tree meaning all logic and
|
||||
|
@ -46,6 +54,7 @@ internal APIs.
|
|||
and off separately
|
||||
|
||||
## Non-Goals
|
||||
|
||||
* Design a mechanism for deploying CSI drivers on all systems so that users can
|
||||
use the current storage system the same way they do today without having to do
|
||||
extra set up.
|
||||
|
@ -70,7 +79,22 @@ GA [TBD]
|
|||
* CSI Drivers for migrated plugins available on related cloud provider cluster
|
||||
by default
|
||||
|
||||
## Milestones
|
||||
|
||||
* Translation Library implemented in Kubernetes staging
|
||||
* Migration Shim for Provision, Attach, Detach, Mount, Unmount (including Inline Volumes)
|
||||
* Migration Shim for Resize, Block
|
||||
* CSI Driver lifecycle manager
|
||||
* GCE PD feature parity in CSI with in-tree implementation
|
||||
* AWS EBS feature parity in CSI with in-tree implementation
|
||||
* Cloud Driver feature parity in CSI with in-tree implementation
|
||||
|
||||
## Dependency Graph
|
||||
|
||||

|
||||
|
||||
## Feature Gating
|
||||
|
||||
We will have an alpha feature gate for the whole feature that can turn the CSI
|
||||
migration on or off, when off all code paths should revert/stay with the in-tree
|
||||
plugins. We will also have individual flags for each driver so that admins can
|
||||
|
@ -95,6 +119,7 @@ CSIMigrationAWS utilfeature.Feature = "CSIMigrationAWS"
|
|||
```
|
||||
|
||||
## Translation Layer
|
||||
|
||||
The main mechanism we will use to migrate plugins is redirecting in-tree
|
||||
operation calls to the CSI Driver instead of the in-tree driver, the external
|
||||
components will pick up these in-tree PV's and use a translation library to
|
||||
|
@ -111,6 +136,7 @@ Cons:
|
|||
### Dynamically Provisioned Volumes
|
||||
|
||||
#### Kubernetes Changes
|
||||
|
||||
Dynamically Provisioned volumes will continue to be provisioned with the in-tree
|
||||
`PersistentVolumeSource`. The CSI external-provisioner to pick up the
|
||||
in-tree PVC's when migration is turned on and provision using the CSI Drivers;
|
||||
|
@ -119,6 +145,7 @@ in-tree PV. The PV will then go through all the same steps outlined below in the
|
|||
"Non-Dynamic Provisioned Volumes" for the rest of the volume lifecycle.
|
||||
|
||||
#### Leader Election
|
||||
|
||||
There will have to be some mechanism to switch between in-tree and external
|
||||
provisioner when the migration feature is turned on/off. The two should be
|
||||
compatible as they both will create the same volume and PV based on the same
|
||||
|
@ -127,7 +154,6 @@ will have logic added so that it will stand down and mark the PV as "migrated"
|
|||
with an annotation when the migration is turned on and the external provisioner
|
||||
will take care of the PV when it sees the annotation.
|
||||
|
||||
|
||||
### Translation Library
|
||||
|
||||
In order to make this on-the-fly translation work we will develop a separate
|
||||
|
@ -139,25 +165,40 @@ imported library and part of whatever binary needs the translation (no extra
|
|||
API or RPC calls).
|
||||
|
||||
#### Library Interface
|
||||
|
||||
```
|
||||
type CSITranslator interface {
|
||||
// TranslateToCSI takes a volume.Spec and will translate it to a
|
||||
// CSIPersistentVolumeSource if the translation logic for that
|
||||
// specific in-tree volume spec has been implemented
|
||||
TranslateToCSI(spec volume.Spec) (CSIPersistentVolumeSource, error)
|
||||
// TranslateInTreePVToCSI takes a persistent volume and will translate
|
||||
// the in-tree source to a CSI Source if the translation logic
|
||||
// has been implemented. The input persistent volume will not
|
||||
// be modified
|
||||
TranslateInTreePVToCSI(pv *v1.PersistentVolume) (*v1.PersistentVolume, error) {
|
||||
|
||||
// TranslateToIntree takes a CSIPersistentVolumeSource and will translate
|
||||
// it to a volume.Spec for the specific in-tree volume specified by
|
||||
//`inTreePlugin`, if that translation logic has been implemented
|
||||
TranslateToInTree(source CSIPersistentVolumeSource, inTreePlugin string) (volume.Spec, error)
|
||||
// TranslateCSIPVToInTree takes a PV with a CSI PersistentVolume Source and will translate
|
||||
// it to a in-tree Persistent Volume Source for the specific in-tree volume specified
|
||||
// by the `Driver` field in the CSI Source. The input PV object will not be modified.
|
||||
TranslateCSIPVToInTree(pv *v1.PersistentVolume) (*v1.PersistentVolume, error) {
|
||||
|
||||
// IsMigrated returns true if the plugin has migration logic
|
||||
// false if it does not
|
||||
IsMigrated(inTreePlugin string) bool
|
||||
|
||||
// IsMigratableByName tests whether there is Migration logic for the in-tree plugin
|
||||
// for the given `pluginName`
|
||||
IsMigratableByName(pluginName string) bool {
|
||||
|
||||
// GetCSINameFromIntreeName maps the name of a CSI driver to its in-tree version
|
||||
GetCSINameFromIntreeName(pluginName string) (string, error) {
|
||||
|
||||
|
||||
// IsPVMigratable tests whether there is Migration logic for the given Persistent Volume
|
||||
IsPVMigratable(pv *v1.PersistentVolume) bool {
|
||||
|
||||
|
||||
// IsInlineMigratable tests whether there is Migration logic for the given Inline Volume
|
||||
IsInlineMigratable(vol *v1.Volume) bool {
|
||||
}
|
||||
```
|
||||
|
||||
#### Library Versioning
|
||||
|
||||
Since the library will be imported by various components it is imperative that
|
||||
all components import a version of the library that supports in-tree driver x
|
||||
before the migration feature flag for x is turned on. If not, the TranslateToCSI
|
||||
|
@ -165,6 +206,7 @@ function will return an error when the translation is attempted.
|
|||
|
||||
|
||||
### Pre-Provisioned Volumes (and volumes provisioned before migration)
|
||||
|
||||
In the OperationGenerator at the start of each volume operation call we will
|
||||
check to see whether the plugin has been migrated.
|
||||
|
||||
|
@ -192,7 +234,8 @@ creates a VolumeAttachment object, and if for some reason we are doing a detach
|
|||
with the in-tree plugin, the VolumeAttachment object becomes orphaned.
|
||||
|
||||
|
||||
### In-Line Volumes
|
||||
### In-line Volumes
|
||||
|
||||
In-line controller calls are a special case because there is no PV. In this case
|
||||
we will add the CSI Source JSON to the VolumeToAttach object and in Attach we
|
||||
will put the Source in a new field in the VolumeAttachment object
|
||||
|
@ -201,7 +244,7 @@ be modified to also check this location for a source before checking the PV
|
|||
itself.
|
||||
|
||||
We need to be careful with naming VolumeAttachments for in-line volumes. The
|
||||
name needs to be unique and A/D controller must be able to find the right
|
||||
name needs to be unique and ADC must be able to find the right
|
||||
VolumeAttachment when a pod is deleted (i.e. using only info in Node.Status).
|
||||
CSI driver in kubelet must be able to find the VolumeAttachment too to get
|
||||
AttachmentMetadata for NodeStage/NodePublish.
|
||||
|
@ -215,7 +258,24 @@ https://github.com/kubernetes/community/pull/2273. Basically we will just transl
|
|||
the in-tree inline volumes into the format specified/implemented in the
|
||||
container-storage-interface-inline-volumes proposal.
|
||||
|
||||
### Volume Resize
|
||||
|
||||
TODO: Design
|
||||
|
||||
### Raw Block
|
||||
|
||||
TODO: Design
|
||||
|
||||
### Volume Reconstruction
|
||||
|
||||
TODO: Design
|
||||
|
||||
### Volume Limit
|
||||
|
||||
TODO: Design
|
||||
|
||||
## Interactions with PV-PVC Protection Finalizers
|
||||
|
||||
PV-PVC Protection finalizers prevent deletion of a PV when it is bound to a PVC,
|
||||
and prevent deletion of a PVC when it is in use by a pod.
|
||||
|
||||
|
@ -224,6 +284,7 @@ the same ways as we are not removing/adding PV’s or PVC’s in out of the ordi
|
|||
ways.
|
||||
|
||||
## Dealing with CSI Driver Failures
|
||||
|
||||
Plugin should fail if the CSI Driver is down and migration is turned on. When
|
||||
the driver recovers we should be able to resume gracefully.
|
||||
|
||||
|
@ -231,71 +292,252 @@ We will also create a playbook entry for how to turn off the CSI Driver
|
|||
migration gracefully, how to tell when the CSI Driver is broken or non-existent,
|
||||
and how to redeploy a CSI Driver in a cluster.
|
||||
|
||||
## API Changes
|
||||
|
||||
### CSINodeInfo API
|
||||
|
||||
Changes in: https://github.com/kubernetes/kubernetes/pull/70515
|
||||
|
||||
#### Old CSINodeInfo API
|
||||
|
||||
```
|
||||
// CSINodeInfo holds information about all CSI drivers installed on a node.
|
||||
type CSINodeInfo struct {
|
||||
metav1.TypeMeta `json:",inline"`
|
||||
|
||||
// metadata.name must be the Kubernetes node name.
|
||||
metav1.ObjectMeta `json:"metadata,omitempty"`
|
||||
|
||||
// List of CSI drivers running on the node and their properties.
|
||||
// +patchMergeKey=driver
|
||||
// +patchStrategy=merge
|
||||
CSIDrivers []CSIDriverInfo `json:"csiDrivers" patchStrategy:"merge" patchMergeKey:"driver"`
|
||||
}
|
||||
|
||||
// CSIDriverInfo contains information about one CSI driver installed on a node.
|
||||
type CSIDriverInfo struct {
|
||||
// driver is the name of the CSI driver that this object refers to.
|
||||
// This MUST be the same name returned by the CSI GetPluginName() call for
|
||||
// that driver.
|
||||
Driver string `json:"driver"`
|
||||
|
||||
// nodeID of the node from the driver point of view.
|
||||
// This field enables Kubernetes to communicate with storage systems that do
|
||||
// not share the same nomenclature for nodes. For example, Kubernetes may
|
||||
// refer to a given node as "node1", but the storage system may refer to
|
||||
// the same node as "nodeA". When Kubernetes issues a command to the storage
|
||||
// system to attach a volume to a specific node, it can use this field to
|
||||
// refer to the node name using the ID that the storage system will
|
||||
// understand, e.g. "nodeA" instead of "node1".
|
||||
NodeID string `json:"nodeID"`
|
||||
|
||||
// topologyKeys is the list of keys supported by the driver.
|
||||
// When a driver is initialized on a cluster, it provides a set of topology
|
||||
// keys that it understands (e.g. "company.com/zone", "company.com/region").
|
||||
// When a driver is initialized on a node it provides the same topology keys
|
||||
// along with values that kubelet applies to the coresponding node API
|
||||
// object as labels.
|
||||
// When Kubernetes does topology aware provisioning, it can use this list to
|
||||
// determine which labels it should retrieve from the node object and pass
|
||||
// back to the driver.
|
||||
TopologyKeys []string `json:"topologyKeys"`
|
||||
}
|
||||
```
|
||||
|
||||
#### New CSINodeInfo API
|
||||
|
||||
```
|
||||
// +genclient
|
||||
// +genclient:nonNamespaced
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
|
||||
// CSINodeInfo holds information about all CSI drivers installed on a node.
|
||||
// CSI drivers do not need to create the CSINodeInfo object directly. As long as
|
||||
// they use the node-driver-registrar sidecar container, the kubelet will
|
||||
// automatically populate the CSINodeInfo object for the CSI driver as part of
|
||||
// kubelet plugin registration.
|
||||
// CSINodeInfo has the same name as a node. If it is missing, it means either
|
||||
// there are no CSI Drivers available on the node, or the Kubelet version is low
|
||||
// enough that it doesn't create this object.
|
||||
// CSINodeInfo has an OwnerReference that points to the corresponding node object.
|
||||
type CSINodeInfo struct {
|
||||
metav1.TypeMeta
|
||||
|
||||
// metadata.name must be the Kubernetes node name.
|
||||
metav1.ObjectMeta
|
||||
|
||||
// spec is the specification of CSINodeInfo
|
||||
Spec CSINodeInfoSpec
|
||||
}
|
||||
|
||||
// CSINodeInfoSpec holds information about the specification of all CSI drivers installed on a node
|
||||
type CSINodeInfoSpec struct {
|
||||
// drivers is a list of information of all CSI Drivers existing on a node.
|
||||
// It can be empty on initialization.
|
||||
// +patchMergeKey=name
|
||||
// +patchStrategy=merge
|
||||
Drivers []CSIDriverInfoSpec
|
||||
}
|
||||
|
||||
// CSIDriverInfoSpec holds information about the specification of one CSI driver installed on a node
|
||||
type CSIDriverInfoSpec struct {
|
||||
// This is the name of the CSI driver that this object refers to.
|
||||
// This MUST be the same name returned by the CSI GetPluginName() call for
|
||||
// that driver.
|
||||
Name string
|
||||
|
||||
// nodeID of the node from the driver point of view.
|
||||
// This field enables Kubernetes to communicate with storage systems that do
|
||||
// not share the same nomenclature for nodes. For example, Kubernetes may
|
||||
// refer to a given node as "node1", but the storage system may refer to
|
||||
// the same node as "nodeA". When Kubernetes issues a command to the storage
|
||||
// system to attach a volume to a specific node, it can use this field to
|
||||
// refer to the node name using the ID that the storage system will
|
||||
// understand, e.g. "nodeA" instead of "node1".
|
||||
// This field must be populated. An empty string means NodeID is not initialized
|
||||
// by the driver and it is invalid.
|
||||
NodeID string
|
||||
|
||||
// topologyKeys is the list of keys supported by the driver.
|
||||
// When a driver is initialized on a cluster, it provides a set of topology
|
||||
// keys that it understands (e.g. "company.com/zone", "company.com/region").
|
||||
// When a driver is initialized on a node, it provides the same topology keys
|
||||
// along with values. Kubelet will expose these topology keys as labels
|
||||
// on its own node object.
|
||||
// When Kubernetes does topology aware provisioning, it can use this list to
|
||||
// determine which labels it should retrieve from the node object and pass
|
||||
// back to the driver.
|
||||
// It is possible for different nodes to use different topology keys.
|
||||
// This can be empty if driver does not support topology.
|
||||
// +optional
|
||||
TopologyKeys []string
|
||||
}
|
||||
```
|
||||
|
||||
#### API Lifecycle
|
||||
|
||||
A new `CSINodeInfo` API object is created for each node by the Kubelet on
|
||||
Kubelet initialization before pods are able to be scheduled. A driver will be
|
||||
added with all of its information populated when a driver is registered through
|
||||
the plugin registration mechanism. When the driver is unregistered through the
|
||||
plugin registration mechanism it's entry will be removed from the `Drivers` list
|
||||
in the `CSINodeInfoSpec`.
|
||||
|
||||
#### Kubelet Initialization & Migration Annotation
|
||||
|
||||
On Kubelet initialization we will also pre-populate an annotation for that
|
||||
node's `CSINodeInfo`. The key will be
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` and the value will be a list of
|
||||
in-tree plugin names that the Kubelet has the migration shim turned on for
|
||||
(through feature flags). This must be populated before the Kubelet becomes
|
||||
schedulable in order to achieve synchronization described in the "ADC and
|
||||
Kubelete CSI/In-tree Sync" section below".
|
||||
|
||||
## Upgrade/Downgrade, Migrate/Un-migrate
|
||||
### Kubelet Node Annotation
|
||||
When the Kubelet starts, it will check whether the feature gate is
|
||||
enabled and if so will annotate its node with `csi.attach.kubernetes.io/gce-pd`
|
||||
for example to communicate to the A/D Controller that it supports migration of
|
||||
the gce-pd to CSI. The A/D Controller will have to choose on a per-node basis
|
||||
whether to use the CSI or the in-tree plugin for attach based on 3 criterea:
|
||||
1. Feature gate
|
||||
2. Plugin Migratable (Implements MigratablePlugin interface)
|
||||
3. Node to Attach to has requisite Annotation
|
||||
|
||||
Note: All 3 criteria must be satisfied for A/D controller to Attach/Detach with
|
||||
CSI instead of in-tree plugin. For example if a Kubelet has feature on and marks
|
||||
the annotation, but the A/D Controller does not have the feature gate flipped,
|
||||
we consider this user error and will throw some errors.
|
||||
### Feature Flags
|
||||
|
||||
This can cause a race between the A/D Controller and the Kubelet annotating, if
|
||||
a volume is attached before the Kubelet completes annotation the A/D controller
|
||||
could attach using in-tree plugin instead of CSI while the Kubelet is expecting
|
||||
a CSI Attach. The same issue exists on downgrade if the Annotation is not
|
||||
removed before a volume is attached. An additional consideration is that we
|
||||
cannot have the Kubelet downgraded to a version that does not have the
|
||||
Annotation removal code.
|
||||
ADC and Kubelet use the "same" feature flags, but in reality they are passed in
|
||||
to each binary separately. There will be a feature flag per driver as well as
|
||||
one for CSIMigration in general.
|
||||
|
||||
Kubelet will use its own feature flags to determine whether to use the in-tree
|
||||
or csi backend for Kubelet storage lifecycle operations, as well as to add the
|
||||
plugins that have the feature flag on to the
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` annotation of `CSINodeInfo` for
|
||||
the node that Kubelet is running on.
|
||||
|
||||
The ADC will also use its own feature flags to help make the determination
|
||||
whether to use in-tree or CSI backend for ADC storage lifecycle operations. The
|
||||
other component to help determine which backend to use will be outlined below in
|
||||
the "ADC and Kubelet CSI/In-tree Sync" section.
|
||||
|
||||
### ADC and Kubelet CSI/In-tree Sync
|
||||
|
||||
Some plugins have subtly different behavior on both ADC and Kubelet side between
|
||||
in-tree and CSI implementations. Therefore it is important that if the ADC is to
|
||||
use the in-tree implementation, the Kubelet must as well - and if the ADC is to
|
||||
use the CSI Migrated implementation, the Kubelet must as well. Therefore we will
|
||||
implement a mechanism to keep the ADC and the Kubelet in sync about the Kubelets
|
||||
abilities as well as the feature gates active in each.
|
||||
|
||||
In order for the ADC controller to have the requisite information from the
|
||||
Kubelet to make informed decisions the Kubelet must propagate the
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` annotation information for each
|
||||
potentially migrated driver on Kubelet startup and be considered `NotReady`
|
||||
until that information is synced to the API server. This gives is the following
|
||||
guarantees:
|
||||
* If `CSINodeInfo` for the node does not exist, then ADC can infer the Kubelet
|
||||
is not at a version with migration logic and should therefore fall-back to
|
||||
in-tree implementation
|
||||
* If `CSINodeInfo` exists, and `storage.alpha.kubernetes.io/migrated-plugins`
|
||||
doesn't include the plugin name, then ADC can infer Kubelet has migration
|
||||
logic however the Feature Flag for that particular plugin is `off` and the ADC
|
||||
should therefore fall-back to in-tree storage implementation
|
||||
* If `CSINodeInfo` exists, and `storage.alpha.kubernetes.io/migrated-plugins`
|
||||
does include the plugin name, then ADC can infer Kubelet has migration logic
|
||||
and the Feature Flag for that particular plugin is `on` and the ADC should
|
||||
therefore use the csi-plugin migration implementation
|
||||
* If `CSINodeInfo` exists, and `storage.alpha.kubernetes.io/migrated-plugins`
|
||||
does include the plugin name but the ADC feature flags for that driver are off
|
||||
(`in-tree`), then an error should be thrown notifying users that Kubelet
|
||||
requested `csi-plugin` volume plugin mechanism but it was not specified on the
|
||||
ADC
|
||||
|
||||
In each of these above cases, the decision the ADC makes to use in-tree or csi
|
||||
migration implemtnation will be mirror the Kubelets logic therefore guaranteeing
|
||||
the entire lifecycle of a volume from controller to Kubelet will be done with
|
||||
the same implementation.
|
||||
|
||||
### Node Drain Requirement
|
||||
|
||||
We require node's to be drained whenever the Kubelet is Upgrade/Downgraded or
|
||||
Migrated/Unmigrated to ensure that the entire volume lifecycle is maintained
|
||||
inside one code branch (CSI or In-tree). This simplifies upgrade/downgrade
|
||||
significantly and reduces chance of error and races.
|
||||
|
||||
### Upgrade/Downgrade Migrate/Unmigrate Scenarios
|
||||
|
||||
For upgrade, starting from a non-migrated cluster you must turn on migration for
|
||||
A/D Controller first, then drain your node before turning on migration for the
|
||||
ADC first, then drain your node before turning on migration for the
|
||||
Kubelet. The workflow is as follows:
|
||||
1. A/D Controller and Kubelet are both not migrated
|
||||
2. A/D Controller restarted and migrated (flags flipped)
|
||||
3. A/D Controller continues to use in-tree code for this node b/c node
|
||||
annotation doesn't exist
|
||||
4. Node drained and made unschedulable. All volumes unmounted/detached with in-tree code
|
||||
5. Kubelet restarted and migrated (flags flipped)
|
||||
6. Kubelet annotates node to tell A/D controller this node has been migrated
|
||||
7. Kubelet is made schedulable
|
||||
8. Both A/D Controller & Kubelet Migrated, node is in "fresh" state so all new
|
||||
1. ADC and Kubelet are both not migrated
|
||||
2. ADC restarted and migrated (flags flipped)
|
||||
3. ADC continues to use in-tree code for this node b/c
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` does NOT include the plugin
|
||||
name
|
||||
4. Node drained and made unschedulable. All volumes unmounted/detached with
|
||||
in-tree code
|
||||
6. Kubelet restarted and migrated (flags flipped)
|
||||
7. Kubelet updates CSINodeInfo node to tell ADC (without informer) whether each
|
||||
node/driver has been migrated by adding the plugin to the
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` annotation
|
||||
8. Kubelet is made schedulable
|
||||
9. Both ADC & Kubelet Migrated, node is in "fresh" state so all new
|
||||
volumes lifecycle is CSI
|
||||
|
||||
For downgrade, starting from a fully migrated cluster you must drain your node
|
||||
first, then turn off migration for your Kubelet, then turn off migration for the
|
||||
A/D Controller. The workflow is as follows:
|
||||
1. A/D Controller and Kubelet are both migrated
|
||||
2. Kubelet drained and made unschedulable, all volumes unmounted/detached with CSI code
|
||||
ADC. The workflow is as follows:
|
||||
1. ADC and Kubelet are both migrated
|
||||
2. Kubelet drained and made unschedulable, all volumes unmounted/detached with
|
||||
CSI code
|
||||
3. Kubelet restarted and un-migrated (flags flipped)
|
||||
4. Kubelet removes node annotation to tell A/D Controller this node is not
|
||||
migrated. In case kubelet does not have annotation removal code, admin must
|
||||
remove the annotation manually.
|
||||
4. Kubelet removes the plugin in question to
|
||||
`storage.alpha.kubernetes.io/migrated-plugins`. In case kubelet does not have
|
||||
`storage.alpha.kubernetes.io/migrated-plugins` update code, admin must update
|
||||
the field manually.
|
||||
5. Kubelet is made schedulable.
|
||||
5. At this point all volumes going onto the node would be using in-tree code for
|
||||
both A/D Controller(b/c of annotation) and Kublet
|
||||
6. Restart and un-migrate A/D Controller
|
||||
both ADC(b/c of annotation) and Kublet
|
||||
6. Restart and un-migrate ADC
|
||||
|
||||
With these workflows a volume attached with CSI will be handled by CSI code for
|
||||
its entire lifecycle, and a volume attached with in-tree code will be handled by
|
||||
in-tree code for its entire lifecycle.
|
||||
|
||||
## Cloud Provider Requirements
|
||||
|
||||
There is a push to remove CloudProvider code from kubernetes.
|
||||
|
||||
There will not be any general auto-deployment mechanism for ALL CSI drivers
|
||||
|
@ -322,19 +564,40 @@ And at this point users doing their own deployment and not installing the GCE PD
|
|||
CSI driver encounter an error.
|
||||
|
||||
## Testing
|
||||
### Standard
|
||||
Good news is that all “normal functionality” can be tested by simply bringing up
|
||||
a cluster with “migrated” drivers and running the existing e2e tests for that
|
||||
driver. We will create CI jobs that run in this configuration for each new
|
||||
volume plugin
|
||||
|
||||
### Migration/Non-migration (Upgrade/Downgrade)
|
||||
Write tests were in a normal workflow of attach/mount/unmount/detach, we have
|
||||
any one of these operations actually happen with the old volume plugin, not the
|
||||
CSI one This makes sure that the workflow is resiliant to rollback at any point
|
||||
in time.
|
||||
### Migration Shim Testing
|
||||
Run all existing in-tree plugin driver tests
|
||||
* If migration is on for that plugin, add infrastructure piece that inspects CSI
|
||||
Drivers logs to make sure that the driver is servicing the operations
|
||||
* Also observer that none of the in-tree code is being called
|
||||
|
||||
### Version Skew
|
||||
Master/Node can have up to 2 version skw. Master must always be equal or higher
|
||||
version than the node. It should be covered by the tests in
|
||||
Migration/Non-migration section.
|
||||
Additionally, we must test that a PV created from migrated dynamic provisioning
|
||||
is identical to the PV created from the in-tree plugin
|
||||
|
||||
This should cover all use cases of volume operations, including volume
|
||||
reconstruction.
|
||||
|
||||
### Upgrade/Downgrade/Skew Testing
|
||||
We need to have test clusters brought up that have different feature flags
|
||||
enabled on different components (ADC and Kubelet). Once these feature flag skew
|
||||
configurations are brought up the test itself would have to know what
|
||||
configuration it’s running in and validate the expected result.
|
||||
|
||||
Configurations to test:
|
||||
|
||||
| ADC | Kubelet | Expected Result |
|
||||
|-------------------|----------------------------------------------------|--------------------------------------------------------------------------|
|
||||
| ADC Migration On | Kubelet Migration On | Fully migrated - result should be same as “Migration Shim Testing” above |
|
||||
| ADC Migration On | Kubelet Migration Off (or Kubelet version too low) | No calls made to driver. All operations serviced by in-tree plugin |
|
||||
| ADC Migration Off | Kubelet Migration On | Not supported config - Undefined behavior |
|
||||
| ADC Migration Off | Kubelet Migration Off | No calls made to driver. All operations service by in-tree plugin |
|
||||
|
||||
### CSI Driver Feature Parity Testing
|
||||
|
||||
We will need some way to automatically qualify drivers have feature parity
|
||||
before promoting their migration features to Beta (on by default).
|
||||
|
||||
This is as simple as on the feature flags and run through our “Migration Shim
|
||||
Testing” tests. If the driver passes all of them then they have parity. If not,
|
||||
we need to revisit in-tree plugin tests and make sure they test the entire suite
|
||||
of possible tests.
|
||||
|
|
Binary file not shown.
After Width: | Height: | Size: 50 KiB |
Loading…
Reference in New Issue