feat: implement kwok cloudprovider

feat: wip implement `CloudProvider` interface boilerplate for `kwok` provider Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add builder for `kwok` - add logic to scale up and scale down nodes in `kwok` provider Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: wip parse node templates from file Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add short README Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: implement remaining things - to get the provider in a somewhat working state Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add in-cluster `kwok` as pre-requisite in the README Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: templates file not correctly marshalling into node list Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: `invalid leading UTF-8 octet` error during template parsing - remove encoding using `gob` - not required Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: use lister to get and list - instead of uncached kube client - add lister as a field on the provider and nodegroup struct Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: `did not find nodegroup annotation` error - CA was thinking the annotation is not present even though it is - fix a bug with parsing annotation Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: CA node recognizing fake nodegroups - add provider ID to nodes in the format `kwok:<node-name>` - fix invalid `KwokManagedAnnotation` - sanitize template nodes (remove `resourceVersion` etc.,) - not sanitizing the node leads to error during creation of new nodes - abstract code to get NG name into a separate function `getNGNameFromAnnotation` Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: node not getting deleted Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add empty test file Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: add OWNERS file Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: wip kwok provider config - add samples for static and dynamic template nodes Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: wip implement pulling node templates from cluster - add status field to kwok provider config - this is to capture how the nodes would be grouped by (can be annotation or label) - use kwok provider config status to get ng name from the node template Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: syntax error in calling `loadNodeTemplatesFromCluster` Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: first draft of dynamic node templates - this allows node templates to be pulled from the cluster - instead of having to specify static templates manually Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: syntax error Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: abstract out related code into separate files - use named constants instead of hardcoded values Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: cleanup kwok nodes when CA is exiting - so that the user doesn't have to cleanup the fake nodes themselves Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: return `nil` instead of err for `HasInstance` - because there is no underlying cloud provider (hence no reason to return `cloudprovider.ErrNotImplemented` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: start working on tests for kwok provider config Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add `gpuLabelKey` under `nodes` field in kwok provider config - fix validation for kwok provider config Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add motivation doc - update README with more details Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: update kwok provider config example to support pulling gpu labels and types from existing providers - still needs to be implemented in the code Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: wip update kwok provider config to get gpu label and available types Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: wip read gpu label and available types from specified provider - add available gpu types in kwok provider config status Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add validation for gpu fields in kwok provider config - load gpu related fields in kwok provider config status Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: implement `GetAvailableGPUTypes` Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add support to install and uninstall kwok - add option to disable installation - add option to manually specify kwok release tag - add future scope in readme Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add future scope 'evaluate adding support to check if kwok controller already exists' Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: vendor conflict and cyclic import - remove support to get gpu config from the specified provider (can't be used because leads to cyclic import) Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add a TODO 'get gpu config from other providers' Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename `file` -> `configmap` - load config and templates from configmap instead of file - move `nodes` and `nodegroups` config to top level - add helper to encode configmap data into `[]bytes` - add helper to get current pod namespace Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add new options to the kwok provider config - auto install kwok only if the version is >= v0.4.0 - add test for `GPULabel()` - use `kubectl apply` way of installing kwok instead of kustomize - add test for kwok helpers - add test for kwok config - inject service account name in CA deployment - add example configmap for node templates and kwok provider config in CA helm chart - add permission to create `clusterrolebinding` (so that kwok provider can create a clusterrolebinding with `cluster-admin` role and create/delete upstream manifests) - update kwok provider sample configs - update `README` Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: update go.mod to use v1.28 packages Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: `go mod tidy` and `go mod vendor` (again) Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: kwok installation code - add functions to create and delete clusterrolebinding to create kwok resources - refactor kwok install and uninstall fns - delete manifests in the opposite order of install ] - add cleaning up left-over kwok installation to future scope Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: nil ptr error - add `TODO` in README for adding docs around kwok config fields Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: remove code to automatically install and uninstall `kwok` - installing/uninstalling requires strong permissions to be granted to `kwok` - granting strong permissions to `kwok` means granting strong permissions to the entire CA codebase - this can pose a security risk - I have removed the code related to install and uninstall for now - will proceed after discussion with the community Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: run `go mod tidy` and `go mod vendor` Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: add permission to create nodes - to fix permissions error for kwok provider Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add more unit tests - add tests for kwok helpers - fix and update kwok config tests - fix a bug where gpu label was getting assigned to `kwokConfig.status.key` - expose `loadConfigFile` -> `LoadConfigFile` - throw error if templates configmap does not have `templates` key (value of which is node templates) - finish test for `GPULabel()` - add tests for `NodeGroupForNode()` - expose `loadNodeTemplatesFromConfigMap` -> `LoadNodeTemplatesFromConfigMap` - fix `KwokCloudProvider`'s kwok config was empty (this caused `GPULabel()` to return empty) Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: abstract provider ID code into `getProviderID` fn - fix provider name in test `kwok` -> `kwok:kind-worker-xxx` Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: run `go mod vendor` and `go mod tidy Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs(cloudprovider/kwok): update info on creating nodegroups based on `hostname/label` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor(charts): replace fromLabelKey value `"kubernetes.io/hostname"` -> `"kwok-nodegroup"` - `"kubernetes.io/hostname"` leads to infinite scale-up Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: support running CA with kwok provider locally Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use global informer factory Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use `fromNodeLabelKey: "kwok-nodegroup"` in test templates Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: `Cleanup()` logic - clean up only nodes managed by the kwok provider Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix/refactor: nodegroup creation logic - fix issue where fake node was getting created which caused fatal error - use ng annotation to keep track of nodegroups - (when creating nodegroups) don't process nodes which don't have the right ng nabel - suffix ng name with unix timestamp Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor/test(cloudprovider/kwok): write tests for `BuildKwokProvider` and `Cleanup` - pass only the required node lister to cloud provider instead of the entire informer factory - pass the required configmap name to `LoadNodeTemplatesFromConfigMap` instead of passing the entire kwok provider config - implement fake node lister for testing Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add test case for dynamic templates in `TestNodeGroupForNode` - remove non-required fields from template node Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add tests for `NodeGroups()` - add extra node template without ng selector label to add more variability in the test Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: write tests for `GetNodeGpuConfig()` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add test for `GetAvailableGPUTypes` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add test for `GetResourceLimiter()` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add tests for nodegroup's `IncreaseSize()` - abstract error msgs into variables to use them in tests Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add test for ng `DeleteNodes()` fn - add check for deleting too many nodes - rename err msg var names to make them consistent Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add tests for ng `DecreaseTargetSize()` - abstract error msgs into variables (for easy use in tests) Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add test for ng `Nodes()` - add extra test case for `DecreaseTargetSize()` to check lister error Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add test for ng `TemplateNodeInfo` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): improve tests for `BuildKwokProvider()` - add more test cases - refactor lister for `TestBuildKwokProvider()` and `TestCleanUp()` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): add test for ng `GetOptions` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(cloudprovider/kwok): unset `KWOK_CONFIG_MAP_NAME` at the end of the test - not doing so leads to failure in other tests - remove `kwokRelease` field from kwok config (not used anymore) - this was causing the tests to fail Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: bump CA chart version - this is because of changes made related to kwok - fix type `everwhere` -> `everywhere` Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: fix linting checks Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: address CI lint errors Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: generate helm docs for `kwokConfigMapName` - remove `KWOK_CONFIG_MAP_KEY` (not being used in the code) - bump helm chart version Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: revise the outline for README - add AEP link to the motivation doc Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: wip create an outline for the README - remove `kwok` field from examples (not needed right now) Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add outline for ascii gifs Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename env variable `KWOK_CONFIG_MAP_NAME` -> `KWOK_PROVIDER_CONFIGMAP` Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: update README with info around installation and benefits of using kwok provider - add `Kwok` as a provider in main CA README Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: run `go mod vendor` - remove TODOs that are not needed anymore Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: finish first draft of README Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: env variable in chart `KWOK_CONFIG_MAP_NAME` -> `KWOK_PROVIDER_CONFIGMAP` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: remove redundant/deprecated code Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: bump chart version `9.30.1` -> `9.30.2` - because of kwok provider related changes Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: fix typo `offical` -> `official` Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: remove debug log msg Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add links for getting help Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: fix type in log `external cluster` -> `cluster` Signed-off-by: vadasambar <surajrbanakar@gmail.com> chore: add newline in chart.yaml to fix CI lint Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: fix mistake `sig-kwok` -> `sig-scheduling` - kwok is a part if sig-scheduling (there is no sig-kwok) Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: fix type `release"` -> `"release"` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: pass informer instead of lister to cloud provider builder fn Signed-off-by: vadasambar <surajrbanakar@gmail.com>
2023-05-30 11:52:11 +05:30 · 2023-05-30 11:52:11 +05:30 · cfbee9a4d6
parent 8de60c98a5
commit cfbee9a4d6
43 changed files with 4757 additions and 14 deletions
--- a/charts/cluster-autoscaler/Chart.yaml
+++ b/charts/cluster-autoscaler/Chart.yaml
@ -11,4 +11,4 @@ name: cluster-autoscaler
 sources:
  - https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
 type: application
-version: 9.32.0
+version: 9.32.1
--- a/charts/cluster-autoscaler/README.md
+++ b/charts/cluster-autoscaler/README.md
@ -419,6 +419,7 @@ vpa:
 | image.repository | string | `"registry.k8s.io/autoscaling/cluster-autoscaler"` | Image repository |
 | image.tag | string | `"v1.27.2"` | Image tag |
 | kubeTargetVersionOverride | string | `""` | Allow overriding the `.Capabilities.KubeVersion.GitVersion` check. Useful for `helm template` commands. |
+| kwokConfigMapName | string | `"kwok-provider-config"` | configmap for configuring kwok provider |
 | magnumCABundlePath | string | `"/etc/kubernetes/ca-bundle.crt"` | Path to the host's CA bundle, from `ca-file` in the cloud-config file. |
 | magnumClusterName | string | `""` | Cluster name or ID in Magnum. Required if `cloudProvider=magnum` and not setting `autoDiscovery.clusterName`. |
 | nameOverride | string | `""` | String to partially override `cluster-autoscaler.fullname` template (will maintain the release name) |
--- a/charts/cluster-autoscaler/templates/clusterrole.yaml
+++ b/charts/cluster-autoscaler/templates/clusterrole.yaml
@ -42,6 +42,8 @@ rules:
    verbs:
    - watch
    - list
+    - create
+    - delete
    - get
    - update
  - apiGroups:
@ -120,6 +122,7 @@ rules:
    verbs:
      - list
      - watch
+      - get
  - apiGroups:
    - coordination.k8s.io
    resources:
--- a/charts/cluster-autoscaler/templates/configmap.yaml
+++ b/charts/cluster-autoscaler/templates/configmap.yaml
@ -0,0 +1,416 @@
+{{- if or (eq .Values.cloudProvider "kwok") }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ .Values.kwokConfigMapName | default "kwok-provider-config" }}
+  namespace: {{ .Release.Namespace }}
+data: 
+  config: |-
+    # if you see '\n' everywhere, remove all the trailing spaces
+    apiVersion: v1alpha1
+    readNodesFrom: configmap # possible values: [cluster,configmap]
+    nodegroups:
+      # to specify how to group nodes into a nodegroup
+      # e.g., you want to treat nodes with same instance type as a nodegroup
+      # node1: m5.xlarge
+      # node2: c5.xlarge
+      # node3: m5.xlarge
+      # nodegroup1: [node1,node3]
+      # nodegroup2: [node2]
+      fromNodeLabelKey: "kwok-nodegroup"
+      # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+      # (both are not allowed)
+      # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+    nodes:
+      # gpuConfig:
+      #   # to tell kwok provider what label should be considered as GPU label
+      #   gpuLabelKey: "k8s.amazonaws.com/accelerator"
+      #   availableGPUTypes:
+      #     "nvidia-tesla-k80": {}
+      #     "nvidia-tesla-p100": {}
+    configmap:
+      name: kwok-provider-templates
+    kwok: {} # default: fetch latest release of kwok from github and install it
+    # # you can also manually specify which kwok release you want to install
+    # # for example:
+    # kwok:
+    #   release: v0.3.0
+    # # you can also disable installing kwok in CA code (and install your own kwok release)
+    # kwok:
+    #   install: false (true if not specified)
+--- 
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: kwok-provider-templates
+  namespace: {{ .Release.Namespace }}
+data: 
+  templates: |-
+    # if you see '\n' everywhere, remove all the trailing spaces
+    apiVersion: v1
+    items:
+    - apiVersion: v1
+      kind: Node
+      metadata:
+        annotations:
+          kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+          node.alpha.kubernetes.io/ttl: "0"
+          volumes.kubernetes.io/controller-managed-attach-detach: "true"
+        creationTimestamp: "2023-05-31T04:39:16Z"
+        labels:
+          beta.kubernetes.io/arch: amd64
+          beta.kubernetes.io/os: linux
+          kubernetes.io/arch: amd64
+          kubernetes.io/hostname: kind-control-plane
+          kwok-nodegroup: control-plane
+          kubernetes.io/os: linux
+          node-role.kubernetes.io/control-plane: ""
+          node.kubernetes.io/exclude-from-external-load-balancers: ""
+        name: kind-control-plane
+        resourceVersion: "506"
+        uid: 86716ec7-3071-4091-b055-77b4361d1dca
+      spec:
+        podCIDR: 10.244.0.0/24
+        podCIDRs:
+        - 10.244.0.0/24
+        providerID: kind://docker/kind/kind-control-plane
+        taints:
+        - effect: NoSchedule
+          key: node-role.kubernetes.io/control-plane
+      status:
+        addresses:
+        - address: 172.18.0.2
+          type: InternalIP
+        - address: kind-control-plane
+          type: Hostname
+        allocatable:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        capacity:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        conditions:
+        - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+          lastTransitionTime: "2023-05-31T04:39:13Z"
+          message: kubelet has sufficient memory available
+          reason: KubeletHasSufficientMemory
+          status: "False"
+          type: MemoryPressure
+        - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+          lastTransitionTime: "2023-05-31T04:39:13Z"
+          message: kubelet has no disk pressure
+          reason: KubeletHasNoDiskPressure
+          status: "False"
+          type: DiskPressure
+        - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+          lastTransitionTime: "2023-05-31T04:39:13Z"
+          message: kubelet has sufficient PID available
+          reason: KubeletHasSufficientPID
+          status: "False"
+          type: PIDPressure
+        - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+          lastTransitionTime: "2023-05-31T04:39:46Z"
+          message: kubelet is posting ready status
+          reason: KubeletReady
+          status: "True"
+          type: Ready
+        daemonEndpoints:
+          kubeletEndpoint:
+            Port: 10250
+        images:
+        - names:
+          - registry.k8s.io/etcd:3.5.6-0
+          sizeBytes: 102542580
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+          - registry.k8s.io/kube-apiserver:v1.26.3
+          sizeBytes: 80392681
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+          - registry.k8s.io/kube-controller-manager:v1.26.3
+          sizeBytes: 68538487
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+          - registry.k8s.io/kube-proxy:v1.26.3
+          sizeBytes: 67217404
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+          - registry.k8s.io/kube-scheduler:v1.26.3
+          sizeBytes: 57761399
+        - names:
+          - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          sizeBytes: 27726335
+        - names:
+          - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+          sizeBytes: 18664669
+        - names:
+          - registry.k8s.io/coredns/coredns:v1.9.3
+          sizeBytes: 14837849
+        - names:
+          - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+          sizeBytes: 3052037
+        - names:
+          - registry.k8s.io/pause:3.7
+          sizeBytes: 311278
+        nodeInfo:
+          architecture: amd64
+          bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+          containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+          kernelVersion: 5.15.0-72-generic
+          kubeProxyVersion: v1.26.3
+          kubeletVersion: v1.26.3
+          machineID: 96f8c8b8c8ae4600a3654341f207586e
+          operatingSystem: linux
+          osImage: Ubuntu 22.04.2 LTS
+          systemUUID: 111aa932-7f99-4bef-aaf7-36aa7fb9b012
+    - apiVersion: v1
+      kind: Node
+      metadata:
+        annotations:
+          kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+          node.alpha.kubernetes.io/ttl: "0"
+          volumes.kubernetes.io/controller-managed-attach-detach: "true"
+        creationTimestamp: "2023-05-31T04:39:57Z"
+        labels:
+          beta.kubernetes.io/arch: amd64
+          beta.kubernetes.io/os: linux
+          kubernetes.io/arch: amd64
+          kubernetes.io/hostname: kind-worker
+          kwok-nodegroup: kind-worker
+          kubernetes.io/os: linux
+        name: kind-worker
+        resourceVersion: "577"
+        uid: 2ac0eb71-e5cf-4708-bbbf-476e8f19842b
+      spec:
+        podCIDR: 10.244.2.0/24
+        podCIDRs:
+        - 10.244.2.0/24
+        providerID: kind://docker/kind/kind-worker
+      status:
+        addresses:
+        - address: 172.18.0.3
+          type: InternalIP
+        - address: kind-worker
+          type: Hostname
+        allocatable:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        capacity:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        conditions:
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has sufficient memory available
+          reason: KubeletHasSufficientMemory
+          status: "False"
+          type: MemoryPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has no disk pressure
+          reason: KubeletHasNoDiskPressure
+          status: "False"
+          type: DiskPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has sufficient PID available
+          reason: KubeletHasSufficientPID
+          status: "False"
+          type: PIDPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:40:05Z"
+          message: kubelet is posting ready status
+          reason: KubeletReady
+          status: "True"
+          type: Ready
+        daemonEndpoints:
+          kubeletEndpoint:
+            Port: 10250
+        images:
+        - names:
+          - registry.k8s.io/etcd:3.5.6-0
+          sizeBytes: 102542580
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+          - registry.k8s.io/kube-apiserver:v1.26.3
+          sizeBytes: 80392681
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+          - registry.k8s.io/kube-controller-manager:v1.26.3
+          sizeBytes: 68538487
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+          - registry.k8s.io/kube-proxy:v1.26.3
+          sizeBytes: 67217404
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+          - registry.k8s.io/kube-scheduler:v1.26.3
+          sizeBytes: 57761399
+        - names:
+          - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          sizeBytes: 27726335
+        - names:
+          - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+          sizeBytes: 18664669
+        - names:
+          - registry.k8s.io/coredns/coredns:v1.9.3
+          sizeBytes: 14837849
+        - names:
+          - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+          sizeBytes: 3052037
+        - names:
+          - registry.k8s.io/pause:3.7
+          sizeBytes: 311278
+        nodeInfo:
+          architecture: amd64
+          bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+          containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+          kernelVersion: 5.15.0-72-generic
+          kubeProxyVersion: v1.26.3
+          kubeletVersion: v1.26.3
+          machineID: a98a13ff474d476294935341f1ba9816
+          operatingSystem: linux
+          osImage: Ubuntu 22.04.2 LTS
+          systemUUID: 5f3c1af8-a385-4776-85e4-73d7f4252b44
+    - apiVersion: v1
+      kind: Node
+      metadata:
+        annotations:
+          kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+          node.alpha.kubernetes.io/ttl: "0"
+          volumes.kubernetes.io/controller-managed-attach-detach: "true"
+        creationTimestamp: "2023-05-31T04:39:57Z"
+        labels:
+          beta.kubernetes.io/arch: amd64
+          beta.kubernetes.io/os: linux
+          kubernetes.io/arch: amd64
+          kubernetes.io/hostname: kind-worker2
+          kwok-nodegroup: kind-worker2
+          kubernetes.io/os: linux
+        name: kind-worker2
+        resourceVersion: "578"
+        uid: edc7df38-feb2-4089-9955-780562bdd21e
+      spec:
+        podCIDR: 10.244.1.0/24
+        podCIDRs:
+        - 10.244.1.0/24
+        providerID: kind://docker/kind/kind-worker2
+      status:
+        addresses:
+        - address: 172.18.0.4
+          type: InternalIP
+        - address: kind-worker2
+          type: Hostname
+        allocatable:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        capacity:
+          cpu: "12"
+          ephemeral-storage: 959786032Ki
+          hugepages-1Gi: "0"
+          hugepages-2Mi: "0"
+          memory: 32781516Ki
+          pods: "110"
+        conditions:
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has sufficient memory available
+          reason: KubeletHasSufficientMemory
+          status: "False"
+          type: MemoryPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has no disk pressure
+          reason: KubeletHasNoDiskPressure
+          status: "False"
+          type: DiskPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:39:57Z"
+          message: kubelet has sufficient PID available
+          reason: KubeletHasSufficientPID
+          status: "False"
+          type: PIDPressure
+        - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+          lastTransitionTime: "2023-05-31T04:40:08Z"
+          message: kubelet is posting ready status
+          reason: KubeletReady
+          status: "True"
+          type: Ready
+        daemonEndpoints:
+          kubeletEndpoint:
+            Port: 10250
+        images:
+        - names:
+          - registry.k8s.io/etcd:3.5.6-0
+          sizeBytes: 102542580
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+          - registry.k8s.io/kube-apiserver:v1.26.3
+          sizeBytes: 80392681
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+          - registry.k8s.io/kube-controller-manager:v1.26.3
+          sizeBytes: 68538487
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+          - registry.k8s.io/kube-proxy:v1.26.3
+          sizeBytes: 67217404
+        - names:
+          - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+          - registry.k8s.io/kube-scheduler:v1.26.3
+          sizeBytes: 57761399
+        - names:
+          - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+          sizeBytes: 27726335
+        - names:
+          - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+          sizeBytes: 18664669
+        - names:
+          - registry.k8s.io/coredns/coredns:v1.9.3
+          sizeBytes: 14837849
+        - names:
+          - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+          sizeBytes: 3052037
+        - names:
+          - registry.k8s.io/pause:3.7
+          sizeBytes: 311278
+        nodeInfo:
+          architecture: amd64
+          bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+          containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+          kernelVersion: 5.15.0-72-generic
+          kubeProxyVersion: v1.26.3
+          kubeletVersion: v1.26.3
+          machineID: fa9f4cd3b3a743bc867b04e44941dcb2
+          operatingSystem: linux
+          osImage: Ubuntu 22.04.2 LTS
+          systemUUID: f36c0f00-8ba5-4c8c-88bc-2981c8d377b9
+    kind: List
+    metadata:
+      resourceVersion: ""
+
+    
+{{- end }}
--- a/charts/cluster-autoscaler/templates/deployment.yaml
+++ b/charts/cluster-autoscaler/templates/deployment.yaml
@ -125,6 +125,14 @@ spec:
            {{- end }}
          {{- end }}
          env:
+            - name: POD_NAMESPACE
+              valueFrom:
+                fieldRef:
+                  fieldPath: metadata.namespace
+            - name: SERVICE_ACCOUNT
+              valueFrom:
+                fieldRef:
+                  fieldPath: spec.serviceAccountName
          {{- if and (eq .Values.cloudProvider "aws") (ne .Values.awsRegion "") }}
            - name: AWS_REGION
              value: "{{ .Values.awsRegion }}"
@ -207,6 +215,9 @@ spec:
                secretKeyRef:
                  key: api-zone
                  name: {{ default (include "cluster-autoscaler.fullname" .) .Values.secretKeyRefNameOverride }}
+          {{- else if eq .Values.cloudProvider "kwok" }}
+            - name: KWOK_PROVIDER_CONFIGMAP
+              value: "{{.Values.kwokConfigMapName | default "kwok-provider-config"}}"
          {{- end }}
          {{- range $key, $value := .Values.extraEnv }}
            - name: {{ $key }}
--- a/charts/cluster-autoscaler/values.yaml
+++ b/charts/cluster-autoscaler/values.yaml
@ -244,6 +244,9 @@ image:
 # kubeTargetVersionOverride -- Allow overriding the `.Capabilities.KubeVersion.GitVersion` check. Useful for `helm template` commands.
 kubeTargetVersionOverride: ""

+# kwokConfigMapName -- configmap for configuring kwok provider
+kwokConfigMapName: "kwok-provider-config"
+
 # magnumCABundlePath -- Path to the host's CA bundle, from `ca-file` in the cloud-config file.
 magnumCABundlePath: "/etc/kubernetes/ca-bundle.crt"

--- a/cluster-autoscaler/README.md
+++ b/cluster-autoscaler/README.md
@ -31,6 +31,7 @@ You should also take a look at the notes and "gotchas" for your specific cloud p
 * [HuaweiCloud](./cloudprovider/huaweicloud/README.md)
 * [IonosCloud](./cloudprovider/ionoscloud/README.md)
 * [Kamatera](./cloudprovider/kamatera/README.md)
+* [Kwok](./cloudprovider/kwok/README.md)
 * [Linode](./cloudprovider/linode/README.md)
 * [Magnum](./cloudprovider/magnum/README.md)
 * [OracleCloud](./cloudprovider/oci/README.md)
--- a/cluster-autoscaler/cloudprovider/builder/builder_all.go
+++ b/cluster-autoscaler/cloudprovider/builder/builder_all.go
@ -1,5 +1,5 @@
-//go:build !gce && !aws && !azure && !kubemark && !alicloud && !magnum && !digitalocean && !clusterapi && !huaweicloud && !ionoscloud && !linode && !hetzner && !bizflycloud && !brightbox && !packet && !oci && !vultr && !tencentcloud && !scaleway && !externalgrpc && !civo && !rancher && !volcengine && !baiducloud && !cherry && !cloudstack && !exoscale && !kamatera && !ovhcloud
-// +build !gce,!aws,!azure,!kubemark,!alicloud,!magnum,!digitalocean,!clusterapi,!huaweicloud,!ionoscloud,!linode,!hetzner,!bizflycloud,!brightbox,!packet,!oci,!vultr,!tencentcloud,!scaleway,!externalgrpc,!civo,!rancher,!volcengine,!baiducloud,!cherry,!cloudstack,!exoscale,!kamatera,!ovhcloud
+//go:build !gce && !aws && !azure && !kubemark && !alicloud && !magnum && !digitalocean && !clusterapi && !huaweicloud && !ionoscloud && !linode && !hetzner && !bizflycloud && !brightbox && !packet && !oci && !vultr && !tencentcloud && !scaleway && !externalgrpc && !civo && !rancher && !volcengine && !baiducloud && !cherry && !cloudstack && !exoscale && !kamatera && !ovhcloud && !kwok
+// +build !gce,!aws,!azure,!kubemark,!alicloud,!magnum,!digitalocean,!clusterapi,!huaweicloud,!ionoscloud,!linode,!hetzner,!bizflycloud,!brightbox,!packet,!oci,!vultr,!tencentcloud,!scaleway,!externalgrpc,!civo,!rancher,!volcengine,!baiducloud,!cherry,!cloudstack,!exoscale,!kamatera,!ovhcloud,!kwok

 /*
 Copyright 2018 The Kubernetes Authors.
@ -39,6 +39,7 @@ import (
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/huaweicloud"
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/ionoscloud"
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/kamatera"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/kwok"
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/linode"
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/magnum"
 	oci "k8s.io/autoscaler/cluster-autoscaler/cloudprovider/oci/instancepools"
@ -50,6 +51,7 @@ import (
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/volcengine"
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/vultr"
 	"k8s.io/autoscaler/cluster-autoscaler/config"
+	"k8s.io/client-go/informers"
 )

 // AvailableCloudProviders supported by the cloud provider builder.
@ -72,6 +74,7 @@ var AvailableCloudProviders = []string{
 	cloudprovider.ClusterAPIProviderName,
 	cloudprovider.IonoscloudProviderName,
 	cloudprovider.KamateraProviderName,
+	cloudprovider.KwokProviderName,
 	cloudprovider.LinodeProviderName,
 	cloudprovider.BizflyCloudProviderName,
 	cloudprovider.BrightboxProviderName,
@ -87,7 +90,10 @@ var AvailableCloudProviders = []string{
 // DefaultCloudProvider is GCE.
 const DefaultCloudProvider = cloudprovider.GceProviderName

-func buildCloudProvider(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider {
+func buildCloudProvider(opts config.AutoscalingOptions,
+	do cloudprovider.NodeGroupDiscoveryOptions,
+	rl *cloudprovider.ResourceLimiter,
+	informerFactory informers.SharedInformerFactory) cloudprovider.CloudProvider {
 	switch opts.CloudProviderName {
 	case cloudprovider.BizflyCloudProviderName:
 		return bizflycloud.BuildBizflyCloud(opts, do, rl)
@ -129,6 +135,8 @@ func buildCloudProvider(opts config.AutoscalingOptions, do cloudprovider.NodeGro
 		return ionoscloud.BuildIonosCloud(opts, do, rl)
 	case cloudprovider.KamateraProviderName:
 		return kamatera.BuildKamatera(opts, do, rl)
+	case cloudprovider.KwokProviderName:
+		return kwok.BuildKwok(opts, do, rl, informerFactory)
 	case cloudprovider.LinodeProviderName:
 		return linode.BuildLinode(opts, do, rl)
 	case cloudprovider.OracleCloudProviderName:
--- a/cluster-autoscaler/cloudprovider/builder/builder_kwok.go
+++ b/cluster-autoscaler/cloudprovider/builder/builder_kwok.go
@ -0,0 +1,43 @@
+//go:build kwok
+// +build kwok
+
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package builder
+
+import (
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/kwok"
+	"k8s.io/autoscaler/cluster-autoscaler/config"
+)
+
+// AvailableCloudProviders supported by the cloud provider builder.
+var AvailableCloudProviders = []string{
+	cloudprovider.KwokProviderName,
+}
+
+// DefaultCloudProvider for Kwok-only build is Kwok.
+const DefaultCloudProvider = cloudprovider.KwokProviderName
+
+func buildCloudProvider(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider {
+	switch opts.CloudProviderName {
+	case cloudprovider.KwokProviderName:
+		return kwok.BuildKwokCloudProvider(opts, do, rl)(opts, do, rl)
+	}
+
+	return nil
+}
--- a/cluster-autoscaler/cloudprovider/builder/cloud_provider_builder.go
+++ b/cluster-autoscaler/cloudprovider/builder/cloud_provider_builder.go
@ -20,12 +20,13 @@ import (
 	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
 	"k8s.io/autoscaler/cluster-autoscaler/config"
 	"k8s.io/autoscaler/cluster-autoscaler/context"
+	"k8s.io/client-go/informers"

 	klog "k8s.io/klog/v2"
 )

 // NewCloudProvider builds a cloud provider from provided parameters.
-func NewCloudProvider(opts config.AutoscalingOptions) cloudprovider.CloudProvider {
+func NewCloudProvider(opts config.AutoscalingOptions, informerFactory informers.SharedInformerFactory) cloudprovider.CloudProvider {
 	klog.V(1).Infof("Building %s cloud provider.", opts.CloudProviderName)

 	do := cloudprovider.NodeGroupDiscoveryOptions{
@ -42,7 +43,7 @@ func NewCloudProvider(opts config.AutoscalingOptions) cloudprovider.CloudProvide
 		return nil
 	}

-	provider := buildCloudProvider(opts, do, rl)
+	provider := buildCloudProvider(opts, do, rl, informerFactory)
 	if provider != nil {
 		return provider
 	}
--- a/cluster-autoscaler/cloudprovider/cloud_provider.go
+++ b/cluster-autoscaler/cloudprovider/cloud_provider.go
@ -60,6 +60,8 @@ const (
 	KamateraProviderName = "kamatera"
 	// KubemarkProviderName gets the provider name of kubemark
 	KubemarkProviderName = "kubemark"
+	// KwokProviderName gets the provider name of kwok
+	KwokProviderName = "kwok"
 	// HuaweicloudProviderName gets the provider name of huaweicloud
 	HuaweicloudProviderName = "huaweicloud"
 	// IonoscloudProviderName gets the provider name of ionoscloud
--- a/cluster-autoscaler/cloudprovider/externalgrpc/examples/external-grpc-cloud-provider-service/main.go
+++ b/cluster-autoscaler/cloudprovider/externalgrpc/examples/external-grpc-cloud-provider-service/main.go
@ -125,7 +125,7 @@ func main() {
 		},
 		UserAgent: "user-agent",
 	}
-	cloudProvider := cloudBuilder.NewCloudProvider(autoscalingOptions)
+	cloudProvider := cloudBuilder.NewCloudProvider(autoscalingOptions, nil)
 	srv := wrapper.NewCloudProviderGrpcWrapper(cloudProvider)

 	// listen
--- a/cluster-autoscaler/cloudprovider/kwok/OWNERS
+++ b/cluster-autoscaler/cloudprovider/kwok/OWNERS
@ -0,0 +1,7 @@
+approvers:
+- vadasambar
+reviewers:
+- vadasambar
+
+labels:
+- area/provider/kwok
--- a/cluster-autoscaler/cloudprovider/kwok/README.md
+++ b/cluster-autoscaler/cloudprovider/kwok/README.md
@ -0,0 +1,266 @@
+With `kwok` provider you can:
+* Run **CA** (cluster-autoscaler) in your terminal and connect it to a cluster (like a kubebuilder controller). You don't have to run CA in an actual cluster to test things out.
+![](./docs/images/run-kwok-locally-1.png)
+![](./docs/images/run-kwok-locally-2.png)
+* Perform a "dry-run" to test autoscaling behavior of CA without creating actual VMs in your cloud provider.
+* Run CA in your local kind cluster with nodes and workloads from a remote cluster (you can also use nodes from the same cluster).
+![](./docs/images/kwok-as-dry-run-1.png)
+![](./docs/images/kwok-as-dry-run-2.png)
+* Test behavior of CA against a large number of fake nodes (of your choice) with metrics.
+![](./docs/images/large-number-of-nodes-1.png)
+![](./docs/images/large-number-of-nodes-2.png)
+* etc.,
+
+## What is `kwok` provider? Why `kwok` provider?
+Check the doc around [motivation](./docs/motivation.md).
+
+## How to use `kwok` provider
+
+### In a Kubernetes cluster:
+
+#### 1. Install `kwok` controller
+
+Follow [the official docs to install `kwok`](https://kwok.sigs.k8s.io/docs/user/kwok-in-cluster/) in a cluster.
+
+#### 2. Configure cluster-autoscaler to use `kwok` cloud provider
+
+*Using helm chart*:
+```shell
+helm upgrade --install <release-name> charts/cluster-autoscaler  \
+--set "serviceMonitor.enabled"=true --set "serviceMonitor.namespace"=default \
+--set "cloudprovider"=kwok --set "image.tag"="<image-tag>" \
+--set "image.repository"="<image-repo>" \
+--set "autoDiscovery.clusterName"="kind-kind" \
+--set "serviceMonitor.selector.release"="prom"
+```
+Replace `<release-name>` with the release name you want.
+Replace `<image-tag>` with the image tag you want. Replace `<image-repo>` with the image repo you want
+(check [releases](https://github.com/kubernetes/autoscaler/releases) for the official image repos and tags)
+
+Note that `kwok` provider doesn't use `autoDiscovery.clusterName`. You can use a fake value for `autoDiscovery.clusterName`.
+
+Replace `"release"="prom"` with the label selector for `ServiceMonitor` in your grafana/prometheus installation.
+
+For example, if you are using prometheus operator, you can find the service monitor label selector using
+```shell
+kubectl get prometheus -ojsonpath='{.items[*].spec.serviceMonitorSelector}' | jq # using jq is optional
+```
+Here's what it looks like
+![](./docs/images/prom-match-labels.png)
+
+`helm upgrade ...` command above installs cluster-autoscaler with `kwok` cloud provider settings. The helm chart by default installs a default kwok provider configuration (`kwok-provider-config` ConfigMap) and sample template nodes (`kwok-provider-templates` ConfigMap) to get you started. Replace the content of these ConfigMaps according to your need.
+
+If you already have cluster-autoscaler running and don't want to use `helm ...`, you can make the following changes to get kwok provider working:
+1. Create `kwok-provider-config` ConfigMap for kwok provider config
+2. Create `kwok-provider-templates` ConfigMap for node templates
+3. Set `POD_NAMESPACE` env variable in the CA Deployment (if it is not there already)
+4. Set `--cloud-provider=kwok` in the CA Deployment
+5. That's all.
+
+For 1 and 2, you can refer to helm chart for the ConfigMaps. You can render them from the helm chart using:
+```
+helm template charts/cluster-autoscaler/  --set "cloudProvider"="kwok" -s templates/configmap.yaml --namespace=default
+```
+Replace `--namespace` with namespace where your CA pod is running.
+
+If you want to temporarily revert back to your previous cloud provider, just change the `--cloud-provider=kwok`.
+No other provider uses `kwok-provider-config` and `kwok-provider-templates` ConfigMap (you can keep them in the cluster or delete them if you want to revert completely). `POD_NAMESPACE` is used only by kwok provider (at the time of writing this).
+
+#### 3. Configure `kwok` cloud provider
+Decide if you want to use static template nodes or dynamic template nodes ([check the FAQ](#3-what-is-the-difference-between-static-template-nodes-and-dynamic-template-nodes) to understand the difference).
+
+If you want to use static template nodes,
+
+`kwok-provider-config` ConfigMap in the helm chart by default is set to use static template nodes (`readNodesFrom` is set to `configmap`). CA helm chart also installs a `kwok-provider-templates` ConfigMap with sample node yamls by default. If you want to use your own node yamls,
+```shell
+# delete the existing configmap
+kubectl delete configmap kwok-provider-templates
+# create a new configmap with your own node yamls
+kubectl create configmap kwok-provider-templates --from-file=templates=template-nodes.yaml
+```
+Replace `template-nodes.yaml` with path to your template nodes file.
+
+If you are using your own template nodes in the `kwok-provider-templates` ConfigMap, make sure you have set the correct value for `nodegroups.fromNodeLabelKey`/`nodegroups.fromNodeAnnotation`. Not doing so will make CA not scale up nodes (it won't throw any error either).
+
+If you want to use dynamic template nodes,
+
+Set `readNodesFrom` in `kwok-provider-config` ConfigMap to `cluster`. This tells kwok provider to use live nodes from the cluster as template nodes.
+
+If you are using live nodes from cluster as template nodes in the `kwok-provider-templates` ConfigMap, make sure you have set the correct value for `nodegroups.fromNodeLabelKey`/`nodegroups.fromNodeAnnotation`. Not doing so will make CA not scale up nodes (it won't throw any error either).
+
+### For local development
+1. Point your kubeconfig to the cluster where you want to test your changes
+Using [`kubectx`](https://github.com/ahmetb/kubectx):
+```
+kubectx <cluster-name>
+```
+Using `kubectl`:
+```
+kubectl config get-contexts
+
+```
+2. Create `kwok-provider-config` and `kwok-provider-templates` ConfigMap in the cluster you want to test your changes.
+
+This is important because even if you run CA locally with kwok provider, kwok provider still searches for the `kwok-provider-config` ConfigMap and `kwok-provider-templates` (because by default `kwok-provider-config` has `readNodesFrom` set to `configmap`) in the cluster it connects to.
+
+You can create both the ConfigMap resources from the helm chart like this:
+
+```shell
+helm template charts/cluster-autoscaler/  --set "cloudProvider"="kwok" -s templates/configmap.yaml --namespace=default | kubectl apply -f -
+```
+`--namespace` has to match `POD_NAMESPACE` env variable you set below.
+
+3. Run CA locally
+
+```shell
+# replace `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT`
+# with your kubernetes api server url
+# you can find it with `kubectl cluster-info`
+# example:
+# $ kubectl cluster-info
+# Kubernetes control plane is running at https://127.0.0.1:36357
+# ...
+export KUBERNETES_SERVICE_HOST=https://127.0.0.1
+export KUBERNETES_SERVICE_PORT=36357
+# POD_NAMESPACE is the namespace where you want to look for
+# your `kwok-provider-config` and `kwok-provider-templates` ConfigMap
+export POD_NAMESPACE=default
+# KWOK_PROVIDER_MODE tells kwok provider that we are running CA locally
+export KWOK_PROVIDER_MODE=local
+# `2>&1` redirects both stdout and stderr to VS Code (remove `| code -` if you don't use VS Code)
+go run main.go --kubeconfig=/home/suraj/.kube/config --cloud-provider=kwok --namespace=default --logtostderr=true --stderrthreshold=info --v=5 2>&1 | code -
+```
+
+This is what it looks like in action:
+![](./docs/images/run-kwok-locally-3.png)
+
+## Tweaking the `kwok` provider
+You can change the behavior of `kwok` provider by tweaking the kwok provider configuration in `kwok-provider-config` ConfigMap:
+
+```yaml
+# only v1alpha1 is supported right now
+apiVersion: v1alpha1
+# possible values: [cluster,configmap]
+# cluster: use nodes from cluster as template nodes
+# configmap: use node yamls from a configmap as template nodes
+readNodesFrom: configmap
+# nodegroups specifies nodegroup level config
+nodegroups:
+  # fromNodeLabelKey's value is used to group nodes together into nodegroups
+  # For example, say you want to group nodes with same value for `node.kubernetes.io/instance-type`
+  # label as a nodegroup. Here are the nodes you have:
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # Your nodegroups will look like this:
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+
+  # fromNodeAnnotation's value is used to group nodes together into nodegroups
+  # (basically same as `fromNodeLabelKey` except based on annotation)
+  # you can specify either of `fromNodeLabelKey` OR `fromNodeAnnotation`
+  # (both are not allowed)
+  fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+# nodes specifies node level config
+nodes:
+  # skipTaint is used to enable/disable adding kwok provider taint on the template nodes
+  # default is false so that even if you run the provider in a production cluster
+  # you don't have to worry about production workload
+  # getting accidentally scheduled on the fake nodes
+  skipTaint: true # default: false
+  # gpuConfig is used to specify gpu config for the node
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+
+# availableGPUTypes is used to specify available GPU types
+availableGPUTypes:
+ "nvidia-tesla-k80": {}
+ "nvidia-tesla-p100": {}
+# configmap specifies config map name and key which stores the kwok provider templates in the cluster
+# Only applicable when `readNodesFrom: configmap`
+configmap:
+  name: kwok-provider-templates
+  key: kwok-config # default: config
+```
+
+By default, kwok provider looks for `kwok-provider-config` ConfigMap. If you want to use a different ConfigMap name, set the env variable `KWOK_PROVIDER_CONFIGMAP` (e.g., `KWOK_PROVIDER_CONFIGMAP=kpconfig`). You can set this env variable in the helm chart using `kwokConfigMapName` OR you can set it directly in the cluster-autoscaler Deployment with `kubectl edit deployment ...`.
+
+### FAQ
+#### 1. What is the difference between `kwok` and `kwok` provider?
+`kwok` is an open source project under `sig-scheduling`.
+> KWOK is a toolkit that enables setting up a cluster of thousands of Nodes in seconds. Under the scene, all Nodes are simulated to behave like real ones, so the overall approach employs a pretty low resource footprint that you can easily play around on your laptop.
+
+https://kwok.sigs.k8s.io/
+
+`kwok` provider refers to the cloud provider extension/plugin in cluster-autoscaler which uses `kwok` to create fake nodes.
+
+#### 2. What does a template node exactly mean?
+Template node is the base node yaml `kwok` provider uses to create a new node in the cluster.
+#### 3. What is the difference between static template nodes and dynamic template nodes?
+Static template nodes are template nodes created using the node yaml specified by the user in `kwok-provider-templates` ConfigMap while dynamic template nodes are template nodes based on the node yaml of the current running nodes in the cluster.
+#### 4. Can I use both static and dynamic template nodes together?
+As of now, no you can't (but it's an interesting idea). If you have a specific usecase, please create an issue and we can talk more there!
+
+
+#### 5. What is the difference between kwok provider config and template nodes config?
+kwok provider config is configuration to change the behavior of kwok provider (and not the underlying `kwok` toolkit) while template nodes config is the ConfigMap you can use to specify static node templates.
+
+
+### Gotchas
+1. kwok provider by default taints the template nodes with `kwok-provider: true` taint so that production workloads don't get scheduled on these nodes accidentally. You have to tolerate the taint to schedule your workload on the nodes created by the kwok provider. You can turn this off by setting `nodes.skipTaint: true` in the kwok provider config.
+2. Make sure the label/annotation for `fromNodeLabelKey`/`fromNodeAnnotation` in kwok provider config is actually present on the template nodes. If it isn't present on the template nodes, kwok provider will not be able to create new nodes.
+3. Note that kwok provider makes the following changes to all the template nodes:
+(pseudocode)
+```
+node.status.nodeInfo.kubeletVersion = "fake"
+node.annotations["kwok.x-k8s.io/node"] = "fake"
+node.annotations["cluster-autoscaler.kwok.nodegroup/name"] = "<name-of-the-nodegroup>"
+node.spec.providerID = "kwok:<name-of-the-node>"
+node.spec.taints = append(node.spec.taints, {
+		key:    "kwok-provider",
+		value:  "true",
+		effect: "NoSchedule",
+	})
+```
+
+## I have a problem/suggestion/question/idea/feature request. What should I do?
+Awesome! Please:
+* [Create a new issue](https://github.com/kubernetes/autoscaler/issues/new/choose) around it. Mention `@vadasambar` (I try to respond within a working day).
+* Start a slack thread aruond it in kubernetes `#sig-autoscaling` channel (for invitation, check [this](https://slack.k8s.io/)). Mention `@vadasambar` (I try to respond within a working day)
+* Add it to the [weekly sig-autoscaling meeting agenda](https://docs.google.com/document/d/1RvhQAEIrVLHbyNnuaT99-6u9ZUMp7BfkPupT2LAZK7w/edit) (happens [on Mondays](https://github.com/kubernetes/community/tree/master/sig-autoscaling#meetings))
+
+Please don't think too much about creating an issue. We can always close it if it doesn't make sense.
+
+## What is not supported?
+* Creating kwok nodegroups based on `kubernetes/hostname` node label. Why? Imagine you have a `Deployment` (replicas: 2) with pod anti-affinity on the `kubernetes/hostname` label like this:
+![](./docs/images/kwok-provider-hostname-label.png)
+Imagine you have only 2 unique hostnames values for `kubernetes/hostname` node label in your cluster:
+   * `hostname1`
+   * `hostname2`
+
+  If you increase the number of replicas in the `Deployment` to 3, CA creates a fake node internally and runs simulations on it to decide if it should scale up. This fake node has `kubernetes/hostname` set to the name of the fake node which looks like `template-node-xxxx-xxxx` (second `xxxx` is random). Since the value of `kubernetes/hostname` on the fake node is not `hostname1` or `hostname2`, CA thinks it can schedule the `Pending` pod on the fake node and hence keeps on scaling up to infinity (or until it can't).
+
+
+
+## Troubleshooting
+1. Pods are still stuck in `Running` even after CA has cleaned up all the kwok nodes
+    * `kwok` provider doesn't drain the nodes when it deletes them. It just deletes the nodes. You should see pods running on these nodes change from `Running` state to `Pending` state in a minute or two. But if you don't, try scaling down your workload and scaling it up again. If the issue persists, please create an issue :pray:.
+
+## I want to contribute
+Thank you ❤️
+
+It is expected that you know how to build and run CA locally. If you don't, I recommend starting from the [`Makefile`](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/Makefile). Check the CA [FAQ](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md) to know more about CA in general ([including info around building CA and submitting a PR](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#developer)). CA is a big and complex project. If you have any questions or if you get stuck anywhere, [reach out for help](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/kwok/README.md#reach-out-for-help-if-you-get-stuck).
+
+### Get yourself familiar with the `kwok` project
+Check https://kwok.sigs.k8s.io/
+### Try out the `kwok` provider
+Go through [the README](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/kwok/README.md).
+### Look for a good first issue
+Check [this](https://github.com/kubernetes/autoscaler/issues?q=is%3Aopen+is%3Aissue+label%3Aarea%2Fprovider%2Fkwok+label%3A%22good+first+issue%22) filter for good first issues around `kwok` provider.
+### Reach out for help if you get stuck
+You can get help in the following ways:
+* Mention `@vadasambar` in the issue/PR you are working on.
+* Start a slack thread in `#sig-autoscaling` mentioning `@vadasambar` (to join Kubernetes slack click [here](https://slack.k8s.io/)).
+* Add it to the weekly [sig-autoscaling meeting](https://github.com/kubernetes/community/tree/master/sig-autoscaling#meetings) agenda (happens on Mondays)
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-as-dry-run-1.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-as-dry-run-1.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-as-dry-run-2.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-as-dry-run-2.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-grafana.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-grafana.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-hostname-label.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-hostname-label.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-in-action.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/kwok-provider-in-action.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/large-number-of-nodes-1.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/large-number-of-nodes-1.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/large-number-of-nodes-2.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/large-number-of-nodes-2.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/prom-match-labels.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/prom-match-labels.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-1.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-1.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-2.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-2.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-3.png
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/images/run-kwok-locally-3.png
--- a/cluster-autoscaler/cloudprovider/kwok/docs/motivation.md
+++ b/cluster-autoscaler/cloudprovider/kwok/docs/motivation.md
@ -0,0 +1,107 @@
+# KWOK (Kubernetes without Kubelet) cloud provider
+
+*This doc was originally a part of https://github.com/kubernetes/autoscaler/pull/5869*
+## Introduction
+> [KWOK](https://sigs.k8s.io/kwok) is a toolkit that enables setting up a cluster of thousands of Nodes in seconds. Under the scene, all Nodes are simulated to behave like real ones, so the overall approach employs a pretty low resource footprint that you can easily play around on your laptop.
+
+https://kwok.sigs.k8s.io/
+
+## Problem
+### 1. It is hard to reproduce an issue happening at scale on local machine
+e.g., https://github.com/kubernetes/autoscaler/issues/5769
+
+To reproduce such issues, we have the following options today:
+### (a) setup [Kubemark](https://github.com/kubernetes/design-proposals-archive/blob/main/scalability/kubemark.md) on a public cloud provider and try reproducing the issue
+You can [setup Kubemark](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-scalability/kubemark-guide.md) ([related](https://github.com/kubernetes/kubernetes/blob/master/test/kubemark/pre-existing/README.md))  and use the [`kubemark` cloudprovider](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/kubemark) (kubemark [proposal](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/kubemark_integration.md)) directly or [`cluster-api` cloudprovider with kubemark](https://github.com/kubernetes-sigs/cluster-api-provider-kubemark)
+
+In either case,
+
+> Every running Kubemark setup looks like the following:
+> 1) A running Kubernetes cluster pointed to by the local kubeconfig
+> 2) A separate VM where the kubemark master is running
+> 3) Some hollow-nodes that run on the Kubernetes Cluster from #1
+> 4) The hollow-nodes are configured to talk with the kubemark master at #2
+
+https://github.com/kubernetes/kubernetes/blob/master/test/kubemark/pre-existing/README.md#introduction
+
+You need to setup a separate VM (Virtual Machine) with master components to get Kubemark running.
+
+> Currently we're running HollowNode with a limit of 0.09 CPU core/pod and 220MB of memory. However, if we also take into account the resources absorbed by default cluster addons and fluentD running on the 'external' cluster, this limit becomes ~0.1 CPU core/pod, thus allowing ~10 HollowNodes to run per core (on an "n1-standard-8" VM node).
+
+https://github.com/kubernetes/community/blob/master/contributors/devel/sig-scalability/kubemark-guide.md#starting-a-kubemark-cluster
+
+Kubemark can mimic 10 nodes with 1 CPU core.
+
+In reality it might be lesser than 10 nodes,
+> Using Kubernetes and [kubemark](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scalability/kubemark.md) on GCP we have created a following 1000 node cluster setup:
+>* 1 master - 1-core VM
+>* 17 nodes - 8-core VMs, each core running up to 8 Kubemark nodes.
+>* 1 Kubemark master - 32-core VM
+>* 1 dedicated VM for Cluster Autoscaler
+
+https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/scalability_tests.md#test-setup
+
+This is a cheaper option than (c) but if you want to setup Kubemark on your local machine you will need a master node and 1 core per 10 fake nodes i.e., if you want to mimic 100 nodes, that's 10 cores of CPU + extra CPU for master node. Unless you have 10-12 free cores on your local machine, it is hard to run scale tests with Kubemark for nodes > 100.
+
+### (b) try to get as much information from the issue reporter as possible and try to reproduce the issue by tweaking our code tests
+This works well if the issue is easy to reproduce by tweaking tests e.g., you want to check why scale down is getting blocked on a particular pod. You can do so by mimicing the pod in the tests by adding an entry [here](https://github.com/kubernetes/autoscaler/blob/1009797f5585d7bf778072ba59fd12eb2b8ab83c/cluster-autoscaler/utils/drain/drain_test.go#L878-L887) and running
+```
+cluster-autoscaler/utils/drain$ go test -run TestDrain
+```
+But when you want to test an issue related to scale e.g., CA is slow in scaling up, it is hard to do.
+### (c) try reproducing the issue using the same CA setup as user with actual nodes in a public cloud provider
+e.g., if the issue reporter has a 200 node cluster in AWS, try creating a 200 node cluster in AWS and use the same CA flags as the issue reporter.
+
+This is a viable option if you already have a cluster running with a similar size but otherwise creating a big cluster just to reproduce the issue is costly.
+
+### 2. It is hard to confirm behavior of CA at scale
+For example, a user with a big Kubernetes cluster (> 100-200 nodes) wants to check if adding scheduling properties to their workloads (node affinity, pod affinity, node selectors etc.,) leads to better utilization of the nodes (which saves cost). To give a more concrete example, imagine a situation like this:
+1. There is a cluster with > 100 nodes. cpu to memory ratio for the nodes is 1:1, 1:2, 1:8 and 1:16
+2. It is observed that 1:16 nodes are underutilized on memory
+3. It is observed that workloads with cpu to memory ratio of 1:7 are getting scheduled on 1:16 nodes thereby leaving some memory unused
+e.g.,
+1:16 node looks like this:
+CPUs: 8 Cores
+Memory: 128Gi
+
+workload (1:7 memory:cpu ratio):
+CPUs: 1 Core
+Memory: 7 Gi
+
+resources wasted on the node: 8 % 1 CPU(s) + 128 % 7 Gi
+= 0 CPUs + 2 Gi memory = 2Gi of wasted memory
+
+1:8 node looks like this:
+CPUs: 8 Cores
+Memory: 64 Gi
+
+workload (1:7 memory:cpu ratio):
+CPUs: 1 Core
+Memory: 7 Gi
+
+resources wasted on the node: 8 % 1 CPU(s) + 64 % 7 Gi
+= 0 CPUs + 1 Gi memory = 1Gi of wasted memory
+
+If 1:7 can somehow be scheduled on 1:8 node using node selector or required node affinity, the wastage would go down. User wants to add required node affinity on 1:7 workloads and see how CA would behave without creating actual nodes in public cloud provider. The goal here is to see if the theory is true and if there are any side-effects.
+
+This can be done with Kubemark today but a public cloud provider would be needed to mimic the cluster of this size. It can't be done on a local cluster (kind/minikube etc.,).
+
+### How does it look in action?
+You can check it [here](https://github.com/kubernetes/autoscaler/issues/5769#issuecomment-1590541506).
+
+### FAQ
+1. **Will this be patched back to older releases of Kubernetes?**
+
+    As of writing this, the plan is to release it as a part of Kubernetes 1.28 and patch it back to 1.27 and 1.26.
+2. **Why did we not use GRPC or cluster-api provider to implement this?**
+The idea was to enable users/contributors to be able to scale-test issues around different cloud providers (e.g., https://github.com/kubernetes/autoscaler/issues/5769). Implementing the `kwok` provider in-tree means we are closer to the actual implementation of our most-used cloud providers (adding gRPC communication in between would mean an extra delay which is not there in our in-tree cloud providers). Although only in-tree provider is a part of this proposal, overall plan is to:
+    * Implement in-tree provider to cover most of the common use-cases
+    * Implement `kwok` provider for `clusterapi` provider so that we can provision `kwok` nodes using `clusterapi` provider ([someone is already working on this](https://kubernetes.slack.com/archives/C8TSNPY4T/p1685648610609449))
+    * Implement gRPC provider if there is user demand
+3. **How performant is `kwok` provider really compared to `kubemark` provider?**
+`kubemark` provider seems to need 1 core per 8-10 nodes (based on our [last scale tests](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/scalability_tests.md#test-setup)). This means we need roughly 10 cores to simulate 100 nodes in `kubemark`.
+`kwok` provider can simulate 385 nodes for 122m of CPU and 521Mi of memory. This means, CPU wise `kwok` can simulate 385 / 0.122 =~ 3155 nodes per 1 core of CPU.
+![](images/kwok-provider-grafana.png)
+![](images/kwok-provider-in-action.png)
+4. **Can I think of `kwok` as a dry-run for my actual `cloudprovider`?**
+That is the goal but note that the definition of what exactly `dry-run` means is not very clear and can mean different things for different users. You can think of it as something similar to a `dry-run`.
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_config.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_config.go
@ -0,0 +1,153 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"context"
+	"errors"
+	"fmt"
+	"os"
+	"strings"
+
+	v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/apimachinery/pkg/util/yaml"
+	kubeclient "k8s.io/client-go/kubernetes"
+	klog "k8s.io/klog/v2"
+)
+
+const (
+	defaultConfigName = "kwok-provider-config"
+	configKey         = "config"
+)
+
+// based on https://github.com/kubernetes/kubernetes/pull/63707/files
+func getCurrentNamespace() string {
+	currentNamespace := os.Getenv("POD_NAMESPACE")
+	if strings.TrimSpace(currentNamespace) == "" {
+		klog.Info("env variable 'POD_NAMESPACE' is empty")
+		klog.Info("trying to read current namespace from serviceaccount")
+		// Fall back to the namespace associated with the service account token, if available
+		if data, err := os.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/namespace"); err == nil {
+			if ns := strings.TrimSpace(string(data)); len(ns) > 0 {
+				currentNamespace = ns
+			} else {
+				klog.Fatal("couldn't get current namespace from serviceaccount")
+			}
+		} else {
+			klog.Fatal("couldn't read serviceaccount to get current namespace")
+		}
+
+	}
+
+	klog.Infof("got current pod namespace '%s'", currentNamespace)
+
+	return currentNamespace
+}
+
+func getConfigMapName() string {
+	configMapName := os.Getenv("KWOK_PROVIDER_CONFIGMAP")
+	if strings.TrimSpace(configMapName) == "" {
+		klog.Infof("env variable 'KWOK_PROVIDER_CONFIGMAP' is empty (defaulting to '%s')", defaultConfigName)
+		configMapName = defaultConfigName
+	}
+
+	return configMapName
+}
+
+// LoadConfigFile loads kwok provider config from k8s configmap
+func LoadConfigFile(kubeClient kubeclient.Interface) (*KwokProviderConfig, error) {
+	configMapName := getConfigMapName()
+
+	currentNamespace := getCurrentNamespace()
+
+	c, err := kubeClient.CoreV1().ConfigMaps(currentNamespace).Get(context.Background(), configMapName, v1.GetOptions{})
+	if err != nil {
+		return nil, fmt.Errorf("failed to get configmap '%s': %v", configMapName, err)
+	}
+
+	decoder := yaml.NewYAMLOrJSONDecoder(strings.NewReader(c.Data[configKey]), 4096)
+	kwokConfig := KwokProviderConfig{}
+	if err := decoder.Decode(&kwokConfig); err != nil {
+		return nil, fmt.Errorf("failed to decode kwok config: %v", err)
+	}
+
+	if kwokConfig.status == nil {
+		kwokConfig.status = &GroupingConfig{}
+	}
+
+	switch kwokConfig.ReadNodesFrom {
+	case nodeTemplatesFromConfigMap:
+
+		if kwokConfig.ConfigMap == nil {
+			return nil, fmt.Errorf("please specify a value for 'configmap' in kwok config (currently empty or undefined)")
+		}
+		if strings.TrimSpace(kwokConfig.ConfigMap.Name) == "" {
+			return nil, fmt.Errorf("please specify 'configmap.name' in kwok config (currently empty or undefined)")
+		}
+
+	case nodeTemplatesFromCluster:
+	default:
+		return nil, fmt.Errorf("'readNodesFrom' in kwok config is invalid (expected: '%s' or '%s'): %s",
+			groupNodesByLabel, groupNodesByAnnotation,
+			kwokConfig.ReadNodesFrom)
+	}
+
+	if kwokConfig.Nodegroups == nil {
+		return nil, fmt.Errorf("please specify a value for 'nodegroups' in kwok config (currently empty or undefined)")
+	}
+
+	if strings.TrimSpace(kwokConfig.Nodegroups.FromNodeLabelKey) == "" &&
+		strings.TrimSpace(kwokConfig.Nodegroups.FromNodeLabelAnnotation) == "" {
+		return nil, fmt.Errorf("please specify either 'nodegroups.fromNodeLabelKey' or 'nodegroups.fromNodeAnnotation' in kwok provider config (currently empty or undefined)")
+	}
+	if strings.TrimSpace(kwokConfig.Nodegroups.FromNodeLabelKey) != "" &&
+		strings.TrimSpace(kwokConfig.Nodegroups.FromNodeLabelAnnotation) != "" {
+		return nil, fmt.Errorf("please specify either 'nodegroups.fromNodeLabelKey' or 'nodegroups.fromNodeAnnotation' in kwok provider config (you can't use both)")
+	}
+
+	if strings.TrimSpace(kwokConfig.Nodegroups.FromNodeLabelKey) != "" {
+		kwokConfig.status.groupNodesBy = groupNodesByLabel
+		kwokConfig.status.key = kwokConfig.Nodegroups.FromNodeLabelKey
+	} else {
+		kwokConfig.status.groupNodesBy = groupNodesByAnnotation
+		kwokConfig.status.key = kwokConfig.Nodegroups.FromNodeLabelAnnotation
+	}
+
+	if kwokConfig.Nodes == nil {
+		kwokConfig.Nodes = &NodeConfig{}
+	} else {
+
+		if kwokConfig.Nodes.GPUConfig == nil {
+			klog.Warningf("nodes.gpuConfig is empty or undefined")
+		} else {
+			if kwokConfig.Nodes.GPUConfig.GPULabelKey != "" &&
+				kwokConfig.Nodes.GPUConfig.AvailableGPUTypes != nil {
+				kwokConfig.status.availableGPUTypes = kwokConfig.Nodes.GPUConfig.AvailableGPUTypes
+				kwokConfig.status.gpuLabel = kwokConfig.Nodes.GPUConfig.GPULabelKey
+			} else {
+				return nil, errors.New("nodes.gpuConfig.gpuLabelKey or file.nodes.gpuConfig.availableGPUTypes is empty")
+			}
+		}
+
+	}
+
+	if kwokConfig.Kwok == nil {
+		kwokConfig.Kwok = &KwokConfig{}
+	}
+
+	return &kwokConfig, nil
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_config_test.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_config_test.go
@ -0,0 +1,285 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"testing"
+
+	"os"
+
+	"github.com/stretchr/testify/assert"
+	v1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/errors"
+	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/client-go/kubernetes/fake"
+	core "k8s.io/client-go/testing"
+)
+
+var testConfigs = map[string]string{
+	defaultConfigName:          testConfig,
+	"without-kwok":             withoutKwok,
+	"with-static-kwok-release": withStaticKwokRelease,
+	"skip-kwok-install":        skipKwokInstall,
+}
+
+// with node templates from configmap
+const testConfig = `
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "kwok-nodegroup"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+kwok: {}
+`
+
+// with node templates from configmap
+const testConfigSkipTaint = `
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "kwok-nodegroup"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  skipTaint: true
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+kwok: {}
+`
+const testConfigDynamicTemplates = `
+apiVersion: v1alpha1
+readNodesFrom: cluster # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "kwok-nodegroup"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+kwok: {}
+`
+
+const testConfigDynamicTemplatesSkipTaint = `
+apiVersion: v1alpha1
+readNodesFrom: cluster # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "kwok-nodegroup"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  skipTaint: true
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+kwok: {}
+`
+
+const withoutKwok = `
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+`
+
+const withStaticKwokRelease = `
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+kwok:
+  release: "v0.2.1"
+configmap:
+  name: kwok-provider-templates
+`
+
+const skipKwokInstall = `
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+kwok:
+  skipInstall: true
+`
+
+func TestLoadConfigFile(t *testing.T) {
+	defer func() {
+		os.Unsetenv("KWOK_PROVIDER_CONFIGMAP")
+	}()
+
+	fakeClient := &fake.Clientset{}
+	fakeClient.Fake.AddReactor("get", "configmaps", func(action core.Action) (bool, runtime.Object, error) {
+		getAction := action.(core.GetAction)
+
+		if getAction == nil {
+			return false, nil, nil
+		}
+
+		cmName := getConfigMapName()
+		if getAction.GetName() == cmName {
+			return true, &v1.ConfigMap{
+				Data: map[string]string{
+					configKey: testConfigs[cmName],
+				},
+			}, nil
+		}
+
+		return true, nil, errors.NewNotFound(v1.Resource("configmaps"), "whatever")
+	})
+
+	os.Setenv("POD_NAMESPACE", "kube-system")
+
+	kwokConfig, err := LoadConfigFile(fakeClient)
+	assert.Nil(t, err)
+	assert.NotNil(t, kwokConfig)
+	assert.NotNil(t, kwokConfig.status)
+	assert.NotEmpty(t, kwokConfig.status.gpuLabel)
+
+	os.Setenv("KWOK_PROVIDER_CONFIGMAP", "without-kwok")
+	kwokConfig, err = LoadConfigFile(fakeClient)
+	assert.Nil(t, err)
+	assert.NotNil(t, kwokConfig)
+	assert.NotNil(t, kwokConfig.status)
+	assert.NotEmpty(t, kwokConfig.status.gpuLabel)
+
+	os.Setenv("KWOK_PROVIDER_CONFIGMAP", "with-static-kwok-release")
+	kwokConfig, err = LoadConfigFile(fakeClient)
+	assert.Nil(t, err)
+	assert.NotNil(t, kwokConfig)
+	assert.NotNil(t, kwokConfig.status)
+	assert.NotEmpty(t, kwokConfig.status.gpuLabel)
+
+	os.Setenv("KWOK_PROVIDER_CONFIGMAP", "skip-kwok-install")
+	kwokConfig, err = LoadConfigFile(fakeClient)
+	assert.Nil(t, err)
+	assert.NotNil(t, kwokConfig)
+	assert.NotNil(t, kwokConfig.status)
+	assert.NotEmpty(t, kwokConfig.status.gpuLabel)
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_constants.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_constants.go
@ -0,0 +1,163 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+const (
+	// ProviderName is the cloud provider name for kwok
+	ProviderName = "kwok"
+
+	//NGNameAnnotation is the annotation kwok provider uses to track the nodegroups
+	NGNameAnnotation = "cluster-autoscaler.kwok.nodegroup/name"
+	// NGMinSizeAnnotation is annotation on template nodes which specify min size of the nodegroup
+	NGMinSizeAnnotation = "cluster-autoscaler.kwok.nodegroup/min-count"
+	// NGMaxSizeAnnotation is annotation on template nodes which specify max size of the nodegroup
+	NGMaxSizeAnnotation = "cluster-autoscaler.kwok.nodegroup/max-count"
+	// NGDesiredSizeAnnotation is annotation on template nodes which specify desired size of the nodegroup
+	NGDesiredSizeAnnotation = "cluster-autoscaler.kwok.nodegroup/desired-count"
+
+	// KwokManagedAnnotation is the default annotation
+	// that kwok manages to decide if it should manage
+	// a node it sees in the cluster
+	KwokManagedAnnotation = "kwok.x-k8s.io/node"
+
+	groupNodesByAnnotation = "annotation"
+	groupNodesByLabel      = "label"
+
+	// // GPULabel is the label added to nodes with GPU resource.
+	// GPULabel = "cloud.google.com/gke-accelerator"
+
+	// for kwok provider config
+	nodeTemplatesFromConfigMap = "configmap"
+	nodeTemplatesFromCluster   = "cluster"
+)
+
+const testTemplates = `
+apiVersion: v1
+items:
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations: {}
+    labels:
+      beta.kubernetes.io/arch: amd64
+      beta.kubernetes.io/os: linux
+      kubernetes.io/arch: amd64
+      kubernetes.io/hostname: kind-worker
+      kwok-nodegroup: kind-worker
+      kubernetes.io/os: linux
+      k8s.amazonaws.com/accelerator: "nvidia-tesla-k80"
+    name: kind-worker
+  spec:
+    podCIDR: 10.244.2.0/24
+    podCIDRs:
+    - 10.244.2.0/24
+    providerID: kind://docker/kind/kind-worker
+  status:
+    addresses:
+    - address: 172.18.0.3
+      type: InternalIP
+    - address: kind-worker
+      type: Hostname
+    allocatable:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    capacity:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations: {}
+    labels:
+      beta.kubernetes.io/arch: amd64
+      beta.kubernetes.io/os: linux
+      kubernetes.io/arch: amd64
+      kubernetes.io/hostname: kind-worker-2
+      kubernetes.io/os: linux
+      k8s.amazonaws.com/accelerator: "nvidia-tesla-k80"
+    name: kind-worker-2
+  spec:
+    podCIDR: 10.244.2.0/24
+    podCIDRs:
+    - 10.244.2.0/24
+    providerID: kind://docker/kind/kind-worker-2
+  status:
+    addresses:
+    - address: 172.18.0.3
+      type: InternalIP
+    - address: kind-worker-2
+      type: Hostname
+    allocatable:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    capacity:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+kind: List
+metadata:
+  resourceVersion: ""
+`
+
+// yaml version of fakeNode1, fakeNode2 and fakeNode3
+const testTemplatesMinimal = `
+apiVersion: v1
+items:
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations:
+      cluster-autoscaler.kwok.nodegroup/name: ng1
+    labels:
+      kwok-nodegroup: ng1
+    name: node1
+  spec: {}
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations:
+      cluster-autoscaler.kwok.nodegroup/name: ng2
+    labels:
+      kwok-nodegroup: ng2
+    name: node2
+  spec: {}
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations: {}
+    labels: {}
+    name: node3
+  spec: {}
+kind: List
+metadata:
+  resourceVersion: ""
+`
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_helpers.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_helpers.go
@ -0,0 +1,278 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"bufio"
+	"context"
+	"errors"
+	"fmt"
+	"io"
+	"log"
+	"strconv"
+	"strings"
+	"time"
+
+	apiv1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/apimachinery/pkg/runtime/serializer"
+	"k8s.io/apimachinery/pkg/util/yaml"
+	kube_util "k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes"
+	"k8s.io/client-go/kubernetes"
+	clientscheme "k8s.io/client-go/kubernetes/scheme"
+	v1lister "k8s.io/client-go/listers/core/v1"
+	klog "k8s.io/klog/v2"
+)
+
+const (
+	templatesKey               = "templates"
+	defaultTemplatesConfigName = "kwok-provider-templates"
+)
+
+type listerFn func(lister v1lister.NodeLister, filter func(*apiv1.Node) bool) kube_util.NodeLister
+
+func loadNodeTemplatesFromCluster(kc *KwokProviderConfig,
+	kubeClient kubernetes.Interface,
+	lister kube_util.NodeLister) ([]*apiv1.Node, error) {
+
+	if lister != nil {
+		return lister.List()
+	}
+
+	nodeList, err := kubeClient.CoreV1().Nodes().List(context.Background(), metav1.ListOptions{})
+	if err != nil {
+		return nil, err
+	}
+
+	nos := []*apiv1.Node{}
+	// note: not using _, node := range nodeList.Items here because it leads to unexpected behavior
+	// more info: https://stackoverflow.com/a/38693163/6874596
+	for i := range nodeList.Items {
+		nos = append(nos, &(nodeList.Items[i]))
+	}
+
+	return nos, nil
+}
+
+// LoadNodeTemplatesFromConfigMap loads template nodes from a k8s configmap
+// check https://github.com/vadafoss/node-templates for more info on the parsing logic
+func LoadNodeTemplatesFromConfigMap(configMapName string,
+	kubeClient kubernetes.Interface) ([]*apiv1.Node, error) {
+	currentNamespace := getCurrentNamespace()
+	nodeTemplates := []*apiv1.Node{}
+
+	c, err := kubeClient.CoreV1().ConfigMaps(currentNamespace).Get(context.Background(), configMapName, v1.GetOptions{})
+	if err != nil {
+		return nil, fmt.Errorf("failed to get configmap '%s': %v", configMapName, err)
+	}
+
+	if c.Data[templatesKey] == "" {
+		return nil, fmt.Errorf("configmap '%s' doesn't have 'templates' key", configMapName)
+	}
+
+	scheme := runtime.NewScheme()
+	clientscheme.AddToScheme(scheme)
+
+	decoder := serializer.NewCodecFactory(scheme).UniversalDeserializer()
+
+	multiDocReader := yaml.NewYAMLReader(bufio.NewReader(strings.NewReader(c.Data[templatesKey])))
+
+	objs := []runtime.Object{}
+
+	for {
+		buf, err := multiDocReader.Read()
+		if err != nil {
+			if err == io.EOF {
+				break
+			}
+			return nil, err
+		}
+
+		obj, _, err := decoder.Decode(buf, nil, nil)
+		if err != nil {
+			return nil, err
+		}
+
+		objs = append(objs, obj)
+	}
+
+	if len(objs) > 1 {
+		for _, obj := range objs {
+			if node, ok := obj.(*apiv1.Node); ok {
+				nodeTemplates = append(nodeTemplates, node)
+			}
+		}
+
+	} else if nodelist, ok := objs[0].(*apiv1.List); ok {
+		for _, item := range nodelist.Items {
+
+			o, _, err := decoder.Decode(item.Raw, nil, nil)
+			if err != nil {
+				return nil, err
+			}
+
+			if node, ok := o.(*apiv1.Node); ok {
+				nodeTemplates = append(nodeTemplates, node)
+			}
+		}
+	} else {
+		return nil, errors.New("invalid templates file (found something other than nodes in the file)")
+	}
+
+	return nodeTemplates, nil
+}
+
+func createNodegroups(nodes []*apiv1.Node, kubeClient kubernetes.Interface, kc *KwokProviderConfig, initCustomLister listerFn,
+	allNodeLister v1lister.NodeLister) []*NodeGroup {
+	ngs := map[string]*NodeGroup{}
+
+	// note: not using _, node := range nodes here because it leads to unexpected behavior
+	// more info: https://stackoverflow.com/a/38693163/6874596
+	for i := range nodes {
+
+		belongsToNg := ((kc.status.groupNodesBy == groupNodesByAnnotation &&
+			nodes[i].GetAnnotations()[kc.status.key] != "") ||
+			(kc.status.groupNodesBy == groupNodesByLabel &&
+				nodes[i].GetLabels()[kc.status.key] != ""))
+		if !belongsToNg {
+			continue
+		}
+
+		ngName := getNGName(nodes[i], kc)
+		if ngs[ngName] != nil {
+			ngs[ngName].targetSize += 1
+			continue
+		}
+
+		ng := parseAnnotations(nodes[i], kc)
+		ng.name = getNGName(nodes[i], kc)
+		sanitizeNode(nodes[i])
+		prepareNode(nodes[i], ng.name)
+		ng.nodeTemplate = nodes[i]
+
+		filterFn := func(no *apiv1.Node) bool {
+			return no.GetAnnotations()[NGNameAnnotation] == ng.name
+		}
+
+		ng.kubeClient = kubeClient
+		ng.lister = initCustomLister(allNodeLister, filterFn)
+
+		ngs[ngName] = ng
+	}
+
+	result := []*NodeGroup{}
+	for i := range ngs {
+		result = append(result, ngs[i])
+	}
+	return result
+}
+
+// sanitizeNode cleans the node
+func sanitizeNode(no *apiv1.Node) {
+	no.ResourceVersion = ""
+	no.Generation = 0
+	no.UID = ""
+	no.CreationTimestamp = v1.Time{}
+	no.Status.NodeInfo.KubeletVersion = "fake"
+
+}
+
+// prepareNode prepares node as a kwok template node
+func prepareNode(no *apiv1.Node, ngName string) {
+	// add prefix in the name to make it clear that this node is different
+	// from the ones already existing in the cluster (in case there is a name clash)
+	no.Name = fmt.Sprintf("kwok-fake-%s", no.GetName())
+	no.Annotations[KwokManagedAnnotation] = "fake"
+	no.Annotations[NGNameAnnotation] = ngName
+	no.Spec.ProviderID = getProviderID(no.GetName())
+}
+
+func getProviderID(nodeName string) string {
+	return fmt.Sprintf("kwok:%s", nodeName)
+}
+
+func parseAnnotations(no *apiv1.Node, kc *KwokProviderConfig) *NodeGroup {
+	min := 0
+	max := 200
+	target := min
+	if no.GetAnnotations()[NGMinSizeAnnotation] != "" {
+		if mi, err := strconv.Atoi(no.GetAnnotations()[NGMinSizeAnnotation]); err == nil {
+			min = mi
+		} else {
+			klog.Fatalf("invalid value for annotation key '%s' for node '%s'", NGMinSizeAnnotation, no.GetName())
+		}
+	}
+
+	if no.GetAnnotations()[NGMaxSizeAnnotation] != "" {
+		if ma, err := strconv.Atoi(no.GetAnnotations()[NGMaxSizeAnnotation]); err == nil {
+			max = ma
+		} else {
+			klog.Fatalf("invalid value for annotation key '%s' for node '%s'", NGMaxSizeAnnotation, no.GetName())
+		}
+	}
+
+	if no.GetAnnotations()[NGDesiredSizeAnnotation] != "" {
+		if ta, err := strconv.Atoi(no.GetAnnotations()[NGDesiredSizeAnnotation]); err == nil {
+			target = ta
+		} else {
+			klog.Fatalf("invalid value for annotation key '%s' for node '%s'", NGDesiredSizeAnnotation, no.GetName())
+		}
+	}
+
+	if max < min {
+		log.Fatalf("min-count '%d' cannot be lesser than max-count '%d' for the node '%s'", min, max, no.GetName())
+	}
+
+	if target > max || target < min {
+		log.Fatalf("desired-count '%d' cannot be lesser than min-count '%d' or greater than max-count '%d' for the node '%s'", target, min, max, no.GetName())
+	}
+
+	return &NodeGroup{
+		minSize:    min,
+		maxSize:    max,
+		targetSize: target,
+	}
+}
+
+func getNGName(no *apiv1.Node, kc *KwokProviderConfig) string {
+
+	if no.GetAnnotations()[NGNameAnnotation] != "" {
+		return no.GetAnnotations()[NGNameAnnotation]
+	}
+
+	var ngName string
+	switch kc.status.groupNodesBy {
+	case "annotation":
+		ngName = no.GetAnnotations()[kc.status.key]
+	case "label":
+		ngName = no.GetLabels()[kc.status.key]
+	default:
+		klog.Fatal("grouping criteria for nodes is not set (expected: 'annotation' or 'label')")
+	}
+
+	if ngName == "" {
+		klog.Fatalf("%s '%s' for node '%s' not present in the manifest",
+			kc.status.groupNodesBy, kc.status.key,
+			no.GetName())
+	}
+
+	ngName = fmt.Sprintf("%s-%v", ngName, time.Now().Unix())
+
+	return ngName
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_helpers_test.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_helpers_test.go
@ -0,0 +1,890 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"os"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	apiv1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/errors"
+	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/client-go/kubernetes/fake"
+	core "k8s.io/client-go/testing"
+)
+
+const multipleNodes = `
+apiVersion: v1
+kind: Node
+metadata:
+  annotations:
+    kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+    node.alpha.kubernetes.io/ttl: "0"
+    volumes.kubernetes.io/controller-managed-attach-detach: "true"
+  creationTimestamp: "2023-05-31T04:39:16Z"
+  labels:
+    beta.kubernetes.io/arch: amd64
+    beta.kubernetes.io/os: linux
+    kubernetes.io/arch: amd64
+    kubernetes.io/hostname: kind-control-plane
+    kwok-nodegroup: control-plane
+    kubernetes.io/os: linux
+    node-role.kubernetes.io/control-plane: ""
+    node.kubernetes.io/exclude-from-external-load-balancers: ""
+  name: kind-control-plane
+  resourceVersion: "603"
+  uid: 86716ec7-3071-4091-b055-77b4361d1dca
+spec:
+  podCIDR: 10.244.0.0/24
+  podCIDRs:
+  - 10.244.0.0/24
+  providerID: kind://docker/kind/kind-control-plane
+  taints:
+  - effect: NoSchedule
+    key: node-role.kubernetes.io/control-plane
+status:
+  addresses:
+  - address: 172.18.0.2
+    type: InternalIP
+  - address: kind-control-plane
+    type: Hostname
+  allocatable:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  capacity:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  conditions:
+  - lastHeartbeatTime: "2023-05-31T04:40:29Z"
+    lastTransitionTime: "2023-05-31T04:39:13Z"
+    message: kubelet has sufficient memory available
+    reason: KubeletHasSufficientMemory
+    status: "False"
+    type: MemoryPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:29Z"
+    lastTransitionTime: "2023-05-31T04:39:13Z"
+    message: kubelet has no disk pressure
+    reason: KubeletHasNoDiskPressure
+    status: "False"
+    type: DiskPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:29Z"
+    lastTransitionTime: "2023-05-31T04:39:13Z"
+    message: kubelet has sufficient PID available
+    reason: KubeletHasSufficientPID
+    status: "False"
+    type: PIDPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:29Z"
+    lastTransitionTime: "2023-05-31T04:39:46Z"
+    message: kubelet is posting ready status
+    reason: KubeletReady
+    status: "True"
+    type: Ready
+  daemonEndpoints:
+    kubeletEndpoint:
+      Port: 10250
+  images:
+  - names:
+    - registry.k8s.io/etcd:3.5.6-0
+    sizeBytes: 102542580
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+    - registry.k8s.io/kube-apiserver:v1.26.3
+    sizeBytes: 80392681
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+    - registry.k8s.io/kube-controller-manager:v1.26.3
+    sizeBytes: 68538487
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+    - registry.k8s.io/kube-proxy:v1.26.3
+    sizeBytes: 67217404
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+    - registry.k8s.io/kube-scheduler:v1.26.3
+    sizeBytes: 57761399
+  - names:
+    - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    sizeBytes: 27726335
+  - names:
+    - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+    - docker.io/kindest/local-path-provisioner@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+    sizeBytes: 18664669
+  - names:
+    - registry.k8s.io/coredns/coredns:v1.9.3
+    sizeBytes: 14837849
+  - names:
+    - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+    sizeBytes: 3052037
+  - names:
+    - registry.k8s.io/pause:3.7
+    sizeBytes: 311278
+  nodeInfo:
+    architecture: amd64
+    bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+    containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+    kernelVersion: 5.15.0-72-generic
+    kubeProxyVersion: v1.26.3
+    kubeletVersion: v1.26.3
+    machineID: 96f8c8b8c8ae4600a3654341f207586e
+    operatingSystem: linux
+    osImage: Ubuntu
+    systemUUID: 111aa932-7f99-4bef-aaf7-36aa7fb9b012
+---
+
+apiVersion: v1
+kind: Node
+metadata:
+  annotations:
+    kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+    node.alpha.kubernetes.io/ttl: "0"
+    volumes.kubernetes.io/controller-managed-attach-detach: "true"
+  creationTimestamp: "2023-05-31T04:39:57Z"
+  labels:
+    beta.kubernetes.io/arch: amd64
+    beta.kubernetes.io/os: linux
+    kubernetes.io/arch: amd64
+    kubernetes.io/hostname: kind-worker
+    kwok-nodegroup: kind-worker
+    kubernetes.io/os: linux
+  name: kind-worker
+  resourceVersion: "577"
+  uid: 2ac0eb71-e5cf-4708-bbbf-476e8f19842b
+spec:
+  podCIDR: 10.244.2.0/24
+  podCIDRs:
+  - 10.244.2.0/24
+  providerID: kind://docker/kind/kind-worker
+status:
+  addresses:
+  - address: 172.18.0.3
+    type: InternalIP
+  - address: kind-worker
+    type: Hostname
+  allocatable:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  capacity:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  conditions:
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has sufficient memory available
+    reason: KubeletHasSufficientMemory
+    status: "False"
+    type: MemoryPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has no disk pressure
+    reason: KubeletHasNoDiskPressure
+    status: "False"
+    type: DiskPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has sufficient PID available
+    reason: KubeletHasSufficientPID
+    status: "False"
+    type: PIDPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:40:05Z"
+    message: kubelet is posting ready status
+    reason: KubeletReady
+    status: "True"
+    type: Ready
+  daemonEndpoints:
+    kubeletEndpoint:
+      Port: 10250
+  images:
+  - names:
+    - registry.k8s.io/etcd:3.5.6-0
+    sizeBytes: 102542580
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+    - registry.k8s.io/kube-apiserver:v1.26.3
+    sizeBytes: 80392681
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+    - registry.k8s.io/kube-controller-manager:v1.26.3
+    sizeBytes: 68538487
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+    - registry.k8s.io/kube-proxy:v1.26.3
+    sizeBytes: 67217404
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+    - registry.k8s.io/kube-scheduler:v1.26.3
+    sizeBytes: 57761399
+  - names:
+    - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    sizeBytes: 27726335
+  - names:
+    - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+    sizeBytes: 18664669
+  - names:
+    - registry.k8s.io/coredns/coredns:v1.9.3
+    sizeBytes: 14837849
+  - names:
+    - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+    sizeBytes: 3052037
+  - names:
+    - registry.k8s.io/pause:3.7
+    sizeBytes: 311278
+  nodeInfo:
+    architecture: amd64
+    bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+    containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+    kernelVersion: 5.15.0-72-generic
+    kubeProxyVersion: v1.26.3
+    kubeletVersion: v1.26.3
+    machineID: a98a13ff474d476294935341f1ba9816
+    operatingSystem: linux
+    osImage: Ubuntu
+    systemUUID: 5f3c1af8-a385-4776-85e4-73d7f4252b44
+`
+
+const nodeList = `
+apiVersion: v1
+items:
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations:
+      kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+      node.alpha.kubernetes.io/ttl: "0"
+      volumes.kubernetes.io/controller-managed-attach-detach: "true"
+    creationTimestamp: "2023-05-31T04:39:16Z"
+    labels:
+      beta.kubernetes.io/arch: amd64
+      beta.kubernetes.io/os: linux
+      kubernetes.io/arch: amd64
+      kubernetes.io/hostname: kind-control-plane
+      kwok-nodegroup: control-plane
+      kubernetes.io/os: linux
+      node-role.kubernetes.io/control-plane: ""
+      node.kubernetes.io/exclude-from-external-load-balancers: ""
+    name: kind-control-plane
+    resourceVersion: "506"
+    uid: 86716ec7-3071-4091-b055-77b4361d1dca
+  spec:
+    podCIDR: 10.244.0.0/24
+    podCIDRs:
+    - 10.244.0.0/24
+    providerID: kind://docker/kind/kind-control-plane
+    taints:
+    - effect: NoSchedule
+      key: node-role.kubernetes.io/control-plane
+  status:
+    addresses:
+    - address: 172.18.0.2
+      type: InternalIP
+    - address: kind-control-plane
+      type: Hostname
+    allocatable:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    capacity:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    conditions:
+    - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+      lastTransitionTime: "2023-05-31T04:39:13Z"
+      message: kubelet has sufficient memory available
+      reason: KubeletHasSufficientMemory
+      status: "False"
+      type: MemoryPressure
+    - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+      lastTransitionTime: "2023-05-31T04:39:13Z"
+      message: kubelet has no disk pressure
+      reason: KubeletHasNoDiskPressure
+      status: "False"
+      type: DiskPressure
+    - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+      lastTransitionTime: "2023-05-31T04:39:13Z"
+      message: kubelet has sufficient PID available
+      reason: KubeletHasSufficientPID
+      status: "False"
+      type: PIDPressure
+    - lastHeartbeatTime: "2023-05-31T04:39:58Z"
+      lastTransitionTime: "2023-05-31T04:39:46Z"
+      message: kubelet is posting ready status
+      reason: KubeletReady
+      status: "True"
+      type: Ready
+    daemonEndpoints:
+      kubeletEndpoint:
+        Port: 10250
+    images:
+    - names:
+      - registry.k8s.io/etcd:3.5.6-0
+      sizeBytes: 102542580
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+      - registry.k8s.io/kube-apiserver:v1.26.3
+      sizeBytes: 80392681
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+      - registry.k8s.io/kube-controller-manager:v1.26.3
+      sizeBytes: 68538487
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+      - registry.k8s.io/kube-proxy:v1.26.3
+      sizeBytes: 67217404
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+      - registry.k8s.io/kube-scheduler:v1.26.3
+      sizeBytes: 57761399
+    - names:
+      - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      sizeBytes: 27726335
+    - names:
+      - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+      sizeBytes: 18664669
+    - names:
+      - registry.k8s.io/coredns/coredns:v1.9.3
+      sizeBytes: 14837849
+    - names:
+      - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+      sizeBytes: 3052037
+    - names:
+      - registry.k8s.io/pause:3.7
+      sizeBytes: 311278
+    nodeInfo:
+      architecture: amd64
+      bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+      containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+      kernelVersion: 5.15.0-72-generic
+      kubeProxyVersion: v1.26.3
+      kubeletVersion: v1.26.3
+      machineID: 96f8c8b8c8ae4600a3654341f207586e
+      operatingSystem: linux
+      osImage: Ubuntu
+      systemUUID: 111aa932-7f99-4bef-aaf7-36aa7fb9b012
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations:
+      kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+      node.alpha.kubernetes.io/ttl: "0"
+      volumes.kubernetes.io/controller-managed-attach-detach: "true"
+    creationTimestamp: "2023-05-31T04:39:57Z"
+    labels:
+      beta.kubernetes.io/arch: amd64
+      beta.kubernetes.io/os: linux
+      kubernetes.io/arch: amd64
+      kubernetes.io/hostname: kind-worker
+      kwok-nodegroup: kind-worker
+      kubernetes.io/os: linux
+    name: kind-worker
+    resourceVersion: "577"
+    uid: 2ac0eb71-e5cf-4708-bbbf-476e8f19842b
+  spec:
+    podCIDR: 10.244.2.0/24
+    podCIDRs:
+    - 10.244.2.0/24
+    providerID: kind://docker/kind/kind-worker
+  status:
+    addresses:
+    - address: 172.18.0.3
+      type: InternalIP
+    - address: kind-worker
+      type: Hostname
+    allocatable:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    capacity:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    conditions:
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has sufficient memory available
+      reason: KubeletHasSufficientMemory
+      status: "False"
+      type: MemoryPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has no disk pressure
+      reason: KubeletHasNoDiskPressure
+      status: "False"
+      type: DiskPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has sufficient PID available
+      reason: KubeletHasSufficientPID
+      status: "False"
+      type: PIDPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:40:05Z"
+      message: kubelet is posting ready status
+      reason: KubeletReady
+      status: "True"
+      type: Ready
+    daemonEndpoints:
+      kubeletEndpoint:
+        Port: 10250
+    images:
+    - names:
+      - registry.k8s.io/etcd:3.5.6-0
+      sizeBytes: 102542580
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+      - registry.k8s.io/kube-apiserver:v1.26.3
+      sizeBytes: 80392681
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+      - registry.k8s.io/kube-controller-manager:v1.26.3
+      sizeBytes: 68538487
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+      - registry.k8s.io/kube-proxy:v1.26.3
+      sizeBytes: 67217404
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+      - registry.k8s.io/kube-scheduler:v1.26.3
+      sizeBytes: 57761399
+    - names:
+      - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      sizeBytes: 27726335
+    - names:
+      - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+      sizeBytes: 18664669
+    - names:
+      - registry.k8s.io/coredns/coredns:v1.9.3
+      sizeBytes: 14837849
+    - names:
+      - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+      sizeBytes: 3052037
+    - names:
+      - registry.k8s.io/pause:3.7
+      sizeBytes: 311278
+    nodeInfo:
+      architecture: amd64
+      bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+      containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+      kernelVersion: 5.15.0-72-generic
+      kubeProxyVersion: v1.26.3
+      kubeletVersion: v1.26.3
+      machineID: a98a13ff474d476294935341f1ba9816
+      operatingSystem: linux
+      osImage: Ubuntu
+      systemUUID: 5f3c1af8-a385-4776-85e4-73d7f4252b44
+kind: List
+metadata:
+  resourceVersion: ""
+`
+
+const wrongIndentation = `
+apiVersion: v1
+  items:
+  - apiVersion: v1
+# everything below should be in-line with apiVersion above
+  kind: Node
+metadata:
+  annotations:
+    kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+    node.alpha.kubernetes.io/ttl: "0"
+    volumes.kubernetes.io/controller-managed-attach-detach: "true"
+  creationTimestamp: "2023-05-31T04:39:57Z"
+  labels:
+    beta.kubernetes.io/arch: amd64
+    beta.kubernetes.io/os: linux
+    kubernetes.io/arch: amd64
+    kubernetes.io/hostname: kind-worker
+    kwok-nodegroup: kind-worker
+    kubernetes.io/os: linux
+  name: kind-worker
+  resourceVersion: "577"
+  uid: 2ac0eb71-e5cf-4708-bbbf-476e8f19842b
+spec:
+  podCIDR: 10.244.2.0/24
+  podCIDRs:
+  - 10.244.2.0/24
+  providerID: kind://docker/kind/kind-worker
+status:
+  addresses:
+  - address: 172.18.0.3
+    type: InternalIP
+  - address: kind-worker
+    type: Hostname
+  allocatable:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  capacity:
+    cpu: "12"
+    ephemeral-storage: 959786032Ki
+    hugepages-1Gi: "0"
+    hugepages-2Mi: "0"
+    memory: 32781516Ki
+    pods: "110"
+  conditions:
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has sufficient memory available
+    reason: KubeletHasSufficientMemory
+    status: "False"
+    type: MemoryPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has no disk pressure
+    reason: KubeletHasNoDiskPressure
+    status: "False"
+    type: DiskPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:39:57Z"
+    message: kubelet has sufficient PID available
+    reason: KubeletHasSufficientPID
+    status: "False"
+    type: PIDPressure
+  - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+    lastTransitionTime: "2023-05-31T04:40:05Z"
+    message: kubelet is posting ready status
+    reason: KubeletReady
+    status: "True"
+    type: Ready
+  daemonEndpoints:
+    kubeletEndpoint:
+      Port: 10250
+  images:
+  - names:
+    - registry.k8s.io/etcd:3.5.6-0
+    sizeBytes: 102542580
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+    - registry.k8s.io/kube-apiserver:v1.26.3
+    sizeBytes: 80392681
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+    - registry.k8s.io/kube-controller-manager:v1.26.3
+    sizeBytes: 68538487
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+    - registry.k8s.io/kube-proxy:v1.26.3
+    sizeBytes: 67217404
+  - names:
+    - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+    - registry.k8s.io/kube-scheduler:v1.26.3
+    sizeBytes: 57761399
+  - names:
+    - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+    sizeBytes: 27726335
+  - names:
+    - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+    sizeBytes: 18664669
+  - names:
+    - registry.k8s.io/coredns/coredns:v1.9.3
+    sizeBytes: 14837849
+  - names:
+    - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+    sizeBytes: 3052037
+  - names:
+    - registry.k8s.io/pause:3.7
+    sizeBytes: 311278
+  nodeInfo:
+    architecture: amd64
+    bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+    containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+    kernelVersion: 5.15.0-72-generic
+    kubeProxyVersion: v1.26.3
+    kubeletVersion: v1.26.3
+    machineID: a98a13ff474d476294935341f1ba9816
+    operatingSystem: linux
+    osImage: Ubuntu 22.04.2 LTS
+    systemUUID: 5f3c1af8-a385-4776-85e4-73d7f4252b44
+kind: List
+metadata:
+  resourceVersion: ""
+`
+
+const noGPULabel = `
+apiVersion: v1
+items:
+- apiVersion: v1
+  kind: Node
+  metadata:
+    annotations:
+      kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
+      node.alpha.kubernetes.io/ttl: "0"
+      volumes.kubernetes.io/controller-managed-attach-detach: "true"
+    creationTimestamp: "2023-05-31T04:39:57Z"
+    labels:
+      beta.kubernetes.io/arch: amd64
+      beta.kubernetes.io/os: linux
+      kubernetes.io/arch: amd64
+      kubernetes.io/hostname: kind-worker
+      kwok-nodegroup: kind-worker
+      kubernetes.io/os: linux
+    name: kind-worker
+    resourceVersion: "577"
+    uid: 2ac0eb71-e5cf-4708-bbbf-476e8f19842b
+  spec:
+    podCIDR: 10.244.2.0/24
+    podCIDRs:
+    - 10.244.2.0/24
+    providerID: kind://docker/kind/kind-worker
+  status:
+    addresses:
+    - address: 172.18.0.3
+      type: InternalIP
+    - address: kind-worker
+      type: Hostname
+    allocatable:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    capacity:
+      cpu: "12"
+      ephemeral-storage: 959786032Ki
+      hugepages-1Gi: "0"
+      hugepages-2Mi: "0"
+      memory: 32781516Ki
+      pods: "110"
+    conditions:
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has sufficient memory available
+      reason: KubeletHasSufficientMemory
+      status: "False"
+      type: MemoryPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has no disk pressure
+      reason: KubeletHasNoDiskPressure
+      status: "False"
+      type: DiskPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:39:57Z"
+      message: kubelet has sufficient PID available
+      reason: KubeletHasSufficientPID
+      status: "False"
+      type: PIDPressure
+    - lastHeartbeatTime: "2023-05-31T04:40:17Z"
+      lastTransitionTime: "2023-05-31T04:40:05Z"
+      message: kubelet is posting ready status
+      reason: KubeletReady
+      status: "True"
+      type: Ready
+    daemonEndpoints:
+      kubeletEndpoint:
+        Port: 10250
+    images:
+    - names:
+      - registry.k8s.io/etcd:3.5.6-0
+      sizeBytes: 102542580
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:ba097b515c8c40689733c0f19de377e9bf8995964b7d7150c2045f3dfd166657
+      - registry.k8s.io/kube-apiserver:v1.26.3
+      sizeBytes: 80392681
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:8dbb345de79d1c44f59a7895da702a5f71997ae72aea056609445c397b0c10dc
+      - registry.k8s.io/kube-controller-manager:v1.26.3
+      sizeBytes: 68538487
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:44db4d50a5f9c8efbac0d37ea974d1c0419a5928f90748d3d491a041a00c20b5
+      - registry.k8s.io/kube-proxy:v1.26.3
+      sizeBytes: 67217404
+    - names:
+      - docker.io/library/import-2023-03-30@sha256:3dd2337f70af979c7362b5e52bbdfcb3a5fd39c78d94d02145150cd2db86ba39
+      - registry.k8s.io/kube-scheduler:v1.26.3
+      sizeBytes: 57761399
+    - names:
+      - docker.io/kindest/kindnetd:v20230330-48f316cd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      - docker.io/kindest/kindnetd@sha256:c19d6362a6a928139820761475a38c24c0cf84d507b9ddf414a078cf627497af
+      sizeBytes: 27726335
+    - names:
+      - docker.io/kindest/local-path-provisioner:v0.0.23-kind.0@sha256:f2d0a02831ff3a03cf51343226670d5060623b43a4cfc4808bd0875b2c4b9501
+      sizeBytes: 18664669
+    - names:
+      - registry.k8s.io/coredns/coredns:v1.9.3
+      sizeBytes: 14837849
+    - names:
+      - docker.io/kindest/local-path-helper:v20230330-48f316cd@sha256:135203f2441f916fb13dad1561d27f60a6f11f50ec288b01a7d2ee9947c36270
+      sizeBytes: 3052037
+    - names:
+      - registry.k8s.io/pause:3.7
+      sizeBytes: 311278
+    nodeInfo:
+      architecture: amd64
+      bootID: 2d71b318-5d07-4de2-9e61-2da28cf5bbf0
+      containerRuntimeVersion: containerd://1.6.19-46-g941215f49
+      kernelVersion: 5.15.0-72-generic
+      kubeProxyVersion: v1.26.3
+      kubeletVersion: v1.26.3
+      machineID: a98a13ff474d476294935341f1ba9816
+      operatingSystem: linux
+      osImage: Ubuntu 22.04.2 LTS
+      systemUUID: 5f3c1af8-a385-4776-85e4-73d7f4252b44
+kind: List
+metadata:
+  resourceVersion: ""
+`
+
+func TestLoadNodeTemplatesFromConfigMap(t *testing.T) {
+	var testTemplatesMap = map[string]string{
+		"wrongIndentation":         wrongIndentation,
+		defaultTemplatesConfigName: testTemplates,
+		"multipleNodes":            multipleNodes,
+		"nodeList":                 nodeList,
+	}
+
+	testTemplateName := defaultTemplatesConfigName
+
+	fakeClient := &fake.Clientset{}
+	fakeClient.Fake.AddReactor("get", "configmaps", func(action core.Action) (bool, runtime.Object, error) {
+		getAction := action.(core.GetAction)
+
+		if getAction == nil {
+			return false, nil, nil
+		}
+
+		if getAction.GetName() == defaultConfigName {
+			return true, &apiv1.ConfigMap{
+				Data: map[string]string{
+					configKey: testConfig,
+				},
+			}, nil
+		}
+
+		if testTemplatesMap[testTemplateName] != "" {
+			return true, &apiv1.ConfigMap{
+				Data: map[string]string{
+					templatesKey: testTemplatesMap[testTemplateName],
+				},
+			}, nil
+		}
+
+		return true, nil, errors.NewNotFound(apiv1.Resource("configmaps"), "whatever")
+	})
+
+	fakeClient.Fake.AddReactor("list", "nodes", func(action core.Action) (bool, runtime.Object, error) {
+		getAction := action.(core.GetAction)
+
+		if getAction == nil {
+			return false, nil, nil
+		}
+
+		return true, &apiv1.NodeList{Items: []apiv1.Node{}}, errors.NewNotFound(apiv1.Resource("nodes"), "whatever")
+	})
+
+	os.Setenv("POD_NAMESPACE", "kube-system")
+
+	kwokConfig, err := LoadConfigFile(fakeClient)
+	assert.Nil(t, err)
+
+	// happy path
+	testTemplateName = defaultTemplatesConfigName
+	nos, err := LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, fakeClient)
+	assert.Nil(t, err)
+	assert.NotEmpty(t, nos)
+	assert.Greater(t, len(nos), 0)
+
+	testTemplateName = "wrongIndentation"
+	nos, err = LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, fakeClient)
+	assert.Error(t, err)
+	assert.Empty(t, nos)
+	assert.Equal(t, len(nos), 0)
+
+	// multiple nodes is something like []*Node{node1, node2, node3, ...}
+	testTemplateName = "multipleNodes"
+	nos, err = LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, fakeClient)
+	assert.Nil(t, err)
+	assert.NotEmpty(t, nos)
+	assert.Greater(t, len(nos), 0)
+
+	// node list is something like []*List{Items:[]*Node{node1, node2, node3, ...}}
+	testTemplateName = "nodeList"
+	nos, err = LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, fakeClient)
+	assert.Nil(t, err)
+	assert.NotEmpty(t, nos)
+	assert.Greater(t, len(nos), 0)
+
+	// fake client which returns configmap with wrong key
+	fakeClient = &fake.Clientset{}
+	fakeClient.Fake.AddReactor("get", "configmaps", func(action core.Action) (bool, runtime.Object, error) {
+		getAction := action.(core.GetAction)
+
+		if getAction == nil {
+			return false, nil, nil
+		}
+
+		return true, &apiv1.ConfigMap{
+			Data: map[string]string{
+				"foo": testTemplatesMap[testTemplateName],
+			},
+		}, nil
+	})
+
+	fakeClient.Fake.AddReactor("list", "nodes", func(action core.Action) (bool, runtime.Object, error) {
+		getAction := action.(core.GetAction)
+
+		if getAction == nil {
+			return false, nil, nil
+		}
+
+		return true, &apiv1.NodeList{Items: []apiv1.Node{}}, errors.NewNotFound(apiv1.Resource("nodes"), "whatever")
+	})
+
+	// throw error if configmap data key is not `templates`
+	nos, err = LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, fakeClient)
+	assert.Error(t, err)
+	assert.Empty(t, nos)
+	assert.Equal(t, len(nos), 0)
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_nodegroups.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_nodegroups.go
@ -0,0 +1,221 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"context"
+	"fmt"
+
+	apiv1 "k8s.io/api/core/v1"
+	v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/apimachinery/pkg/util/rand"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
+	"k8s.io/autoscaler/cluster-autoscaler/config"
+	klog "k8s.io/klog/v2"
+	schedulerframework "k8s.io/kubernetes/pkg/scheduler/framework"
+)
+
+var (
+	sizeIncreaseMustBePositiveErr   = "size increase must be positive"
+	maxSizeReachedErr               = "size increase too large"
+	minSizeReachedErr               = "min size reached, nodes will not be deleted"
+	belowMinSizeErr                 = "can't delete nodes because nodegroup size would go below min size"
+	notManagedByKwokErr             = "can't delete node '%v' because it is not managed by kwok"
+	sizeDecreaseMustBeNegativeErr   = "size decrease must be negative"
+	attemptToDeleteExistingNodesErr = "attempt to delete existing nodes"
+)
+
+// MaxSize returns maximum size of the node group.
+func (nodeGroup *NodeGroup) MaxSize() int {
+	return nodeGroup.maxSize
+}
+
+// MinSize returns minimum size of the node group.
+func (nodeGroup *NodeGroup) MinSize() int {
+	return nodeGroup.minSize
+}
+
+// TargetSize returns the current TARGET size of the node group. It is possible that the
+// number is different from the number of nodes registered in Kubernetes.
+func (nodeGroup *NodeGroup) TargetSize() (int, error) {
+	return nodeGroup.targetSize, nil
+}
+
+// IncreaseSize increases NodeGroup size.
+func (nodeGroup *NodeGroup) IncreaseSize(delta int) error {
+	if delta <= 0 {
+		return fmt.Errorf(sizeIncreaseMustBePositiveErr)
+	}
+	size := nodeGroup.targetSize
+	newSize := int(size) + delta
+	if newSize > nodeGroup.MaxSize() {
+		return fmt.Errorf("%s, desired: %d max: %d", maxSizeReachedErr, newSize, nodeGroup.MaxSize())
+	}
+
+	klog.V(5).Infof("increasing size of nodegroup '%s' to %v (old size: %v, delta: %v)", nodeGroup.name, newSize, size, delta)
+
+	schedNode, err := nodeGroup.TemplateNodeInfo()
+	if err != nil {
+		return fmt.Errorf("couldn't create a template node for nodegroup %s", nodeGroup.name)
+	}
+
+	for i := 0; i < delta; i++ {
+		node := schedNode.Node()
+		node.Name = fmt.Sprintf("%s-%s", nodeGroup.name, rand.String(5))
+		node.Spec.ProviderID = getProviderID(node.Name)
+		_, err := nodeGroup.kubeClient.CoreV1().Nodes().Create(context.Background(), node, v1.CreateOptions{})
+		if err != nil {
+			return fmt.Errorf("couldn't create new node '%s': %v", node.Name, err)
+		}
+	}
+
+	nodeGroup.targetSize = newSize
+
+	return nil
+}
+
+// DeleteNodes deletes the specified nodes from the node group.
+func (nodeGroup *NodeGroup) DeleteNodes(nodes []*apiv1.Node) error {
+	size := nodeGroup.targetSize
+	if size <= nodeGroup.MinSize() {
+		return fmt.Errorf(minSizeReachedErr)
+	}
+
+	if size-len(nodes) < nodeGroup.MinSize() {
+		return fmt.Errorf(belowMinSizeErr)
+	}
+
+	for _, node := range nodes {
+		// TODO(vadasambar): check if there's a better way than returning an error here
+		if node.GetAnnotations()[KwokManagedAnnotation] != "fake" {
+			return fmt.Errorf(notManagedByKwokErr, node.GetName())
+		}
+
+		// TODO(vadasambar): proceed to delete the next node if the current node deletion errors
+		// TODO(vadasambar): collect all the errors and return them after attempting to delete all the nodes to be deleted
+		err := nodeGroup.kubeClient.CoreV1().Nodes().Delete(context.Background(), node.GetName(), v1.DeleteOptions{})
+		if err != nil {
+			return err
+		}
+	}
+	return nil
+}
+
+// DecreaseTargetSize decreases the target size of the node group. This function
+// doesn't permit to delete any existing node and can be used only to reduce the
+// request for new nodes that have not been yet fulfilled. Delta should be negative.
+func (nodeGroup *NodeGroup) DecreaseTargetSize(delta int) error {
+	if delta >= 0 {
+		return fmt.Errorf(sizeDecreaseMustBeNegativeErr)
+	}
+	size := nodeGroup.targetSize
+	nodes, err := nodeGroup.getNodeNamesForNodeGroup()
+	if err != nil {
+		return err
+	}
+	newSize := int(size) + delta
+	if newSize < len(nodes) {
+		return fmt.Errorf("%s, targetSize: %d delta: %d existingNodes: %d",
+			attemptToDeleteExistingNodesErr, size, delta, len(nodes))
+	}
+
+	nodeGroup.targetSize = newSize
+
+	return nil
+}
+
+// getNodeNamesForNodeGroup returns list of nodes belonging to the nodegroup
+func (nodeGroup *NodeGroup) getNodeNamesForNodeGroup() ([]string, error) {
+	names := []string{}
+
+	nodeList, err := nodeGroup.lister.List()
+	if err != nil {
+		return names, err
+	}
+
+	for _, no := range nodeList {
+		names = append(names, no.GetName())
+	}
+
+	return names, nil
+}
+
+// Id returns nodegroup name.
+func (nodeGroup *NodeGroup) Id() string {
+	return nodeGroup.name
+}
+
+// Debug returns a debug string for the nodegroup.
+func (nodeGroup *NodeGroup) Debug() string {
+	return fmt.Sprintf("%s (%d:%d)", nodeGroup.Id(), nodeGroup.MinSize(), nodeGroup.MaxSize())
+}
+
+// Nodes returns a list of all nodes that belong to this node group.
+func (nodeGroup *NodeGroup) Nodes() ([]cloudprovider.Instance, error) {
+	instances := make([]cloudprovider.Instance, 0)
+	nodeNames, err := nodeGroup.getNodeNamesForNodeGroup()
+	if err != nil {
+		return instances, err
+	}
+	for _, nodeName := range nodeNames {
+		instances = append(instances, cloudprovider.Instance{Id: getProviderID(nodeName), Status: &cloudprovider.InstanceStatus{
+			State:     cloudprovider.InstanceRunning,
+			ErrorInfo: nil,
+		}})
+	}
+	return instances, nil
+}
+
+// TemplateNodeInfo returns a node template for this node group.
+func (nodeGroup *NodeGroup) TemplateNodeInfo() (*schedulerframework.NodeInfo, error) {
+	nodeInfo := schedulerframework.NewNodeInfo(cloudprovider.BuildKubeProxy(nodeGroup.Id()))
+	nodeInfo.SetNode(nodeGroup.nodeTemplate)
+
+	return nodeInfo, nil
+}
+
+// Exist checks if the node group really exists on the cloud provider side.
+// Since kwok nodegroup is not backed by anything on cloud provider side
+// We can safely return `true` here
+func (nodeGroup *NodeGroup) Exist() bool {
+	return true
+}
+
+// Create creates the node group on the cloud provider side.
+// Left unimplemented because Create is not used anywhere
+// in the core autoscaler as of writing this
+func (nodeGroup *NodeGroup) Create() (cloudprovider.NodeGroup, error) {
+	return nil, cloudprovider.ErrNotImplemented
+}
+
+// Delete deletes the node group on the cloud provider side.
+// Left unimplemented because Delete is not used anywhere
+// in the core autoscaler as of writing this
+func (nodeGroup *NodeGroup) Delete() error {
+	return cloudprovider.ErrNotImplemented
+}
+
+// Autoprovisioned returns true if the node group is autoprovisioned.
+func (nodeGroup *NodeGroup) Autoprovisioned() bool {
+	return false
+}
+
+// GetOptions returns NodeGroupAutoscalingOptions that should be used for this particular
+// NodeGroup. Returning a nil will result in using default options.
+func (nodeGroup *NodeGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
+	return &defaults, nil
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_nodegroups_test.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_nodegroups_test.go
@ -0,0 +1,360 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"fmt"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	apiv1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
+	"k8s.io/autoscaler/cluster-autoscaler/config"
+	kube_util "k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes"
+	"k8s.io/client-go/kubernetes/fake"
+	core "k8s.io/client-go/testing"
+)
+
+func TestIncreaseSize(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	nodes := []*apiv1.Node{}
+
+	fakeClient.Fake.AddReactor("create", "nodes",
+		func(action core.Action) (bool, runtime.Object, error) {
+			createAction := action.(core.CreateAction)
+			if createAction == nil {
+				return false, nil, nil
+			}
+
+			nodes = append(nodes, createAction.GetObject().(*apiv1.Node))
+
+			return true, nil, nil
+		})
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(nil),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 2,
+		maxSize:    3,
+	}
+
+	// usual case
+	err := ng.IncreaseSize(1)
+	assert.Nil(t, err)
+	assert.Len(t, nodes, 1)
+	assert.Equal(t, 3, ng.targetSize)
+	for _, n := range nodes {
+		assert.Contains(t, n.Spec.ProviderID, "kwok")
+		assert.Contains(t, n.GetName(), ng.name)
+	}
+
+	// delta is negative
+	nodes = []*apiv1.Node{}
+	err = ng.IncreaseSize(-1)
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), sizeIncreaseMustBePositiveErr)
+	assert.Len(t, nodes, 0)
+
+	// delta is greater than max size
+	nodes = []*apiv1.Node{}
+	err = ng.IncreaseSize(ng.maxSize + 1)
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), maxSizeReachedErr)
+	assert.Len(t, nodes, 0)
+
+}
+
+func TestDeleteNodes(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	deletedNodes := make(map[string]bool)
+	fakeClient.Fake.AddReactor("delete", "nodes", func(action core.Action) (bool, runtime.Object, error) {
+		deleteAction := action.(core.DeleteAction)
+
+		if deleteAction == nil {
+			return false, nil, nil
+		}
+
+		deletedNodes[deleteAction.GetName()] = true
+
+		return true, nil, nil
+
+	})
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(nil),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 1,
+		maxSize:    3,
+	}
+
+	nodeToDelete1 := &apiv1.Node{
+		ObjectMeta: metav1.ObjectMeta{
+			Name: "node-to-delete-1",
+			Annotations: map[string]string{
+				KwokManagedAnnotation: "fake",
+			},
+		},
+	}
+
+	nodeToDelete2 := &apiv1.Node{
+		ObjectMeta: metav1.ObjectMeta{
+			Name: "node-to-delete-2",
+			Annotations: map[string]string{
+				KwokManagedAnnotation: "fake",
+			},
+		},
+	}
+
+	nodeWithoutKwokAnnotation := &apiv1.Node{
+		ObjectMeta: metav1.ObjectMeta{
+			Name:        "node-to-delete-3",
+			Annotations: map[string]string{},
+		},
+	}
+
+	// usual case
+	err := ng.DeleteNodes([]*apiv1.Node{nodeToDelete1})
+	assert.Nil(t, err)
+	assert.True(t, deletedNodes[nodeToDelete1.GetName()])
+
+	// min size reached
+	deletedNodes = make(map[string]bool)
+	ng.targetSize = 0
+	err = ng.DeleteNodes([]*apiv1.Node{nodeToDelete1})
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), minSizeReachedErr)
+	assert.False(t, deletedNodes[nodeToDelete1.GetName()])
+	ng.targetSize = 1
+
+	// too many nodes to delete - goes below ng's minSize
+	deletedNodes = make(map[string]bool)
+	err = ng.DeleteNodes([]*apiv1.Node{nodeToDelete1, nodeToDelete2})
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), belowMinSizeErr)
+	assert.False(t, deletedNodes[nodeToDelete1.GetName()])
+	assert.False(t, deletedNodes[nodeToDelete2.GetName()])
+
+	// kwok annotation is not present on the node to delete
+	deletedNodes = make(map[string]bool)
+	err = ng.DeleteNodes([]*apiv1.Node{nodeWithoutKwokAnnotation})
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), "not managed by kwok")
+	assert.False(t, deletedNodes[nodeWithoutKwokAnnotation.GetName()])
+
+}
+
+func TestDecreaseTargetSize(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	fakeNodes := []*apiv1.Node{
+		{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "node-1",
+			},
+		},
+		{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "node-2",
+			},
+		},
+	}
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(fakeNodes),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 3,
+		maxSize:    4,
+	}
+
+	// usual case
+	err := ng.DecreaseTargetSize(-1)
+	assert.Nil(t, err)
+	assert.Equal(t, 2, ng.targetSize)
+
+	// delta is positive
+	ng.targetSize = 3
+	err = ng.DecreaseTargetSize(1)
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), sizeDecreaseMustBeNegativeErr)
+	assert.Equal(t, 3, ng.targetSize)
+
+	// attempt to delete existing nodes
+	err = ng.DecreaseTargetSize(-2)
+	assert.NotNil(t, err)
+	assert.Contains(t, err.Error(), attemptToDeleteExistingNodesErr)
+	assert.Equal(t, 3, ng.targetSize)
+
+	// error from lister
+	ng.lister = &ErroneousNodeLister{}
+	err = ng.DecreaseTargetSize(-1)
+	assert.NotNil(t, err)
+	assert.Equal(t, cloudprovider.ErrNotImplemented.Error(), err.Error())
+	assert.Equal(t, 3, ng.targetSize)
+	ng.lister = kube_util.NewTestNodeLister(fakeNodes)
+}
+
+func TestNodes(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	fakeNodes := []*apiv1.Node{
+		{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "node-1",
+			},
+		},
+		{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "node-2",
+			},
+		},
+	}
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(fakeNodes),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 2,
+		maxSize:    3,
+	}
+
+	// usual case
+	cpInstances, err := ng.Nodes()
+	assert.Nil(t, err)
+	assert.Len(t, cpInstances, 2)
+	for i := range cpInstances {
+		assert.Contains(t, cpInstances[i].Id, fakeNodes[i].GetName())
+		assert.Equal(t, &cloudprovider.InstanceStatus{
+			State:     cloudprovider.InstanceRunning,
+			ErrorInfo: nil,
+		}, cpInstances[i].Status)
+	}
+
+	// error from lister
+	ng.lister = &ErroneousNodeLister{}
+	cpInstances, err = ng.Nodes()
+	assert.NotNil(t, err)
+	assert.Len(t, cpInstances, 0)
+	assert.Equal(t, cloudprovider.ErrNotImplemented.Error(), err.Error())
+
+}
+
+func TestTemplateNodeInfo(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(nil),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 2,
+		maxSize:    3,
+	}
+
+	// usual case
+	ti, err := ng.TemplateNodeInfo()
+	assert.Nil(t, err)
+	assert.NotNil(t, ti)
+	assert.Len(t, ti.Pods, 1)
+	assert.Contains(t, ti.Pods[0].Pod.Name, fmt.Sprintf("kube-proxy-%s", ng.name))
+	assert.Equal(t, ng.nodeTemplate, ti.Node())
+
+}
+
+func TestGetOptions(t *testing.T) {
+	fakeClient := &fake.Clientset{}
+
+	ng := NodeGroup{
+		name:       "ng",
+		kubeClient: fakeClient,
+		lister:     kube_util.NewTestNodeLister(nil),
+		nodeTemplate: &apiv1.Node{
+			ObjectMeta: metav1.ObjectMeta{
+				Name: "template-node-ng",
+			},
+		},
+		minSize:    0,
+		targetSize: 2,
+		maxSize:    3,
+	}
+
+	// dummy values
+	autoscalingOptions := config.NodeGroupAutoscalingOptions{
+		ScaleDownUtilizationThreshold:    50.0,
+		ScaleDownGpuUtilizationThreshold: 50.0,
+		ScaleDownUnneededTime:            time.Minute * 5,
+		ScaleDownUnreadyTime:             time.Minute * 5,
+		MaxNodeProvisionTime:             time.Minute * 5,
+		ZeroOrMaxNodeScaling:             true,
+		IgnoreDaemonSetsUtilization:      true,
+	}
+
+	// usual case
+	opts, err := ng.GetOptions(autoscalingOptions)
+	assert.Nil(t, err)
+	assert.Equal(t, autoscalingOptions, *opts)
+
+}
+
+// ErroneousNodeLister is used to check if the caller function throws an error
+// if lister throws an error
+type ErroneousNodeLister struct {
+}
+
+func (e *ErroneousNodeLister) List() ([]*apiv1.Node, error) {
+	return nil, cloudprovider.ErrNotImplemented
+}
+
+func (e *ErroneousNodeLister) Get(name string) (*apiv1.Node, error) {
+	return nil, cloudprovider.ErrNotImplemented
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_provider.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_provider.go
@ -0,0 +1,257 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	"context"
+	"fmt"
+	"os"
+	"strings"
+
+	apiv1 "k8s.io/api/core/v1"
+	"k8s.io/apimachinery/pkg/api/resource"
+	v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
+	"k8s.io/autoscaler/cluster-autoscaler/config"
+	"k8s.io/autoscaler/cluster-autoscaler/utils/errors"
+	"k8s.io/autoscaler/cluster-autoscaler/utils/gpu"
+	kube_util "k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes"
+	"k8s.io/client-go/informers"
+	kubeclient "k8s.io/client-go/kubernetes"
+	"k8s.io/client-go/rest"
+	"k8s.io/client-go/tools/clientcmd"
+	klog "k8s.io/klog/v2"
+)
+
+// Name returns name of the cloud provider.
+func (kwok *KwokCloudProvider) Name() string {
+	return ProviderName
+}
+
+// NodeGroups returns all node groups configured for this cloud provider.
+func (kwok *KwokCloudProvider) NodeGroups() []cloudprovider.NodeGroup {
+	result := make([]cloudprovider.NodeGroup, 0, len(kwok.nodeGroups))
+	for _, nodegroup := range kwok.nodeGroups {
+		result = append(result, nodegroup)
+	}
+	return result
+}
+
+// NodeGroupForNode returns the node group for the given node.
+func (kwok *KwokCloudProvider) NodeGroupForNode(node *apiv1.Node) (cloudprovider.NodeGroup, error) {
+	// Skip nodes that are not managed by kwok cloud provider.
+	if !strings.HasPrefix(node.Spec.ProviderID, ProviderName) {
+		klog.V(2).Infof("ignoring node '%s' because it is not managed by kwok", node.GetName())
+		return nil, nil
+	}
+
+	for _, nodeGroup := range kwok.nodeGroups {
+		if nodeGroup.name == getNGName(node, kwok.config) {
+			klog.V(5).Infof("found nodegroup '%s' for node '%s'", nodeGroup.name, node.GetName())
+			return nodeGroup, nil
+		}
+	}
+	return nil, nil
+}
+
+// HasInstance returns whether a given node has a corresponding instance in this cloud provider
+// Since there is no underlying cloud provider instance, return true
+func (kwok *KwokCloudProvider) HasInstance(node *apiv1.Node) (bool, error) {
+	return true, nil
+}
+
+// Pricing returns pricing model for this cloud provider or error if not available.
+func (kwok *KwokCloudProvider) Pricing() (cloudprovider.PricingModel, errors.AutoscalerError) {
+	return nil, cloudprovider.ErrNotImplemented
+}
+
+// GetAvailableMachineTypes get all machine types that can be requested from the cloud provider.
+// Implementation optional.
+func (kwok *KwokCloudProvider) GetAvailableMachineTypes() ([]string, error) {
+	return []string{}, cloudprovider.ErrNotImplemented
+}
+
+// NewNodeGroup builds a theoretical node group based on the node definition provided.
+func (kwok *KwokCloudProvider) NewNodeGroup(machineType string, labels map[string]string, systemLabels map[string]string,
+	taints []apiv1.Taint,
+	extraResources map[string]resource.Quantity) (cloudprovider.NodeGroup, error) {
+	return nil, cloudprovider.ErrNotImplemented
+}
+
+// GetResourceLimiter returns struct containing limits (max, min) for resources (cores, memory etc.).
+func (kwok *KwokCloudProvider) GetResourceLimiter() (*cloudprovider.ResourceLimiter, error) {
+	return kwok.resourceLimiter, nil
+}
+
+// GPULabel returns the label added to nodes with GPU resource.
+func (kwok *KwokCloudProvider) GPULabel() string {
+	// GPULabel() might get called before the config is loaded
+	if kwok.config == nil || kwok.config.status == nil {
+		return ""
+	}
+	return kwok.config.status.gpuLabel
+}
+
+// GetAvailableGPUTypes return all available GPU types cloud provider supports
+func (kwok *KwokCloudProvider) GetAvailableGPUTypes() map[string]struct{} {
+	// GetAvailableGPUTypes() might get called before the config is loaded
+	if kwok.config == nil || kwok.config.status == nil {
+		return map[string]struct{}{}
+	}
+	return kwok.config.status.availableGPUTypes
+}
+
+// GetNodeGpuConfig returns the label, type and resource name for the GPU added to node. If node doesn't have
+// any GPUs, it returns nil.
+func (kwok *KwokCloudProvider) GetNodeGpuConfig(node *apiv1.Node) *cloudprovider.GpuConfig {
+	return gpu.GetNodeGPUFromCloudProvider(kwok, node)
+}
+
+// Refresh is called before every main loop and can be used to dynamically update cloud provider state.
+// In particular the list of node groups returned by NodeGroups can change as a result of CloudProvider.Refresh().
+// TODO(vadasambar): implement this
+func (kwok *KwokCloudProvider) Refresh() error {
+
+	// TODO(vadasambar): causes CA to not recognize kwok nodegroups
+	// needs better implementation
+	// nodeList, err := kwok.lister.List()
+	// if err != nil {
+	// 	return err
+	// }
+
+	// ngs := []*NodeGroup{}
+	// for _, no := range nodeList {
+	// 	ng := parseAnnotationsToNodegroup(no)
+	// 	ng.kubeClient = kwok.kubeClient
+	// 	ngs = append(ngs, ng)
+	// }
+
+	// kwok.nodeGroups = ngs
+
+	return nil
+}
+
+// Cleanup cleans up all resources before the cloud provider is removed
+func (kwok *KwokCloudProvider) Cleanup() error {
+	for _, ng := range kwok.nodeGroups {
+		nodeNames, err := ng.getNodeNamesForNodeGroup()
+		if err != nil {
+			return fmt.Errorf("error cleaning up: %v", err)
+		}
+
+		for _, node := range nodeNames {
+			err := kwok.kubeClient.CoreV1().Nodes().Delete(context.Background(), node, v1.DeleteOptions{})
+			if err != nil {
+				klog.Errorf("error cleaning up kwok provider nodes '%v'", node)
+			}
+		}
+	}
+
+	return nil
+}
+
+// BuildKwok builds kwok cloud provider.
+func BuildKwok(opts config.AutoscalingOptions,
+	do cloudprovider.NodeGroupDiscoveryOptions,
+	rl *cloudprovider.ResourceLimiter,
+	informerFactory informers.SharedInformerFactory) cloudprovider.CloudProvider {
+
+	var restConfig *rest.Config
+	var err error
+	if os.Getenv("KWOK_PROVIDER_MODE") == "local" {
+		// Check and load kubeconfig from the path set
+		// in KUBECONFIG env variable (if not use default path of ~/.kube/config)
+		apiConfig, err := clientcmd.NewDefaultClientConfigLoadingRules().Load()
+		if err != nil {
+			klog.Fatal(err)
+		}
+
+		// Create rest config from kubeconfig
+		restConfig, err = clientcmd.NewDefaultClientConfig(*apiConfig, &clientcmd.ConfigOverrides{}).ClientConfig()
+		if err != nil {
+			klog.Fatal(err)
+		}
+	} else {
+		restConfig, err = rest.InClusterConfig()
+		if err != nil {
+			klog.Fatalf("failed to get kubeclient config for cluster: %v", err)
+		}
+	}
+
+	// TODO: switch to using the same kube/rest config as the core CA after
+	// https://github.com/kubernetes/autoscaler/pull/6180/files is merged
+	kubeClient := kubeclient.NewForConfigOrDie(restConfig)
+
+	p, err := BuildKwokProvider(&kwokOptions{
+		kubeClient:      kubeClient,
+		autoscalingOpts: &opts,
+		discoveryOpts:   &do,
+		resourceLimiter: rl,
+		ngNodeListerFn:  kube_util.NewNodeLister,
+		allNodesLister:  informerFactory.Core().V1().Nodes().Lister()})
+
+	if err != nil {
+		klog.Fatal(err)
+	}
+
+	return p
+}
+
+// BuildKwokProvider builds the kwok provider
+func BuildKwokProvider(ko *kwokOptions) (*KwokCloudProvider, error) {
+
+	kwokConfig, err := LoadConfigFile(ko.kubeClient)
+	if err != nil {
+		return nil, fmt.Errorf("failed to load kwok provider config: %v", err)
+	}
+
+	var nodegroups []*NodeGroup
+	var nodeTemplates []*apiv1.Node
+	switch kwokConfig.ReadNodesFrom {
+	case nodeTemplatesFromConfigMap:
+		if nodeTemplates, err = LoadNodeTemplatesFromConfigMap(kwokConfig.ConfigMap.Name, ko.kubeClient); err != nil {
+			return nil, err
+		}
+	case nodeTemplatesFromCluster:
+		if nodeTemplates, err = loadNodeTemplatesFromCluster(kwokConfig, ko.kubeClient, nil); err != nil {
+			return nil, err
+		}
+	}
+
+	if !kwokConfig.Nodes.SkipTaint {
+		for _, no := range nodeTemplates {
+			no.Spec.Taints = append(no.Spec.Taints, kwokProviderTaint())
+		}
+	}
+
+	nodegroups = createNodegroups(nodeTemplates, ko.kubeClient, kwokConfig, ko.ngNodeListerFn, ko.allNodesLister)
+
+	return &KwokCloudProvider{
+		nodeGroups:      nodegroups,
+		kubeClient:      ko.kubeClient,
+		resourceLimiter: ko.resourceLimiter,
+		config:          kwokConfig,
+	}, nil
+}
+
+func kwokProviderTaint() apiv1.Taint {
+	return apiv1.Taint{
+		Key:    "kwok-provider",
+		Value:  "true",
+		Effect: apiv1.TaintEffectNoSchedule,
+	}
+}
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_provider_test.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_provider_test.go
--- a/cluster-autoscaler/cloudprovider/kwok/kwok_types.go
+++ b/cluster-autoscaler/cloudprovider/kwok/kwok_types.go
@ -0,0 +1,107 @@
+/*
+Copyright 2023 The Kubernetes Authors.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kwok
+
+import (
+	apiv1 "k8s.io/api/core/v1"
+	"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
+	"k8s.io/autoscaler/cluster-autoscaler/config"
+	kube_util "k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes"
+	"k8s.io/client-go/kubernetes"
+	listersv1 "k8s.io/client-go/listers/core/v1"
+)
+
+// KwokCloudProvider implements CloudProvider interface for kwok
+type KwokCloudProvider struct {
+	nodeGroups      []*NodeGroup
+	config          *KwokProviderConfig
+	resourceLimiter *cloudprovider.ResourceLimiter
+	// kubeClient is to be used only for create, delete and update
+	kubeClient kubernetes.Interface
+}
+
+type kwokOptions struct {
+	kubeClient      kubernetes.Interface
+	autoscalingOpts *config.AutoscalingOptions
+	discoveryOpts   *cloudprovider.NodeGroupDiscoveryOptions
+	resourceLimiter *cloudprovider.ResourceLimiter
+	// TODO(vadasambar): look into abstracting kubeClient
+	// and lister into a single client
+	// allNodeLister lists all the nodes in the cluster
+	allNodesLister listersv1.NodeLister
+	// nodeLister lists all nodes managed by kwok for a specific nodegroup
+	ngNodeListerFn listerFn
+}
+
+// NodeGroup implements NodeGroup interface.
+type NodeGroup struct {
+	name         string
+	kubeClient   kubernetes.Interface
+	lister       kube_util.NodeLister
+	nodeTemplate *apiv1.Node
+	minSize      int
+	targetSize   int
+	maxSize      int
+}
+
+// NodegroupsConfig defines options for creating nodegroups
+type NodegroupsConfig struct {
+	FromNodeLabelKey        string `json:"fromNodeLabelKey" yaml:"fromNodeLabelKey"`
+	FromNodeLabelAnnotation string `json:"fromNodeLabelAnnotation" yaml:"fromNodeLabelAnnotation"`
+}
+
+// NodeConfig defines config options for the nodes
+type NodeConfig struct {
+	GPUConfig *GPUConfig `json:"gpuConfig" yaml:"gpuConfig"`
+	SkipTaint bool       `json:"skipTaint" yaml:"skipTaint"`
+}
+
+// ConfigMapConfig allows setting the kwok provider configmap name
+type ConfigMapConfig struct {
+	Name string `json:"name" yaml:"name"`
+	Key  string `json:"key" yaml:"key"`
+}
+
+// GPUConfig defines GPU related config for the node
+type GPUConfig struct {
+	GPULabelKey       string              `json:"gpuLabelKey" yaml:"gpuLabelKey"`
+	AvailableGPUTypes map[string]struct{} `json:"availableGPUTypes" yaml:"availableGPUTypes"`
+}
+
+// KwokConfig is the struct to define kwok specific config
+// (needs to be implemented; currently empty)
+type KwokConfig struct {
+}
+
+// KwokProviderConfig is the struct to hold kwok provider config
+type KwokProviderConfig struct {
+	APIVersion    string            `json:"apiVersion" yaml:"apiVersion"`
+	ReadNodesFrom string            `json:"readNodesFrom" yaml:"readNodesFrom"`
+	Nodegroups    *NodegroupsConfig `json:"nodegroups" yaml:"nodegroups"`
+	Nodes         *NodeConfig       `json:"nodes" yaml:"nodes"`
+	ConfigMap     *ConfigMapConfig  `json:"configmap" yaml:"configmap"`
+	Kwok          *KwokConfig       `json:"kwok" yaml:"kwok"`
+	status        *GroupingConfig
+}
+
+// GroupingConfig defines different
+type GroupingConfig struct {
+	groupNodesBy      string              // [annotation, label]
+	key               string              // annotation or label key
+	gpuLabel          string              // gpu label key
+	availableGPUTypes map[string]struct{} // available gpu types
+}
--- a/cluster-autoscaler/cloudprovider/kwok/samples/dynamic_nodegroups_config.yaml
+++ b/cluster-autoscaler/cloudprovider/kwok/samples/dynamic_nodegroups_config.yaml
@ -0,0 +1,28 @@
+apiVersion: v1alpha1
+readNodesFrom: cluster # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+
+nodes:
+  # kwok provider adds a taint on the template nodes
+  # so that even if you run the provider in a production cluster
+  # you don't have to worry about production workload
+  # getting accidentally scheduled on the fake nodes
+  # use skipTaint: true to disable this behavior (false by default)
+  # skipTaint: false (default)
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
--- a/cluster-autoscaler/cloudprovider/kwok/samples/static_nodegroups_config.yaml
+++ b/cluster-autoscaler/cloudprovider/kwok/samples/static_nodegroups_config.yaml
@ -0,0 +1,30 @@
+apiVersion: v1alpha1
+readNodesFrom: configmap # possible values: [cluster,configmap]
+nodegroups:
+  # to specify how to group nodes into a nodegroup
+  # e.g., you want to treat nodes with same instance type as a nodegroup
+  # node1: m5.xlarge
+  # node2: c5.xlarge
+  # node3: m5.xlarge
+  # nodegroup1: [node1,node3]
+  # nodegroup2: [node2]
+  fromNodeLabelKey: "node.kubernetes.io/instance-type"
+  # you can either specify fromNodeLabelKey OR fromNodeAnnotation
+  # (both are not allowed)
+  # fromNodeAnnotation: "eks.amazonaws.com/nodegroup"
+nodes:
+  # kwok provider adds a taint on the template nodes
+  # so that even if you run the provider in a production cluster
+  # you don't have to worry about production workload
+  # getting accidentally scheduled on the fake nodes
+  # use skipTaint: true to disable this behavior (false by default)
+  # skipTaint: false (default)
+  gpuConfig:
+    # to tell kwok provider what label should be considered as GPU label
+    gpuLabelKey: "k8s.amazonaws.com/accelerator"
+    availableGPUTypes:
+      "nvidia-tesla-k80": {}
+      "nvidia-tesla-p100": {}
+configmap:
+  name: kwok-provider-templates
+  key: kwok-config # default: config
--- a/cluster-autoscaler/core/autoscaler.go
+++ b/cluster-autoscaler/core/autoscaler.go
@ -73,8 +73,8 @@ type Autoscaler interface {
 }

 // NewAutoscaler creates an autoscaler of an appropriate type according to the parameters
-func NewAutoscaler(opts AutoscalerOptions) (Autoscaler, errors.AutoscalerError) {
-	err := initializeDefaultOptions(&opts)
+func NewAutoscaler(opts AutoscalerOptions, informerFactory informers.SharedInformerFactory) (Autoscaler, errors.AutoscalerError) {
+	err := initializeDefaultOptions(&opts, informerFactory)
 	if err != nil {
 		return nil, errors.ToAutoscalerError(errors.InternalError, err)
 	}
@ -97,7 +97,7 @@ func NewAutoscaler(opts AutoscalerOptions) (Autoscaler, errors.AutoscalerError)
 }

 // Initialize default options if not provided.
-func initializeDefaultOptions(opts *AutoscalerOptions) error {
+func initializeDefaultOptions(opts *AutoscalerOptions, informerFactory informers.SharedInformerFactory) error {
 	if opts.Processors == nil {
 		opts.Processors = ca_processors.DefaultProcessors(opts.AutoscalingOptions)
 	}
@ -111,7 +111,7 @@ func initializeDefaultOptions(opts *AutoscalerOptions) error {
 		opts.RemainingPdbTracker = pdb.NewBasicRemainingPdbTracker()
 	}
 	if opts.CloudProvider == nil {
-		opts.CloudProvider = cloudBuilder.NewCloudProvider(opts.AutoscalingOptions)
+		opts.CloudProvider = cloudBuilder.NewCloudProvider(opts.AutoscalingOptions, informerFactory)
 	}
 	if opts.ExpanderStrategy == nil {
 		expanderFactory := factory.NewFactory()
--- a/cluster-autoscaler/core/static_autoscaler.go
+++ b/cluster-autoscaler/core/static_autoscaler.go
@ -941,6 +941,8 @@ func (a *StaticAutoscaler) ExitCleanUp() {
 	}
 	utils.DeleteStatusConfigMap(a.AutoscalingContext.ClientSet, a.AutoscalingContext.ConfigNamespace, a.AutoscalingContext.StatusConfigMapName)

+	a.CloudProvider.Cleanup()
+
 	a.clusterStateRegistry.Stop()
 }

--- a/cluster-autoscaler/go.mod
+++ b/cluster-autoscaler/go.mod
@ -192,7 +192,7 @@ require (
 	k8s.io/kms v0.29.0-alpha.3 // indirect
 	k8s.io/kube-openapi v0.0.0-20231010175941-2dd684a91f00 // indirect
 	k8s.io/kube-scheduler v0.0.0 // indirect
-	k8s.io/kubectl v0.0.0 // indirect
+	k8s.io/kubectl v0.28.0 // indirect
 	k8s.io/mount-utils v0.26.0-alpha.0 // indirect
 	sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.28.0 // indirect
 	sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
--- a/cluster-autoscaler/main.go
+++ b/cluster-autoscaler/main.go
@ -510,7 +510,7 @@ func buildAutoscaler(debuggingSnapshotter debuggingsnapshot.DebuggingSnapshotter
 	metrics.UpdateMemoryLimitsBytes(autoscalingOptions.MinMemoryTotal, autoscalingOptions.MaxMemoryTotal)

 	// Create autoscaler.
-	autoscaler, err := core.NewAutoscaler(opts)
+	autoscaler, err := core.NewAutoscaler(opts, informerFactory)
 	if err != nil {
 		return nil, err
 	}
--- a/cluster-autoscaler/vendor/modules.txt
+++ b/cluster-autoscaler/vendor/modules.txt
@ -1852,7 +1852,7 @@ k8s.io/kube-openapi/pkg/validation/strfmt/bson
 ## explicit; go 1.21.3
 k8s.io/kube-scheduler/config/v1
 k8s.io/kube-scheduler/extender/v1
-# k8s.io/kubectl v0.0.0 => k8s.io/kubectl v0.29.0-alpha.3
+# k8s.io/kubectl v0.28.0 => k8s.io/kubectl v0.29.0-alpha.3
 ## explicit; go 1.21.3
 k8s.io/kubectl/pkg/scale
 # k8s.io/kubelet v0.29.0-alpha.3 => k8s.io/kubelet v0.29.0-alpha.3