dartboard/docs/20220928 - 250 pods per nod...

7.7 KiB

2022-09-28 - 250 pods per node test (with cluster operations)

Results

  • results from the 20220923 test were reproduced, then:
  • 4 worker nodes were added and the number of replicas was scaled up to 2000
    • workload stayed healthy as per above
  • 1 worker node was cordoned, drained and removed via the Rancher UI
    • workload stayed healthy as per above
  • all RKE2 nodes were upgraded to the next RKE2 minor version
    • workload stayed healthy as per above

The test takes ~1 hour and costs ~2 USD in AWS resources.

AWS Hardware configuration

  • bastion host (for SSH tunnelling only): t2.micro, 50 GiB EBS gp3 root volume
  • Rancher cluster: 3-node t3.large, 50 GiB EBS gp3 root volume
  • downstream cluster: from 7 up to 11 nodes t3.xlarge, 50 GiB EBS gp3 root volume
  • networking: one /16 AWS VPC with two /24 subnets
    • public subnet: contains the one bastion host which exposes port 22 to the Internet via security groups
    • private subnet: contains all other nodes. Traffic allowed only internally and to/from the bastion via SSH

See https://aws.amazon.com/ec2/instance-types/, https://aws.amazon.com/ebs/, https://aws.amazon.com/vpc/

Software configuration

  • bastion host: SLES 15 SP4
  • Rancher cluster: Rancher 2.6.5 on a 3-node RKE2 v1.23.10+rke2r1 cluster
    • all nodes based on Rocky Linux 8.6
  • downstream cluster: RKE2 v1.22.13+rke2r1, 3 server nodes and 4 agent nodes
    • all nodes based on Rocky Linux 8.6

The number of 250 pods per node is set by:

  • adding a kubelet-arg: "config=/etc/rancher/rke2/kubelet-custom.config" line to /etc/rancher/rke2/config.yaml
  • creating a /etc/rancher/rke2/kubelet-custom.config file with the following content:
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
maxPods: 250
  • restarting the rke2-server service

See the rke2 installation script in this repo for details.

Full configuration details

All infrastructure is defined via Terraform files in the 20220928_250_pods_per_node_with_cluster_operations branch.

Note in particular inputs.tf for the main parameters.

Reproduction Instructions

Deployment and cluster configuration

  • get API access to EC2 configured for your terminal
    • for SUSE Engineering:
  • get Terraform
  • check out this project
git clone https://github.com/rancher/dartboard.git
cd scalability-tests
git checkout 20220928_250_pods_per_node_with_cluster_operations
  • run:
terraform init
terraform apply -auto-approve
  • deployment takes ~10 minutes
  • output will contain instructions to access the newly created clusters, eg.
UPSTREAM CLUSTER ACCESS:
  export KUBECONFIG=./config/upstream.yaml

RANCHER UI:
  https://upstream.local.gd:3000

DOWNSTREAM CLUSTER ACCESS:
  export KUBECONFIG=./config/downstream.yaml
  • use the URL to access the Rancher UI, login with admin as temporary password
  • "Clusters" -> "Import existing" -> "Generic" -> "Cluster Name": test -> "Create"
  • click on the curl command to copy it
  • SSH into the first node running ./config/ssh-to-*-downstream-server-node-0.sh
  • paste the curl command to register the cluster to Rancher
  • wait until the cluster is fully registered, check that all 7 nodes are detected with a maximum number of 1750 pods visible from the home page

Workload deployment

  • Homepage -> test -> "Apps" -> "Repositories" -> "Create"
    • "Name": bitnami
    • "Index URL": https://charts.bitnami.com/bitnami
    • click on "Create"
  • "Apps" -> "Charts" -> "memcached"
    • "Chart Versions": 6.2.5
    • click on "Install"
    • "Name": memcached
    • click on "Next" -> "Install"
    • "Workload" -> "Deployments" -> memcached -> "⋮" -> "Edit YAML"
    • "Edit YAML":
      • spec.replicas: change from 1 to 1500
      • remove the resources dictionary (resources: line and the three following lines)
    • click on "Save"
  • wait for deployment to stabilize (it could take several minutes) and check events on the "Cluster Dashboard" page, there should not be errors after the cluster settles at ~1550 / 1750 active pods
    • note that the UI might display outdated information temporarily. If pods are shown as errored when they should not, try refreshing the page

Adding cluster nodes

  • edit inputs.tf and edit downstream_agent_count from 4 to 8
  • run terraform apply -auto-approve again
    • monitor the "Cluster" -> "Nodes" page. New nodes should appear during deployment
  • pod cluster capacity in the "Cluster" page should increase from 1750 to 2750
  • scale the number of replicas up from 1500 to 2000 per instructions in the previous paragraph
    • monitor the "Cluster" -> "Nodes" page. Newly added nodes pod count should raise up to ~51%
  • check the health indicator from the "Workload" -> "Deployments" -> memcached page and events from the "More Resources" -> "Core" -> "Events" page

Removing cluster nodes

  • Drain a worker node: "Cluster" -> "Nodes" -> worker node -> "⋮" -> "Drain" -> "Drain"
    • "Delete Empty Dir Data": Yes
    • "Force": Yes
    • "Ignore the defaults and give each pod": 5 seconds
  • check the health indicator from the "Workload" -> "Deployments" -> memcached. 2000 pods should remain available
  • Delete that worker node: "Cluster" -> "Nodes" -> worker node -> "⋮" -> "Delete" -> "Delete"
  • reload the page. The number of nodes should now be 10

Upgrading RKE2

  • scale the number of replicas down from 2000 to 1000 per instructions in the previous paragraph
    • monitor the "Cluster" -> "Nodes" page. Pod pressure should diminish
  • run ./util/upgrade_downstream_rke.sh
    • monitor the "Cluster" -> "Nodes" page. Nodes should be cordoned and uncordoned, starting from server nodes, one by one
    • monitor the "Workload" -> "Deployments" -> memcached page. Upon node draining, the number of available pods should diminish and then be replenished to 1000
  • at the end of the process, the "Cluster" -> "Nodes" page should show the new RKE2 version on all nodes (v1.23.10+rke2r1), and an even pod load distribution
  • finally check the health indicator from the "Workload" -> "Deployments" -> memcached page and events from the "More Resources" -> "Core" -> "Events" page

Cleanup

All created infrastructure can be destroyed via:

terraform destroy -auto-approve

Screenshots

Rancher UI screenshot

Rancher UI screenshot

Rancher UI screenshot

Rancher UI screenshot

Troubleshooting

  • to re-create SSH tunnels to the cluster, use terraform apply again

  • if the error below is produced:

Error: creating EC2 Instance: VcpuLimitExceeded: You have requested more vCPU capacity than your current vCPU limit of 32 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.

Then you need to request higher limits for your account to AWS. This can be done by visting the Service Quotas page and filling the details to request an increase of the "Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances" limit.