add docs for 20221003 test

Signed-off-by: Silvio Moioli <silvio@moioli.net>
2022-10-05 23:05:10 +02:00 · 2022-10-05 23:05:10 +02:00 · b45fd81a6a
parent 57f5a86f90
commit b45fd81a6a
1 changed files with 163 additions and 0 deletions
--- a/docs/20221003
+++ b/docs/20221003
@ -0,0 +1,163 @@
+# 2022-10-03 - 300 pods per node test
+
+## Results
+
+- a 6-node RKE cluster is set up with a limit of 300 pods per node (1800 pods maximum cluster capacity)
+- an `ngnix` workload at 90% pod capacity (270 pods per node / 1620 pods total) is scheduled and health checks stay green
+- some day-2 operations are performed on the cluster, with the workload staying healthy (after a transitory period) at every step:
+  - addition (uncordoning) of one node
+  - full workload redeployment
+  - removal (draining) of one node
+  - upgrade of RKE2 on all cluster nodes
+- CPU overhead at rest (kubelet, container runtime, etc.): ~15% (of a 4-vCPU 3.1 GHz Intel Xeon)
+
+The test is completely automated, takes ~1 hour, and costs ~2 USD in AWS resources.
+
+[Screen captures and logs](https://mysuse-my.sharepoint.com/:f:/g/personal/moio_suse_com/Esqw92qeMS5AkAwn6PaV-O0B8tADbJ9rK9zRjqL-yKIGMQ?e=OUbMaz) are available to SUSE employees.
+
+## Methodology notes
+
+- 300 pods per node was chosen to be higher than 254 pods per node, which is the maximum possible with the default /24 CIDR range
+- a /22 CIDR range was assigned, as [Google recommends](https://cloud.google.com/kubernetes-engine/docs/how-to/flexible-pod-cidr#cidr_ranges_for_clusters) having at least twice the number of available IP addresses than pods
+- workload pods were scheduled up to 90% of the maximum number (270 pods per node). That still leaves 30 pods for system components and is still higher than 254
+- the downstream cluster is imported because AWS provisioning [is not supported with temporary session tokens at the time of writing](https://github.com/rancher/rancher/issues/15962)
+- the downstream cluster RKE2 upgrade is performed via the [installation script method](https://github.com/rancher/rke2/blob/v1.25.2%2Brke2r1/docs/upgrade/basic_upgrade.md#upgrade-rke2-using-the-installation-script) because it's the only option with imported clusters at the time of writing
+- upgrades via installation scripts imply draining nodes that are being upgraded. One extra node is added for that purpose
+- bugs not related to scalability have been worked around
+
+## Test outline
+ - infrastructure is set up:
+   - AWS hardware (VMs, network devices...) are deployed
+   - RKE2 is installed on upstream cluster nodes, Rancher is installed on top of it
+   - RKE3 is installed on downstream clusters
+ - test is conducted:
+   - initial admin user is set up ([detail](../cypress/cypress/e2e/users.cy.js))
+   - downstream cluster is imported into Rancher ([detail](../cypress/cypress/e2e/imported-clusters.cy.js))
+   - an ngnix deployment ([detail](../cypress/cypress/e2e/deployment.yaml)) is applied ([detail](../cypress/cypress/e2e/workloads.cy.js))
+   - deployment replicas are scaled up to 1620
+   - one worker node is drained
+   - deployment is recycled
+   - worker node is uncordoned
+   - deployment is recycled again
+   - RKE2 is updated, node per node ([detail](../cypress/cypress/e2e/rke2-update.cy.js))
+   - logs are collected
+
+## AWS Hardware configuration outline
+
+- bastion host (for SSH tunnelling only): `t4g.small`, 50 GiB EBS `gp3` root volume
+- Rancher cluster: 3-node `t3.large`, 50 GiB EBS `gp3` root volume
+- downstream cluster: 6 to 7 nodes, `t3.xlarge`, 50 GiB EBS gp3 root volume
+- networking: one /16 AWS VPC with two /24 subnets
+  - public subnet: contains the one bastion host which exposes port 22 to the Internet via security groups
+  - private subnet: contains all other nodes. Traffic allowed only internally and to/from the bastion via SSH
+
+References:
+  - [instance types](https://aws.amazon.com/ec2/instance-types/)
+  - [EBS](https://aws.amazon.com/ebs/)
+  - [VPC](https://aws.amazon.com/vpc/) 
+
+## Software configuration outline
+
+- bastion host: SLES 15 SP4
+- Rancher cluster: Rancher 2.6.5 on a 3-node RKE2 v1.23.10+rke2r1 cluster
+  - all nodes based on Rocky Linux 8.6
+- downstream cluster: RKE2 v1.22.13+rke2r1, 3 server nodes and 4 agent nodes
+  - all nodes based on Rocky Linux 8.6
+
+The number of 300 pods per node is set via:
+- adding a `kube-controller-manager-arg: "node-cidr-mask-size=22"` line to `/etc/rancher/rke2/config.yaml`
+- adding a `kubelet-arg: "config=/etc/rancher/rke2/kubelet-custom.config"` line to `/etc/rancher/rke2/config.yaml`
+- creating a `/etc/rancher/rke2/kubelet-custom.config` file with the following content:
+```yaml
+apiVersion: kubelet.config.k8s.io/v1beta1
+kind: KubeletConfiguration
+maxPods: 250
+```
+- restarting the `rke2-server` service
+
+See [the rke2 installation script in this repo](../rke2/install_rke2.sh) for details.
+
+## Full configuration details
+
+All infrastructure is defined via [Terraform](https://www.terraform.io/) files in the [20221003_300_pods_per_node](https://github.com/moio/scalability-tests/tree/20221003_300_pods_per_node) branch. Note in particular [inputs.tf](../inputs.tf) for the main parameters.
+All tests are driven by [Cypress](https://www.cypress.io/) files in the [cypress](https://github.com/moio/scalability-tests/tree/20221003_300_pods_per_node/cypress) directory.
+
+## Reproduction Instructions
+ 
+### Requirements
+
+- API access to EC2 configured for your terminal
+    - for SUSE Engineering:
+        - [have "AWS Landing Zone" added to your Okta account](https://confluence.suse.com/display/CCOE/Requesting+AWS+Access)
+        - open [Okta](https://suse.okta.com/) -> "AWS Landing Zone"
+        - Click on "AWS Account" -> your account -> "Command line or programmatic access" -> click to copy commands under "Option 1: Set AWS environment variables"
+        - paste contents in terminal
+- [Terraform](https://www.terraform.io/downloads)
+- `git`
+- `node`
+- `kubectl`
+- `nc` (netcat)
+
+### Setup
+
+- clone this project:
+```shell
+git clone https://github.com/moio/scalability-tests.git
+cd scalability-tests
+git checkout 20221003_300_pods_per_node
+```
+- initialize Terraform and Cypress:
+```shell
+terraform init
+pushd cypress
+npm install cypress --save-dev
+```
+
+### Run
+
+```shell
+# deploys all infrastructure
+terraform apply -auto-approve
+
+# runs tests
+cd cypress; ./node_modules/cypress/bin/cypress run
+```
+
+### Inspect results
+
+- `cypress/cypress/logs` contains log tarballs collected from all downstream nodes
+- `cypress/cypress/videos` contains screen captures of test runs
+- Terraform output will contain instructions to access the newly created clusters, eg.
+```
+UPSTREAM CLUSTER ACCESS:
+  export KUBECONFIG=./config/upstream.yaml
+
+RANCHER UI:
+  https://upstream.local.gd:3000
+
+DOWNSTREAM CLUSTER ACCESS:
+  export KUBECONFIG=./config/downstream.yaml
+```
+- `./config` contains scripts to SSH into any VM and reopen SSH tunnels to ports 3000 (Web UI), 6443 (upstream cluster API), 6444 (downstream cluster API)
+- Cypress tests can be run interactively via:
+
+```shell
+cd cypress
+npx cypress open
+```
+
+### Cleanup
+
+All created infrastructure can be destroyed via:
+```shell
+terraform destroy -auto-approve
+```
+
+## Troubleshooting
+
+- if the error below is produced:
+```
+Error: creating EC2 Instance: VcpuLimitExceeded: You have requested more vCPU capacity than your current vCPU limit of 32 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.
+```
+
+Then you need to request higher limits for your account to AWS. This can be done by visting [the Service Quotas](https://console.aws.amazon.com/servicequotas/home) page and filling the details to request an increase of the "Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances" limit.