add docs for 20221003 test

Signed-off-by: Silvio Moioli <silvio@moioli.net>
This commit is contained in:
Silvio Moioli 2022-10-05 23:05:10 +02:00
parent 57f5a86f90
commit b45fd81a6a
No known key found for this signature in database
1 changed files with 163 additions and 0 deletions

View File

@ -0,0 +1,163 @@
# 2022-10-03 - 300 pods per node test
## Results
- a 6-node RKE cluster is set up with a limit of 300 pods per node (1800 pods maximum cluster capacity)
- an `ngnix` workload at 90% pod capacity (270 pods per node / 1620 pods total) is scheduled and health checks stay green
- some day-2 operations are performed on the cluster, with the workload staying healthy (after a transitory period) at every step:
- addition (uncordoning) of one node
- full workload redeployment
- removal (draining) of one node
- upgrade of RKE2 on all cluster nodes
- CPU overhead at rest (kubelet, container runtime, etc.): ~15% (of a 4-vCPU 3.1 GHz Intel Xeon)
The test is completely automated, takes ~1 hour, and costs ~2 USD in AWS resources.
[Screen captures and logs](https://mysuse-my.sharepoint.com/:f:/g/personal/moio_suse_com/Esqw92qeMS5AkAwn6PaV-O0B8tADbJ9rK9zRjqL-yKIGMQ?e=OUbMaz) are available to SUSE employees.
## Methodology notes
- 300 pods per node was chosen to be higher than 254 pods per node, which is the maximum possible with the default /24 CIDR range
- a /22 CIDR range was assigned, as [Google recommends](https://cloud.google.com/kubernetes-engine/docs/how-to/flexible-pod-cidr#cidr_ranges_for_clusters) having at least twice the number of available IP addresses than pods
- workload pods were scheduled up to 90% of the maximum number (270 pods per node). That still leaves 30 pods for system components and is still higher than 254
- the downstream cluster is imported because AWS provisioning [is not supported with temporary session tokens at the time of writing](https://github.com/rancher/rancher/issues/15962)
- the downstream cluster RKE2 upgrade is performed via the [installation script method](https://github.com/rancher/rke2/blob/v1.25.2%2Brke2r1/docs/upgrade/basic_upgrade.md#upgrade-rke2-using-the-installation-script) because it's the only option with imported clusters at the time of writing
- upgrades via installation scripts imply draining nodes that are being upgraded. One extra node is added for that purpose
- bugs not related to scalability have been worked around
## Test outline
- infrastructure is set up:
- AWS hardware (VMs, network devices...) are deployed
- RKE2 is installed on upstream cluster nodes, Rancher is installed on top of it
- RKE3 is installed on downstream clusters
- test is conducted:
- initial admin user is set up ([detail](../cypress/cypress/e2e/users.cy.js))
- downstream cluster is imported into Rancher ([detail](../cypress/cypress/e2e/imported-clusters.cy.js))
- an ngnix deployment ([detail](../cypress/cypress/e2e/deployment.yaml)) is applied ([detail](../cypress/cypress/e2e/workloads.cy.js))
- deployment replicas are scaled up to 1620
- one worker node is drained
- deployment is recycled
- worker node is uncordoned
- deployment is recycled again
- RKE2 is updated, node per node ([detail](../cypress/cypress/e2e/rke2-update.cy.js))
- logs are collected
## AWS Hardware configuration outline
- bastion host (for SSH tunnelling only): `t4g.small`, 50 GiB EBS `gp3` root volume
- Rancher cluster: 3-node `t3.large`, 50 GiB EBS `gp3` root volume
- downstream cluster: 6 to 7 nodes, `t3.xlarge`, 50 GiB EBS gp3 root volume
- networking: one /16 AWS VPC with two /24 subnets
- public subnet: contains the one bastion host which exposes port 22 to the Internet via security groups
- private subnet: contains all other nodes. Traffic allowed only internally and to/from the bastion via SSH
References:
- [instance types](https://aws.amazon.com/ec2/instance-types/)
- [EBS](https://aws.amazon.com/ebs/)
- [VPC](https://aws.amazon.com/vpc/)
## Software configuration outline
- bastion host: SLES 15 SP4
- Rancher cluster: Rancher 2.6.5 on a 3-node RKE2 v1.23.10+rke2r1 cluster
- all nodes based on Rocky Linux 8.6
- downstream cluster: RKE2 v1.22.13+rke2r1, 3 server nodes and 4 agent nodes
- all nodes based on Rocky Linux 8.6
The number of 300 pods per node is set via:
- adding a `kube-controller-manager-arg: "node-cidr-mask-size=22"` line to `/etc/rancher/rke2/config.yaml`
- adding a `kubelet-arg: "config=/etc/rancher/rke2/kubelet-custom.config"` line to `/etc/rancher/rke2/config.yaml`
- creating a `/etc/rancher/rke2/kubelet-custom.config` file with the following content:
```yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
maxPods: 250
```
- restarting the `rke2-server` service
See [the rke2 installation script in this repo](../rke2/install_rke2.sh) for details.
## Full configuration details
All infrastructure is defined via [Terraform](https://www.terraform.io/) files in the [20221003_300_pods_per_node](https://github.com/moio/scalability-tests/tree/20221003_300_pods_per_node) branch. Note in particular [inputs.tf](../inputs.tf) for the main parameters.
All tests are driven by [Cypress](https://www.cypress.io/) files in the [cypress](https://github.com/moio/scalability-tests/tree/20221003_300_pods_per_node/cypress) directory.
## Reproduction Instructions
### Requirements
- API access to EC2 configured for your terminal
- for SUSE Engineering:
- [have "AWS Landing Zone" added to your Okta account](https://confluence.suse.com/display/CCOE/Requesting+AWS+Access)
- open [Okta](https://suse.okta.com/) -> "AWS Landing Zone"
- Click on "AWS Account" -> your account -> "Command line or programmatic access" -> click to copy commands under "Option 1: Set AWS environment variables"
- paste contents in terminal
- [Terraform](https://www.terraform.io/downloads)
- `git`
- `node`
- `kubectl`
- `nc` (netcat)
### Setup
- clone this project:
```shell
git clone https://github.com/moio/scalability-tests.git
cd scalability-tests
git checkout 20221003_300_pods_per_node
```
- initialize Terraform and Cypress:
```shell
terraform init
pushd cypress
npm install cypress --save-dev
```
### Run
```shell
# deploys all infrastructure
terraform apply -auto-approve
# runs tests
cd cypress; ./node_modules/cypress/bin/cypress run
```
### Inspect results
- `cypress/cypress/logs` contains log tarballs collected from all downstream nodes
- `cypress/cypress/videos` contains screen captures of test runs
- Terraform output will contain instructions to access the newly created clusters, eg.
```
UPSTREAM CLUSTER ACCESS:
export KUBECONFIG=./config/upstream.yaml
RANCHER UI:
https://upstream.local.gd:3000
DOWNSTREAM CLUSTER ACCESS:
export KUBECONFIG=./config/downstream.yaml
```
- `./config` contains scripts to SSH into any VM and reopen SSH tunnels to ports 3000 (Web UI), 6443 (upstream cluster API), 6444 (downstream cluster API)
- Cypress tests can be run interactively via:
```shell
cd cypress
npx cypress open
```
### Cleanup
All created infrastructure can be destroyed via:
```shell
terraform destroy -auto-approve
```
## Troubleshooting
- if the error below is produced:
```
Error: creating EC2 Instance: VcpuLimitExceeded: You have requested more vCPU capacity than your current vCPU limit of 32 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.
```
Then you need to request higher limits for your account to AWS. This can be done by visting [the Service Quotas](https://console.aws.amazon.com/servicequotas/home) page and filling the details to request an increase of the "Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances" limit.