diff --git a/docs/20221215 - kine locality test.md b/docs/20221215 - kine locality test.md new file mode 100644 index 0000000..5a22022 --- /dev/null +++ b/docs/20221215 - kine locality test.md @@ -0,0 +1,92 @@ +# 2022-12-15 - Kine locality test + +## Results + +No significant difference in list performance of small ConfigMaps, up to 256K of them, running Kine: + - on the same machine running Postgres + - or inside of the k3s cluster, with Postgres running in RDS + +(machine running Postgres is set up on Amazon Linux, using the same Postgres version as RDS and the same virtual hardware to make the comparison as fair as possible) + +[Details in Excel format](https://mysuse-my.sharepoint.com/:x:/g/personal/moio_suse_com/EW4-iiB6Ib5PoMHO_Am6m_4BAISQCb7ozW5BwDx_ACMpfQ) are available to SUSE employees. + +## Methodology notes + +- no workload is running on the local cluster, no downstream clusters are registered +- every measurement is repeated 10 times and average/standard deviation is calculated. All standard deviations observed appeared to be reasonable to sustain conclusions above +- API calls under benchmark are accessed via pure REST, setup calls use a Kubernetes client +- all request responses are checked for errors (non-200 return codes). All results with errors were discarded, conclusions above use non-error cases only +- pauses are used before running each benchmark to ensure previous operations do not interfere (eg. handlers triggered asynchronously after resource creation) +- entire environment is destroyed and recreated via Terraform for each test + +## Test outline +- infrastructure is set up: + - AWS hardware (VMs, network devices, databases...) are deployed + - k3s is installed on cluster nodes +- test is conducted: + - the benchmark script is run (see below) + - results are collected in CSV form and elaborated in Excel + +## Full configuration details + +All infrastructure is defined via [Terraform](https://www.terraform.io/) files in the [20221215_kine_locality_test](https://github.com/moio/scalability-tests/tree/20221215_kine_locality_test/terraform) branch. +Benchmark Python script is available in the [util](https://github.com/moio/scalability-tests/tree/20221215_kine_locality_test/util) directory. + +## Reproduction Instructions + +### Requirements + +- API access to EC2 configured for your terminal + - for SUSE Engineering: + - [have "AWS Landing Zone" added to your Okta account](https://confluence.suse.com/display/CCOE/Requesting+AWS+Access) + - open [Okta](https://suse.okta.com/) -> "AWS Landing Zone" + - Click on "AWS Account" -> your account -> "Command line or programmatic access" -> click to copy commands under "Option 1: Set AWS environment variables" + - paste contents in terminal +- [Terraform](https://www.terraform.io/downloads) +- `git` +- `nc` (netcat) + +### Setup + +- clone this project: +```shell +git clone https://github.com/moio/scalability-tests.git +cd scalability-tests +git checkout 20221215_kine_locality_test +``` +- initialize Terraform and Cypress: +```shell +cd terraform +terraform init +``` + +### Run + +Configure and deploy the AWS infrastructure: + - edit `terraform/inputs.tf` (specifically: `ssh_private_key_path`, `ssh_public_key_path`) + - deploy and configure infrastructure: +```shell +terraform apply +``` + +Execute the benchmark: +```shell +./config/ssh-to-upstream-server-node-0-*.sh KUBECONFIG=/etc/rancher/k3s/k3s.yaml python3 - 0 1000 256000 10 4 <./util/benchmark_k8s_config_maps.py | tee -a results.csv +``` + +Elements of the line above have the following meaning: + - `./config/ssh-to-upstream-server-node-0-*.sh` opens an SSH shell to the first server node of the cluster + - `KUBECONFIG` points to the configuration file on the first cluster node + - `0 1000 256000` indicate the number of resources to create before each benchmark measure - starting from 0, first step at 1000, doubling the number up to 256000 + - `10 4` indicate the number of repetitions for each benchmark and the number of bytes in the configmap data field respectively + - `./util/benchmark_k8s_config_maps.py` is the benchmark script + - `| tee -a results.csv` saves results into a file that can be opened in a spreadsheet editor + +The benchmark can be repeated after reverting the "Switch back to RDS backed kine" commit to observe the difference between the two environments. + +### Cleanup + +All created infrastructure can be destroyed via: +```shell +terraform destroy -auto-approve +```