# FAQ

### How many QPS kube-apiserver can support?

There is NO single answer to this question. The reason for it is that,
every Kubernetes request is potentially very different. As a result,
kube-apiserver may be able to process N in a given usage pattern, but
may not be able to process even N/5 in a different usage pattern.

To be more specific, the difference is fairly obvious if we compare
GET and LIST requests (e.g. "GET my-pod" vs "LIST all pods").
It's also not counter-intuitive that there is a difference between
ready-only and mutating requests (e.g. "GET my-pod" vs "POST my-pod").

However, even if we focus only on a single type of request (let's say just
POST or just PUT), the differences can be in the orders of magnitude.
The two main reasons to think about are:
- size of the object - there is a huge difference between a small Lease
  object (say ~300B) and large custom CRD of e.g. 1MB
- fan-out - if a given object is being watched by N watchers, you
  suddenly get a multiplication factor for the cost of your request,
  that as a sender of a single API call you don't fully control

As a result, we are consciously not providing any values, because
one can imagine a cluster that handles thousands of QPS and a cluster
that is set up the same way may have troubles handling small tens of
QPS if the requests are fairly expensive.


### How do you setup clusters for scalability testing?

We are testing Kubernetes on two levels of scale: 100 nodes and 5000 nodes.
Given that Kubernetes scalability is a multidimensional problem, we are
obviously trying to exercise many other dimensions in our tests.

However, all our scalability tests are performed on the cluster with just
a single large control-plane machine (VM) with all control-plane components
running that machine.
The single large machine provides sufficient scalability to let us run
our load tests on a cluster with 5000 nodes.

The control-plane VM used for testing 5000-node clusters has 64 cores,
256 GB of RAM and is backed by 200GB SSD persistent disk. Public cloud
providers typically support even large machines, so as a user you still
have some slack here.

We are also running a `simulated` clusters ([Kubemark]) where the control
plane VM is set up the same was as in regular cluster, but many simulated
nodes are running on a single VM (to simmulate 5000 node clusters, we run
~80 machines with 60-ish hollow-nodes per machine). However, those are
currently somewhat auxiliary and release validation is done using a regular
clusters setup on GCE cloud provider.

[Kubemark]: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-scalability/kubemark-guide.md


### Why you are not testing HA cluster (with multiple control-plane instances)?

Honestly, we would like to. But we are lacking capacity to do that.
The main reason is the tooling to setup cluster - we are still relying on
kube-up and a bunch of knobs we have there. And kube-up doesn't really
support HA clusters. However, migration to anything else is a lot of work.
At the same time we would like to do that consistently for all other jobs
running on GCE (which all of still rely on kube-up).

We are aware that this is a gap in our testing for several reasons, including:
* large production clusters generally run at least 3 control plane instances to
  ensure tolerance to outages of individual instances, zero-downtime upgrade etc.
  we should be testing what users are doing in real life.
* etcd is RAFT-based so a cluster of etcd instance have slightly different
  performance characteristics that a single etcd server
* distributed systems often show different performance characteristics when the
  components are separate by a network
* distributed system may be affected by inconsistency of caches

Noting difference between various configuration is also important as this
can give indications which optimizations would have the best return-on-investments.