Make the references to the figure a bit more clear.

Signed-off-by: Milan Plzik <milan.plzik@grafana.com>
This commit is contained in:
Milan Plzik 2023-10-31 14:52:37 +01:00
parent 24d2b57f68
commit 8a724f966e
1 changed files with 2 additions and 2 deletions

View File

@ -37,11 +37,11 @@ Note that in both cases discussed above, were effectively preventing the work
### Bin-packing and cluster resource allocation
Firstly, lets discuss bin-packing and cluster resource allocation. Theres some inherent cluster inefficiency that comes to play its hard to achieve 100% resource allocation in a Kubernetes cluster. Thus, some percentage will be left unallocated.
When configuring fixed-fraction headroom limits, a proportional amount of this will be available to the pods. If the percentage of unallocated resources in the cluster is lower than the constant we use for setting fixed-fraction headroom limits (2), all the pods together are able to theoretically use up all the nodes resources; otherwise there are some resources that will inevitably be wasted (1). In order to eliminate the inevitable resource waste, the percentage for fixed-fraction headroom limits should be configured so that its at least equal to the expected percentage of unallocated resources.
When configuring fixed-fraction headroom limits, a proportional amount of this will be available to the pods. If the percentage of unallocated resources in the cluster is lower than the constant we use for setting fixed-fraction headroom limits (see the figure, line 2), all the pods together are able to theoretically use up all the nodes resources; otherwise there are some resources that will inevitably be wasted (see the figure, line 1). In order to eliminate the inevitable resource waste, the percentage for fixed-fraction headroom limits should be configured so that its at least equal to the expected percentage of unallocated resources.
{{<figure alt="Chart displaying various requests/limits configurations" width="40%" src="requests-limits-configurations.svg">}}
For requests = limits (3), this does not hold: Unless were able to allocate all nodes resources, theres going to be some inevitably wasted resources. Without any knobs to turn on the requests/limits side, the only suitable approach here is to ensure efficient bin-packing on the nodes by configuring correct machine profiles. This can be done either manually or by using a variety of cloud service provider tooling for example [Karpenter](https://karpenter.sh/) for EKS or [GKE Node auto provisioning](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning).
For requests = limits (see the figure, line 3), this does not hold: Unless were able to allocate all nodes resources, theres going to be some inevitably wasted resources. Without any knobs to turn on the requests/limits side, the only suitable approach here is to ensure efficient bin-packing on the nodes by configuring correct machine profiles. This can be done either manually or by using a variety of cloud service provider tooling for example [Karpenter](https://karpenter.sh/) for EKS or [GKE Node auto provisioning](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning).
### Optimizing actual resource utilization
Free resources also come in the form of unused resources of other pods (reserved vs. actual CPU utilization, etc.), and their availability cant be predicted in any reasonable way. Configuring limits makes it next to impossible to utilize these. Looking at this from a different perspective, if a workload wastes a significant amount of resources it has requested, re-visiting its own resource requests might be a fair thing to do. Looking at past data and picking more fitting resource requests might help to make the packing more tight (although at the price of worsening its performance for example increasing long tail latencies).