9.1 KiB

Raw Blame History

title

authors

reviewers

approvers

creation-date

divide replicas by static weight evenly

@chaosi-zju

@RainbowMango

@XiShanYongYe-Chang

@zhzhuang-zju

@RainbowMango

2023-11-11

Divide replicas by static weight evenly

Summary

Karmada supports specifying the number of replicas to be allocated to each member cluster in a static weight ratio, e.g., the user specifies that the total replicas of a Deployment is 5, and the requirement is to distribute them into two clusters, member1 and member2, with a static weight of 1:1.

According to the current implementation, the result must be 3 replicas for member1 and 2 replicas for member2. (5 is an odd number, so it can't be divided equally, and member1 allocates one more).

This is because for clusters with equal weights, the remainder prefers the cluster with the smaller dictionary order of cluster names in current implementation (the remainder is the number of replicas left over if the total replicas is not exactly divided by the static weights).

If there are many such Deployments, the total replicas of member1 cluster will be significantly higher than that of member2 cluster, i.e., a potential problem of uneven replicas distribution.

Therefore, in order to solve the uneven replicas allocation problem without affecting the scheduling inertia, I propose to optimize the sorting of clusters when allocating the remainder as:

Sort the clusters first by weights
If the weights are equal, then the clusters are sorted according to the current number of replicas (more current number of replicas implies that the remainders of the last scheduling were randomized to such clusters, and in order to keep the inertia in this scheduling, such clusters should also be prioritized).
If the weights are equal and the current number of replicas is equal too, then the clusters will be randomized.

Motivation

Problem Summary

The essence of above case is that the "total replicas" is not divisible by the "sum weights", and a remainder is generated.

The remainder is preferred allocating to clusters with higher dictionary order of cluster name when they have equal weights, which caused unevenness of replicas distribution.

Therefore, Karmada needs to do:

The remainder should be assigned to clusters having equal weights with equal probability
Scheduler rescheduling should ensure inertia of the scheduling result

Current realization

Example

sum replicas = 7，clusters = [member1, member2, member3]，weight = 2 : 1 : 1

Calculation Formula

each cluster allocations = (total replicas * each cluster weight) / sum weights

remainders = total replicas - sum (each cluster allocations)

Detail Steps

Calculate cluster allocations and remainders：

  member1 = 7 * 2 / 4 = 3
  member2 = 7 * 1 / 4 = 1
  member3 = 7 * 1 / 4 = 1
  
  so, remainders = 2

Sort clusters: first by weight, then by dictionary order of cluster name (member1 > member2 > member3)
Assign the remainders to the sorted clusters one by one (one for member1 and another for member2)
Eventually: member1 = 4、member2 = 2、member3 = 1

Expected goals

As in the above example, since member2 and member3 have equal weights, when assigning the remainder, it must be assigned to member2 first, which is not expected

We expect that it will be assigned to member2 and member3 with equal probability.

Proposal

Since the unevenness of the current implementation is caused by sorting the clusters in dictionary order, the simplest way is changing it to directly randomize the ordering when the weights are equal.

However, it requires consideration of scheduling inertia, avoiding biased rescheduling results due to the introduction of randomization.

Therefore, to solve the problem of uneven replicas allocation, I propose to optimize the ordering of clusters when allocating remainders as:

Sort the clusters first by weights
If the weights are equal, then the clusters are sorted according to the current number of replicas (more current number of replicas implies that the remainders of the last scheduling were randomized to such clusters, and in order to keep the inertia in this scheduling, such clusters should also be prioritized).
If the weights are equal and the current number of replicas is equal too, then the clusters will be randomized.

User Stories

Story 1：expanded replicas scenario

In expanding replicas scenario, we expect directly increasing replicas in several clusters, avoiding different results calculated when rescheduling due to randomness, which leads to some clusters increasing replicas, while others reducing replica.

Yet this proposal can avoid this problem:

sum replicas = 7，clusters = [member1, member2, member3, member4]，weight = 2 : 1 : 1 : 1

1、Calculation：【each cluster allocations】= 2、1、1、1，【remainders】= 2          // 7*2/5=2、7*1/5=1
2、Sorting：member1 > member2 = member3 = member4
3、One possible result：3、2、1、1

Supposing sum replicas change from 7 to 8

1、Calculation：【each cluster allocations】= 3、1、1、1，【remainders】= 2          // 8*2/5=3、8*1/5=1
2、Sorting：member1 > member2 > member3 = member4                                 // member2 has more current replicas than member3 / member4
3、Result：4、2、1、1  (results ensured inertia)

Story 2：reduced replicas scenario

In reducing replicas scenario, we expect directly reducing replicas in several clusters, avoiding different results calculated when rescheduling due to randomness, which leads to some clusters increasing replicas, while others reducing replica.