Merge pull request #3168 from eduartua/issue-3064-grouping-by-sig-scheduling
Grouping /devel files by SIGs - SIG Scheduling
This commit is contained in:
commit
cb56a80017
|
@ -2,7 +2,7 @@
|
|||
|
||||
There are three ways to add new scheduling rules (predicates and priority
|
||||
functions) to Kubernetes: (1) by adding these rules to the scheduler and
|
||||
recompiling, [described here](/contributors/devel/scheduler.md),
|
||||
recompiling, [described here](/contributors/devel/sig-scheduling/scheduler.md),
|
||||
(2) implementing your own scheduler process that runs instead of, or alongside
|
||||
of, the standard Kubernetes scheduler, (3) implementing a "scheduler extender"
|
||||
process that the standard Kubernetes scheduler calls out to as a final pass when
|
||||
|
|
|
@ -1,90 +1,3 @@
|
|||
# The Kubernetes Scheduler
|
||||
|
||||
The Kubernetes scheduler runs as a process alongside the other master components such as the API server.
|
||||
Its interface to the API server is to watch for Pods with an empty PodSpec.NodeName,
|
||||
and for each Pod, it posts a binding indicating where the Pod should be scheduled.
|
||||
|
||||
## Exploring the code
|
||||
|
||||
We are dividing scheduler into three layers from high level:
|
||||
- [cmd/kube-scheduler/scheduler.go](http://releases.k8s.io/HEAD/cmd/kube-scheduler/scheduler.go):
|
||||
This is the main() entry that does initialization before calling the scheduler framework.
|
||||
- [pkg/scheduler/scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/scheduler.go):
|
||||
This is the scheduler framework that handles stuff (e.g. binding) beyond the scheduling algorithm.
|
||||
- [pkg/scheduler/core/generic_scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/core/generic_scheduler.go):
|
||||
The scheduling algorithm that assigns nodes for pods.
|
||||
|
||||
## The scheduling algorithm
|
||||
|
||||
```
|
||||
For given pod:
|
||||
|
||||
+---------------------------------------------+
|
||||
| Schedulable nodes: |
|
||||
| |
|
||||
| +--------+ +--------+ +--------+ |
|
||||
| | node 1 | | node 2 | | node 3 | |
|
||||
| +--------+ +--------+ +--------+ |
|
||||
| |
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
|
||||
Pred. filters: node 3 doesn't have enough resource
|
||||
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
| remaining nodes: |
|
||||
| +--------+ +--------+ |
|
||||
| | node 1 | | node 2 | |
|
||||
| +--------+ +--------+ |
|
||||
| |
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
|
||||
Priority function: node 1: p=2
|
||||
node 2: p=5
|
||||
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
select max{node priority} = node 2
|
||||
```
|
||||
|
||||
The Scheduler tries to find a node for each Pod, one at a time.
|
||||
- First it applies a set of "predicates" to filter out inappropriate nodes. For example, if the PodSpec specifies resource requests, then the scheduler will filter out nodes that don't have at least that much resources available (computed as the capacity of the node minus the sum of the resource requests of the containers that are already running on the node).
|
||||
- Second, it applies a set of "priority functions"
|
||||
that rank the nodes that weren't filtered out by the predicate check. For example, it tries to spread Pods across nodes and zones while at the same time favoring the least (theoretically) loaded nodes (where "load" - in theory - is measured as the sum of the resource requests of the containers running on the node, divided by the node's capacity).
|
||||
- Finally, the node with the highest priority is chosen (or, if there are multiple such nodes, then one of them is chosen at random). The code for this main scheduling loop is in the function `Schedule()` in [pkg/scheduler/core/generic_scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/core/generic_scheduler.go)
|
||||
|
||||
### Predicates and priorities policies
|
||||
|
||||
Predicates are a set of policies applied one by one to filter out inappropriate nodes.
|
||||
Priorities are a set of policies applied one by one to rank nodes (that made it through the filter of the predicates).
|
||||
By default, Kubernetes provides built-in predicates and priorities policies documented in [scheduler_algorithm.md](scheduler_algorithm.md).
|
||||
The predicates and priorities code are defined in [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) and [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/) , respectively.
|
||||
|
||||
|
||||
## Scheduler extensibility
|
||||
|
||||
The scheduler is extensible: the cluster administrator can choose which of the pre-defined
|
||||
scheduling policies to apply, and can add new ones.
|
||||
|
||||
### Modifying policies
|
||||
|
||||
The policies that are applied when scheduling can be chosen in one of two ways.
|
||||
The default policies used are selected by the functions `defaultPredicates()` and `defaultPriorities()` in
|
||||
[pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
|
||||
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](https://git.k8s.io/examples/staging/scheduler-policy-config.json) for an example
|
||||
config file. (Note that the config file format is versioned; the API is defined in [pkg/scheduler/api](http://releases.k8s.io/HEAD/pkg/scheduler/api/)).
|
||||
Thus to add a new scheduling policy, you should modify [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) or add to the directory [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/), and either register the policy in `defaultPredicates()` or `defaultPriorities()`, or use a policy config file.
|
||||
This file has moved to https://git.k8s.io/community/contributors/devel/sig-scheduling/scheduler.md.
|
||||
|
||||
This file is a placeholder to preserve links. Please remove by April 29, 2019 or the release of kubernetes 1.13, whichever comes first.
|
|
@ -1,40 +1,3 @@
|
|||
# Scheduler Algorithm in Kubernetes
|
||||
|
||||
For each unscheduled Pod, the Kubernetes scheduler tries to find a node across the cluster according to a set of rules. A general introduction to the Kubernetes scheduler can be found at [scheduler.md](scheduler.md). In this document, the algorithm of how to select a node for the Pod is explained. There are two steps before a destination node of a Pod is chosen. The first step is filtering all the nodes and the second is ranking the remaining nodes to find a best fit for the Pod.
|
||||
|
||||
## Filtering the nodes
|
||||
|
||||
The purpose of filtering the nodes is to filter out the nodes that do not meet certain requirements of the Pod. For example, if the free resource on a node (measured by the capacity minus the sum of the resource requests of all the Pods that already run on the node) is less than the Pod's required resource, the node should not be considered in the ranking phase so it is filtered out. Currently, there are several "predicates" implementing different filtering policies, including:
|
||||
|
||||
- `NoDiskConflict`: Evaluate if a pod can fit due to the volumes it requests, and those that are already mounted. Currently supported volumes are: AWS EBS, GCE PD, ISCSI and Ceph RBD. Only Persistent Volume Claims for those supported types are checked. Persistent Volumes added directly to pods are not evaluated and are not constrained by this policy.
|
||||
- `NoVolumeZoneConflict`: Evaluate if the volumes a pod requests are available on the node, given the Zone restrictions.
|
||||
- `PodFitsResources`: Check if the free resource (CPU and Memory) meets the requirement of the Pod. The free resource is measured by the capacity minus the sum of requests of all Pods on the node. To learn more about the resource QoS in Kubernetes, please check [QoS proposal](../design-proposals/node/resource-qos.md).
|
||||
- `PodFitsHostPorts`: Check if any HostPort required by the Pod is already occupied on the node.
|
||||
- `HostName`: Filter out all nodes except the one specified in the PodSpec's NodeName field.
|
||||
- `MatchNodeSelector`: Check if the labels of the node match the labels specified in the Pod's `nodeSelector` field and, as of Kubernetes v1.2, also match the `nodeAffinity` if present. See [here](https://kubernetes.io/docs/user-guide/node-selection/) for more details on both.
|
||||
- `MaxEBSVolumeCount`: Ensure that the number of attached ElasticBlockStore volumes does not exceed a maximum value (by default, 39, since Amazon recommends a maximum of 40 with one of those 40 reserved for the root volume -- see [Amazon's documentation](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html#linux-specific-volume-limits)). The maximum value can be controlled by setting the `KUBE_MAX_PD_VOLS` environment variable.
|
||||
- `MaxGCEPDVolumeCount`: Ensure that the number of attached GCE PersistentDisk volumes does not exceed a maximum value (by default, 16, which is the maximum GCE allows -- see [GCE's documentation](https://cloud.google.com/compute/docs/disks/persistent-disks#limits_for_predefined_machine_types)). The maximum value can be controlled by setting the `KUBE_MAX_PD_VOLS` environment variable.
|
||||
- `CheckNodeMemoryPressure`: Check if a pod can be scheduled on a node reporting memory pressure condition. Currently, no ``BestEffort`` pods should be placed on a node under memory pressure as it gets automatically evicted by kubelet.
|
||||
- `CheckNodeDiskPressure`: Check if a pod can be scheduled on a node reporting disk pressure condition. Currently, no pods should be placed on a node under disk pressure as it gets automatically evicted by kubelet.
|
||||
|
||||
The details of the above predicates can be found in [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go). All predicates mentioned above can be used in combination to perform a sophisticated filtering policy. Kubernetes uses some, but not all, of these predicates by default. You can see which ones are used by default in [pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
|
||||
|
||||
## Ranking the nodes
|
||||
|
||||
The filtered nodes are considered suitable to host the Pod, and it is often that there are more than one nodes remaining. Kubernetes prioritizes the remaining nodes to find the "best" one for the Pod. The prioritization is performed by a set of priority functions. For each remaining node, a priority function gives a score which scales from 0-10 with 10 representing for "most preferred" and 0 for "least preferred". Each priority function is weighted by a positive number and the final score of each node is calculated by adding up all the weighted scores. For example, suppose there are two priority functions, `priorityFunc1` and `priorityFunc2` with weighting factors `weight1` and `weight2` respectively, the final score of some NodeA is:
|
||||
|
||||
finalScoreNodeA = (weight1 * priorityFunc1) + (weight2 * priorityFunc2)
|
||||
|
||||
After the scores of all nodes are calculated, the node with highest score is chosen as the host of the Pod. If there are more than one nodes with equal highest scores, a random one among them is chosen.
|
||||
|
||||
Currently, Kubernetes scheduler provides some practical priority functions, including:
|
||||
|
||||
- `LeastRequestedPriority`: The node is prioritized based on the fraction of the node that would be free if the new Pod were scheduled onto the node. (In other words, (capacity - sum of requests of all Pods already on the node - request of Pod that is being scheduled) / capacity). CPU and memory are equally weighted. The node with the highest free fraction is the most preferred. Note that this priority function has the effect of spreading Pods across the nodes with respect to resource consumption.
|
||||
- `BalancedResourceAllocation`: This priority function tries to put the Pod on a node such that the CPU and Memory utilization rate is balanced after the Pod is deployed.
|
||||
- `SelectorSpreadPriority`: Spread Pods by minimizing the number of Pods belonging to the same service, replication controller, or replica set on the same node. If zone information is present on the nodes, the priority will be adjusted so that pods are spread across zones and nodes.
|
||||
- `CalculateAntiAffinityPriority`: Spread Pods by minimizing the number of Pods belonging to the same service on nodes with the same value for a particular label.
|
||||
- `ImageLocalityPriority`: Nodes are prioritized based on locality of images requested by a pod. Nodes with larger size of already-installed packages required by the pod will be preferred over nodes with no already-installed packages required by the pod or a small total size of already-installed packages required by the pod.
|
||||
- `NodeAffinityPriority`: (Kubernetes v1.2) Implements `preferredDuringSchedulingIgnoredDuringExecution` node affinity; see [here](https://kubernetes.io/docs/user-guide/node-selection/) for more details.
|
||||
|
||||
The details of the above priority functions can be found in [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/). Kubernetes uses some, but not all, of these priority functions by default. You can see which ones are used by default in [pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go). Similar as predicates, you can combine the above priority functions and assign weight factors (positive number) to them as you want (check [scheduler.md](scheduler.md) for how to customize).
|
||||
This file has moved to https://git.k8s.io/community/contributors/devel/sig-scheduling/scheduler_algorithm.md.
|
||||
|
||||
This file is a placeholder to preserve links. Please remove by April 29, 2019 or the release of kubernetes 1.13, whichever comes first.
|
|
@ -0,0 +1,90 @@
|
|||
# The Kubernetes Scheduler
|
||||
|
||||
The Kubernetes scheduler runs as a process alongside the other master components such as the API server.
|
||||
Its interface to the API server is to watch for Pods with an empty PodSpec.NodeName,
|
||||
and for each Pod, it posts a binding indicating where the Pod should be scheduled.
|
||||
|
||||
## Exploring the code
|
||||
|
||||
We are dividing scheduler into three layers from high level:
|
||||
- [cmd/kube-scheduler/scheduler.go](http://releases.k8s.io/HEAD/cmd/kube-scheduler/scheduler.go):
|
||||
This is the main() entry that does initialization before calling the scheduler framework.
|
||||
- [pkg/scheduler/scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/scheduler.go):
|
||||
This is the scheduler framework that handles stuff (e.g. binding) beyond the scheduling algorithm.
|
||||
- [pkg/scheduler/core/generic_scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/core/generic_scheduler.go):
|
||||
The scheduling algorithm that assigns nodes for pods.
|
||||
|
||||
## The scheduling algorithm
|
||||
|
||||
```
|
||||
For given pod:
|
||||
|
||||
+---------------------------------------------+
|
||||
| Schedulable nodes: |
|
||||
| |
|
||||
| +--------+ +--------+ +--------+ |
|
||||
| | node 1 | | node 2 | | node 3 | |
|
||||
| +--------+ +--------+ +--------+ |
|
||||
| |
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
|
||||
Pred. filters: node 3 doesn't have enough resource
|
||||
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
| remaining nodes: |
|
||||
| +--------+ +--------+ |
|
||||
| | node 1 | | node 2 | |
|
||||
| +--------+ +--------+ |
|
||||
| |
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
+-------------------+-------------------------+
|
||||
|
||||
Priority function: node 1: p=2
|
||||
node 2: p=5
|
||||
|
||||
+-------------------+-------------------------+
|
||||
|
|
||||
|
|
||||
v
|
||||
select max{node priority} = node 2
|
||||
```
|
||||
|
||||
The Scheduler tries to find a node for each Pod, one at a time.
|
||||
- First it applies a set of "predicates" to filter out inappropriate nodes. For example, if the PodSpec specifies resource requests, then the scheduler will filter out nodes that don't have at least that much resources available (computed as the capacity of the node minus the sum of the resource requests of the containers that are already running on the node).
|
||||
- Second, it applies a set of "priority functions"
|
||||
that rank the nodes that weren't filtered out by the predicate check. For example, it tries to spread Pods across nodes and zones while at the same time favoring the least (theoretically) loaded nodes (where "load" - in theory - is measured as the sum of the resource requests of the containers running on the node, divided by the node's capacity).
|
||||
- Finally, the node with the highest priority is chosen (or, if there are multiple such nodes, then one of them is chosen at random). The code for this main scheduling loop is in the function `Schedule()` in [pkg/scheduler/core/generic_scheduler.go](http://releases.k8s.io/HEAD/pkg/scheduler/core/generic_scheduler.go)
|
||||
|
||||
### Predicates and priorities policies
|
||||
|
||||
Predicates are a set of policies applied one by one to filter out inappropriate nodes.
|
||||
Priorities are a set of policies applied one by one to rank nodes (that made it through the filter of the predicates).
|
||||
By default, Kubernetes provides built-in predicates and priorities policies documented in [scheduler_algorithm.md](scheduler_algorithm.md).
|
||||
The predicates and priorities code are defined in [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) and [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/) , respectively.
|
||||
|
||||
|
||||
## Scheduler extensibility
|
||||
|
||||
The scheduler is extensible: the cluster administrator can choose which of the pre-defined
|
||||
scheduling policies to apply, and can add new ones.
|
||||
|
||||
### Modifying policies
|
||||
|
||||
The policies that are applied when scheduling can be chosen in one of two ways.
|
||||
The default policies used are selected by the functions `defaultPredicates()` and `defaultPriorities()` in
|
||||
[pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
|
||||
However, the choice of policies can be overridden by passing the command-line flag `--policy-config-file` to the scheduler, pointing to a JSON file specifying which scheduling policies to use. See [examples/scheduler-policy-config.json](https://git.k8s.io/examples/staging/scheduler-policy-config.json) for an example
|
||||
config file. (Note that the config file format is versioned; the API is defined in [pkg/scheduler/api](http://releases.k8s.io/HEAD/pkg/scheduler/api/)).
|
||||
Thus to add a new scheduling policy, you should modify [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go) or add to the directory [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/), and either register the policy in `defaultPredicates()` or `defaultPriorities()`, or use a policy config file.
|
||||
|
|
@ -0,0 +1,40 @@
|
|||
# Scheduler Algorithm in Kubernetes
|
||||
|
||||
For each unscheduled Pod, the Kubernetes scheduler tries to find a node across the cluster according to a set of rules. A general introduction to the Kubernetes scheduler can be found at [scheduler.md](scheduler.md). In this document, the algorithm of how to select a node for the Pod is explained. There are two steps before a destination node of a Pod is chosen. The first step is filtering all the nodes and the second is ranking the remaining nodes to find a best fit for the Pod.
|
||||
|
||||
## Filtering the nodes
|
||||
|
||||
The purpose of filtering the nodes is to filter out the nodes that do not meet certain requirements of the Pod. For example, if the free resource on a node (measured by the capacity minus the sum of the resource requests of all the Pods that already run on the node) is less than the Pod's required resource, the node should not be considered in the ranking phase so it is filtered out. Currently, there are several "predicates" implementing different filtering policies, including:
|
||||
|
||||
- `NoDiskConflict`: Evaluate if a pod can fit due to the volumes it requests, and those that are already mounted. Currently supported volumes are: AWS EBS, GCE PD, ISCSI and Ceph RBD. Only Persistent Volume Claims for those supported types are checked. Persistent Volumes added directly to pods are not evaluated and are not constrained by this policy.
|
||||
- `NoVolumeZoneConflict`: Evaluate if the volumes a pod requests are available on the node, given the Zone restrictions.
|
||||
- `PodFitsResources`: Check if the free resource (CPU and Memory) meets the requirement of the Pod. The free resource is measured by the capacity minus the sum of requests of all Pods on the node. To learn more about the resource QoS in Kubernetes, please check [QoS proposal](../design-proposals/node/resource-qos.md).
|
||||
- `PodFitsHostPorts`: Check if any HostPort required by the Pod is already occupied on the node.
|
||||
- `HostName`: Filter out all nodes except the one specified in the PodSpec's NodeName field.
|
||||
- `MatchNodeSelector`: Check if the labels of the node match the labels specified in the Pod's `nodeSelector` field and, as of Kubernetes v1.2, also match the `nodeAffinity` if present. See [here](https://kubernetes.io/docs/user-guide/node-selection/) for more details on both.
|
||||
- `MaxEBSVolumeCount`: Ensure that the number of attached ElasticBlockStore volumes does not exceed a maximum value (by default, 39, since Amazon recommends a maximum of 40 with one of those 40 reserved for the root volume -- see [Amazon's documentation](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html#linux-specific-volume-limits)). The maximum value can be controlled by setting the `KUBE_MAX_PD_VOLS` environment variable.
|
||||
- `MaxGCEPDVolumeCount`: Ensure that the number of attached GCE PersistentDisk volumes does not exceed a maximum value (by default, 16, which is the maximum GCE allows -- see [GCE's documentation](https://cloud.google.com/compute/docs/disks/persistent-disks#limits_for_predefined_machine_types)). The maximum value can be controlled by setting the `KUBE_MAX_PD_VOLS` environment variable.
|
||||
- `CheckNodeMemoryPressure`: Check if a pod can be scheduled on a node reporting memory pressure condition. Currently, no ``BestEffort`` pods should be placed on a node under memory pressure as it gets automatically evicted by kubelet.
|
||||
- `CheckNodeDiskPressure`: Check if a pod can be scheduled on a node reporting disk pressure condition. Currently, no pods should be placed on a node under disk pressure as it gets automatically evicted by kubelet.
|
||||
|
||||
The details of the above predicates can be found in [pkg/scheduler/algorithm/predicates/predicates.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/predicates/predicates.go). All predicates mentioned above can be used in combination to perform a sophisticated filtering policy. Kubernetes uses some, but not all, of these predicates by default. You can see which ones are used by default in [pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go).
|
||||
|
||||
## Ranking the nodes
|
||||
|
||||
The filtered nodes are considered suitable to host the Pod, and it is often that there are more than one nodes remaining. Kubernetes prioritizes the remaining nodes to find the "best" one for the Pod. The prioritization is performed by a set of priority functions. For each remaining node, a priority function gives a score which scales from 0-10 with 10 representing for "most preferred" and 0 for "least preferred". Each priority function is weighted by a positive number and the final score of each node is calculated by adding up all the weighted scores. For example, suppose there are two priority functions, `priorityFunc1` and `priorityFunc2` with weighting factors `weight1` and `weight2` respectively, the final score of some NodeA is:
|
||||
|
||||
finalScoreNodeA = (weight1 * priorityFunc1) + (weight2 * priorityFunc2)
|
||||
|
||||
After the scores of all nodes are calculated, the node with highest score is chosen as the host of the Pod. If there are more than one nodes with equal highest scores, a random one among them is chosen.
|
||||
|
||||
Currently, Kubernetes scheduler provides some practical priority functions, including:
|
||||
|
||||
- `LeastRequestedPriority`: The node is prioritized based on the fraction of the node that would be free if the new Pod were scheduled onto the node. (In other words, (capacity - sum of requests of all Pods already on the node - request of Pod that is being scheduled) / capacity). CPU and memory are equally weighted. The node with the highest free fraction is the most preferred. Note that this priority function has the effect of spreading Pods across the nodes with respect to resource consumption.
|
||||
- `BalancedResourceAllocation`: This priority function tries to put the Pod on a node such that the CPU and Memory utilization rate is balanced after the Pod is deployed.
|
||||
- `SelectorSpreadPriority`: Spread Pods by minimizing the number of Pods belonging to the same service, replication controller, or replica set on the same node. If zone information is present on the nodes, the priority will be adjusted so that pods are spread across zones and nodes.
|
||||
- `CalculateAntiAffinityPriority`: Spread Pods by minimizing the number of Pods belonging to the same service on nodes with the same value for a particular label.
|
||||
- `ImageLocalityPriority`: Nodes are prioritized based on locality of images requested by a pod. Nodes with larger size of already-installed packages required by the pod will be preferred over nodes with no already-installed packages required by the pod or a small total size of already-installed packages required by the pod.
|
||||
- `NodeAffinityPriority`: (Kubernetes v1.2) Implements `preferredDuringSchedulingIgnoredDuringExecution` node affinity; see [here](https://kubernetes.io/docs/user-guide/node-selection/) for more details.
|
||||
|
||||
The details of the above priority functions can be found in [pkg/scheduler/algorithm/priorities](http://releases.k8s.io/HEAD/pkg/scheduler/algorithm/priorities/). Kubernetes uses some, but not all, of these priority functions by default. You can see which ones are used by default in [pkg/scheduler/algorithmprovider/defaults/defaults.go](http://releases.k8s.io/HEAD/pkg/scheduler/algorithmprovider/defaults/defaults.go). Similar as predicates, you can combine the above priority functions and assign weight factors (positive number) to them as you want (check [scheduler.md](scheduler.md) for how to customize).
|
||||
|
Loading…
Reference in New Issue