Commit Graph

11 Commits

Author SHA1 Message Date
Kuba Tużnik 4e283e34ee CA: don't error out in HintingSimulator if a hinted Node is gone
If a hinted Node is no longer in the cluster snapshot (e.g. it was
a fake upcoming Node and the real one appeared).

This was introduced in the recent PredicateChecker->PredicateSnapshot
refactor. Previously, PredicateChecker.CheckPredicates() would return
an error if the hinted Node was gone, and HintingSimulator treated
this error the same as failing predicates - it would move on to the
non-hinting logic. After the refactor, HintingSimulator explicitly
errors out if it can't retrieve the hinted Node from the snapshot,
so the behavior changed.

I checked other CheckPredicates()/SchedulePod() callsites, and this is
the only one when ignoring the missing Node makes sense. For the others,
the Node is added to the snapshot just before the call, so it being
missing should cause an error.
2024-12-12 14:20:51 +01:00
Kuba Tużnik 6876289228 CA: remove PredicateChecker, use the new ClusterSnapshot methods instead 2024-12-04 14:33:51 +01:00
Kuba Tużnik 540725286f CA: migrate the codebase to use PredicateSnapshot 2024-12-04 14:33:51 +01:00
Kuba Tużnik a35f830f1d CA: extract a Handle to scheduleframework.Framework out of PredicateChecker
This decouples PredicateChecker from the Framework initialization logic,
and allows creating multiple PredicateChecker instances while only
initializing the framework once.

This commit also fixes how CA integrates with Framework metrics. Instead
of Registering them they're only Initialized so that CA doesn't expose
scheduler metrics. And the initialization is moved from multiple
different places to the Handle constructor.
2024-12-03 16:47:54 +01:00
Kuba Tużnik f67db627e2 CA: rename ClusterSnapshot AddPod, RemovePod, RemoveNode
RemoveNode is renamed to RemoveNodeInfo for consistency with other
NodeInfo methods.

For DRA, the snapshot will have to potentially allocate ResourceClaims
when adding a Pod to a Node, and deallocate them when removing a Pod
from a Node. This will happen in new methods added to ClusterSnapshot
in later commits - SchedulePod and UnschedulePod. These new methods
should be the "default" way of moving pods around the snapshot going
forward.

However, we'll still need to be able to add and remove pods from the
snapshot "forcefully" to handle some corner cases (e.g. expendable pods).
AddPod is renamed to ForceAddPod, and RemovePod to ForceRemovePod to
highlight that these are no longer the "default" methods of moving pods
around the snapshot, and are bypassing something important.
2024-11-19 15:28:21 +01:00
Kuba Tużnik 879c6a84a4 DRA: migrate all of CA to use the new internal NodeInfo/PodInfo
The new wrapper types should behave like the direct schedulerframework
types for most purposes, so most of the migration is just changing
the imported package.

Constructors look a bit different, so they have to be adapted -
mostly in test code. Accesses to the Pods field have to be changed
to a method call.

After this, the schedulerframework types are only used in the new
wrappers, and in the parts of simulator/ that directly interact with
the scheduler framework. The rest of CA codebase operates on the new
wrapper types.
2024-11-05 16:43:43 +01:00
Bartłomiej Wróblewski 068ce78272 Register scheduler metrics 2024-10-23 16:47:34 +00:00
Aleksandra Malinowska dfc0ef363f Refactor out unnecessary return value; it's always nil 2024-04-29 11:16:58 +02:00
Daniel Gutowski b6fa1f5bc8 Cap logs logged by HintingSimulator.
To avoid excesive logging of the HintingSimulator we should use klogx
to cap the number of logged lines depending on the verbosity level.
2023-01-05 06:24:56 -08:00
Bartłomiej Wróblewski 10d3f25996 Use scheduling package in filterOutSchedulable processor 2022-11-23 12:32:59 +00:00
Daniel Kłobuszewski 92f5b8673e Extract scheduling hints to a dedicated object
This removes the need for passing maps back and forth when doing
scheduling simulations.
2022-10-20 11:44:15 +02:00