The logic is very basic and will likely need to be revised, but it's
something for initial testing. Utilization of a given Pool is calculated
as the number of allocated devices in the pool divided by the number of
all devices in the pool. For scale-down purposes, the max utilization
of all Node-local Pools is used.
The new logic is mostly behind the DRA flag guard, so this should be a no-op
if the flag is disabled. The only difference should be that FilterOutUnremovable
marks a Node as unremovable if calculating utilization fails. Not sure
why this wasn't the case before, but I think we need it for DRA - if CA sees an
incomplete picture of a resource pool, we probably don't want to scale
the Node down.
The Snapshot can hold all DRA objects in the cluster, and expose them
to the scheduler framework via the SharedDRAManager interface.
The state of the objects can be modified during autoscaling simulations
using the provided methods.
A new DRA Snapshot type is introduced, for now with just dummy methods
to be implemented in later commits. The new type is intended to hold all
DRA objects in the cluster.
ClusterSnapshotStore.SetClusterState() is extended to take the new DRA Snapshot in
addition to the existing parameters.
ClusterSnapshotStore.DraSnapshot() is added to retrieve the DRA snapshot set by
SetClusterState() back. This will be used by PredicateSnapshot to implement DRA
logic later.
This should be a no-op, as DraSnapshot() is never called, and no DRA
snapshot is passed to SetClusterState() yet.