Merge pull request #3914 from wangchen615/customizable_recommender_kep
Add enhancement proposal for feature request #3913
This commit is contained in:
commit
3bfbc18c03
|
|
@ -0,0 +1,298 @@
|
||||||
|
# Support Customized Recommenders for Vertical Pod Autoscalers
|
||||||
|
|
||||||
|
<!-- toc -->
|
||||||
|
- [Summary](#summary)
|
||||||
|
- [Motivation](#motivation)
|
||||||
|
- [Goals](#goals)
|
||||||
|
- [Non-Goals](#non-goals)
|
||||||
|
- [Proposal](#proposal)
|
||||||
|
- [User Stories](#user-stories-optional)
|
||||||
|
- [Story 1](#story-1)
|
||||||
|
- [Story 2](#story-2)
|
||||||
|
- [Implementation Details](#implementation-details)
|
||||||
|
- [Deployment Details](#deployment-details)
|
||||||
|
- [Design Details](#design-details)
|
||||||
|
- [Test Plan](#test-plan)
|
||||||
|
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
|
||||||
|
- [Alternatives](#alternatives)
|
||||||
|
- [Out of Scope](#out-out-scope)
|
||||||
|
- [Implementation History](#implementation-history)
|
||||||
|
<!-- /toc -->
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Today, the current VPA recommends CPU/Mem requests based on one recommender,
|
||||||
|
which recommends the future requests based on the historical usage observed in a
|
||||||
|
rolling time window. As there is no universal recommendation policy that applies to all
|
||||||
|
types of workload, this KEP suggests supporting multiple customized recommenders in VPA.
|
||||||
|
Thus, users can run different recommenders for different workloads, as they may exhibit
|
||||||
|
very distinct resource usage behaviors.
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
A VPA is used to recommend the requested resources of containers in pods when the actual CPU/memory usage of a container
|
||||||
|
is significantly different from the resources requested. Resource usage-based recommendation
|
||||||
|
is the basic approach that resizes containers according to the actual usage observed and is
|
||||||
|
implemented in the default VPA recommender. Users can configure the time window and a certain
|
||||||
|
percentile of observed usage in the past as the prediction of future requests/limits for CPU/memory.
|
||||||
|
|
||||||
|
However, as containers running different types of workloads may have different resource usage patterns,
|
||||||
|
there is no universal policy that applies to all. The existing VPA recommender may not accurately
|
||||||
|
predict future resource usage when containers exhibit certain resource usage behaviors,
|
||||||
|
such as trending, periodically changing, or occasional spikes, resulting in significant
|
||||||
|
over-provisioning and OOM kills for microservices. Learning different types of resource usage
|
||||||
|
behaviors for workloads and applying different algorithms to improve resource utilization
|
||||||
|
(CPU and Memory) predictions can significantly reduce over-provisioning and OOM kills in VPA.
|
||||||
|
|
||||||
|
### Goals
|
||||||
|
|
||||||
|
- Allow the VPA object to specify a customized recommender to use.
|
||||||
|
- Allow the VPA object to use the default recommender when no recommender is specified.
|
||||||
|
|
||||||
|
### Non-Goals
|
||||||
|
|
||||||
|
- We assume no pod uses two recommenders at the same time.
|
||||||
|
- We do not resolve conflicts between recommenders.
|
||||||
|
|
||||||
|
## Proposal
|
||||||
|
|
||||||
|
### User Stories
|
||||||
|
|
||||||
|
#### Story 1
|
||||||
|
|
||||||
|
- Containers with Cyclic Patterns in Resource Usage
|
||||||
|
|
||||||
|
Containers used in monitoring may receive load periodically to process but need to be long-running
|
||||||
|
to listen to incoming traffic. Thus, these containers usually exhibit cyclic patterns, alternating
|
||||||
|
between usage spikes and idling. Resizing containers according to usage observed in the previous
|
||||||
|
time window may always lead to under-provision for a short period when the load spikes just arrive.
|
||||||
|
The problem will happen for memory if the cyclic pattern length is >8 days. For CPU, the problem may
|
||||||
|
be visible for example with lower usage on the weekend. The problem will even lead to frequent pod evictions
|
||||||
|
when the pod was resized according to the idling period and the host resource has been taken by other pods.
|
||||||
|
|
||||||
|
#### Story 2
|
||||||
|
|
||||||
|
- Containers with Different but Recurrent Behaviors in Resource Usage
|
||||||
|
|
||||||
|
Containers running spark/deep learning training workloads are known to show recurring and repeating
|
||||||
|
patterns in resource usage. Prior research has shown that different but recurrent behaviors occur
|
||||||
|
for different containerized tasks, such as Spark or deep learning training. These common patterns can
|
||||||
|
be represented by phases, which display similar resource usage of computational resources over time.
|
||||||
|
There are common sequences of patterns for different executions of the workload and they can be used
|
||||||
|
to proactively predict future resource usage more accurately. The default recommender in the current
|
||||||
|
VPA adopts a reactive approach so a more proactive recommender is needed for these types of workload.
|
||||||
|
|
||||||
|
### Implementation Details
|
||||||
|
|
||||||
|
The following describes the details of implementing a first-citizen approach to support the customized
|
||||||
|
recommender. Namely, a dedicated field `recommenderName` is added to the VPA crd definition in
|
||||||
|
`deploy/vpa-v1.crd.yaml`.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
validation:
|
||||||
|
# openAPIV3Schema is the schema for validating custom objects.
|
||||||
|
openAPIV3Schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
spec:
|
||||||
|
type: object
|
||||||
|
required: []
|
||||||
|
properties:
|
||||||
|
recommenderName:
|
||||||
|
type: string
|
||||||
|
targetRef:
|
||||||
|
type: object
|
||||||
|
updatePolicy:
|
||||||
|
type: object
|
||||||
|
```
|
||||||
|
|
||||||
|
Correspondingly, the `VerticalPodAutoscalerSpec` in `pkg/apis/autoscaling.k8s.io/v1/types.go`
|
||||||
|
should be updated to include the `recommenderName` field.
|
||||||
|
|
||||||
|
```golang
|
||||||
|
// VerticalPodAutoscalerSpec is the specification of the behavior of the autoscaler.
|
||||||
|
type VerticalPodAutoscalerSpec struct {
|
||||||
|
// TargetRef points to the controller managing the set of pods for the
|
||||||
|
// autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler
|
||||||
|
// can be targeted at controller implementing scale subresource (the pod set is
|
||||||
|
// retrieved from the controller's ScaleStatus) or some well known controllers
|
||||||
|
// (e.g. for DaemonSet the pod set is read from the controller's spec).
|
||||||
|
// If VerticalPodAutoscaler cannot use specified target it will report
|
||||||
|
// ConfigUnsupported condition.
|
||||||
|
// Note that VerticalPodAutoscaler does not require full implementation
|
||||||
|
// of scale subresource - it will not use it to modify the replica count.
|
||||||
|
// The only thing retrieved is a label selector matching pods grouped by
|
||||||
|
// the target resource.
|
||||||
|
TargetRef *autoscaling.CrossVersionObjectReference `json:"targetRef" protobuf:"bytes,1,name=targetRef"`
|
||||||
|
|
||||||
|
// Describes the rules on how changes are applied to the pods.
|
||||||
|
// If not specified, all fields in the `PodUpdatePolicy` are set to their
|
||||||
|
// default values.
|
||||||
|
// +optional
|
||||||
|
UpdatePolicy *PodUpdatePolicy `json:"updatePolicy,omitempty" protobuf:"bytes,2,opt,name=updatePolicy"`
|
||||||
|
|
||||||
|
// Controls how the autoscaler computes recommended resources.
|
||||||
|
// The resource policy may be used to set constraints on the recommendations
|
||||||
|
// for individual containers. If not specified, the autoscaler computes recommended
|
||||||
|
// resources for all containers in the pod, without additional constraints.
|
||||||
|
// +optional
|
||||||
|
ResourcePolicy *PodResourcePolicy `json:"resourcePolicy,omitempty" protobuf:"bytes,3,opt,name=resourcePolicy"`
|
||||||
|
|
||||||
|
// Name of the recommender responsible for generating recommendation for this object.
|
||||||
|
RecommenderName []string `json:"recommenderName,omitempty" protobuf:"bytes,4,opt,name=recommenderName"`
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When creating a recommender object for recommendations, the recommender main routine should initiate itself
|
||||||
|
with a predefined recommender name, which can be defined as a constant in the `pkg/recomender/main.go` routine,
|
||||||
|
|
||||||
|
```golang
|
||||||
|
const RecommenderName = "default"
|
||||||
|
|
||||||
|
recommender := routines.NewRecommender(config, *checkpointsGCInterval, useCheckpoints, RecommenderName, *vpaObjectNamespace)
|
||||||
|
```
|
||||||
|
|
||||||
|
where the routines.NewRecommender can pass the `RecommenderName` to the clusterState object.
|
||||||
|
|
||||||
|
```golang
|
||||||
|
// NewRecommender creates a new recommender instance.
|
||||||
|
// Dependencies are created automatically.
|
||||||
|
// Deprecated; use RecommenderFactory instead.
|
||||||
|
func NewRecommender(config *rest.Config, checkpointsGCInterval time.Duration, useCheckpoints bool, recommender_name string, namespace string) Recommender {
|
||||||
|
clusterState := model.NewClusterState(recommender_name)
|
||||||
|
return RecommenderFactory{
|
||||||
|
ClusterState: clusterState,
|
||||||
|
ClusterStateFeeder: input.NewClusterStateFeeder(config, clusterState, *memorySaver, namespace),
|
||||||
|
CheckpointWriter: checkpoint.NewCheckpointWriter(clusterState, vpa_clientset.NewForConfigOrDie(config).AutoscalingV1()),
|
||||||
|
VpaClient: vpa_clientset.NewForConfigOrDie(config).AutoscalingV1(),
|
||||||
|
PodResourceRecommender: logic.CreatePodResourceRecommender(),
|
||||||
|
CheckpointsGCInterval: checkpointsGCInterval,
|
||||||
|
UseCheckpoints: useCheckpoints,
|
||||||
|
}.Make()
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
// NewClusterState returns a new ClusterState with no pods.
|
||||||
|
func NewClusterState(recommender_name string) *ClusterState {
|
||||||
|
return &ClusterState{
|
||||||
|
RecommenderName: recommender_name,
|
||||||
|
Pods: make(map[PodID]*PodState),
|
||||||
|
Vpas: make(map[VpaID]*Vpa),
|
||||||
|
EmptyVPAs: make(map[VpaID]time.Time),
|
||||||
|
aggregateStateMap: make(aggregateContainerStatesMap),
|
||||||
|
labelSetMap: make(labelSetMap),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Therefore, when loading VPA objects to the `clusterStateFeeder`, it can use the field selector to select VPA CRDs that
|
||||||
|
have `recommenderName` equal to the current clusterState’s `RecommenderName`.
|
||||||
|
```golang
|
||||||
|
// Fetch VPA objects and load them into the cluster state.
|
||||||
|
func (feeder *clusterStateFeeder) LoadVPAs() {
|
||||||
|
// List VPA API objects.
|
||||||
|
allVpaCRDs, err := feeder.vpaLister.List(labels.Everything())
|
||||||
|
if err != nil {
|
||||||
|
klog.Errorf("Cannot list VPAs. Reason: %+v", err)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
var vpaCRDs []*vpa_types.VerticalPodAutoscaler
|
||||||
|
for _, vpaCRD := range allVpaCRDs {
|
||||||
|
currentRecommenderName := feeder.clusterState.RecommenderName
|
||||||
|
if (vpaCRD.Spec.RecommenderName != currentRecommenderName) && (vpaCRD.Spec.RecommenderName != "") {
|
||||||
|
klog.V(6).Infof("Ignoring the vpaCRD as its name %v is not equal to the current recommender's name %v", vpaCRD.Spec.RecommenderName, currentRecommenderName)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
vpaCRDs = append(vpaCRDs, vpaCRD)
|
||||||
|
|
||||||
|
klog.V(3).Infof("Fetched %d VPAs.", len(vpaCRDs))
|
||||||
|
// Add or update existing VPAs in the model.
|
||||||
|
vpaKeys := make(map[model.VpaID]bool)
|
||||||
|
|
||||||
|
…
|
||||||
|
|
||||||
|
feeder.clusterState.ObservedVpas = vpaCRDs
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Accordingly, the VPA object yaml should include the `recommenderName` as the default `RecommenderName`.
|
||||||
|
```yaml
|
||||||
|
apiVersion: "autoscaling.k8s.io/v1"
|
||||||
|
kind: VerticalPodAutoscaler
|
||||||
|
metadata:
|
||||||
|
name: hamster-vpa
|
||||||
|
Spec:
|
||||||
|
recommenderName: default
|
||||||
|
targetRef:
|
||||||
|
apiVersion: "apps/v1"
|
||||||
|
... ...
|
||||||
|
```
|
||||||
|
|
||||||
|
### Deployment Details
|
||||||
|
The customized recommender is supposed to be deployed as a separate deployment that is chosen
|
||||||
|
by different sets of VPA objects.Each VPA object is supposed to choose only one recommender at a time.
|
||||||
|
The way how the default recommender and the customized recommender are running and interacting with VPA objects
|
||||||
|
are shown in the following drawing.
|
||||||
|
|
||||||
|
<img src="images/deployment.png" alt="deployment" width="720" height="360"/>
|
||||||
|
|
||||||
|
Though we do not support a VPA object to use multiple recommenders in this proposal, we leave the possibility of necessary
|
||||||
|
changes of using multiple recommenders in the future. Namely, we define `recommenderName` to be an array instead of a string, but we support one element only in this proposal. We modify the admission controller to validate that the array has <= 1 elements.
|
||||||
|
|
||||||
|
We will add the following check in the `func validateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool)` function.
|
||||||
|
```
|
||||||
|
if len(vpa.Spec.RecommenderName) > 1 {
|
||||||
|
return fmt.Errorf("VPA object shouldn't specify more than one recommenderNames.")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Design Details
|
||||||
|
|
||||||
|
### Test Plan
|
||||||
|
- Add e2e test demonstrating the default recommender ignores a VPA which specifies an alternate recommender.
|
||||||
|
|
||||||
|
### Upgrade / Downgrade Strategy
|
||||||
|
For cluster upgrades, the VPA from the previous version will continue working as before.
|
||||||
|
There is no change in behavior or flags which have to be enabled or disabled.
|
||||||
|
|
||||||
|
## Alternatives
|
||||||
|
|
||||||
|
### Develop a plugin framework for customizable recommenders.
|
||||||
|
Add a webhook system for customized recommendations. The default VPA recommender would
|
||||||
|
call any available recommendation webhooks, and if any of them make a recommendation,
|
||||||
|
the recommender would use that recommendation instead of making its own. If none make
|
||||||
|
a recommendation, it would make its recommendation as it currently does. The plugin alternative
|
||||||
|
is rejected because it involves much more design changes and code changes. It might be considered in the future if there are
|
||||||
|
more use cases where running multiple recommenders for the same VPA object is needed.
|
||||||
|
|
||||||
|
### Develop a label selector approach.
|
||||||
|
Add a label for the CRD object to denote the recommender’s name. When making
|
||||||
|
recommendations in the recommender, only the VpaCrds with the label
|
||||||
|
`recommender=default` will be loaded and updated by the existing recommender.
|
||||||
|
A label selector approach is rejected because it is too powerful and users can easily
|
||||||
|
ignore those labels and misconfigure the VPA objects.
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
|
||||||
|
- Although this proposal will enable alternate recommenders, no alternate recommenders
|
||||||
|
will be created as part of this proposal.
|
||||||
|
- This proposal will not support running multiple recommenders for the same VPA object. Each VPA object
|
||||||
|
is supposed to use only one recommender.
|
||||||
|
|
||||||
|
## Implementation History
|
||||||
|
|
||||||
|
<!--
|
||||||
|
Major milestones in the lifecycle of a KEP should be tracked in this section.
|
||||||
|
Major milestones might include:
|
||||||
|
- the `Summary` and `Motivation` sections being merged, signaling SIG acceptance
|
||||||
|
- the `Proposal` section being merged, signaling agreement on a proposed design
|
||||||
|
- the date implementation started
|
||||||
|
- the first Kubernetes release where an initial version of the KEP was available
|
||||||
|
- the version of Kubernetes where the KEP graduated to general availability
|
||||||
|
- when the KEP was retired or superseded
|
||||||
|
-->
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Binary file not shown.
|
After Width: | Height: | Size: 65 KiB |
Loading…
Reference in New Issue