Commit Graph

8583 Commits

Author SHA1 Message Date
Kuba Tużnik 5df68a49ca CA: update-deps.sh: move tidy after all gets
There doesn't seem to be any benefit to tidying after
getting every dependency pkg. And trying to tidy after
every get seems bugged:

* Some of the k8s.io dependencies depend on each other.
  For example k8s.io/client-go depends on k8s.io/api.
* We bump k8s.io/api to a new version, and that version removes
  some pkg (e.g. replaces v1alpha1 with v1alpha2), that
  some other dependency (e.g. k8s.io/client-go) requires.
* If we try to tidy immediately after getting k8s.io/api, this can
  fail because the required new k8s.io/api version doesn't have the
  removed pkg that k8s.io/client-go (still at a lower version
  since we haven't processed it yet) requires.

Getting all of the dependencies at the new versions first, and then
tidying once afterwards solves this issue.
2024-11-27 18:30:38 +01:00
Ismail Alidzhikov 4e96bbc73a vpa-recommender: Group prometheus and external-metrics flags to dedicated var blocks 2024-11-27 17:57:54 +02:00
Omer Aplatony 716239572f Removed already implemented TODO
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-11-27 13:54:36 +02:00
Kuba Tużnik eb26816ce9 CA: refactor utils related to NodeInfos
simulator.BuildNodeInfoForNode, core_utils.GetNodeInfoFromTemplate,
and scheduler_utils.DeepCopyTemplateNode all had very similar logic
for sanitizing and copying NodeInfos. They're all consolidated to
one file in simulator, sharing common logic.

DeepCopyNodeInfo is changed to be a framework.NodeInfo method.

MixedTemplateNodeInfoProvider now correctly uses ClusterSnapshot to
correlate Nodes to scheduled pods, instead of using a live Pod lister.
This means that the snapshot now has to be properly initialized in a
bunch of tests.
2024-11-27 12:51:30 +01:00
Kuba Tużnik bc16a6f55b CA: wrap the provided errors in ToAutoscalerError() and AddPrefix(), implement Unwrap()
This allows using errors.Is() to check if an AutoscalerError wraps
a sentinel error (e.g. cloudprovider.ErrNotImplemented) when a prefix is
added to it.
2024-11-27 12:44:59 +01:00
Kubernetes Prow Robot 29ce5d42b7
Merge pull request #7534 from voelzmo/fix/invalid-log-keys
Fix invalid log keys
2024-11-27 10:58:57 +00:00
Marco Voelz 0d3e262f4b apply review suggestions 2024-11-27 11:42:14 +01:00
Marco Voelz 3b054a0f53
Update vertical-pod-autoscaler/pkg/target/controller_fetcher/controller_fetcher.go
Co-authored-by: Ismail Alidzhikov <9372594+ialidzhikov@users.noreply.github.com>
2024-11-27 11:41:27 +01:00
Marco Voelz d791391e22 fix all InfoS without keys 2024-11-27 10:46:35 +01:00
Marco Voelz d886fe414e fix all ErrorS without keys 2024-11-27 10:46:35 +01:00
Adrian Moisey 13d6ffa28f
Guard against panic
And add tests
2024-11-27 08:57:44 +02:00
Adrian Moisey e9c1c1cc26
Remove DeepCopy
It's unneeded
2024-11-27 08:48:09 +02:00
Kubernetes Prow Robot 930f148f01
Merge pull request #7531 from willie-yao/fast-delete-toggle
Add flag to enable fast delete of failed VMSS
2024-11-26 01:38:56 +00:00
willie-yao 064d48f36c
Add toggle for fast delete 2024-11-26 00:25:04 +00:00
Kubernetes Prow Robot 86a80c6823
Merge pull request #7526 from willie-yao/cse-fast-delete
Set node state to InstanceCreating to delete on CSE error
2024-11-26 00:20:57 +00:00
Kubernetes Prow Robot 60a35bbee4
Merge pull request #7496 from PBundyra/noretry-flag
Introduce noRetry Parameter for checkcapacity ProvisioningRequest
2024-11-25 09:58:56 +00:00
Adrian Moisey e277df0063
Fix logic 2024-11-24 20:45:01 +02:00
Adrian Moisey 66eabb3e30
Pass the whole VPA into cappingRecommendationProcessor.Apply()
The current log message for when no container is found is very
misleading and can cause confusion.

This passes the entire VPA object into that function, in order for it to
create a log file with the relevant VPA name in it.

It kinda feels like surgery with a scalpel, any alternative approaches
would be appreciated.
2024-11-23 20:55:18 +02:00
Kubernetes Prow Robot 3080b95129
Merge pull request #7490 from raywainman/dockerfile
Remove maintainer labels in Dockerfiles since they just get stale
2024-11-23 17:04:54 +00:00
willie-yao 49a1ad4ad2
Set node state to InstanceCreating to delete on CSE error 2024-11-23 00:25:12 +00:00
Kubernetes Prow Robot 5ae08a927e
Merge pull request #7482 from omerap12/issue-7434
chore(VPA): add comprehensive test coverage for types.go
2024-11-22 19:04:54 +00:00
Kubernetes Prow Robot f8b2129609
Merge pull request #7505 from omerap12/e2e-wait-deprecated
VPA: Poll/PollImmediate PollUntilContextTimeout in v1 e2e tests
2024-11-22 19:02:54 +00:00
Kubernetes Prow Robot 3312125f42
Merge pull request #7483 from adrianmoisey/remove-long-sleeps
VPA: Try make recommender e2e a little faster
2024-11-22 19:00:55 +00:00
Kubernetes Prow Robot 60f82d65c2
Merge pull request #7488 from omerap12/bump-metrics-server
VPA: e2e Update metrics-server version to 0.7.2
2024-11-22 18:58:54 +00:00
Kubernetes Prow Robot 855b940112
Merge pull request #7494 from adrianmoisey/readme_updates
VPA - Update docs a little
2024-11-22 18:54:55 +00:00
Kubernetes Prow Robot 5458e1c208
Merge pull request #7436 from maximrub/fr-7435-alibaba-cloud-rrsa-new-env-vars
7435 Support New Alibaba Cloud ENV Variables names for RRSA Authorization
2024-11-22 10:30:54 +00:00
Kubernetes Prow Robot 30e57c90b9
Merge pull request #7466 from towca/jtuznik/dra-snapshot-cleanup
CA: refactor ClusterSnapshot methods
2024-11-21 17:18:54 +00:00
Patryk Bundyra 14067b766c Address nits 2024-11-21 13:37:13 +00:00
Patryk Bundyra ddcb59dd23 Add logging, add comment, move test util function 2024-11-20 15:29:27 +00:00
Kubernetes Prow Robot 4c37ff38ce
Merge pull request #6999 from dominic-p/iss-5919-placement-groups
Add support for node pool placement group config
2024-11-20 13:04:53 +00:00
Kuba Tużnik 473a1a8ffc CA: remove Clear from ClusterSnapshot
It's now redundant - SetClusterState with empty arguments does the same
thing.
2024-11-19 15:28:27 +01:00
Kuba Tużnik f67db627e2 CA: rename ClusterSnapshot AddPod, RemovePod, RemoveNode
RemoveNode is renamed to RemoveNodeInfo for consistency with other
NodeInfo methods.

For DRA, the snapshot will have to potentially allocate ResourceClaims
when adding a Pod to a Node, and deallocate them when removing a Pod
from a Node. This will happen in new methods added to ClusterSnapshot
in later commits - SchedulePod and UnschedulePod. These new methods
should be the "default" way of moving pods around the snapshot going
forward.

However, we'll still need to be able to add and remove pods from the
snapshot "forcefully" to handle some corner cases (e.g. expendable pods).
AddPod is renamed to ForceAddPod, and RemovePod to ForceRemovePod to
highlight that these are no longer the "default" methods of moving pods
around the snapshot, and are bypassing something important.
2024-11-19 15:28:21 +01:00
Kuba Tużnik a81aa5c616 CA: remove AddNode from ClusterSnapshot
AddNodeInfo already provides the same functionality, and has to be used
in production code in order to propagate DRA objects correctly.

Uses in production are replaced with SetClusterState(), which will later
take DRA objects into account. Uses in the test code are replaced with
AddNodeInfo().
2024-11-19 15:28:16 +01:00
Kuba Tużnik 38603883db CA: remove redundant IsPVCUsedByPods from ClusterSnapshot
The method is already accessible via StorageInfos(), it's
redundant.
2024-11-19 15:28:11 +01:00
Kuba Tużnik 517ecb992f CA: add SetClusterState to ClusterSnapshot, remove AddNodes
AddNodes() is redundant - it was indended for batch adding nodes,
with batch-specific optimizations in mind probably. However, it
has always been implemented as just iterating over AddNode(), and
is only used in test code.

Most of the uses in the test code were initializing the cluster state.
They are replaced with SetClusterState(), which will later be needed for
handling DRA anyway (we'll have to start tracking things that aren't
node- or pod-scoped). The other uses are replaced with inline loops over
AddNode().
2024-11-19 15:28:06 +01:00
Kuba Tużnik 269c7a339e CA: remove AddNodeWithPods from ClusterSnapshot, replace uses with AddNodeInfo
We need AddNodeInfo in order to propagate DRA objects through the
snapshot, which makes AddNodeWithPods redundant.
2024-11-19 15:27:59 +01:00
Kubernetes Prow Robot a01276ef14
Merge pull request #7493 from BigDarkClown/remove-unneeded
Add flag to force remove long unregistered nodes
2024-11-19 10:00:55 +00:00
Kubernetes Prow Robot 2d37aeefe8
Merge pull request #7385 from jlamillan/jlamillan/oci_sdk_65.75.2-2
Upgrade OCI providers SDK to v65.75.2.
2024-11-18 23:54:54 +00:00
Bartłomiej Wróblewski 15803158ed Split removeOldUnregisteredNodes method 2024-11-18 16:37:03 +00:00
Bartłomiej Wróblewski a0bf1082b5 Add flag to force remove long unregistered nodes 2024-11-18 13:55:15 +00:00
Bartłomiej Wróblewski c5f13bb02d Add ForceDeleteNodes implementation for GCE cloud provider 2024-11-18 13:55:09 +00:00
Bartłomiej Wróblewski 3b47908e51 Add ForceDeleteNodes method to NodeGroup interface 2024-11-18 13:55:07 +00:00
Patryk Bundyra 3b371f7179 Add unit test 2024-11-18 10:37:09 +00:00
Omer Aplatony f796c7d703 VPA: Poll/PollImmediate PollUntilContextTimeout in v1 e2e tests
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-11-16 13:56:35 +02:00
Maxim Rubchinsky dcd6d6ab36
7435 Support New Alibaba Cloud ENV Variables names for RRSA Authorization in Cluster Autoscaler
Signed-off-by: Maxim Rubchinsky <maxim@rubchinsky.com>
2024-11-16 11:58:54 +02:00
Kubernetes Prow Robot b01bff1640
Merge pull request #7453 from gvnc/oci-self-managed-nodes-fix
exclude self-managed nodes from being processed
2024-11-15 23:32:53 +00:00
Kubernetes Prow Robot 97f4089a6b
Merge pull request #7503 from adrianmoisey/typo
Fix typo in error message
2024-11-15 20:42:53 +00:00
Adrian Moisey e749f20623
Fix typo in error message 2024-11-15 21:25:03 +02:00
Kubernetes Prow Robot 009f2b8b16
Merge pull request #7438 from maximrub/bug-7437-alibaba-cloud-endpoint-reloving-logging
7437 Add logging for endpoint resolving errors
2024-11-15 10:10:52 +00:00
Kubernetes Prow Robot 267a0d8a98
Merge pull request #7459 from damikag/update-bootdisk-logs
Change log level of boot dist type and size defaulting in gce_price
2024-11-15 09:54:53 +00:00