Commit Graph

83 Commits

Author SHA1 Message Date
Bartłomiej Wróblewski 0fb897b839 Update imports after scheduler scheduler/framework/v1alpha1 removal 2020-11-30 10:48:52 +00:00
Michael McCune d8d064f6bc refactor CAPI controller unit test to use PollImmediate
This change removes the `PollImmediateInfinite` calls in the cluster-api
controller unit tests in favor of `PollImmediate`. It is being proposed
to prevent an edge case where the polling calls would become blocked
indefinitely. As we are using fake clients within the unit tests there
should be no delay getting a return value, but just in case there is a
miss on the poll function the new `PollImmediate` will timeout after 15
seconds.
2020-11-04 16:24:17 -05:00
Jason DeTiberus 06e5f6a0ed
Update group identifier to use for Cluster API annotations
- Also add backwards compatibility for the previously used deprecated annotations
2020-09-21 10:42:46 -04:00
Jason DeTiberus 150dbdeb64
[cluster-autoscaler] Support using --cloud-config for clusterapi provider
- Leverage --cloud-config to allow for providing a separate kubeconfig for Cluster API management and workload cluster resources
- Allow for fallback to previous behavior when --cloud-config is not specified for backward compatibility
- Provides a --clusterapi-cloud-config-authoritative flag to disable the above fallback behavior and allow for both the management and workload cluster clients to use the in-cluster config
2020-09-21 10:38:06 -04:00
Jason DeTiberus 75b850718f
Add node autodiscovery to cluster-autoscaler clusterapi provider 2020-08-20 16:08:49 -04:00
Jason DeTiberus 63f9e40d82
Improve Cluster API tests to work better with constrained resources 2020-08-19 13:31:32 -04:00
Jason DeTiberus 18d44fc532
Convert clusterapi provider to use unstructured
Remove internal types for Cluster API and replace with unstructured access
2020-07-21 15:49:03 -04:00
Ben Moss d97e3dc221
Add sample deployment/service account manifest
Based on https://notes.elmiko.dev/2020/05/22/kubernetes-autoscaler-capd.html
2020-07-15 19:31:13 +00:00
Maciek Pytel a548be8d91 Fix typo in documentation
It was blocking presubmits
2020-06-08 14:03:26 +02:00
Maciek Pytel 655b4081f4 Migrate to klog v2 2020-06-05 17:22:26 +02:00
Michael McCune aab6973f86 add CAPI prerequisites section to cluster-autoscaler README
This change adds a section to the cluster-autoscaler CAPI provider
README which details the required prerequisites for using the
autoscaler. It is being added to help inform users about the
restrictions that are currently in place with regards to using this
provider.
2020-06-04 12:00:10 -04:00
Michael McCune 1a62952003 remove redundant error checks in mark/unmark deletion functions
This change removes a few nil checks against resources returned in the
Mark and Unmark deletion functions of the cluster-autoscaler CAPI
provider. These checks look to see if the returned value for a resource
are nil, but the function will not return nil if it returns an
error[0]. We only need to check the error return as discussed here[1].

[0]
https://github.com/kubernetes/client-go/blob/master/dynamic/simple.go#L234
[1]
https://github.com/openshift/kubernetes-autoscaler/pull/141/files#r414480960
2020-06-03 15:07:16 -04:00
Michael McCune abbb26a93c Improve delete node mechanisms in cluster-autoscaler CAPI provider
This change adds a function to remove the annotations associated with
marking a node for deletion. It also adds logic to unmark a node in the
event that an error is returned after the node has been annotated but
before it has been removed. In the case where a node cannot be removed
(eg due to minimum size), the node is unmarked before we return from the
error condition.
2020-06-03 15:05:58 -04:00
Joel Speed be6edb4a3e Rewrite DeleteNodesTwice test to check API not TargetSize for
cluster-autoscaler CAPI provider
2020-06-02 14:51:11 -04:00
Enxebre dac1f7d47e Compare against minSize in deleteNodes() in cluster-autoscaler CAPI
provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
2020-06-02 14:48:48 -04:00
Enxebre 9c8b78aa79 Get replicas always from API server for cluster-autoscaler CAPI provider
When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104
2020-06-02 14:45:58 -04:00
Michael McCune f1407a1b50 Add mutex to DeleteNodes in cluster-autoscaler CAPI provider
This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104
2020-06-02 13:58:47 -04:00
Kubernetes Prow Robot 0f504d38c5
Merge pull request #3057 from JoelSpeed/external-node-ids
CAPI: Do not normalize Node IDs outside of CAPI provider
2020-05-27 07:28:40 -07:00
Michael McCune cc1dbb9c3d add readme file for cluster-api provider
This change adds a readme which describes the basic operation of the
cluster-api provider.
2020-04-28 10:38:47 -04:00
Jakub Tużnik 73a5cdf928 Address recent breaking changes in scheduler
The following things changed in scheduler and needed to be fixed:
* NodeInfo was moved to schedulerframework
* Some fields on NodeInfo are now exposed directly instead of via getters
* NodeInfo.Pods is now a list of *schedulerframework.PodInfo, not *apiv1.Pod
* SharedLister and NodeInfoLister were moved to schedulerframework
* PodLister was removed
2020-04-24 17:54:47 +02:00
Joel Speed 5e0126ada5
Do not normalize Node IDs outside of CAPI provider 2020-04-16 10:32:27 +01:00
Joel Speed d23d3a1dd5
Add testing for fake provider IDs 2020-04-02 15:24:57 +01:00
Joel Speed 8283e80da7
Provide fake proivder IDs for failed machines 2020-04-02 15:24:15 +01:00
Enxebre 1a16bbf4a9 Let the controller move on if machineDeployments are not available
There might be adhoc environments where machineDeployments might not necessarily be available. This let the controller to remain functional for such scenarios.
2020-03-20 15:44:54 +01:00
Enxebre dfbb0491df CAPI: Stop panicking in newMachineController 2020-03-18 14:12:49 +01:00
Michael McCune 7082cfee81 Add the ability to override CAPI group via env variable and discover API version.
This change adds detection for an environment variable to specify the group for the clusterapi resources. If the environment
variable `CAPI_GROUP` is specified, then it will
be used instead of the default.
This also decouples the API group from the version and let the latter to be discovered dynamically.
2020-03-16 14:58:54 +01:00
Andrew McDermott d9e3197daa Normalize providerID values
We index on providerID but it turns out that those values on node and
machine are not always consistent. Some encode region, some do not,
for example.

This commit normalizes all values through the normalizedProviderString().

To ensure that we catch all places I've introduced a new type and made
the find() functions take this new type in lieu of a string. Unit
tests have also been adjusted to introduce a 'test:///' prefix on the
providerID value to further validate the change.

This change allows CAPI to work out-of-the-box, assuming v1alpha2.

It's also reasonable to assert that this consistency should be
enforced elsewhere and to make this behaviour easily revertable I'm
leaving this as a separate commit in this patch series.
2020-03-10 10:59:05 +00:00
Andrew McDermott c5fa2b4cba Update OWNERS 2020-03-10 10:59:05 +00:00
Enxebre 7ba979866a Make machine API swappable as an env variable 2020-03-10 10:59:05 +00:00
Joel Speed eae1579100 Ensure DeleteNodes doesn't delete a node twice 2020-03-10 10:59:05 +00:00
Enxebre 699c0b83b4 Let Nodes() return the list of all machines
The autoscaler expects provider implementations nodeGroups to implement the Nodes() function to return the number of instances belonging to the group regardless of they have become a kubernetes node or not.
This information is then used for instance to realise about unregistered nodes bf3a9fb52e/cluster-autoscaler/clusterstate/clusterstate.go (L307-L311)
2020-03-10 10:59:05 +00:00
Andrew McDermott f83d0dd810 cloudprovider/clusterapi: copy cluster-api v1alpha types
These are copied to facilitate testing. They are not meant to reflect
upstream clusterapi/v1alpha1 - in fact, fields have been removed. They
are here to support the switch to unstructured types in the tests
without having to rewrite all of the unit tests.
2020-03-10 10:59:04 +00:00
Andrew McDermott 46bb9b4f29 cloudprovider/clusterapi: new provider
This adds a new cloudprovider based on the cluster-api project:

  https://github.com/kubernetes-sigs/cluster-api
2020-03-10 10:59:04 +00:00