I hit an odd IAM failure yesterday, and this information would have been
helpful. It only proved a negative - it turned out to be an AWS problem
that was solved by deleting and recreating the IAM roles - but still
makes diagnosis much easier.
The update in #1444 didn't add the security groups to terraform meaning
if you did a `kops --target terraform` you only got the standard
security group.
This:
- reworks how retries are handled in fi/executor.go to a time-based scheme
- changes the single-task limit to 10m (from about 30s of no-progress)
- eliminates the inner IAM propagation retry for LaunchConfigurations,
because the task itself will just be redriven for a while. This also
eliminates any long-pole delay caused by this error (since task Run()
should be 'fast').
Beginnings of a mock for the AWSCloud, so that hopefully we aren't
calling out to AWS at all in the tests. We will likely start mocking
the actual EC2 APIs in future, but this seems a good starting point.
Fix#425
This allows for a larger EBS root volume (and we now default to 20GB,
just like kube-up did).
We remove the BlockDeviceMappings support because it wasn't used and
made things a lot more complicated. We always map the ephemeral
devices.
Issue #24
IAM instance profile creation is very async, and this causes dependent
resources to fail. That's fine - we have good retry logic - but we
should output a less frightening error message.
Issue #35
When we retry a task, we run the Run method again. But in this case,
the run method actually populated some default values. Only warn if the
values we are populating are different, to avoid spurious warnings.
We probably need a stronger lifecycle - for example having a Validate
method would probably be helpful.
Fix#48
This way we can output a LaunchConfiguration prefix into terrform that
we can then read later, so that we can create with terraform and then
transfer to another mode of operation if desired.