Merge pull request #8902 from olemarkus/doc-fixes-2

Clean up the spec docs
2020-04-15 10:16:04 -07:00 · 2020-04-15 10:16:04 -07:00 · b42bd45b2f
parent b8de497879 bb9b842a7a
commit b42bd45b2f
4 changed files with 569 additions and 655 deletions
--- a/docs/cluster_spec.md
+++ b/docs/cluster_spec.md
@ -1,10 +1,12 @@
-# Description of Keys in `config` and `cluster.spec`
+# The `Cluster` resource

-This list is not complete but aims to document any keys that are less than self-explanatory. Our [go.dev](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops) reference provides a more detailed list of API values. [ClusterSpec](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#ClusterSpec), defined as `kind: Cluster` in YAML, and [InstanceGroupSpec](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#InstanceGroupSpec), defined as `kind: InstanceGroup` in YAML, are the two top-level API values used to describe a cluster.
+The `Cluster` resource contains the specification of the cluster itself.

-## spec
+The complete list of keys can be found at the [Cluster](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#ClusterSpec) reference page. 

-### api
+On this page, we will expand on the more important configuration keys.
+
+## api

 This object configures how we expose the API:

@ -76,9 +78,11 @@ spec:
      crossZoneLoadBalancing: true
 ```

-### etcdClusters v3 & tls
+## etcdClusters

-Although kops doesn't presently default to etcd3, it is possible to turn on both v3 and TLS authentication for communication amongst cluster members. These options may be enabled via the cluster spec (manifests only i.e. no command line options as yet). An upfront warning; at present no upgrade path exists for migrating from v2 to v3 so **DO NOT** try to enable this on a v2 running cluster as it must be done on cluster creation. The below example snippet assumes a HA cluster of three masters.
+### The default etcd configuration
+
+Kops will default to v3 using TLS by default. etcd provisioning and upgrades are handled by etcd-manager. By default, the spec looks like this:

 ```yaml
 etcdClusters:
@ -89,9 +93,7 @@ etcdClusters:
    name: a-2
  - instanceGroup: master0-az1
    name: b-1
-  enableEtcdTLS: true
  name: main
-  version: 3.0.17
 - etcdMembers:
  - instanceGroup: master0-az0
    name: a-1
@ -99,16 +101,14 @@ etcdClusters:
    name: a-2
  - instanceGroup: master0-az1
    name: b-1
-  enableEtcdTLS: true
  name: events
-  version: 3.0.17
 ```

-> __Note:__ The images for etcd that kops uses are from the Google Cloud Repository. Google doesn't release every version of etcd to the gcr. Check that the version of etcd you want to use is available [at the gcr](https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/etcd?gcrImageListsize=50) before using it in your cluster spec.
+The etcd version used by kops follows the recommended etcd version for the given kubernetes version. It is possible to override this by adding the `version` key to each of the etcd clusters.

 By default, the Volumes created for the etcd clusters are `gp2` and 20GB each. The volume size, type and Iops( for `io1`) can be configured via their parameters. Conversion between `gp2` and `io1` is not supported, nor are size changes.

-As of Kops 1.12.0 it is also possible to specify the requests for your etcd cluster members using the `cpuRequest` and `memoryRequest` parameters.
+As of Kops 1.12.0 it is also possible to modify the requests for your etcd cluster members using the `cpuRequest` and `memoryRequest` parameters.

 ```yaml
 etcdClusters:
@ -130,9 +130,9 @@ etcdClusters:
  memoryRequest: 512Mi
 ```

-### etcd v3 and metrics
+### etcd metrics

-You cam expose /metrics endpoint for `main` and `event` etcd instances and control their type (`basic` or `extensive`) by defining env vars:
+You cam expose /metrics endpoint for the etcd instances and control their type (`basic` or `extensive`) by defining env vars:

 ```yaml
 etcdClusters:
@ -148,7 +148,7 @@ etcdClusters:
      value: basic
 ```

-### sshAccess
+## sshAccess

 This array configures the CIDRs that are able to ssh into nodes. On AWS this is manifested as inbound security group rules on the `nodes` and `master` security groups.

@ -160,7 +160,7 @@ spec:
    - 12.34.56.78/32
 ```

-### kubernetesApiAccess
+## kubernetesApiAccess

 This array configures the CIDRs that are able to access the kubernetes API. On AWS this is manifested as inbound security group rules on the ELB or master security groups.

@ -172,12 +172,12 @@ spec:
    - 12.34.56.78/32
 ```

-### cluster.spec Subnet Keys
+## cluster.spec Subnet Keys

-#### id
+### id
 ID of a subnet to share in an existing VPC.

-#### egress
+### egress
 The resource identifier (ID) of something in your existing VPC that you would like to use as "egress" to the outside world.

 This feature was originally envisioned to allow re-use of NAT gateways. In this case, the usage is as follows. Although NAT gateways are "public"-facing resources, in the Cluster spec, you must specify them in the private subnet section. One way to think about this is that you are specifying "egress", which is the default route out from this private subnet.
@ -209,7 +209,7 @@ spec:
    zone: us-east-1a
 ```

-#### publicIP
+### publicIP
 The IP of an existing EIP that you would like to attach to the NAT gateway.

 ```
@ -222,11 +222,11 @@ spec:
    zone: us-east-1a
 ```

-### kubeAPIServer
+## kubeAPIServer

 This block contains configuration for the `kube-apiserver`.

-#### oidc flags for Open ID Connect Tokens
+### oidc flags for Open ID Connect Tokens

 Read more about this here: https://kubernetes.io/docs/admin/authentication/#openid-connect-tokens

@ -244,7 +244,7 @@ spec:
    	- "key=value"
 ```

-#### audit logging
+### audit logging

 Read more about this here: https://kubernetes.io/docs/admin/audit

@ -264,7 +264,7 @@ You could use the [fileAssets](https://github.com/kubernetes/kops/blob/master/do

 Example policy file can be found [here](https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/audit/audit-policy.yaml)

-#### dynamic audit configuration
+### dynamic audit configuration

 Read more about this here: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#dynamic-backend

@ -282,7 +282,7 @@ You could use the [fileAssets](https://github.com/kubernetes/kops/blob/master/do

 Example policy file can be found [here](https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/audit/audit-policy.yaml)

-#### bootstrap tokens
+### bootstrap tokens

 Read more about this here: https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/

@ -298,7 +298,7 @@ By enabling this feature you instructing two things;

 **Note** enabling bootstrap tokens does not provision bootstrap tokens for the worker nodes. Under this configuration it is assumed a third-party process is provisioning the tokens on behalf of the worker nodes. For the full setup please read [Node Authorizer Service](node_authorization.md)

-#### Max Requests Inflight
+### Max Requests Inflight

 The maximum number of non-mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero for no limit. (default 400)

@ -316,7 +316,7 @@ spec:
    maxMutatingRequestsInflight: 450
 ```

-#### runtimeConfig
+### runtimeConfig

 Keys and values here are translated into `--runtime-config` values for `kube-apiserver`, separated by commas.

@ -332,7 +332,7 @@ spec:

 Will result in the flag `--runtime-config=batch/v2alpha1=true,apps/v1alpha1=true`. Note that `kube-apiserver` accepts `true` as a value for switch-like flags.

-#### serviceNodePortRange
+### serviceNodePortRange

 This value is passed as `--service-node-port-range` for `kube-apiserver`.

@ -342,7 +342,7 @@ spec:
    serviceNodePortRange: 30000-33000
 ```

-#### Disable Basic Auth
+### Disable Basic Auth

 Support for basic authentication was removed in Kubernetes 1.19. For previous versions
 of Kubernetes this will disable the passing of the `--basic-auth-file` flag when:
@ -353,7 +353,7 @@ spec:
    disableBasicAuth: true
 ```

-#### targetRamMb
+### targetRamMb

 Memory limit for apiserver in MB (used to configure sizes of caches, etc.)

@ -363,7 +363,7 @@ spec:
    targetRamMb: 4096
 ```

-#### eventTTL
+### eventTTL

 How long API server retains events. Note that you must fill empty units of time with zeros.

@ -373,7 +373,7 @@ spec:
    eventTTL: 03h0m0s
 ```

-### externalDns
+## externalDns

 This block contains configuration options for your `external-DNS` provider.
 The current external-DNS provider is the kops `dns-controller`, which can set up DNS records for Kubernetes resources.
@ -387,7 +387,7 @@ spec:

 Default _kops_ behavior is false. `watchIngress: true` uses the default _dns-controller_ behavior which is to watch the ingress controller for changes. Set this option at risk of interrupting Service updates in some cases.

-### kubelet
+## kubelet

 This block contains configurations for `kubelet`.  See https://kubernetes.io/docs/admin/kubelet/

@ -400,7 +400,7 @@ NOTE: Where the corresponding configuration value can be empty, fields can be se

 Will result in the flag `--resolv-conf=` being built.

-#### Disable CPU CFS Quota
+### Disable CPU CFS Quota
 To disable CPU CFS quota enforcement for containers that specify CPU limits (default true) we have to set the flag `--cpu-cfs-quota` to `false`
 on all the kubelets. We can specify that in the `kubelet` spec in our cluster.yml.

@ -410,7 +410,7 @@ spec:
    cpuCFSQuota: false
 ```

-#### Configure CPU CFS Period
+### Configure CPU CFS Period
 Configure CPU CFS quota period value (cpu.cfs_period_us). Example:

 ```
@ -419,7 +419,7 @@ spec:
    cpuCFSQuotaPeriod: "100ms"
 ```

-#### Enable Custom metrics support
+### Enable Custom metrics support
 To use custom metrics in kubernetes as per [custom metrics doc](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics)
 we have to set the flag `--enable-custom-metrics` to `true` on all the kubelets. We can specify that in the `kubelet` spec in our cluster.yml.

@ -429,7 +429,7 @@ spec:
    enableCustomMetrics: true
 ```

-#### Setting kubelet CPU management policies
+### Setting kubelet CPU management policies
 Kops 1.12.0 added support for enabling cpu management policies in kubernetes as per [cpu management doc](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#cpu-management-policies)
 we have to set the flag `--cpu-manager-policy` to the appropriate value on all the kubelets. This must be specified in the `kubelet` spec in our cluster.yml.

@ -439,7 +439,7 @@ spec:
    cpuManagerPolicy: static
 ```

-#### Setting kubelet configurations together with the Amazon VPC backend
+### Setting kubelet configurations together with the Amazon VPC backend
 Setting kubelet configurations together with the networking Amazon VPC backend requires to also set the `cloudProvider: aws` setting in this block. Example:

 ```yaml
@ -456,7 +456,7 @@ spec:
    amazonvpc: {}
 ```

-#### Configure a Flex Volume plugin directory
+### Configure a Flex Volume plugin directory
 An optional flag can be provided within the KubeletSpec to set a volume plugin directory (must be accessible for read/write operations), which is additionally provided to the Controller Manager and mounted in accordingly.

 Kops will set this for you based off the Operating System in use:
@ -471,7 +471,7 @@ spec:
    volumePluginDirectory: /provide/a/writable/path/here
 ```

-### kubeScheduler
+## kubeScheduler

 This block contains configurations for `kube-scheduler`.  See https://kubernetes.io/docs/admin/kube-scheduler/

@ -485,7 +485,7 @@ Will make kube-scheduler use the scheduler policy from configmap "scheduler-poli

 Note that as of Kubernetes 1.8.0 kube-scheduler does not reload its configuration from configmap automatically. You will need to ssh into the master instance and restart the Docker container manually.

-### kubeDNS
+## kubeDNS

 This block contains configurations for `kube-dns`.

@ -543,6 +543,8 @@ spec:

 **Note:** If you are upgrading to CoreDNS, kube-dns will be left in place and must be removed manually (you can scale the kube-dns and kube-dns-autoscaler deployments in the `kube-system` namespace to 0 as a starting point). The `kube-dns` Service itself should be left in place, as this retains the ClusterIP and eliminates the possibility of DNS outages in your cluster. If you would like to continue autoscaling, update the `kube-dns-autoscaler` Deployment container command for `--target=Deployment/kube-dns` to be `--target=Deployment/coredns`.

+## Node local DNS cache
+
 If you are using CoreDNS, you can enable NodeLocal DNSCache. It is used to improve improve the Cluster DNS performance by running a dns caching agent on cluster nodes as a DaemonSet.

 ```yaml
@ -563,7 +565,7 @@ spec:
    clusterDNS: 169.254.20.10
 ```

-### kubeControllerManager
+## kubeControllerManager
 This block contains configurations for the `controller-manager`.

 ```yaml
@ -579,7 +581,9 @@ spec:

 For more details on `horizontalPodAutoscaler` flags see the [official HPA docs](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) and the [Kops guides on how to set it up](horizontal_pod_autoscaling.md).

-####  Feature Gates
+###  Feature Gates
+
+Feature gates can be configured on the kubelet.

 ```yaml
 spec:
@ -589,12 +593,10 @@ spec:
      AllowExtTrafficLocalEndpoints: "false"
 ```

-Will result in the flag `--feature-gates=Accelerators=true,AllowExtTrafficLocalEndpoints=false`
+The above will result in the flag `--feature-gates=Accelerators=true,AllowExtTrafficLocalEndpoints=false` being added to the kubelet.

-NOTE: Feature gate `ExperimentalCriticalPodAnnotation` is enabled by default because some critical components like `kube-proxy` depend on its presence.
-
-Some feature gates also require the `featureGates` setting to be used on other components - e.g. `PodShareProcessNamespace` requires
-the feature gate to be enabled on the api server:
+Some feature gates also require the `featureGates` setting on other components. For example`PodShareProcessNamespace` requires
+the feature gate to be enabled also on the api server:

 ```yaml
 spec:
@ -608,7 +610,7 @@ spec:

 For more information, see the [feature gate documentation](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/)

-####  Compute Resources Reservation
+###  Compute Resources Reservation

 ```yaml
 spec:
@ -630,7 +632,7 @@ Will result in the flag `--kube-reserved=cpu=100m,memory=100Mi,ephemeral-storage

 Learn [more about reserving compute resources](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/).

-### networkID
+## networkID

 On AWS, this is the id of the VPC the cluster is created in. If creating a cluster from scratch, this field does not need to be specified at create time; `kops` will create a `VPC` for you.

@ -641,9 +643,9 @@ spec:

 More information about running in an existing VPC is [here](run_in_existing_vpc.md).

-### hooks
+## hooks

-Hooks allow for the execution of an action before the installation of Kubernetes on every node in a cluster.  For instance you can install Nvidia drivers for using GPUs. This hooks can be in the form of Docker images or manifest files (systemd units). Hooks can be placed in either the cluster spec, meaning they will be globally deployed, or they can be placed into the instanceGroup specification. Note: service names on the instanceGroup which overlap with the cluster spec take precedence and ignore the cluster spec definition, i.e. if you have a unit file 'myunit.service' in cluster and then one in the instanceGroup, only the instanceGroup is applied.
+Hooks allow for the execution of an action before the installation of Kubernetes on every node in a cluster. For instance you can install Nvidia drivers for using GPUs. This hooks can be in the form of Docker images or manifest files (systemd units). Hooks can be placed in either the cluster spec, meaning they will be globally deployed, or they can be placed into the instanceGroup specification. Note: service names on the instanceGroup which overlap with the cluster spec take precedence and ignore the cluster spec definition, i.e. if you have a unit file 'myunit.service' in cluster and then one in the instanceGroup, only the instanceGroup is applied.

 When creating a systemd unit hook using the `manifest` field, the hook system will construct a systemd unit file for you. It creates the `[Unit]` section, adding an automated description and setting `Before` and `Requires` values based on the `before` and `requires` fields. The value of the `manifest` field is used as the `[Service]` section of the unit file. To override this behavior, and instead specify the entire unit file yourself, you may specify `useRawManifest: true`. In this case, the contents of the `manifest` field will be used as a systemd unit, unmodified. The `before` and `requires` fields may not be used together with `useRawManifest`.

@ -742,9 +744,9 @@ spec:
      image: busybox
 ```

-### fileAssets
+## fileAssets

-FileAssets is an alpha feature which permits you to place inline file content into the cluster and instanceGroup specification. It's designated as alpha as you can probably do this via kubernetes daemonsets as an alternative.
+FileAssets permits you to place inline file content into the cluster and instanceGroup specification. This is useful for deploying additional configuration files that kubernetes components requires, such as auditlogs or admission controller configurations.

 ```yaml
 spec:
@ -758,9 +760,9 @@ spec:
 ```


-### cloudConfig
+## cloudConfig

-#### disableSecurityGroupIngress
+### disableSecurityGroupIngress
 If you are using aws as `cloudProvider`, you can disable authorization of ELB security group to Kubernetes Nodes security group. In other words, it will not add security group rule.
 This can be useful to avoid AWS limit: 50 rules per security group.
 ```yaml
@ -769,8 +771,7 @@ spec:
    disableSecurityGroupIngress: true
 ```

-#### elbSecurityGroup
-*WARNING: this works only for Kubernetes version above 1.7.0.*
+### elbSecurityGroup

 To avoid creating a security group per elb, you can specify security group id, that will be assigned to your LoadBalancer. It must be security group id, not name.
 `api.loadBalancer.additionalSecurityGroups` must be empty, because Kubernetes will add rules per ports that are specified in service file.
@ -782,8 +783,7 @@ spec:
    elbSecurityGroup: sg-123445678
 ```

-### containerRuntime
-*WARNING: this works only for Kubernetes version above 1.11.0.*
+## containerRuntime

 Alternative [container runtimes](https://kubernetes.io/docs/setup/production-environment/container-runtimes/) can be used to run Kubernetes. Docker is still the default container runtime, but [containerd](https://kubernetes.io/blog/2018/05/24/kubernetes-containerd-integration-goes-ga/) can also be selected.

@ -792,7 +792,7 @@ spec:
  containerRuntime: containerd
 ```

-### containerd
+## containerd

 It is possible to override the [containerd](https://github.com/containerd/containerd/blob/master/README.md) daemon options for all the nodes in the cluster. See the [API docs](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#ContainerdConfig) for the full list of options.

@ -804,11 +804,11 @@ spec:
    configOverride: ""
 ```

-### docker
+## docker

 It is possible to override Docker daemon options for all masters and nodes in the cluster. See the [API docs](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#DockerConfig) for the full list of options.

-#### registryMirrors
+### registryMirrors

 If you have a bunch of Docker instances (physical or vm) running, each time one of them pulls an image that is not present on the host, it will fetch it from the internet (DockerHub). By caching these images, you can keep the traffic within your local network and avoid egress bandwidth usage.
 This setting benefits not only cluster provisioning but also image pulling.
@ -823,7 +823,7 @@ spec:
    - https://registry.example.com
 ```

-#### Skip Install
+### Skip Install

 If you want nodeup to skip the Docker installation tasks, you can do so with:

@ -835,7 +835,7 @@ spec:

 **NOTE:** When this field is set to `true`, it is entirely up to the user to install and configure Docker.

-#### storage
+### storage

 The Docker [Storage Driver](https://docs.docker.com/engine/reference/commandline/dockerd/#daemon-storage-driver) can be specified in order to override the default. Be sure the driver you choose is supported by your operating system and docker version.

@ -848,7 +848,7 @@ docker:
    - "dm.use_deferred_removal=true"
 ```

-### sshKeyName
+## sshKeyName

 In some cases, it may be desirable to use an existing AWS SSH key instead of allowing kops to create a new one.
 Providing the name of a key already in AWS is an alternative to `--ssh-public-key`.
@ -864,8 +864,7 @@ spec:
  sshKeyName: ""
 ```

-
-### useHostCertificates
+## useHostCertificates

 Self-signed certificates towards Cloud APIs. In some cases Cloud APIs do have self-signed certificates.

@ -874,7 +873,7 @@ spec:
  useHostCertificates: true
 ```

-#### Optional step: add root certificates to instancegroups root ca bundle
+### Optional step: add root certificates to instancegroups root ca bundle

 ```yaml
  additionalUserData:
@ -892,8 +891,7 @@ snip

 **NOTE**: `update-ca-certificates` is command for debian/ubuntu. That command is different depending your OS.

-
-### target
+## target

 In some use-cases you may wish to augment the target output with extra options.  `target` supports a minimal amount of options you can do this with.  Currently only the terraform target supports this, but if other use cases present themselves, kops may eventually support more.

@ -905,11 +903,11 @@ spec:
        alias: foo
 ```

-### assets
+## assets

 Assets define alternative locations from where to retrieve static files and containers

-#### containerRegistry
+### containerRegistry

 The container registry enables kops / kubernetes to pull containers from a managed registry.
 This is useful when pulling containers from the internet is not an option, eg. because the
@ -925,7 +923,7 @@ spec:
 ```


-#### containerProxy
+### containerProxy

 The container proxy is designed to acts as a [pull through cache](https://docs.docker.com/registry/recipes/mirror/) for docker container assets.
 Basically, what it does is it remaps the Kubernetes image URL to point to your cache so that the docker daemon will pull the image from that location.
@ -939,7 +937,7 @@ spec:
    containerProxy: proxy.example.com
 ```

-### Setting Custom Kernel Runtime Parameters
+## sysctlParameters

 To add custom kernel runtime parameters to your all instance groups in the
 cluster, specify the `sysctlParameters` field as an array of strings. Each
@ -962,4 +960,3 @@ spec:
 ```

 which would end up in a drop-in file on all masters and nodes of the cluster.
-
--- a/docs/instance_groups.md
+++ b/docs/instance_groups.md
@ -1,400 +1,92 @@
-# Instance Groups
+# The `InstanceGroup` resource

-kops has the concept of "instance groups", which are a group of similar machines.  On AWS, they map to
-an AutoScalingGroup.
+The `InstanceGroup` resource represents a group of similar machines typically provisioned in the same availability zone. On AWS, instance groups map directly to an autoscaling group.

-By default, a cluster has:
+The complete list of keys can be found at the [InstanceGroup](https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#InstanceGroupSpec) reference page. 

-* An instance group called `nodes` spanning all the zones; these instances are your workers.
-* One instance group for each master zone, called `master-<zone>` (e.g. `master-us-east-1c`).  These normally have
-  minimum size and maximum size = 1, so they will run a single instance.  We do this so that the cloud will
-  always relaunch masters, even if everything is terminated at once.  We have an instance group per zone
-  because we need to force the cloud to run an instance in every zone, so we can mount the master volumes - we
-  cannot do that across zones.
+You can also find concrete use cases for the configurations on the [Instance Group operations page](instance_groups.md)

-## Instance Groups Disclaimer
+On this page, we will expand on the more important configuration keys.

-* When there is only one availability zone in a region (eu-central-1) and you would like to run multiple masters,
-  you have to define multiple instance groups for each of those masters. (e.g. `master-eu-central-1-a` and
-  `master-eu-central-1-b` and so on...)
-* If instance groups are not defined correctly (particularly when there are an even number of master or multiple
-  groups of masters into one availability zone in a single region), etcd servers will not start and master nodes will not check in. This is because etcd servers are configured per availability zone. DNS and Route53 would be the first places to check when these problems are happening.
+## cloudLabels

-## Listing instance groups
-
-`kops get instancegroups`
-
-```
-NAME                    ROLE    MACHINETYPE     MIN     MAX     ZONES
-master-us-east-1c       Master                  1       1       us-east-1c
-nodes                   Node    t2.medium       2       2
-```
-
-You can also use the `kops get ig` alias.
-
-## Change the instance type in an instance group
-
-First you edit the instance group spec, using `kops edit ig nodes`.  Change the machine type to `t2.large`,
-for example.  Now if you `kops get ig`, you will see the large instance size.  Note though that these changes
-have not yet been applied (this may change soon though!).
-
-To preview the change:
-
-`kops update cluster <clustername>`
-
-```
-...
-Will modify resources:
-  *awstasks.LaunchConfiguration launchConfiguration/mycluster.mydomain.com
-    InstanceType t2.medium -> t2.large
-```
-
-Presuming you're happy with the change, go ahead and apply it: `kops update cluster <clustername> --yes`
-
-This change will apply to new instances only; if you'd like to roll it out immediately to all the instances
-you have to perform a rolling update.
-
-See a preview with: `kops rolling-update cluster`
-
-Then restart the machines with: `kops rolling-update cluster --yes`
-
-This will drain nodes, restart them with the new instance type, and validate them after startup.
-
-## Resize an instance group
-
-The procedure to resize an instance group works the same way:
-
-* Edit the instance group, set minSize and maxSize to the desired size: `kops edit ig nodes`
-* Preview changes: `kops update cluster <clustername>`
-* Apply changes: `kops update cluster <clustername>  --yes`
-* (you do not need a `rolling-update` when changing instancegroup sizes)
-
-## Changing the root volume size or type
-
-The default volume size for Masters is 64 GB, while the default volume size for a node is 128 GB.
-
-The procedure to resize the root volume works the same way:
-
-* Edit the instance group, set `rootVolumeSize` and/or `rootVolumeType` to the desired values: `kops edit ig nodes`
-* `rootVolumeType` must be one of [supported volume types](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html), e.g. `gp2` (default), `io1` (high performance) or `standard` (for testing).
-* If `rootVolumeType` is set to `io1` then you can define the number of Iops by specifying `rootVolumeIops` (defaults to 100 if not defined)
-* Preview changes: `kops update cluster <clustername>`
-* Apply changes: `kops update cluster <clustername> --yes`
-* Rolling update to update existing instances: `kops rolling-update cluster --yes`
-
-For example, to set up a 200GB gp2 root volume, your InstanceGroup spec might look like:
+If you need to add tags on auto scaling groups or instances (propagate ASG tags), you can add it in the instance group specs with `cloudLabels`. Cloud Labels defined at the cluster spec level will also be inherited.

 ```YAML
-metadata:
-  name: nodes
 spec:
-  machineType: t2.medium
+  cloudLabels:
+    billing: infra
+    environment: dev
+```
+
+## suspendProcess
+
+Autoscaling groups automatically include multiple [scaling processes](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-suspend-resume-processes.html#process-types)
+that keep our ASGs healthy.  In some cases, you may want to disable certain scaling activities.
+
+An example of this is if you are running multiple AZs in an ASG while using a Kubernetes Autoscaler.
+The autoscaler will remove specific instances that are not being used.  In some cases, the `AZRebalance` process
+will rescale the ASG without warning.
+
+```YAML
+spec:
+  suspendProcesses:
+  - AZRebalance
+```
+
+
+## instanceProtection
+
+Autoscaling groups may scale up or down automatically to balance types of instances, regions, etc.
+[Instance protection](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-instance-termination.html#instance-protection) prevents the ASG from being scaled in.
+
+```YAML
+spec:
+  instanceProtection: true
+```
+
+## externalLoadBalancers
+
+Instance groups can be linked to up to 10 load balancers. When attached, any instance launched will
+automatically register itself to the load balancer. For example, if you can create an instance group
+dedicated to running an ingress controller exposed on a
+[NodePort](https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport), you can
+manually create a load balancer and link it to the instance group. Traffic to the load balancer will now
+automatically go to one of the nodes.
+
+You can specify either `loadBalancerName` to link the instance group to an AWS Classic ELB or you can
+specify `targetGroupArn` to link the instance group to a target group, which are used by Application
+load balancers and Network load balancers.
+
+```YAML
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: k8s.dev.local
+  name: ingress
+spec:
+  machineType: m4.large
  maxSize: 2
  minSize: 2
  role: Node
-  rootVolumeSize: 200
-  rootVolumeType: gp2
+  externalLoadBalancers:
+  - targetGroupArn: arn:aws:elasticloadbalancing:eu-west-1:123456789012:targetgroup/my-ingress-target-group/0123456789abcdef
+  - loadBalancerName: my-elb-classic-load-balancer
 ```

-For example, to set up a 200GB io1 root volume with 200 provisioned Iops, your InstanceGroup spec might look like:
+## detailedInstanceMonitoring

-```YAML
-metadata:
-  name: nodes
-spec:
-  machineType: t2.medium
-  maxSize: 2
-  minSize: 2
-  role: Node
-  rootVolumeSize: 200
-  rootVolumeType: io1
-  rootVolumeIops: 200
-```
+Detailed monitoring will cause the monitoring data to be available every 1 minute instead of every 5 minutes. [Enabling Detailed Monitoring](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html). In production environments you may want to consider to enable detailed monitoring for quicker troubleshooting.

-## Adding additional storage to the instance groups
-
-As of Kops 1.12.0 you can add additional storage _(note, presently confined to AWS)_ via the instancegroup specification.
-
-```YAML
---
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: my-beloved-cluster
-  name: compute
-spec:
-  cloudLabels:
-    role: compute
-  image: coreos.com/CoreOS-stable-1855.4.0-hvm
-  machineType: m4.large
-  ...
-  volumes:
-  - device: /dev/xvdd
-    encrypted: true
-    size: 20
-    type: gp2
-```
-
-In AWS the above example shows how to add an additional 20gb EBS volume, which applies to each node within the instancegroup.
-
-## Automatically formatting and mounting the additional storage
-
-You can add additional storage via the above `volumes` collection though this only provisions the storage itself. Assuming you don't wish to handle the mechanics of formatting and mounting the device yourself _(perhaps via a hook)_ you can utilize the `volumeMounts` section of the instancegroup to handle this for you.
-
-```YAML
---
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: my-beloved-cluster
-  name: compute
-spec:
-  cloudLabels:
-    role: compute
-  image: coreos.com/CoreOS-stable-1855.4.0-hvm
-  machineType: m4.large
-  ...
-  volumeMounts:
-  - device: /dev/xvdd
-    filesystem: ext4
-    path: /var/lib/docker
-  volumes:
-  - device: /dev/xvdd
-    encrypted: true
-    size: 20
-    type: gp2
-```
-
-The above will provision the additional storage, format and mount the device into the node. Note this feature is purposely distinct from `volumes` so that it may be reused in areas such as ephemeral storage. Using a `c5d.large` instance as an example, which comes with a 50gb SSD drive; we can use the `volumeMounts` to mount this into `/var/lib/docker` for us.
-
-```YAML
---
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: my-beloved-cluster
-  name: compute
-spec:
-  cloudLabels:
-    role: compute
-  image: coreos.com/CoreOS-stable-1855.4.0-hvm
-  machineType: c5d.large
-  ...
-  volumeMounts:
-  - device: /dev/nvme1n1
-    filesystem: ext4
-    path: /data
-  # -- mount the instance storage --
-  - device: /dev/nvme2n1
-    filesystem: ext4
-    path: /var/lib/docker
-  volumes:
-  - device: /dev/nvme1n1
-    encrypted: true
-    size: 20
-    type: gp2
-```
-
-For AWS you can find more information on device naming conventions [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html)
-
-```shell
-$ df -h | grep nvme[12]
-/dev/nvme1n1      20G   45M   20G   1% /data
-/dev/nvme2n1      46G  633M   45G   2% /var/lib/docker
-```
-
-> Note: at present its up to the user ensure the correct device names.
-
-## Creating a new instance group
-
-Suppose you want to add a new group of nodes, perhaps with a different instance type.  You do this using `kops create ig <InstanceGroupName> --subnet <zone(s)>`. Currently the
-`--subnet` flag is required, and it receives the zone(s) of the subnet(s) in which the instance group will be. The command opens an editor with a skeleton configuration, allowing you to edit it before creation.
-
-So the procedure is:
-
-* `kops create ig morenodes --subnet us-east-1a`
-
-  or, in case you need it to be in more than one subnet, use a comma-separated list:
-
-* `kops create ig morenodes --subnet us-east-1a,us-east-1b,us-east-1c`
-* Preview: `kops update cluster <clustername>`
-* Apply: `kops update cluster <clustername> --yes`
-* (no instances need to be relaunched, so no rolling-update is needed)
-
-## Creating a instance group of mixed instances types (AWS Only)
-
-AWS permits the creation of mixed instance EC2 Autoscaling Groups using a [mixed instance policy](https://aws.amazon.com/blogs/aws/new-ec2-auto-scaling-groups-with-multiple-instance-types-purchase-options/), allowing the users to build a target capacity and make up of on-demand and spot instances while offloading the allocation strategy to AWS.
-
-Support for mixed instance groups was added in Kops 1.12.0
-
-```YAML
---
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: your.cluster.name
-  name: compute
-spec:
-  cloudLabels:
-    role: compute
-  image: coreos.com/CoreOS-stable-1911.4.0-hvm
-  machineType: m4.large
-  maxSize: 50
-  minSize: 10
-  # You can manually set the maxPrice you're willing to pay - it will default to the onDemand price.
-  maxPrice: "1.0"
-  # add the mixed instance policy here
-  mixedInstancesPolicy:
-    instances:
-    - m4.xlarge
-    - m5.large
-    - m5.xlarge
-    - t2.medium
-    onDemandAboveBase: 5
-    spotInstancePools: 3
-```
-
-The mixed instance policy permits setting the following configurable below, but for more details please check against the AWS documentation.
-
-```Go
-// MixedInstancesPolicySpec defines the specification for an autoscaling backed by a ec2 fleet
-type MixedInstancesPolicySpec struct {
-  // Instances is a list of instance types which we are willing to run in the EC2 fleet
-  Instances []string `json:"instances,omitempty"`
-  // OnDemandAllocationStrategy indicates how to allocate instance types to fulfill On-Demand capacity
-  OnDemandAllocationStrategy *string `json:"onDemandAllocationStrategy,omitempty"`
-  // OnDemandBase is the minimum amount of the Auto Scaling group's capacity that must be
-  // fulfilled by On-Demand Instances. This base portion is provisioned first as your group scales.
-  OnDemandBase *int64 `json:"onDemandBase,omitempty"`
-  // OnDemandAboveBase controls the percentages of On-Demand Instances and Spot Instances for your
-  // additional capacity beyond OnDemandBase. The range is 0–100. The default value is 100. If you
-  // leave this parameter set to 100, the percentages are 100% for On-Demand Instances and 0% for
-  // Spot Instances.
-  OnDemandAboveBase *int64 `json:"onDemandAboveBase,omitempty"`
-  // SpotAllocationStrategy diversifies your Spot capacity across multiple instance types to
-  // find the best pricing. Higher Spot availability may result from a larger number of
-  // instance types to choose from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html#spot-fleet-allocation-strategy
-  SpotAllocationStrategy *string `json:"spotAllocationStrategy,omitempty"`
-  // SpotInstancePools is the number of Spot pools to use to allocate your Spot capacity (defaults to 2)
-  // pools are determined from the different instance types in the Overrides array of LaunchTemplate
-  SpotInstancePools *int64 `json:"spotInstancePools,omitempty"`
-}
-```
-
-Note: as of writing this the kube cluster autoscaler does not support mixed instance groups, in the sense it will still scale groups up and down based on capacity but some of the simulations it does might be wrong as it's not aware of the instance type coming into the group.
-
-Note: when upgrading from a launchconfiguration to launchtemplate with mixed instance policy the launchconfiguration is left undeleted as has to be manually removed.
-
-## Moving from one instance group spanning multiple AZs to one instance group per AZ
-
-It may be beneficial to have one IG per AZ rather than one IG spanning multiple AZs. One common example is, when you have a persistent volume claim bound to an AWS EBS Volume this volume is bound to the AZ it has been created in so any resource (e.g. a StatefulSet) depending on that volume is bound to that same AZ. In this case you have to ensure that there is at least one node running in that same AZ, which is not guaranteed by one IG. This however can be guaranteed by one IG per AZ.
-
-So the procedure is:
-
-* `kops edit ig nodes`
-* Remove two of the subnets, e.g. `eu-central-1b` and `eu-central-1c`
-  * Alternatively you can also delete the existing IG and create a new one with a more suitable name
-* `kops create ig nodes-eu-central-1b --subnet eu-central-1b`
-* `kops create ig nodes-eu-central-1c --subnet eu-central-1c`
-* Preview: `kops update cluster <clustername>`
-* Apply: `kops update cluster <clustername> --yes`
-* Rolling update to update existing instances: `kops rolling-update cluster --yes`
-
-## Converting an instance group to use spot instances
-
-Follow the normal procedure for reconfiguring an InstanceGroup, but set the maxPrice property to your bid.
-For example, "0.10" represents a spot-price bid of $0.10 (10 cents) per hour.
-
-An example spec looks like this:
-
-```YAML
-metadata:
-  name: nodes
-spec:
-  machineType: t2.medium
-  maxPrice: "0.01"
-  maxSize: 3
-  minSize: 3
-  role: Node
-```
-
-So the procedure is:
-
-* Edit: `kops edit ig nodes`
-* Preview: `kops update cluster <clustername>`
-* Apply: `kops update cluster <clustername> --yes`
-* Rolling-update, only if you want to apply changes immediately: `kops rolling-update cluster`
-
-## Adding Taints or Labels to an Instance Group
-
-If you're running Kubernetes 1.6.0 or later, you can also control taints in the InstanceGroup.
-The taints property takes a list of strings. The following example would add two taints to an IG,
-using the same `edit` -> `update` -> `rolling-update` process as above.
-
-Additionally, `nodeLabels` can be added to an IG in order to take advantage of Pod Affinity. Every node in the IG will be assigned the desired labels. For more information see the [labels](./labels.md) documentation.
-
-```YAML
-metadata:
-  name: nodes
-spec:
-  machineType: m3.medium
-  maxSize: 3
-  minSize: 3
-  role: Node
-  taints:
-  - dedicated=gpu:NoSchedule
-  - team=search:PreferNoSchedule
-  nodeLabels:
-    spot: "false"
-```
-
-## Resizing the master
-
-(This procedure should be pretty familiar by now!)
-
-Your master instance group will probably be called `master-us-west-1c` or something similar.
-
-`kops edit ig master-us-west-1c`
-
-Add or set the machineType:
+**Note: that enabling detailed monitoring is a subject for [charge](https://aws.amazon.com/cloudwatch)**

 ```YAML
 spec:
-  machineType: m3.large
+  detailedInstanceMonitoring: true
 ```

-* Preview changes: `kops update cluster <clustername>`
-
-* Apply changes: `kops update cluster <clustername> --yes`
-
-* Rolling-update, only if you want to apply changes immediately: `kops rolling-update cluster`
-
-If you want to minimize downtime, scale the master ASG up to size 2, then wait for that new master to
-be Ready in `kubectl get nodes`, then delete the old master instance, and scale the ASG back down to size 1.  (A
-future version of rolling-update will probably do this automatically)
-
-## Deleting an instance group
-
-If you decide you don't need an InstanceGroup any more, you delete it using: `kops delete ig <name>`
-
-Example: `kops delete ig morenodes`
-
-No `kops update cluster` nor `kops rolling-update` is needed, so **be careful** when deleting an instance group, your nodes will be deleted automatically (and note this is not currently graceful, so there may be interruptions to workloads where the pods are running on those nodes).
-
-## EBS Volume Optimization
-
-EBS-Optimized instances can be created by setting the following field:
-
-```YAML
-spec:
-  rootVolumeOptimization: true
-```
-
-## Additional user-data for cloud-init
+## additionalUserData

 Kops utilizes cloud-init to initialize and setup a host at boot time. However in certain cases you may already be leveraging certain features of cloud-init in your infrastructure and would like to continue doing so. More information on cloud-init can be found [here](http://cloudinit.readthedocs.io/en/latest/)

@ -423,155 +115,7 @@ spec:
              - http://archive.ubuntu.com
 ```

-## Add Tags on AWS autoscalling groups and instances
-
-If you need to add tags on auto scaling groups or instances (propagate ASG tags), you can add it in the instance group specs with *cloudLabels*. Cloud Labels defined at the cluster spec level will also be inherited.
-
-```YAML
-# Example for nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: nodes
-spec:
-  cloudLabels:
-    billing: infra
-    environment: dev
-  associatePublicIp: false
-  machineType: m4.xlarge
-  maxSize: 20
-  minSize: 2
-  role: Node
-```
-
-## Suspending Scaling Processes on AWS Autoscaling groups
-
-Autoscaling groups automatically include multiple [scaling processes](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-suspend-resume-processes.html#process-types)
-that keep our ASGs healthy.  In some cases, you may want to disable certain scaling activities.
-
-An example of this is if you are running multiple AZs in an ASG while using a Kubernetes Autoscaler.
-The autoscaler will remove specific instances that are not being used.  In some cases, the `AZRebalance` process
-will rescale the ASG without warning.
-
-```YAML
-# Example for nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: nodes
-spec:
-  machineType: m4.xlarge
-  maxSize: 20
-  minSize: 2
-  role: Node
-  suspendProcesses:
-  - AZRebalance
-```
-
-## Protect new instances from scale in
-
-Autoscaling groups may scale up or down automatically to balance types of instances, regions, etc.
-[Instance protection](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-instance-termination.html#instance-protection) prevents the ASG from being scaled in.
-
-```YAML
-# Example for nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: nodes
-spec:
-  machineType: m4.xlarge
-  maxSize: 20
-  minSize: 2
-  role: Node
-  instanceProtection: true
-```
-
-## Attaching existing Load Balancers to Instance Groups
-
-Instance groups can be linked to up to 10 load balancers. When attached, any instance launched will
-automatically register itself to the load balancer. For example, if you can create an instance group
-dedicated to running an ingress controller exposed on a
-[NodePort](https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport), you can
-manually create a load balancer and link it to the instance group. Traffic to the load balancer will now
-automatically go to one of the nodes.
-
-You can specify either `loadBalancerName` to link the instance group to an AWS Classic ELB or you can
-specify `targetGroupArn` to link the instance group to a target group, which are used by Application
-load balancers and Network load balancers.
-
-```YAML
-# Example ingress nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: ingress
-spec:
-  machineType: m4.large
-  maxSize: 2
-  minSize: 2
-  role: Node
-  externalLoadBalancers:
-  - targetGroupArn: arn:aws:elasticloadbalancing:eu-west-1:123456789012:targetgroup/my-ingress-target-group/0123456789abcdef
-  - loadBalancerName: my-elb-classic-load-balancer
-```
-
-## Enabling Detailed-Monitoring on AWS instances
-
-Detailed-Monitoring will cause the monitoring data to be available every 1 minute instead of every 5 minutes. [Enabling Detailed Monitoring](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-new.html). In production environments you may want to consider to enable detailed monitoring for quicker troubleshooting.
-
-**Note: that enabling detailed monitoring is a subject for [charge](https://aws.amazon.com/cloudwatch)**
-
-```YAML
-# Example for nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: nodes
-spec:
-  detailedInstanceMonitoring: true
-  machineType: t2.medium
-  maxSize: 2
-  minSize: 2
-  role: Node
-```
-
-## Booting from a volume in OpenStack
-
-If you want to boot from a volume when you are running in openstack you can set annotations on the instance groups.
-
-```YAML
-# Example for nodes
-apiVersion: kops.k8s.io/v1alpha2
-kind: InstanceGroup
-metadata:
-  labels:
-    kops.k8s.io/cluster: k8s.dev.local
-  name: nodes
-  annotations:
-    openstack.kops.io/osVolumeBoot: enabled
-    openstack.kops.io/osVolumeSize: "15" # In gigabytes
-spec:
-  detailedInstanceMonitoring: true
-  machineType: t2.medium
-  maxSize: 2
-  minSize: 2
-  role: Node
-```
-
-If `openstack.kops.io/osVolumeSize` is not set it will default to the minimum disk specified by the image.
-
-## Setting Custom Kernel Runtime Parameters
+## sysctlParameters

 To add custom kernel runtime parameters to your instance group, specify the
 `sysctlParameters` field as an array of strings. Each string must take the form
@ -596,3 +140,36 @@ spec:
 ```

 which would end up in a drop-in file on nodes of the instance group in question.
+
+## mixedInstancePolicy (AWS Only)
+
+### Instances
+
+Instances is a list of instance types which we are willing to run in the EC2 fleet
+
+### onDemandAllocationStrategy
+
+Indicates how to allocate instance types to fulfill On-Demand capacity
+
+### onDemandBase
+
+OnDemandBase is the minimum amount of the Auto Scaling group's capacity that must be
+fulfilled by On-Demand Instances. This base portion is provisioned first as your group scales.
+
+### onDemandAboveBase
+
+OnDemandAboveBase controls the percentages of On-Demand Instances and Spot Instances for your
+additional capacity beyond OnDemandBase. The range is 0–100. The default value is 100. If you
+leave this parameter set to 100, the percentages are 100% for On-Demand Instances and 0% for
+Spot Instances.
+
+### spotAllocationStrategy
+
+SpotAllocationStrategy diversifies your Spot capacity across multiple instance types to
+find the best pricing. Higher Spot availability may result from a larger number of
+instance types to choose from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html#spot-fleet-allocation-strategy
+
+### spotInstancePools
+
+SpotInstancePools is the number of Spot pools to use to allocate your Spot capacity (defaults to 2)
+pools are determined from the different instance types in the Overrides array of LaunchTemplate
--- a/docs/tutorial/working-with-instancegroups.md
+++ b/docs/tutorial/working-with-instancegroups.md
@ -1,13 +1,72 @@
-# Working with InstanceGroups
+# Managinging Instance Groups

-The kops InstanceGroup is a declarative model of a group of nodes.  By modifying the object, you
-can change the instance type you're using, the number of nodes you have, the OS image you're running - essentially
-all the per-node configuration is in the InstanceGroup.
+kops has the concept of "instance groups", which are a group of similar machines.  On AWS, they map to
+an AutoScalingGroup.

-We'll assume you have a working cluster - if not, you probably want to read [how to get started on GCE](../getting_started/gce.md).
+By default, a cluster has:
+
+* An instance group called `nodes` spanning all the zones; these instances are your workers.
+* One instance group for each master zone, called `master-<zone>` (e.g. `master-us-east-1c`).  These normally have
+  minimum size and maximum size = 1, so they will run a single instance.  We do this so that the cloud will
+  always relaunch masters, even if everything is terminated at once.  We have an instance group per zone
+  because we need to force the cloud to run an instance in every zone, so we can mount the master volumes - we
+  cannot do that across zones.
+
+This page explains some common instance group operations. For more detailed documentation of the various configuration keys, see the [InstanceGroup Resource](instancegroups_spec.md). 
+
+
+## Instance Groups Disclaimer
+
+* When there is only one availability zone in a region (eu-central-1) and you would like to run multiple masters,
+  you have to define multiple instance groups for each of those masters. (e.g. `master-eu-central-1-a` and
+  `master-eu-central-1-b` and so on...)
+* If instance groups are not defined correctly (particularly when there are an even number of master or multiple
+  groups of masters into one availability zone in a single region), etcd servers will not start and master nodes will not check in. This is because etcd servers are configured per availability zone. DNS and Route53 would be the first places to check when these problems are happening.
+
+## Listing instance groups
+
+`kops get instancegroups`
+
+```
+NAME                    ROLE    MACHINETYPE     MIN     MAX     ZONES
+master-us-east-1c       Master                  1       1       us-east-1c
+nodes                   Node    t2.medium       2       2
+```
+
+You can also use the `kops get ig` alias.
+
+## Change the instance type in an instance group
+
+First you edit the instance group spec, using `kops edit ig nodes`.  Change the machine type to `t2.large`,
+for example.  Now if you `kops get ig`, you will see the large instance size.  Note though that these changes
+have not yet been applied (this may change soon though!).
+
+To preview the change:
+
+`kops update cluster <clustername>`
+
+```
+...
+Will modify resources:
+  *awstasks.LaunchConfiguration launchConfiguration/mycluster.mydomain.com
+    InstanceType t2.medium -> t2.large
+```
+
+Presuming you're happy with the change, go ahead and apply it: `kops update cluster <clustername> --yes`
+
+This change will apply to new instances only; if you'd like to roll it out immediately to all the instances
+you have to perform a rolling update.
+
+See a preview with: `kops rolling-update cluster`
+
+Then restart the machines with: `kops rolling-update cluster --yes`
+
+This will drain nodes, restart them with the new instance type, and validate them after startup.

 ## Changing the number of nodes

+Note: This uses GCE as example. It will look different when AWS is the cloud provider, but the concept and the configuration is the same.
+
 If you `kops get ig` you should see that you have InstanceGroups for your nodes and for your master:

 ```
@ -21,7 +80,7 @@ Let's change the number of nodes to 3.  We'll edit the InstanceGroup configurati
 should be very familiar to you if you've used `kubectl edit`).  `kops edit ig nodes` will open
 the InstanceGroup in your editor, looking a bit like this:

-```
+```YAML
 apiVersion: kops.k8s.io/v1alpha2
 kind: InstanceGroup
 metadata:
@ -41,19 +100,13 @@ spec:
  - us-central1-a
 ```

-<!-- TODO enable cluster autoscaler or GCE autoscaler -->
-
 Edit `minSize` and `maxSize`, changing both from 2 to 3, save and exit your editor.  If you wanted to change
 the image or the machineType, you could do that here as well.  There are actually a lot more fields,
 but most of them have their default values, so won't show up unless they are set.  The general approach is the same though.

-<!-- TODO link to API reference docs -->
-
 On saving you'll note that nothing happens.  Although you've changed the model, you need to tell kops to
 apply your changes to the cloud.

-<!-- TODO can we have a dirty flag somehow -->
-
 We use the same `kops update cluster` command that we used when initially creating the cluster; when
 run without `--yes` it should show you a preview of the changes, and now there should be only one change:

@ -69,8 +122,6 @@ This is saying that we will alter the `TargetSize` property of the `InstanceGrou

 That's what we want, so we `kops update cluster --yes`.

-<!-- TODO: Make Changes may require instances to restart: kops rolling-update cluster appear selectively -->
-
 kops will resize the GCE managed instance group from 2 to 3, which will create a new GCE instance,
 which will then boot and join the cluster.  Within a minute or so you should see the new node join:

@ -103,8 +154,6 @@ debian-9-stretch-v20170918                        debian-cloud             debia
 ...
 ```

-<!-- TODO: Auto select debian-cloud/debian-9 => debian-cloud/debian-9-stretch-v20170918 -->
-
 So now we'll do the same `kops edit ig nodes`, except this time change the image to `debian-cloud/debian-9-stretch-v20170918`:

 Now `kops update cluster` will show that you're going to create a new [GCE Instance Template](https://cloud.google.com/compute/docs/reference/latest/instanceTemplates),
@ -144,43 +193,336 @@ you might want to make more changes or you might want to wait for off-peak hours
 the instances to terminate naturally - new instances will come up with the new configuration - though if you're not
 using preemptible/spot instances you might be waiting for a long time.

-## Performing a rolling-update of your cluster
+## Changing the root volume size or type

-When you're ready to force your instances to restart, use `kops rollling-update cluster`:
+The default volume size for Masters is 64 GB, while the default volume size for a node is 128 GB.

-```
-> kops rolling-update cluster
-Using cluster from kubectl context: simple.k8s.local
+The procedure to resize the root volume works the same way:

-NAME			STATUS		NEEDUPDATE	READY	MIN	MAX	NODES
-master-us-central1-a	Ready		0		1	1	1	1
-nodes			NeedsUpdate	3		0	3	3	3
+* Edit the instance group, set `rootVolumeSize` and/or `rootVolumeType` to the desired values: `kops edit ig nodes`
+* `rootVolumeType` must be one of [supported volume types](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html), e.g. `gp2` (default), `io1` (high performance) or `standard` (for testing).
+* If `rootVolumeType` is set to `io1` then you can define the number of Iops by specifying `rootVolumeIops` (defaults to 100 if not defined)
+* Preview changes: `kops update cluster <clustername>`
+* Apply changes: `kops update cluster <clustername> --yes`
+* Rolling update to update existing instances: `kops rolling-update cluster --yes`

-Must specify --yes to rolling-update.
+For example, to set up a 200GB gp2 root volume, your InstanceGroup spec might look like:
+
+```YAML
+metadata:
+  name: nodes
+spec:
+  machineType: t2.medium
+  maxSize: 2
+  minSize: 2
+  role: Node
+  rootVolumeSize: 200
+  rootVolumeType: gp2
 ```

-You can see that your nodes need to be restarted, and your masters do not.  A `kops rolling-update cluster --yes` will perform the update.
-It will only restart instances that need restarting (unless you `--force` a rolling-update).
+For example, to set up a 200GB io1 root volume with 200 provisioned Iops, your InstanceGroup spec might look like:

-When you're ready, do `kops rolling-update cluster --yes`.  It'll take a few minutes per node, because for each node
-we cordon the node, drain the pods, shut it down and wait for the new node to join the cluster and for the cluster
-to be healthy again.  But this procedure minimizes disruption to your cluster - a rolling-update cluster is never
-going to be something you do during your superbowl commercial, but ideally it should be minimally disruptive.
-
-<!-- TODO: Clean up rolling-update cluster stdout -->
-<!-- TODO: Pause after showing preview, to give a change to Ctrl-C -->
-
-
-After the rolling-update is complete, you can see that the nodes are now running a new image:
-
-```
-> kubectl get nodes -owide
-NAME                        STATUS    AGE       VERSION   EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION
-master-us-central1-a-8fcc   Ready     48m       v1.7.2    35.188.177.16   Container-Optimized OS from Google   4.4.35+
-nodes-9cml                  Ready     17m       v1.7.2    35.194.25.144   Container-Optimized OS from Google   4.4.35+
-nodes-km98                  Ready     11m       v1.7.2    35.202.95.161   Container-Optimized OS from Google   4.4.35+
-nodes-wbb2                  Ready     5m        v1.7.2    35.194.56.129   Container-Optimized OS from Google   4.4.35+
+```YAML
+metadata:
+  name: nodes
+spec:
+  machineType: t2.medium
+  maxSize: 2
+  minSize: 2
+  role: Node
+  rootVolumeSize: 200
+  rootVolumeType: io1
+  rootVolumeIops: 200
 ```

+## Adding additional storage to the instance groups
+
+As of Kops 1.12.0 you can add additional storage _(note, presently confined to AWS)_ via the instancegroup specification.
+
+```YAML
+---
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: my-beloved-cluster
+  name: compute
+spec:
+  cloudLabels:
+    role: compute
+  image: coreos.com/CoreOS-stable-1855.4.0-hvm
+  machineType: m4.large
+  ...
+  volumes:
+  - device: /dev/xvdd
+    encrypted: true
+    size: 20
+    type: gp2
+```
+
+In AWS the above example shows how to add an additional 20gb EBS volume, which applies to each node within the instancegroup.
+
+## Automatically formatting and mounting the additional storage
+
+You can add additional storage via the above `volumes` collection though this only provisions the storage itself. Assuming you don't wish to handle the mechanics of formatting and mounting the device yourself _(perhaps via a hook)_ you can utilize the `volumeMounts` section of the instancegroup to handle this for you.
+
+```YAML
+---
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: my-beloved-cluster
+  name: compute
+spec:
+  cloudLabels:
+    role: compute
+  image: coreos.com/CoreOS-stable-1855.4.0-hvm
+  machineType: m4.large
+  ...
+  volumeMounts:
+  - device: /dev/xvdd
+    filesystem: ext4
+    path: /var/lib/docker
+  volumes:
+  - device: /dev/xvdd
+    encrypted: true
+    size: 20
+    type: gp2
+```
+
+The above will provision the additional storage, format and mount the device into the node. Note this feature is purposely distinct from `volumes` so that it may be reused in areas such as ephemeral storage. Using a `c5d.large` instance as an example, which comes with a 50gb SSD drive; we can use the `volumeMounts` to mount this into `/var/lib/docker` for us.
+
+```YAML
+---
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: my-beloved-cluster
+  name: compute
+spec:
+  cloudLabels:
+    role: compute
+  image: coreos.com/CoreOS-stable-1855.4.0-hvm
+  machineType: c5d.large
+  ...
+  volumeMounts:
+  - device: /dev/nvme1n1
+    filesystem: ext4
+    path: /data
+  # -- mount the instance storage --
+  - device: /dev/nvme2n1
+    filesystem: ext4
+    path: /var/lib/docker
+  volumes:
+  - device: /dev/nvme1n1
+    encrypted: true
+    size: 20
+    type: gp2
+```
+
+For AWS you can find more information on device naming conventions [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html)
+
+```shell
+$ df -h | grep nvme[12]
+/dev/nvme1n1      20G   45M   20G   1% /data
+/dev/nvme2n1      46G  633M   45G   2% /var/lib/docker
+```
+
+> Note: at present its up to the user ensure the correct device names.
+
+## Creating a new instance group
+
+Suppose you want to add a new group of nodes, perhaps with a different instance type.  You do this using `kops create ig <InstanceGroupName> --subnet <zone(s)>`. Currently the
+`--subnet` flag is required, and it receives the zone(s) of the subnet(s) in which the instance group will be. The command opens an editor with a skeleton configuration, allowing you to edit it before creation.
+
+So the procedure is:
+
+* `kops create ig morenodes --subnet us-east-1a`
+
+  or, in case you need it to be in more than one subnet, use a comma-separated list:
+
+* `kops create ig morenodes --subnet us-east-1a,us-east-1b,us-east-1c`
+* Preview: `kops update cluster <clustername>`
+* Apply: `kops update cluster <clustername> --yes`
+* (no instances need to be relaunched, so no rolling-update is needed)
+
+## Creating a instance group of mixed instances types (AWS Only)
+
+AWS permits the creation of mixed instance EC2 Autoscaling Groups using a [mixed instance policy](https://aws.amazon.com/blogs/aws/new-ec2-auto-scaling-groups-with-multiple-instance-types-purchase-options/), allowing the users to build a target capacity and make up of on-demand and spot instances while offloading the allocation strategy to AWS.
+
+Support for mixed instance groups was added in Kops 1.12.0
+
+
+```YAML
+---
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: your.cluster.name
+  name: compute
+spec:
+  cloudLabels:
+    role: compute
+  image: coreos.com/CoreOS-stable-1911.4.0-hvm
+  machineType: m4.large
+  maxSize: 50
+  minSize: 10
+  # You can manually set the maxPrice you're willing to pay - it will default to the onDemand price.
+  maxPrice: "1.0"
+  # add the mixed instance policy here
+  mixedInstancesPolicy:
+    instances:
+    - m4.xlarge
+    - m5.large
+    - m5.xlarge
+    - t2.medium
+    onDemandAboveBase: 5
+    spotInstancePools: 3
+```
+
+The mixed instance policy permits setting the following configurable below, but for more details please check against the AWS documentation.
+
+Note: as of writing this the kube cluster autoscaler does not support mixed instance groups, in the sense it will still scale groups up and down based on capacity but some of the simulations it does might be wrong as it's not aware of the instance type coming into the group.
+
+Note: when upgrading from a launchconfiguration to launchtemplate with mixed instance policy the launchconfiguration is left undeleted as has to be manually removed.
+
+## Moving from one instance group spanning multiple AZs to one instance group per AZ
+
+It may be beneficial to have one IG per AZ rather than one IG spanning multiple AZs. One common example is, when you have a persistent volume claim bound to an AWS EBS Volume this volume is bound to the AZ it has been created in so any resource (e.g. a StatefulSet) depending on that volume is bound to that same AZ. In this case you have to ensure that there is at least one node running in that same AZ, which is not guaranteed by one IG. This however can be guaranteed by one IG per AZ.
+
+So the procedure is:
+
+* `kops edit ig nodes`
+* Remove two of the subnets, e.g. `eu-central-1b` and `eu-central-1c`
+  * Alternatively you can also delete the existing IG and create a new one with a more suitable name
+* `kops create ig nodes-eu-central-1b --subnet eu-central-1b`
+* `kops create ig nodes-eu-central-1c --subnet eu-central-1c`
+* Preview: `kops update cluster <clustername>`
+* Apply: `kops update cluster <clustername> --yes`
+* Rolling update to update existing instances: `kops rolling-update cluster --yes`
+
+## Converting an instance group to use spot instances
+
+Follow the normal procedure for reconfiguring an InstanceGroup, but set the maxPrice property to your bid.
+For example, "0.10" represents a spot-price bid of $0.10 (10 cents) per hour.
+
+An example spec looks like this:
+
+```YAML
+metadata:
+  name: nodes
+spec:
+  machineType: t2.medium
+  maxPrice: "0.01"
+  maxSize: 3
+  minSize: 3
+  role: Node
+```
+
+So the procedure is:
+
+* Edit: `kops edit ig nodes`
+* Preview: `kops update cluster <clustername>`
+* Apply: `kops update cluster <clustername> --yes`
+* Rolling-update, only if you want to apply changes immediately: `kops rolling-update cluster`
+
+## Adding Taints or Labels to an Instance Group
+
+If you're running Kubernetes 1.6.0 or later, you can also control taints in the InstanceGroup.
+The taints property takes a list of strings. The following example would add two taints to an IG,
+using the same `edit` -> `update` -> `rolling-update` process as above.
+
+Additionally, `nodeLabels` can be added to an IG in order to take advantage of Pod Affinity. Every node in the IG will be assigned the desired labels. For more information see the [labels](./labels.md) documentation.
+
+```YAML
+metadata:
+  name: nodes
+spec:
+  machineType: m3.medium
+  maxSize: 3
+  minSize: 3
+  role: Node
+  taints:
+  - dedicated=gpu:NoSchedule
+  - team=search:PreferNoSchedule
+  nodeLabels:
+    spot: "false"
+```
+
+## Resizing the master
+
+(This procedure should be pretty familiar by now!)
+
+Your master instance group will probably be called `master-us-west-1c` or something similar.
+
+`kops edit ig master-us-west-1c`
+
+Add or set the machineType:
+
+```YAML
+spec:
+  machineType: m3.large
+```
+
+* Preview changes: `kops update cluster <clustername>`
+
+* Apply changes: `kops update cluster <clustername> --yes`
+
+* Rolling-update, only if you want to apply changes immediately: `kops rolling-update cluster`
+
+If you want to minimize downtime, scale the master ASG up to size 2, then wait for that new master to
+be Ready in `kubectl get nodes`, then delete the old master instance, and scale the ASG back down to size 1.  (A
+future version of rolling-update will probably do this automatically)
+
+## Deleting an instance group
+
+If you decide you don't need an InstanceGroup any more, you delete it using: `kops delete ig <name>`
+
+Example: `kops delete ig morenodes`
+
+No `kops update cluster` nor `kops rolling-update` is needed, so **be careful** when deleting an instance group, your nodes will be deleted automatically (and note this is not currently graceful, so there may be interruptions to workloads where the pods are running on those nodes).
+
+## EBS Volume Optimization
+
+EBS-Optimized instances can be created by setting the following field:
+
+```YAML
+spec:
+  rootVolumeOptimization: true
+```
+
+## Booting from a volume in OpenStack
+
+If you want to boot from a volume when you are running in openstack you can set annotations on the instance groups.
+
+```YAML
+# Example for nodes
+apiVersion: kops.k8s.io/v1alpha2
+kind: InstanceGroup
+metadata:
+  labels:
+    kops.k8s.io/cluster: k8s.dev.local
+  name: nodes
+  annotations:
+    openstack.kops.io/osVolumeBoot: enabled
+    openstack.kops.io/osVolumeSize: "15" # In gigabytes
+spec:
+  detailedInstanceMonitoring: true
+  machineType: t2.medium
+  maxSize: 2
+  minSize: 2
+  role: Node
+```
+
+If `openstack.kops.io/osVolumeSize` is not set it will default to the minimum disk specified by the image.
+# Working with InstanceGroups
+
+The kops InstanceGroup is a declarative model of a group of nodes.  By modifying the object, you
+can change the instance type you're using, the number of nodes you have, the OS image you're running - essentially
+all the per-node configuration is in the InstanceGroup.
+
+We'll assume you have a working cluster - if not, you probably want to read [how to get started on GCE](../getting_started/gce.md).

-Next steps: learn how to perform cluster-wide operations, like [upgrading kubernetes](upgrading-kubernetes.md).
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -39,8 +39,8 @@ nav:
    - Deploying to GCE: "getting_started/gce.md"
    - Deploying to OpenStack - Beta: "getting_started/openstack.md"
    - Deploying to Digital Ocean - Alpha: "getting_started/digitalocean.md"
-    - Kops Commands: "getting_started/commands.md"
-    - Kops Arguments: "getting_started/arguments.md"
+    - kops Commands: "getting_started/commands.md"
+    - kops Arguments: "getting_started/arguments.md"
    - kubectl usage: "getting_started/kubectl.md"
  - CLI:
    - kops: "cli/kops.md"
@ -61,14 +61,13 @@ nav:
    - kops validate: "cli/kops_validate.md"
    - kops version: "cli/kops_version.md"
  - API:
-    - Cluster Spec: "cluster_spec.md"
-    - Instance Group API: "instance_groups.md"
-    - Using Manifests and Customizing: "manifests_and_customizing_via_api.md"
-    - Godocs for Cluster - ClusterSpec: "https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#ClusterSpec"
-    - Godocs for Instance Group - InstanceGroupSpec: "https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#InstanceGroupSpec"
+    - Cluster Resource: "cluster_spec.md"
+    - InstanceGroup Resource: "instance_groups.md"

  - Operations:
    - Updates & Upgrades: "operations/updates_and_upgrades.md"
+    - Working with Instance Groups: "tutorial/working-with-instancegroups.md"
+    - Using Manifests and Customizing: "manifests_and_customizing_via_api.md"
    - High Availability: "operations/high_availability.md"
    - etcd backup, restore and encryption: "operations/etcd_backup_restore_encryption.md"
    - Instancegroup images: "operations/images.md"
@ -82,7 +81,6 @@ nav:
    - Secret management: "secrets.md"
    - Service Account Token Volume: "operations/service_account_token_volumes.md"
    - Moving from a Single Master to Multiple HA Masters: "single-to-multi-master.md"
-    - Working with Instance Groups: "tutorial/working-with-instancegroups.md"
    - Running kops in a CI environment: "continuous_integration.md"
    - etcd administration: "operations/etcd_administration.md"
  - Networking: