community/roadmap-cluster-deployment.md at 6d81acf86b5d7adb20c282e73d17c8bc260dbaad

10 KiB

Raw Blame History

OBSOLETE

Cluster lifecycle includes deployment (infrastructure provisioning and bootstrapping Kubernetes), scaling, upgrades, and turndown.

Owner: @kubernetes/sig-cluster-lifecycle (kubernetes-sig-cluster-lifecycle at googlegroups.com, sig-cluster-lifecycle on slack)

There is no one-size-fits-all solution for cluster deployment and management (e.g., upgrades). There's a spectrum of possible solutions, each with different tradeoffs:

opinionated solution (easier to use for a narrower solution space) vs. toolkit (easier to adapt and extend)
understandability (easier to modify) vs. configurability (addresses a broader solution space without coding)

Some useful points in the spectrum are described below.

There are a number of tasks/features/changes that would be useful to multiple points in the spectrum. We should prioritize them, since they would enable multiple solutions.

Single-node laptop/development cluster

Should be sufficient to kick the tires for most examples and for local development. Should be dead simple to use and highly opinionated rather than configurable.

Owner: @dlorenc

Portable multi-node cluster understandable reference implementation

For people who want to get Kubernetes running painlessly on an arbitrary set of machines -- any cloud provider (or bare metal), any OS distro, any networking infrastructure. Porting work should be minimized via separation of concerns (composition) and ease of modification rather than automated configuration transformation. Not intended to be highly optimized by default, but the cluster should be reliable.

Also a reference implementation for people who want to understand how to build Kubernetes clusters from scratch.

Ideally cluster scaling and upgrades would be supported by this implementation.

Replace Docker multi-node guide.

To facilitate this, we aim to provide an understandable, declarative, decoupled infrastructure provisioning implementation and a portable cluster bootstrapping implementation. Networking setup needs to be decoupled, so it can be swapped out with alternative implementations.

For portability, all components need to be containerized (though Kubelet may use an alternative to Docker, so long as it is portable and meets other requirements) and we need a default network overlay solution.

Eventually, we'd like to entirely eliminate the need for Chef/Puppet/Ansible/Salt. We shouldn't need to copy files around to host filesystems.

For simplicity, users shouldn't need to install/launch more than one component or execute more than one command per node. This could be achieved a variety of ways: monolithic binaries, monolithic containers, a launcher/controller container that spawns other containers, etc.

Once we have this, we should delete out-of-date, untested "getting-started guides" (example broken cluster debugging thread).

Building a cluster from scratch

For people starting from scratch:

We should simplify this as much as possible, and clearly document it.

This is probably the only viable way to support people who want to do significant customization:

cloud provider (including bare metal)
OS distro
cluster size
master and worker node configurations
networking solution and parameters (e.g., CIDR)
container runtime (Docker or rkt) and its configuration
monitoring solutions
logging solutions
ingress controller
image registry
IAM
HA
multi-zone
K8s component configuration

To do:

Simplify release packaging and installation
- Finding and installing the right version of Docker itself can be hard (apt-get install docker/docker.io/docker-engine isn't the right thing)
- Build rpms, debs?
Verify that system requirements have been satisfied (docker version, kernel configuration, etc.)
- And ideally degrade gracefully and warn if they are not
Documentation
- What is the latest release, how can I find it, how do I install it, what version of Docker/rkt/etc. is required?
- An architectural diagram (like the one we use in our presentations) would help, too.
- Explain the architecture
  - Link to instructions about how to manage etcd
  - Link to Chubby paper
- Document system requirements ("Node Spec")
  - OS distro versions
  - kernel configuration
  - resources
  - IP forwarding
- Document how to set up a cluster
- Adequately document how to configure our components.
  - Improve/simplify/organize command help
  - Hide/remove/deemphasize test-only options
- Document how to integrate IAM
- Create guides to help with key decisions for production clusters
  - Selecting a networking model
  - Managing a CA
  - Managing user authentication and authorization
  - Initial deployment requirements (memory, cpu, networking, storage)
  - Upgrading best practices
Code changes
- Finish converting components to use configuration files rather than command-line flags
- Facilitate managing that configuration using ConfigMap
- Cluster config
- Reduce external dependencies
  - APIs for reusable building blocks, such as TLS bootstrap, certificate signing for addons, teardown
- Need key/cert rotation (master, service accounts)
- Finish generalization of component registration
- Improve addon management

Production-grade, easy-to-use cluster management tools/services

Easy to use and opinionated. Potentially highly optimized. Acceptable for production use. Not necessarily easily portable nor easy to extend/adapt/change.

Examples:

Kube-AWS
kops
Kargo
- (is https://git.k8s.io/contrib/ansible still needed?)
kompose8
Tectonic
Kraken
NavOps Launch
Photon Cluster Manager
Platform 9
GKE
Stackpoint.io
Juju

10 KiB Raw Blame History

Single-node laptop/development cluster

Portable multi-node cluster understandable reference implementation

Building a cluster from scratch

Production-grade, easy-to-use cluster management tools/services

10 KiB

Raw Blame History