community/contributors/design-proposals/architecture/scope.md

14 KiB
Raw Blame History

Kubernetes scope

Purpose of this doc: Clarify factors affecting decisions regarding what is and is not in scope for the Kubernetes project.

Related documents:

Kubernetes is a portable, extensible open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. Workload portability is an especially high priority. Kubernetes provides a flexible, easy-to-run, secure foundation for running containerized applications on any cloud provider or your own systems.

While not a full distribution in the Linux sense, adoption of Kubernetes has been facilitated by the fact that the upstream releases are usable on their own, with minimal dependencies (e.g., etcd, a container runtime, and a networking implementation).

The high-level scope and goals are often insufficient for making decisions about where to draw the line, so this documents where the line is, the rationale for some past decisions, and some general criteria that have been applied, including non-technical considerations. For instance, user adoption and continued operation of the project itself are also important factors.

Significant areas

More details can be found below, but a concise list of areas in scope follows:

  • Containerized workload execution and management
  • Service discovery, load balancing, and routing
  • Workload identity propagation and authentication
  • Declarative resource management platform
  • Command-line tool
  • Web dashboard (UI)
  • Cluster lifecycle tools
  • Extensibility to support execution and management in diverse environments
  • Multi-cluster management tools and systems
  • Project GitHub automation and other process automation
  • Project continuous build and test infrastructure
  • Release tooling
  • Documentation
  • Usage data collection mechanisms

Scope domains

Most decisions are regarding whether any part of the project should undertake efforts in a particular area. However, some decisions may sometimes be necessary for smaller scopes. The term "core" is sometimes used, but is not well defined. The following are scopes that may be relevant:

  • Kubernetes project github orgs
    • All github orgs
    • The kubernetes github org
    • The kubernetes-sigs and kubernetes-incubator github orgs
    • The kubernetes-client github org
    • Other github orgs
  • Release artifacts
    • The Kubernetes release bundle
    • Binaries built in kubernetes/kubernetes
      • “core” server components: apiserver, controller manager, scheduler, kube-proxy, kubelet
      • kubectl
      • kubeadm
    • Other images, packages, etc.
  • The kubernetes/kubernetes repository (aka k/k)
    • master branch
    • kubernetes/kubernetes/master/pkg
    • kubernetes/kubernetes/master/staging
  • Functionality layers
    • required
    • pluggable
    • optional
    • usable independently of the rest of Kubernetes

Other inclusion considerations

The Kubernetes project is a large, complex effort.

  • Is the functionality consistent with the existing implementation conventions, design principles, architecture, and direction?

  • Do the subproject owners, approvers, reviewers, and regular contributors agree to maintain the functionality?

  • Do the contributors to the functionality agree to follow the projects development conventions and requirements, including CLA, code of conduct, github and build tooling, testing, documentation, and release criteria, etc.?

  • Does the functionality improve existing use cases, or mostly enable new ones? The project isn't completely blocking new functionality (more reducing the rate of expansion), but it is trying to limit additions to kubernetes/kubernetes/master, and aims to improve the quality of the functionality that already exists.

  • Is it needed by project contributors? Example: We need cluster creation and upgrade functionality in order to run end-to-end tests.

  • Is it necessary in order to enable workload portability?

  • Is it needed in order for upstream releases to be usable? For example, things without which users otherwise were reverse-engineering Kubernetes to figure out, and/or copying code out of Kubernetes itself to make work.

  • Is it functionality that users expect, such as because other container platforms and/or service discovery and routing mechanisms provide it? If a capability that relates to Kubernetes's fundamental purpose were to become table stakes in the industry, Kubernetes would need to support it in order to stay relevant. (Whether it would need to be addressed by the core project would depend on the other criteria.)

  • Is there sufficiently broad user demand and/or sufficient expected user benefit for the functionality?

  • Is there an adequate mechanism to discover, deploy, express a dependency on, and upgrade the functionality if implemented using an extension mechanism? Are there consistent notions of releases, maturity, quality, version skew, conformance, etc. for extensions?

  • Is it needed as a reference implementation exercising extension points or other APIs?

  • Is the functionality sufficiently general-purpose?

  • Is it an area where we want to provide an opinionated solution and/or where fragmentation would be problematic for users, or are there many reasonable alternative approaches and solutions to the problem?

  • Is it an area where we want to foster exploration and innovation in the ecosystem?

  • Has the ecosystem produced adequate solutions on its own? For instance, have ecosystem projects taken on requirements of the Kubernetes project, if needed? Example: etcd3 added a number of features and other improvements to benefit Kubernetes, so the project didn't need to launch a separate storage effort.

  • Is there an acceptable home for the recommended ecosystem solution(s)? Example: the CNCF Sandbox is one possible home

  • Has the functionality been provided by the project/release/component historically?

Technical scope details and rationale

Containerized workload execution and management

Including:

  • common general categories of workloads, such as stateless, stateful, batch, and cluster services
  • provisioning, allocation, accessing, and managing compute, storage, and network resources on behalf of the workloads, and enforcement of security policies on those resources
  • workload prioritization, capacity assessment, placement, and relocation (aka scheduling)
  • graceful workload eviction
  • local container image caching
  • configuration and secret distribution
  • manual and automatic horizontal and vertical scaling
  • deployment, progressive (aka rolling) upgrades, and downgrades
  • self-healing
  • exposing container logs, status, health, and resource usage metrics for collection

Service discovery, load balancing, and routing

Including:

  • endpoint tracking and discovery, including pod and non-pod endpoints
  • the most common L4 and L7 Internet protocols (TCP, UDP, SCTP, HTTP, HTTPS)
  • intra-cluster DNS configuration and serving
  • external DNS configuration
  • accessing external services (e.g., imported services, Open Service Broker)
  • exposing traffic latency, throughput, and status metrics for collection
  • access authorization

Workload identity propagation and authentication

Including:

  • internal identity (e.g., SPIFFE support)
  • external identity (e.g., TLS certificate management)

Declarative resource management platform

Including:

  • CRUD API operations and behaviors, diff, patch, dry run, watch
  • declarative updates (apply)
  • resource type definition, registration, discovery, documentation, and validation mechanisms
  • pluggable authentication, authorization, admission (API-level policy enforcement), and audit-logging mechanisms
  • Namespace (resource scoping primitive) lifecycle
  • resource instance persistence and garbage collection
  • asynchronous event reporting
  • API producer SDK
  • API client SDK / libraries in widely used languages
  • dynamic, resource-oriented CLI, as a reference implementation for interacting with the API and basic tool for declarative and imperative management
    • simplifies getting started and avoids complexities of documenting the system with just, for instance, curl

Command-line tool

Since some Kubernetes primitives are fairly low-level, in addition to general-purpose resource-oriented operations, the CLI also supports “porcelain” for common simple, domain-specific operational operations (both status/progress extraction and mutations) that dont have discrete API implementations, such as run, expose, rollout, cp, top, cordon, and drain. And there should be support for non-resource-oriented APIs, such as exec, logs, attach, port-forward, and proxy.

Web dashboard (UI)

The project supported a dashboard, initially built into the apiserver, almost from the beginning. Other projects in the space had UIs and users expected one. There wasnt a vendor-neutral one in the ecosystem, however, and a solution was needed for the project's local cluster environment, minikube. The dashboard has also served as a UI reference implementation and a vehicle to drive conventions (e.g., around resource category terminology). The dashboard has also been useful as a tool to demonstrate and to learn about Kubernetes concepts, features, and behaviors.

Cluster lifecycle tools

Cluster lifecycle includes provisioning, bootstrapping, upgrade/downgrade, and teardown. The project develops several such tools. Tools are needed for the following scenarios/purposes:

  • usability of upstream releases: at least one solution that can be used to bootstrap the upstream release (e.g., kubeadm)
  • testing: solutions that can be used to run multi-node end-to-end tests (e.g., kind), integration tests, upgrade/downgrade tests, version-skew tests, scalability tests, and other types of tests the projects deems necessary to ensure adequate release quality
  • portable, low-dependency local environment: at least one local environment (e.g., minikube), in order to simplify documentation tutorials that require a cluster to exist

Extensibility to support execution and management in diverse environments

Including:

  • CRI
  • CNI
  • CSI
  • external cloud providers
  • KMS providers
  • OSB brokers
  • Cluster APIs

Multi-cluster management tools and systems

Many users desire to operate in and deploy applications to multiple geographic locations and environments, even across multiple providers. This generally requires managing multiple Kubernetes clusters. While general deployment pipeline tools and continuous deployment systems are not in scope, the project has explored multiple mechanisms to simplify management of resources across multiple clusters, including Federation v1, Federation v2, and the Cluster Registry API.

Project GitHub automation and other process automation

As one of the largest, most active projects on Github, Kubernetes has some extreme needs.

Including:

  • prow
  • gubernator
  • velodrome and kettle
  • website infrastructure
  • k8s.io

Project continuous build and test infrastructure

Including:

  • prow
  • tide
  • triage dashboard

Release tooling

Including:

  • anago

Documentation

Documentation of project-provided functionality and components, for multiple audiences, including:

  • application developers
  • application operators
  • cluster operators
  • ecosystem developers
  • distribution providers, and others who want to port Kubernetes to new environments
  • project contributors

Usage data collection mechanisms

Including:

  • Spartakus

Examples of projects and areas not in scope

Some of these are obvious, but many have been seriously deliberated in the past.

  • The resource instance store (etcd)
  • Container runtimes, other than current grandfathered ones
  • Network and storage plugins, other than current grandfathered ones
  • CoreDNS
    • Since intra-cluster DNS is in scope, we need to ensure we have some solution, which has been kubedns, but now that there is an adequate alternative outside the project, we are adopting it.
  • Service load balancers (e.g., Envoy, Linkerd), other than kube-proxy
  • Cloud provider implementations, other than current grandfathered ones
  • Container image build tools
  • Image registries and distribution mechanisms
  • Identity (user/group) sources of truth (e.g., LDAP)
  • Key management systems (e.g., Vault)
  • CI, CD, and GitOps (push to deploy) systems, other than infrastructure used to build and test the Kubernetes project itself
  • Application-level services, such as middleware (e.g., message buses), data-processing frameworks (e.g., Spark), machine-learning frameworks (e.g., Kubeflow), databases (e.g., Mysql), caches, nor cluster storage systems (e.g., Ceph) as built-in services. Such components can run on Kubernetes, and/or can be accessed by applications running on Kubernetes through portable mechanisms, such as the Open Service Broker. Application-specific Operators (e.g., Cassandra Operator) are also not in scope.
  • Application and cluster log aggregation and searching, application and cluster monitoring aggregation and dashboarding (other than heapster, which is grandfathered), alerting, application performance management, tracing, and debugging tools
  • General-purpose machine configuration (e.g., Chef, Puppet, Ansible, Salt), maintenance, automation (e.g., Rundeck), and management systems
  • Templating and configuration languages (e.g., jinja, jsonnet, starlark, hcl, dhall, hocon)
  • File packaging tools (e.g., helm, kpm, kubepack, duffle)
  • Managing non-containerized applications in VMs, and other general IaaS functionality
  • Full Platform as a Service functionality
  • Full Functions as a Service functionality
  • Workflow orchestration: "Workflow" is a very broad, diverse area, with solutions typically tailored to specific use cases (e.g., data-flow graphs, data-driven processing, deployment pipelines, event-driven automation, business-process execution, iPaaS) and specific input and event sources, and often requires arbitrary code to evaluate conditions, actions, and/or failure handling.
  • Other forms of human-oriented and programmatic interfaces over the Kubernetes API other than “basic” CLIs (e.g., kubectl) and UI (dashboard), such as mobile dashboards, IDEs, chat bots, SQL, interactive shells, etc.