14 KiB

Raw Blame History

Kubernetes scope

Purpose of this doc: Clarify factors affecting decisions regarding what is and is not in scope for the Kubernetes project.

Significant areas

More details can be found below, but a concise list of areas in scope follows:

Containerized workload execution and management
Service discovery, load balancing, and routing
Workload identity propagation and authentication
Declarative resource management platform
Command-line tool
Web dashboard (UI)
Cluster lifecycle tools
Extensibility to support execution and management in diverse environments
Multi-cluster management tools and systems
Project GitHub automation and other process automation
Project continuous build and test infrastructure
Release tooling
Documentation
Usage data collection mechanisms

Scope domains

Most decisions are regarding whether any part of the project should undertake efforts in a particular area. However, some decisions may sometimes be necessary for smaller scopes. The term "core" is sometimes used, but is not well defined. The following are scopes that may be relevant:

Kubernetes project github orgs
- All github orgs
- The kubernetes github org
- The kubernetes-sigs and kubernetes-incubator github orgs
- The kubernetes-client github org
- Other github orgs
Release artifacts
- The Kubernetes release bundle
- Binaries built in kubernetes/kubernetes
  - “core” server components: apiserver, controller manager, scheduler, kube-proxy, kubelet
  - kubectl
  - kubeadm
- Other images, packages, etc.
The kubernetes/kubernetes repository (aka k/k)
- master branch
- kubernetes/kubernetes/master/pkg
- kubernetes/kubernetes/master/staging
Functionality layers
- required
- pluggable
- optional
- usable independently of the rest of Kubernetes

Other inclusion considerations

The Kubernetes project is a large, complex effort.

Is the functionality consistent with the existing implementation conventions, design principles, architecture, and direction?
Do the subproject owners, approvers, reviewers, and regular contributors agree to maintain the functionality?
Do the contributors to the functionality agree to follow the project’s development conventions and requirements, including CLA, code of conduct, github and build tooling, testing, documentation, and release criteria, etc.?
Does the functionality improve existing use cases, or mostly enable new ones? The project isn't completely blocking new functionality (more reducing the rate of expansion), but it is trying to limit additions to kubernetes/kubernetes/master, and aims to improve the quality of the functionality that already exists.
Is it needed by project contributors? Example: We need cluster creation and upgrade functionality in order to run end-to-end tests.
Is it necessary in order to enable workload portability?
Is it needed in order for upstream releases to be usable? For example, things without which users otherwise were reverse-engineering Kubernetes to figure out, and/or copying code out of Kubernetes itself to make work.
Is it functionality that users expect, such as because other container platforms and/or service discovery and routing mechanisms provide it? If a capability that relates to Kubernetes's fundamental purpose were to become table stakes in the industry, Kubernetes would need to support it in order to stay relevant. (Whether it would need to be addressed by the core project would depend on the other criteria.)
Is there sufficiently broad user demand and/or sufficient expected user benefit for the functionality?
Is there an adequate mechanism to discover, deploy, express a dependency on, and upgrade the functionality if implemented using an extension mechanism? Are there consistent notions of releases, maturity, quality, version skew, conformance, etc. for extensions?
Is it needed as a reference implementation exercising extension points or other APIs?
Is the functionality sufficiently general-purpose?
Is it an area where we want to provide an opinionated solution and/or where fragmentation would be problematic for users, or are there many reasonable alternative approaches and solutions to the problem?
Is it an area where we want to foster exploration and innovation in the ecosystem?
Has the ecosystem produced adequate solutions on its own? For instance, have ecosystem projects taken on requirements of the Kubernetes project, if needed? Example: etcd3 added a number of features and other improvements to benefit Kubernetes, so the project didn't need to launch a separate storage effort.
Is there an acceptable home for the recommended ecosystem solution(s)? Example: the CNCF Sandbox is one possible home
Has the functionality been provided by the project/release/component historically?

Technical scope details and rationale

Containerized workload execution and management

Including:

common general categories of workloads, such as stateless, stateful, batch, and cluster services
provisioning, allocation, accessing, and managing compute, storage, and network resources on behalf of the workloads, and enforcement of security policies on those resources
workload prioritization, capacity assessment, placement, and relocation (aka scheduling)
graceful workload eviction
local container image caching
configuration and secret distribution
manual and automatic horizontal and vertical scaling
deployment, progressive (aka rolling) upgrades, and downgrades
self-healing
exposing container logs, status, health, and resource usage metrics for collection

Service discovery, load balancing, and routing

Including:

endpoint tracking and discovery, including pod and non-pod endpoints
the most common L4 and L7 Internet protocols (TCP, UDP, SCTP, HTTP, HTTPS)
intra-cluster DNS configuration and serving
external DNS configuration
accessing external services (e.g., imported services, Open Service Broker)
exposing traffic latency, throughput, and status metrics for collection
access authorization

Workload identity propagation and authentication

Including:

internal identity (e.g., SPIFFE support)
external identity (e.g., TLS certificate management)

Declarative resource management platform

Including:

CRUD API operations and behaviors, diff, patch, dry run, watch
declarative updates (apply)
resource type definition, registration, discovery, documentation, and validation mechanisms
pluggable authentication, authorization, admission (API-level policy enforcement), and audit-logging mechanisms
Namespace (resource scoping primitive) lifecycle
resource instance persistence and garbage collection
asynchronous event reporting
API producer SDK
API client SDK / libraries in widely used languages
dynamic, resource-oriented CLI, as a reference implementation for interacting with the API and basic tool for declarative and imperative management
- simplifies getting started and avoids complexities of documenting the system with just, for instance, curl

Command-line tool

Since some Kubernetes primitives are fairly low-level, in addition to general-purpose resource-oriented operations, the CLI also supports “porcelain” for common simple, domain-specific operational operations (both status/progress extraction and mutations) that don’t have discrete API implementations, such as run, expose, rollout, cp, top, cordon, and drain. And there should be support for non-resource-oriented APIs, such as exec, logs, attach, port-forward, and proxy.

Web dashboard (UI)

The project supported a dashboard, initially built into the apiserver, almost from the beginning. Other projects in the space had UIs and users expected one. There wasn’t a vendor-neutral one in the ecosystem, however, and a solution was needed for the project's local cluster environment, minikube. The dashboard has also served as a UI reference implementation and a vehicle to drive conventions (e.g., around resource category terminology). The dashboard has also been useful as a tool to demonstrate and to learn about Kubernetes concepts, features, and behaviors.

Cluster lifecycle tools

Cluster lifecycle includes provisioning, bootstrapping, upgrade/downgrade, and teardown. The project develops several such tools. Tools are needed for the following scenarios/purposes:

usability of upstream releases: at least one solution that can be used to bootstrap the upstream release (e.g., kubeadm)
testing: solutions that can be used to run multi-node end-to-end tests (e.g., kind), integration tests, upgrade/downgrade tests, version-skew tests, scalability tests, and other types of tests the projects deems necessary to ensure adequate release quality
portable, low-dependency local environment: at least one local environment (e.g., minikube), in order to simplify documentation tutorials that require a cluster to exist

Extensibility to support execution and management in diverse environments

Including:

CRI
CNI
CSI
external cloud providers
KMS providers
OSB brokers
Cluster APIs

Multi-cluster management tools and systems

Many users desire to operate in and deploy applications to multiple geographic locations and environments, even across multiple providers. This generally requires managing multiple Kubernetes clusters. While general deployment pipeline tools and continuous deployment systems are not in scope, the project has explored multiple mechanisms to simplify management of resources across multiple clusters, including Federation v1, Federation v2, and the Cluster Registry API.

Project GitHub automation and other process automation

As one of the largest, most active projects on Github, Kubernetes has some extreme needs.

Including:

prow
gubernator
velodrome and kettle
website infrastructure
k8s.io

Project continuous build and test infrastructure

Including:

prow
tide
triage dashboard

Release tooling

Including:

anago

Documentation

Documentation of project-provided functionality and components, for multiple audiences, including:

application developers
application operators
cluster operators
ecosystem developers
distribution providers, and others who want to port Kubernetes to new environments
project contributors

Usage data collection mechanisms

Including:

Spartakus

Examples of projects and areas not in scope

Some of these are obvious, but many have been seriously deliberated in the past.

The resource instance store (etcd)
Container runtimes, other than current grandfathered ones
Network and storage plugins, other than current grandfathered ones
CoreDNS
- Since intra-cluster DNS is in scope, we need to ensure we have some solution, which has been kubedns, but now that there is an adequate alternative outside the project, we are adopting it.
Service load balancers (e.g., Envoy, Linkerd), other than kube-proxy
Cloud provider implementations, other than current grandfathered ones
Container image build tools
Image registries and distribution mechanisms
Identity (user/group) sources of truth (e.g., LDAP)
Key management systems (e.g., Vault)
CI, CD, and GitOps (push to deploy) systems, other than infrastructure used to build and test the Kubernetes project itself
Application-level services, such as middleware (e.g., message buses), data-processing frameworks (e.g., Spark), machine-learning frameworks (e.g., Kubeflow), databases (e.g., Mysql), caches, nor cluster storage systems (e.g., Ceph) as built-in services. Such components can run on Kubernetes, and/or can be accessed by applications running on Kubernetes through portable mechanisms, such as the Open Service Broker. Application-specific Operators (e.g., Cassandra Operator) are also not in scope.
Application and cluster log aggregation and searching, application and cluster monitoring aggregation and dashboarding (other than heapster, which is grandfathered), alerting, application performance management, tracing, and debugging tools
General-purpose machine configuration (e.g., Chef, Puppet, Ansible, Salt), maintenance, automation (e.g., Rundeck), and management systems
Templating and configuration languages (e.g., jinja, jsonnet, starlark, hcl, dhall, hocon)
File packaging tools (e.g., helm, kpm, kubepack, duffle)
Managing non-containerized applications in VMs, and other general IaaS functionality
Full Platform as a Service functionality
Full Functions as a Service functionality
Workflow orchestration: "Workflow" is a very broad, diverse area, with solutions typically tailored to specific use cases (e.g., data-flow graphs, data-driven processing, deployment pipelines, event-driven automation, business-process execution, iPaaS) and specific input and event sources, and often requires arbitrary code to evaluate conditions, actions, and/or failure handling.
Other forms of human-oriented and programmatic interfaces over the Kubernetes API other than “basic” CLIs (e.g., kubectl) and UI (dashboard), such as mobile dashboards, IDEs, chat bots, SQL, interactive shells, etc.

14 KiB Raw Blame History Unescape Escape

Kubernetes scope

Significant areas

Scope domains

Other inclusion considerations

Technical scope details and rationale

Containerized workload execution and management

Service discovery, load balancing, and routing

Workload identity propagation and authentication

Declarative resource management platform

Command-line tool

Web dashboard (UI)

Cluster lifecycle tools

Extensibility to support execution and management in diverse environments

Multi-cluster management tools and systems

Project GitHub automation and other process automation

Project continuous build and test infrastructure

Release tooling

Documentation

Usage data collection mechanisms

Examples of projects and areas not in scope

14 KiB

Raw Blame History