343 lines
14 KiB
Markdown
343 lines
14 KiB
Markdown
# Kubernetes scope
|
||
|
||
Purpose of this doc: Clarify factors affecting decisions regarding
|
||
what is and is not in scope for the Kubernetes project.
|
||
|
||
Related documents:
|
||
* [What is Kubernetes?](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/)
|
||
* [Kubernetes design and architecture](architecture.md)
|
||
* [Kubernetes architectural roadmap (2017)](architectural-roadmap.md)
|
||
* [Design principles](principles.md)
|
||
* [Kubernetes resource management](resource-management.md)
|
||
|
||
Kubernetes is a portable, extensible open-source platform for managing
|
||
containerized workloads and services, that facilitates both
|
||
declarative configuration and automation. Workload portability is an
|
||
especially high priority. Kubernetes provides a flexible, easy-to-run,
|
||
secure foundation for running containerized applications on any cloud
|
||
provider or your own systems.
|
||
|
||
While not a full distribution in the Linux sense, adoption of
|
||
Kubernetes has been facilitated by the fact that the upstream releases
|
||
are usable on their own, with minimal dependencies (e.g., etcd, a
|
||
container runtime, and a networking implementation).
|
||
|
||
The high-level scope and goals are often insufficient for making
|
||
decisions about where to draw the line, so this documents where the
|
||
line is, the rationale for some past decisions, and some general
|
||
criteria that have been applied, including non-technical
|
||
considerations. For instance, user adoption and continued operation of
|
||
the project itself are also important factors.
|
||
|
||
## Significant areas
|
||
|
||
More details can be found below, but a concise list of areas in scope follows:
|
||
* Containerized workload execution and management
|
||
* Service discovery, load balancing, and routing
|
||
* Workload identity propagation and authentication
|
||
* Declarative resource management platform
|
||
* Command-line tool
|
||
* Web dashboard (UI)
|
||
* Cluster lifecycle tools
|
||
* Extensibility to support execution and management in diverse environments
|
||
* Multi-cluster management tools and systems
|
||
* Project GitHub automation and other process automation
|
||
* Project continuous build and test infrastructure
|
||
* Release tooling
|
||
* Documentation
|
||
* Usage data collection mechanisms
|
||
|
||
## Scope domains
|
||
|
||
Most decisions are regarding whether any part of the project should
|
||
undertake efforts in a particular area. However, some decisions may
|
||
sometimes be necessary for smaller scopes. The term "core" is sometimes
|
||
used, but is not well defined. The following are scopes that may be relevant:
|
||
* Kubernetes project github orgs
|
||
* All github orgs
|
||
* The kubernetes github org
|
||
* The kubernetes-sigs and kubernetes-incubator github orgs
|
||
* The kubernetes-client github org
|
||
* Other github orgs
|
||
* Release artifacts
|
||
* The Kubernetes release bundle
|
||
* Binaries built in kubernetes/kubernetes
|
||
* “core” server components: apiserver, controller manager, scheduler, kube-proxy, kubelet
|
||
* kubectl
|
||
* kubeadm
|
||
* Other images, packages, etc.
|
||
* The kubernetes/kubernetes repository (aka k/k)
|
||
* master branch
|
||
* kubernetes/kubernetes/master/pkg
|
||
* kubernetes/kubernetes/master/staging
|
||
* [Functionality layers](architectural-roadmap.md)
|
||
* required
|
||
* pluggable
|
||
* optional
|
||
* usable independently of the rest of Kubernetes
|
||
|
||
## Other inclusion considerations
|
||
|
||
The Kubernetes project is a large, complex effort.
|
||
|
||
* Is the functionality consistent with the existing implementation
|
||
conventions, design principles, architecture, and direction?
|
||
|
||
* Do the subproject owners, approvers, reviewers, and regular contributors
|
||
agree to maintain the functionality?
|
||
|
||
* Do the contributors to the functionality agree to follow the
|
||
project’s development conventions and requirements, including CLA,
|
||
code of conduct, github and build tooling, testing, documentation,
|
||
and release criteria, etc.?
|
||
|
||
* Does the functionality improve existing use cases, or mostly enable
|
||
new ones? The project isn't completely blocking new functionality
|
||
(more reducing the rate of expansion), but it is trying to
|
||
limit additions to kubernetes/kubernetes/master, and aims to improve the
|
||
quality of the functionality that already exists.
|
||
|
||
* Is it needed by project contributors? Example: We need cluster
|
||
creation and upgrade functionality in order to run end-to-end tests.
|
||
|
||
* Is it necessary in order to enable workload portability?
|
||
|
||
* Is it needed in order for upstream releases to be usable? For
|
||
example, things without which users otherwise were
|
||
reverse-engineering Kubernetes to figure out, and/or copying code
|
||
out of Kubernetes itself to make work.
|
||
|
||
* Is it functionality that users expect, such as because other
|
||
container platforms and/or service discovery and routing mechanisms
|
||
provide it? If a capability that relates to Kubernetes's fundamental
|
||
purpose were to become table stakes in the industry, Kubernetes
|
||
would need to support it in order to stay relevant. (Whether it
|
||
would need to be addressed by the core project would depend on the
|
||
other criteria.)
|
||
|
||
* Is there sufficiently broad user demand and/or sufficient expected
|
||
user benefit for the functionality?
|
||
|
||
* Is there an adequate mechanism to discover, deploy, express a
|
||
dependency on, and upgrade the functionality if implemented using an
|
||
extension mechanism? Are there consistent notions of releases, maturity,
|
||
quality, version skew, conformance, etc. for extensions?
|
||
|
||
* Is it needed as a reference implementation exercising extension
|
||
points or other APIs?
|
||
|
||
* Is the functionality sufficiently general-purpose?
|
||
|
||
* Is it an area where we want to provide an opinionated solution
|
||
and/or where fragmentation would be problematic for users, or are
|
||
there many reasonable alternative approaches and solutions to the
|
||
problem?
|
||
|
||
* Is it an area where we want to foster exploration and innovation in
|
||
the ecosystem?
|
||
|
||
* Has the ecosystem produced adequate solutions on its own? For
|
||
instance, have ecosystem projects taken on requirements of the
|
||
Kubernetes project, if needed? Example: etcd3 added a number of features
|
||
and other improvements to benefit Kubernetes, so the project didn't
|
||
need to launch a separate storage effort.
|
||
|
||
* Is there an acceptable home for the recommended ecosystem solution(s)?
|
||
Example: the [CNCF Sandbox](https://github.com/cncf/toc/blob/master/process/sandbox.md) is one possible home
|
||
|
||
* Has the functionality been provided by the project/release/component
|
||
historically?
|
||
|
||
## Technical scope details and rationale
|
||
|
||
### Containerized workload execution and management
|
||
|
||
Including:
|
||
* common general categories of workloads, such as stateless, stateful, batch, and cluster services
|
||
* provisioning, allocation, accessing, and managing compute, storage, and network resources on behalf of the workloads, and enforcement of security policies on those resources
|
||
* workload prioritization, capacity assessment, placement, and relocation (aka scheduling)
|
||
* graceful workload eviction
|
||
* local container image caching
|
||
* configuration and secret distribution
|
||
* manual and automatic horizontal and vertical scaling
|
||
* deployment, progressive (aka rolling) upgrades, and downgrades
|
||
* self-healing
|
||
* exposing container logs, status, health, and resource usage metrics for collection
|
||
|
||
### Service discovery, load balancing, and routing
|
||
|
||
Including:
|
||
* endpoint tracking and discovery, including pod and non-pod endpoints
|
||
* the most common L4 and L7 Internet protocols (TCP, UDP, SCTP, HTTP, HTTPS)
|
||
* intra-cluster DNS configuration and serving
|
||
* external DNS configuration
|
||
* accessing external services (e.g., imported services, Open Service Broker)
|
||
* exposing traffic latency, throughput, and status metrics for collection
|
||
* access authorization
|
||
|
||
### Workload identity propagation and authentication
|
||
|
||
Including:
|
||
* internal identity (e.g., SPIFFE support)
|
||
* external identity (e.g., TLS certificate management)
|
||
|
||
### Declarative resource management platform
|
||
|
||
Including:
|
||
* CRUD API operations and behaviors, diff, patch, dry run, watch
|
||
* declarative updates (apply)
|
||
* resource type definition, registration, discovery, documentation, and validation mechanisms
|
||
* pluggable authentication, authorization, admission (API-level policy enforcement), and audit-logging mechanisms
|
||
* Namespace (resource scoping primitive) lifecycle
|
||
* resource instance persistence and garbage collection
|
||
* asynchronous event reporting
|
||
* API producer SDK
|
||
* API client SDK / libraries in widely used languages
|
||
* dynamic, resource-oriented CLI, as a reference implementation for interacting with the API and basic tool for declarative and imperative management
|
||
* simplifies getting started and avoids complexities of documenting the system with just, for instance, curl
|
||
|
||
### Command-line tool
|
||
|
||
Since some Kubernetes primitives are fairly low-level, in addition to
|
||
general-purpose resource-oriented operations, the CLI also supports
|
||
“porcelain” for common simple, domain-specific operational operations (both
|
||
status/progress extraction and mutations) that don’t have discrete API
|
||
implementations, such as run, expose, rollout, cp, top, cordon, and
|
||
drain. And there should be support for non-resource-oriented APIs,
|
||
such as exec, logs, attach, port-forward, and proxy.
|
||
|
||
### Web dashboard (UI)
|
||
|
||
The project supported a dashboard, initially built into the apiserver,
|
||
almost from the beginning. Other projects in the space had UIs and
|
||
users expected one. There wasn’t a vendor-neutral one in the
|
||
ecosystem, however, and a solution was needed for the project's local
|
||
cluster environment, minikube. The dashboard has also served as a UI
|
||
reference implementation and a vehicle to drive conventions (e.g.,
|
||
around resource category terminology). The dashboard has also been
|
||
useful as a tool to demonstrate and to learn about Kubernetes
|
||
concepts, features, and behaviors.
|
||
|
||
### Cluster lifecycle tools
|
||
|
||
Cluster lifecycle includes provisioning, bootstrapping,
|
||
upgrade/downgrade, and teardown. The project develops several such tools.
|
||
Tools are needed for the following scenarios/purposes:
|
||
* usability of upstream releases: at least one solution that can be used to bootstrap the upstream release (e.g., kubeadm)
|
||
* testing: solutions that can be used to run multi-node end-to-end tests (e.g., kind), integration tests, upgrade/downgrade tests, version-skew tests, scalability tests, and other types of tests the projects deems necessary to ensure adequate release quality
|
||
* portable, low-dependency local environment: at least one local environment (e.g., minikube), in order to simplify documentation tutorials that require a cluster to exist
|
||
|
||
### Extensibility to support execution and management in diverse environments
|
||
|
||
Including:
|
||
* CRI
|
||
* CNI
|
||
* CSI
|
||
* external cloud providers
|
||
* KMS providers
|
||
* OSB brokers
|
||
* Cluster APIs
|
||
|
||
### Multi-cluster management tools and systems
|
||
|
||
Many users desire to operate in and deploy applications to multiple
|
||
geographic locations and environments, even across multiple providers.
|
||
This generally requires managing multiple Kubernetes clusters. While
|
||
general deployment pipeline tools and continuous deployment systems
|
||
are not in scope, the project has explored multiple mechanisms to
|
||
simplify management of resources across multiple clusters, including
|
||
Federation v1, Federation v2, and the Cluster Registry API.
|
||
|
||
### Project GitHub automation and other process automation
|
||
|
||
As one of the largest, most active projects on Github, Kubernetes has
|
||
some extreme needs.
|
||
|
||
Including:
|
||
* prow
|
||
* gubernator
|
||
* velodrome and kettle
|
||
* website infrastructure
|
||
* k8s.io
|
||
|
||
### Project continuous build and test infrastructure
|
||
|
||
Including:
|
||
* prow
|
||
* tide
|
||
* triage dashboard
|
||
|
||
### Release tooling
|
||
|
||
Including:
|
||
* anago
|
||
|
||
### Documentation
|
||
|
||
Documentation of project-provided functionality and components, for
|
||
multiple audiences, including:
|
||
* application developers
|
||
* application operators
|
||
* cluster operators
|
||
* ecosystem developers
|
||
* distribution providers, and others who want to port Kubernetes to new environments
|
||
* project contributors
|
||
|
||
### Usage data collection mechanisms
|
||
|
||
Including:
|
||
* Spartakus
|
||
|
||
## Examples of projects and areas not in scope
|
||
|
||
Some of these are obvious, but many have been seriously deliberated in the
|
||
past.
|
||
* The resource instance store (etcd)
|
||
* Container runtimes, other than current grandfathered ones
|
||
* Network and storage plugins, other than current grandfathered ones
|
||
* CoreDNS
|
||
* Since intra-cluster DNS is in scope, we need to ensure we have
|
||
some solution, which has been kubedns, but now that there is an
|
||
adequate alternative outside the project, we are adopting it.
|
||
* Service load balancers (e.g., Envoy, Linkerd), other than kube-proxy
|
||
* Cloud provider implementations, other than current grandfathered ones
|
||
* Container image build tools
|
||
* Image registries and distribution mechanisms
|
||
* Identity (user/group) sources of truth (e.g., LDAP)
|
||
* Key management systems (e.g., Vault)
|
||
* CI, CD, and GitOps (push to deploy) systems, other than
|
||
infrastructure used to build and test the Kubernetes project itself
|
||
* Application-level services, such as middleware (e.g., message
|
||
buses), data-processing frameworks (e.g., Spark), machine-learning
|
||
frameworks (e.g., Kubeflow), databases (e.g., Mysql), caches, nor
|
||
cluster storage systems (e.g., Ceph) as built-in services. Such
|
||
components can run on Kubernetes, and/or can be accessed by
|
||
applications running on Kubernetes through portable mechanisms, such
|
||
as the Open Service Broker. Application-specific Operators (e.g.,
|
||
Cassandra Operator) are also not in scope.
|
||
* Application and cluster log aggregation and searching, application
|
||
and cluster monitoring aggregation and dashboarding (other than
|
||
heapster, which is grandfathered), alerting, application performance
|
||
management, tracing, and debugging tools
|
||
* General-purpose machine configuration (e.g., Chef, Puppet, Ansible,
|
||
Salt), maintenance, automation (e.g., Rundeck), and management systems
|
||
* Templating and configuration languages (e.g., jinja, jsonnet,
|
||
starlark, hcl, dhall, hocon)
|
||
* File packaging tools (e.g., helm, kpm, kubepack, duffle)
|
||
* Managing non-containerized applications in VMs, and other general
|
||
IaaS functionality
|
||
* Full Platform as a Service functionality
|
||
* Full Functions as a Service functionality
|
||
* [Workflow
|
||
orchestration](https://github.com/kubernetes/kubernetes/pull/24781#issuecomment-215914822):
|
||
"Workflow" is a very broad, diverse area, with solutions typically
|
||
tailored to specific use cases (e.g., data-flow graphs, data-driven
|
||
processing, deployment pipelines, event-driven automation,
|
||
business-process execution, iPaaS) and specific input and event
|
||
sources, and often requires arbitrary code to evaluate conditions,
|
||
actions, and/or failure handling.
|
||
* Other forms of human-oriented and programmatic interfaces over the
|
||
Kubernetes API other than “basic” CLIs (e.g., kubectl) and UI
|
||
(dashboard), such as mobile dashboards, IDEs, chat bots, SQL,
|
||
interactive shells, etc.
|