Merge pull request #480 from deads2k/ext-02-position

move sig-api-machinery api extensions position to community
2017-03-24 16:32:03 -04:00 · 2017-03-24 16:32:03 -04:00 · d635ca855a
parent 24d72f74c2 e075ab06d9
commit d635ca855a
1 changed files with 112 additions and 0 deletions
--- a/sig-api-machinery/api-extensions-position-statement.md
+++ b/sig-api-machinery/api-extensions-position-statement.md
@ -0,0 +1,112 @@
+# API Extensions (SIG API Machinery position statement)
+
+Authors: Daniel Smith, David Eads (SIG API Machinery co-leads)  
+Last edit: Feb 23  
+Status: RELEASED
+
+
+## Background
+We have observed a lot of confusion in the community around the general topic
+of ThirdPartyResources (TPRs) and apiserver aggregation (AA). We want to
+document the current position of the API Machinery SIG.
+
+Extremely briefly, TPR is a mechanism for lightweight, easy extension of the
+kubernetes API, which has collected a [significant userbase](https://gist.github.com/philips/a97a143546c87b86b870a82a753db14c).
+AA is a heavier-weight mechanism for accomplishing a similar task; it is
+targeted at allowing the Kubernetes project to move away from a monolithic
+apiserver, and as a consequence, it will support PaaSes or other users that
+need the complete set of server-side kubernetes API semantics.
+
+
+## Positions
+### Q: Do we need two extension mechanisms, or should we provide a single
+extension mechanism with multiple opt-in features for users to grow into?
+(Binary vs gradient)
+
+We think there is both room in this space and a necessity for both approaches.
+TPR is clearly useful to users. In its current state, TPR lacks some features
+and has some bugs which limit it. We believe TPR bugs should be fixed and some
+features should be added to it (as long as it maintains its ease-of-use, which
+we think is its primary feature). We think TPR’s competitive advantage is its
+low barrier-to-entry and ease of use.
+
+However, even in the limit where we have added all the features to TPR that
+make sense, there’s still a need for apiserver aggregation. Here are two use
+cases that TPR cannot address while maintaining its ease of use.
+* Heapster / metrics API. The metrics API is going to be data assembled at read
+time, which is extremely high churn and should not be stored in an etcd
+instance. Heapster needs to use custom storage.
+* Full-featured extension APIs (pieces of Kubernetes itself; PaaSes).
+  * OpenShift is an example of a full-featured API server that makes use of the
+  apimachinery and apiserver features (API versioning, conversion, defaulting,
+  serialization (including protocol buffer encoding), storage, security, custom
+  subresource handlers, and admission).
+  * Integrators who wish to provide this level of features and expect this
+  level of API traffic volume are unlikely to be satisfied by webhooks, but
+  should still be able to integrate. 
+  * If Kubernetes developers could create new APIs in new apiservers instead
+  of modifying the core apiserver, it would make life better for everyone:
+    * Easier to shard reviews
+    * Easier to experiment with APIs
+    * No more accidentally enabling a half-baked API
+    * Code freeze/release train less disruptive
+  * It would be great if it were possible to run these extensions (including
+  OpenShift, other PaaSes, and various optional extensions such as the service
+  catalog) directly on an existing kubernetes cluster; in fact, we think that
+  the alternative to this is a multiplication of forks, which will be really
+  bad for the ecosystem as a whole. With ecosystem unification in mind, it
+  would be infeasible to ask any consumer with both many users and an
+  extensive codebase (such as OpenShift) to rewrite their stack in terms of
+  TPRs and webhooks. We have to give such users a path to straight consumption
+  as opposed to the current fork-and-modify approach, which has been the only
+  feasible one for far too long.
+
+This is not to say that TPR should stay in its current form. The API Machinery
+SIG is committed to finishing TPR, making it usable, and maintaining it (but we
+need volunteers to step up, or it’s going to take a long time).
+
+The big table in [Eric’s comparison doc](https://docs.google.com/document/d/1y16jKL2hMjQO0trYBJJSczPAWj8vAgNFrdTZeCincmI/edit#heading=h.xugwibxye5f0)
+is a good place to learn the current and possible future feature sets of TPRs
+and AA. The fact that TPR has been languishing is due to lack of an owner and
+lack of people willing to work on it, not lack of belief that it ought to be
+fixed and perfected. Eric and Anirudh have agreed to take on this role.
+
+### Q: Should there be a single API object that programs either TPR or AA as appropriate, or should each of these have their own registration object?
+We think that the configuration of these two objects is distinct enough that
+two API resources are appropriate.
+
+We do need to take care to provide a good user experience, as the API groups
+users enter in both AA and TPR come out of the same global namespace. E.g., a
+user should not have to make both a TPR registration and an AA registration to
+start up a TPR--this would break current users of TPRs.
+
+### Q: Should TPRs be fixed up and extended in-place, or should a replacement be built in a separate TPR apiserver?
+TPR is implemented currently with a variety of special cases sprinkled
+throughout kube-apiserver code. It would greatly simplify kube-apiserver code
+and the TPR implementation if this were separated, and TPR constructed as its
+own HTTP server (but still run from kube-apiserver; see bottom Q). However, we
+will not block safe, targeted TPR fixes on completion of this split.
+
+### Q: Should TPR maintain compatibility, or should we break compatibility to fix and extend it?
+There are two dozen open-source projects that use TPR, and we also know of
+private users of TPR, and at least some people consider it to be beta. However,
+we may have to implement fixes in a way that requires breaking backward
+compatibility. If we do that, we will at a minimum provide migration
+instructions and go through a one-release deprecation cycle to give users time
+to switch over to the new version. We think this decision is probably best made
+by the people actually working on this (currently: @deads2k, @erictune,
+@foxish). [Some thoughts here](https://docs.google.com/document/d/1Gg158jO1cRBq-8RrWRAWA2IRF9avscuRaWFmY2Wb6qw/edit).
+
+### Q: Should kube-aggregator be a separate binary/process than kube-apiserver?
+For code health reasons, it is very convenient to totally separate the
+aggregation layer from apiserver. However, operationally, it is extremely
+inconvenient to set up and run an additional binary. Additionally, it is
+crucial that all extensibility functionality be in every cluster, because users
+need to be able to depend on it; this argues that kube-aggregator can’t be
+optional.
+
+Our current plan is to host several logical apiservers (the existing
+kube-apiserver, kube-aggregator, and perhaps a hypothetical kube-tprserver,
+see above) in a single binary, and launch them in a single process (a drop-in
+replacement for the existing kube-apiserver). There are several candidate
+mechanisms for accomplishing this and we won’t design this in this document. :)