Moving files to sig-API-Machinery folder - URLs fixed
Fixing links pointing to the new location of controllers.md URLs pointing to api-conventions.md fixed Tombstone files added removing caps from folder name folder name /devel/api-machinery fixed (lowercase) - URLs updated URL changed to relative path
This commit is contained in:
parent
1c3a44adae
commit
f0dd87ad47
|
@ -16,7 +16,7 @@ In Kubernetes, declarative abstractions are primary, rather than layered on top
|
|||
|
||||
Kubernetes supports declarative control by recording user intent as the desired state in its API resources. This enables a single API schema for each resource to serve as a declarative data model, as both a source and a target for automated components (e.g., autoscalers), and even as an intermediate representation for resource transformations prior to instantiation.
|
||||
|
||||
The intent is carried out by asynchronous [controllers](https://github.com/kubernetes/community/blob/master/contributors/devel/controllers.md), which interact through the Kubernetes API. Controllers don’t access the state store, etcd, directly, and don’t communicate via private direct APIs. Kubernetes itself does expose some features similar to key-value stores such as etcd and [Zookeeper](https://zookeeper.apache.org/), however, in order to facilitate centralized [state and configuration management and distribution](https://sysgears.com/articles/managing-configuration-of-distributed-system-with-apache-zookeeper/) to decentralized components.
|
||||
The intent is carried out by asynchronous [controllers](/contributors/devel/sig-api-machinery/controllers.md), which interact through the Kubernetes API. Controllers don’t access the state store, etcd, directly, and don’t communicate via private direct APIs. Kubernetes itself does expose some features similar to key-value stores such as etcd and [Zookeeper](https://zookeeper.apache.org/), however, in order to facilitate centralized [state and configuration management and distribution](https://sysgears.com/articles/managing-configuration-of-distributed-system-with-apache-zookeeper/) to decentralized components.
|
||||
|
||||
Controllers continuously strive to make the observed state match the desired state, and report back their status to the apiserver asynchronously. All of the state, desired and observed, is made visible through the API to users and to other controllers. The API resources serve as coordination points, common intermediate representation, and shared state.
|
||||
|
||||
|
@ -125,4 +125,4 @@ And get:
|
|||
|
||||
Kubernetes API resource specifications are designed for humans to directly author and read as declarative configuration data, as well as to enable composable configuration tools and automated systems to manipulate them programmatically. We chose this simple approach of using literal API resource specifications for configuration, rather than other representations, because it was natural, given that we designed the API to support CRUD on declarative primitives. The API schema must already well defined, documented, and supported. With this approach, there’s no other representation to keep up to date with new resources and versions, or to require users to learn. [Declarative configuration](https://goo.gl/T66ZcD) is only one client use case; there are also CLIs (e.g., kubectl), UIs, deployment pipelines, etc. The user will need to interact with the system in terms of the API in these other scenarios, and knowledge of the API transfers to other clients and tools. Additionally, configuration, macro/substitution, and templating languages are generally more difficult to manipulate programmatically than pure data, and involve complexity/expressiveness tradeoffs that prevent one solution being ideal for all use cases. Such languages/tools could be layered over the native API schemas, if desired, but they should not assume exclusive control over all API fields, because doing so obstructs automation and creates undesirable coupling with the configuration ecosystem.
|
||||
|
||||
The Kubernetes Resource Model encourages separation of concerns by supporting multiple distinct configuration sources and preserving declarative intent while allowing automatically set attributes. Properties not explicitly declaratively managed by the user are free to be changed by other clients, enabling the desired state to be cooperatively determined by both users and systems. This is achieved by an operation, called [**Apply**](https://docs.google.com/document/d/1q1UGAIfmOkLSxKhVg7mKknplq3OTDWAIQGWMJandHzg/edit#heading=h.xgjl2srtytjt) ("make it so"), that performs a 3-way merge of the previous configuration, the new configuration, and the live state. A 2-way merge operation, called [strategic merge patch](https://github.com/kubernetes/community/blob/master/contributors/devel/strategic-merge-patch.md), enables patches to be expressed using the same schemas as the resources themselves. Such patches can be used to perform automated updates without custom mutation operations, common updates (e.g., container image updates), combinations of configurations of orthogonal concerns, and configuration customization, such as for overriding properties of variants.
|
||||
The Kubernetes Resource Model encourages separation of concerns by supporting multiple distinct configuration sources and preserving declarative intent while allowing automatically set attributes. Properties not explicitly declaratively managed by the user are free to be changed by other clients, enabling the desired state to be cooperatively determined by both users and systems. This is achieved by an operation, called [**Apply**](https://docs.google.com/document/d/1q1UGAIfmOkLSxKhVg7mKknplq3OTDWAIQGWMJandHzg/edit#heading=h.xgjl2srtytjt) ("make it so"), that performs a 3-way merge of the previous configuration, the new configuration, and the live state. A 2-way merge operation, called [strategic merge patch](https:git.k8s.io/community/contributors/devel/sig-api-machinery/strategic-merge-patch.md), enables patches to be expressed using the same schemas as the resources themselves. Such patches can be used to perform automated updates without custom mutation operations, common updates (e.g., container image updates), combinations of configurations of orthogonal concerns, and configuration customization, such as for overriding properties of variants.
|
||||
|
|
|
@ -6,7 +6,7 @@ Support multi-fields merge key in Strategic Merge Patch.
|
|||
|
||||
## Background
|
||||
|
||||
Strategic Merge Patch is covered in this [doc](/contributors/devel/strategic-merge-patch.md).
|
||||
Strategic Merge Patch is covered in this [doc](/contributors/devel/sig-api-machinery/strategic-merge-patch.md).
|
||||
In Strategic Merge Patch, we use Merge Key to identify the entries in the list of non-primitive types.
|
||||
It must always be present and unique to perform the merge on the list of non-primitive types,
|
||||
and will be preserved.
|
||||
|
|
|
@ -4,7 +4,7 @@ Author: @mengqiy
|
|||
|
||||
## Motivation
|
||||
|
||||
Background of the Strategic Merge Patch is covered [here](../devel/strategic-merge-patch.md).
|
||||
Background of the Strategic Merge Patch is covered [here](/contributors/devel/sig-api-machinery/strategic-merge-patch.md).
|
||||
|
||||
The Kubernetes API may apply semantic meaning to the ordering of items within a list,
|
||||
however the strategic merge patch does not keep the ordering of elements.
|
||||
|
|
|
@ -292,7 +292,7 @@ As the figure below shows, the CSI snapshot controller architecture consists of
|
|||
|
||||
* External snapshotter uses ControllerGetCapabilities to find out if CSI driver supports CREATE_DELETE_SNAPSHOT calls. It degrades to trivial mode if not.
|
||||
|
||||
* External snapshotter is responsible for creating/deleting snapshots and binding snapshot and SnapshotContent objects. It follows [controller](https://github.com/kubernetes/community/blob/master/contributors/devel/controllers.md) pattern and uses informers to watch for `VolumeSnapshot` and `VolumeSnapshotContent` create/update/delete events. It filters out `VolumeSnapshot` instances with `Snapshotter==<CSI driver name>` and processes these events in workqueues with exponential backoff.
|
||||
* External snapshotter is responsible for creating/deleting snapshots and binding snapshot and SnapshotContent objects. It follows [controller](/contributors/devel/sig-api-machinery/controllers.md) pattern and uses informers to watch for `VolumeSnapshot` and `VolumeSnapshotContent` create/update/delete events. It filters out `VolumeSnapshot` instances with `Snapshotter==<CSI driver name>` and processes these events in workqueues with exponential backoff.
|
||||
|
||||
* For dynamically created snapshot, it should have a VolumeSnapshotClass associated with it. User can explicitly specify a VolumeSnapshotClass in the VolumeSnapshot API object. If user does not specify a VolumeSnapshotClass, a default VolumeSnapshotClass created by the admin will be used. This is similar to how a default StorageClass created by the admin will be used for the provisioning of a PersistentVolumeClaim.
|
||||
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,191 +1,3 @@
|
|||
# Writing Controllers
|
||||
This file has moved to https://git.k8s.io/community/contributors/devel/sig-api-machinery/controllers.md.
|
||||
|
||||
A Kubernetes controller is an active reconciliation process. That is, it watches some object for the world's desired state, and it watches the world's actual state, too. Then, it sends instructions to try and make the world's current state be more like the desired state.
|
||||
|
||||
The simplest implementation of this is a loop:
|
||||
|
||||
```go
|
||||
for {
|
||||
desired := getDesiredState()
|
||||
current := getCurrentState()
|
||||
makeChanges(desired, current)
|
||||
}
|
||||
```
|
||||
|
||||
Watches, etc, are all merely optimizations of this logic.
|
||||
|
||||
## Guidelines
|
||||
|
||||
When you're writing controllers, there are few guidelines that will help make sure you get the results and performance you're looking for.
|
||||
|
||||
1. Operate on one item at a time. If you use a `workqueue.Interface`, you'll be able to queue changes for a particular resource and later pop them in multiple “worker” gofuncs with a guarantee that no two gofuncs will work on the same item at the same time.
|
||||
|
||||
Many controllers must trigger off multiple resources (I need to "check X if Y changes"), but nearly all controllers can collapse those into a queue of “check this X” based on relationships. For instance, a ReplicaSet controller needs to react to a pod being deleted, but it does that by finding the related ReplicaSets and queuing those.
|
||||
|
||||
1. Random ordering between resources. When controllers queue off multiple types of resources, there is no guarantee of ordering amongst those resources.
|
||||
|
||||
Distinct watches are updated independently. Even with an objective ordering of “created resourceA/X” and “created resourceB/Y”, your controller could observe “created resourceB/Y” and “created resourceA/X”.
|
||||
|
||||
1. Level driven, not edge driven. Just like having a shell script that isn't running all the time, your controller may be off for an indeterminate amount of time before running again.
|
||||
|
||||
If an API object appears with a marker value of `true`, you can't count on having seen it turn from `false` to `true`, only that you now observe it being `true`. Even an API watch suffers from this problem, so be sure that you're not counting on seeing a change unless your controller is also marking the information it last made the decision on in the object's status.
|
||||
|
||||
1. Use `SharedInformers`. `SharedInformers` provide hooks to receive notifications of adds, updates, and deletes for a particular resource. They also provide convenience functions for accessing shared caches and determining when a cache is primed.
|
||||
|
||||
Use the factory methods down in https://git.k8s.io/kubernetes/staging/src/k8s.io/client-go/informers/factory.go to ensure that you are sharing the same instance of the cache as everyone else.
|
||||
|
||||
This saves us connections against the API server, duplicate serialization costs server-side, duplicate deserialization costs controller-side, and duplicate caching costs controller-side.
|
||||
|
||||
You may see other mechanisms like reflectors and deltafifos driving controllers. Those were older mechanisms that we later used to build the `SharedInformers`. You should avoid using them in new controllers.
|
||||
|
||||
1. Never mutate original objects! Caches are shared across controllers, this means that if you mutate your "copy" (actually a reference or shallow copy) of an object, you'll mess up other controllers (not just your own).
|
||||
|
||||
The most common point of failure is making a shallow copy, then mutating a map, like `Annotations`. Use `api.Scheme.Copy` to make a deep copy.
|
||||
|
||||
1. Wait for your secondary caches. Many controllers have primary and secondary resources. Primary resources are the resources that you'll be updating `Status` for. Secondary resources are resources that you'll be managing (creating/deleting) or using for lookups.
|
||||
|
||||
Use the `framework.WaitForCacheSync` function to wait for your secondary caches before starting your primary sync functions. This will make sure that things like a Pod count for a ReplicaSet isn't working off of known out of date information that results in thrashing.
|
||||
|
||||
1. There are other actors in the system. Just because you haven't changed an object doesn't mean that somebody else hasn't.
|
||||
|
||||
Don't forget that the current state may change at any moment--it's not sufficient to just watch the desired state. If you use the absence of objects in the desired state to indicate that things in the current state should be deleted, make sure you don't have a bug in your observation code (e.g., act before your cache has filled).
|
||||
|
||||
1. Percolate errors to the top level for consistent re-queuing. We have a `workqueue.RateLimitingInterface` to allow simple requeuing with reasonable backoffs.
|
||||
|
||||
Your main controller func should return an error when requeuing is necessary. When it isn't, it should use `utilruntime.HandleError` and return nil instead. This makes it very easy for reviewers to inspect error handling cases and to be confident that your controller doesn't accidentally lose things it should retry for.
|
||||
|
||||
1. Watches and Informers will “sync”. Periodically, they will deliver every matching object in the cluster to your `Update` method. This is good for cases where you may need to take additional action on the object, but sometimes you know there won't be more work to do.
|
||||
|
||||
In cases where you are *certain* that you don't need to requeue items when there are no new changes, you can compare the resource version of the old and new objects. If they are the same, you skip requeuing the work. Be careful when you do this. If you ever skip requeuing your item on failures, you could fail, not requeue, and then never retry that item again.
|
||||
|
||||
1. If the primary resource your controller is reconciling supports ObservedGeneration in its status, make sure you correctly set it to metadata.Generation whenever the values between the two fields mismatches.
|
||||
|
||||
This lets clients know that the controller has processed a resource. Make sure that your controller is the main controller that is responsible for that resource, otherwise if you need to communicate observation via your own controller, you will need to create a different kind of ObservedGeneration in the Status of the resource.
|
||||
|
||||
1. Consider using owner references for resources that result in the creation of other resources (eg. a ReplicaSet results in creating Pods). Thus you ensure that children resources are going to be garbage-collected once a resource managed by your controller is deleted. For more information on owner references, read more [here](/contributors/design-proposals/api-machinery/controller-ref.md).
|
||||
|
||||
Pay special attention in the way you are doing adoption. You shouldn't adopt children for a resource when either the parent or the children are marked for deletion. If you are using a cache for your resources, you will likely need to bypass it with a direct API read in case you observe that an owner reference has been updated for one of the children. Thus, you ensure your controller is not racing with the garbage collector.
|
||||
|
||||
See [k8s.io/kubernetes/pull/42938](https://github.com/kubernetes/kubernetes/pull/42938) for more information.
|
||||
|
||||
## Rough Structure
|
||||
|
||||
Overall, your controller should look something like this:
|
||||
|
||||
```go
|
||||
type Controller struct {
|
||||
// pods gives cached access to pods.
|
||||
pods informers.PodLister
|
||||
podsSynced cache.InformerSynced
|
||||
|
||||
// queue is where incoming work is placed to de-dup and to allow "easy"
|
||||
// rate limited requeues on errors
|
||||
queue workqueue.RateLimitingInterface
|
||||
}
|
||||
|
||||
func NewController(pods informers.PodInformer) *Controller {
|
||||
c := &Controller{
|
||||
pods: pods.Lister(),
|
||||
podsSynced: pods.Informer().HasSynced,
|
||||
queue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "controller-name"),
|
||||
}
|
||||
|
||||
// register event handlers to fill the queue with pod creations, updates and deletions
|
||||
pods.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
|
||||
AddFunc: func(obj interface{}) {
|
||||
key, err := cache.MetaNamespaceKeyFunc(obj)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
UpdateFunc: func(old interface{}, new interface{}) {
|
||||
key, err := cache.MetaNamespaceKeyFunc(new)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
DeleteFunc: func(obj interface{}) {
|
||||
// IndexerInformer uses a delta nodeQueue, therefore for deletes we have to use this
|
||||
// key function.
|
||||
key, err := cache.DeletionHandlingMetaNamespaceKeyFunc(obj)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
},)
|
||||
|
||||
return c
|
||||
}
|
||||
|
||||
func (c *Controller) Run(threadiness int, stopCh chan struct{}) {
|
||||
// don't let panics crash the process
|
||||
defer utilruntime.HandleCrash()
|
||||
// make sure the work queue is shutdown which will trigger workers to end
|
||||
defer c.queue.ShutDown()
|
||||
|
||||
glog.Infof("Starting <NAME> controller")
|
||||
|
||||
// wait for your secondary caches to fill before starting your work
|
||||
if !cache.WaitForCacheSync(stopCh, c.podsSynced) {
|
||||
return
|
||||
}
|
||||
|
||||
// start up your worker threads based on threadiness. Some controllers
|
||||
// have multiple kinds of workers
|
||||
for i := 0; i < threadiness; i++ {
|
||||
// runWorker will loop until "something bad" happens. The .Until will
|
||||
// then rekick the worker after one second
|
||||
go wait.Until(c.runWorker, time.Second, stopCh)
|
||||
}
|
||||
|
||||
// wait until we're told to stop
|
||||
<-stopCh
|
||||
glog.Infof("Shutting down <NAME> controller")
|
||||
}
|
||||
|
||||
func (c *Controller) runWorker() {
|
||||
// hot loop until we're told to stop. processNextWorkItem will
|
||||
// automatically wait until there's work available, so we don't worry
|
||||
// about secondary waits
|
||||
for c.processNextWorkItem() {
|
||||
}
|
||||
}
|
||||
|
||||
// processNextWorkItem deals with one key off the queue. It returns false
|
||||
// when it's time to quit.
|
||||
func (c *Controller) processNextWorkItem() bool {
|
||||
// pull the next work item from queue. It should be a key we use to lookup
|
||||
// something in a cache
|
||||
key, quit := c.queue.Get()
|
||||
if quit {
|
||||
return false
|
||||
}
|
||||
// you always have to indicate to the queue that you've completed a piece of
|
||||
// work
|
||||
defer c.queue.Done(key)
|
||||
|
||||
// do your work on the key. This method will contains your "do stuff" logic
|
||||
err := c.syncHandler(key.(string))
|
||||
if err == nil {
|
||||
// if you had no error, tell the queue to stop tracking history for your
|
||||
// key. This will reset things like failure counts for per-item rate
|
||||
// limiting
|
||||
c.queue.Forget(key)
|
||||
return true
|
||||
}
|
||||
|
||||
// there was a failure so be sure to report it. This method allows for
|
||||
// pluggable error handling which can be used for things like
|
||||
// cluster-monitoring
|
||||
utilruntime.HandleError(fmt.Errorf("%v failed with : %v", key, err))
|
||||
|
||||
// since we failed, we should requeue the item to work on later. This
|
||||
// method will add a backoff to avoid hotlooping on particular items
|
||||
// (they're probably still not going to work right away) and overall
|
||||
// controller protection (everything I've done is broken, this controller
|
||||
// needs to calm down or it can starve other useful work) cases.
|
||||
c.queue.AddRateLimited(key)
|
||||
|
||||
return true
|
||||
}
|
||||
```
|
||||
This file is a placeholder to preserve links. Please remove by April 24, 2019 or the release of kubernetes 1.13, whichever comes first.
|
|
@ -1,50 +1,3 @@
|
|||
# Generation and release cycle of clientset
|
||||
This file has moved to https://git.k8s.io/community/contributors/devel/sig-api-machinery/generating-clientset.md.
|
||||
|
||||
Client-gen is an automatic tool that generates [clientset](../design-proposals/api-machinery/client-package-structure.md#high-level-client-sets) based on API types. This doc introduces the use of client-gen, and the release cycle of the generated clientsets.
|
||||
|
||||
## Using client-gen
|
||||
|
||||
The workflow includes three steps:
|
||||
|
||||
**1.** Marking API types with tags: in `pkg/apis/${GROUP}/${VERSION}/types.go`, mark the types (e.g., Pods) that you want to generate clients for with the `// +genclient` tag. If the resource associated with the type is not namespace scoped (e.g., PersistentVolume), you need to append the `// +genclient:nonNamespaced` tag as well.
|
||||
|
||||
The following `// +genclient` are supported:
|
||||
|
||||
- `// +genclient` - generate default client verb functions (*create*, *update*, *delete*, *get*, *list*, *update*, *patch*, *watch* and depending on the existence of `.Status` field in the type the client is generated for also *updateStatus*).
|
||||
- `// +genclient:nonNamespaced` - all verb functions are generated without namespace.
|
||||
- `// +genclient:onlyVerbs=create,get` - only listed verb functions will be generated.
|
||||
- `// +genclient:skipVerbs=watch` - all default client verb functions will be generated **except** *watch* verb.
|
||||
- `// +genclient:noStatus` - skip generation of *updateStatus* verb even thought the `.Status` field exists.
|
||||
|
||||
In some cases you want to generate non-standard verbs (eg. for sub-resources). To do that you can use the following generator tag:
|
||||
|
||||
- `// +genclient:method=Scale,verb=update,subresource=scale,input=k8s.io/api/extensions/v1beta1.Scale,result=k8s.io/api/extensions/v1beta1.Scale` - in this case a new function `Scale(string, *v1beta.Scale) *v1beta.Scale` will be added to the default client and the body of the function will be based on the *update* verb. The optional *subresource* argument will make the generated client function use subresource `scale`. Using the optional *input* and *result* arguments you can override the default type with a custom type. If the import path is not given, the generator will assume the type exists in the same package.
|
||||
|
||||
In addition, the following optional tags influence the client generation:
|
||||
|
||||
- `// +groupName=policy.authorization.k8s.io` – used in the fake client as the full group name (defaults to the package name),
|
||||
- `// +groupGoName=AuthorizationPolicy` – a CamelCase Golang identifier to de-conflict groups with non-unique prefixes like `policy.authorization.k8s.io` and `policy.k8s.io`. These would lead to two `Policy()` methods in the clientset otherwise (defaults to the upper-case first segement of the group name).
|
||||
|
||||
**2a.** If you are developing in the k8s.io/kubernetes repository, you just need to run hack/update-codegen.sh.
|
||||
|
||||
**2b.** If you are running client-gen outside of k8s.io/kubernetes, you need to use the command line argument `--input` to specify the groups and versions of the APIs you want to generate clients for, client-gen will then look into `pkg/apis/${GROUP}/${VERSION}/types.go` and generate clients for the types you have marked with the `genclient` tags. For example, to generated a clientset named "my_release" including clients for api/v1 objects and extensions/v1beta1 objects, you need to run:
|
||||
|
||||
```
|
||||
$ client-gen --input="api/v1,extensions/v1beta1" --clientset-name="my_release"
|
||||
```
|
||||
|
||||
**3.** ***Adding expansion methods***: client-gen only generates the common methods, such as CRUD. You can manually add additional methods through the expansion interface. For example, this [file](https://git.k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset/typed/core/internalversion/pod_expansion.go) adds additional methods to Pod's client. As a convention, we put the expansion interface and its methods in file ${TYPE}_expansion.go. In most cases, you don't want to remove existing expansion files. So to make life easier, instead of creating a new clientset from scratch, ***you can copy and rename an existing clientset (so that all the expansion files are copied)***, and then run client-gen.
|
||||
|
||||
## Output of client-gen
|
||||
|
||||
- clientset: the clientset will be generated at `pkg/client/clientset_generated/` by default, and you can change the path via the `--clientset-path` command line argument.
|
||||
|
||||
- Individual typed clients and client for group: They will be generated at `pkg/client/clientset_generated/${clientset_name}/typed/generated/${GROUP}/${VERSION}/`
|
||||
|
||||
## Released clientsets
|
||||
|
||||
If you are contributing code to k8s.io/kubernetes, try to use the generated clientset [here](https://git.k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset).
|
||||
|
||||
If you need a stable Go client to build your own project, please refer to the [client-go repository](https://github.com/kubernetes/client-go).
|
||||
|
||||
We are migrating k8s.io/kubernetes to use client-go as well, see issue [#35159](https://github.com/kubernetes/kubernetes/issues/35159).
|
||||
This file is a placeholder to preserve links. Please remove by April 24, 2019 or the release of kubernetes 1.13, whichever comes first.
|
|
@ -0,0 +1,191 @@
|
|||
# Writing Controllers
|
||||
|
||||
A Kubernetes controller is an active reconciliation process. That is, it watches some object for the world's desired state, and it watches the world's actual state, too. Then, it sends instructions to try and make the world's current state be more like the desired state.
|
||||
|
||||
The simplest implementation of this is a loop:
|
||||
|
||||
```go
|
||||
for {
|
||||
desired := getDesiredState()
|
||||
current := getCurrentState()
|
||||
makeChanges(desired, current)
|
||||
}
|
||||
```
|
||||
|
||||
Watches, etc, are all merely optimizations of this logic.
|
||||
|
||||
## Guidelines
|
||||
|
||||
When you're writing controllers, there are few guidelines that will help make sure you get the results and performance you're looking for.
|
||||
|
||||
1. Operate on one item at a time. If you use a `workqueue.Interface`, you'll be able to queue changes for a particular resource and later pop them in multiple “worker” gofuncs with a guarantee that no two gofuncs will work on the same item at the same time.
|
||||
|
||||
Many controllers must trigger off multiple resources (I need to "check X if Y changes"), but nearly all controllers can collapse those into a queue of “check this X” based on relationships. For instance, a ReplicaSet controller needs to react to a pod being deleted, but it does that by finding the related ReplicaSets and queuing those.
|
||||
|
||||
1. Random ordering between resources. When controllers queue off multiple types of resources, there is no guarantee of ordering amongst those resources.
|
||||
|
||||
Distinct watches are updated independently. Even with an objective ordering of “created resourceA/X” and “created resourceB/Y”, your controller could observe “created resourceB/Y” and “created resourceA/X”.
|
||||
|
||||
1. Level driven, not edge driven. Just like having a shell script that isn't running all the time, your controller may be off for an indeterminate amount of time before running again.
|
||||
|
||||
If an API object appears with a marker value of `true`, you can't count on having seen it turn from `false` to `true`, only that you now observe it being `true`. Even an API watch suffers from this problem, so be sure that you're not counting on seeing a change unless your controller is also marking the information it last made the decision on in the object's status.
|
||||
|
||||
1. Use `SharedInformers`. `SharedInformers` provide hooks to receive notifications of adds, updates, and deletes for a particular resource. They also provide convenience functions for accessing shared caches and determining when a cache is primed.
|
||||
|
||||
Use the factory methods down in https://git.k8s.io/kubernetes/staging/src/k8s.io/client-go/informers/factory.go to ensure that you are sharing the same instance of the cache as everyone else.
|
||||
|
||||
This saves us connections against the API server, duplicate serialization costs server-side, duplicate deserialization costs controller-side, and duplicate caching costs controller-side.
|
||||
|
||||
You may see other mechanisms like reflectors and deltafifos driving controllers. Those were older mechanisms that we later used to build the `SharedInformers`. You should avoid using them in new controllers.
|
||||
|
||||
1. Never mutate original objects! Caches are shared across controllers, this means that if you mutate your "copy" (actually a reference or shallow copy) of an object, you'll mess up other controllers (not just your own).
|
||||
|
||||
The most common point of failure is making a shallow copy, then mutating a map, like `Annotations`. Use `api.Scheme.Copy` to make a deep copy.
|
||||
|
||||
1. Wait for your secondary caches. Many controllers have primary and secondary resources. Primary resources are the resources that you'll be updating `Status` for. Secondary resources are resources that you'll be managing (creating/deleting) or using for lookups.
|
||||
|
||||
Use the `framework.WaitForCacheSync` function to wait for your secondary caches before starting your primary sync functions. This will make sure that things like a Pod count for a ReplicaSet isn't working off of known out of date information that results in thrashing.
|
||||
|
||||
1. There are other actors in the system. Just because you haven't changed an object doesn't mean that somebody else hasn't.
|
||||
|
||||
Don't forget that the current state may change at any moment--it's not sufficient to just watch the desired state. If you use the absence of objects in the desired state to indicate that things in the current state should be deleted, make sure you don't have a bug in your observation code (e.g., act before your cache has filled).
|
||||
|
||||
1. Percolate errors to the top level for consistent re-queuing. We have a `workqueue.RateLimitingInterface` to allow simple requeuing with reasonable backoffs.
|
||||
|
||||
Your main controller func should return an error when requeuing is necessary. When it isn't, it should use `utilruntime.HandleError` and return nil instead. This makes it very easy for reviewers to inspect error handling cases and to be confident that your controller doesn't accidentally lose things it should retry for.
|
||||
|
||||
1. Watches and Informers will “sync”. Periodically, they will deliver every matching object in the cluster to your `Update` method. This is good for cases where you may need to take additional action on the object, but sometimes you know there won't be more work to do.
|
||||
|
||||
In cases where you are *certain* that you don't need to requeue items when there are no new changes, you can compare the resource version of the old and new objects. If they are the same, you skip requeuing the work. Be careful when you do this. If you ever skip requeuing your item on failures, you could fail, not requeue, and then never retry that item again.
|
||||
|
||||
1. If the primary resource your controller is reconciling supports ObservedGeneration in its status, make sure you correctly set it to metadata.Generation whenever the values between the two fields mismatches.
|
||||
|
||||
This lets clients know that the controller has processed a resource. Make sure that your controller is the main controller that is responsible for that resource, otherwise if you need to communicate observation via your own controller, you will need to create a different kind of ObservedGeneration in the Status of the resource.
|
||||
|
||||
1. Consider using owner references for resources that result in the creation of other resources (eg. a ReplicaSet results in creating Pods). Thus you ensure that children resources are going to be garbage-collected once a resource managed by your controller is deleted. For more information on owner references, read more [here](/contributors/design-proposals/api-machinery/controller-ref.md).
|
||||
|
||||
Pay special attention in the way you are doing adoption. You shouldn't adopt children for a resource when either the parent or the children are marked for deletion. If you are using a cache for your resources, you will likely need to bypass it with a direct API read in case you observe that an owner reference has been updated for one of the children. Thus, you ensure your controller is not racing with the garbage collector.
|
||||
|
||||
See [k8s.io/kubernetes/pull/42938](https://github.com/kubernetes/kubernetes/pull/42938) for more information.
|
||||
|
||||
## Rough Structure
|
||||
|
||||
Overall, your controller should look something like this:
|
||||
|
||||
```go
|
||||
type Controller struct {
|
||||
// pods gives cached access to pods.
|
||||
pods informers.PodLister
|
||||
podsSynced cache.InformerSynced
|
||||
|
||||
// queue is where incoming work is placed to de-dup and to allow "easy"
|
||||
// rate limited requeues on errors
|
||||
queue workqueue.RateLimitingInterface
|
||||
}
|
||||
|
||||
func NewController(pods informers.PodInformer) *Controller {
|
||||
c := &Controller{
|
||||
pods: pods.Lister(),
|
||||
podsSynced: pods.Informer().HasSynced,
|
||||
queue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "controller-name"),
|
||||
}
|
||||
|
||||
// register event handlers to fill the queue with pod creations, updates and deletions
|
||||
pods.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
|
||||
AddFunc: func(obj interface{}) {
|
||||
key, err := cache.MetaNamespaceKeyFunc(obj)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
UpdateFunc: func(old interface{}, new interface{}) {
|
||||
key, err := cache.MetaNamespaceKeyFunc(new)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
DeleteFunc: func(obj interface{}) {
|
||||
// IndexerInformer uses a delta nodeQueue, therefore for deletes we have to use this
|
||||
// key function.
|
||||
key, err := cache.DeletionHandlingMetaNamespaceKeyFunc(obj)
|
||||
if err == nil {
|
||||
c.queue.Add(key)
|
||||
}
|
||||
},
|
||||
},)
|
||||
|
||||
return c
|
||||
}
|
||||
|
||||
func (c *Controller) Run(threadiness int, stopCh chan struct{}) {
|
||||
// don't let panics crash the process
|
||||
defer utilruntime.HandleCrash()
|
||||
// make sure the work queue is shutdown which will trigger workers to end
|
||||
defer c.queue.ShutDown()
|
||||
|
||||
glog.Infof("Starting <NAME> controller")
|
||||
|
||||
// wait for your secondary caches to fill before starting your work
|
||||
if !cache.WaitForCacheSync(stopCh, c.podsSynced) {
|
||||
return
|
||||
}
|
||||
|
||||
// start up your worker threads based on threadiness. Some controllers
|
||||
// have multiple kinds of workers
|
||||
for i := 0; i < threadiness; i++ {
|
||||
// runWorker will loop until "something bad" happens. The .Until will
|
||||
// then rekick the worker after one second
|
||||
go wait.Until(c.runWorker, time.Second, stopCh)
|
||||
}
|
||||
|
||||
// wait until we're told to stop
|
||||
<-stopCh
|
||||
glog.Infof("Shutting down <NAME> controller")
|
||||
}
|
||||
|
||||
func (c *Controller) runWorker() {
|
||||
// hot loop until we're told to stop. processNextWorkItem will
|
||||
// automatically wait until there's work available, so we don't worry
|
||||
// about secondary waits
|
||||
for c.processNextWorkItem() {
|
||||
}
|
||||
}
|
||||
|
||||
// processNextWorkItem deals with one key off the queue. It returns false
|
||||
// when it's time to quit.
|
||||
func (c *Controller) processNextWorkItem() bool {
|
||||
// pull the next work item from queue. It should be a key we use to lookup
|
||||
// something in a cache
|
||||
key, quit := c.queue.Get()
|
||||
if quit {
|
||||
return false
|
||||
}
|
||||
// you always have to indicate to the queue that you've completed a piece of
|
||||
// work
|
||||
defer c.queue.Done(key)
|
||||
|
||||
// do your work on the key. This method will contains your "do stuff" logic
|
||||
err := c.syncHandler(key.(string))
|
||||
if err == nil {
|
||||
// if you had no error, tell the queue to stop tracking history for your
|
||||
// key. This will reset things like failure counts for per-item rate
|
||||
// limiting
|
||||
c.queue.Forget(key)
|
||||
return true
|
||||
}
|
||||
|
||||
// there was a failure so be sure to report it. This method allows for
|
||||
// pluggable error handling which can be used for things like
|
||||
// cluster-monitoring
|
||||
utilruntime.HandleError(fmt.Errorf("%v failed with : %v", key, err))
|
||||
|
||||
// since we failed, we should requeue the item to work on later. This
|
||||
// method will add a backoff to avoid hotlooping on particular items
|
||||
// (they're probably still not going to work right away) and overall
|
||||
// controller protection (everything I've done is broken, this controller
|
||||
// needs to calm down or it can starve other useful work) cases.
|
||||
c.queue.AddRateLimited(key)
|
||||
|
||||
return true
|
||||
}
|
||||
```
|
|
@ -0,0 +1,50 @@
|
|||
# Generation and release cycle of clientset
|
||||
|
||||
Client-gen is an automatic tool that generates [clientset](../design-proposals/api-machinery/client-package-structure.md#high-level-client-sets) based on API types. This doc introduces the use of client-gen, and the release cycle of the generated clientsets.
|
||||
|
||||
## Using client-gen
|
||||
|
||||
The workflow includes three steps:
|
||||
|
||||
**1.** Marking API types with tags: in `pkg/apis/${GROUP}/${VERSION}/types.go`, mark the types (e.g., Pods) that you want to generate clients for with the `// +genclient` tag. If the resource associated with the type is not namespace scoped (e.g., PersistentVolume), you need to append the `// +genclient:nonNamespaced` tag as well.
|
||||
|
||||
The following `// +genclient` are supported:
|
||||
|
||||
- `// +genclient` - generate default client verb functions (*create*, *update*, *delete*, *get*, *list*, *update*, *patch*, *watch* and depending on the existence of `.Status` field in the type the client is generated for also *updateStatus*).
|
||||
- `// +genclient:nonNamespaced` - all verb functions are generated without namespace.
|
||||
- `// +genclient:onlyVerbs=create,get` - only listed verb functions will be generated.
|
||||
- `// +genclient:skipVerbs=watch` - all default client verb functions will be generated **except** *watch* verb.
|
||||
- `// +genclient:noStatus` - skip generation of *updateStatus* verb even thought the `.Status` field exists.
|
||||
|
||||
In some cases you want to generate non-standard verbs (eg. for sub-resources). To do that you can use the following generator tag:
|
||||
|
||||
- `// +genclient:method=Scale,verb=update,subresource=scale,input=k8s.io/api/extensions/v1beta1.Scale,result=k8s.io/api/extensions/v1beta1.Scale` - in this case a new function `Scale(string, *v1beta.Scale) *v1beta.Scale` will be added to the default client and the body of the function will be based on the *update* verb. The optional *subresource* argument will make the generated client function use subresource `scale`. Using the optional *input* and *result* arguments you can override the default type with a custom type. If the import path is not given, the generator will assume the type exists in the same package.
|
||||
|
||||
In addition, the following optional tags influence the client generation:
|
||||
|
||||
- `// +groupName=policy.authorization.k8s.io` – used in the fake client as the full group name (defaults to the package name),
|
||||
- `// +groupGoName=AuthorizationPolicy` – a CamelCase Golang identifier to de-conflict groups with non-unique prefixes like `policy.authorization.k8s.io` and `policy.k8s.io`. These would lead to two `Policy()` methods in the clientset otherwise (defaults to the upper-case first segement of the group name).
|
||||
|
||||
**2a.** If you are developing in the k8s.io/kubernetes repository, you just need to run hack/update-codegen.sh.
|
||||
|
||||
**2b.** If you are running client-gen outside of k8s.io/kubernetes, you need to use the command line argument `--input` to specify the groups and versions of the APIs you want to generate clients for, client-gen will then look into `pkg/apis/${GROUP}/${VERSION}/types.go` and generate clients for the types you have marked with the `genclient` tags. For example, to generated a clientset named "my_release" including clients for api/v1 objects and extensions/v1beta1 objects, you need to run:
|
||||
|
||||
```
|
||||
$ client-gen --input="api/v1,extensions/v1beta1" --clientset-name="my_release"
|
||||
```
|
||||
|
||||
**3.** ***Adding expansion methods***: client-gen only generates the common methods, such as CRUD. You can manually add additional methods through the expansion interface. For example, this [file](https://git.k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset/typed/core/internalversion/pod_expansion.go) adds additional methods to Pod's client. As a convention, we put the expansion interface and its methods in file ${TYPE}_expansion.go. In most cases, you don't want to remove existing expansion files. So to make life easier, instead of creating a new clientset from scratch, ***you can copy and rename an existing clientset (so that all the expansion files are copied)***, and then run client-gen.
|
||||
|
||||
## Output of client-gen
|
||||
|
||||
- clientset: the clientset will be generated at `pkg/client/clientset_generated/` by default, and you can change the path via the `--clientset-path` command line argument.
|
||||
|
||||
- Individual typed clients and client for group: They will be generated at `pkg/client/clientset_generated/${clientset_name}/typed/generated/${GROUP}/${VERSION}/`
|
||||
|
||||
## Released clientsets
|
||||
|
||||
If you are contributing code to k8s.io/kubernetes, try to use the generated clientset [here](https://git.k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset).
|
||||
|
||||
If you need a stable Go client to build your own project, please refer to the [client-go repository](https://github.com/kubernetes/client-go).
|
||||
|
||||
We are migrating k8s.io/kubernetes to use client-go as well, see issue [#35159](https://github.com/kubernetes/kubernetes/issues/35159).
|
|
@ -0,0 +1,449 @@
|
|||
Strategic Merge Patch
|
||||
=====================
|
||||
|
||||
# Background
|
||||
|
||||
Kubernetes supports a customized version of JSON merge patch called strategic merge patch. This
|
||||
patch format is used by `kubectl apply`, `kubectl edit` and `kubectl patch`, and contains
|
||||
specialized directives to control how specific fields are merged.
|
||||
|
||||
In the standard JSON merge patch, JSON objects are always merged but lists are
|
||||
always replaced. Often that isn't what we want. Let's say we start with the
|
||||
following Pod:
|
||||
|
||||
```yaml
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
```
|
||||
|
||||
and we POST that to the server (as JSON). Then let's say we want to *add* a
|
||||
container to this Pod.
|
||||
|
||||
```yaml
|
||||
PATCH /api/v1/namespaces/default/pods/pod-name
|
||||
spec:
|
||||
containers:
|
||||
- name: log-tailer
|
||||
image: log-tailer-1.0
|
||||
```
|
||||
|
||||
If we were to use standard Merge Patch, the entire container list would be
|
||||
replaced with the single log-tailer container. However, our intent is for the
|
||||
container lists to merge together based on the `name` field.
|
||||
|
||||
To solve this problem, Strategic Merge Patch uses the go struct tag of the API
|
||||
objects to determine what lists should be merged and which ones should not.
|
||||
The metadata is available as struct tags on the API objects
|
||||
themselves and also available to clients as [OpenAPI annotations](https://github.com/kubernetes/kubernetes/blob/master/api/openapi-spec/README.md#x-kubernetes-patch-strategy-and-x-kubernetes-patch-merge-key).
|
||||
In the above example, the `patchStrategy` metadata for the `containers`
|
||||
field would be `merge` and the `patchMergeKey` would be `name`.
|
||||
|
||||
|
||||
# Basic Patch Format
|
||||
|
||||
Strategic Merge Patch supports special operations through directives.
|
||||
|
||||
There are multiple directives:
|
||||
|
||||
- replace
|
||||
- merge
|
||||
- delete
|
||||
- delete from primitive list
|
||||
|
||||
`replace`, `merge` and `delete` are mutual exclusive.
|
||||
|
||||
## `replace` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`replace` directive indicates that the element that contains it should be replaced instead of being merged.
|
||||
|
||||
### Syntax
|
||||
|
||||
`replace` directive is used in both patch with directive marker and go struct tags.
|
||||
|
||||
Example usage in the patch:
|
||||
|
||||
```
|
||||
$patch: replace
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
`replace` directive can be used on both map and list.
|
||||
|
||||
#### Map
|
||||
|
||||
To indicate that a map should not be merged and instead should be taken literally:
|
||||
|
||||
```yaml
|
||||
$patch: replace # recursive and applies to all fields of the map it's in
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
```
|
||||
|
||||
#### List of Maps
|
||||
|
||||
To override the container list to be strictly replaced, regardless of the default:
|
||||
|
||||
```yaml
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
- $patch: replace # any further $patch operations nested in this list will be ignored
|
||||
```
|
||||
|
||||
|
||||
## `delete` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`delete` directive indicates that the element that contains it should be deleted.
|
||||
|
||||
### Syntax
|
||||
|
||||
`delete` directive is used only in the patch with directive marker.
|
||||
It can be used on both map and list of maps.
|
||||
```
|
||||
$patch: delete
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### List of Maps
|
||||
|
||||
To delete an element of a list that should be merged:
|
||||
|
||||
```yaml
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
- $patch: delete
|
||||
name: log-tailer # merge key and value goes here
|
||||
```
|
||||
|
||||
Note: Delete operation will delete all entries in the list that match the merge key.
|
||||
|
||||
#### Maps
|
||||
|
||||
One way to delete a map is using `delete` directive.
|
||||
Applying this patch will delete the rollingUpdate map.
|
||||
```yaml
|
||||
rollingUpdate:
|
||||
$patch: delete
|
||||
```
|
||||
|
||||
An equivalent way to delete this map is
|
||||
```yaml
|
||||
rollingUpdate: null
|
||||
```
|
||||
|
||||
## `merge` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`merge` directive indicates that the element that contains it should be merged instead of being replaced.
|
||||
|
||||
### Syntax
|
||||
|
||||
`merge` directive is used only in the go struct tags.
|
||||
|
||||
|
||||
## `deleteFromPrimitiveList` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
We have two patch strategies for lists of primitives: replace and merge.
|
||||
Replace is the default patch strategy for list, which will replace the whole list on update and it will preserve the order;
|
||||
while merge strategy works as an unordered set. We call a primitive list with merge strategy an unordered set.
|
||||
The patch strategy is defined in the go struct tag of the API objects.
|
||||
|
||||
`deleteFromPrimitiveList` directive indicates that the elements in this list should be deleted from the original primitive list.
|
||||
|
||||
### Syntax
|
||||
|
||||
It is used only as the prefix of the key in the patch.
|
||||
```
|
||||
$deleteFromPrimitiveList/<keyOfPrimitiveList>: [a primitive list]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
##### List of Primitives (Unordered Set)
|
||||
|
||||
`finalizers` uses `merge` as patch strategy.
|
||||
```go
|
||||
Finalizers []string `json:"finalizers,omitempty" patchStrategy:"merge" protobuf:"bytes,14,rep,name=finalizers"`
|
||||
```
|
||||
|
||||
Suppose we have defined a `finalizers` and we call it the original finalizers:
|
||||
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
To delete items "b" and "c" from the original finalizers, the patch will be:
|
||||
|
||||
```yaml
|
||||
# The directive includes the prefix $deleteFromPrimitiveList and
|
||||
# followed by a '/' and the name of the list.
|
||||
# The values in this list will be deleted after applying the patch.
|
||||
$deleteFromPrimitiveList/finalizers:
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
After applying the patch on the original finalizers, it will become:
|
||||
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
```
|
||||
|
||||
Note: When merging two set, the primitives are first deduplicated and then merged.
|
||||
In an erroneous case, the set may be created with duplicates. Deleting an
|
||||
item that has duplicates will delete all matching items.
|
||||
|
||||
## `setElementOrder` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`setElementOrder` directive provides a way to specify the order of a list.
|
||||
The relative order specified in this directive will be retained.
|
||||
Please refer to [proposal](/contributors/design-proposals/cli/preserve-order-in-strategic-merge-patch.md) for more information.
|
||||
|
||||
### Syntax
|
||||
|
||||
It is used only as the prefix of the key in the patch.
|
||||
```
|
||||
$setElementOrder/<keyOfList>: [a list]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### List of Primitives
|
||||
|
||||
Suppose we have a list of `finalizers`:
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
To reorder the elements order in the list, we can send a patch:
|
||||
```yaml
|
||||
# The directive includes the prefix $setElementOrder and
|
||||
# followed by a '/' and the name of the list.
|
||||
$setElementOrder/finalizers:
|
||||
- b
|
||||
- c
|
||||
- a
|
||||
```
|
||||
|
||||
After applying the patch, it will be:
|
||||
```yaml
|
||||
finalizers:
|
||||
- b
|
||||
- c
|
||||
- a
|
||||
```
|
||||
|
||||
#### List of Maps
|
||||
|
||||
Suppose we have a list of `containers` whose `mergeKey` is `name`:
|
||||
```yaml
|
||||
containers:
|
||||
- name: a
|
||||
...
|
||||
- name: b
|
||||
...
|
||||
- name: c
|
||||
...
|
||||
```
|
||||
|
||||
To reorder the elements order in the list, we can send a patch:
|
||||
```yaml
|
||||
# each map in the list should only include the mergeKey
|
||||
$setElementOrder/containers:
|
||||
- name: b
|
||||
- name: c
|
||||
- name: a
|
||||
```
|
||||
|
||||
After applying the patch, it will be:
|
||||
```yaml
|
||||
containers:
|
||||
- name: b
|
||||
...
|
||||
- name: c
|
||||
...
|
||||
- name: a
|
||||
...
|
||||
```
|
||||
|
||||
|
||||
## `retainKeys` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`retainKeys` directive provides a mechanism for union types to clear mutual exclusive fields.
|
||||
When this directive is present in the patch, all the fields not in this directive will be cleared.
|
||||
Please refer to [proposal](/contributors/design-proposals/api-machinery/add-new-patchStrategy-to-clear-fields-not-present-in-patch.md) for more information.
|
||||
|
||||
### Syntax
|
||||
|
||||
```
|
||||
$retainKeys: [a list of field keys]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### Map
|
||||
|
||||
Suppose we have a union type:
|
||||
```
|
||||
union:
|
||||
foo: a
|
||||
other: b
|
||||
```
|
||||
|
||||
And we have a patch:
|
||||
```
|
||||
union:
|
||||
retainKeys:
|
||||
- another
|
||||
- bar
|
||||
another: d
|
||||
bar: c
|
||||
```
|
||||
|
||||
After applying this patch, we get:
|
||||
```
|
||||
union:
|
||||
# Field foo and other have been cleared w/o explicitly set them to null.
|
||||
another: d
|
||||
bar: c
|
||||
```
|
||||
|
||||
# Changing patch format
|
||||
|
||||
As issues and limitations have been discovered with the strategic merge
|
||||
patch implementation, it has been necessary to change the patch format
|
||||
to support additional semantics - such as merging lists of
|
||||
primitives and defining order when merging lists.
|
||||
|
||||
## Requirements for any changes to the patch format
|
||||
|
||||
**Note:** Changes to the strategic merge patch must be backwards compatible such
|
||||
that patch requests valid in previous versions continue to be valid.
|
||||
That is, old patch formats sent by old clients to new servers with
|
||||
must continue to function correctly.
|
||||
|
||||
Previously valid patch requests do not need to keep the exact same
|
||||
behavior, but do need to behave correctly.
|
||||
|
||||
**Example:** if a patch request previously randomized the order of elements
|
||||
in a list and we want to provide a deterministic order, we must continue
|
||||
to support old patch format but we can make the ordering deterministic
|
||||
for the old format.
|
||||
|
||||
### Client version skew
|
||||
|
||||
Because the server does not publish which patch versions it supports,
|
||||
and it silently ignores patch directives that it does not recognize,
|
||||
new patches should behave correctly when sent to old servers that
|
||||
may not support all of the patch directives.
|
||||
|
||||
While the patch API must be backwards compatible, it must also
|
||||
be forward compatible for 1 version. This is needed because `kubectl` must
|
||||
support talking to older and newer server versions without knowing what
|
||||
parts of patch are supported on each, and generate patches that work correctly on both.
|
||||
|
||||
## Strategies for introducing new patch behavior
|
||||
|
||||
#### 1. Add optional semantic meaning to the existing patch format.
|
||||
|
||||
**Note:** Must not require new data or elements to be present that was not required before. Meaning must not break old interpretation of old patches.
|
||||
|
||||
**Good Example:**
|
||||
|
||||
Old format
|
||||
- ordering of elements in patch had no meaning and the final ordering was arbitrary
|
||||
|
||||
New format
|
||||
- ordering of elements in patch has meaning and the final ordering is deterministic based on the ordering in the patch
|
||||
|
||||
**Bad Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
This example won't work, because old patch formats will contain data that is now
|
||||
considered required. To support this, introduce a new directive to guard the
|
||||
new patch format.
|
||||
|
||||
#### 2. Add support for new directives in the patch format
|
||||
|
||||
- Optional directives may be introduced to change how the patch is applied by the server - **backwards compatible** (old patch against newer server).
|
||||
- May control how the patch is applied
|
||||
- May contain patch information - such as elements to delete from a list
|
||||
- Must NOT impose new requirements on the old patch format
|
||||
|
||||
- New patch requests should be a superset of old patch requests - **forwards compatible** (newer patch against older server)
|
||||
- *Old servers will ignore directives they do not recognize*
|
||||
- Must include the full patch that would have been sent before the new directives were added.
|
||||
- Must NOT rely on the directive being supported by the server
|
||||
|
||||
**Good Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format *without* directive
|
||||
- Same as old
|
||||
|
||||
New format *with* directive
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
In this example, the behavior was unchanged when the directive was missing,
|
||||
retaining the old behavior for old patch requests.
|
||||
|
||||
**Bad Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format *with* directive
|
||||
- Same as old
|
||||
|
||||
New format *without* directive
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
In this example, the behavior was changed when the directive was missing,
|
||||
breaking compatibility.
|
||||
|
||||
## Alternatives
|
||||
|
||||
The previous strategy is necessary because there is no notion of
|
||||
patch versions. Having the client negotiate the patch version
|
||||
with the server would allow changing the patch format, but at
|
||||
the cost of supporting multiple patch formats in the server and client.
|
||||
Using client provided directives to evolve how a patch is merged
|
||||
provides some limited support for multiple versions.
|
||||
|
|
@ -1,449 +1,3 @@
|
|||
Strategic Merge Patch
|
||||
=====================
|
||||
|
||||
# Background
|
||||
|
||||
Kubernetes supports a customized version of JSON merge patch called strategic merge patch. This
|
||||
patch format is used by `kubectl apply`, `kubectl edit` and `kubectl patch`, and contains
|
||||
specialized directives to control how specific fields are merged.
|
||||
|
||||
In the standard JSON merge patch, JSON objects are always merged but lists are
|
||||
always replaced. Often that isn't what we want. Let's say we start with the
|
||||
following Pod:
|
||||
|
||||
```yaml
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
```
|
||||
|
||||
and we POST that to the server (as JSON). Then let's say we want to *add* a
|
||||
container to this Pod.
|
||||
|
||||
```yaml
|
||||
PATCH /api/v1/namespaces/default/pods/pod-name
|
||||
spec:
|
||||
containers:
|
||||
- name: log-tailer
|
||||
image: log-tailer-1.0
|
||||
```
|
||||
|
||||
If we were to use standard Merge Patch, the entire container list would be
|
||||
replaced with the single log-tailer container. However, our intent is for the
|
||||
container lists to merge together based on the `name` field.
|
||||
|
||||
To solve this problem, Strategic Merge Patch uses the go struct tag of the API
|
||||
objects to determine what lists should be merged and which ones should not.
|
||||
The metadata is available as struct tags on the API objects
|
||||
themselves and also available to clients as [OpenAPI annotations](https://github.com/kubernetes/kubernetes/blob/master/api/openapi-spec/README.md#x-kubernetes-patch-strategy-and-x-kubernetes-patch-merge-key).
|
||||
In the above example, the `patchStrategy` metadata for the `containers`
|
||||
field would be `merge` and the `patchMergeKey` would be `name`.
|
||||
|
||||
|
||||
# Basic Patch Format
|
||||
|
||||
Strategic Merge Patch supports special operations through directives.
|
||||
|
||||
There are multiple directives:
|
||||
|
||||
- replace
|
||||
- merge
|
||||
- delete
|
||||
- delete from primitive list
|
||||
|
||||
`replace`, `merge` and `delete` are mutual exclusive.
|
||||
|
||||
## `replace` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`replace` directive indicates that the element that contains it should be replaced instead of being merged.
|
||||
|
||||
### Syntax
|
||||
|
||||
`replace` directive is used in both patch with directive marker and go struct tags.
|
||||
|
||||
Example usage in the patch:
|
||||
|
||||
```
|
||||
$patch: replace
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
`replace` directive can be used on both map and list.
|
||||
|
||||
#### Map
|
||||
|
||||
To indicate that a map should not be merged and instead should be taken literally:
|
||||
|
||||
```yaml
|
||||
$patch: replace # recursive and applies to all fields of the map it's in
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
```
|
||||
|
||||
#### List of Maps
|
||||
|
||||
To override the container list to be strictly replaced, regardless of the default:
|
||||
|
||||
```yaml
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
- $patch: replace # any further $patch operations nested in this list will be ignored
|
||||
```
|
||||
|
||||
|
||||
## `delete` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`delete` directive indicates that the element that contains it should be deleted.
|
||||
|
||||
### Syntax
|
||||
|
||||
`delete` directive is used only in the patch with directive marker.
|
||||
It can be used on both map and list of maps.
|
||||
```
|
||||
$patch: delete
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### List of Maps
|
||||
|
||||
To delete an element of a list that should be merged:
|
||||
|
||||
```yaml
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx-1.0
|
||||
- $patch: delete
|
||||
name: log-tailer # merge key and value goes here
|
||||
```
|
||||
|
||||
Note: Delete operation will delete all entries in the list that match the merge key.
|
||||
|
||||
#### Maps
|
||||
|
||||
One way to delete a map is using `delete` directive.
|
||||
Applying this patch will delete the rollingUpdate map.
|
||||
```yaml
|
||||
rollingUpdate:
|
||||
$patch: delete
|
||||
```
|
||||
|
||||
An equivalent way to delete this map is
|
||||
```yaml
|
||||
rollingUpdate: null
|
||||
```
|
||||
|
||||
## `merge` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`merge` directive indicates that the element that contains it should be merged instead of being replaced.
|
||||
|
||||
### Syntax
|
||||
|
||||
`merge` directive is used only in the go struct tags.
|
||||
|
||||
|
||||
## `deleteFromPrimitiveList` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
We have two patch strategies for lists of primitives: replace and merge.
|
||||
Replace is the default patch strategy for list, which will replace the whole list on update and it will preserve the order;
|
||||
while merge strategy works as an unordered set. We call a primitive list with merge strategy an unordered set.
|
||||
The patch strategy is defined in the go struct tag of the API objects.
|
||||
|
||||
`deleteFromPrimitiveList` directive indicates that the elements in this list should be deleted from the original primitive list.
|
||||
|
||||
### Syntax
|
||||
|
||||
It is used only as the prefix of the key in the patch.
|
||||
```
|
||||
$deleteFromPrimitiveList/<keyOfPrimitiveList>: [a primitive list]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
##### List of Primitives (Unordered Set)
|
||||
|
||||
`finalizers` uses `merge` as patch strategy.
|
||||
```go
|
||||
Finalizers []string `json:"finalizers,omitempty" patchStrategy:"merge" protobuf:"bytes,14,rep,name=finalizers"`
|
||||
```
|
||||
|
||||
Suppose we have defined a `finalizers` and we call it the original finalizers:
|
||||
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
To delete items "b" and "c" from the original finalizers, the patch will be:
|
||||
|
||||
```yaml
|
||||
# The directive includes the prefix $deleteFromPrimitiveList and
|
||||
# followed by a '/' and the name of the list.
|
||||
# The values in this list will be deleted after applying the patch.
|
||||
$deleteFromPrimitiveList/finalizers:
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
After applying the patch on the original finalizers, it will become:
|
||||
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
```
|
||||
|
||||
Note: When merging two set, the primitives are first deduplicated and then merged.
|
||||
In an erroneous case, the set may be created with duplicates. Deleting an
|
||||
item that has duplicates will delete all matching items.
|
||||
|
||||
## `setElementOrder` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`setElementOrder` directive provides a way to specify the order of a list.
|
||||
The relative order specified in this directive will be retained.
|
||||
Please refer to [proposal](/contributors/design-proposals/cli/preserve-order-in-strategic-merge-patch.md) for more information.
|
||||
|
||||
### Syntax
|
||||
|
||||
It is used only as the prefix of the key in the patch.
|
||||
```
|
||||
$setElementOrder/<keyOfList>: [a list]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### List of Primitives
|
||||
|
||||
Suppose we have a list of `finalizers`:
|
||||
```yaml
|
||||
finalizers:
|
||||
- a
|
||||
- b
|
||||
- c
|
||||
```
|
||||
|
||||
To reorder the elements order in the list, we can send a patch:
|
||||
```yaml
|
||||
# The directive includes the prefix $setElementOrder and
|
||||
# followed by a '/' and the name of the list.
|
||||
$setElementOrder/finalizers:
|
||||
- b
|
||||
- c
|
||||
- a
|
||||
```
|
||||
|
||||
After applying the patch, it will be:
|
||||
```yaml
|
||||
finalizers:
|
||||
- b
|
||||
- c
|
||||
- a
|
||||
```
|
||||
|
||||
#### List of Maps
|
||||
|
||||
Suppose we have a list of `containers` whose `mergeKey` is `name`:
|
||||
```yaml
|
||||
containers:
|
||||
- name: a
|
||||
...
|
||||
- name: b
|
||||
...
|
||||
- name: c
|
||||
...
|
||||
```
|
||||
|
||||
To reorder the elements order in the list, we can send a patch:
|
||||
```yaml
|
||||
# each map in the list should only include the mergeKey
|
||||
$setElementOrder/containers:
|
||||
- name: b
|
||||
- name: c
|
||||
- name: a
|
||||
```
|
||||
|
||||
After applying the patch, it will be:
|
||||
```yaml
|
||||
containers:
|
||||
- name: b
|
||||
...
|
||||
- name: c
|
||||
...
|
||||
- name: a
|
||||
...
|
||||
```
|
||||
|
||||
|
||||
## `retainKeys` Directive
|
||||
|
||||
### Purpose
|
||||
|
||||
`retainKeys` directive provides a mechanism for union types to clear mutual exclusive fields.
|
||||
When this directive is present in the patch, all the fields not in this directive will be cleared.
|
||||
Please refer to [proposal](/contributors/design-proposals/api-machinery/add-new-patchStrategy-to-clear-fields-not-present-in-patch.md) for more information.
|
||||
|
||||
### Syntax
|
||||
|
||||
```
|
||||
$retainKeys: [a list of field keys]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
#### Map
|
||||
|
||||
Suppose we have a union type:
|
||||
```
|
||||
union:
|
||||
foo: a
|
||||
other: b
|
||||
```
|
||||
|
||||
And we have a patch:
|
||||
```
|
||||
union:
|
||||
retainKeys:
|
||||
- another
|
||||
- bar
|
||||
another: d
|
||||
bar: c
|
||||
```
|
||||
|
||||
After applying this patch, we get:
|
||||
```
|
||||
union:
|
||||
# Field foo and other have been cleared w/o explicitly set them to null.
|
||||
another: d
|
||||
bar: c
|
||||
```
|
||||
|
||||
# Changing patch format
|
||||
|
||||
As issues and limitations have been discovered with the strategic merge
|
||||
patch implementation, it has been necessary to change the patch format
|
||||
to support additional semantics - such as merging lists of
|
||||
primitives and defining order when merging lists.
|
||||
|
||||
## Requirements for any changes to the patch format
|
||||
|
||||
**Note:** Changes to the strategic merge patch must be backwards compatible such
|
||||
that patch requests valid in previous versions continue to be valid.
|
||||
That is, old patch formats sent by old clients to new servers with
|
||||
must continue to function correctly.
|
||||
|
||||
Previously valid patch requests do not need to keep the exact same
|
||||
behavior, but do need to behave correctly.
|
||||
|
||||
**Example:** if a patch request previously randomized the order of elements
|
||||
in a list and we want to provide a deterministic order, we must continue
|
||||
to support old patch format but we can make the ordering deterministic
|
||||
for the old format.
|
||||
|
||||
### Client version skew
|
||||
|
||||
Because the server does not publish which patch versions it supports,
|
||||
and it silently ignores patch directives that it does not recognize,
|
||||
new patches should behave correctly when sent to old servers that
|
||||
may not support all of the patch directives.
|
||||
|
||||
While the patch API must be backwards compatible, it must also
|
||||
be forward compatible for 1 version. This is needed because `kubectl` must
|
||||
support talking to older and newer server versions without knowing what
|
||||
parts of patch are supported on each, and generate patches that work correctly on both.
|
||||
|
||||
## Strategies for introducing new patch behavior
|
||||
|
||||
#### 1. Add optional semantic meaning to the existing patch format.
|
||||
|
||||
**Note:** Must not require new data or elements to be present that was not required before. Meaning must not break old interpretation of old patches.
|
||||
|
||||
**Good Example:**
|
||||
|
||||
Old format
|
||||
- ordering of elements in patch had no meaning and the final ordering was arbitrary
|
||||
|
||||
New format
|
||||
- ordering of elements in patch has meaning and the final ordering is deterministic based on the ordering in the patch
|
||||
|
||||
**Bad Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
This example won't work, because old patch formats will contain data that is now
|
||||
considered required. To support this, introduce a new directive to guard the
|
||||
new patch format.
|
||||
|
||||
#### 2. Add support for new directives in the patch format
|
||||
|
||||
- Optional directives may be introduced to change how the patch is applied by the server - **backwards compatible** (old patch against newer server).
|
||||
- May control how the patch is applied
|
||||
- May contain patch information - such as elements to delete from a list
|
||||
- Must NOT impose new requirements on the old patch format
|
||||
|
||||
- New patch requests should be a superset of old patch requests - **forwards compatible** (newer patch against older server)
|
||||
- *Old servers will ignore directives they do not recognize*
|
||||
- Must include the full patch that would have been sent before the new directives were added.
|
||||
- Must NOT rely on the directive being supported by the server
|
||||
|
||||
**Good Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format *without* directive
|
||||
- Same as old
|
||||
|
||||
New format *with* directive
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
In this example, the behavior was unchanged when the directive was missing,
|
||||
retaining the old behavior for old patch requests.
|
||||
|
||||
**Bad Example:**
|
||||
|
||||
Old format
|
||||
- fields not present in a patch for Kind foo are ignored
|
||||
- unmodified fields for Kind foo are optional in patch request
|
||||
|
||||
New format *with* directive
|
||||
- Same as old
|
||||
|
||||
New format *without* directive
|
||||
- fields not present in a patch for Kind foo are cleared
|
||||
- unmodified fields for Kind foo are required in patch request
|
||||
|
||||
In this example, the behavior was changed when the directive was missing,
|
||||
breaking compatibility.
|
||||
|
||||
## Alternatives
|
||||
|
||||
The previous strategy is necessary because there is no notion of
|
||||
patch versions. Having the client negotiate the patch version
|
||||
with the server would allow changing the patch format, but at
|
||||
the cost of supporting multiple patch formats in the server and client.
|
||||
Using client provided directives to evolve how a patch is merged
|
||||
provides some limited support for multiple versions.
|
||||
This file has moved to https://git.k8s.io/community/contributors/devel/sig-api-machinery/strategic-merge-patch.md.
|
||||
|
||||
This file is a placeholder to preserve links. Please remove by April 24, 2019 or the release of kubernetes 1.13, whichever comes first.
|
Loading…
Reference in New Issue