Modified based on the first comments. Userspace SCTP related part re-worked based on further investigations.
This commit is contained in:
parent
106b8e7570
commit
f368332fe0
|
|
@ -5,14 +5,14 @@ authors:
|
|||
- "@janosi"
|
||||
owning-sig: sig-network
|
||||
participating-sigs:
|
||||
- sig-cloud-provider
|
||||
- sig-network
|
||||
reviewers:
|
||||
- "@thockin"
|
||||
approvers:
|
||||
- TBD
|
||||
- "@thockin"
|
||||
editor: TBD
|
||||
creation-date: 2018-06-14
|
||||
last-updated: yyyy-mm-dd
|
||||
last-updated: 2018-06-22
|
||||
status: provisional
|
||||
see-also:
|
||||
- PR64973
|
||||
|
|
@ -21,47 +21,9 @@ superseded-by:
|
|||
---
|
||||
|
||||
# SCTP support
|
||||
<!---
|
||||
This is the title of the KEP.
|
||||
Keep it simple and descriptive.
|
||||
A good title can help communicate what the KEP is and should be considered as part of any review.
|
||||
|
||||
The *filename* for the KEP should include the KEP number along with the title.
|
||||
The title should be lowercased and spaces/punctuation should be replaced with `-`.
|
||||
As the KEP is approved and an official KEP number is allocated, the file should be renamed.
|
||||
|
||||
To get started with this template:
|
||||
1. **Pick a hosting SIG.**
|
||||
Make sure that the problem space is something the SIG is interested in taking up.
|
||||
KEPs should not be checked in without a sponsoring SIG.
|
||||
1. **Allocate a KEP number.**
|
||||
Do this by (a) taking the next number in the `NEXT_KEP_NUMBER` file and (b) incrementing that number.
|
||||
Include the updated `NEXT_KEP_NUMBER` file in your PR.
|
||||
1. **Make a copy of this template.**
|
||||
Name it `NNNN-YYYYMMDD-my-title.md` where `NNNN` is the KEP number that was allocated.
|
||||
1. **Fill out the "overview" sections.**
|
||||
This includes the Summary and Motivation sections.
|
||||
These should be easy if you've preflighted the idea of the KEP with the appropriate SIG.
|
||||
1. **Create a PR.**
|
||||
Assign it to folks in the SIG that are sponsoring this process.
|
||||
1. **Merge early.**
|
||||
Avoid getting hung up on specific details and instead aim to get the goal of the KEP merged quickly.
|
||||
The best way to do this is to just start with the "Overview" sections and fill out details incrementally in follow on PRs.
|
||||
View anything marked as a `provisional` as a working document and subject to change.
|
||||
Aim for single topic PRs to keep discussions focused.
|
||||
If you disagree with what is already in a document, open a new PR with suggested changes.
|
||||
|
||||
The canonical place for the latest set of instructions (and the likely source of this file) is [here](/keps/0000-kep-template.md).
|
||||
|
||||
The `Metadata` section above is intended to support the creation of tooling around the KEP process.
|
||||
This will be a YAML section that is fenced as a code block.
|
||||
See the KEP process for details on each of these items.
|
||||
--->
|
||||
## Table of Contents
|
||||
<!---
|
||||
A table of contents is helpful for quickly jumping to sections of a KEP and for highlighting any additional information provided beyond the standard KEP template.
|
||||
[Tools for generating][] a table of contents from markdown are available.
|
||||
--->
|
||||
|
||||
* [Table of Contents](#table-of-contents)
|
||||
* [Summary](#summary)
|
||||
* [Motivation](#motivation)
|
||||
|
|
@ -78,22 +40,12 @@ A table of contents is helpful for quickly jumping to sections of a KEP and for
|
|||
* [Drawbacks [optional]](#drawbacks-optional)
|
||||
* [Alternatives [optional]](#alternatives-optional)
|
||||
|
||||
[Tools for generating]: https://github.com/ekalinin/github-markdown-toc
|
||||
|
||||
## Summary
|
||||
|
||||
The goal of the SCTP support feature is to enable the usage of the SCTP protocol in Kubernetes [Service][] and [NetworkPolicy][] as an additional protocol option beside the current TCP and UDP options.
|
||||
SCTP is an IETF protocol specified in [RFC4960][], and it is used widely in telecommunications network stacks.
|
||||
Once SCTP support is added as a new protocol option for Service and NetworkPolicy those applications that require SCTP as L4 protocol on their interfaces can be deployed on Kubernetes clusters on a more straightforward way. For example they can use the native kube-dns based service discovery, and their communication can be controlled on the native NetworkPolicy way.
|
||||
|
||||
<!---
|
||||
The `Summary` section is incredibly important for producing high quality user focused documentation such as release notes or a development road map.
|
||||
It should be possible to collect this information before implementation begins in order to avoid requiring implementors to split their attention between writing release notes and implementing the feature itself.
|
||||
KEP editors, SIG Docs, and SIG PM should help to ensure that the tone and content of the `Summary` section is useful for a wide audience.
|
||||
|
||||
A good summary is probably at least a paragraph in length.
|
||||
--->
|
||||
|
||||
[Service]: https://kubernetes.io/docs/concepts/services-networking/service/
|
||||
[NetworkPolicy]: https://kubernetes.io/docs/concepts/services-networking/network-policies/
|
||||
[RFC4960]: https://tools.ietf.org/html/rfc4960
|
||||
|
|
@ -102,42 +54,19 @@ A good summary is probably at least a paragraph in length.
|
|||
|
||||
SCTP is a widely used protocol in telecommunications. It would ease the management and execution of telecommunication applications on Kubernetes if SCTP were added as a protocol option to Kubernetes Service and NetworkPolicy.
|
||||
|
||||
<!---
|
||||
This section is for explicitly listing the motivation, goals and non-goals of this KEP.
|
||||
Describe why the change is important and the benefits to users.
|
||||
The motivation section can optionally provide links to [experience reports][] to demonstrate the interest in a KEP within the wider Kubernetes community.
|
||||
|
||||
[experience reports]: https://github.com/golang/go/wiki/ExperienceReports
|
||||
--->
|
||||
### Goals
|
||||
|
||||
Add SCTP support to Kubernetes Service and NetworkPolicy, so applications running in pods can use the native kube-dns based service discovery for SCTP based services, and their communication can be controlled via the native NetworkPolicy way.
|
||||
|
||||
<!---
|
||||
List the specific goals of the KEP.
|
||||
How will we know that this has succeeded?
|
||||
--->
|
||||
|
||||
### Non-Goals
|
||||
|
||||
It is not a goal here to add SCTP support to load balancers that are provided by cloud providers. I.e. the Kubernetes user can define Services with type=LoadBalancer and Protocol=SCTP, but if the actual load balancer implementation does not support SCTP then the creation of the Service/load balancer fails.
|
||||
It is not a goal to support multi-homed SCTP associations.
|
||||
|
||||
<!---
|
||||
What is out of scope for his KEP?
|
||||
Listing non-goals helps to focus discussion and make progress.
|
||||
--->
|
||||
## Proposal
|
||||
|
||||
<!---
|
||||
This is where we get down to the nitty gritty of what the proposal actually is.
|
||||
--->
|
||||
### User Stories [optional]
|
||||
|
||||
<!---
|
||||
Detail the things that people will be able to do if this KEP is implemented.
|
||||
Include as much detail as possible so that people can understand the "how" of the system.
|
||||
The goal here is to make this feel real for users without getting bogged down.
|
||||
--->
|
||||
#### Service with SCTP and Virtual IP
|
||||
As a user of Kubernetes I want to define Services with Virtual IPs for my applications that use SCTP as L4 protocol on their interfaces,so client applications can use the services of my applications on top of SCTP via that Virtual IP.
|
||||
Example:
|
||||
|
|
@ -168,7 +97,7 @@ spec:
|
|||
app: MyApp
|
||||
ClusterIP: "None"
|
||||
ports:
|
||||
- protocol: TCP
|
||||
- protocol: SCTP
|
||||
port: 80
|
||||
targetPort: 9376
|
||||
```
|
||||
|
|
@ -183,7 +112,7 @@ metadata:
|
|||
spec:
|
||||
ClusterIP: "None"
|
||||
ports:
|
||||
- protocol: TCP
|
||||
- protocol: SCTP
|
||||
port: 80
|
||||
targetPort: 9376
|
||||
```
|
||||
|
|
@ -224,68 +153,40 @@ As a user of Kubernetes I want to deploy and run my applications that use a user
|
|||
|
||||
### Implementation Details/Notes/Constraints [optional]
|
||||
|
||||
<!---
|
||||
What are the caveats to the implementation?
|
||||
What are some important details that didn't come across above.
|
||||
Go in to as much detail as necessary here.
|
||||
This might be a good place to talk about core concepts and how they releate.
|
||||
--->
|
||||
#### SCTP in Services
|
||||
The Kubernetes API modification for Services is obvious.
|
||||
The selected port shall be reserved on the node, just like for TCP and UDP now. Unfortunately, golang does not have native SCTP support in the "net" package, so in order to access the kernel's SCTP API we have to introduce a new 3rd party package as a new vendor package.
|
||||
For Services with type=LoadBalancer we have to check how the cloud provider implementations handle new protocols, and we have to make sure that if SCTP is not supported then the request for a new load balancer, firewall rule, etc. with protocol=SCTP is rejected gracefully.
|
||||
DNS shall support SRV records with "_sctp" as "proto" value.
|
||||
|
||||
#### SCTP in NetworkPolicy
|
||||
The Kubernetes API modification for the NetworkPolicy is obvious.
|
||||
In order to utilize the new protocol value the network controller must support it.
|
||||
In order to utilize the new protocol value the network controller must support it.
|
||||
|
||||
#### Interworking with applications that use a user space SCTP stack
|
||||
A userspace SCTP stack implementation cannot work together with the SCTP kernel module (lksctp) on the same node. That is, the loading of the SCTP kernel module must be avoided on nodes where such applications that use userspace SCTP stack are planned to be run. The problem comes with the introduction of the SCTP protocol option for Services with Virtual IP: once such a service is created the relevant iptables/ipvs management logic kicks-in on every node, and as a consequence it loads the SCTP kernel module. There are some ideas how to solve this interworking problem:
|
||||
A userspace SCTP stack implementation cannot work together with the SCTP kernel module (lksctp) on the same node. That is, the loading of the SCTP kernel module must be avoided on nodes where such applications that use userspace SCTP stack are planned to be run. The problem comes with the introduction of the SCTP protocol option for Services with Virtual IP: once such a service is created the relevant iptables/ipvs management logic kicks-in on every node, and as a consequence it loads the SCTP kernel module.
|
||||
NOTE! It is not a new interworking problem between the userspace SCTP stack implementations and the SCTP kernel module. It is a known phenomenon. The solution has been to dedicate nodes to userspace SCTP applications, and ensure that on those nodes the SCTP kernel module is not loaded.
|
||||
|
||||
1. "-p sctp" is not used in the iptables rules, the processing of requests to the Virtual IP is executed purely based on the destination IP address. In case of ipvs the protocol is a mandatory parameter, so ipvs with SCTP rules cannot be used on the node where userspace SCTP applications should run.
|
||||
2. Fall back to the user space proxy on those specific nodes. The user space proxy shall also use a user space SCTP stack, of course. Also the iptables rules that direct the client traffic to the userspace proxy must be created without the "-p sctp" option.
|
||||
For this reason the main task here is to provide the same isolation possibility: i.e. to provide the option to dedicate some nodes to userspace SCTP applications and ensure that k8s does not load the SCTP kernel modules on those dedicated nodes.
|
||||
|
||||
In any case we shall be able to dedicate these nodes for those userspace SCTP applications, or at least, we must achieve that "regular" SCTP user applications are not deployed on these nodes. The solution proposal for this node separation:
|
||||
As we can easily see, it is pretty easy to separate application pods that use a userspace SCTP stack from those application pods that use the kernel space SCTP stack: the usual nodeselector label based mechanism, or taints are for this very purpose.
|
||||
The real challenge here is to ensure that when an SCTP Service is created in a k8s cluster the k8s logic does not create iptables or ipvs rules on those nodes that are dedicated for the applications that use userspace SCTP stack - because such an action would trigger the loading of the kernel module, but at the same time those applications that use userspace SCTP stack can still access the just created SCTP based Service vie the ClusterIP of that service - assuming that the new Service has ClusterIP allocated. There is no such challenge with regard to headless SCTP Services.
|
||||
|
||||
- there shall be a new kube-proxy parameter. If the parameter is set, the kube-proxy switches to this new mode of operation (described above) for SCTP services
|
||||
- if the new kube-proxy parameter is set the node must be tainted with a new taint, so the scheduler places only such SCTP applications on this node that use userspace SCTP stack. We must avoid the deployment of "regular" SCTP users on this node.
|
||||
This is how our way of thinking goes:
|
||||
The first task is to provide a way to dedicate nodes to userspae SCTP application so, that k8s itself is aware of that role of those nodes. It may be achieved with a node level parameter - e.g. in kube-proxy. Based on that parameter the k8be-proxy would be aware of the role of the node and it would not apply iptables or ipvs rules for SCTP Services on the node.
|
||||
If a node is dedicated for userspace SCTP applications then whatever proxy solution is to run on that node, that proxy shall use userspace SCTP as well. That is, on those nodes we need a userspace proxy for the SCTP Services. Whether this usespace proxy shall be an extension of the current kube-proxy, or rather it shall be a new independent proxy - it is to be discussed. We are aware of the plans of which goal is to remove the userspace part of kube-proxy - however, we think, that this situation is different from those where the userspace kube-proxy is used for TCP or UDP traffic. I.e. even if the current TCP/UDP related userspace logic is removed from the kube-proxy, the foundations of that could be re-used for this case.
|
||||
The userspace proxy would follow then the current high level logic of the kube-proxy: it would listen on an IP address of the local node, and it would establish connections to the application pods that provide the service.
|
||||
The next task is to ensure that the packets that are sent by applications to the ClusterIP end up in the userspace proxy. It requires the careful setup of iptables or ipvs rules on the node, so those do not trigger the loading of the SCTP kernel module. It means, that those rules cannot use filter on the actual protocol value (SCTP), i.e. we end up with rules that simply forward the ClusterIP to the local host IP on which the userspace proxy listens. The consequence is, that the Service definition can contain only SCTP Ports, TCP or UDP Ports should not be used in that Service definition.
|
||||
|
||||
NOTE! The handling of TCP and UDP Services does not change on those dedicated nodes, i.e. the current iptables/ipvs/etc. mechanisms can be used for those
|
||||
|
||||
### Risks and Mitigations
|
||||
|
||||
<!---
|
||||
What are the risks of this proposal and how do we mitigate.
|
||||
Think broadly.
|
||||
For example, consider both security and how this will impact the larger kubernetes ecosystem.
|
||||
--->
|
||||
## Graduation Criteria
|
||||
|
||||
<!---
|
||||
How will we know that this has succeeded?
|
||||
Gathering user feedback is crucial for building high quality experiences and SIGs have the important responsibility of setting milestones for stability and completeness.
|
||||
Hopefully the content previously contained in [umbrella issues][] will be tracked in the `Graduation Criteria` section.
|
||||
|
||||
[umbrella issues]: https://github.com/kubernetes/kubernetes/issues/42752
|
||||
--->
|
||||
## Implementation History
|
||||
|
||||
<!---
|
||||
Major milestones in the life cycle of a KEP should be tracked in `Implementation History`.
|
||||
Major milestones might include
|
||||
|
||||
- the `Summary` and `Motivation` sections being merged signaling SIG acceptance
|
||||
- the `Proposal` section being merged signaling agreement on a proposed design
|
||||
- the date implementation started
|
||||
- the first Kubernetes release where an initial version of the KEP was available
|
||||
- the version of Kubernetes where the KEP graduated to general availability
|
||||
- when the KEP was retired or superseded
|
||||
--->
|
||||
## Drawbacks [optional]
|
||||
|
||||
<!---
|
||||
Why should this KEP _not_ be implemented.
|
||||
--->
|
||||
## Alternatives [optional]
|
||||
|
||||
<!---
|
||||
Similar to the `Drawbacks` section the `Alternatives` section is used to highlight and record other possible approaches to delivering the value proposed by a KEP.
|
||||
--->
|
||||
Loading…
Reference in New Issue