Handling of Services with type=LoadBalaner changed. Support of HostPort with SCTP is clarified (not supported).

This commit is contained in:
Janosi Laszlo 2018-07-12 08:09:09 +02:00 committed by janosi
parent 3bb64044f5
commit 683f9a0192
1 changed files with 38 additions and 27 deletions

View File

@ -42,34 +42,41 @@ superseded-by:
## Summary
The goal of the SCTP support feature is to enable the usage of the SCTP protocol in Kubernetes Pod container port, [Service][] and [NetworkPolicy][] value as an additional protocol option beside the current TCP and UDP options.
The goal of the SCTP support feature is to enable the usage of the SCTP protocol in Kubernetes [Service][], [NetworkPolicy][], and [ContainerPort][]as an additional protocol value option beside the current TCP and UDP options.
SCTP is an IETF protocol specified in [RFC4960][], and it is used widely in telecommunications network stacks.
Once SCTP support is added as a new protocol option for Service, container port, and NetworkPolicy those applications that require SCTP as L4 protocol on their interfaces can be deployed on Kubernetes clusters on a more straightforward way. For example they can use the native kube-dns based service discovery, and their communication can be controlled on the native NetworkPolicy way.
Once SCTP support is added as a new protocol option those applications that require SCTP as L4 protocol on their interfaces can be deployed on Kubernetes clusters on a more straightforward way. For example they can use the native kube-dns based service discovery, and their communication can be controlled on the native NetworkPolicy way.
[Service]: https://kubernetes.io/docs/concepts/services-networking/service/
[NetworkPolicy]: https://kubernetes.io/docs/concepts/services-networking/network-policies/
[NetworkPolicy]:
https://kubernetes.io/docs/concepts/services-networking/network-policies/
[ContainerPort]:https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#exposing-pods-to-the-cluster
[RFC4960]: https://tools.ietf.org/html/rfc4960
## Motivation
SCTP is a widely used protocol in telecommunications. It would ease the management and execution of telecommunication applications on Kubernetes if SCTP were added as a protocol option to Kubernetes container port, Service and NetworkPolicy.
SCTP is a widely used protocol in telecommunications. It would ease the management and execution of telecommunication applications on Kubernetes if SCTP were added as a protocol option to Kubernetes.
### Goals
Add SCTP support to Kubernetes container port, Service and NetworkPolicy, so applications running in pods can use the native kube-dns based service discovery for SCTP based services, they can define container ports for their SCTP based interfaces, and their communication can be controlled via the native NetworkPolicy way.
Add SCTP support to Kubernetes ContainerPort, Service and NetworkPolicy, so applications running in pods can use the native kube-dns based service discovery for SCTP based services, and their communication can be controlled via the native NetworkPolicy way.
### Non-Goals
It is not a goal here to add SCTP support to load balancers that are provided by cloud providers. I.e. the Kubernetes user can define Services with type=LoadBalancer and Protocol=SCTP, but if the actual load balancer implementation does not support SCTP then the creation of the Service/load balancer fails.
It is not a goal here to add SCTP support to load balancers that are provided by cloud providers.
It is not a goal to support multi-homed SCTP associations.
It is not a goal to support multi-homed SCTP associations. Such a support also depends on the ability to manage multiple IP addresses for a pod, and in the case of Services with ClusterIP or NodePort the support of multi-homed assocations would also require the support of NAT for multihomed associations in iptables/ipvs.
It is not a goal to support SCTP as protocol value for the container's HostPort. The reason: [the usage of HostPort is not recommended by Kubernetes][], and to ensure proper interworking of HostPort with userspace SCTP stacks (see below) would require an additional kubelet/kubenet configuration option. In order to keep the complexity and impact of the introduction of SCTP on a lower level we do not plan to support SCTP as new protocol value for HostPort.
[the usage of HostPort is not recommended by Kubernetes]:https://kubernetes.io/docs/concepts/configuration/overview/#services
## Proposal
### User Stories [optional]
#### Service with SCTP and Virtual IP
As a user of Kubernetes I want to define Services with Virtual IPs for my applications that use SCTP as L4 protocol on their interfaces,so client applications can use the services of my applications on top of SCTP via that Virtual IP.
Example:
```
kind: Service
@ -87,6 +94,7 @@ spec:
#### Headless Service with SCTP
As a user of Kubernetes I want to define headless Services for my applications that use SCTP as L4 protocol on their interfaces, so client applications can discover my applications in kube-dns, or via any other service discovery method that gets information about endpoints via the Kubernetes API.
Example:
```
kind: Service
@ -103,7 +111,8 @@ spec:
targetPort: 9376
```
#### Service with SCTP without selector
As a user of Kubernetes I want to define Services without selector for my applications that use SCTP as L4 protocol on their interfaces,so I can implement my own service controllers if I want to extend the basic functionality of Kubernetes.
As a user of Kubernetes I want to define Services without selector for my applications that use SCTP as L4 protocol on their interfaces, so I can implement my own service controllers if I want to extend the basic functionality of Kubernetes.
Example:
```
kind: Service
@ -119,7 +128,7 @@ spec:
```
#### SCTP as container port protocol in Pod definition
As a user of Kubernetes I want to define hostPort based port mappings for the SCTP based interfaces of my applications
As a user of Kubernetes I want to define containerPorts for the SCTP based interfaces of my applications
Example:
```
apiVersion: v1
@ -133,10 +142,12 @@ spec:
ports:
- name: diameter
protocol: SCTP
containerPort: 80
```
#### NetworkPolicy with SCTP
As a user of Kubernetes I want to define NetworPolicies for my applications that use SCTP as L4 protocol on their interfaces, so the network controllers that support SCTP can control the accessibility of my applications on the SCTP based interfaces, too.
As a user of Kubernetes I want to define NetworkPolicies for my applications that use SCTP as L4 protocol on their interfaces, so the network controllers that support SCTP can control the accessibility of my applications on the SCTP based interfaces, too.
Example:
```
apiVersion: networking.k8s.io/v1
@ -168,26 +179,28 @@ spec:
port: 7777
```
#### Userspace SCTP stack
As a user of Kubernetes I want to deploy and run my applications that use a user space SCTP stack, and at the same time I want to define SCTP Services in the same cluster.
As a user of Kubernetes I want to deploy and run my applications that use a userspace SCTP stack, and at the same time I want to define SCTP Services in the same cluster.
### Implementation Details/Notes/Constraints [optional]
#### SCTP in Services
The Kubernetes API modification for Services is obvious.
The selected port shall be reserved on the node, just like for TCP and UDP now. Unfortunately, golang does not have native SCTP support in the "net" package, so in order to access the kernel's SCTP API we have to introduce a new 3rd party package as a new vendor package. We plan to use the go sctp library from github.com/ishidawataru/sctp.
In case of Servies with ClusterIP or NodePort or externalIP the selected port shall be reserved on the respective nodes, just like for TCP and UDP currently. Unfortunately, golang does not have native SCTP support in the "net" package, so in order to reserving those ports via the kernel's SCTP API we have to introduce a new 3rd party package as a new vendor package. We plan to use the go sctp library from github.com/ishidawataru/sctp.
For Services with type=LoadBalancer we have to check how the cloud provider implementations handle new protocols, and we have to make sure that if SCTP is not supported then the request for a new load balancer, firewall rule, etc. with protocol=SCTP is rejected gracefully.
For Services with type=LoadBalancer we reject the Service creation request for SCTP services at API validation time.
Kube DNS shall support SRV records with "_sctp" as "proto" value. According to our investigations, the DNS controller is very flexible from this perspective, and it can create SRV records with any protocol name. Example:
Kube DNS shall support SRV records with "_sctp" as "proto" value. According to our investigations, the DNS controller is very flexible from this perspective, and it can create SRV records with any protocol name. I.e. there is no need for additional implementation to achieve this goal.
Example:
```
_diameter._sctp.my-service.default.svc.cluster.local. 30 IN SRV 10 100 1234 my-service.default.svc.cluster.local.
```
#### SCTP in the Pod's container port
#### SCTP in the Pod's ContainerPort
The Kubernetes API modification for the Pod is obvious.
The selected port shall be reserved on the node, just like for TCP and UDP now. Unfortunately, golang does not have native SCTP support in the "net" package, so in order to access the kernel's SCTP API we have to introduce a new 3rd party package as a new vendor package. We plan to use the go sctp library from github.com/ishidawataru/sctp.
We reject the pod creation request for pods that have containers with the combination of a hostPort and SCTP as protocol at API validation time.
#### SCTP in NetworkPolicy
The Kubernetes API modification for the NetworkPolicy is obvious.
@ -195,37 +208,35 @@ The Kubernetes API modification for the NetworkPolicy is obvious.
In order to utilize the new protocol value the network controller must support it.
#### Interworking with applications that use a user space SCTP stack
A userspace SCTP stack implementation cannot work together with the SCTP kernel module (lksctp) on the same node. That is, the loading of the SCTP kernel module must be avoided on nodes where such applications that use userspace SCTP stack are planned to be run. The problem comes with the introduction of the SCTP protocol option for Services with Virtual IP: once such a service is created the relevant port reservation logic kicks-in on every node, it starts listening on the port, and as a consequence it loads the SCTP kernel module on every nodes. It immediately ruins the connectivity of the userspace SCTP applications on those nodes.
A userspace SCTP stack implementation cannot work together with the SCTP kernel module (lksctp) on the same node. That is, the loading of the SCTP kernel module must be avoided on nodes where such applications that use userspace SCTP stack are planned to be run. The problem comes with the introduction of the SCTP protocol option for Services with Virtual IP (the "type" of the Service is ClusterIP or NodePort): once such a service is created the relevant port reservation logic kicks-in on every node in the cluster, it starts listening on the port, and as a consequence it loads the SCTP kernel module on every nodes. It immediately ruins the connectivity of the userspace SCTP applications on those nodes.
The same interworking problem stands for Services with "externalIP" defined.
NOTE! It is not a new interworking problem between the userspace SCTP stack implementations and the SCTP kernel module. It is a known phenomenon. The userpace SCTP stack creates raw sockets with IPPROTO_SCTP. As it is clearly highlighted in the [documentation of raw sockets][]:
>Raw sockets may tap all IP protocols in Linux, even protocols like ICMP or TCP which have a protocol module in the kernel. In this case, the packets are passed to both the kernel module and the raw socket(s).
I.e. it is the normal function of the [kernel][], that it sends the incoming packet to both sides: the raw socket and the kernel module. In this case the kernel module will handle those packets that are destined to the raw socket as Out of the blue (OOTB) packets according to the rules defined in the [RFC4960][].
I.e. it is the normal function of the [kernel][], that it sends the incoming packet to both sides: the raw socket and relevant the kernel module. In this case the kernel module will handle those packets that are destined to the raw socket as Out of the blue (OOTB) packets according to the rules defined in the [RFC4960][].
The solution has been to dedicate nodes to userspace SCTP applications, and to ensure that on those nodes the SCTP kernel module is not loaded.
In order to resolve this problem the solution has been to dedicate nodes to userspace SCTP applications, and to ensure that on those nodes the SCTP kernel module is not loaded.
For this reason the main task here is to provide the same isolation possibility: i.e. to provide the option to dedicate some nodes to userspace SCTP applications and ensure that k8s does not load the SCTP kernel modules on those dedicated nodes.
For this reason the main task here is to provide the same isolation possibility: i.e. to provide the option to dedicate some nodes to userspace SCTP applications and ensure that the actions performed by Kubernetes do not load the SCTP kernel modules on those dedicated nodes.
As we can easily see, it is pretty easy to separate application pods that use a userspace SCTP stack from those application pods that use the kernel space SCTP stack: the usual nodeselector label based mechanism, or taints are there for this very purpose.
The real challenge here is to ensure that when an SCTP Service is created in a k8s cluster the k8s logic does not create listening SCTP sockets on those nodes that are dedicated for the applications that use userspace SCTP stack - because such an action would trigger the loading of the kernel module.
The real challenge here is to ensure that when an SCTP Service is created in a Kubernetes cluster the Kubernetes logic does not create listening SCTP sockets on those nodes that are dedicated for the applications that use userspace SCTP stack - because such an action would trigger the loading of the kernel module.
There is no such challenge with regard to headless SCTP Services.
This is how our way of thinking goes:
The first task is to provide a way to dedicate nodes to userspae SCTP application so, that k8s itself is aware of that role of those nodes. It may be achieved with a node level parameter. Based on that parameter the kube-proxy would be aware of the role of the node and it would not create listening SCTP sockets for SCTP Services on the node.
The first task is to provide a way to dedicate nodes to userspae SCTP application so, that Kubernetes itself is aware of that role of those nodes. It may be achieved with a node level parameter. Based on that parameter the kube-proxy would be aware of the role of the node and it would not create listening SCTP sockets for SCTP Services on the node.
NOTE! The handling of TCP and UDP Services does not change on those dedicated nodes.
NOTE! When the user defines SCTP ports to a container in a Pod definition that triggers the creation of a listening SCTP socket (and thus the loading of the SCTP kernel module) only on those nodes to which the pod is scheduled - i.e. the regular node selectors and taints can be used to avoid the collision of userspace SCTP stacks with the SCTP kernel module.
We propose the following alternatives for consideration in the community:
##### Documentation only
In this alternative we would describe in the Kubernetes documentation the mutually exclusive nature of userspace and kernel space SCTP stacks, and we would highlight, that the new SCTP Service feature must not be used in those clusters where userspace SCTP stack based applications are deployed, and in turn, userspace SCTP stack based applications cannot be deployed in such clusters where kernel space SCTP stack based applications have already been deployed. We would also highlight, that the usage of headless SCTP Services is allowed because such services do not trigger the creation of listening SCTP sockets, thus those do not trigger the loading of the SCTP kernel module on every node.
We would also describe that SCTP must not be used as protocol value in the Pod/container definition for those applications that use a userspace SCTP stack.
In this alternative we would describe in the Kubernetes documentation the mutually exclusive nature of userspace and kernel space SCTP stacks, and we would highlight, that the new SCTP Service feature must not be used in those clusters where userspace SCTP stack based applications are deployed, and in turn, userspace SCTP stack based applications cannot be deployed in such clusters where kernel space SCTP stack based applications have already been deployed. We would also highlight, that the usage of headless SCTP Services is possible because such services do not trigger the creation of listening SCTP sockets, thus those do not trigger the loading of the SCTP kernel module on every node.
##### A node level parameter to dedicate nodes for userspace SCTP applications