add dockershim checkpoint proposal (#255)
This commit is contained in:
parent
36af529096
commit
2cb35f20ab
|
@ -0,0 +1,127 @@
|
|||
# CRI: Dockershim PodSandbox Checkpoint
|
||||
|
||||
## Umbrella Issue
|
||||
[#34672](https://github.com/kubernetes/kubernetes/issues/34672)
|
||||
|
||||
## Background
|
||||
[Container Runtime Interface (CRI)](../devel/container-runtime-interface.md)
|
||||
is an ongoing project to allow container runtimes to integrate with
|
||||
kubernetes via a newly-defined API.
|
||||
[Dockershim](https://github.com/kubernetes/kubernetes/blob/release-1.5/pkg/kubelet/dockershim)
|
||||
is the Docker CRI implementation. This proposal aims to introduce
|
||||
checkpoint mechanism in dockershim.
|
||||
|
||||
## Motivation
|
||||
### Why do we need checkpoint?
|
||||
|
||||
|
||||
With CRI, Kubelet only passes configurations (SandboxConfig,
|
||||
ContainerConfig and ImageSpec) when creating sandbox, container and
|
||||
image, and only use the reference id to manage them after creation.
|
||||
However, information in configuration is not only needed during creation.
|
||||
|
||||
In the case of dockershim with CNI network plugin, CNI plugins needs
|
||||
the same information from PodSandboxConfig at creation and deletion.
|
||||
|
||||
```
|
||||
Kubelet ---------------------------------
|
||||
| RunPodSandbox(PodSandboxConfig)
|
||||
| StopPodSandbox(PodSandboxID)
|
||||
V
|
||||
Dockershim-------------------------------
|
||||
| SetUpPod
|
||||
| TearDownPod
|
||||
V
|
||||
Network Plugin---------------------------
|
||||
| ADD
|
||||
| DEL
|
||||
V
|
||||
CNI plugin-------------------------------
|
||||
```
|
||||
|
||||
|
||||
In addition, checkpoint helps to improve the reliability of dockershim.
|
||||
With checkpoints, critical information for disaster recovery could be
|
||||
preserved. Kubelet makes decisions based on the reported pod states
|
||||
from runtime shims. Dockershim currently gathers states from docker
|
||||
engine. However, in case of disaster, docker engine may lose all
|
||||
container information, including the reference ids. Without necessary
|
||||
information, kubelet and dockershim could not conduct proper clean up.
|
||||
For example, if docker containers are removed underneath kubelet, reference
|
||||
to the allocated IPs and iptables setup for the pods are also lost.
|
||||
This leads to resource leak and potential iptables rule conflict.
|
||||
|
||||
### Why checkpoint in dockershim?
|
||||
- CNI specification does not require CNI plugins to be stateful. And CNI
|
||||
specification does not provide interface to retrieve states from CNI plugins.
|
||||
- Currently there is no uniform checkpoint requirements across existing runtime shims.
|
||||
- Need to preserve backward compatibility for kubelet.
|
||||
- Easier to maintain backward compatibility by checkpointing at a lower level.
|
||||
|
||||
## PodSandbox Checkpoint
|
||||
Checkpoint file will be created for each PodSandbox. Files will be
|
||||
placed under `/var/lib/dockershim/sandbox/`. File name will be the
|
||||
corresponding `PodSandboxID`. File content will be json encoded.
|
||||
Data structure is as follows:
|
||||
|
||||
```go
|
||||
const schemaVersion = "v1"
|
||||
|
||||
type Protocol string
|
||||
|
||||
// PortMapping is the port mapping configurations of a sandbox.
|
||||
type PortMapping struct {
|
||||
// Protocol of the port mapping.
|
||||
Protocol *Protocol `json:"protocol,omitempty"`
|
||||
// Port number within the container.
|
||||
ContainerPort *int32 `json:"container_port,omitempty"`
|
||||
// Port number on the host.
|
||||
HostPort *int32 `json:"host_port,omitempty"`
|
||||
}
|
||||
|
||||
// CheckpointData contains all types of data that can be stored in the checkpoint.
|
||||
type CheckpointData struct {
|
||||
PortMappings []*PortMapping `json:"port_mappings,omitempty"`
|
||||
}
|
||||
|
||||
// PodSandboxCheckpoint is the checkpoint structure for a sandbox
|
||||
type PodSandboxCheckpoint struct {
|
||||
// Version of the pod sandbox checkpoint schema.
|
||||
Version string `json:"version"`
|
||||
// Pod name of the sandbox. Same as the pod name in the PodSpec.
|
||||
Name string `json:"name"`
|
||||
// Pod namespace of the sandbox. Same as the pod namespace in the PodSpec.
|
||||
Namespace string `json:"namespace"`
|
||||
// Data to checkpoint for pod sandbox.
|
||||
Data *CheckpointData `json:"data,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Workflow Changes
|
||||
|
||||
|
||||
`RunPodSandbox` creates checkpoint:
|
||||
```
|
||||
() --> Pull Image --> Create Sandbox Container --> (Create Sandbox Checkpoint) --> Start Sandbox Container --> Set Up Network --> ()
|
||||
```
|
||||
|
||||
`RemovePodSandbox` removes checkpoint:
|
||||
```
|
||||
() --> Remove Sandbox --> (Remove Sandbox Checkpoint) --> ()
|
||||
```
|
||||
|
||||
`ListPodSandbox` need to include all PodSandboxes as long as their
|
||||
checkpoint files exist. If sandbox checkpoint exists but sandbox
|
||||
container could not be found, the PodSandbox object will include
|
||||
PodSandboxID, namespace and name. PodSandbox state will be `PodSandboxState_SANDBOX_NOTREADY`.
|
||||
|
||||
`StopPodSandbox` and `RemovePodSandbox` need to conduct proper error handling to ensure idempotency.
|
||||
|
||||
|
||||
|
||||
## Future extensions
|
||||
This proposal is mainly driven by networking use cases. More could be added into checkpoint.
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue