kruise game 0.3.0 (#112)
Signed-off-by: “skkkkkkk” <sk01199367@alibaba-inc.com> Co-authored-by: “skkkkkkk” <sk01199367@alibaba-inc.com>
|
@ -26,7 +26,7 @@
|
|||
|
||||
## 为什么OpenKruiseGame(OKG)是一个工作负载
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/workload.png').default} width="90%" />
|
||||
|
||||
游戏服云原生化核心要解决两个问题,游戏服的生命周期管理与游戏服的运维管理。Kubernetes内置了一些通用的工作负载模型,例如:无状态(Deployment)、有状态(StatefulSet)、任务(Job)等。但是,游戏服的状态管理不论从粒度还是确定性上面都有更高的要求。例如:游戏服需要热更新的机制来确保更短的游戏中断;游戏服需要原地更新确保元数据信息(网络为主)不变;游戏服需要确保在自动伸缩过程中只有0玩家的游戏服可以下线;需要具备手动运维/诊断/隔离任意一个游戏服的能力等。这些都是Kubernetes内置负载不能够解决的问题。
|
||||
|
||||
|
@ -54,7 +54,7 @@ OpenKruiseGame(OKG)只包含两个CRD对象:GameServerSet与GameServer。O
|
|||
|
||||
## OpenKruiseGame(OKG)的部署架构
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/arch.png').default} width="90%" />
|
||||
|
||||
OpenKruiseGame(OKG)的部署模型分为三个部分:
|
||||
|
||||
|
|
|
@ -16,39 +16,44 @@ $ helm repo update
|
|||
# Install the latest version.
|
||||
$ helm install kruise openkruise/kruise --version 1.4.0
|
||||
```
|
||||
|
||||
## 安装Kruise-Game
|
||||
·
|
||||
#### 安装Kruise-Game
|
||||
|
||||
```shell
|
||||
$ helm install kruise-game openkruise/kruise-game --version 0.2.1
|
||||
$ helm install kruise-game openkruise/kruise-game --version 0.3.0
|
||||
```
|
||||
|
||||
## 可选:使用自定义配置安装/升级
|
||||
#### 可选:使用自定义配置安装/升级
|
||||
|
||||
下表列出了 kruise-game 的可配置参数及其默认值。
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|----------------------------------|-------------------------------------------------------------------|-------------------------------------|
|
||||
| `installation.namespace` | kruise-game 安装到的 namespace,一般不建议修改 | `kruise-game-system` |
|
||||
| `installation.createNamespace` | 是否需要创建上述 namespace,一般不建议修改,除非指定安装到已有的 ns 中 | `true` |
|
||||
| `kruiseGame.fullname` | kruise-game 部署和其他配置的名称 | `kruise-game-controller-manager` |
|
||||
| `kruiseGame.healthBindPort` | 用于检查 kruise-game 容器健康检查的端口 | `8082` |
|
||||
| `kruiseGame.webhook.port` | kruise-game 容器服务的 webhook 端口 | `443` |
|
||||
| `kruiseGame.webhook.targetPort` | 用于 MutatingWebhookConfigurations 中工作负载的 ObjectSelector | `9876` |
|
||||
| `replicaCount` | kruise-game 的期望副本数 | `1` |
|
||||
| `image.repository` | kruise-game 的镜像仓库 | `openkruise/kruise-game-manager` |
|
||||
| `image.tag` | kruise-game 的镜像版本 | `v0.2.1` |
|
||||
| `image.pullPolicy` | kruise-game 的镜像拉取策略 | `Always` |
|
||||
| `serviceAccount.annotations` | kruise-game的serviceAccount注解 | ` ` |
|
||||
| `resources.limits.cpu` | kruise-game容器的CPU资源限制 | `500m` |
|
||||
| `resources.limits.memory` | kruise-game容器的内存资源限制 | `1Gi` |
|
||||
| `resources.requests.cpu` | kruise-game容器的CPU资源请求 | `10m` |
|
||||
| `resources.requests.memory` | kruise-game容器的内存资源请求 | `64Mi` |
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|----------------------------------------|-------------------------------------------------------------------|-------------------------------------|
|
||||
| `installation.namespace` | kruise-game 安装到的 namespace,一般不建议修改 | `kruise-game-system` |
|
||||
| `installation.createNamespace` | 是否需要创建上述 namespace,一般不建议修改,除非指定安装到已有的 ns 中 | `true` |
|
||||
| `kruiseGame.fullname` | kruise-game 部署和其他配置的名称 | `kruise-game-controller-manager` |
|
||||
| `kruiseGame.healthBindPort` | 用于检查 kruise-game 容器健康检查的端口 | `8082` |
|
||||
| `kruiseGame.webhook.port` | kruise-game 容器服务的 webhook 端口 | `443` |
|
||||
| `kruiseGame.webhook.targetPort` | 用于 MutatingWebhookConfigurations 中工作负载的 ObjectSelector | `9876` |
|
||||
| `replicaCount` | kruise-game 的期望副本数 | `1` |
|
||||
| `image.repository` | kruise-game 的镜像仓库 | `openkruise/kruise-game-manager` |
|
||||
| `image.tag` | kruise-game 的镜像版本 | `v0.2.1` |
|
||||
| `image.pullPolicy` | kruise-game 的镜像拉取策略 | `Always` |
|
||||
| `serviceAccount.annotations` | kruise-game的serviceAccount注解 | ` ` |
|
||||
| `resources.limits.cpu` | kruise-game容器的CPU资源限制 | `500m` |
|
||||
| `resources.limits.memory` | kruise-game容器的内存资源限制 | `1Gi` |
|
||||
| `resources.requests.cpu` | kruise-game容器的CPU资源请求 | `10m` |
|
||||
| `resources.requests.memory` | kruise-game容器的内存资源请求 | `64Mi` |
|
||||
| `prometheus.enabled` | 是否创建指标监控服务 | `true` |
|
||||
| `prometheus.monitorService.port` | monitorService的监听端口 | `8080` |
|
||||
| `scale.service.port` | 伸缩服务监听端口 | `6000` |
|
||||
| `scale.service.targetPort` | 伸缩服务目标端口 | `6000` |
|
||||
| `network.totalWaitTime` | 等待网络Ready的最长时间,单位是秒 | `60` |
|
||||
| `network.probeIntervalTime` | 探测网络状态的时间间隔,单位是秒 | `5` |
|
||||
|
||||
使用 `--set key=value[,key=value]` 参数指定每个参数到 `helm install`,例如,
|
||||
|
||||
### 可选:中国地区的镜像
|
||||
#### 可选:中国地区的镜像
|
||||
|
||||
如果你在中国并且无法从官方 DockerHub 拉取镜像,你可以使用托管在阿里云上的镜像:
|
||||
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
|
||||
OpenKruiseGame(OKG)是一个面向多云的开源游戏服Kubernetes工作负载,是CNCF工作负载开源项目OpenKruise在游戏领域的子项目,让游戏服的云原生化变得更加简单、快速、稳定。
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/intro.png').default} width="90%" />
|
||||
|
||||
## 什么是OpenKruiseGame(OKG)
|
||||
OpenKruiseGame(OKG)是简化游戏服云原生化的自定义Kubernetes工作负载,相比Kubernetes内置的无状态(Deployment)、有状态(StatefulSet)等工作负载而言,OpenKruiseGame(OKG)提供了热更新、原地升级、定向管理等常用的游戏服管理功能,是完全面向游戏服场景而设计的Kubernetes工作负载。
|
||||
|
@ -47,6 +47,18 @@ OpenKruiseGame(OKG)具有如下核心能力:
|
|||
* 云服务厂商无关
|
||||
* 复杂的游戏服务编排
|
||||
|
||||
## 谁在使用OpenKruiseGame(OKG)
|
||||
|
||||
<table>
|
||||
<tr style={{"border":0}}>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/bilibili-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/hypergryph-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/shangyou-logo.jpeg').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/guanying-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/booming-logo.png').default} width="120" /></center></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## What's Next
|
||||
接下来,我们推荐你:
|
||||
* 安装 OpenKruiseGame。有关详细信息,请参阅 [安装](./installation.md)。
|
||||
|
|
|
@ -0,0 +1,95 @@
|
|||
# 自动伸缩
|
||||
## 功能概览
|
||||
|
||||
游戏服与无状态业务类型不同,对于自动伸缩特性有着更高的要求,其要求主要体现在缩容方面。
|
||||
|
||||
由于游戏为强有状态业务,随着时间的推移,游戏服之间的差异性愈加明显,缩容的精确度要求极高,粗糙的缩容机制容易造成玩家断线等负面影响,给业务造成巨大损失。
|
||||
|
||||
原生Kubernetes中的水平伸缩机制如下图所示
|
||||
|
||||

|
||||
|
||||
在游戏场景下,它的主要问题在于:
|
||||
|
||||
- 在pod层面,无法感知游戏服业务状态,进而无法通过业务状态设置删除优先级
|
||||
- 在workload层面,无法根据业务状态选择缩容对象
|
||||
- 在autoscaler层面,无法定向感知游戏服业务状态计算合适的副本数目
|
||||
|
||||
这样一来,基于原生Kubernetes的自动伸缩机制将在游戏场景下造成两大问题:
|
||||
|
||||
- 缩容数目不精确。容易删除过多或过少的游戏服。
|
||||
- 缩容对象不精确。容易删除业务负载水平高的游戏服。
|
||||
|
||||
OKG 的自动伸缩机制如下所示
|
||||
|
||||

|
||||
|
||||
- 在游戏服层面,每个游戏服可以上报自身状态,通过自定义服务质量或外部组件来暴露自身是否为WaitToBeDeleted状态。
|
||||
- 在workload层面,GameServerSet可根据游戏服上报的业务状态来决定缩容的对象,如[游戏服水平伸缩](gameservers-scale.md)中所述,WaitToBeDeleted的游戏服是删除优先级最高的游戏服,缩容时最优先删除。
|
||||
- 在autoscaler层面,精准计算WaitToBeDeleted的游戏服个数,将其作为缩容数量,不会造成误删的情况。
|
||||
|
||||
如此一来,OKG的自动伸缩器在缩容窗口期内只会删除处于WaitToBeDeleted状态的游戏服,真正做到定向缩容、精准缩容。
|
||||
|
||||
## 使用示例
|
||||
|
||||
_**前置条件:在集群中安装 [KEDA](https://keda.sh/docs/2.10/deploy/)**_
|
||||
|
||||
部署ScaledObject对象来设置自动伸缩策略,具体字段含义可参考 [ScaledObject API](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledobject_types.go)
|
||||
|
||||
```yaml
|
||||
apiVersion: keda.sh/v1alpha1
|
||||
kind: ScaledObject
|
||||
metadata:
|
||||
name: minecraft #填写对应GameServerSet的名称
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
name: minecraft #填写对应GameServerSet的名称
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
pollingInterval: 30
|
||||
minReplicaCount: 0
|
||||
advanced:
|
||||
horizontalPodAutoscalerConfig:
|
||||
behavior: #继承HPA策略,可参考文档 https://kubernetes.io/zh-cn/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 45 #设置缩容稳定窗口时间为45秒
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 15
|
||||
triggers:
|
||||
- type: external
|
||||
metricType: Value
|
||||
metadata:
|
||||
scalerAddress: kruise-game-external-scaler.kruise-game-system:6000
|
||||
```
|
||||
|
||||
部署完成后,更改gs minecraft-0 的 opsState 为 WaitToBeDeleted(可参考[自定义服务质量](service-qualities.md)实现自动化设置游戏服状态)
|
||||
|
||||
```bash
|
||||
kubectl edit gs minecraft-0
|
||||
|
||||
...
|
||||
spec:
|
||||
deletionPriority: 0
|
||||
opsState: WaitToBeDeleted #初始为None, 将其改为WaitToBeDeleted
|
||||
updatePriority: 0
|
||||
...
|
||||
```
|
||||
|
||||
经过缩容窗口期后,游戏服minecraft-0自动被删除
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Deleting WaitToBeDeleted 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
# After a while
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
```
|
|
@ -0,0 +1,237 @@
|
|||
# CRD字段说明
|
||||
## GameServerSet
|
||||
|
||||
### GameServerSetSpec
|
||||
|
||||
```
|
||||
type GameServerSetSpec struct {
|
||||
// 游戏服数目,必须指定,最小值为0
|
||||
Replicas *int32 `json:"replicas"`
|
||||
|
||||
// 游戏服模版,新生成的游戏服将以模版定义的参数创建
|
||||
GameServerTemplate GameServerTemplate `json:"gameServerTemplate,omitempty"`
|
||||
|
||||
// 保留的游戏服序号,可选项。若指定了该序号,已经存在的游戏服将被删除;而未存在的游戏服,新建时将跳过、不创建该序号
|
||||
ReserveGameServerIds []int `json:"reserveGameServerIds,omitempty"`
|
||||
|
||||
// 游戏服自定义服务质量。用户通过该字段实现游戏服自动化状态感知。
|
||||
ServiceQualities []ServiceQuality `json:"serviceQualities,omitempty"`
|
||||
|
||||
// 游戏服批量更新策略
|
||||
UpdateStrategy UpdateStrategy `json:"updateStrategy,omitempty"`
|
||||
|
||||
// 游戏服水平伸缩策略
|
||||
ScaleStrategy ScaleStrategy `json:"scaleStrategy,omitempty"`
|
||||
|
||||
// 游戏服接入层网络设置
|
||||
Network *Network `json:"network,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### GameServerSetStatus
|
||||
|
||||
```
|
||||
type GameServerSetStatus struct {
|
||||
// 控制器观察到GameServerSet的迭代版本
|
||||
ObservedGeneration int64 `json:"observedGeneration,omitempty"`
|
||||
|
||||
// 游戏服数目
|
||||
Replicas int32 `json:"replicas"`
|
||||
|
||||
// 处于Ready的游戏服数目
|
||||
ReadyReplicas int32 `json:"readyReplicas"`
|
||||
|
||||
// 可用的游戏服数目
|
||||
AvailableReplicas int32 `json:"availableReplicas"`
|
||||
|
||||
// 当前的游戏服数目
|
||||
CurrentReplicas int32 `json:"currentReplicas"`
|
||||
|
||||
// 已更新的游戏服数目
|
||||
UpdatedReplicas int32 `json:"updatedReplicas"`
|
||||
|
||||
// 已更新并Ready的游戏服数目
|
||||
UpdatedReadyReplicas int32 `json:"updatedReadyReplicas,omitempty"`
|
||||
|
||||
// 处于Maintaining状态的游戏服数目
|
||||
MaintainingReplicas *int32 `json:"maintainingReplicas,omitempty"`
|
||||
|
||||
// 处于WaitToBeDeleted状态的游戏服数目
|
||||
WaitToBeDeletedReplicas *int32 `json:"waitToBeDeletedReplicas,omitempty"`
|
||||
|
||||
// LabelSelector 是标签选择器,用于查询应与 HPA 使用的副本数相匹配的游戏服。
|
||||
LabelSelector string `json:"labelSelector,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### GameServerTemplate
|
||||
|
||||
```
|
||||
type GameServerTemplate struct {
|
||||
// 继承至PodTemplateSpec的所有字段
|
||||
corev1.PodTemplateSpec `json:",inline"`
|
||||
|
||||
// 对持久卷的请求和声明
|
||||
VolumeClaimTemplates []corev1.PersistentVolumeClaim `json:"volumeClaimTemplates,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### UpdateStrategy
|
||||
|
||||
```
|
||||
type UpdateStrategy struct {
|
||||
// 更新策略类型,可选择 OnDelete 或 RollingUpdate
|
||||
Type apps.StatefulSetUpdateStrategyType `json:"type,omitempty"`
|
||||
|
||||
// 当策略类型为RollingUpdate时可用,指定RollingUpdate具体策略
|
||||
RollingUpdate *RollingUpdateStatefulSetStrategy `json:"rollingUpdate,omitempty"`
|
||||
}
|
||||
|
||||
|
||||
type RollingUpdateStatefulSetStrategy struct {
|
||||
// 保留旧版本游戏服的数量或百分比,默认为 0。
|
||||
Partition *int32 `json:"partition,omitempty"`
|
||||
|
||||
|
||||
// 会保证发布过程中最多有多少个游戏服处于不可用状态,默认值为 1。
|
||||
// 支持设置百分比,比如:20%,意味着最多有20%个游戏服处于不可用状态。
|
||||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
|
||||
|
||||
// 表明游戏服更新的方式。可选择ReCreate / InPlaceIfPossible / InPlaceOnly。默认为ReCreate。
|
||||
PodUpdatePolicy kruiseV1beta1.PodUpdateStrategyType `json:"podUpdatePolicy,omitempty"`
|
||||
|
||||
// 是否暂停发布,默认为false。
|
||||
Paused bool `json:"paused,omitempty"`
|
||||
|
||||
// 原地升级的策略
|
||||
InPlaceUpdateStrategy *appspub.InPlaceUpdateStrategy `json:"inPlaceUpdateStrategy,omitempty"`
|
||||
|
||||
// 游戏服在更新后多久被视为准备就绪,默认为0,最大值为300。
|
||||
MinReadySeconds *int32 `json:"minReadySeconds,omitempty"`
|
||||
}
|
||||
|
||||
type InPlaceUpdateStrategy struct {
|
||||
// 将游戏服状态设置为NotReady和更新游戏服Spec中的镜像之间的时间跨度。
|
||||
GracePeriodSeconds int32 `json:"gracePeriodSeconds,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### ScaleStrategy
|
||||
|
||||
```
|
||||
type ScaleStrategy struct {
|
||||
// 扩缩期间游戏服最大不可用的数量,可为绝对值或百分比
|
||||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
|
||||
|
||||
// 缩容策略类型,目前支持两种:General 与 ReserveIds。
|
||||
// 默认为General,缩容时优先考虑reserveGameServerIds字段,
|
||||
// 当预留的GameServer数量不满足缩减数量时,继续从当前游戏服务器列表中选择并删除GameServer。
|
||||
// 当该字段设置为ReserveIds时,无论是保留的游戏服还是控制器按照优先级删除的游戏服,
|
||||
// 被删除的游戏服的序号都会回填至ReserveGameServerIds字段。
|
||||
ScaleDownStrategyType ScaleDownStrategyType `json:"scaleDownStrategyType,omitempty"`
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
#### ServiceQualities
|
||||
|
||||
```
|
||||
type ServiceQuality struct {
|
||||
// 继承至corev1.Probe所有字段,此处指定探测方式
|
||||
corev1.Probe `json:",inline"`
|
||||
|
||||
// 自定义服务质量的名称,区别定义不同的服务质量
|
||||
Name string `json:"name"`
|
||||
|
||||
// 探测的容器名称
|
||||
ContainerName string `json:"containerName,omitempty"`
|
||||
|
||||
// 是否让GameServerSpec在ServiceQualityAction执行后不发生变化。
|
||||
// 当Permanent为true时,无论检测结果如何,ServiceQualityAction只会执行一次。
|
||||
// 当Permanent为false时,即使ServiceQualityAction已经执行过,也可以再次执行ServiceQualityAction。
|
||||
Permanent bool `json:"permanent"`
|
||||
|
||||
// 服务质量对应执行动作
|
||||
ServiceQualityAction []ServiceQualityAction `json:"serviceQualityAction,omitempty"`
|
||||
}
|
||||
|
||||
type ServiceQualityAction struct {
|
||||
// 用户设定当探测结果为true/false时执行动作
|
||||
State bool `json:"state"`
|
||||
|
||||
// 动作为更改GameServerSpec中的字段
|
||||
GameServerSpec `json:",inline"`
|
||||
}
|
||||
```
|
||||
|
||||
#### Network
|
||||
|
||||
```
|
||||
type Network struct {
|
||||
// 网络类型
|
||||
NetworkType string `json:"networkType,omitempty"`
|
||||
|
||||
// 网络参数,不同网络类型需要填写不同的网络参数
|
||||
NetworkConf []NetworkConfParams `json:"networkConf,omitempty"`
|
||||
}
|
||||
|
||||
type NetworkConfParams KVParams
|
||||
|
||||
type KVParams struct {
|
||||
// 参数名,名称由网络插件决定
|
||||
Name string `json:"name,omitempty"`
|
||||
|
||||
// 参数值,格式由网络插件决定
|
||||
Value string `json:"value,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
## GameServer
|
||||
|
||||
### GameServerSpec
|
||||
|
||||
```
|
||||
type GameServerSpec struct {
|
||||
// 游戏服运维状态,表示业务相关的游戏服状态,目前可指定的状态有:None / WaitToBeDeleted / Maintaining。默认为None
|
||||
OpsState OpsState `json:"opsState,omitempty"`
|
||||
|
||||
// 更新优先级,优先级高则优先被更新
|
||||
UpdatePriority *intstr.IntOrString `json:"updatePriority,omitempty"`
|
||||
|
||||
// 删除优先级,优先级高则优先被删除
|
||||
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
|
||||
|
||||
// 是否进行网络隔离、切断接入层网络,默认为false
|
||||
NetworkDisabled bool `json:"networkDisabled,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
### GameServerStatus
|
||||
|
||||
```
|
||||
type GameServerStatus struct {
|
||||
// 期望游戏服状态,Ready
|
||||
DesiredState GameServerState `json:"desiredState,omitempty"`
|
||||
|
||||
// 当前游戏服实际状态
|
||||
CurrentState GameServerState `json:"currentState,omitempty"`
|
||||
|
||||
// 网络状态信息
|
||||
NetworkStatus NetworkStatus `json:"networkStatus,omitempty"`
|
||||
|
||||
// 游戏服对应pod状态
|
||||
PodStatus corev1.PodStatus `json:"podStatus,omitempty"`
|
||||
|
||||
// 游戏服服务质量状况
|
||||
ServiceQualitiesCondition []ServiceQualityCondition `json:"serviceQualitiesConditions,omitempty"`
|
||||
|
||||
// 当前更新优先级
|
||||
UpdatePriority *intstr.IntOrString `json:"updatePriority,omitempty"`
|
||||
|
||||
// 当前删除优先级
|
||||
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
|
||||
|
||||
// 上次变更时间
|
||||
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
|
||||
}
|
||||
```
|
|
@ -0,0 +1,35 @@
|
|||
# 游戏服监控
|
||||
## 可用指标
|
||||
|
||||
OKG 默认透出游戏服相关 prometheus metrics,其中指标包括:
|
||||
|
||||
| 名称 | 描述 | 类型 |
|
||||
| --- |----------------------|---------|
|
||||
| GameServersStateCount | 不同state状态下的游戏服数量 | gauge |
|
||||
| GameServersOpsStateCount | 不同opsState状态下的游戏服数量 | gauge |
|
||||
| GameServersTotal | 存在过的游戏服总数 | counter |
|
||||
| GameServerSetsReplicasCount | 每个GameServerSet的副本数量 | gauge |
|
||||
| GameServerDeletionPriority | 游戏服删除优先级 | gauge |
|
||||
| GameServerUpdatePriority | 游戏服更新优先级 | gauge |
|
||||
|
||||
## 监控仪表盘
|
||||
|
||||
### 仪表盘导入
|
||||
|
||||
1. 将 [grafana.json](https://github.com/openkruise/kruise-game/blob/master/config/prometheus/grafana.json) 导入至Grafana中
|
||||
2. 选择数据源
|
||||
3. 替换UID并完成导入
|
||||
|
||||
### 仪表盘说明
|
||||
|
||||
完成导入后的仪表盘如下所示:
|
||||
|
||||
<img src={require('/static/img/kruisegame/user-manuals/gra-dash.png').default} width="90%" />
|
||||
|
||||
从上至下,依次包含
|
||||
|
||||
- 第一行:当前游戏服各个状态的数量、当前游戏服各个状态的比例饼图
|
||||
- 第二行:游戏服各个状态数量变化折线图
|
||||
- 第三行:游戏服删除优先级、更新优先级变化折线图(可根据左上角namespace与gsName筛选游戏服)
|
||||
- 第四、五行:游戏服集合中不同状态的游戏服数量变化折线图(可根据左上角namespace与gssName筛选游戏服集合)
|
||||
|
|
@ -6,9 +6,52 @@
|
|||
在不同场景下,往往需要不同的网络产品,而有时网络产品由云厂商提供。OKG 的 Cloud Provider & Network Plugin 源于此而诞生。
|
||||
OKG 会集成不同云提供商的不同网络插件,用户可通过GameServerSet设置游戏服的网络参数,并在生成的GameServer中查看网络状态信息,极大降低了游戏服接入网络的复杂度。
|
||||
|
||||
## 使用示例
|
||||
## 网络插件附录
|
||||
|
||||
当前支持的网络插件:
|
||||
- Kubernetes-HostPort
|
||||
- Kubernetes-Ingress
|
||||
- AlibabaCloud-NATGW
|
||||
- AlibabaCloud-SLB
|
||||
- AlibabaCloud-SLB-SharedPort
|
||||
|
||||
---
|
||||
### Kubernetes-HostPort
|
||||
#### 插件名称
|
||||
|
||||
`Kubernetes-HostPort`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
Kubernetes
|
||||
|
||||
#### 插件说明
|
||||
- Kubernetes-HostPort利用宿主机网络,通过主机上的端口转发实现游戏服对外暴露服务。宿主机需要配置公网IP,有被公网访问的能力。
|
||||
|
||||
- 用户在配置文件中可自定义宿主机开放的端口段(默认为8000-9000),该网络插件可以帮助用户分配管理宿主机端口,尽量避免端口冲突。
|
||||
|
||||
- 该插件不支持网络隔离。
|
||||
|
||||
#### 网络参数
|
||||
|
||||
ContainerPorts
|
||||
|
||||
- 含义:填写提供服务的容器名以及对应暴露的端口和协议
|
||||
- 填写格式:containerName:port1/protocol1,port2/protocol2,...(协议需大写) 比如:`game-server:25565/TCP`
|
||||
- 是否支持变更:不支持,在创建时即永久生效,随pod生命周期结束而结束
|
||||
|
||||
#### 插件配置
|
||||
|
||||
```
|
||||
[kubernetes]
|
||||
enable = true
|
||||
[kubernetes.hostPort]
|
||||
#填写宿主机可使用的空闲端口段,用于为pod分配宿主机转发端口
|
||||
max_port = 9000
|
||||
min_port = 8000
|
||||
```
|
||||
|
||||
#### 示例说明
|
||||
|
||||
OKG支持在原生Kubernetes集群使用HostPort游戏服网络,使用游戏服所在宿主机暴露外部IP及端口,转发至游戏服内部端口中。使用方式如下。
|
||||
|
||||
|
@ -66,109 +109,172 @@ EOF
|
|||
|
||||
访问 48.98.98.8:8211 即可
|
||||
|
||||
### AlibabaCloud-NATGW
|
||||
|
||||
OKG支持阿里云下NAT网关模型,使用NATGW的外部IP与端口暴露服务,流量最终将转发至Pod之中。使用方式如下:
|
||||
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-natgw
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: AlibabaCloud-NATGW
|
||||
networkConf:
|
||||
- name: Ports
|
||||
#暴露的端口,格式如下 {port1},{port2}...
|
||||
value: "80"
|
||||
- name: Protocol
|
||||
#使用的协议,默认为TCP
|
||||
value: "TCP"
|
||||
# - name: Fixed
|
||||
# 是否固定映射关系,默认不固定,pod删除后会生成新的外部IP及端口
|
||||
# value: true
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
生成的GameServer中通过networkStatus字段查看游戏服网络信息:
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T11:21:34Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.97.227.137
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "512"
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.189
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "80"
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T11:21:34Z"
|
||||
networkType: AlibabaCloud-NATGW
|
||||
```
|
||||
|
||||
访问 47.97.227.137:512 即可
|
||||
|
||||
## 网络插件附录
|
||||
|
||||
当前支持的网络插件:
|
||||
- Kubernetes-HostPort
|
||||
- AlibabaCloud-NATGW
|
||||
- AlibabaCloud-SLB
|
||||
- AlibabaCloud-SLB-SharedPort
|
||||
|
||||
---
|
||||
### Kubernetes-HostPort
|
||||
|
||||
### Kubernetes-Ingress
|
||||
|
||||
#### 插件名称
|
||||
|
||||
`Kubernetes-HostPort`
|
||||
`Kubernetes-Ingress`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
Kubernetes
|
||||
|
||||
#### 插件说明
|
||||
- Kubernetes-HostPort利用宿主机网络,通过主机上的端口转发实现游戏服对外暴露服务。宿主机需要配置公网IP,有被公网访问的能力。
|
||||
|
||||
- 用户在配置文件中可自定义宿主机开放的端口段(默认为8000-9000),该网络插件可以帮助用户分配管理宿主机端口,尽量避免端口冲突。
|
||||
|
||||
- 该插件不支持网络隔离。
|
||||
- 针对页游等需要七层网络模型的游戏场景,OKG提供了Ingress网络模型。该插件将会自动地为每个游戏服设置对应的访问路径,该路径与游戏服ID相关,每个游戏服各不相同。
|
||||
- 是否支持网络隔离:否
|
||||
|
||||
#### 网络参数
|
||||
|
||||
ContainerPorts
|
||||
Path
|
||||
|
||||
- 含义:填写提供服务的容器名以及对应暴露的端口和协议
|
||||
- 填写格式:containerName:port1/protocol1,port2/protocol2,...(协议需大写) 比如:`game-server:25565/TCP`
|
||||
- 是否支持变更:不支持,在创建时即永久生效,随pod生命周期结束而结束
|
||||
- 含义:访问路径。每个游戏服依据ID拥有各自的访问路径。
|
||||
- 填写格式:将<id\>添加到原始路径(与HTTPIngressPath中Path一致)的任意位置,该插件将会生成游戏服ID对应的路径。例如,当设置路径为 /game<id\>,游戏服0对应路径为/game0,游戏服1对应路径为/game1,以此类推。
|
||||
- 是否支持变更:支持
|
||||
|
||||
PathType
|
||||
|
||||
- 含义:路径类型。与HTTPIngressPath的PathType字段一致。
|
||||
- 填写格式:与HTTPIngressPath的PathType字段一致。
|
||||
- 是否支持变更:支持
|
||||
|
||||
Port
|
||||
|
||||
- 含义:游戏服暴露的端口值。
|
||||
- 填写格式:端口数字
|
||||
- 是否支持变更:支持
|
||||
|
||||
IngressClassName
|
||||
|
||||
- 含义:指定IngressClass的名称。与IngressSpec的IngressClassName字段一致。
|
||||
- 填写格式:与IngressSpec的IngressClassName字段一致。
|
||||
- 是否支持变更:支持
|
||||
|
||||
Host
|
||||
|
||||
- 含义:域名。每个游戏服依据ID拥有各自的访问域名。
|
||||
- 填写格式:将<id\>添加域名的任意位置,该插件将会生成游戏服ID对应的域名。例如,当设置域名为 test.game<id\>.cn-hangzhou.ali.com,游戏服0对应域名为test.game0.cn-hangzhou.ali.com,游戏服1对应域名为test.game1.cn-hangzhou.ali.com,以此类推。
|
||||
- 是否支持变更:支持
|
||||
|
||||
TlsHosts
|
||||
|
||||
- 含义:包含TLS证书的host列表。含义与IngressTLS的Hosts字段类似。
|
||||
- 填写格式:host1,host2,... 例如,xxx.xx1.com,xxx.xx2.com
|
||||
- 是否支持变更:支持
|
||||
|
||||
TlsSecretName
|
||||
|
||||
- 含义:与IngressTLS的SecretName字段一致。
|
||||
- 填写格式:与IngressTLS的SecretName字段一致。
|
||||
- 是否支持变更:支持
|
||||
|
||||
Annotation
|
||||
|
||||
- 含义:作为ingress对象的annotation
|
||||
- 格式:key: value(注意:后有空格),例如:nginx.ingress.kubernetes.io/rewrite-target: /$2
|
||||
- 是否支持变更:支持
|
||||
|
||||
_补充说明_
|
||||
|
||||
- 支持填写多个annotation,在networkConf中填写多个Annotation以及对应值即可,不区分填写顺序。
|
||||
- 支持填写多个路径。路径、路径类型、端口按照填写顺序一一对应。当路径数目大于路径类型数目(或端口数目)时,无法找到对应关系的路径按照率先填写的路径类型(或端口)匹配。
|
||||
|
||||
#### 插件配置
|
||||
|
||||
无
|
||||
|
||||
#### 示例说明
|
||||
|
||||
GameServerSet中network字段声明如下:
|
||||
|
||||
```yaml
|
||||
network:
|
||||
networkConf:
|
||||
- name: IngressClassName
|
||||
value: nginx
|
||||
- name: Port
|
||||
value: "80"
|
||||
- name: Path
|
||||
value: /game<id>(/|$)(.*)
|
||||
- name: Path
|
||||
value: /test-<id>
|
||||
- name: Host
|
||||
value: test.xxx.cn-hangzhou.ali.com
|
||||
- name: PathType
|
||||
value: ImplementationSpecific
|
||||
- name: TlsHosts
|
||||
value: xxx.xx1.com,xxx.xx2.com
|
||||
- name: Annotation
|
||||
value: 'nginx.ingress.kubernetes.io/rewrite-target: /$2'
|
||||
- name: Annotation
|
||||
value: 'nginx.ingress.kubernetes.io/random: xxx'
|
||||
networkType: Kubernetes-Ingress
|
||||
```
|
||||
[kubernetes]
|
||||
enable = true
|
||||
[kubernetes.hostPort]
|
||||
#填写宿主机可使用的空闲端口段,用于为pod分配宿主机转发端口
|
||||
max_port = 9000
|
||||
min_port = 8000
|
||||
|
||||
则会生成gss replicas对应数目的service与ingress对象。0号游戏服生成的ingress字段如下所示:
|
||||
|
||||
```yaml
|
||||
spec:
|
||||
ingressClassName: nginx
|
||||
rules:
|
||||
- host: test.xxx.cn-hangzhou.ali.com
|
||||
http:
|
||||
paths:
|
||||
- backend:
|
||||
service:
|
||||
name: ing-nginx-0
|
||||
port:
|
||||
number: 80
|
||||
path: /game0(/|$)(.*)
|
||||
pathType: ImplementationSpecific
|
||||
- backend:
|
||||
service:
|
||||
name: ing-nginx-0
|
||||
port:
|
||||
number: 80
|
||||
path: /test-0
|
||||
pathType: ImplementationSpecific
|
||||
tls:
|
||||
- hosts:
|
||||
- xxx.xx1.com
|
||||
- xxx.xx2.com
|
||||
status:
|
||||
loadBalancer:
|
||||
ingress:
|
||||
- ip: 47.xx.xxx.xxx
|
||||
```
|
||||
|
||||
其他序号的游戏服只有path字段与service name不同,生成的其他参数都相同。
|
||||
|
||||
对应的0号GameServer的networkStatus如下:
|
||||
|
||||
```yaml
|
||||
networkStatus:
|
||||
createTime: "2023-04-28T14:00:30Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.xx.xxx.xxx
|
||||
ports:
|
||||
- name: /game0(/|$)(.*)
|
||||
port: 80
|
||||
protocol: TCP
|
||||
- name: /test-0
|
||||
port: 80
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 10.xxx.x.xxx
|
||||
ports:
|
||||
- name: /game0(/|$)(.*)
|
||||
port: 80
|
||||
protocol: TCP
|
||||
- name: /test-0
|
||||
port: 80
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2023-04-28T14:00:30Z"
|
||||
networkType: Kubernetes-Ingress
|
||||
```
|
||||
|
||||
---
|
||||
|
@ -211,6 +317,67 @@ Fixed
|
|||
|
||||
无
|
||||
|
||||
#### 示例说明
|
||||
|
||||
OKG支持阿里云下NAT网关模型,使用NATGW的外部IP与端口暴露服务,流量最终将转发至Pod之中。使用方式如下:
|
||||
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-natgw
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: AlibabaCloud-NATGW
|
||||
networkConf:
|
||||
- name: Ports
|
||||
#暴露的端口,格式如下 {port1},{port2}...
|
||||
value: "80"
|
||||
- name: Protocol
|
||||
#使用的协议,默认为TCP
|
||||
value: "tcp"
|
||||
# - name: Fixed
|
||||
# 是否固定映射关系,默认不固定,pod删除后会生成新的外部IP及端口
|
||||
# value: true
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
生成的GameServer中通过networkStatus字段查看游戏服网络信息:
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T11:21:34Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.97.227.137
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "512"
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.189
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "80"
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T11:21:34Z"
|
||||
networkType: AlibabaCloud-NATGW
|
||||
```
|
||||
|
||||
访问 47.97.227.137:512 即可
|
||||
|
||||
---
|
||||
### AlibabaCloud-SLB
|
||||
#### 插件名称
|
||||
|
@ -263,6 +430,7 @@ min_port = 500
|
|||
|
||||
#### 插件名称
|
||||
### AlibabaCloud-SLB-SharedPort
|
||||
|
||||
`AlibabaCloud-SLB-SharedPort`
|
||||
|
||||
#### Cloud Provider
|
||||
|
|
|
@ -15,15 +15,15 @@ OKG 提供了原地升级([热更新](./hot-update.md))、批量更新、按
|
|||
此时GameServerSet下有3个游戏服副本:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 0
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
```
|
||||
|
||||
设置更新优先级,将1号游戏服优先级调大:
|
||||
```shell
|
||||
kubectl edit gs gs-demo-1
|
||||
kubectl edit gs minecraft-1
|
||||
|
||||
...
|
||||
spec:
|
||||
|
@ -35,10 +35,10 @@ spec:
|
|||
|
||||
接下来设置 GameServerSet partition、以及即将更新的新镜像:
|
||||
```shell
|
||||
kubectl edit gss gs-demo
|
||||
kubectl edit gss minecraft
|
||||
|
||||
...
|
||||
image: gameserver:latest # 更新镜像
|
||||
image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2-new # 更新镜像
|
||||
name: gameserver
|
||||
...
|
||||
updateStrategy:
|
||||
|
@ -50,28 +50,28 @@ kubectl edit gss gs-demo
|
|||
|
||||
```
|
||||
|
||||
此时只有gs-demo-1将会更新:
|
||||
此时只有minecraft-1将会更新:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Updating None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Updating None 0 10
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
|
||||
# 一段时间过后
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 10
|
||||
minecraft-2 Ready None 0 0
|
||||
```
|
||||
|
||||
待gs-demo-1验证通过后,更新其余游戏服:
|
||||
待minecraft-1验证通过后,更新其余游戏服:
|
||||
```shell
|
||||
kubectl edit gss gs-demo
|
||||
kubectl edit gss minecraft
|
||||
...
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
|
|
|
@ -0,0 +1,68 @@
|
|||
# 设计理念
|
||||
## 开源OpenKruiseGame(OKG)的初衷
|
||||
|
||||
>我是从2015年开始做云原生产品的,从最开始的Swarm到后来的Kubernetes,在容器集群之上运行的负载类型从早期的网站、API服务到后来的转码、AI训练再到元宇宙、Web3、图形化应用。我们见证了云原生技术在改变一个又一个行业。但是,游戏是一个非常特殊的行业,一个大型的游戏,包含网关、平台服、游戏服、匹配服等不同种角色。很多游戏公司早已将平台服、网关等业务进行了云原生化改造,但是,游戏服的容器化进展都比较缓慢。通过和大量的游戏开发者/运维人员进行交流,大致可以归纳为如下三个重要的原因。
|
||||
>
|
||||
>1. 运行中的游戏服更换部署架构的风险收益比过高。
|
||||
>2. 游戏服云原生化过程中存在缺失的核心功能,例如:游戏热更新,定向合服/停服等。
|
||||
>3. 缺少游戏服云原生化的最佳实践与成功案例。
|
||||
>
|
||||
>为了解决上述问题,我们联合了灵犀互娱等多家游戏公司,将游戏服云原生化场景下的通用能力进行了抽象,开源了OpenKruiseGame项目。希望能够通过一个云厂商无关的开源项目,将游戏服的云原生化最佳实践交付给更多的游戏开发者。同时,我们也希望越来越多的游戏公司/工作室/开发者可以参与到社区,将遇到的难题、场景和大家一起讨论,分享游戏服云原生化的经验。
|
||||
|
||||
<p align="right">来自 刘中巍,阿里云容器服务,OpenKruiseGame项目发起人</p>
|
||||
|
||||
>灵犀互娱已全面拥抱云原生架构,在云原生化过程中我们清楚地认识到,游戏服不同于其他Web类型应用,在k8s集群之中对其的管理是非常复杂的。原生k8s workload 提供的管理功能很难满足游戏服日常运维需求,Deployment 无法固定ID不适配有状态的特性、而StatefulSet又缺乏定向管理的灵活性,为此我们自研了Paas平台,提供对游戏服的编排管理的能力,以实现高效开服/更新等游戏服运维操作。
|
||||
|
||||
<p align="right"> 来自 冯谋杰 阿里灵犀互娱容器云负责人</p>
|
||||
|
||||
>作为一个大型的游戏分发平台,B站有着海量且异构架构的内外部游戏项目需要管理维护,在当前降本增效的大环境下,游戏项目从传统虚拟机迁移至k8s势在必行。但是原生的k8s面对游戏热更、多环境管理、滚服游戏的区服抽象、业务接流等场景是比较疲软的。需要一个成本低廉、高效的跨云解决方案为上述问题提供支持,基于OpenKruise衍生的OpenKruiseGame所提供的固定id、原地升级等功能对游戏场景有着很大的吸引力,给游戏的容器化增加了一种选择。
|
||||
|
||||
<p align="right"> 来自 李宁 bilibili游戏运维负责人</p>
|
||||
|
||||
|
||||
>在尝试对游戏服进行云原生化改造的过程中,网络是首要考虑的问题。由于游戏服从虚拟机迁移至容器,基于机器IP的运维方式在k8s中难以保障,衍生出固定IP的需求;对外服务的方式也不像直接在虚拟机暴露端口那么简单,增加了许多复杂性。除了网络问题之外,一个游戏服的各个进程在pod中的状态难以感知,原生k8s重建的策略太过“粗暴”,不利于游戏稳定运行,亟需一种针对性的感知策略,针对不同的探测结果执行不同的动作。
|
||||
|
||||
<p align="right"> 来自 盛浩 冠赢互娱游戏云平台负责人</p>
|
||||
|
||||
## 为什么OpenKruiseGame(OKG)是一个工作负载
|
||||
|
||||

|
||||
|
||||
游戏服云原生化核心要解决两个问题,游戏服的生命周期管理与游戏服的运维管理。Kubernetes内置了一些通用的工作负载模型,例如:无状态(Deployment)、有状态(StatefulSet)、任务(Job)等。但是,游戏服的状态管理不论从粒度还是确定性上面都有更高的要求。例如:游戏服需要热更新的机制来确保更短的游戏中断;游戏服需要原地更新确保元数据信息(网络为主)不变;游戏服需要确保在自动伸缩过程中只有0玩家的游戏服可以下线;需要具备手动运维/诊断/隔离任意一个游戏服的能力等。这些都是Kubernetes内置负载不能够解决的问题。
|
||||
|
||||
此外,Kubernetes中的工作负载还承担了与基础设施无缝整合的重要枢纽角色。例如:通过Annotations中的字段,自动实现监控系统、日志系统与应用的对接;通过nodeSelector字段,实现应用与底层资源的调度绑定关系;通过labels中的字段,记录分组等元数据信息,替代传统的CMDB系统。这些都让自定义工作负载成为了Kubernetes中适配不同类型应用的最佳方式,OpenKruiseGame(OKG)是一个完全面向游戏场景的Kubernetes工作负载,通过OpenKruiseGame(OKG),开发者不止可以获得更好的游戏服的生命周期管理和游戏服的运维管理,还可以以OpenKruiseGame(OKG)为纽带,无需开发额外的代码,充分发挥云产品带来的强大能力。
|
||||
|
||||
## OpenKruiseGame(OKG)的设计理念
|
||||
|
||||
OpenKruiseGame(OKG)只包含两个CRD对象:GameServerSet与GameServer。OpenKruiseGame(OKG)的设计理念是基于状态控制的,将不同的职责划分在不同的工作负载维度来控制。
|
||||
|
||||
* GameServerSet(生命周期管理)
|
||||
对一组GameServer的生命周期管理的抽象,主要用于副本数目管理、游戏服发布等生命周期控制。
|
||||
|
||||
* GameServer(定向管理运维动作)
|
||||
对一个GameServer的运维/管理动作的抽象,主要用于更新顺序控制、游戏服状态控制、游戏服网络变更等定向运维管理动作。
|
||||
|
||||
当我们理解了OpenKruiseGame(OKG)的设计理念后,一些非常有趣的推论就可以快速的得出,例如:
|
||||
|
||||
* 当不小心删除GameServer的时候会触发游戏服的删除吗?
|
||||
|
||||
不会,GameServer只是游戏服的差异性运维动作的状态记录,如果删除GameServer之后,会重新创建一个使用默认配置的GameServer对象。此时,你的GameServer也会重置为默认定义在GameServerSet中的游戏服模板配置。
|
||||
|
||||
* 如何让匹配服务与自动伸缩更好的配合防止出现玩家被强制下线?
|
||||
|
||||
可以通过服务质量能力,将游戏的玩家任务转换为GameServer的状态,匹配框架感知GameServer的状态并控制伸缩的副本数目,GameServerSet也会根据GameServer的状态来判断删除的顺序,从而实现优雅下线。
|
||||
|
||||
## OpenKruiseGame(OKG)的部署架构
|
||||
|
||||

|
||||
|
||||
OpenKruiseGame(OKG)的部署模型分为三个部分:
|
||||
|
||||
1. OpenKruiseGame(OKG)控制器
|
||||
负责管理GameServerSet与GameServer的生命周期管理,在OpenKruiseGame控制器中,内置一个Cloud Provider模块,用来适配不同云服务厂商在网络插件等场景下的差异,让OpenKruiseGame可以真正做到一套代码无差异部署。
|
||||
|
||||
2. OpenKruise控制器
|
||||
负责管理Pod的生命周期管理,是OpenKruiseGame(OKG)的依赖组件,对OpenKruiseGame(OKG)使用者/开发者是无感的。
|
||||
|
||||
3. OpenKruiseGame(OKG)运维后台【待完成】
|
||||
针对希望白屏化使用OpenKruiseGame(OKG)的开发者提供的运维后台与API,主要提供游戏服的生命周期管理和编排能力。
|
|
@ -0,0 +1,29 @@
|
|||
# 项目贡献
|
||||
欢迎来到 OpenKruiseGame 社区。随时提供帮助、报告问题、提高文档质量、修复错误或引入新功能。有关如何向 OpenKruiseGame 提交内容的详细信息,请参见下文。
|
||||
|
||||
## 提交问题并参与基于场景的讨论
|
||||
OpenKruiseGame 是一个非常开放的社区,随时提交各种类型的问题,以下列表显示了问题类型:
|
||||
* 错误报告
|
||||
* 功能要求
|
||||
* 性能问题
|
||||
* 功能提案
|
||||
* 特征设计
|
||||
* 征求帮助
|
||||
* 文档不完整
|
||||
* 测试改进
|
||||
* 关于项目的任何问题
|
||||
|
||||
|
||||
当您提交问题时,请确保您已经进行了数据屏蔽,以确保您的信息的机密性,例如 AccessKey。
|
||||
## 贡献代码和文档
|
||||
能够为 OpenKruiseGame 提供帮助的行动值得鼓励,您可以提交您希望在拉取请求中修复的内容。
|
||||
*如果您发现拼写错误,请更正它。
|
||||
* 如果您发现代码错误,请修复它。
|
||||
* 如果您发现缺少的单元测试,请解决问题。
|
||||
* 如果您发现文档不完整或有错误,请更新它。
|
||||
|
||||
## 需要额外帮助
|
||||
如果您在游戏服务器云原生改造过程中遇到其他类型的问题需要帮助,请发邮件给我们寻求进一步的帮助,邮箱:zhongwei.lzw@alibaba-inc.com
|
||||
|
||||
## 成为 OpenKruiseGame 的核心贡献者
|
||||
也非常欢迎大家参与OpenKruiseGame会议,共同决定OpenKruiseGame的未来发展方向.作为OpenKruise的一个子项目,OpenKruiseGame在我们双周会上讨论OpenKruise的时候也会讨论。有关详细信息,请参阅 <a target="_blank" href="https://github.com/openkruise/kruise#community">时间表</a>.
|
|
@ -0,0 +1,25 @@
|
|||
# FAQ
|
||||
|
||||
## 如何调试你的代码
|
||||
|
||||
|
||||
0) 编辑Makefile,将IMG字段的值修改为Makefile的仓库地址。
|
||||
|
||||
1) 编译打包kruise-game-manager镜像。
|
||||
|
||||
```bash
|
||||
make docker-build
|
||||
```
|
||||
|
||||
2) 将打包后的镜像上传到镜像仓库。
|
||||
|
||||
```bash
|
||||
make docker-push
|
||||
```
|
||||
|
||||
3) 在 Kubernetes 集群 (~/.kube/conf) 中部署 kruise-game-manager 组件。
|
||||
|
||||
```bash
|
||||
make deploy
|
||||
```
|
||||
|
|
@ -0,0 +1,73 @@
|
|||
# 安装
|
||||
|
||||
安装OpenKruiseGame需安装Kruise与Kruise-Game,且要求 Kubernetes版本 >= 1.16
|
||||
|
||||
## 安装Kruise
|
||||
|
||||
建议采用 helm v3.5+ 来安装 Kruise
|
||||
|
||||
```shell
|
||||
# Firstly add openkruise charts repository if you haven't do this.
|
||||
$ helm repo add openkruise https://openkruise.github.io/charts/
|
||||
|
||||
# [Optional]
|
||||
$ helm repo update
|
||||
|
||||
# Install the latest version.
|
||||
$ helm install kruise openkruise/kruise --version 1.4.0
|
||||
```
|
||||
|
||||
## 安装Kruise-Game
|
||||
|
||||
```shell
|
||||
$ helm install kruise-game openkruise/kruise-game --version 0.2.1
|
||||
```
|
||||
|
||||
## 可选:使用自定义配置安装/升级
|
||||
|
||||
下表列出了 kruise-game 的可配置参数及其默认值。
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|----------------------------------|-------------------------------------------------------------------|-------------------------------------|
|
||||
| `installation.namespace` | kruise-game 安装到的 namespace,一般不建议修改 | `kruise-game-system` |
|
||||
| `installation.createNamespace` | 是否需要创建上述 namespace,一般不建议修改,除非指定安装到已有的 ns 中 | `true` |
|
||||
| `kruiseGame.fullname` | kruise-game 部署和其他配置的名称 | `kruise-game-controller-manager` |
|
||||
| `kruiseGame.healthBindPort` | 用于检查 kruise-game 容器健康检查的端口 | `8082` |
|
||||
| `kruiseGame.webhook.port` | kruise-game 容器服务的 webhook 端口 | `443` |
|
||||
| `kruiseGame.webhook.targetPort` | 用于 MutatingWebhookConfigurations 中工作负载的 ObjectSelector | `9876` |
|
||||
| `replicaCount` | kruise-game 的期望副本数 | `1` |
|
||||
| `image.repository` | kruise-game 的镜像仓库 | `openkruise/kruise-game-manager` |
|
||||
| `image.tag` | kruise-game 的镜像版本 | `v0.2.1` |
|
||||
| `image.pullPolicy` | kruise-game 的镜像拉取策略 | `Always` |
|
||||
| `serviceAccount.annotations` | kruise-game的serviceAccount注解 | ` ` |
|
||||
| `resources.limits.cpu` | kruise-game容器的CPU资源限制 | `500m` |
|
||||
| `resources.limits.memory` | kruise-game容器的内存资源限制 | `1Gi` |
|
||||
| `resources.requests.cpu` | kruise-game容器的CPU资源请求 | `10m` |
|
||||
| `resources.requests.memory` | kruise-game容器的内存资源请求 | `64Mi` |
|
||||
|
||||
|
||||
使用 `--set key=value[,key=value]` 参数指定每个参数到 `helm install`,例如,
|
||||
|
||||
### 可选:中国地区的镜像
|
||||
|
||||
如果你在中国并且无法从官方 DockerHub 拉取镜像,你可以使用托管在阿里云上的镜像:
|
||||
|
||||
```bash
|
||||
$ helm install kruise-game https://... --set image.repository=registry.cn-hangzhou.aliyuncs.com/acs/kruise-game-manager:v0.2.1
|
||||
...
|
||||
```
|
||||
|
||||
## 卸载
|
||||
|
||||
请注意,这将导致删除 kruise-game 创建的所有资源,包括 webhook 配置、服务、命名空间、CRD 和 CR 实例 kruise-game 控制器!
|
||||
请仅在您完全了解后果后才这样做。
|
||||
如果安装了 helm charts,则卸载 kruise-game:
|
||||
|
||||
```bash
|
||||
$ helm uninstall kruise-game
|
||||
release "kruise-game" uninstalled
|
||||
```
|
||||
|
||||
## What's Next
|
||||
接下来,我们推荐你:
|
||||
- 了解 kruise-game 的 [部署游戏服](user-manuals/deploy-gameservers.md).
|
|
@ -0,0 +1,57 @@
|
|||
# OpenKruiseGame简介
|
||||
⭐ ***If you like OpenKruiseGame, give it a star on <a target="_blank" rel="noopener noreferrer" href="https://github.com/openkruise/kruise-game">GitHub</a>!***
|
||||
## 概览
|
||||
|
||||
OpenKruiseGame(OKG)是一个面向多云的开源游戏服Kubernetes工作负载,是CNCF工作负载开源项目OpenKruise在游戏领域的子项目,让游戏服的云原生化变得更加简单、快速、稳定。
|
||||
|
||||

|
||||
|
||||
## 什么是OpenKruiseGame(OKG)
|
||||
OpenKruiseGame(OKG)是简化游戏服云原生化的自定义Kubernetes工作负载,相比Kubernetes内置的无状态(Deployment)、有状态(StatefulSet)等工作负载而言,OpenKruiseGame(OKG)提供了热更新、原地升级、定向管理等常用的游戏服管理功能,是完全面向游戏服场景而设计的Kubernetes工作负载。
|
||||
|
||||
除此之外,OpenKruiseGame(OKG)还承担了游戏服与云服务、匹配服务、运维平台对接的角色,通过低代码或者0代码的方式实现游戏服云原生化时日志、监控、网络、存储、弹性、匹配等功能的自动化集成,通过Kubernetes的一致性交付标准,实现多云/混合云/多集群的统一管理。
|
||||
|
||||
OpenKruiseGame(OKG)是一个完全开源的项目,开发者可以通过二次开发的方式定制属于自己的游戏服工作负载,构建游戏服的发布运维后台等。除了通过Kubernetes的模板/API的方式进行调用和扩展,OpenKruiseGame(OKG)还支持与KubeVela等交付系统进行对接,通过白屏化的方式实现游戏服的编排与全生命周期管理。
|
||||
|
||||
## 为什么需要OpenKruiseGame(OKG)
|
||||
|
||||
Kubernetes作为云原生时代的应用交付/运维标准,其具备的声明式资源管理、自动弹性伸缩、多云环境一致性交付等能力与游戏服的场景是非常匹配的,能够在开服效率、成本控制、版本管理、全球同服等场景提供支持。但是,游戏服的一些特性导致了它与Kubernetes进行适配的时候存在一些障碍,例如:
|
||||
|
||||
* 热更新/热重载
|
||||
|
||||
为了让玩家能够得到更好的游戏体验,很多游戏服都是通过热更新或者配置热重载的方式进行更新,而在Kubernetes的各种不同负载中,Pod的生命周期和镜像的生命周期是一致的,当业务的镜像需要发布的时候,Pod会进行重建,而重建Pod的代价往往意味着玩家对局的中断,玩家服网络元数据的变更等。
|
||||
|
||||
* 定向运维管理
|
||||
|
||||
玩家服在大部分的场景下是有状态的,例如PVP游戏在更新或者下线的时候,应该优先且只能变更没有活跃玩家在线的游戏服;PVE游戏在停服或者合服的时候,应该能够定向管理特定ID的玩家服。
|
||||
|
||||
* 适合游戏的网络模型
|
||||
|
||||
Kubernetes中的网络模型是通过Service进行抽象的,更多的是面向无状态场景的适配。对于网络敏感的游戏服而言,高性能网关或者IP端口固定的无损直连的方案更符合真实的业务场景。
|
||||
|
||||
* 游戏服编排
|
||||
|
||||
当下的游戏服架构越来越复杂,很多MMORPG的玩家服已经抽象成了多种不同功能和用途的游戏服的组合,例如:负责网络接入的网关服、负责游戏引擎的中心服,负责游戏脚本和玩法的策略服等。每个游戏服的容量和管理策略有所不同,通过单一的负载类型很难描述和快速交付。
|
||||
|
||||
这些能力的缺失,让游戏服进行云原生化变得非常困难。OpenKruiseGame(OKG)设计的初衷就是将这些游戏行业通用的需求进行抽象,通过语义化的方式,将不同类型游戏服的云原生化过程变得简单、高效、安全。
|
||||
|
||||
## 核心功能列表
|
||||
|
||||
OpenKruiseGame(OKG)具有如下核心能力:
|
||||
|
||||
* 镜像热更新/配置热重载
|
||||
* 定向更新/删除/隔离
|
||||
* 内置多种网络模型(IP端口不变/无损直连/全球加速)
|
||||
* 自动弹性伸缩
|
||||
* 自动化运维管理(服务质量)
|
||||
* 云服务厂商无关
|
||||
* 复杂的游戏服务编排
|
||||
|
||||
## What's Next
|
||||
接下来,我们推荐你:
|
||||
* 安装 OpenKruiseGame。有关详细信息,请参阅 [安装](./installation.md)。
|
||||
* 为 OpenKruiseGame 提交代码。更多信息,请参见 [开发者指南](./developer-manuals/contribution.md)。
|
||||
* 加入钉钉群(ID:44862615)与OpenKruiseGame核心贡献者一起讨论。
|
||||
* 通过电子邮件 zhongwei.lzw@alibaba-inc.com 联系我们。
|
||||
|
||||
|
|
@ -0,0 +1,35 @@
|
|||
# 容器启动顺序控制
|
||||
## 功能概述
|
||||
|
||||
单个游戏服Pod存在多个容器的情况下,有时候会需要对容器的启动顺序有所要求。OKG提供了自定义顺序启动的功能
|
||||
|
||||
## 使用示例
|
||||
|
||||
在GameServerSet.Spec.GameServerTemplate.spec.containers 中添加 KRUISE_CONTAINER_PRIORITY 环境变量:
|
||||
|
||||
```
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
|
||||
# ...
|
||||
|
||||
spec:
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- name: main
|
||||
# ...
|
||||
- name: sidecar
|
||||
env:
|
||||
- name: KRUISE_CONTAINER_PRIORITY
|
||||
value: "1"
|
||||
|
||||
# ...
|
||||
|
||||
```
|
||||
|
||||
- 值的范围在 [-2147483647, 2147483647],不写默认是 0。
|
||||
- 权重高的容器,会保证在权重低的容器之前启动。
|
||||
- 相同权重的容器不保证启动顺序。
|
||||
|
||||
上述例子中游戏服启动时由于sidecar权重更高,所以先启动sidecar容器,再启动main容器
|
|
@ -0,0 +1,43 @@
|
|||
# 部署游戏服
|
||||
## 功能概述
|
||||
您可以使用GameServerSet进行游戏服的部署,一个简单的部署案例如下:
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: minecraft
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2
|
||||
name: minecraft
|
||||
EOF
|
||||
```
|
||||
|
||||
当前GameServerSet创建完成后,由于指定的副本数为3,故在集群中将会出现3个GameServer,以及对应的3个Pod:
|
||||
|
||||
```bash
|
||||
kubectl get gss
|
||||
NAME AGE
|
||||
minecraft 9s
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
kubectl get pod
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
minecraft-0 1/1 Running 0 10s
|
||||
minecraft-1 1/1 Running 0 10s
|
||||
minecraft-2 1/1 Running 0 10s
|
||||
```
|
|
@ -0,0 +1,174 @@
|
|||
# 游戏服水平伸缩
|
||||
## OpenKruiseGame的伸缩特性
|
||||
|
||||
OKG提供游戏服状态设置的能力,您可以手动/自动(服务质量功能)地设置游戏服的运维状态或删除优先级。当缩容时,GameServerSet负载会根据游戏服的状态进行缩容选择,缩容规则如下:
|
||||
|
||||
1)根据游戏服的opsState缩容。按顺序依次缩容opsState为`WaitToBeDeleted`、`None`、`Maintaining`的游戏服
|
||||
|
||||
2)当opsState相同时,按照DeletionPriority(删除优先级)缩容,优先删除DeletionPriority大的游戏服
|
||||
|
||||
3)当opsState与DeletionPriority都相同时,优先删除名称尾部序号较大的游戏服
|
||||
|
||||
### 示例
|
||||
|
||||
部署一个副本为5的游戏服:
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: minecraft
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 5
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2
|
||||
name: minecraft
|
||||
EOF
|
||||
```
|
||||
|
||||
生成5个GameServer:
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
minecraft-3 Ready None 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
```
|
||||
|
||||
对minecraft-2设置删除优先级为10:
|
||||
|
||||
```bash
|
||||
kubectl edit gs minecraft-2
|
||||
|
||||
...
|
||||
spec:
|
||||
deletionPriority: 10 #初始为0,调大到10
|
||||
opsState: None
|
||||
updatePriority: 0
|
||||
...
|
||||
```
|
||||
|
||||
手动缩容到4个副本:
|
||||
|
||||
```bash
|
||||
kubectl scale gss minecraft --replicas=4
|
||||
gameserverset.game.kruise.io/minecraft scale
|
||||
```
|
||||
|
||||
游戏服的数目最终变为4,可以看到2号游戏服因为删除优先级最大所以被删除:
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Deleting None 10 0
|
||||
minecraft-3 Ready None 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
|
||||
# After a while
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-3 Ready None 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
```
|
||||
|
||||
设置minecraft-3的opsState为WaitToBeDeleted:
|
||||
|
||||
```bash
|
||||
kubectl edit gs minecraft-3
|
||||
|
||||
...
|
||||
spec:
|
||||
deletionPriority: 0
|
||||
opsState: WaitToBeDeleted #初始为None, 将其改为WaitToBeDeleted
|
||||
updatePriority: 0
|
||||
...
|
||||
```
|
||||
|
||||
手动缩容到3个副本:
|
||||
|
||||
```bash
|
||||
kubectl scale gss minecraft --replicas=3
|
||||
gameserverset.game.kruise.io/minecraft scaled
|
||||
```
|
||||
|
||||
游戏服的数目最终变为3,可以看到3号游戏服因为处于WaitToBeDeleted状态所以被删除:
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-3 Deleting WaitToBeDeleted 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
|
||||
# After a while
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
```
|
||||
|
||||
手动扩容回5个副本:
|
||||
|
||||
```bash
|
||||
kubectl scale gss minecraft --replicas=5
|
||||
gameserverset.game.kruise.io/minecraft scaled
|
||||
```
|
||||
|
||||
游戏服的数目最终变为5,此时扩容出的游戏服序号为2与3:
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
minecraft-3 Ready None 0 0
|
||||
minecraft-4 Ready None 0 0
|
||||
```
|
||||
|
||||
## 配置游戏服的自动伸缩
|
||||
|
||||
GameServerSet支持HPA,您可以通过默认/自定义指标配置
|
||||
|
||||
### HPA示例
|
||||
|
||||
```yaml
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: minecraft-hpa
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
name: minecraft # GameServerSet对应名称
|
||||
minReplicas: 1
|
||||
maxReplicas: 10
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 50 # 示例以cpu利用率50%为计算标准
|
||||
```
|
|
@ -0,0 +1,72 @@
|
|||
# 游戏服热更新
|
||||
## 功能概述
|
||||
在游戏场景下,游戏服脚本、场景资源等属于热更文件,时常以sidecar的形式部署在pod中。
|
||||
在更新这些文件时,我们往往希望不影响主程序(游戏服引擎侧)的正常运行。
|
||||
然而,在原生Kubernetes集群,更新pod中任意容器都会导致pod重建,无法满足游戏热更场景。
|
||||
|
||||
OKG 提供的原地升级能力,可以针对性定向更新pod中某一个容器,不影响整个pod的生命周期。
|
||||
如下图所示,蓝色部分为热更部分,橘色部分为非热更部分。我们将Game Script容器从版本V1更新至版本V2后,整个pod不会重建,橘色部分不受到任何影响,Game Engine正常平稳运行
|
||||
|
||||

|
||||
|
||||
## 使用示例
|
||||
|
||||
部署带有sidecar容器的游戏服,使用GameServerSet作为游戏服负载,pod更新策略选择原地升级:
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: minecraft
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2
|
||||
name: minecraft
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/sidecar:v0.1
|
||||
name: sidecar
|
||||
EOF
|
||||
```
|
||||
|
||||
生成3个GameServer以及对应的3个Pod:
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
kubectl get pod
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
minecraft-0 2/2 Running 0 13s
|
||||
minecraft-1 2/2 Running 0 13s
|
||||
minecraft-2 2/2 Running 0 13s
|
||||
```
|
||||
|
||||
当产生热更需求,我们希望只更新sidecar容器而不影响整个pod的生命周期,此时只需更新GameServerSet对应的容器镜像版本即可:
|
||||
|
||||
```bash
|
||||
kubectl edit gss minecraft
|
||||
...
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/sidecar:v0.2
|
||||
name: sidecar
|
||||
...
|
||||
```
|
||||
|
||||
一段时间过后,发现Pod已经更新完毕,restarts次数变为1,但Age并没有减少。游戏服完成了热更新:
|
||||
|
||||
```bash
|
||||
kubectl get pod
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
minecraft-0 2/2 Running 1 (33s ago) 8m55s
|
||||
minecraft-1 2/2 Running 1 (37s ago) 8m54s
|
||||
minecraft-2 2/2 Running 1 (49s ago) 8m54s
|
||||
```
|
|
@ -0,0 +1,295 @@
|
|||
# 网络模型
|
||||
## 功能概述
|
||||
|
||||
如[OKG设计理念](../design-concept.md)中提到的,游戏服接入层网络是游戏开发者非常关注的问题。
|
||||
非网关架构下,游戏开发者需要考虑如何暴露游戏服的外部IP端口,供玩家连接访问。
|
||||
在不同场景下,往往需要不同的网络产品,而有时网络产品由云厂商提供。OKG 的 Cloud Provider & Network Plugin 源于此而诞生。
|
||||
OKG 会集成不同云提供商的不同网络插件,用户可通过GameServerSet设置游戏服的网络参数,并在生成的GameServer中查看网络状态信息,极大降低了游戏服接入网络的复杂度。
|
||||
|
||||
## 使用示例
|
||||
|
||||
### Kubernetes-HostPort
|
||||
|
||||
OKG支持在原生Kubernetes集群使用HostPort游戏服网络,使用游戏服所在宿主机暴露外部IP及端口,转发至游戏服内部端口中。使用方式如下。
|
||||
|
||||
部署一个带有network的GameServerSet:
|
||||
|
||||
```
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-hostport
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: Kubernetes-HostPort
|
||||
networkConf:
|
||||
#网络配置以k-v键值对的形式传入,由网络插件指定。不同网络插件有着不同的网络配置
|
||||
- name: ContainerPorts
|
||||
#ContainerPorts对应的值格式如下{containerName}:{port1}/{protocol1},{port2}/{protocol2},...
|
||||
value: "gameserver:80"
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
生成的GameServer中通过networkStatus字段查看游戏服网络信息:
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T10:57:01Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 48.98.98.8
|
||||
ports:
|
||||
- name: gameserver-80
|
||||
port: 8211
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.8
|
||||
ports:
|
||||
- name: gameserver-80
|
||||
port: 80
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T10:57:01Z"
|
||||
networkType: Kubernetes-HostPort
|
||||
```
|
||||
|
||||
访问 48.98.98.8:8211 即可
|
||||
|
||||
### AlibabaCloud-NATGW
|
||||
|
||||
OKG支持阿里云下NAT网关模型,使用NATGW的外部IP与端口暴露服务,流量最终将转发至Pod之中。使用方式如下:
|
||||
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-natgw
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: AlibabaCloud-NATGW
|
||||
networkConf:
|
||||
- name: Ports
|
||||
#暴露的端口,格式如下 {port1},{port2}...
|
||||
value: "80"
|
||||
- name: Protocol
|
||||
#使用的协议,默认为TCP
|
||||
value: "TCP"
|
||||
# - name: Fixed
|
||||
# 是否固定映射关系,默认不固定,pod删除后会生成新的外部IP及端口
|
||||
# value: true
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
生成的GameServer中通过networkStatus字段查看游戏服网络信息:
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T11:21:34Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.97.227.137
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "512"
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.189
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "80"
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T11:21:34Z"
|
||||
networkType: AlibabaCloud-NATGW
|
||||
```
|
||||
|
||||
访问 47.97.227.137:512 即可
|
||||
|
||||
## 网络插件附录
|
||||
|
||||
当前支持的网络插件:
|
||||
- Kubernetes-HostPort
|
||||
- AlibabaCloud-NATGW
|
||||
- AlibabaCloud-SLB
|
||||
- AlibabaCloud-SLB-SharedPort
|
||||
|
||||
---
|
||||
### Kubernetes-HostPort
|
||||
#### 插件名称
|
||||
|
||||
`Kubernetes-HostPort`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
Kubernetes
|
||||
|
||||
#### 插件说明
|
||||
- Kubernetes-HostPort利用宿主机网络,通过主机上的端口转发实现游戏服对外暴露服务。宿主机需要配置公网IP,有被公网访问的能力。
|
||||
|
||||
- 用户在配置文件中可自定义宿主机开放的端口段(默认为8000-9000),该网络插件可以帮助用户分配管理宿主机端口,尽量避免端口冲突。
|
||||
|
||||
- 该插件不支持网络隔离。
|
||||
|
||||
#### 网络参数
|
||||
|
||||
ContainerPorts
|
||||
|
||||
- 含义:填写提供服务的容器名以及对应暴露的端口和协议
|
||||
- 填写格式:containerName:port1/protocol1,port2/protocol2,...(协议需大写) 比如:`game-server:25565/TCP`
|
||||
- 是否支持变更:不支持,在创建时即永久生效,随pod生命周期结束而结束
|
||||
|
||||
#### 插件配置
|
||||
|
||||
```
|
||||
[kubernetes]
|
||||
enable = true
|
||||
[kubernetes.hostPort]
|
||||
#填写宿主机可使用的空闲端口段,用于为pod分配宿主机转发端口
|
||||
max_port = 9000
|
||||
min_port = 8000
|
||||
```
|
||||
|
||||
---
|
||||
### AlibabaCloud-NATGW
|
||||
#### 插件名称
|
||||
|
||||
`AlibabaCloud-NATGW`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
AlibabaCloud
|
||||
|
||||
#### 插件说明
|
||||
|
||||
- AlibabaCloud-NATGW 使用阿里云公网网关作为游戏服对外服务的承载实体,外网流量通过DNAT规则转发至对应的游戏服中。
|
||||
|
||||
- 是否支持网络隔离:否
|
||||
|
||||
#### 网络参数
|
||||
|
||||
Ports
|
||||
|
||||
- 含义:填写pod需要暴露的端口
|
||||
- 填写格式:port1,port2,port3… 例如:80,8080,8888
|
||||
- 是否支持变更:不支持
|
||||
|
||||
Protocol
|
||||
|
||||
- 含义:填写服务的网络协议
|
||||
- 填写格式:例如:tcp,默认为tcp
|
||||
- 是否支持变更:不支持
|
||||
|
||||
Fixed
|
||||
|
||||
- 含义:是否固定访问IP/端口。若是,即使pod删除重建,网络内外映射关系不会改变
|
||||
- 填写格式:false / true
|
||||
- 是否支持变更:不支持
|
||||
|
||||
#### 插件配置
|
||||
|
||||
无
|
||||
|
||||
---
|
||||
### AlibabaCloud-SLB
|
||||
#### 插件名称
|
||||
|
||||
`AlibabaCloud-SLB`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
AlibabaCloud
|
||||
|
||||
#### 插件说明
|
||||
|
||||
- AlibabaCloud-SLB 使用阿里云经典四层负载均衡(SLB,又称CLB)作为对外服务的承载实体,在此模式下,不同游戏服将使用同一SLB的不同端口,此时SLB只做转发,并未均衡流量。
|
||||
|
||||
- 是否支持网络隔离:是
|
||||
|
||||
相关设计:https://github.com/openkruise/kruise-game/issues/20
|
||||
|
||||
#### 网络参数
|
||||
|
||||
SlbIds
|
||||
|
||||
- 含义:填写slb的id。暂只支持填写一例,未来将支持填写多例
|
||||
- 填写格式:例如:lb-9zeo7prq1m25ctpfrw1m7
|
||||
- 是否支持变更:暂不支持。未来将支持
|
||||
|
||||
PortProtocols
|
||||
|
||||
- 含义:pod暴露的端口及协议,支持填写多个端口/协议
|
||||
- 格式:port1/protocol1,port2/protocol2,...(协议需大写)
|
||||
- 是否支持变更:暂不支持。未来将支持
|
||||
|
||||
Fixed
|
||||
|
||||
- 含义:是否固定访问IP/端口。若是,即使pod删除重建,网络内外映射关系不会改变
|
||||
- 填写格式:false / true
|
||||
- 是否支持变更:不支持
|
||||
|
||||
#### 插件配置
|
||||
```
|
||||
[alibabacloud]
|
||||
enable = true
|
||||
[alibabacloud.slb]
|
||||
#填写slb可使用的空闲端口段,用于为pod分配外部接入端口,范围为200
|
||||
max_port = 700
|
||||
min_port = 500
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### 插件名称
|
||||
### AlibabaCloud-SLB-SharedPort
|
||||
`AlibabaCloud-SLB-SharedPort`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
AlibabaCloud
|
||||
|
||||
#### 插件说明
|
||||
|
||||
- AlibabaCloud-SLB-SharedPort 使用阿里云经典四层负载均衡(SLB,又称CLB)作为对外服务的承载实体。但与AlibabaCloud-SLB不同,`AlibabaCloud-SLB-SharedPort` 使用SLB同一端口转发流量,具有负载均衡的特点。
|
||||
适用于游戏场景下代理(proxy)或网关等无状态网络服务。
|
||||
|
||||
- 是否支持网络隔离:是
|
||||
|
||||
#### 网络参数
|
||||
|
||||
SlbIds
|
||||
|
||||
- 含义:填写slb的id,支持填写多例
|
||||
- 填写格式:例如:lb-9zeo7prq1m25ctpfrw1m7
|
||||
- 是否支持变更:支持。
|
||||
|
||||
PortProtocols
|
||||
|
||||
- 含义:pod暴露的端口及协议,支持填写多个端口/协议
|
||||
- 格式:port1/protocol1,port2/protocol2,...(协议需大写)
|
||||
- 是否支持变更:暂不支持。未来将支持
|
||||
|
||||
#### 插件配置
|
||||
|
||||
无
|
|
@ -0,0 +1,128 @@
|
|||
# 自定义服务质量
|
||||
## 功能概述
|
||||
|
||||
由于游戏是有状态服务,很多时候游戏服是以一种 "富容器" 的形态存在于Pod之中,多个进程在Pod中统一管理。
|
||||
然而,每个进程重要性却有所不同,对于"轻量级进程"错误的情况,用户并不希望将整个pod删除重建,像k8s原生的liveness probe并不能很好地满足这种需求,过于僵化的模式与游戏场景并不适配。
|
||||
OKG 认为游戏服的服务质量水平应该交由游戏开发者定义,开发者可以根据不同游戏服状态去设置对应的处理动作。自定义服务质量功能是探测+动作的组合,通过这种方式帮助用户自动化地处理各类游戏服状态问题。
|
||||
|
||||
## 使用示例
|
||||
|
||||
### 游戏服空闲设置即将下线
|
||||
|
||||
部署一个带有自定义服务质量的GameServerSet:
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: minecraft
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:idle
|
||||
name: minecraft
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
maxUnavailable: 100%
|
||||
serviceQualities: # 设置了一个idle的服务质量
|
||||
- name: idle
|
||||
containerName: minecraft
|
||||
permanent: false
|
||||
#与原生probe类似,本例使用执行脚本的方式探测游戏服是否空闲,不存在玩家
|
||||
exec:
|
||||
command: ["bash", "./idle.sh"]
|
||||
serviceQualityAction:
|
||||
#不存在玩家,标记该游戏服运维状态为WaitToBeDeleted
|
||||
- state: true
|
||||
opsState: WaitToBeDeleted
|
||||
#存在玩家,标记该游戏服运维状态为None
|
||||
- state: false
|
||||
opsState: None
|
||||
EOF
|
||||
```
|
||||
|
||||
部署完成后,由于还未导入玩家,故所有游戏服都为空闲状态,可以任意被删除:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready WaitToBeDeleted 0 0
|
||||
minecraft-1 Ready WaitToBeDeleted 0 0
|
||||
minecraft-2 Ready WaitToBeDeleted 0 0
|
||||
```
|
||||
|
||||
当有玩家进入游戏服minecraft-1,则游戏服的运维状态发生变化:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready WaitToBeDeleted 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready WaitToBeDeleted 0 0
|
||||
```
|
||||
|
||||
此时若发生缩容,游戏服minecraft-1将得到保护,避免优先删除。
|
||||
|
||||
### 游戏服状态异常设置维护中
|
||||
|
||||
部署一个带有自定义服务质量的GameServerSet:
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: demo-gs
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:healthy
|
||||
name: minecraft
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
maxUnavailable: 100%
|
||||
serviceQualities: # 设置了一个healthy的服务质量
|
||||
- name: healthy
|
||||
containerName: minecraft
|
||||
permanent: false
|
||||
#与原生probe类似,本例使用执行脚本的方式探测游戏服是否健康
|
||||
exec:
|
||||
command: ["bash", "./healthy.sh"]
|
||||
serviceQualityAction:
|
||||
#探测健康,标记该游戏服运维状态为None
|
||||
- state: true
|
||||
opsState: None
|
||||
#探测不健康,标记该游戏服运维状态为Maintaining
|
||||
- state: false
|
||||
opsState: Maintaining
|
||||
EOF
|
||||
```
|
||||
|
||||
部署完成后,由于一切正常,故所有游戏服都为None:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
demo-gs-0 Ready None 0 0
|
||||
demo-gs-1 Ready None 0 0
|
||||
demo-gs-2 Ready None 0 0
|
||||
```
|
||||
|
||||
模拟demo-gs-0某进程宕机,游戏服变为Maintaining状态:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
demo-gs-0 Ready Maintaining 0 0
|
||||
demo-gs-1 Ready None 0 0
|
||||
demo-gs-2 Ready None 0 0
|
||||
```
|
||||
|
||||
此时gameserver controller会发出 GameServer demo-gs-0 Warning 的 event,配合[kube-event项目](https://github.com/AliyunContainerService/kube-eventer)可实现异常通知:
|
||||
|
||||

|
||||
|
||||
此外,OKG 未来会集成游戏服自动排障/恢复工具,进一步丰富游戏服的自动化运维能力。
|
|
@ -0,0 +1,84 @@
|
|||
# 游戏服更新策略
|
||||
## 功能概述
|
||||
|
||||
OKG 提供了原地升级([热更新](./hot-update.md))、批量更新、按优先级更新等多种更新策略。
|
||||
|
||||
用户可设置GameServer的更新优先级,配合partition参数,实现在实际生产场景下,把控更新范围、更新顺序、更新节奏。
|
||||
如下图所示,提高序号为1的游戏服优先级,同时设置partition为2,则会优先更新1号游戏服;随后更改partition为0,则会再更新其余游戏服。详情可参考使用示例。
|
||||
|
||||

|
||||
|
||||
## 使用示例
|
||||
|
||||
本示例中将一组游戏服分成两批次更新,模拟灰度更新,逐步验证的场景。
|
||||
|
||||
此时GameServerSet下有3个游戏服副本:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 0
|
||||
gs-demo-2 Ready None 0 0
|
||||
```
|
||||
|
||||
设置更新优先级,将1号游戏服优先级调大:
|
||||
```shell
|
||||
kubectl edit gs gs-demo-1
|
||||
|
||||
...
|
||||
spec:
|
||||
deletionPriority: 0
|
||||
opsState: None
|
||||
updatePriority: 10 #初始为0,调大成10
|
||||
...
|
||||
```
|
||||
|
||||
接下来设置 GameServerSet partition、以及即将更新的新镜像:
|
||||
```shell
|
||||
kubectl edit gss gs-demo
|
||||
|
||||
...
|
||||
image: gameserver:latest # 更新镜像
|
||||
name: gameserver
|
||||
...
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 5
|
||||
partition: 2 # 设置保留的游戏服数目,这里只更新一个,所以要保留余下2个
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
此时只有gs-demo-1将会更新:
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Updating None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
|
||||
|
||||
# 一段时间过后
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
```
|
||||
|
||||
待gs-demo-1验证通过后,更新其余游戏服:
|
||||
```shell
|
||||
kubectl edit gss gs-demo
|
||||
...
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 5
|
||||
partition: 0 # 设置保留的游戏服数目,将其设置为0,更新余下全部
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
...
|
||||
|
||||
```
|
||||
|
|
@ -27,7 +27,7 @@
|
|||
## Why is OpenKruiseGame a workload?
|
||||
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/workload.png').default} width="90%" />
|
||||
|
||||
|
||||
The key to cloud-native transformation of game servers is to address two concerns: lifecycle management and O&M management of game servers. Kubernetes has built-in general workload models such as Deployment, StatefulSet, and Job. However, the management of game server states is more fine-grained and more precise. For example, for game servers, a rolling update mechanism is required to ensure a shorter game interruption, and in-place updates are required to ensure that the network-based metadata information remains unchanged. A mechanism to ensure logouts occur only in zero-player game servers during auto scaling and the capability to allow to manually perform O&M, diagnosis, and isolation on game servers are required. The preceding requirements cannot be met by using only the built-in workloads of Kubernetes.
|
||||
|
@ -56,7 +56,7 @@ The service quality capability can be used to convert players' tasks of a game t
|
|||
|
||||
## Deployment architecture of OpenKruiseGame
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/arch.png').default} width="90%" />
|
||||
|
||||
The deployment model of OpenKruiseGame consists of three parts:
|
||||
|
||||
|
|
|
@ -18,38 +18,43 @@ $ helm repo update
|
|||
$ helm install kruise openkruise/kruise --version 1.4.0
|
||||
```
|
||||
|
||||
## Install Kruise-Game
|
||||
#### Install Kruise-Game
|
||||
|
||||
```shell
|
||||
$ helm install kruise-game openkruise/kruise-game --version 0.2.1
|
||||
$ helm install kruise-game openkruise/kruise-game --version 0.3.0
|
||||
```
|
||||
|
||||
## Optional: install/upgrade with customized configurations
|
||||
#### Optional: install/upgrade with customized configurations
|
||||
|
||||
The following table lists the configurable parameters of the kruise-game chart and their default values.
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|----------------------------------|-------------------------------------------------------------------|-------------------------------------|
|
||||
| `installation.namespace` | Namespace for kruise-game operation installation | `kruise-game-system` |
|
||||
| `installation.createNamespace` | Whether to create the installation.namespace | `true` |
|
||||
| `kruiseGame.fullname` | Nick name for kruise-game deployment and other configurations | `kruise-game-controller-manager` |
|
||||
| `kruiseGame.healthBindPort` | Port for checking health of kruise-game container | `8082` |
|
||||
| `kruiseGame.webhook.port` | Port of webhook served by kruise-game container | `443` |
|
||||
| `kruiseGame.webhook.targetPort` | ObjectSelector for workloads in MutatingWebhookConfigurations | `9876` |
|
||||
| `replicaCount` | Replicas of kruise-game deployment | `1` |
|
||||
| `image.repository` | Repository for kruise-game image | `openkruise/kruise-game-manager` |
|
||||
| `image.tag` | Tag for kruise-game image | `v0.2.1` |
|
||||
| `image.pullPolicy` | ImagePullPolicy for kruise-game container | `Always` |
|
||||
| `serviceAccount.annotations` | The annotations for serviceAccount of kruise-game | ` ` |
|
||||
| `resources.limits.cpu` | CPU resource limit of kruise-game container | `500m` |
|
||||
| `resources.limits.memory` | Memory resource limit of kruise-game container | `1Gi` |
|
||||
| `resources.requests.cpu` | CPU resource request of kruise-game container | `10m` |
|
||||
| `resources.requests.memory` | Memory resource request of kruise-game container | `64Mi` |
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|----------------------------------------|-------------------------------------------------------------------|-------------------------------------|
|
||||
| `installation.namespace` | Namespace for kruise-game operation installation | `kruise-game-system` |
|
||||
| `installation.createNamespace` | Whether to create the installation.namespace | `true` |
|
||||
| `kruiseGame.fullname` | Nick name for kruise-game deployment and other configurations | `kruise-game-controller-manager` |
|
||||
| `kruiseGame.healthBindPort` | Port for checking health of kruise-game container | `8082` |
|
||||
| `kruiseGame.webhook.port` | Port of webhook served by kruise-game container | `443` |
|
||||
| `kruiseGame.webhook.targetPort` | ObjectSelector for workloads in MutatingWebhookConfigurations | `9876` |
|
||||
| `replicaCount` | Replicas of kruise-game deployment | `1` |
|
||||
| `image.repository` | Repository for kruise-game image | `openkruise/kruise-game-manager` |
|
||||
| `image.tag` | Tag for kruise-game image | `v0.2.1` |
|
||||
| `image.pullPolicy` | ImagePullPolicy for kruise-game container | `Always` |
|
||||
| `serviceAccount.annotations` | The annotations for serviceAccount of kruise-game | ` ` |
|
||||
| `resources.limits.cpu` | CPU resource limit of kruise-game container | `500m` |
|
||||
| `resources.limits.memory` | Memory resource limit of kruise-game container | `1Gi` |
|
||||
| `resources.requests.cpu` | CPU resource request of kruise-game container | `10m` |
|
||||
| `resources.requests.memory` | Memory resource request of kruise-game container | `64Mi` |
|
||||
| `prometheus.enabled` | Whether to bind metric endpoint | `true` |
|
||||
| `prometheus.monitorService.port` | Port of the monitorservice bind to | `8080` |
|
||||
| `scale.service.port` | Port of the external scaler server binds to | `6000` |
|
||||
| `scale.service.targetPort` | TargetPort of the external scaler server binds to | `6000` |
|
||||
| `network.totalWaitTime` | Maximum time to wait for network ready, the unit is seconds | `60` |
|
||||
| `network.probeIntervalTime` | Time interval for detecting network status, the unit is seconds | `5` |
|
||||
|
||||
Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`. For example,
|
||||
|
||||
### Optional: the local image for China
|
||||
#### Optional: the local image for China
|
||||
|
||||
If you are in China and have problem to pull image from official DockerHub, you can use the registry hosted on Alibaba Cloud:
|
||||
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
### Overview
|
||||
OpenKruiseGame (OKG) is a multicloud-oriented, open source Kubernetes workload specialized for game servers. It is a sub-project of the open source workload project OpenKruise of the Cloud Native Computing Foundation (CNCF) in the gaming field. OpenKruiseGame makes the cloud-native transformation of game servers easier, faster, and stabler.
|
||||
|
||||

|
||||
<img src={require('/static/img/kruisegame/intro.png').default} width="90%" />
|
||||
|
||||
## What is OpenKruiseGame?
|
||||
OpenKruiseGame is a custom Kubernetes workload designed specially for game server scenarios. It simplifies the cloud-native transformation of game servers. Compared with the built-in workloads of Kubernetes, such as Deployment and StatefulSet, OpenKruiseGame provides common game server management features, such as hot update, in-place update, and management of specified game servers.
|
||||
|
@ -23,6 +23,7 @@ The architecture of game servers has become increasingly complex. The player ser
|
|||
The preceding challenges make it difficult to implement cloud-native transformation of game servers. OpenKruiseGame is aimed to abstract the common requirements of the gaming industry, and use the semantic method to make the cloud-native transformation of various game servers simple, efficient, and secure.
|
||||
|
||||
## List of core features
|
||||
|
||||
OpenKruiseGame has the following core features:
|
||||
|
||||
* Hot update based on images and hot reload of configurations
|
||||
|
@ -33,6 +34,18 @@ OpenKruiseGame has the following core features:
|
|||
* Independent of cloud service providers
|
||||
* Complex game server orchestration
|
||||
|
||||
## Users of OpenKruiseGame(OKG)
|
||||
|
||||
<table>
|
||||
<tr style={{"border":0}}>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/bilibili-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/hypergryph-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/shangyou-logo.jpeg').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/guanying-logo.png').default} width="120" /></center></td>
|
||||
<td style={{"border":0}}><center><img src={require('/static/img/kruisegame/booming-logo.png').default} width="120" /></center></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## What's Next
|
||||
Here are some recommended next steps:
|
||||
* Install OpenKruiseGame. For more information, see [Install](./installation.md).
|
||||
|
|
|
@ -0,0 +1,100 @@
|
|||
# AutoScale
|
||||
## Feature overview
|
||||
|
||||
Compared to stateless service types, game servers have higher requirements for automatic scaling, especially in terms of scaling down.
|
||||
|
||||
The differences between game servers become more and more obvious over time, and the precision requirements for scaling down are extremely high. Coarse-grained scaling mechanisms can easily cause negative effects such as player disconnections, resulting in huge losses for the business.
|
||||
|
||||
The horizontal scaling mechanism in native Kubernetes is shown in the following figure:
|
||||
|
||||

|
||||
|
||||
In the game scenario, its main problems are:
|
||||
|
||||
- At the pod level, it is unable to perceive the game server game status and therefore cannot set deletion priority based on game status.
|
||||
- At the workload level, it cannot select scaling-down objects based on game status.
|
||||
- At the autoscaler level, it cannot accurately calculate the appropriate number of replicas based on the game server game status.
|
||||
|
||||
In this way, the automatic scaling mechanism based on native Kubernetes will cause two major problems in the game scenario:
|
||||
|
||||
- The number of scaling down is not accurate. It is easy to delete too many or too few game servers.
|
||||
- The scaling-down object is not accurate. It is easy to delete game servers with high game load levels.
|
||||
|
||||
|
||||
The automatic scaling mechanism of OKG is shown in the following figure:
|
||||
|
||||

|
||||
|
||||
- At the game server level, each game server can report its own status and expose whether it is in the WaitToBeDeleted state through custom service quality or external components.
|
||||
- At the workload level, the GameServerSet can determine the scaling-down object based on the business status reported by the game server. As described in [Game Server Horizontal Scaling](gameservers-scale.md), the game server in the WaitToBeDeleted state is the highest priority game server to be deleted during scaling down.
|
||||
- At the autoscaler level, accurately calculate the number of game servers in the WaitToBeDeleted state, and use it as the scaling-down quantity to avoid accidental deletion.
|
||||
|
||||
In this way, OKG's automatic scaler will only delete game servers in the WaitToBeDeleted state during the scaling-down window, achieving targeted and precise scaling down.
|
||||
|
||||
## Usage Example
|
||||
|
||||
_**Prerequisites: Install [KEDA](https://keda.sh/docs/2.10/deploy/) in the cluster.**_
|
||||
|
||||
Deploy the ScaledObject object to set the automatic scaling strategy. Refer to the [ScaledObject API](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledobject_types.go) for the specific field meanings.
|
||||
|
||||
```yaml
|
||||
apiVersion: keda.sh/v1alpha1
|
||||
kind: ScaledObject
|
||||
metadata:
|
||||
name: minecraft # Fill in the name of the corresponding GameServerSet
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
name: minecraft # Fill in the name of the corresponding GameServerSet
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
pollingInterval: 30
|
||||
minReplicaCount: 0
|
||||
advanced:
|
||||
horizontalPodAutoscalerConfig:
|
||||
behavior: # Inherit from HPA behavior, refer to https://kubernetes.io/zh-cn/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 45 # Set the scaling-down stabilization window time to 45 seconds
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 15
|
||||
triggers:
|
||||
- type: external
|
||||
metricType: Value
|
||||
metadata:
|
||||
scalerAddress: kruise-game-external-scaler.kruise-game-system:6000
|
||||
|
||||
```
|
||||
|
||||
After deployment, change the opsState of the gs minecraft-0 to WaitToBeDeleted (see [Custom Service Quality](service-qualities.md) for automated setting of game server status).
|
||||
|
||||
```bash
|
||||
kubectl edit gs minecraft-0
|
||||
|
||||
...
|
||||
spec:
|
||||
deletionPriority: 0
|
||||
opsState: WaitToBeDeleted # Set to None initially, and change it to WaitToBeDeleted
|
||||
updatePriority: 0
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
After the scaling-down window period, the game server minecraft-0 is automatically deleted.
|
||||
|
||||
```bash
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Deleting WaitToBeDeleted 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
# After a while
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
```
|
|
@ -0,0 +1,277 @@
|
|||
# CRD Field Description
|
||||
## GameServerSet
|
||||
|
||||
### GameServerSetSpec
|
||||
|
||||
```
|
||||
type GameServerSetSpec struct {
|
||||
// The number of game servers. Must be specified, with a minimum value of 0.
|
||||
Replicas *int32 `json:"replicas"`
|
||||
|
||||
// Game server template. The new game server will be created with the parameters defined in GameServerTemplate.
|
||||
GameServerTemplate GameServerTemplate `json:"gameServerTemplate,omitempty"`
|
||||
|
||||
// Reserved game server IDs, optional. If specified, existing game servers with those IDs will be deleted,
|
||||
// and new game servers will not be created with those IDs.
|
||||
ReserveGameServerIds []int `json:"reserveGameServerIds,omitempty"`
|
||||
|
||||
// Custom service qualities for game servers.
|
||||
ServiceQualities []ServiceQuality `json:"serviceQualities,omitempty"`
|
||||
|
||||
// Batch update strategy for game servers.
|
||||
UpdateStrategy UpdateStrategy `json:"updateStrategy,omitempty"`
|
||||
|
||||
// Horizontal scaling strategy for game servers.
|
||||
ScaleStrategy ScaleStrategy `json:"scaleStrategy,omitempty"`
|
||||
|
||||
// Network settings for game server access layer.
|
||||
Network *Network `json:"network,omitempty"`
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
#### GameServerTemplate
|
||||
|
||||
```yaml
|
||||
type GameServerTemplate struct {
|
||||
// All fields inherited from PodTemplateSpec.
|
||||
corev1.PodTemplateSpec `json:",inline"`
|
||||
|
||||
// Requests and claims for persistent volumes.
|
||||
VolumeClaimTemplates []corev1.PersistentVolumeClaim `json:"volumeClaimTemplates,omitempty"`
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
#### UpdateStrategy
|
||||
|
||||
```
|
||||
type UpdateStrategy struct {
|
||||
// Type indicates the type of the StatefulSetUpdateStrategy.
|
||||
// Default is RollingUpdate.
|
||||
// +optional
|
||||
Type apps.StatefulSetUpdateStrategyType `json:"type,omitempty"`
|
||||
|
||||
// RollingUpdate is used to communicate parameters when Type is RollingUpdateStatefulSetStrategyType.
|
||||
// +optional
|
||||
RollingUpdate *RollingUpdateStatefulSetStrategy `json:"rollingUpdate,omitempty"`
|
||||
}
|
||||
|
||||
type RollingUpdateStatefulSetStrategy struct {
|
||||
// Partition indicates the ordinal at which the StatefulSet should be partitioned by default.
|
||||
// But if unorderedUpdate has been set:
|
||||
// - Partition indicates the number of pods with non-updated revisions when rolling update.
|
||||
// - It means controller will update $(replicas - partition) number of pod.
|
||||
// Default value is 0.
|
||||
// +optional
|
||||
Partition *int32 `json:"partition,omitempty"`
|
||||
|
||||
// The maximum number of pods that can be unavailable during the update.
|
||||
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
|
||||
// Absolute number is calculated from percentage by rounding down.
|
||||
// Also, maxUnavailable can just be allowed to work with Parallel podManagementPolicy.
|
||||
// Defaults to 1.
|
||||
// +optional
|
||||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
|
||||
|
||||
// PodUpdatePolicy indicates how pods should be updated
|
||||
// Default value is "ReCreate"
|
||||
// +optional
|
||||
PodUpdatePolicy kruiseV1beta1.PodUpdateStrategyType `json:"podUpdatePolicy,omitempty"`
|
||||
|
||||
// Paused indicates that the StatefulSet is paused.
|
||||
// Default value is false
|
||||
// +optional
|
||||
Paused bool `json:"paused,omitempty"`
|
||||
|
||||
// UnorderedUpdate contains strategies for non-ordered update.
|
||||
// If it is not nil, pods will be updated with non-ordered sequence.
|
||||
// Noted that UnorderedUpdate can only be allowed to work with Parallel podManagementPolicy
|
||||
// +optional
|
||||
// UnorderedUpdate *kruiseV1beta1.UnorderedUpdateStrategy `json:"unorderedUpdate,omitempty"`
|
||||
|
||||
// InPlaceUpdateStrategy contains strategies for in-place update.
|
||||
// +optional
|
||||
InPlaceUpdateStrategy *appspub.InPlaceUpdateStrategy `json:"inPlaceUpdateStrategy,omitempty"`
|
||||
|
||||
// MinReadySeconds indicates how long will the pod be considered ready after it's updated.
|
||||
// MinReadySeconds works with both OrderedReady and Parallel podManagementPolicy.
|
||||
// It affects the pod scale up speed when the podManagementPolicy is set to be OrderedReady.
|
||||
// Combined with MaxUnavailable, it affects the pod update speed regardless of podManagementPolicy.
|
||||
// Default value is 0, max is 300.
|
||||
// +optional
|
||||
MinReadySeconds *int32 `json:"minReadySeconds,omitempty"`
|
||||
}
|
||||
|
||||
type InPlaceUpdateStrategy struct {
|
||||
// GracePeriodSeconds is the timespan between set Pod status to not-ready and update images in Pod spec
|
||||
// when in-place update a Pod.
|
||||
GracePeriodSeconds int32 `json:"gracePeriodSeconds,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### ScaleStrategy
|
||||
```
|
||||
|
||||
type ScaleStrategy struct {
|
||||
// The maximum number of pods that can be unavailable during scaling.
|
||||
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
|
||||
// Absolute number is calculated from percentage by rounding down.
|
||||
// It can just be allowed to work with Parallel podManagementPolicy.
|
||||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
|
||||
|
||||
// ScaleDownStrategyType indicates the scaling down strategy, include two types: General & ReserveIds
|
||||
// General will first consider the ReserveGameServerIds field when game server scaling down.
|
||||
// When the number of reserved game servers does not meet the scale down number, continue to
|
||||
// select and delete the game servers from the current game server list.
|
||||
// ReserveIds will backfill the sequence numbers into ReserveGameServerIds field when
|
||||
// GameServers scale down, whether set by ReserveGameServerIds field or the GameServerSet
|
||||
// controller chooses to remove it.
|
||||
// Default is General
|
||||
// +optional
|
||||
ScaleDownStrategyType ScaleDownStrategyType `json:"scaleDownStrategyType,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### ServiceQualities
|
||||
|
||||
```
|
||||
type ServiceQuality struct {
|
||||
// Inherits all fields from corev1.Probe
|
||||
corev1.Probe `json:",inline"`
|
||||
|
||||
// Custom name for the service quality, distinguishes different service qualities that are defined.
|
||||
Name string `json:"name"`
|
||||
|
||||
// Name of the container to be probed.
|
||||
ContainerName string `json:"containerName,omitempty"`
|
||||
|
||||
// Whether to make GameServerSpec not change after the ServiceQualityAction is executed.
|
||||
// When Permanent is true, regardless of the detection results, ServiceQualityAction will only be executed once.
|
||||
// When Permanent is false, ServiceQualityAction can be executed again even though ServiceQualityAction has been executed.
|
||||
Permanent bool `json:"permanent"`
|
||||
|
||||
// Corresponding actions to be executed for the service quality.
|
||||
ServiceQualityAction []ServiceQualityAction `json:"serviceQualityAction,omitempty"`
|
||||
}
|
||||
|
||||
type ServiceQualityAction struct {
|
||||
// Defines to change the GameServerSpec field when the detection is true/false.
|
||||
State bool `json:"state"`
|
||||
GameServerSpec `json:",inline"`
|
||||
}
|
||||
```
|
||||
|
||||
#### Network
|
||||
|
||||
```
|
||||
type Network struct {
|
||||
// Different network types correspond to different network plugins.
|
||||
NetworkType string `json:"networkType,omitempty"`
|
||||
|
||||
// Different network types need to fill in different network parameters.
|
||||
NetworkConf []NetworkConfParams `json:"networkConf,omitempty"`
|
||||
}
|
||||
|
||||
type NetworkConfParams KVParams
|
||||
|
||||
type KVParams struct {
|
||||
// Parameter name, the name is determined by the network plugin
|
||||
Name string `json:"name,omitempty"`
|
||||
|
||||
// Parameter value, the format is determined by the network plugin
|
||||
Value string `json:"value,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
### GameServerSetStatus
|
||||
|
||||
```yaml
|
||||
type GameServerSetStatus struct {
|
||||
// The iteration version of the GameServerSet observed by the controller.
|
||||
ObservedGeneration int64 `json:"observedGeneration,omitempty"`
|
||||
|
||||
// The number of game servers.
|
||||
Replicas int32 `json:"replicas"`
|
||||
|
||||
// The number of game servers that are ready.
|
||||
ReadyReplicas int32 `json:"readyReplicas"`
|
||||
|
||||
// The number of game servers that are available.
|
||||
AvailableReplicas int32 `json:"availableReplicas"`
|
||||
|
||||
// The current number of game servers.
|
||||
CurrentReplicas int32 `json:"currentReplicas"`
|
||||
|
||||
// The number of game servers that have been updated.
|
||||
UpdatedReplicas int32 `json:"updatedReplicas"`
|
||||
|
||||
// The number of game servers that have been updated and are ready.
|
||||
UpdatedReadyReplicas int32 `json:"updatedReadyReplicas,omitempty"`
|
||||
|
||||
// The number of game servers that are in Maintaining state.
|
||||
MaintainingReplicas *int32 `json:"maintainingReplicas,omitempty"`
|
||||
|
||||
// The number of game servers that are in WaitToBeDeleted state.
|
||||
WaitToBeDeletedReplicas *int32 `json:"waitToBeDeletedReplicas,omitempty"`
|
||||
|
||||
// The label selector used to query game servers that should match the replica count used by HPA.
|
||||
LabelSelector string `json:"labelSelector,omitempty"`
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
|
||||
## GameServer
|
||||
|
||||
### GameServerSpec
|
||||
|
||||
```
|
||||
type GameServerSpec struct {
|
||||
// The O&M state of the game server, not pod runtime state, more biased towards the state of the game itself.
|
||||
// Currently, the states that can be specified are: None / WaitToBeDeleted / Maintaining.
|
||||
// Default is None
|
||||
OpsState OpsState `json:"opsState,omitempty"`
|
||||
|
||||
// Update priority. If the priority is higher, it will be updated first.
|
||||
UpdatePriority *intstr.IntOrString `json:"updatePriority,omitempty"`
|
||||
|
||||
// Deletion priority. If the priority is higher, it will be deleted first.
|
||||
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
|
||||
|
||||
// Whether to perform network isolation and cut off the access layer network
|
||||
// Default is false
|
||||
NetworkDisabled bool `json:"networkDisabled,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
### GameServerStatus
|
||||
|
||||
```
|
||||
type GameServerStatus struct {
|
||||
// Expected game server status
|
||||
DesiredState GameServerState `json:"desiredState,omitempty"`
|
||||
|
||||
// The actual status of the current game server
|
||||
CurrentState GameServerState `json:"currentState,omitempty"`
|
||||
|
||||
// network status information
|
||||
NetworkStatus NetworkStatus `json:"networkStatus,omitempty"`
|
||||
|
||||
// The game server corresponds to the pod status
|
||||
PodStatus corev1.PodStatus `json:"podStatus,omitempty"`
|
||||
|
||||
// Service quality status of game server
|
||||
ServiceQualitiesCondition []ServiceQualityCondition `json:"serviceQualitiesConditions,omitempty"`
|
||||
|
||||
// Current update priority
|
||||
UpdatePriority *intstr.IntOrString `json:"updatePriority,omitempty"`
|
||||
|
||||
// Current deletion priority
|
||||
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
|
||||
|
||||
// Last change time
|
||||
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
# GameServer Monitor
|
||||
## Metrics available
|
||||
|
||||
OKG by default exposes game server related Prometheus metrics, including:
|
||||
|
||||
| Name | Description | Type |
|
||||
| --- |------------------------------------------------|---------|
|
||||
| GameServersStateCount | Number of game servers in different states | gauge |
|
||||
| GameServersOpsStateCount | Number of game servers in different ops states | gauge |
|
||||
| GameServersTotal | Total number of game servers that have existed | counter |
|
||||
| GameServerSetsReplicasCount | Number of replicas for each GameServerSet | gauge |
|
||||
| GameServerDeletionPriority | Deletion priority for game servers | gauge |
|
||||
| GameServerUpdatePriority | Update priority for game servers | gauge |
|
||||
|
||||
|
||||
## Monitoring Dashboard
|
||||
|
||||
### Dashboard Import
|
||||
1. Import [grafana.json](https://github.com/openkruise/kruise-game/blob/master/config/prometheus/grafana.json) to Grafana
|
||||
2. Choose data source
|
||||
3. Replace UID and complete the import
|
||||
|
||||
### Dashboard Introduction
|
||||
The imported dashboard is shown below:
|
||||
|
||||
<img src={require('/static/img/kruisegame/user-manuals/gra-dash.png').default} width="90%" />
|
||||
|
||||
From top to bottom, it includes:
|
||||
- First row: number of GameServers in each current state, and a pie chart showing the proportion of GameServers in each current state
|
||||
- Second row: line chart showing the number of GameServers in each state over time
|
||||
- Third row: line chart showing the changes in deletion and update priorities for GameServers (can be filtered by namespace and gsName in the top-left corner)
|
||||
- Fourth and fifth rows: line charts showing the number of GameServers in different states for each GameServerSet (can be filtered by namespace and gssName in the top-left corner)
|
|
@ -6,9 +6,51 @@ For a non-gateway architecture, game developers need to consider how to expose e
|
|||
Different network products are usually required for access in different scenarios, and the network products may be provided by different cloud service providers. This increases the complexity of the access network. Cloud Provider & Network Plugin of OpenKruiseGame is designed to resolve this issue.
|
||||
OpenKruiseGame integrates different network plugins of different cloud service providers. You can use GameServerSet to set network parameters for game servers. Moreover, you can view network status information in the generated GameServer. This significantly reduces the complexity of the access network of game servers.
|
||||
|
||||
## Example
|
||||
## Network plugins
|
||||
|
||||
OpenKruiseGame supports the following network plugins:
|
||||
- Kubernetes-HostPort
|
||||
- AlibabaCloud-NATGW
|
||||
- AlibabaCloud-SLB
|
||||
- AlibabaCloud-SLB-SharedPort
|
||||
|
||||
---
|
||||
### Kubernetes-HostPort
|
||||
#### Plugin name
|
||||
|
||||
`Kubernetes-HostPort`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
Kubernetes
|
||||
|
||||
#### Plugin description
|
||||
- HostPort enables game servers to be accessed from the Internet by forwarding Internet traffic to the game servers by using the external IP address and ports exposed by the host where the game servers are located. The exposed IP address of the host must be a public IP address so that the host can be accessed from the Internet.
|
||||
|
||||
- In the configuration file, you can specify a custom range of available host ports. The default port range is 8000 to 9000. This network plugin can help you allocate and manage host ports to prevent port conflicts.
|
||||
|
||||
- This network plugin does not support network isolation.
|
||||
|
||||
#### Network parameters
|
||||
|
||||
ContainerPorts
|
||||
|
||||
- Meaning: the name of the container that provides services, the ports to be exposed, and the protocols.
|
||||
- Value: in the format of containerName:port1/protocol1,port2/protocol2,... The protocol names must be in uppercase letters. Example: `game-server:25565/TCP`.
|
||||
- Configuration change supported or not: no. The value of this parameter is effective until the pod lifecycle ends.
|
||||
|
||||
#### Plugin configuration
|
||||
|
||||
```
|
||||
[kubernetes]
|
||||
enable = true
|
||||
[kubernetes.hostPort]
|
||||
# Specify the range of available ports of the host. Ports in this range can be used to forward Internet traffic to pods.
|
||||
max_port = 9000
|
||||
min_port = 8000
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
OpenKruiseGame allows game servers to use the HostPort network in native Kubernetes clusters. The host where game servers are located exposes its external IP address and ports by using which Internet traffic is forwarded to the internal ports of the game servers. The following example shows the details:
|
||||
|
||||
|
@ -66,109 +108,167 @@ Use the networkStatus field in the generated GameServer to view the network stat
|
|||
|
||||
Clients can access the game server by using 48.98.98.8:8211.
|
||||
|
||||
### AlibabaCloud-NATGW
|
||||
|
||||
OpenKruiseGame supports the NAT gateway model of Alibaba Cloud. A NAT gateway exposes its external IP addresses and ports by using which Internet traffic is forwarded to pods. The following example shows the details:
|
||||
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-natgw
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: AlibabaCloud-NATGW
|
||||
networkConf:
|
||||
- name: Ports
|
||||
# The ports to be exposed. The value is in the following format: {port1},{port2}...
|
||||
value: "80"
|
||||
- name: Protocol
|
||||
# The protocol. The value is TCP by default.
|
||||
value: "TCP"
|
||||
# - name: Fixed
|
||||
# Specify whether the mapping relationship is fixed. By default, the mapping relationship is not fixed, that is, a new external IP address and port are generated after the pod is deleted.
|
||||
# value: true
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
Use the networkStatus field in the generated GameServer to view the network status information of the game server.
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T11:21:34Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.97.227.137
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "512"
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.189
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "80"
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T11:21:34Z"
|
||||
networkType: AlibabaCloud-NATGW
|
||||
```
|
||||
|
||||
Clients can access the game server by using 47.97.227.137:512.
|
||||
|
||||
## Network plugins
|
||||
|
||||
OpenKruiseGame supports the following network plugins:
|
||||
- Kubernetes-HostPort
|
||||
- AlibabaCloud-NATGW
|
||||
- AlibabaCloud-SLB
|
||||
- AlibabaCloud-SLB-SharedPort
|
||||
|
||||
---
|
||||
### Kubernetes-HostPort
|
||||
### Kubernetes-Ingress
|
||||
#### Plugin name
|
||||
|
||||
`Kubernetes-HostPort`
|
||||
`Kubernetes-Ingress`
|
||||
|
||||
#### Cloud Provider
|
||||
|
||||
Kubernetes
|
||||
|
||||
#### Plugin description
|
||||
- HostPort enables game servers to be accessed from the Internet by forwarding Internet traffic to the game servers by using the external IP address and ports exposed by the host where the game servers are located. The exposed IP address of the host must be a public IP address so that the host can be accessed from the Internet.
|
||||
|
||||
- In the configuration file, you can specify a custom range of available host ports. The default port range is 8000 to 9000. This network plugin can help you allocate and manage host ports to prevent port conflicts.
|
||||
- OKG provides the Ingress network model for games such as H5 games that require the application layer network model. This plugin will automatically set the corresponding path for each game server, which is related to the game server ID and is unique for each game server.
|
||||
|
||||
- This network plugin does not support network isolation.
|
||||
|
||||
#### Network parameters
|
||||
|
||||
ContainerPorts
|
||||
Path
|
||||
|
||||
- Meaning: Access path. Each game server has its own access path based on its ID.
|
||||
- Value format: Add <id\> to any position in the original path(consistent with the Path field in HTTPIngressPath), and the plugin will generate the path corresponding to the game server ID. For example, when setting the path to /game<id\>, the path for game server 0 is /game0, the path for game server 1 is /game1, and so on.
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
|
||||
Port
|
||||
|
||||
- Meaning: Port value exposed by the game server.
|
||||
- Value format: port number
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
IngressClassName
|
||||
|
||||
- Meaning: Specify the name of the IngressClass. Same as the IngressClassName field in IngressSpec.
|
||||
- Value format: Same as the IngressClassName field in IngressSpec.
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
Host
|
||||
|
||||
- Meaning: Domain name. Each game server has its own domain based on its ID.
|
||||
- Value format: Add <id\> to any position in the domain, and the plugin will generate the domain corresponding to the game server ID. For example, when setting the domain to test.game<id\>.cn-hangzhou.ali.com, the domain for game server 0 is test.game0.cn-hangzhou.ali.com, the domain for game server 1 is test.game1.cn-hangzhou.ali.com, and so on.
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
TlsHosts
|
||||
|
||||
- Meaning: List of hosts containing TLS certificates. Similar to the Hosts field in IngressTLS.
|
||||
- Value format: host1,host2,... For example, xxx.xx1.com,xxx.xx2.com
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
TlsSecretName
|
||||
|
||||
- Meaning: Same as the SecretName field in IngressTLS.
|
||||
- Value format: Same as the SecretName field in IngressTLS.
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
Annotation
|
||||
|
||||
- Meaning: as an annotation of the Ingress object
|
||||
- Value format: key: value (note the space after the colon), for example: nginx.ingress.kubernetes.io/rewrite-target: /$2
|
||||
- Configuration change supported or not: yes.
|
||||
|
||||
_additional explanation_
|
||||
|
||||
- If you want to fill in multiple annotations, you can define multiple slices named Annotation in the networkConf.
|
||||
- Supports filling in multiple paths. The path, path type, and port correspond one-to-one in the order of filling. When the number of paths is greater than the number of path types(or port), non-corresponding paths will match the path type(or port) that was filled in first.
|
||||
|
||||
- Meaning: the name of the container that provides services, the ports to be exposed, and the protocols.
|
||||
- Value: in the format of containerName:port1/protocol1,port2/protocol2,... The protocol names must be in uppercase letters. Example: `game-server:25565/TCP`.
|
||||
- Configuration change supported or not: no. The value of this parameter is effective until the pod lifecycle ends.
|
||||
|
||||
#### Plugin configuration
|
||||
|
||||
None
|
||||
|
||||
#### Example
|
||||
|
||||
Set GameServerSet.Spec.Network:
|
||||
|
||||
```yaml
|
||||
network:
|
||||
networkConf:
|
||||
- name: IngressClassName
|
||||
value: nginx
|
||||
- name: Port
|
||||
value: "80"
|
||||
- name: Path
|
||||
value: /game<id>(/|$)(.*)
|
||||
- name: Path
|
||||
value: /test-<id>
|
||||
- name: Host
|
||||
value: test.xxx.cn-hangzhou.ali.com
|
||||
- name: PathType
|
||||
value: ImplementationSpecific
|
||||
- name: TlsHosts
|
||||
value: xxx.xx1.com,xxx.xx2.com
|
||||
- name: Annotation
|
||||
value: 'nginx.ingress.kubernetes.io/rewrite-target: /$2'
|
||||
- name: Annotation
|
||||
value: 'nginx.ingress.kubernetes.io/random: xxx'
|
||||
networkType: Kubernetes-Ingress
|
||||
|
||||
```
|
||||
[kubernetes]
|
||||
enable = true
|
||||
[kubernetes.hostPort]
|
||||
# Specify the range of available ports of the host. Ports in this range can be used to forward Internet traffic to pods.
|
||||
max_port = 9000
|
||||
min_port = 8000
|
||||
This will generate a service and an ingress object for each replica of GameServerSet. The configuration for the ingress of the 0th game server is shown below:
|
||||
|
||||
```yaml
|
||||
spec:
|
||||
ingressClassName: nginx
|
||||
rules:
|
||||
- host: test.xxx.cn-hangzhou.ali.com
|
||||
http:
|
||||
paths:
|
||||
- backend:
|
||||
service:
|
||||
name: ing-nginx-0
|
||||
port:
|
||||
number: 80
|
||||
path: /game0(/|$)(.*)
|
||||
pathType: ImplementationSpecific
|
||||
- backend:
|
||||
service:
|
||||
name: ing-nginx-0
|
||||
port:
|
||||
number: 80
|
||||
path: /test-0
|
||||
pathType: ImplementationSpecific
|
||||
tls:
|
||||
- hosts:
|
||||
- xxx.xx1.com
|
||||
- xxx.xx2.com
|
||||
status:
|
||||
loadBalancer:
|
||||
ingress:
|
||||
- ip: 47.xx.xxx.xxx
|
||||
```
|
||||
|
||||
The other GameServers only have different path fields and service names, while the other generated parameters are the same.
|
||||
|
||||
The network status of GameServer is as follows:
|
||||
|
||||
```yaml
|
||||
networkStatus:
|
||||
createTime: "2023-04-28T14:00:30Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.xx.xxx.xxx
|
||||
ports:
|
||||
- name: /game0(/|$)(.*)
|
||||
port: 80
|
||||
protocol: TCP
|
||||
- name: /test-0
|
||||
port: 80
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 10.xxx.x.xxx
|
||||
ports:
|
||||
- name: /game0(/|$)(.*)
|
||||
port: 80
|
||||
protocol: TCP
|
||||
- name: /test-0
|
||||
port: 80
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2023-04-28T14:00:30Z"
|
||||
networkType: Kubernetes-Ingress
|
||||
```
|
||||
|
||||
---
|
||||
|
@ -211,6 +311,67 @@ Fixed
|
|||
|
||||
None
|
||||
|
||||
#### Example
|
||||
|
||||
OpenKruiseGame supports the NAT gateway model of Alibaba Cloud. A NAT gateway exposes its external IP addresses and ports by using which Internet traffic is forwarded to pods. The following example shows the details:
|
||||
|
||||
```shell
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: game.kruise.io/v1alpha1
|
||||
kind: GameServerSet
|
||||
metadata:
|
||||
name: gs-natgw
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
podUpdatePolicy: InPlaceIfPossible
|
||||
network:
|
||||
networkType: AlibabaCloud-NATGW
|
||||
networkConf:
|
||||
- name: Ports
|
||||
# The ports to be exposed. The value is in the following format: {port1},{port2}...
|
||||
value: "80"
|
||||
- name: Protocol
|
||||
# The protocol. The value is TCP by default.
|
||||
value: "tcp"
|
||||
# - name: Fixed
|
||||
# Specify whether the mapping relationship is fixed. By default, the mapping relationship is not fixed, that is, a new external IP address and port are generated after the pod is deleted.
|
||||
# value: true
|
||||
gameServerTemplate:
|
||||
spec:
|
||||
containers:
|
||||
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
|
||||
name: gameserver
|
||||
EOF
|
||||
```
|
||||
|
||||
Use the networkStatus field in the generated GameServer to view the network status information of the game server.
|
||||
|
||||
```shell
|
||||
networkStatus:
|
||||
createTime: "2022-11-23T11:21:34Z"
|
||||
currentNetworkState: Ready
|
||||
desiredNetworkState: Ready
|
||||
externalAddresses:
|
||||
- ip: 47.97.227.137
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "512"
|
||||
protocol: TCP
|
||||
internalAddresses:
|
||||
- ip: 172.16.0.189
|
||||
ports:
|
||||
- name: "80"
|
||||
port: "80"
|
||||
protocol: TCP
|
||||
lastTransitionTime: "2022-11-23T11:21:34Z"
|
||||
networkType: AlibabaCloud-NATGW
|
||||
```
|
||||
|
||||
Clients can access the game server by using 47.97.227.137:512.
|
||||
|
||||
---
|
||||
### AlibabaCloud-SLB
|
||||
#### Plugin name
|
||||
|
|
|
@ -15,15 +15,15 @@ A group of game servers are updated in two batches to simulate a canary update.
|
|||
The GameServerSet consists of three game server replicas.
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 0
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 0
|
||||
minecraft-2 Ready None 0 0
|
||||
```
|
||||
|
||||
Set the updatePriority parameter to a greater value for the game server gs-demo-1.
|
||||
Set the updatePriority parameter to a greater value for the game server minecraft-1.
|
||||
```shell
|
||||
kubectl edit gs gs-demo-1
|
||||
kubectl edit gs minecraft-1
|
||||
|
||||
...
|
||||
spec:
|
||||
|
@ -38,7 +38,7 @@ Set the partition parameter and the latest image used to trigger an update opera
|
|||
kubectl edit gss gs-demo
|
||||
|
||||
...
|
||||
image: gameserver:latest # Set the latest image.
|
||||
image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2-new # Set the latest image.
|
||||
name: gameserver
|
||||
...
|
||||
updateStrategy:
|
||||
|
@ -50,28 +50,28 @@ kubectl edit gss gs-demo
|
|||
|
||||
```
|
||||
|
||||
In this case, only the game server gs-demo-1 is updated.
|
||||
In this case, only the game server minecraft-1 is updated.
|
||||
```shell
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Updating None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Updating None 0 10
|
||||
minecraft-2 Ready None 0 0
|
||||
|
||||
|
||||
# Wait for a period of time.
|
||||
...
|
||||
|
||||
kubectl get gs
|
||||
NAME STATE OPSSTATE DP UP
|
||||
gs-demo-0 Ready None 0 0
|
||||
gs-demo-1 Ready None 0 10
|
||||
gs-demo-2 Ready None 0 0
|
||||
NAME STATE OPSSTATE DP UP
|
||||
minecraft-0 Ready None 0 0
|
||||
minecraft-1 Ready None 0 10
|
||||
minecraft-2 Ready None 0 0
|
||||
```
|
||||
|
||||
After you verify that the game server gs-demo-1 is updated, update the remaining game servers.
|
||||
After you verify that the game server minecraft-1 is updated, update the remaining game servers.
|
||||
```shell
|
||||
kubectl edit gss gs-demo
|
||||
kubectl edit gss minecraft
|
||||
...
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
|
|
|
@ -33,6 +33,9 @@ module.exports = {
|
|||
'user-manuals/container-startup-sequence-control',
|
||||
'user-manuals/service-qualities',
|
||||
'user-manuals/network',
|
||||
'user-manuals/autoscale',
|
||||
'user-manuals/gameserver-monitor',
|
||||
'user-manuals/crd-field-description',
|
||||
],
|
||||
},
|
||||
{
|
||||
|
|
After Width: | Height: | Size: 57 KiB |
After Width: | Height: | Size: 56 KiB |
After Width: | Height: | Size: 265 KiB |
After Width: | Height: | Size: 29 KiB |
After Width: | Height: | Size: 51 KiB |
After Width: | Height: | Size: 323 KiB |
After Width: | Height: | Size: 319 KiB |
After Width: | Height: | Size: 1.4 MiB |
After Width: | Height: | Size: 1024 KiB |
After Width: | Height: | Size: 3.1 MiB |