add bluegreen doc (#224)

Signed-off-by: yunbo <yunbo10124scut@gmail.com>
Co-authored-by: yunbo <yunbo10124scut@gmail.com>
This commit is contained in:
myname4423 2025-01-14 20:31:44 +08:00 committed by GitHub
parent f12a1ed237
commit 951af52138
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 430 additions and 44 deletions

View File

@ -3,7 +3,7 @@
## 什么是 Kruise Rollouts
Kruise Rollouts 是一个 **Bypass(旁路)** 组件,提供 **高级渐进式交付功能**
。它的支持可以帮助您实现对应用程序的更平稳和受控的更改部署支持金丝雀、多批次和A/B测试交付模式同时它兼容 Gateway API 和各种Ingress 实现,使其更容易集成到您的现有基础设施中。总的来说,对于希望优化其部署流程的 Kubernetes 用户来说Kruise Rollouts是一个有价值的工具
。它的支持可以帮助您实现对应用程序的更平稳和受控的更改部署,支持金丝雀、蓝绿、多批次和A/B测试交付模式同时它兼容 Gateway API 和各种Ingress 实现,使其更容易集成到您的现有基础设施中。总的来说,对于希望优化其部署流程的 Kubernetes 用户来说Kruise Rollouts是一个有价值的工具
<center><img src={require('/static/img/rollouts/intro.png').default} width="90%" /></center>
@ -12,6 +12,7 @@ Kruise Rollouts 是一个 **Bypass(旁路)** 组件,提供 **高级渐进式
- **丰富的发布策略**
- 用于 Deployment、CloneSet、StatefulSet、Advanced StatefulSet、Advanced DaemonSet 的多批次更新策略。
- 用于 Deployment 的金丝雀(Canary)更新策略。
- 用于 Deployment、CloneSet 的蓝绿(Blue-Green)更新策略。
- **丰富的流量路由管理策略**
- 在更新工作负载时进行流量细粒度、加权流量转移。
@ -39,7 +40,7 @@ Kruise Rollouts 与 [Argo Rollout](https://argoproj.github.io/rollouts/) 和 [Fl
| 核心概念 | 增强现有的工作负载 | 替换您的工作负载 | 管理您的工作负载 |
| 架构 | Bypass | 新的工作负载类型 | Bypass |
| 插拔和热切换 | 是 | 否 | 否 |
| 发布类型 | 多批次、金丝雀、A/B测试、全链路灰度 | 多批次、金丝雀、蓝绿、A/B测试 | 金丝雀、蓝绿、A/B测试 |
| 发布类型 | 多批次、金丝雀、A/B测试、全链路灰度、蓝绿 | 多批次、金丝雀、蓝绿、A/B测试 | 金丝雀、蓝绿、A/B测试 |
| 工作负载类型 | Deployment、StatefulSet、CloneSet、Advanced StatefulSet、Advanced DaemonSet | Agro-Rollout | Deployment、DaemonSet |
| 流量类型 | Ingress、GatewayAPI、CRD需要 Lua 脚本) | Ingress、GatewayAPI、APISIX、Traefik、SMI 等等 | Ingress、GatewayAPI、APISIX、Traefik、SMI 等等 |
| 迁移成本 | 无需迁移工作负载和Pods | 必须迁移工作负载和Pods | 必须迁移Pods |

View File

@ -291,6 +291,10 @@ spec:
### 策略API必填
canary用于金丝雀发布和多批次发布blueGreen用于蓝绿发布二者是互斥选项不能同时为空也不能都非空。blueGreen选项是在Kruise-Rollout v0.5.0版本之后引入的且不支持v1alpha1 API。
#### canary
描述您的发布策略:
<Tabs>
@ -389,17 +393,55 @@ spec:
- `enableExtraWorkloadForCanary 在v1beta1的Rollout对象中可用 在v1alpha1版本的Rollout对象中 可以用Rollout的特殊annotation `rollouts.kruise.io/rolling-type` 来开启类似功能rolling-type 如果设置为"canary"(默认值) 则相当于设置enableExtraWorkloadForCanary=true; 如果设置为partition, 这相当于设置enableExtraWorkloadForCanar=false
- `patchPodTemplateMetadata`只有在`enableExtraWorkloadForCanary = true`的情况下才会生效。
#### blueGreen
描述您的发布策略:
```yaml
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
namespace: your-workload-ns
spec:
strategy:
blueGreen:
steps:
# the first step
- replicas: 100%
traffic: 0%
pause:
duration: {}
# the second step
- replicas: 100%
matches:
- headers:
- type: Exact # or "RegularExpression"
name: matched-header-name
value: matched-header-value # value or reg-expression
# the third step
- replicas: 100%
traffic: 100%
```
注意:
- 除了没有`patchPodTemplateMetadata`字段和``enableExtraWorkloadForCanary`字段blueGreen和canary的配置选项完全一样也遵循和canary同样的注意事项。
- 蓝绿发布和其他发布方式的区别请查阅“发布策略”-“蓝绿发布”。
### 工作负载的特殊注释(可选)
绑定工作负载中有一些特殊的注释,用于启用特定功能。
| 注释 | 值 | 默认值 | 说明 |
|---------------------------------|-------|-----|-------------------------------------------------------|
| `rollouts.kruise.io/rollout-id` | 任意字符串 | "" | 这个概念类似于发布顺序号。用于解决用户是否观察到Kruise Rollout控制器当前工作负载更改的问题。 |
| 注释 | 值 | 默认值 | 说明 |
| ------------------------------- | ---------- | ------ | ------------------------------------------------------------ |
| `rollouts.kruise.io/rollout-id` | 任意字符串 | "" | 这个概念类似于发布顺序号。用于解决用户是否观察到Kruise Rollout控制器当前工作负载更改的问题。 |
### 您应该了解的Rollout状态
#### Canary
```yaml
kind: Rollout
status:
@ -417,16 +459,39 @@ status:
stableRevision: b76b6f48f
```
| 字段 | 类型 | 模式 | 说明 |
|------------------------------------|-----|-------|-----------------------------------------------------------------------|
| `phase` | 字符串 | 只读 | "Initial" 表示没有绑定的工作负载;"Healthy" 表示绑定的工作负载已推进;"Progressing" 表示卷出正在进行中。 |
| `observedGeneration` | 整数 | 只读 | 观察到的卷出规范的生成。 |
| `canaryStatus` | *对象 | 只读 | 有关卷出进展的信息。 |
| `canaryStatus.canaryReplicas` | 整数 | 只读 | 工作负载更新的副本数。 |
| `canaryStatus.canaryReadyReplicas` | 整数 | 只读 | 工作负载更新的就绪副本数。 |
| `canaryStatus.podTemplateHash` | 字符串 | 只读 | 工作负载更新(新版本)的哈希值。 |
| `canaryStatus.canaryRevision` | 字符串 | 只读 | 由Kruise Rollout控制器计算的工作负载更新新版本的修订哈希值。 |
| `canaryStatus.stableRevision` | 字符串 | 只读 | 进展之前记录的工作负载稳定(旧版本)的修订哈希值。 |
| `canaryStatus.observedRolloutID` | 字符串 | 只读 | 对应于工作负载的`rollouts.kruise.io/rollout-id`注释。如果它们相等,意味着卷出控制器观察到了工作负载的更改。 |
| `canaryStatus.currentStepIndex` | 整数 | 只读 | 卷出当前步骤索引。从1开始。 |
| `canaryStatus.currentStepState` | 字符串 | 只读和写入 | 卷出当前步骤状态。"StepReady"和"Complete"都表示当前步骤就绪。 |
| 字段 | 类型 | 模式 | 说明 |
| ---------------------------------- | ------ | ---- | ------------------------------------------------------------ |
| `phase` | 字符串 | 只读 | "Initial" 表示没有绑定的工作负载;"Healthy" 表示绑定的工作负载已推进;"Progressing" 表示发布正在进行中。 |
| `observedGeneration` | 整数 | 只读 | 观察到的Rollout的Generation。 |
| `canaryStatus` | *对象 | 只读 | 有关Rollout进展的信息。 |
| `canaryStatus.canaryReplicas` | 整数 | 只读 | 工作负载更新的副本数。 |
| `canaryStatus.canaryReadyReplicas` | 整数 | 只读 | 工作负载更新的就绪副本数。 |
| `canaryStatus.podTemplateHash` | 字符串 | 只读 | 工作负载更新(新版本)的哈希值。 |
| `canaryStatus.canaryRevision` | 字符串 | 只读 | 由Kruise Rollout控制器计算的工作负载更新新版本的修订哈希值。 |
| `canaryStatus.stableRevision` | 字符串 | 只读 | 进展之前记录的工作负载稳定(旧版本)的修订哈希值。 |
| `canaryStatus.observedRolloutID` | 字符串 | 只读 | 对应于工作负载的`rollouts.kruise.io/rollout-id`注释。如果它们相等意味着Rollout控制器观察到了工作负载的更改。 |
| `canaryStatus.currentStepIndex` | 整数 | 只读 | Rollout当前步骤索引。从1开始。 |
| `canaryStatus.nextStepIndex` | 整数 | 可写 | 指示下一个执行的步骤。如果当前批次已经是最后一批或者发布已经结束,其值为-1其余情况通常等于`canaryStatus.currentStepIndex`+1。允许用户修改为合理的正数以实现批次的非顺序执行。 |
| `canaryStatus.currentStepState` | 字符串 | 可写 | Rollout当前步骤状态。"StepReady"和"Complete"都表示当前步骤就绪。 |
#### Blue-Green
```yaml
kind: Rollout
status:
blueGreenStatus:
currentStepIndex: 1
currentStepState: StepPaused
lastUpdateTime: "2025-01-03T09:20:29Z"
message: BatchRelease is at state Ready, rollout-id , step 1
nextStepIndex: 2
observedWorkloadGeneration: 4
podTemplateHash: 64c6f99459
rolloutHash: 7w8dxcdc49wv4w49c469b27c6c7xb4f4c4dvf8dwd4b6zb5z4zcc852c7w9vf5dv
stableRevision: 65f957664d
updatedReadyReplicas: 10
updatedReplicas: 10
updatedRevision: 64448b955c
```
和`canaryStatus`一样,`blueGreenStatus`有**完全相同**的状态字段,它门的含义也相同。

View File

@ -0,0 +1,116 @@
# 蓝绿发布
## 蓝绿发布流程
<center><img src={require('/static/img/rollouts/bluegreen.png').default} width="90%" /></center>
## 推荐配置
**注意蓝绿策略仅适用于Deployment和CloneSet且仅支持v1beta1的API要求Rollout版本高于v0.5.0(不包括v0.5.0)**
```YAML
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
name: rollouts-demo
spec:
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: workload-demo
strategy:
blueGreen:
steps:
- replicas: 100% # step 1
traffic: 0%
- replicas: 100% # step 2
traffic: 10%
- replicas: 100% # step 3
traffic: 100%
trafficRoutings:
- service: service-demo
ingress:
classType: nginx
name: ingress-demo
```
## 行为解释
当您为workload-demo应用新修订版本时
- 在第一批中将扩容100%的新版本Pod稳定版本的Pod不会被缩容此时有200%的Pod没有任何请求路由到新版本Pods需要手动确认到下一批。
- 在第二批中将10%的流量路由到新版本,需要手动确认到下一批。
- 在第三批中将100%的流量路由到新版本,需要手动确认以完成发布。
当您认为新版本已经通过验证并确认进行下一步时:
- 稳定版本的Pods将会缩容发布结束
## 和金丝雀/多批次发布的区别
从**API**的角度,`strategy.blueGreen`和`strategy.canary`二者的区别很小,只体现在:
- blueGreen没有`EnableExtraWorkloadForCanary`字段。canary借助该字段区分是否对Deployment创建额外的Deployment对应Deployment的金丝雀发布和多批次发布。
- blueGreen没有`PatchPodTemplateMetadata`字段。目前只有金丝雀发布支持该字段,多批次发布和蓝绿发布都不支持该字段。
除此之外两种发布方式在API上没有进一步区别。您可以注意到蓝绿发布的steps和canary的steps有一致的用法。
从**发布流程**的角度,蓝绿发布和金丝雀/多批次发布的区别在于:
- 蓝绿发布不会创建额外的工作负载,这一点和多批次发布相同,而金丝雀发布会创建额外的工作负载。
- 蓝绿发布过程中旧版本的实例不会被缩容这一点和金丝雀发布相同而多批次发布在扩容新版本Pods的同时会缩容旧版本Pods。
您应该很容易理解蓝绿发布和多批次发布的差异,然而您也可能会想,除了底层是否会创建额外的工作负载之外,蓝绿发布和金丝雀发布似乎没什么区别。或许您可以从下图中获得解释:
<center><img src={require('/static/img/rollouts/canary_vs_bluegreen.png').default} width="90%" /></center>
注意到在最后一批发布完成之后金丝雀发布会执行会在原工作负载上执行滚动发布而蓝绿发布只需要直接缩容旧版本Pods。
这种差异反应了两种方式的语义上的差异:金丝雀发布中,创建的工作负载本质是用于验证新版本的“金丝雀”,在验证通过后即被删除。而蓝绿发布中,新旧版本会“共存”,允许在两个版本之间进行流量切换。
在实践上金丝雀发布建议配置少量批次比如1批每个批次的replicas应该比较小尽管允许将`replicas`配置为100%,但是通常没有必要。而对于蓝绿发布,通常建议配置`replicas`为100%
从**底层实现**的角度,蓝绿发布和金丝雀/多批次发布的区别在于:
- 金丝雀只有Deployment支持金丝雀发布发布时Kruise-Rollout会创建名为`[DeploymentName]-canary`的Deployment
- 多批次CloneSet可以借助自带的`Partition`字段实现多批次发布Deployment则依赖定制的Deployment控制器实现
- 蓝绿CloneSet和Deployment都是通过设置`MinReadySeconds`、`MaxSurge`以及`MaxUnavailable`字段来实现的,因此,如果您在使用蓝绿发布时发现这些字段被改变,请不用担心,这是正常的行为,在发布完成后这些字段会复原。
那么,我该选择哪种发布方式呢?这取决于您的业务场景:
- 蓝绿发布会消耗双倍的资源,您可能在面对稳定性要求比较高的场景时才会使用蓝绿发布。这是因为新旧版本实例会共存,您可以快速地将流量切换到新版本或稳定版本上。
- 在其他场景,您可能会选择金丝雀或者多批次发布。
## 回滚
### 全局回滚
和金丝雀/多批次发布相同,您可以直接回滚工作负载规范以回滚应用程序。详见“基本使用指南”-“如何回滚”
### 流量回滚
Rollout引入了新特性支持批次的跳转例如在发布到第3批时我们可以执行下面的命令跳转到第2批
```shell
kubectl patch rollout rollouts-demo --type merge --subresource status -p "{\"status\":{\"blueGreenStatus\":{\"nextStepIndex\": 2}}}"
```
利用这个特性可以对蓝绿发布的流量进行“回滚”在推荐配置中例子中从第3批跳转到第2批将使得流量配置从“100%的请求路由到新版本”变为“10%的请求路由到新版本”如果我们再进一步跳转到第1批那么全部请求将被路由到稳定版本。
不过需要注意的是如果目标step的`replicas`小于当前step的`replicas`并不会有Pods被缩容。
事实上直接修改Rollout资源的`spec.strategy.blueGreen.steps[x].traffic`可以达到相似效果,例如将`spec.strategy.blueGreen.steps[lastBatch].traffic`从100%修改为10%之后再从10%修改为0%。不过直接对`spec`进行修改可能会影响到下次发布,而修改`status`可以避免该问题。
注意允许批次跳转和修改某一批次的流量配置都是伴随着蓝绿发布而引入的新特性对Rollout的版本有要求。
## 注意事项
### HPA
在蓝绿发布过程中如果工作负载绑定了HPAHorizontalPodAutoscalerKruise-Rollout会禁止该HPA你会发现HPA的`scaleTargetRef.name`增加了后缀“-DisableByRollout”导致工作负载Not Found。蓝绿发布结束后后缀会被移除HPA将会再次生效。
### PDB
在蓝绿发布过程中如果工作负载绑定了PDBPodDisruptionBudget由于PDB在计算“Allowed disruptions”时不会考虑Pod的版本因此如果用户配置的是`maxUnavailable`,将导致“步长变小,过保护”;如果配置的是`minAvailable`将导致”步长变大弱保护”。因此除非有必要尽量使用minAvailable宁愿“过保护”也不要“弱保护”
### 连续发布
蓝绿发布目前不支持连续发布如果你在发布v2版本v1->v2)的过程中想要发布v3版本我们推荐先手动回滚到v1版本然后再发布v3版本。
如果你已经不小心发布了v3版本控制器将不会工作v3版本的Pods不会被创建v2和v1版本的Pods也不会被缩容此时如果检查Rollout资源你会注意到下面类似的Message
```shell
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE
rollouts-demo Progressing 1 StepPaused new version releasing detected in the progress of blue-green release, please rollback first 6d23h
```
此时建议执行回滚操作你可以回滚到v1版本。特别的如果你使用`kubectl rollout undo`命令进行回滚,请确保指定正确的`--to-revision`以回滚到v1如果没有指定`--to-revision`你可能会再次回到v2即发布v3之前的状态
```shell
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE
rollouts-demo Progressing 1 StepPaused Rollout is in step(1/4), and you need manually confirm to enter the next step 7d
```
## 已知问题
- 由于CloneSet的控制器代码存在尚未解决的问题对CloneSet进行蓝绿发布时所有批次的replicas都应该设置为100%否则CloneSet可能无法扩容。Deployment不受此限制。

View File

@ -2,7 +2,7 @@
## What is Kruise Rollouts?
Kruise Rollouts is a **Bypass** component that offers **Advanced Progressive Delivery Features**.
Its support for canary, multi-batch, and A/B testing delivery modes can be helpful in achieving smooth and controlled rollouts of changes to your application, while its compatibility with Gateway API and various Ingress implementations makes it easier to integrate with your existing infrastructure. Overall, Kruise Rollouts is a valuable tool for Kubernetes users looking to optimize their deployment processes!
Its support for canary, blue-green, multi-batch, and A/B testing delivery modes can be helpful in achieving smooth and controlled rollouts of changes to your application, while its compatibility with Gateway API and various Ingress implementations makes it easier to integrate with your existing infrastructure. Overall, Kruise Rollouts is a valuable tool for Kubernetes users looking to optimize their deployment processes!
![kruise-rollout-intro](../static/img/rollouts/intro.png)
@ -10,6 +10,7 @@ Its support for canary, multi-batch, and A/B testing delivery modes can be helpf
- **Rich release strategies**
- Multi-batch update strategy for Deployment, CloneSet, StatefulSet, Advanced StatefulSet, Advanced DaemonSet.
- Canary update strategy for Deployment.
- Blue-Green update strategy for Deployment, CloneSet.
- **Rich traffic routing management strategies**
- Traffic fine-grained, weighted traffic shifting when updating workloads.
@ -30,16 +31,16 @@ There is a demo of multi-batch update strategy for Deployment.
Kruise Rollouts vs. [Argo Rollout](https://argoproj.github.io/rollouts/) and [Flux Flagger](https://fluxcd.io/flagger/).
| Component | **Kruise Rollouts** | Argo Rollouts | Flux Flagger |
|-----------------------------|-------------------------------------------------------------------------|----------------------------------------------------|----------------------------------------------------|
| Core Concept | Enhance your existing workloads | Replace your workloads | manage your workloads |
| Architecture | Bypass | A new workload type | Bypass |
| Plug and Play, Hot-Swapping | Yes | No | No |
| Release Type | Multi-Batch, Canary, A/B Testing, End-to-End Canary | Multi-Batch, Canary, Blue-Green, A/B Testing | Canary, Blue-Green, A/B Testing |
| Workload Type | Deployment,StatefulSet,CloneSet,Advanced StatefulSet,Advanced DaemonSet | Agro-Rollout | Deployment. DaemonSet |
| Traffic Type | Ingress, GatewayAPI, CRD (Need Lua Script) | Ingress, GatewayAPI, APISIX, Traefik, SMI and more | Ingress, GatewayAPI, APISIX, Traefik, SMI and more |
| Migration Costs | No need migrate your workloads and pods | Must migrate your workloads and pods | Must migrate your pods |
| HPA compatible | Yes | Yes | No |
| Component | **Kruise Rollouts** | Argo Rollouts | Flux Flagger |
| --------------------------- | ------------------------------------------------------------ | -------------------------------------------------- | -------------------------------------------------- |
| Core Concept | Enhance your existing workloads | Replace your workloads | manage your workloads |
| Architecture | Bypass | A new workload type | Bypass |
| Plug and Play, Hot-Swapping | Yes | No | No |
| Release Type | Multi-Batch, Canary, A/B Testing, End-to-End Canary, Blue-Green | Multi-Batch, Canary, Blue-Green, A/B Testing | Canary, Blue-Green, A/B Testing |
| Workload Type | Deployment,StatefulSet,CloneSet,Advanced StatefulSet,Advanced DaemonSet | Agro-Rollout | Deployment. DaemonSet |
| Traffic Type | Ingress, GatewayAPI, CRD (Need Lua Script) | Ingress, GatewayAPI, APISIX, Traefik, SMI and more | Ingress, GatewayAPI, APISIX, Traefik, SMI and more |
| Migration Costs | No need migrate your workloads and pods | Must migrate your workloads and pods | Must migrate your pods |
| HPA compatible | Yes | Yes | No |
## What's Next
Here are some recommended next steps:

View File

@ -312,6 +312,11 @@ spec:
</Tabs>
### Strategy API (Mandatory)
`canary` is used for canary strategy and multi-batch strategy, while `blueGreen` is used for blue-green strategy. These two are mutually exclusive; they cannot both be empty or both be non-empty. The `blueGreen` strategy was introduced in Kruise-Rollout versions higher than v0.5.0 and is not supported in the v1alpha1 API.
#### Canary
Describe your strategy of rollout:
<Tabs>
@ -410,14 +415,56 @@ Note:
- `patchPodTemplateMetadata` can be set only if enableExtraWorkloadForCanary=true
- `enableExtraWorkloadForCanary` is available in v1beta Rollout resource; In v1alpha1 Rollout resource, one can use the annotation of Rollout `rollouts.kruise.io/rolling-type`="canary" to enable `enableExtraWorkloadForCanary`
#### blueGreen
Describe your strategy of rollout:
```yaml
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
namespace: your-workload-ns
spec:
strategy:
blueGreen:
steps:
# the first step
- replicas: 100%
traffic: 0%
pause:
duration: {}
# the second step
- replicas: 100%
matches:
- headers:
- type: Exact # or "RegularExpression"
name: matched-header-name
value: matched-header-value # value or reg-expression
# the third step
- replicas: 100%
traffic: 100%
```
Note:
- Except for the absence of the `patchPodTemplateMetadata` and `enableExtraWorkloadForCanary` fields, the configuration for `blueGreen` and `canary` are identical and follow the same precautions as `canary`.
- For the differences between blue-green strategy and other strategies, please refer to "Release Strategies" - "Blue-Green Release."
### Special Annotations of Workload (Optional)
There are some special annotations in Bounded Workload to enable specific abilities.
| Annotations | Value | Defaults | Explanation |
|---------------------------------|------------|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Annotations | Value | Defaults | Explanation |
| ------------------------------- | ---------- | -------- | ------------------------------------------------------------ |
| `rollouts.kruise.io/rollout-id` | any string | "" | The concept is similar to the release order number. To solve the problem that users should know whether the current changes of workload is observed by Kruise Rollout controller. |
### Rollout Status You Should Know
#### Canary
```yaml
kind: Rollout
status:
@ -434,16 +481,39 @@ status:
podTemplateHash: 76fd76f75b
stableRevision: b76b6f48f
```
| Fields | Type | Mode | Explanation |
|------------------------------------|---------|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| `phase` | string | ready-only | "Initial" means no bounded workload; "Healthy" means bounded workload is promoted; "Progressing" means rollout is working. |
| `observedGeneration` | integer | ready-only | Observed rollout spec generation. |
| `canaryStatus` | *object | ready-only | Information about rollout progressing. |
| `canaryStatus.canaryReplicas` | integer | ready-only | workload updated replicas |
| `canaryStatus.canaryReadyReplicas` | integer | ready-only | workload updated ready replicas. |
| `canaryStatus.podTemplateHash` | string | ready-only | workload update(new) revision hash. |
| `canaryStatus.canaryRevision` | string | ready-only | workload update(new) revision hash calculated by Kruise Rollout controller. |
| `canaryStatus.stableRevision` | string | ready-only | workload stable(old) revision hash recorded before progressing. |
| `canaryStatus.observedRolloutID` | string | ready-only | corresponding to workload `rollouts.kruise.io/rollout-id` annotations. if they are equal, it means rollout controller watched workload changes. |
| `canaryStatus.currentStepIndex` | integer | ready-only | rollout current step index. Start from 1. |
| `canaryStatus.currentStepState` | string | ready&write | rollout current step state. Both "StepReady" and "Complete" mean current step is ready. |
| Fields | Type | Mode | Explanation |
| ---------------------------------- | ------- | ---------- | ------------------------------------------------------------ |
| `phase` | string | read-only | "Initial" means no bounded workload; "Healthy" means bounded workload is promoted; "Progressing" means rollout is working. |
| `observedGeneration` | integer | read-only | Observed rollout spec generation. |
| `canaryStatus` | *object | read-only | Information about rollout progressing. |
| `canaryStatus.canaryReplicas` | integer | read-only | workload updated replicas |
| `canaryStatus.canaryReadyReplicas` | integer | read-only | workload updated ready replicas. |
| `canaryStatus.podTemplateHash` | string | read-only | workload update(new) revision hash. |
| `canaryStatus.canaryRevision` | string | read-only | workload update(new) revision hash calculated by Kruise Rollout controller. |
| `canaryStatus.stableRevision` | string | read-only | workload stable(old) revision hash recorded before progressing. |
| `canaryStatus.observedRolloutID` | string | read-only | corresponding to workload `rollouts.kruise.io/rollout-id` annotations. if they are equal, it means rollout controller watched workload changes. |
| `canaryStatus.currentStepIndex` | integer | read-only | rollout current step index. Start from 1. |
| `canaryStatus.nextStepIndex` | integer | read&write | Indicates the next step to execute. If the current batch is the last batch or the release has ended, its value is set to -1. In other cases, it is typically equal to `canaryStatus.currentStepIndex` + 1. Users can modify it to a reasonable positive number to enable non-sequential step execution. |
| `canaryStatus.currentStepState` | string | read&write | rollout current step state. Both "StepReady" and "Complete" mean current step is ready. |
#### Blue-Green
```yaml
kind: Rollout
status:
blueGreenStatus:
currentStepIndex: 1
currentStepState: StepPaused
lastUpdateTime: "2025-01-03T09:20:29Z"
message: BatchRelease is at state Ready, rollout-id , step 1
nextStepIndex: 2
observedWorkloadGeneration: 4
podTemplateHash: 64c6f99459
rolloutHash: 7w8dxcdc49wv4w49c469b27c6c7xb4f4c4dvf8dwd4b6zb5z4zcc852c7w9vf5dv
stableRevision: 65f957664d
updatedReadyReplicas: 10
updatedReplicas: 10
updatedRevision: 64448b955c
```
Just like `canaryStatus`, `blueGreenStatus` has **exactly the same** status fields, and their meanings are identical.

View File

@ -0,0 +1,132 @@
# Blue-Green Release
## Blue-Green Release Process
<center><img src={require('/static/img/rollouts/bluegreen.png').default} width="90%" /></center>
## Recommended Configuration
**Note: The blue-green strategy is only applicable to Deployment and CloneSet, supports only the v1beta1 API, and requires a Rollout version higher than v0.5.0 (excluding v0.5.0).**
```YAML
apiVersion: rollouts.kruise.io/v1beta1
kind: Rollout
metadata:
name: rollouts-demo
spec:
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: workload-demo
strategy:
blueGreen:
steps:
- replicas: 100% # step 1
traffic: 0%
- replicas: 100% # step 2
traffic: 10%
- replicas: 100% # step 3
traffic: 100%
trafficRoutings:
- service: service-demo
ingress:
classType: nginx
name: ingress-demo
```
## Behavior Explanation
When you apply a new revision to `workload-demo`:
- **First Batch:** 100% of the new version Pods are scaled up, and the stable version Pods are not scaled down. At this point, there are 200% of the Pods, but no traffic is routed to the new version Pods. Manual confirmation is required to proceed to the next batch.
- **Second Batch:** 10% of the traffic is routed to the new version. Manual confirmation is required to proceed to the next batch.
- **Third Batch:** 100% of the traffic is routed to the new version. Manual confirmation is required to complete the release.
Once you verify that the new version is validated and confirm to proceed:
- The stable version Pods will be scaled down, and the release will be completed.
## Differences from Other Strategies
From the **API** perspective, the differences between `strategy.blueGreen` and `strategy.canary` are minimal and mainly include:
- `strategy.blueGreen` does not have the `EnableExtraWorkloadForCanary` field. `strategy.canary` uses this field to distinguish whether to create an additional Deployment, corresponding to canary strategy and multi-batch strategy of the Deployment.
- `strategy.blueGreen` does not have the `PatchPodTemplateMetadata` field. Only canary strategy support this field; multi-batch and blue-Green strategy do not.
Apart from these, there are no further differences in the API. You may notice that the steps in Blue-Green strategy are used similarly to those in canary strategy.
From the perspective of **release process**, the differences between Blue-Green strategy and other strategies are:
- Blue-Green strategy do not create additional workloads, similar to multi-batch strategy, whereas canary strategy create additional workload for Deployment.
- In the blue-green release process, Pods of the old version won't be scaled down, which is the same as in canary strategy. In contrast, multi-batch strategy scale up new version Pods while simultaneously scaling down old version Pods.
You can easily understand the differences between blue-green and multi-batch strategies. However, you might also think that aside from whether additional workload is created at the low level, there seems to be little difference between blue-green and canary. Perhaps the following diagram can provide clarification:
<center><img src={require('/static/img/rollouts/canary_vs_bluegreen.png').default} width="90%" /></center>
Notice that after the final batch is completed, canary strategy perform a rolling update on the original workload, whereas blue-green deployments simply scale down the old version Pods directly. This difference reflects the semantic differences between the two strategies: in canary strategy, the created workload essentially serves as a "canary" to validate the new version and is deleted after validation. In blue-green strategy, the old and new versions "coexist," allowing traffic to be switched between the two versions.
**In practice:**
- Canary Release are recommended to configure a small number of batches (e.g., 1 batch), with each batch having a relatively small number of replicas. Although configuring replicas as 100% is allowed, it is usually unnecessary.
- For blue-green deployments, it is generally recommended to configure replicas as 100%.
From the perspective of **underlying implementation** , the differences between blue-green strategy and other strategies are:
- **Canary:** Only Deployment support canary strategy. During release, Kruise-Rollout creates a Deployment named "[DeploymentName]-canary".
- **Multi-Batch:** CloneSets can utilize their inherent `partition` field to implement multi-batch strategy, while Deployment relies on a customized controller.
- **Blue-Green:** Both CloneSet and Deployment achieve blue-green release by setting the `MinReadySeconds`, `MaxSurge`, and `MaxUnavailable` fields. Therefore, if you notice these fields being changed while using blue-green strategy, there is no need to worry. This is normal behavior, and these fields will be restored after the release is done.
### Which Release Method Should I Choose?
It depends on your scenarios:
- Blue-Green Release consumes double the resources. You might opt for blue-green strategy in scenarios with high stability requirements because the old and new version instances coexist, allowing you to quickly switch traffic to either the new or stable version.
- In other scenarios, you might choose canary or multi-batch strategies.
## Rollback
### Global Rollback
Similar to canary/multi-batch strategies, you can directly rollback the workload specification to rollback the application. For details, refer to the "Basic Usage Guide" → "How to Rollback".
### Traffic Rollback
Rollout has introduced a new feature that supports jumping between steps. For example, when deploying to the third batch, you can execute the following command to jump back to the second batch:
```shell
kubectl patch rollout rollouts-demo --type merge --subresource status -p "{\"status\":{\"blueGreenStatus\":{\"nextStepIndex\": 2}}}"
```
Using this feature, you can "rollback" the traffic of a blue-green release. In the recommended configuration example, jumping from the third step to the second step will change the traffic configuration from "100% of requests routed to the new version" to "10% of requests routed to the new version." If you further jump to the first step, all requests will be routed to the stable version.
However, it is important to note that if the `replicas` of the target step are less than those of the current step, no Pods will be scaled down.
In fact, directly modifying the `spec.strategy.blueGreen.steps[x].traffic` of the Rollout resource can achieve a similar effect. For example, changing `spec.strategy.blueGreen.steps[lastBatch].traffic` from 100% to 10%, and then from 10% to 0%. However, directly modifying the spec may affect the next release, whereas modifying the status can avoid this issue.
**Note:** Allowing step-jumping and modifying the traffic configuration of a specific step are new features introduced with blue-green release and require specific Rollout versions.
## Considerations
### HPA
During the blue-green release process, if the workload is bound to an HPA(Horizontal Pod Autoscaler), Kruise-Rollout will disable the HPA. You will notice that the HPA's `scaleTargetRef.name` is appended with the suffix "-DisableByRollout," causing the workload to be marked as Not Found. After the blue-green release is done, the suffix will be removed, and the HPA will become active again.
### PDB
During the blue-green release process, If the workload is bound to a PDB(Pod Disruption Budget), since the PDB does not consider the version of the Pods when calculating "Allowed disruptions," configuring `maxUnavailable` will lead to "smaller step sizes and over-protection," whereas configuring `minAvailable` will result in "larger step sizes and under-protection." Therefore, unless necessary, it is recommended to use `minAvailable `(prefer "over-protection" rather than "under-protection").
### Successive Release
The Blue-Green strategy does not support successive releases. If you are in the progress of releasing version v2 (upgrading from v1 to v2) and then attempt to release version v3, we recommend that you first manually roll back to version v1 and then proceed to deploy version v3.
If you have accidentally released version v3, the controller will not function as expected. The Pods for version v3 will not be created, and the Pods for versions v2 and v1 will not be scaled down. If you check the Rollout resource at this point, you will notice a message similar to the following:
```shell
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE
rollouts-demo Progressing 1 StepPaused new version releasing detected in the progress of blue-green release, please rollback first 6d23h
```
It is recommended to perform a rollback operation at this time. You can roll back to version v1. Specifically, if you use the `kubectl rollout undo` command to perform the rollback, please ensure you specify the correct `--to-revision` to revert to version v1. If you do not specify `--to-revision`, you might revert back to version v2 (i.e., the state before deploying version v3):
```shell
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE
rollouts-demo Progressing 1 StepPaused Rollout is in step(1/4), and you need manually confirm to enter the next step 7d
```
## Known Issues
- Due to unresolved issues in the CloneSet controller code, when performing blue-green strategy on CloneSets, all steps' replicas should be set to 100%. Otherwise, CloneSets may fail to scale up. Deployment is not subject to this restriction.

View File

@ -31,6 +31,7 @@ module.exports = {
'Release Strategies': [
'user-manuals/strategy-canary-update',
'user-manuals/strategy-multi-batch-update',
'user-manuals/strategy-bluegreen-update',
'user-manuals/strategy-ab-testing',
'user-manuals/strategy-end2end-canary-update',
]

Binary file not shown.

After

Width:  |  Height:  |  Size: 597 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 282 KiB