diff --git a/blog/2022-08-17-helm-rollout.md b/blog/2022-08-17-helm-rollout.md new file mode 100644 index 00000000..9f7bc083 --- /dev/null +++ b/blog/2022-08-17-helm-rollout.md @@ -0,0 +1,246 @@ +--- +title: The Canary Rollout of the Helm Chart Application Is Coming! +author: Yike Wang +author_title: KubeVela Team +author_url: https://github.com/kubevela/kubevela +author_image_url: https://KubeVela.io/img/logo.svg +tags: [ KubeVela, "use-case", "helm chart", "Canary Rollout"] +description: "" +image: https://raw.githubusercontent.com/oam-dev/KubeVela.io/main/docs/resources/KubeVela-03.png +hide_table_of_contents: false +--- + +## Background + +Helm is an application packaging and deployment tool of client side widely used in the cloud-native field. Its simple design and easy-to-use features have been recognized by users and formed its ecosystem. Up to now, thousands applications have been packaged using Helm Chart. Helm's design concept is very concise and can be summarized in the following two aspects: +1. Packaging and templating complex Kubernetes APIs and then abstracting and simplifying them into small number of parameters + +2. **Giving Application Lifecycle Solutions:** Production, upload (hosting), versioning, distribution (discovery), and deployment. + +These two design principles ensure that Helm is flexible and simple enough to cover all Kubernetes APIs, which solves the problem of one-off cloud-native application delivery. However, for enterprises with a certain scale, using Helm for continuous software delivery poses quite a challenge. +## Challenges of the Continuous Delivery of Helm +Helm was initially designed to ensure simplicity and ease of use instead of complex component orchestration. Therefore, **Helm delivers all resources to Kubernetes clusters during the application deployment. It is expected to solve application dependency and orchestration problems automatically using Kubernetes' final-state oriented self-healing capabilities.** Such a design may not be a problem during its first deployment, but it is too idealistic for the enterprise with a certain scale of production environment. + +On the one hand, updating all resources when the application is upgraded may easily cause overall service interruption due to the short-term unavailability of some services. On the other hand, if there is a bug in the software, it cannot be rolled back in time, which may cause more trouble and make it difficult to control. In some serious scenarios, if some configurations in the production environment have been manually modified in O&M, the original modifications would be overwritten due to the one-off deployment of Helm. What's more, the previous versions of Helm may be inconsistent with the production environment, so it cannot be recovered using rollback. All of these add up to a larger area of failure. + +**Thus, with a certain scale, the grayscale and rollback capabilities of the software in the production environment are extremely important, and Helm itself cannot guarantee sufficient stability.** +## How Do We Enable Canary-Rollout for Helm? +Typically, a rigorous software upgrade process follows a similar process: It is roughly divided into three stages. The first stage upgrades small number of pods (such as 20% ) and switches a small amount of traffic to the new version. After completing this stage, the upgrade is paused first. After manual confirmation, continue the second phase, which is to upgrade a larger proportion of pods and traffic (such as 90% ), and pause again for manual confirmation. In the final stage, the full upgrade to the new version is completed, and the verification is completed, thus the entire rollout process is completed. If any exceptions, including business metrics, are found during the upgrade, such as an increase in the CPU or memory usage rate or excessive log requests error 500, you can roll back quickly. + +![image](/img/rollout-step.jpg) + +The image above is a typical canary rollout scenario, *so how do we complete the above process for the Helm chart application?* There are two typical ways in the industry: +1. **Modify the Helm chart to change workloads into two copies and expose different Helm parameters.** During the rollout, the images, number of pods, and traffic ratio of the two workloads are continuously modified to implement the canary rollout. +2. **Modify the Helm chart to change the original basic workload to a custom workload with the same features and with phased rollout capabilities and expose the Helm parameters.** It's these canary rollout CRDs that are manipulated during the canary rollout. + +The two solutions are complex with considerable modification costs, especially **when your Helm chart is a third-party component that cannot be modified or cannot maintain a Helm chart.** Even if the two are modified, there are still stability risks compared to the original simple workload model. The reason is that **Helm is only a package management tool, and it is incompatible with the canary rollout or workloads management.** + +When we have in-depth communication with large number of users in the community, we find that most users' applications are not complicated, among which the most are classic types (such as Deployment and StatefulSet). Therefore, through the powerful addon mechanism of [KubeVela](http://kubevela.net/) and the [OpenKruise](https://openkruise.io/) community, we have made a canary rollout KubeVela Addon for these qualified types. **This addon helps you easily complete the canary rollout of the Helm chart without any migration and modification.** Also, if your Helm chart is more complicated, you can customize an addon for your scenario to get the same experience. +Let's take you through a practical example (using Deployment Workload as an example) to get a glimpse of the complete process. + +## Use KubeVela for Canary Rollout +### Prepare the Environment + +- Install KubeVela + +```shell +$ curl -fsSl https://static.kubevela.net/script/install-velad.sh | bash +velad install +``` + +See [this document](https://kubevela.net/docs/install#1-install-velad) for more installation details. + +- Enable related addon + +```shell +$ vela addon enable fluxcd +$ vela addon enable ingress-nginx +$ vela addon enable kruise-rollout +$ vela addon enable velaux +``` + +In this step, the following addons are started: +1. The fluxcd addon helps us enable the capability of Helm delivery. +2. The ingress-nginx addon is used to provide traffic management capabilities of canary rollout. +3. The kruise-rollout provides canary rollout capability. +4. The velaux addon provides interface operation and visualization. + +- Map the Nginx ingress-controller port to local + +```shell +$ vela port-forward addon-ingress-nginx -n vela-system +``` + +### First Deployment + +Run the following command to deploy the Helm application for the first time. In this step, the deployment is done through KubeVela's CLI tool. If you are familiar with Kubernetes, you can also deploy through kubectl apply. The two work the same. + +```shell +cat <