--- title: "Canary Deployments using Istio" overview: Using Istio to create autoscaled canary deployments published: true permalink: blog/canary-deployments-using-istio.html attribution: The Istio Team layout: post type: markdown --- One of the benefits of the [Istio]({{home}}) project is that it provides the control needed to deploy canary services. The idea behind canary deployment (or rollout) is to introduce a new version of a service by first testing it using a small percentage of user traffic, and then if all goes well, increase, possibly gradually in increments, the percentage while simultaneously phasing out the old version. If anything goes wrong along the way, we abort and rollback to the previous version. In its simplest form, the traffic sent to the canary version is a randomly selected percentage of requests, but in more sophisticated schemes it can be based on the region, user, or other properties of the request. Depending on your level of expertise in this area, you may wonder why Istio's support for canary deployment is even needed, given that platforms like Kubernetes already provide a way to do [version rollout](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#updating-a-deployment) and [canary deployment](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments). Problem solved, right? Well, not exactly. Although doing a rollout this way works in simple cases, it’s very limited, especially in large scale cloud environments receiving lots of (and especially varying amounts of) traffic, where autoscaling is needed. ## Canary deployment in Kubernetes As an example, let's say we have a deployed service, **helloworld** version **v1**, for which we would like to test (or simply rollout) a new version, **v2**. Using Kubernetes, you can rollout a new version of the **helloworld** service by simply updating the image in the service’s corresponding [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) and letting the [rollout](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#updating-a-deployment) happen automatically. If we take particular care to ensure that there are enough **v1** replicas running when we start and [pause](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#pausing-and-resuming-a-deployment) the rollout after only one or two **v2** replicas have been started, we can keep the canary’s effect on the system very small. We can then observe the effect before deciding to proceed or, if necessary, [rollback](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment). Best of all, we can even attach a [horizontal pod autoscaler](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#scaling-a-deployment) to the Deployment and it will keep the replica ratios consistent if, during the rollout process, it also needs to scale replicas up or down to handle traffic load. Although fine for what it does, this approach is only useful when we have a properly tested version that we want to deploy, i.e., more of a blue/green, a.k.a. red/black, kind of upgrade than a "dip your feet in the water" kind of canary deployment. In fact, for the latter (for example, testing a canary version that may not even be ready or intended for wider exposure), the canary deployment in Kubernetes would be done using two Deployments with [common pod labels](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#using-labels-effectively). In this case, we can’t use autoscaling anymore because it’s now being done by two independent autoscalers, one for each Deployment, so the replica ratios (percentages) may vary from the desired ratio, depending purely on load. Whether we use one deployment or two, canary management using deployment features of container orchestration platforms like Docker, Mesos/Marathon, or Kubernetes has a fundamental problem: the use of instance scaling to manage the traffic; traffic version distribution and replica deployment are not independent in these systems. All replica pods, regardless of version, are treated the same in the kube-proxy round-robin pool, so the only way to manage the amount of traffic that a particular version receives is by controlling the replica ratio. Maintaining canary traffic at small percentages requires many replicas (e.g., 1% would require a minimum of 100 replicas). Even if we ignore this problem, the deployment approach is still very limited in that it only supports the simple (random percentage) canary approach. If, instead, we wanted to limit the visibility of the canary to requests based on some specific criteria, we still need another solution. ## Enter Istio With Istio, traffic routing and replica deployment are two completely independent functions. The number of pods implementing services are free to scale up and down based on traffic load, completely orthogonal to the control of version traffic routing. This makes managing a canary version in the presence of autoscaling a much simpler problem. Autoscalers may, in fact, respond to load variations resulting from traffic routing changes, but they are nevertheless functioning independently and no differently than when loads change for other reasons. Istio’s [routing rules]({{home}}/docs/concepts/traffic-management/rules-configuration.html) also provide other important advantages; you can easily control fine grain traffic percentages (e.g., route 1% of traffic without requiring 100 pods) and you can control traffic using other criteria (e.g., route traffic for specific users to the canary version). To illustrate, let’s look at deploying the **helloworld** service and see how simple the problem becomes. We begin by defining the **helloworld** Service, just like any other Kubernetes service, something like this: ```yaml apiVersion: v1 kind: Service metadata: name: helloworld labels: app: helloworld spec: selector: app: helloworld ... ``` We then add 2 Deployments, one for each version (**v1** and **v2**), both of which include the service selector’s `app: helloworld` label: ```yaml kind: Deployment metadata: name: helloworld-v1 spec: replicas: 1 template: metadata: labels: app: helloworld version: v1 spec: containers: - image: helloworld-v1 ... --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: helloworld-v2 spec: replicas: 1 template: metadata: labels: app: helloworld version: v2 spec: containers: - image: helloworld-v2 ... ``` Note that this is exactly the same way we would do a [canary deployment](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#canary-deployments) using plain Kubernetes, but in that case we would need to adjust the number of replicas of each Deployment to control the distribution of traffic. For example, to send 10% of the traffic to the canary version (**v2**), the replicas for **v1** and **v2** could be set to 9 and 1, respectively. However, since we are going to deploy the service in an [Istio enabled]({{home}}/docs/setup/) cluster, all we need to do is set a routing rule to control the traffic distribution. For example if we want to send 10% of the traffic to the canary, we could use the [istioctl]({{home}}/docs/reference/commands/istioctl.html) command to set a routing rule something like this: ```bash $ cat <