Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)

ab-testing aws-appmesh canary contour gitops gloo istio kubernetes linkerd nginx progressive-delivery

Go to file

Ciaran Moran ce6ae8d511 Use multi-stage build to slim image		2020-06-01 14:36:15 +01:00
.circleci	e2e: Update Kind, Istio and Linkerd	2020-05-02 09:39:55 +03:00
.github	Publish Helm chart from CircleCI	2019-08-05 17:08:33 +03:00
artifacts	Add source labels to analysis matching rules	2020-05-18 13:16:03 +03:00
charts	Add source labels to analysis matching rules	2020-05-18 13:16:03 +03:00
cmd	Add ingress class support for Contour	2020-05-14 12:17:03 +03:00
docs	Add --debug to helm command	2020-05-30 18:07:40 +02:00
hack	Add AppMesh v1beta2 clientset and RBAC	2020-05-04 22:22:51 +03:00
kustomize	Add source labels to analysis matching rules	2020-05-18 13:16:03 +03:00
pkg	Add allow origins field to CORS spec	2020-06-01 14:58:08 +03:00
test	Update Istio e2e to v1.6.0	2020-06-01 14:06:59 +03:00
.codecov.yml	build: post report only if coverage changes	2020-03-23 12:51:21 +02:00
.gitbook.yaml	docs: use metric providers in tutorials	2020-02-29 11:56:13 +02:00
.gitignore	clean up and update dependencies of flagger	2020-03-20 17:11:02 -07:00
.goreleaser.yml	CircleCI - fix deprecated goreleaser config	2019-06-21 16:15:51 +03:00
CHANGELOG.md	Release v1.0.0-rc.5	2020-05-14 14:00:23 +03:00
CONTRIBUTING.md	Fix broken link to Flagger Development Guide	2020-05-11 23:35:56 -04:00
Dockerfile	Use multi-stage build to slim image	2020-06-01 14:36:15 +01:00
Dockerfile.loadtester	loadtester: release v0.16.0	2020-03-31 12:48:16 +03:00
LICENSE	Copyright Weaveworks	2019-01-03 14:42:21 +02:00
MAINTAINERS	docs: add maintainer: @mathetake	2020-02-27 13:22:59 +02:00
Makefile	build: make release compatible with go mod	2020-02-28 18:46:26 +02:00
README.md	update README: custom metric instead of custom promql	2020-05-27 17:31:40 +09:00
code-of-conduct.md	Add contributing and code of conduct docs	2018-10-11 14:33:28 +03:00
go.mod	build: Update Kubernetes client-go to 1.18.2	2020-05-02 08:54:22 +03:00
go.sum	build: Update Kubernetes client-go to 1.18.2	2020-05-02 08:54:22 +03:00

README.md

flagger

Flagger is a progressive delivery tool that automates the release process for applications running on Kubernetes. It reduces the risk of introducing a new software version in production by gradually shifting traffic to the new version while measuring metrics and running conformance tests.

Flagger implements several deployment strategies (Canary releases, A/B testing, Blue/Green mirroring) using a service mesh (App Mesh, Istio, Linkerd) or an ingress controller (Contour, Gloo, NGINX) for traffic routing. For release analysis, Flagger can query Prometheus, Datadog or CloudWatch and for alerting it uses Slack, MS Teams, Discord and Rocket.

Documentation

Flagger documentation can be found at docs.flagger.app.

Install
- Flagger install on Kubernetes
Usage
Tutorials
- App Mesh
- Istio
- Linkerd
- Contour
- Gloo
- NGINX Ingress
- Kubernetes Blue/Green

Who is using Flagger

List of organizations using Flagger:

If you are using Flagger, please submit a PR to add your organization to the list!

Canary CRD

Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA), then creates a series of objects (Kubernetes deployments, ClusterIP services, service mesh or ingress routes). These objects expose the application on the mesh and drive the canary analysis and promotion.

Flagger keeps track of ConfigMaps and Secrets referenced by a Kubernetes Deployment and triggers a canary analysis if any of those objects change. When promoting a workload in production, both code (container images) and configuration (config maps and secrets) are being synchronised.

For a deployment named podinfo, a canary promotion can be defined using Flagger's custom resource:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: podinfo
  namespace: test
spec:
  # service mesh provider (optional)
  # can be: kubernetes, istio, linkerd, appmesh, nginx, contour, gloo, supergloo
  provider: istio
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: podinfo
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    name: podinfo
  service:
    # service name (defaults to targetRef.name)
    name: podinfo
    # ClusterIP port number
    port: 9898
    # container port name or number (optional)
    targetPort: 9898
    # port name can be http or grpc (default http)
    portName: http
    # add all the other container ports
    # to the ClusterIP services (default false)
    portDiscovery: true
    # HTTP match conditions (optional)
    match:
      - uri:
          prefix: /
    # HTTP rewrite (optional)
    rewrite:
      uri: /
    # request timeout (optional)
    timeout: 5s
  # promote the canary without analysing it (default false)
  skipAnalysis: false
  # define the canary analysis timing and KPIs
  analysis:
    # schedule interval (default 60s)
    interval: 1m
    # max number of failed metric checks before rollback
    threshold: 10
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 5
    # validation (optional)
    metrics:
    - name: request-success-rate
      # builtin Prometheus check
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      # builtin Prometheus check
      # maximum req duration P99
      # milliseconds
      thresholdRange:
        max: 500
      interval: 30s
    - name: "database connections"
      # custom metric check
      templateRef:
        name: db-connections
      thresholdRange:
        min: 2
        max: 100
      interval: 1m
    # testing (optional)
    webhooks:
      - name: "conformance test"
        type: pre-rollout
        url: http://flagger-helmtester.test/
        timeout: 5m
        metadata:
          type: "helmv3"
          cmd: "test run podinfo -n test"
      - name: "load test"
        type: rollout
        url: http://flagger-loadtester.test/
        metadata:
          cmd: "hey -z 1m -q 10 -c 2 http://podinfo.test:9898/"
    # alerting (optional)
    alerts:
      - name: "dev team Slack"
        severity: error
        providerRef:
          name: dev-slack
          namespace: flagger
      - name: "qa team Discord"
        severity: warn
        providerRef:
          name: qa-discord
      - name: "on-call MS Teams"
        severity: info
        providerRef:
          name: on-call-msteams

For more details on how the canary analysis and promotion works please read the docs.

Features

Feature	Istio	Linkerd	App Mesh	NGINX	Gloo	Contour	CNI
Canary deployments (weighted traffic)	✔️	✔️	✔️	✔️	✔️	✔️	➖
A/B testing (headers and cookies routing)	✔️	➖	✔️	✔️	➖	✔️	➖
Blue/Green deployments (traffic switch)	✔️	✔️	✔️	✔️	✔️	✔️	✔️
Webhooks (acceptance/load testing)	✔️	✔️	✔️	✔️	✔️	✔️	✔️
Manual gating (approve/pause/resume)	✔️	✔️	✔️	✔️	✔️	✔️	✔️
Request success rate check (L7 metric)	✔️	✔️	✔️	✔️	✔️	✔️	➖
Request duration check (L7 metric)	✔️	✔️	✔️	✔️	✔️	✔️	➖
Custom metric checks	✔️	✔️	✔️	✔️	✔️	✔️	✔️
Traffic policy, CORS, retries and timeouts	✔️	➖	➖	➖	➖	✔️	➖

Roadmap

Add support for Kubernetes Ingress v2
Integrate with other service mesh like Consul Connect and ingress controllers like HAProxy, ALB
Integrate with other metrics providers like InfluxDB, Stackdriver, SignalFX
Add support for comparing the canary metrics to the primary ones and do the validation based on the derivation between the two

Contributing

Flagger is Apache 2.0 licensed and accepts contributions via GitHub pull requests. To start contributing please read the development guide.

When submitting bug reports please include as much details as possible:

which Flagger version
which Flagger CRD version
which Kubernetes version
what configuration (canary, ingress and workloads definitions)
what happened (Flagger and Proxy logs)

Getting Help

If you have any questions about Flagger and progressive delivery:

Read the Flagger docs.
Invite yourself to the Weave community slack and join the #flagger channel.
Join the Weave User Group and get invited to online talks, hands-on training and meetups in your area.
File an issue.

Your feedback is always welcome!

README.md Unescape Escape