| id |
title |
sidebar_label |
| chaos-engine |
ChaosEngine |
ChaosEngine |
The ChaosEngine CR is the main user-facing chaos custom resource with a namespace scope and is designed to hold information around how the chaos experiments are executed. It connects an application instance with one or more chaos experiments,
while allowing the users to specify run level details (override experiment defaults, provide new environment variables and volumes, options to delete or retain experiment pods, etc.,). This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.
Prerequisites
To understand the concepts of ChaosEngine better make sure you are aware of the Chaos Experiment Custom Resources
ChaosEngine
State Specification
This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.
| Field |
.spec.engineState |
| Description |
Flag to control the state of the chaosengine |
| Type |
Mandatory |
| Range |
active, stop |
| Default |
active |
| Notes |
The engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine. |
Application Specification
| Field |
.spec.appinfo.appns |
| Description |
Flag to specify namespace of application under test |
| Type |
Optional |
| Range |
user-defined (type: string) |
| Default |
n/a |
| Notes |
The appns in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos. |
| Field |
.spec.appinfo.applabel |
| Description |
Flag to specify unique label of application under test |
| Type |
Optional |
| Range |
user-defined (type: string)(pattern: "label_key=label_value") |
| Default |
n/a |
| Notes |
The applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: "true". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos. |
| Field |
.spec.appinfo.appkind |
| Description |
Flag to specify resource kind of application under test |
| Type |
Optional |
| Range |
deployment, statefulset, daemonset, deploymentconfig, rollout |
| Default |
n/a (depends on app type) |
| Notes |
The appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos |
| Field |
.spec.auxiliaryAppInfo |
| Description |
Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces. |
| Type |
Optional |
| Range |
user-defined (type: string)(pattern: "namespace:label_key=label_value"). |
| Default |
n/a |
| Notes |
The auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary. |
Note: Irrespective of the nature of the chaos experiment, i.e., pod-level (single-app impact/lesser blast radius) or infra-level(multi-app impact/higher blast radius), the .spec.appinfo is a must-fill where the experiment is pointed to at least one primary app whose health is measured as an indicator of the resiliency / success of the chaos experiment.
RBAC Specification
| Field |
.spec.chaosServiceAccount |
| Description |
Flag to specify serviceaccount used for chaos experiment |
| Type |
Mandatory |
| Range |
user-defined (type: string) |
| Default |
n/a |
| Notes |
The chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR. |
Runtime Specification
| Field |
.spec.annotationCheck |
| Description |
Flag to control annotationChecks on applications as prerequisites for chaos |
| Type |
Optional |
| Range |
true, false |
| Default |
true |
| Notes |
The annotationCheck in the spec controls whether or not the operator checks for the annotation "litmuschaos.io/chaos" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection. |
| Field |
.spec.terminationGracePeriodSeconds |
| Description |
Flag to control terminationGracePeriodSeconds for the chaos pods(abort case) |
| Type |
Optional |
| Range |
integer value |
| Default |
30 |
| Notes |
The terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated. |
| Field |
.spec.jobCleanupPolicy |
| Description |
Flag to control cleanup of chaos experiment job post execution of chaos |
| Type |
Optional |
| Range |
delete, retain |
| Default |
delete |
| Notes |
The jobCleanupPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms). |
Component Specification
| Field |
.spec.components.runner.image |
| Description |
Flag to specify image of ChaosRunner pod |
| Type |
Optional |
| Range |
user-defined (type: string) |
| Default |
n/a (refer Notes) |
| Notes |
The .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE |
| Field |
.spec.components.runner.imagePullPolicy |
| Description |
Flag to specify imagePullPolicy for the ChaosRunner |
| Type |
Optional |
| Range |
Always, IfNotPresent |
| Default |
IfNotPresent |
| Notes |
The .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test. |
| Field |
.spec.components.runner.imagePullSecrets |
| Description |
Flag to specify imagePullSecrets for the ChaosRunner |
| Type |
Optional |
| Range |
user-defined (type: []corev1.LocalObjectReference) |
| Default |
n/a |
| Notes |
The .components.runner.imagePullSecrets allows developers to specify the imagePullSecret name for ChaosRunner. |
| Field |
.spec.components.runner.runnerAnnotations |
| Description |
Annotations that needs to be provided in the pod which will be created (runner-pod) |
| Type |
Optional |
| Range |
user-defined (type: map[string]string) |
| Default |
n/a |
| Notes |
The .components.runner.runnerAnnotation allows developers to specify the custom annotations for the runner pod. |
| Field |
.spec.components.runner.args |
| Description |
Specify the args for the ChaosRunner Pod |
| Type |
Optional |
| Range |
user-defined (type: []string) |
| Default |
n/a |
| Notes |
The .components.runner.args allows developers to specify their own debug runner args. |
| Field |
.spec.components.runner.command |
| Description |
Specify the commands for the ChaosRunner Pod |
| Type |
Optional |
| Range |
user-defined (type: []string) |
| Default |
n/a |
| Notes |
The .components.runner.command allows developers to specify their own debug runner commands. |
| Field |
.spec.components.runner.configMaps |
| Description |
Configmaps passed to the chaos runner pod |
| Type |
Optional |
| Range |
user-defined (type: {`{name: string, mountPath: string}`}) |
| Default |
n/a |
| Notes |
The .spec.components.runner.configMaps provides for a means to insert config information into the runner pod. |
| Field |
.spec.components.runner.secrets |
| Description |
Kubernetes secrets passed to the chaos runner pod. |
| Type |
Optional |
| Range |
user-defined (type: {`{name: string, mountPath: string}`}) |
| Default |
n/a |
| Notes |
The .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments. |
| Field |
.spec.components.runner.nodeSelector |
| Description |
Node selectors for the runner pod |
| Type |
Optional |
| Range |
Labels in the from of label key=value |
| Default |
n/a |
| Notes |
The .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos. |
| Field |
.spec.components.runner.resources |
| Description |
Specify the resource requirements for the ChaosRunner pod |
| Type |
Optional |
| Range |
user-defined (type: corev1.ResourceRequirements) |
| Default |
n/a |
| Notes |
The .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod. |
| Field |
.spec.components.runner.tolerations |
| Description |
Toleration for the runner pod |
| Type |
Optional |
| Range |
user-defined (type: []corev1.Toleration) |
| Default |
n/a |
| Notes |
The .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. |
Experiment Specification
| Field |
.spec.experiments[].name |
| Description |
Name of the chaos experiment CR |
| Type |
Mandatory |
| Range |
user-defined (type: string) |
| Default |
n/a |
| Notes |
The experiment[].name specifies the chaos experiment to be executed by the ChaosOperator. |
| Field |
.spec.experiments[].spec.components.env |
| Description |
Environment variables passed to the chaos experiment |
| Type |
Optional |
| Range |
user-defined (type: {`{name: string, value: string}`}) |
| Default |
n/a |
| Notes |
The experiment[].spec.components.env specifies the array of tunables passed to the experiment pods. Though the field is optional from a chaosengine definition viewpoint, it is almost always necessary to provide experiment tunables via this definition. While some of the env variables override the defaults in the experiment CR and some of the env are mandatory additions filling in for placeholders/empty values in the experimet CR. For a list of "mandatory" & "optional" env for an experiment, refer to the respective experiment documentation. |
| Field |
.spec.experiments[].spec.components.configMaps |
| Description |
Configmaps passed to the chaos experiment |
| Type |
Optional |
| Range |
user-defined (type: {`{name: string, mountPath: string}`}) |
| Default |
n/a |
| Notes |
The experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods. |
| Field |
.spec.experiments[].spec.components.secrets |
| Description |
Kubernetes secrets passed to the chaos experiment |
| Type |
Optional |
| Range |
user-defined (type: {`{name: string, mountPath: string}`}) |
| Default |
n/a |
| Notes |
The experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods. |
| Field |
.spec.experiments[].spec.components.experimentImage |
| Description |
Override the image of the chaos experiment |
| Type |
Optional |
| Range |
string |
| Default |
n/a |
| Notes |
The experiment[].spec.components.experimentImage overrides the experiment image for the chaoexperiment. |
| Field |
.spec.experiments[].spec.components.experimentImagePullSecrets |
| Description |
Flag to specify imagePullSecrets for the ChaosExperiment |
| Type |
Optional |
| Range |
user-defined (type: []corev1.LocalObjectReference) |
| Default |
n/a |
| Notes |
The .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment. |
| Field |
.spec.experiments[].spec.components.nodeSelector |
| Description |
Provide the node selector for the experiment pod |
| Type |
Optional |
| Range |
Labels in the from of label key=value |
| Default |
n/a |
| Notes |
The experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos. |
| Field |
.spec.experiments[].spec.components.statusCheckTimeouts |
| Description |
Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry) |
| Type |
Optional |
| Range |
It contains values in the form {`delay: int, timeout: int`} |
| Default |
delay: 2s and timeout: 180s |
| Notes |
The experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds. |
| Field |
.spec.experiments[].spec.components.resources |
| Description |
Specify the resource requirements for the ChaosExperiment pod |
| Type |
Optional |
| Range |
user-defined (type: corev1.ResourceRequirements) |
| Default |
n/a |
| Notes |
The experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod. |
| Field |
.spec.experiments[].spec.components.experimentAnnotations |
| Description |
Annotations that needs to be provided in the pod which will be created (experiment-pod) |
| Type |
Optional |
| Range |
user-defined (type: label key=value) |
| Default |
n/a |
| Notes |
The .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod. |
| Field |
.spec.experiments[].spec.components.tolerations |
| Description |
Toleration for the experiment pod |
| Type |
Optional |
| Range |
user-defined (type: []corev1.Toleration) |
| Default |
n/a |
| Notes |
The .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. |
| Field |
.spec.experiments[].spec.probe |
| Description |
Declarative way to define the chaos hypothesis |
| Type |
Optional |
| Range |
user-defined |
| Default |
n/a |
| Notes |
The .probe allows developers to specify the chaos hypothesis. It supports four types: cmdProbe, k8sProbe, httpProbe, promProbe. For more details refer |
Summary
The ChaosEngine CR is the user-facing CR which helps in binding the application instance with the ChaosExperiment. It defines the Run Policies and also holds the status of your experiment. This CR helps you customize the experiment according to your need since it can override some of the default characteristics/tunables in your experiment CR.
This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.
Resources
Learn More