feat(docs): adding faq, troubleshooting and chaos resources details (#3120)
Signed-off-by: shubham chaudhary <shubham@chaosnative.com>
This commit is contained in:
parent
e621cdef8a
commit
2b34489a39
|
@ -1,3 +1,9 @@
|
|||
# Experiments
|
||||
|
||||
The experiment execution is triggered upon creation of the ChaosEngine resource (various examples of which are provided under the respective experiments). Typically, these chaosengines are embedded within the 'steps' of a Litmus Chaos Workflow [here](https://litmusdocs-beta.netlify.app/). However, one may also create the chaos engines directly by hand, and the chaos-operator reconciles this resource and triggers the experiment execution.
|
||||
|
||||
Provided below are tables with links to the individual experiment docs for easy navigation
|
||||
|
||||
## Kubernetes Experiments
|
||||
|
||||
It contains chaos experiments which apply on the resources, which are running on the kubernetes cluster. It contains <code>Generic</code>, <code>Kafka</code>, <code>Cassandra</code> experiments.
|
||||
|
@ -32,7 +38,7 @@ Chaos actions that apply to generic Kubernetes resources are classified into thi
|
|||
</tr>
|
||||
<tr>
|
||||
<td>Pod CPU Hog Exec</td>
|
||||
<td>Consumes CPU resources on the application container</td>
|
||||
<td>Consumes CPU resources on the application container by invoking a utility within the app container base image</td>
|
||||
<td><a href="/litmus/experiments/categories/pods/pod-cpu-hog-exec">pod-cpu-hog-exec</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
|
@ -62,7 +68,7 @@ Chaos actions that apply to generic Kubernetes resources are classified into thi
|
|||
</tr>
|
||||
<tr>
|
||||
<td>Pod Memory Hog Exec</td>
|
||||
<td>Consumes Memory resources on the application container</td>
|
||||
<td>Consumes Memory resources on the application container by invoking a utility within the app container base image</td>
|
||||
<td><a href="/litmus/experiments/categories/pods/pod-memory-hog-exec">pod-memory-hog-exec</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
|
@ -166,7 +172,7 @@ While Chaos Experiments under the Generic category offer the ability to induce c
|
|||
|
||||
<hr/>
|
||||
|
||||
## Cloud Platforms
|
||||
## Cloud Infrastructure
|
||||
|
||||
Chaos experiments that inject chaos into the platform resources of Kubernetes are classified into this category. Management of platform resources vary significantly from each other, Chaos Charts may be maintained separately for each platform (For example, AWS, GCP, Azure, etc)
|
||||
|
||||
|
|
|
@ -10,7 +10,9 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as "Noisy Neighbour" problems.
|
||||
|
||||
Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
|
@ -10,7 +10,9 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as "Noisy Neighbour" problems.
|
||||
|
||||
Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
|
@ -8,7 +8,9 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as "Noisy Neighbour" problems
|
||||
|
||||
Stressing the disk with continuous and heavy IO for example can cause degradation in reads written by other microservices that use this shared disk for example modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out. Another issue is the amount of scratch space eaten up on a node which leads to the lack of space for newer containers to get scheduled (kubernetes too gives up by applying an "eviction" taint like "disk-pressure") and causes a wholesale movement of all pods to other nodes
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
|
@ -10,8 +10,10 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.
|
||||
|
||||
This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified
|
||||
|
||||
## Prerequisites
|
||||
|
||||
??? info "Verify the prerequisites"
|
||||
|
|
|
@ -9,7 +9,9 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.
|
||||
|
||||
This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
|
@ -9,7 +9,12 @@
|
|||
## Uses
|
||||
|
||||
??? info "View the uses of the experiment"
|
||||
coming soon
|
||||
The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container). The idea of this experiment is to simulate issues within your pod network OR microservice communication across services in different availability zones/regions etc.
|
||||
|
||||
Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things. Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree. The experiment provides DESTINATION_IPS or DESTINATION_HOSTS so that you can control the chaos against specific services within or outside the cluster.
|
||||
|
||||
The applications may stall or get corrupted while they wait endlessly for a packet. The experiment limits the impact (blast radius) to only the traffic you want to test by specifying IP addresses or application information.This experiment will help to improve the resilience of your services over time
|
||||
|
||||
|
||||
## Prerequisites
|
||||
|
||||
|
|
|
@ -0,0 +1,850 @@
|
|||
# Chaos Engine Specifications
|
||||
|
||||
Bind an instance of a given app with one or more chaos experiments, define run characteristics, override chaos defaults, define steady-state hypothesis, reconciled by Litmus Chaos Operator.
|
||||
|
||||
This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field Name</th>
|
||||
<th>Description</th>
|
||||
<th>User Guide</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>State Specification</td>
|
||||
<td>It defines the state of the chaosengine</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/engine-state">State Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Application Specification</td>
|
||||
<td>It defines the details of AUT and auxiliary applications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/application-details">Application Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RBAC Specification</td>
|
||||
<td>It defines the chaos-service-account name</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/rbac-details">RBAC Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Runtime Specification</td>
|
||||
<td>It defines the runtime details of the chaosengine</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/runtime-details">Runtime Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Runner Specification</td>
|
||||
<td>It defines the runner pod specifications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/runner-components">Runner Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Experiment Specification</td>
|
||||
<td>It defines the experiment pod specifications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/experiment-components">Experiment Specifications</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### State Specification
|
||||
|
||||
??? info "View the state specification schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.engineState</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control the state of the chaosengine</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>active</code>, <code>stop</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>active</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>engineState</code> in the spec is a user defined flag to trigger chaos. Setting it to <code>active</code> ensures successful execution of chaos. Patching it with <code>stop</code> aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called <code>engineStatus</code> which is updated by the controller based on actual state of the ChaosEngine.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Application Specification
|
||||
|
||||
??? info "View the application specification schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.appinfo.appns</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify namespace of application under test</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>appns</code> in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.appinfo.applabel</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify unique label of application under test</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)(pattern: "label_key=label_value")</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>applabel</code> in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation <code>litmuschaos.io/chaos: "true"</code>. If, however, the <code>annotationCheck</code> is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.appinfo.appkind</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify resource kind of application under test</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>deployment</code>, <code>statefulset</code>, <code>daemonset</code>, <code>deploymentconfig</code>, <code>rollout</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i> (depends on app type)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>appkind</code> in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.auxiliaryAppInfo</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the <code>.spec.appInfo</code>. <b>NOTE</b>: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)(pattern: "namespace:label_key=label_value").</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>auxiliaryAppInfo</code> in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in <code>.spec.appInfo</code> in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### RBAC Specification
|
||||
|
||||
??? info "View the RBAC specification schema"
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.chaosServiceAccount</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify serviceaccount used for chaos experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>chaosServiceAccount</code> in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the <code>.spec.definition.permissions</code> field of the respective <b>chaosexperiment</b> CR.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Runtime Specification
|
||||
|
||||
??? info "View the runtime specification schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.annotationCheck</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control annotationChecks on applications as prerequisites for chaos</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>true</code>, <code>false</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>true</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>annotationCheck</code> in the spec controls whether or not the operator checks for the annotation "litmuschaos.io/chaos" to be set against the application under test (AUT). Setting it to <code>true</code> ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to <code>false</code> suppresses this check and proceeds with chaos injection.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.terminationGracePeriodSeconds</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control terminationGracePeriodSeconds for the chaos pods(abort case)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>integer value</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>30</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>terminationGracePeriodSeconds</code> in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.jobCleanUpPolicy</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control cleanup of chaos experiment job post execution of chaos</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>delete</code>, <code>retain</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>delete</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td><The <code>jobCleanUpPolicy</code> controls whether or not the experiment pods are removed once execution completes. Set to <code>retain</code> for debug purposes (in the absence of standard logging mechanisms).</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Runner Specification
|
||||
|
||||
??? info "View the runner specification schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.image</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify image of ChaosRunner pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i> (refer <i>Notes</i>)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.image</code> allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env <b>CHAOS_RUNNER_IMAGE</b></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.imagePullPolicy</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify imagePullPolicy for the ChaosRunner</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>Always</code>, <code>IfNotPresent</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>IfNotPresent</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.imagePullPolicy</code> allows developers to specify the pull policy for chaos-runner. Set to <code>Always</code> during debug/test.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.imagePullSecrets</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify imagePullSecrets for the ChaosRunner</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []corev1.LocalObjectReference)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.imagePullSecrets</code> allows developers to specify the <code>imagePullSecret</code> name for ChaosRunner. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.runnerAnnotations</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Annotations that needs to be provided in the pod which will be created (runner-pod)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <i>user-defined</i> (type: map[string]string) </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td> n/a </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.runnerAnnotation</code> allows developers to specify the custom annotations for the runner pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.args</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Specify the args for the ChaosRunner Pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.args</code> allows developers to specify their own debug runner args.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.command</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Specify the commands for the ChaosRunner Pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.command</code> allows developers to specify their own debug runner commands.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.configMaps</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Configmaps passed to the chaos runner pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: {name: string, mountPath: string})</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.runner.configMaps</code> provides for a means to insert config information into the runner pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.secrets</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Kubernetes secrets passed to the chaos runner pod.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: {name: string, mountPath: string})</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.runner.secrets</code> provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.nodeSelector</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Node selectors for the runner pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>Labels in the from of label key=value</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.runner.nodeSelector</code> The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.resources</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Specify the resource requirements for the ChaosRunner pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: corev1.ResourceRequirements)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.runner.resources</code> contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.components.runner.tolerations</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Toleration for the runner pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []corev1.Toleration)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.runner.tolerations</code> Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Experiment Specification
|
||||
|
||||
??? info "View the experiment specification schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.configMaps</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Configmaps passed to the chaos experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: {name: string, mountPath: string})</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.configMaps</code> provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.secrets</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Kubernetes secrets passed to the chaos experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: {name: string, mountPath: string})</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.secrets</code> provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.experimentImage</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Override the image of the chaos experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i> string </i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.experimentImage</code> overrides the experiment image for the chaoexperiment.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.experimentImagePullSecrets</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify imagePullSecrets for the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []corev1.LocalObjectReference)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.components.runner.experimentImagePullSecrets</code> allows developers to specify the <code>imagePullSecret</code> name for ChaosExperiment. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.nodeSelector</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Provide the node selector for the experiment pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i> Labels in the from of label key=value</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.nodeSelector</code> The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.statusCheckTimeouts</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i> It contains values in the form {delay: int, timeout: int} </i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>delay: 2s and timeout: 180s</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.statusCheckTimeouts</code> The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.resources</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Specify the resource requirements for the ChaosExperiment pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: corev1.ResourceRequirements)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>experiment[].spec.components.resources</code> contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.experimentAnnotations</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Annotations that needs to be provided in the pod which will be created (experiment-pod)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <i>user-defined</i> (type: label key=value) </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td> n/a </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.experimentAnnotation</code> allows developers to specify the custom annotations for the experiment pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiments[].spec.components.tolerations</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Toleration for the experiment pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: []corev1.Toleration)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.components.tolerations</code>Tolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.</td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,426 @@
|
|||
# Chaos Experiment Specifications
|
||||
|
||||
Granular definition of chaos intent specified via image, librar, necessary permissions, low-level chaos parameters (default values).
|
||||
|
||||
This section describes the fields in the ChaosExperiment and the possible values that can be set against the same.
|
||||
|
||||
### Scope Specification
|
||||
|
||||
??? info "View the scope schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.scope</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the scope of the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>Namespaced</code>, <code>Cluster</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i> (depends on experiment type)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.scope</code> specifies the scope of the experiment. It can be <code>Namespaced</code> scope for pod level experiments and <code>Cluster</code> for the experiments having a cluster wide impact.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.permissions</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the minimum permission to run the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: list)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.permissions</code> specify the minimum permission that is required to run the ChaosExperiment. It also helps to estimate the blast radius for the ChaosExperiment.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Component Specification
|
||||
|
||||
??? info "View the component schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.image</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the image to run the ChaosExperiment </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i> (refer Notes)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.image</code> allows the developers to specify their experiment images. Typically set to the Litmus <code>go-runner</code> or the <code>ansible-runner</code>. This feature of the experiment enables BYOC (BringYourOwnChaos), where developers can implement their own variants of a standard chaos experiment</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.imagePullPolicy</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag that helps the developers to specify imagePullPolicy for the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>IfNotPresent</code>, <code>Always</code> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>Always</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.imagePullPolicy</code> allows developers to specify the pull policy for ChaosExperiment image. Set to <code>Always</code> during debug/test</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.args</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the entrypoint for the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type:list of string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.args</code> specifies the entrypoint for the ChaosExperiment. It depends on the language used in the experiment. For litmus-go the <code>.spec.definition.args</code> contains a single binary of all experiments and managed via <code>-name</code> flag to indicate experiment to run(<code>-name (exp-name)</code>).</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.command</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the shell on which the ChaosExperiment will execute</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: list of string).</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>/bin/bash</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.command</code> specifies the shell used to run the experiment <code>/bin/bash</code> is the most common shell to be used.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Experiment Tunables Specification
|
||||
|
||||
??? info "View the experiment tunables"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.env</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify env used for ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: {name: string, value: string})</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.env</code> specifies the array of tunables passed to the experiment pods as environment variables. It is used to manage the experiment execution. We can set the default values for all the variables (tunable) here which can be overridden by ChaosEngine from <code>.spec.experiments[].spec.components.env</code> if required. To know about the variables that need to be overridden check the list of "mandatory" & "optional" env for an experiment as provided within the respective experiment documentation.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Configuration Specification
|
||||
|
||||
??? info "View the configuration schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.securityContext.containerSecurityContext.privileged</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the security context for the ChaosExperiment pod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>true, false</i> (type:bool)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.definition.securityContext.containerSecurityContext.privileged</code> specify the securityContext params to the experiment container.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.labels</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the label for the ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i> (type:map[string]string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.labels</code> allow developers to specify the ChaosPod label for an experiment. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.securityContext.podSecurityContext</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify security context for ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i> (type:corev1.PodSecurityContext)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.securityContext.podSecurityContext</code> allows the developers to specify the security context for the ChaosPod which applies to all containers inside the Pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.configMaps</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the configmap for ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.configMaps</code> allows the developers to mount the ConfigMap volume into the experiment pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.secrets</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the secrets for ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.secrets</code> specify the secret data to be passed for the ChaosPod. The secrets typically contains confidential information like credentials.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.experimentAnnotations</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the custom annotation to the ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i> (type:map[string]string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.experimentAnnotations</code> allows the developer to specify the Custom annotation for the chaos pod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.hostFileVolumes</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the host file volumes to the ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined<i> (type:map[string]string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.hostFileVolumes</code> allows the developer to specify the host file volumes to the ChaosPod.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.definition.hostPID</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the host PID for the ChaosPod</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>true, false</i> (type:bool)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td> The <code>.spec.definition.hostPID</code> allows the developer to specify the host PID for the ChaosPod. </td>
|
||||
</tr>
|
||||
</table>
|
|
@ -1,75 +0,0 @@
|
|||
# Chaos Engine Tunables
|
||||
|
||||
The ChaosEngine is the main user-facing chaos custom resource with a namespace scope and is designed to hold information around how the chaos experiments are executed. It connects an application instance with one or more chaos experiments, while allowing the users to specify run level details (override experiment defaults, provide new environment variables and volumes, options to delete or retain experiment pods, etc.,). This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.
|
||||
|
||||
This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.
|
||||
|
||||
## ChaosEngine Spec Specification
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field Name</th>
|
||||
<th>Description</th>
|
||||
<th>User Guide</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>State Specification</td>
|
||||
<td>It defines the state of the chaosengine</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/engine-state">State Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Application Specification</td>
|
||||
<td>It defines the details of AUT and auxiliary applications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/application-details">Application Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>RBAC Specification</td>
|
||||
<td>It defines the chaos-service-account name</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/rbac-details">RBAC Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Runtime Specification</td>
|
||||
<td>It defines the runtime details of the chaosengine</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/runtime-details">Runtime Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Runner Specification</td>
|
||||
<td>It defines the runner pod specifications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/runner-components">Runner Specifications</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Experiment Specification</td>
|
||||
<td>It defines the experiment pod specifications</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/experiment-components">Experiment Specifications</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## Probes
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Probe Name</th>
|
||||
<th>Description</th>
|
||||
<th>User Guide</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Command Probe</td>
|
||||
<td>It defines the command probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/cmdProbe">Command Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>HTTP Probe</td>
|
||||
<td>It defines the http probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/httpProbe">HTTP Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>K8S Probe</td>
|
||||
<td>It defines the k8s probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/k8sProbe">K8S Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Prometheus Probe</td>
|
||||
<td>It defines the prometheus probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/promProbe">Prometheus Probe</a></td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,332 @@
|
|||
# Chaos Result Specifications
|
||||
|
||||
Hold engine reference, experiment state, verdict(on complete), salient application/result attributes, sources for metrics collection
|
||||
|
||||
This section describes the fields in the ChaosResult and the possible values that can be set against the same.
|
||||
|
||||
### Component Details
|
||||
|
||||
??? info "View the components schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.engine</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the ChaosEngine name for the experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.engine<code> holds the engine name for the current course of the experiment.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.experiment</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the ChaosExperiment name which induces chaos.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.spec.experiment</code> holds the ChaosExperiment name for the current course of the experiment.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Status Details
|
||||
|
||||
??? info "View the status schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.experimentStatus.failstep</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the failure step of the ChaosExperiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>n/a<i>(type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.experimentStatus.failstep</code> Show the step at which the experiment failed. It helps in faster debugging of failures in the experiment execution.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.experimentStatus.phase</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the current phase of the experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>Awaited,Running,Completed,Aborted</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.experimentStatus.phase</code> shows the current phase in which the experiment is. It gets updated as the experiment proceeds.If the experiment is aborted then the status will be Aborted.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.experimentStatus.probesuccesspercentage</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the probe success percentage</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>1 to 100</i> (type: int)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.experimentStatus.probesuccesspercentage</code> shows the probe success percentage which is a ratio of successful checks v/s total probes.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.experimentStatus.verdict</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the verdict of the experiment.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>Awaited,Pass,Fail,Stopped</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.experimentStatus.verdict</code> shows the verdict of the experiment. It is <code>Awaited</code> when the experiment is in progress and ends up with Pass or Fail according to the experiment result.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.history.passedRuns</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>It contains cumulative passed run count</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> ANY NON NEGATIVE INTEGER </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.history.passedRuns</code> contains cumulative passed run counts for a specific ChaosResult.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.history.failedRuns</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>It contains cumulative failed run count</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> ANY NON NEGATIVE INTEGER </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.history.failedRuns</code> contains cumulative failed run counts for a specific ChaosResult.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.history.stoppedRuns</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>It contains cumulative stopped run count</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> ANY NON NEGATIVE INTEGER </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.history.stoppedRuns</code> contains cumulative stopped run counts for a specific ChaosResult.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Probe Details
|
||||
|
||||
??? info "View the probe schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.probestatus.name</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the name of probe used in the experiment</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>n/a</i> n/a (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.probestatus.name</code> shows the name of the probe used in the experiment.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.probestatus.status.continuous</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the result of probe in continuous mode</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>Awaited,Passed,Better Luck Next Time</i> (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.probestatus.status.continuous</code> helps to get the result of the probe in the continuous mode. The httpProbe is better used in the Continuous mode.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.probestatus.status.postchaos</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the probe result post chaos</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>Awaited,Passed,Better Luck Next Time</i> (type:map[string]string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.probestatus.status.postchaos</code> shows the result of probe setup in EOT mode executed at the End of Test as a post-chaos check. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.probestatus.status.prechaos</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the probe result pre chaos</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>Awaited,Passed,Better Luck Next Time</i> (type:string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.probestatus.status.prechaos</code> shows the result of probe setup in SOT mode executed at the Start of Test as a pre-chaos check.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.status.probestatus.type</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to show the type of probe used</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>
|
||||
<i>HTTPProbe,K8sProbe,CmdProbe</i>(type:string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.status.probestatus.type</code> shows the type of probe used.</td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,268 @@
|
|||
# Chaos Scheduler Specifications
|
||||
|
||||
Hold attributes for repeated execution (run now, once@timestamp, b/w start-end timestamp@ interval). Embeds the ChaosEngine as template
|
||||
|
||||
This section describes the fields in the ChaosScheduler and the possible values that can be set against the same.
|
||||
|
||||
### Schedule NOW
|
||||
|
||||
??? info "View the schedule now schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.now</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control the type of scheduling</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>true</code>, <code>false</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>n/a</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>now</code> in the <code>spec.schedule</code> ensures immediate creation of chaosengine, i.e., injection of chaos.
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Schedule Once
|
||||
|
||||
??? info "View the schedule once schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.once.executionTime</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify execution timestamp at which chaos is injected, when the policy is <code>once</code>. The chaosengine is created exactly at this timestamp.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: UTC Timeformat)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td><code>.spec.schedule.once</code> refers to a single-instance execution of chaos at a particular timestamp specified by <code>.spec.schedule.once.executionTime</code></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Schedule Repeat
|
||||
|
||||
??? info "View the schedule repeat schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.repeat.timeRange.startTime</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify start timestamp of the range within which chaos is injected, when the policy is <code>repeat</code>. The chaosengine is not created before this timestamp.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: UTC Timeformat)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>When <code>startTime</code> is specified against the policy <code>repeat</code>, ChaosEngine will not be formed before this time, no matter when it was created.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.repeat.timeRange.endTime</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify end timestamp of the range within which chaos is injected, when the policy is <code>repeat</code>. The chaosengine is not created after this timestamp.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: UTC Timeformat)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>When <code>endTime</code> is specified against the policy <code>repeat</code>, ChaosEngine will not be formed after this time.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.repeat.properties.minChaosInterval</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the minimum interval between two chaosengines to be formed. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)(pattern: "{number}m", "{number}h").</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>minChaosInterval</code> in the spec specifies a time interval that must be taken care of while repeatedly forming the chaosengines i.e. This much duration of time should be there as interval between the formation of two chaosengines. </td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.repeat.workDays.includedDays</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the days at which chaos is allowed to take place</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>user-defined</i> (type: string)(pattern: [{day_name},{day_name}...]).</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>includedDays</code> in the spec specifies a (comma-separated) list of days of the week at which chaos is allowed to take place. {day_name} is to be specified with the first 3 letters of the name of day such as <code>Mon</code>, <code>Tue</code> etc.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.schedule.repeat.workHours.includedHours</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to specify the hours at which chaos is allowed to take place</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>{hour_number} will range from 0 to 23</i> (type: string)(pattern: {hour_number}-{hour_number}).</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>includedHours</code> in the spec specifies a range of hours of the day at which chaos is allowed to take place. 24 hour format is followed
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## Engine Specification
|
||||
|
||||
??? info "View the engine details"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.engineTemplateSpec</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control chaosengine to be formed </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><i>n/a</i></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>engineTemplateSpec</code> is the ChaosEngineSpec of ChaosEngine that is to be formed.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## State Specification
|
||||
|
||||
??? info "View the state schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.spec.scheduleState</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to control chaosshedule state </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>active</code>, <code>halt</code>, <code>complete</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Default</th>
|
||||
<td><code>active</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>scheduleState</code> is the current state of ChaosSchedule. If the schedule is running its state will be <code>active</code>, if the schedule is halted its state will be <code>halt</code> and if the schedule is completed it state will be <code>complete</code>.</td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,36 @@
|
|||
# Chaos Resources
|
||||
|
||||
At the heart of the Litmus Platform are the chaos custom resources. This section consists of the specification (details of each field within the .spec & .status of the resources) as well as standard examples for tuning the supported parameters.
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Chaos Resource Name</th>
|
||||
<th>Description</th>
|
||||
<th>User Guide</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ChaosEngine</td>
|
||||
<td>Contains the ChaosEngine specifications user-guide</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-engine/contents/">ChaosEngine</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ChaosExperiment</td>
|
||||
<td>Contains the ChaosExperiment specifications user-guide</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-experiment/contents/">ChaosExperiment</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ChaosResult</td>
|
||||
<td>Contains the ChaosResult specifications user-guide</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-result/contents/">ChaosResult</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>ChaosScheduler</td>
|
||||
<td>Contains the ChaosScheduler specifications user-guide</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/chaos-scheduler/contents/">ChaosScheduler</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Probes</td>
|
||||
<td>Contains the Probes specifications user-guide</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/contents/">Probes</a></td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,863 @@
|
|||
# Probes Specifications
|
||||
|
||||
Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard “in-built” checks).
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Probe Name</th>
|
||||
<th>Description</th>
|
||||
<th>User Guide</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Command Probe</td>
|
||||
<td>It defines the command probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/cmdProbe">Command Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>HTTP Probe</td>
|
||||
<td>It defines the http probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/httpProbe">HTTP Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>K8S Probe</td>
|
||||
<td>It defines the k8s probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/k8sProbe">K8S Probe</a></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Prometheus Probe</td>
|
||||
<td>It defines the prometheus probes</td>
|
||||
<td><a href="/litmus/experiments/chaos-resources/probes/promProbe">Prometheus Probe</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Basic Details
|
||||
|
||||
??? info "View the basic schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.name</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the name of the probe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a (type: string)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.name</code> holds the name of the probe. It can be set based on the usecase</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.type</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the type of the probe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>httpProbe</code>, <code>k8sProbe</code>, <code>cmdProbe</code>, <code>promProbe</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.type</code> supports four type of probes. It can one of the <code>httpProbe</code>, <code>k8sProbe</code>, <code>cmdProbe</code>, <code>promProbe</code></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.mode</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the mode of the probe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>SOT</code>, <code>EOT</code>, <code>Edge</code>, <code>Continuous</code>, <code>OnChaos</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.mode</code> supports five modes of probes. It can one of the <code>SOT</code>, <code>EOT</code>, <code>Edge</code>, <code>Continuous</code>, <code>OnChaos</code></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.data</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the data for the <code>create</code> operation of the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.data</code> contains the manifest/data for the resource, which need to be created. It supported for <code>create</code> operation of k8sProbe only</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Command Probe
|
||||
|
||||
??? info "View the command probe schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.cmdProbe/inputs.command</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the command for the cmdProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.cmdProbe/inputs.command</code> contains the shell command, which should be run as part of cmdProbe</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.cmdProbe/inputs.source</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the source for the cmdProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>inline</code>, <code>any source docker image</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.cmdProbe/inputs.source</code> It supports <code>inline</code> value when command can be run from within the experiment image. Otherwise provide the source image which can be used to launch a external pod where the command execution is carried out.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### HTTP Probe
|
||||
|
||||
??? info "View the http probe schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.url</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the URL for the httpProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.url</code> contains the URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.insecureSkipVerify</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the flag to skip certificate checks for the httpProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>true</code>, <code>false</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.insecureSkipVerify</code> contains flag to skip certificate checks.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.responseTimeout</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the flag to response timeout for the httpProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.responseTimeout</code> contains flag to provide the response timeout for the http Get/Post request.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.get.criteria</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the criteria for the http get request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>==</code>, <code>!=</code>, <code>oneOf</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.get.criteria</code> contains criteria to match the http get request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.get.responseCode</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the expected response code for the get request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> HTTP_RESPONSE_CODE</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.get.responseCode</code> contains the expected response code for the http get request as part of httpProbe run</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.post.contentType</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the content type of the post request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.post.contentType</code> contains the content type of the http body data, which need to be passed for the http post request</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.post.body</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the body of the http post request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.post.body</code> contains the http body, which is required for the http post request. It is used for the simple http body. If the http body is complex then use <code>.httpProbe/inputs.method.post.bodyPath</code> field.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.post.bodyPath</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the path of the http body, required for the http post request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.post.bodyPath</code> This field is used in case of complex POST request in which the body spans multiple lines, the bodyPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.post.criteria</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the criteria for the http post request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>==</code>, <code>!=</code>, <code>oneOf</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.post.criteria</code> contains criteria to match the http post request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.httpProbe/inputs.method.post.responseCode</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the expected response code for the post request</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> HTTP_RESPONSE_CODE</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.httpProbe/inputs.method.post.responseCode</code> contains the expected response code for the http post request as part of httpProbe run</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### K8S Probe
|
||||
|
||||
??? info "View the k8s probe schema"
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.group</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the group of the kubernetes resource for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.group</code> contains group of the kubernetes resource on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.version</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the apiVersion of the kubernetes resource for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.version</code> contains apiVersion of the kubernetes resource on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.resource</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the kubernetes resource name for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.resource</code> contains the kubernetes resource name on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.namespace</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the namespace of the kubernetes resource for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.namespace</code> contains namespace of the kubernetes resource on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.fieldSelector</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the fieldSelectors of the kubernetes resource for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.fieldSelector</code> contains fieldSelector to derived the kubernetes resource on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.labelSelector</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the labelSelectors of the kubernetes resource for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.labelSelector</code> contains labelSelector to derived the kubernetes resource on which k8sProbe performs the specified operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.k8sProbe/inputs.operation</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the operation type for the k8sProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td><code>create</code>, <code>delete</code>, <code>present</code>, <code>absent</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.k8sProbe/inputs.operation</code> contains operation which should be applied on the kubernetes resource as part of k8sProbe. It supports four type of operation. It can be one of <code>create</code>, <code>delete</code>, <code>present</code>, <code>absent</code>.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Prometheus Probe
|
||||
|
||||
??? info "View the prometheus probe schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.promProbe/inputs.endpoint</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the prometheus endpoints for the promProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.promProbe/inputs.endpoint</code> contains the prometheus endpoints</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.promProbe/inputs.query</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the promql query for the promProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.promProbe/inputs.query</code> contains the promql query to extract out the desired prometheus metrics via running it on the given prometheus endpoint</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.promProbe/inputs.queryPath</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the path of the promql query for the promProbe</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.promProbe/inputs.queryPath</code> This field is used in case of complex queries that spans multiple lines, the queryPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Runproperties
|
||||
|
||||
??? info "View the run-properties schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.probeTimeout</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the timeout for the probes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.probeTimeout</code> represents the time limit for the probe to execute the specified check and return the expected data</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.retry</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the retry count for the probes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.retry</code> contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.interval</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the interval for the probes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.interval</code> contains the interval for which probes waits between subsequent retries</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.probePollingInterval</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the polling interval for the probes(applicable for <code>Continuous</code> mode only)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.probePollingInterval</code> contains the time interval for which continuous probe should be sleep after each iteration</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.initialDelaySeconds</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold the initial delay interval for the probes</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>n/a {type: integer}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.initialDelaySeconds</code> represents the initial waiting time interval for the probes.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>.runProperties.stopOnFailure</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td> Flags to hold the stop or continue the experiment on probe failure</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Optional</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td>false {type: boolean}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>.runProperties.stopOnFailure</code> can be set to true/false to stop or continue the experiment execution after probe fails</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
### Comparator
|
||||
|
||||
??? info "View the comparator schema"
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>type</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold type of the data used for comparision</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> <code>string</code>, <code>int</code>, <code>float</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>type</code> contains type of data, which should be compare as part of comparision operation</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>criteria</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold criteria for the comparision</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> it supports {>=, <=, ==, >, <, !=, oneOf, between} for int & float type. And {equal, notEqual, contains, matches, notMatches, oneOf} for string type.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>criteria</code> contains criteria of the comparision, which should be fulfill as part of comparision operation.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<td><code>value</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Description</th>
|
||||
<td>Flag to hold value for the comparision</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<td>Mandatory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Range</th>
|
||||
<td> n/a {type: string}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Notes</th>
|
||||
<td>The <code>value</code> contains value of the comparision, which should follow the given criteria as part of comparision operation.</td>
|
||||
</tr>
|
||||
</table>
|
|
@ -0,0 +1,190 @@
|
|||
# The What, Why & How of Litmus
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?](#node-memory-hog-experiments-pod-oom-killed-even-before-the-kubelet-sees-the-memory-stress)
|
||||
|
||||
1. [Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?](#pod-network-corruption-and-pod-network-loss-both-experiments-force-network-packet-loss-is-it-worthwhile-trying-out-both-experiments-in-a-scheduled-chaos-test)
|
||||
|
||||
1. [How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?](#how-is-the-packet-loss-achieved-in-pod-network-loss-and-corruption-experiments-what-are-the-internals-of-it)
|
||||
|
||||
1. [What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?](#whats-the-difference-between-pod-memorycpu-hog-vs-pod-memorycpu-hog-exec)
|
||||
|
||||
1. [What are the typical probes used for pod-network related experiments?](#what-are-the-typical-probes-used-for-pod-network-related-experiments)
|
||||
|
||||
1. [Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?](#litmus-provides-multiple-libs-to-run-some-chaos-experiments-like-stress-chaos-and-network-chaos-so-which-library-should-be-preferred-to-use)
|
||||
|
||||
1. [How to run chaos experiment programatically using apis?](#how-to-run-chaos-experiment-programatically-using-apis)
|
||||
|
||||
1. [Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?](#kubernetes-by-default-has-built-in-features-like-replicasetdeployment-to-prevent-service-unavailability-continuous-curl-from-the-httpprobe-on-litmus-should-not-fail-in-case-of-container-kill-pod-delete-and-oom-due-to-pod-memory-hog-then-why-do-we-need-cpu-io-and-network-related-chaos-experiments)
|
||||
|
||||
1. [The experiment is not targeting all pods with the given label, it just selects only one pod by default](#the-experiment-is-not-targeting-all-pods-with-the-given-label-it-just-selects-only-one-pod-by-default)
|
||||
|
||||
1. [Do we have a way to see what pods are targeted when users use percentages?](#do-we-have-a-way-to-see-what-pods-are-targeted-when-users-use-percentages)
|
||||
|
||||
1. [What is the function of spec.definition.scope of a ChaosExperiment CR?](#what-is-the-function-of-specdefinitionscope-of-a-chaosexperiment-cr)
|
||||
|
||||
1. [Pod network latency -- I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?](#pod-network-latency-i-have-pod-a-talking-to-pod-b-over-service-b-and-i-want-to-introduce-latency-between-pod-a-and-service-b-what-would-go-into-specappinfo-section-pod-a-namespace-label-selector-and-kind-what-will-go-into-destination_ip-and-destination_host-service-b-details-what-are-the-target_pods)
|
||||
|
||||
1. [How to check the NETWORK_INTERFACE and SOCKET_PATH variable?](#how-to-check-the-network_interface-and-socket_path-variable)
|
||||
|
||||
1. [What are the different ways to target the pods and nodes for chaos?](#what-are-the-different-ways-to-target-the-pods-and-nodes-for-chaos)
|
||||
|
||||
1. [Does the pod affected perc select the random set of pods from the total pods under chaos?](#does-the-pod-affected-perc-select-the-random-set-of-pods-from-the-total-pods-under-chaos)
|
||||
|
||||
1. [How to extract the chaos start time and end time?](#how-to-extract-the-chaos-start-time-and-end-time)
|
||||
|
||||
1. [How do we check the MTTR (Mean time to recovery) for an application post chaos?](#how-do-we-check-the-mttr-mean-time-to-recovery-for-an-application-post-chaos)
|
||||
|
||||
1. [What is the difference between Ramp Time and Chaos Interval?](#what-is-the-difference-between-ramp-time-and-chaos-interval)
|
||||
|
||||
1. [When I’m executing an experiment the experiment's pod failed with the exec format error](#when-im-executing-an-experiment-the-experiments-pod-failed-with-the-exec-format-error)
|
||||
|
||||
<hr>
|
||||
|
||||
### Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?
|
||||
|
||||
The experiment takes a percentage of the total memory capacity of the Node. The helper pod runs on the target node to stress the resources of that node. So The experiment will not consume/hog the memory resources greater than the total memory available on Node. In other words there will always be an upper limit for the amount of memory to be consumed, which equal to the total available memory. Please refer to this [blog](https://dev.to/uditgaurav/litmuschaos-node-memory-hog-experiment-2nj6) for more details.
|
||||
|
||||
### Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?
|
||||
|
||||
Yes, ultimately these are different ways to simulate a degraded network. Both cases are expected to typically cause retransmissions (for tcp). The extent of degradation depends on the percentage of loss/corruption
|
||||
|
||||
### How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?
|
||||
|
||||
The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container)
|
||||
The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in different availability zones/regions etc..,
|
||||
Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things.
|
||||
Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree.
|
||||
|
||||
The args passed to the tc netem command run against the target container changes depending on the type of n/w fault
|
||||
|
||||
### What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?
|
||||
|
||||
The pod cpu and memory chaos experiment till now (version 1.13.7) was using an exec mode of execution which means - we were execing inside the specified target container and launching process like `md5sum` and `dd` to consume the cpu and memory respectively. This is done by providing `CHAOS_INJECT_COMMAND` and `CHAOS-KILL-COMMAND` in chaosengine CR. But we have some limitations of using this method. Those were:
|
||||
- The chaos inject and kill command are highly dependent on the base image of the target container and may work for some and for others you may have to derive it manually and use it.
|
||||
- For scratch images that don't expose shells we couldn't execute the chaos.
|
||||
|
||||
To overcome this - The stress-chaos experiments (cpu, memory and io) are enhanced to use a non exec mode of chaos execution. It makes use of target container cgroup for the resource allocation and container pid namespace for showing the stress-ng process in target container. This `stress-ng` process will consume the resources on the target container without doing an exec. The new enhanced experiments are available from litmus 1.13.8 version.
|
||||
|
||||
### What are the typical probes used for pod-network related experiments?
|
||||
|
||||
Precisely the role of the experiment. Cause n/w degradation w/o the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container)
|
||||
The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in diff availability zones/regions etc..,
|
||||
|
||||
Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things.
|
||||
|
||||
Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree
|
||||
|
||||
### Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?
|
||||
|
||||
The optional libs (like Pumba) is more of an illustration of how you can use 3rd party tools with litmus. Called the BYOC (Bring Your Own Chaos). The preferred LIB is `litmus`.
|
||||
|
||||
### How to run chaos experiment programatically using apis?
|
||||
|
||||
To directly consume/manipulate the chaos resources (i.e., chaosexperiment, chaosengine or chaosresults) via API - you can directly use the kube API. The CRDs by default provide us with an API endpoint. You can use any generic client implementation (go/python are most used ones) to access them. In case you use go, there is a clientset available as well: [go-client](https://github.com/litmuschaos/chaos-operator/tree/master/pkg/client/clientset/versioned/typed/litmuschaos/v1alpha1)
|
||||
|
||||
Here are some simple CRUD ops against chaosresources you could construct with curl (I have used kubectl proxy, one could use an auth token instead)- just for illustration purposes.
|
||||
|
||||
#### Create ChaosEngine:
|
||||
|
||||
For example, assume this is the [engine spec](https://gist.github.com/ksatchit/7426f2c24a48e3aedbe79b5547d817b3)
|
||||
|
||||
```console
|
||||
curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines -XPOST -H 'Content-Type: application/json' -d@pod-delete-chaosengine-trigger.json
|
||||
```
|
||||
|
||||
#### Read ChaosEngine status:
|
||||
|
||||
```console
|
||||
curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos | jq '.status.engineStatus, .status.experiments[].verdict'
|
||||
```
|
||||
|
||||
#### Update ChaosEngine Spec:
|
||||
|
||||
(say, this is the patch: https://gist.github.com/ksatchit/be54955a1f4231314797f25361ac488d)
|
||||
|
||||
```console
|
||||
curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "replace", "path": "/spec/engineState", "value": "stop"}]' http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos
|
||||
```
|
||||
|
||||
#### Delete the ChaosEngine resource:
|
||||
|
||||
```console
|
||||
curl -X DELETE localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos \
|
||||
-d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Foreground"}' \
|
||||
-H "Content-Type: application/json"
|
||||
```
|
||||
|
||||
#### Similarly, to check the results/verdict of the experiment from ChaosResult, you could use:
|
||||
|
||||
```console
|
||||
curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/nginx-chaos-pod-delete | jq '.status.experimentStatus.verdict, .status.experimentStatus.probeSuccessPercentage'
|
||||
```
|
||||
|
||||
### Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?
|
||||
|
||||
There are some scenarios that can still occur despite whatever availability aids K8s provides. For example, take disk usage or CPU hogs -- problems you would generally refer to as "Noisy Neighbour" problems.
|
||||
Stressing the disk w/ continuous and heavy I/O for example can cause degradation in reads and writes performed by other microservices that use this shared disk - for example. (modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out). Another issue is the amount of scratch space eaten up on a node - leading to lack of space for newer containers to get scheduled (kubernetes too gives up by applying an "eviction" taint like "disk-pressure") and causes a wholesale movement of all pods to other nodes.
|
||||
Similarly w/ CPU chaos -- by injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in app traffic OR in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.
|
||||
|
||||
### The experiment is not targeting all pods with the given label, it just selects only one pod by default.
|
||||
|
||||
Yes. You can use either `the PODS_AFFECTED_PERCENTAGE` or `TARGET_PODS` env to select multiple pods. Refer: [experiment tunable envs](https://docs.litmuschaos.io/docs/pod-network-loss/#supported-experiment-tunables).
|
||||
|
||||
### Do we have a way to see what pods are targeted when users use percentages?
|
||||
|
||||
We can view the target pods from the experiment logs or inside chaos results.
|
||||
|
||||
### What is the function of spec.definition.scope of a ChaosExperiment CR?
|
||||
|
||||
The `spec.definition.scope` & `.spec.definition.permissions` is mostly for indicative/illustration purposes (for external tools to identify and validate what are the permissions associated to run the exp). By itself, it doesn't influence how and where an exp can be used.One could remove these fields if needed (of course along w/ the crd validation) and store these manifests if desired.
|
||||
|
||||
### In Pod network latency - I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?
|
||||
|
||||
It will target the `[1:total_replicas]`(based on PODS_AFFECTED_PERC) numbers of random pods with matching labels(appinfo.applabel) and namespace(appinfo.appns). But if you want to target a specific pod then you can provide their names as a comma separated list inside `TARGET_PODS`.
|
||||
Yes, you can provide service B details inside `DESTINATION_IPS` or `DESTINATION_HOSTS`. The `NETWORK_INTERFACE` should be `eth0`.
|
||||
|
||||
### How to check the NETWORK_INTERFACE and SOCKET_PATH variable?
|
||||
|
||||
The `NETWORK_INTERFACE` is the interface name inside the pod/container that needs to be targeted. You can find it by execing into the target pod and checking the available interfaces. You can try `ip link`, `iwconfig` , `ifconfig` depending on the tools installed in the pod either of those could work.
|
||||
|
||||
The `SOCKET_PATH` by default takes the docker socket path. If you are using something else like containerd, crio or have a different socket path by any chance you can specify it. This is required to communicate with the container runtime of your cluster.
|
||||
In addition to this if container-runtime is different then provide the name of container runtime inside `CONTAINER_RUNTIME` ENV. It supports `docker`, `containerd`, and `crio` runtimes.
|
||||
|
||||
### What are the different ways to target the pods and nodes for chaos?
|
||||
|
||||
The different ways are:
|
||||
|
||||
Pod Chaos:
|
||||
- `Appinfo`: Provide the target pod labels in the chaos engine appinfo section.
|
||||
- `TARGET_PODS`: You can provide the target pod names as a Comma Separated Variable. Like pod1,pod2.
|
||||
|
||||
Node Chaos:
|
||||
- `TARGET_NODE` or `TARGET_NODES`: Provide the target node or nodes in these envs.
|
||||
- `NODE_LABEL`: Provide the label of the target nodes.
|
||||
|
||||
### Does the pod affected percentage select the random set of pods from the total pods under chaos?
|
||||
|
||||
Yes, it selects the random pods based on the `POD_AFFACTED_PERC` ENV. In pod-delete experiment it selects random pods for each iterations of chaos. But for rest of the experiments(if it supports iterations) then it will select random pods once and use the same set of pods for remaining iterations.
|
||||
|
||||
### How to extract the chaos start time and end time?
|
||||
|
||||
We can use the Chaos exporter metrics for the same. One can also visualise these events along with time in chaos engine events.
|
||||
|
||||
### How do we check the MTTR (Mean time to recovery) for an application post chaos?
|
||||
|
||||
The MTTR can be validated by using statusCheck Timeout in the chaos engine. By default its value will be 180 seconds. We can also overwrite this using ChaosEngine. For more details refer [this](https://litmuschaos.github.io/litmus/experiments/chaos-resources/experiment-components/#experiment-status-check-timeout)
|
||||
|
||||
### What is the difference between Ramp Time and Chaos Interval?
|
||||
|
||||
The ramp time is the time duration to wait before and after injection of chaos in seconds. While the chaos interval is the time interval (in second) between successive chaos iterations.
|
||||
|
||||
### When I’m executing an experiment the experiment's pod failed with the exec format error
|
||||
|
||||
??? info "View the error message"
|
||||
```
|
||||
standard_init_linux.go:211: exec user process caused "exec format error":
|
||||
```
|
||||
|
||||
There could be multiple reasons for this. The most common one is mismatched in the binary and the platform on which it is running, try to check out the image binary you're using should have the support for the platform on which you’re trying to run the experiment.
|
|
@ -0,0 +1,201 @@
|
|||
# Troubleshooting Litmus
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [The Litmus chaos operator is seen to be in CrashLoopBackOff state immediately after installation?](#the-litmus-chaosoperator-is-seen-to-be-in-crashloopbackoff-state-immediately-after-installation)
|
||||
|
||||
1. [Nothing happens (no pods created) when the chaosengine resource is created?](#nothing-happens-no-pods-created-when-the-chaosengine-resource-is-created)
|
||||
|
||||
1. [The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?](#the-chaos-runner-pod-enters-completed-state-seconds-after-getting-created-no-experiment-jobs-are-created)
|
||||
|
||||
1. [The experiment pod enters completed state w/o the desired chaos being injected?](#the-experiment-pod-enters-completed-state-wo-the-desired-chaos-being-injected)
|
||||
|
||||
1. [Scheduler not forming chaosengines for type-repeat?](#scheduler-not-forming-chaosengines-for-typerepeat)
|
||||
|
||||
1. [Litmus uninstallation is not successful and namespace is stuck in terminating state?](#litmus-uninstallation-is-not-successful-and-namespace-is-stuck-in-terminating-state)
|
||||
|
||||
1. [Observing experiment results using describe chaosresult is showing NotFound error?](#observing-experiment-results-using-describe-chaosresult-is-showing-notfound-error)
|
||||
|
||||
<hr>
|
||||
|
||||
### The Litmus ChaosOperator is seen to be in CrashLoopBackOff state immediately after installation?
|
||||
|
||||
Verify if the ChaosEngine custom resource definition (CRD) has been installed in the cluster. This can be
|
||||
verified with the following commands:
|
||||
|
||||
```console
|
||||
kubectl get crds | grep chaos
|
||||
```
|
||||
```console
|
||||
kubectl api-resources | grep chaos
|
||||
```
|
||||
|
||||
If not created, install it from [here](https://github.com/litmuschaos/chaos-operator/blob/master/deploy/crds/chaosengine_crd.yaml)
|
||||
|
||||
### Nothing happens (no pods created) when the chaosengine resource is created?
|
||||
|
||||
If the ChaosEngine creation results in no action at all, perform the following checks:
|
||||
|
||||
- Check the Kubernetes events generated against the chaosengine resource.
|
||||
|
||||
|
||||
```
|
||||
kubectl describe chaosengine <chaosengine-name> -n <namespace>
|
||||
```
|
||||
|
||||
Specifically look for the event reason *ChaosResourcesOperationFailed*. Typically, these events consist of messages pointing to the
|
||||
problem. Some of the common messages include:
|
||||
|
||||
- *Unable to filter app by specified info*
|
||||
- *Unable to get chaos resources*
|
||||
- *Unable to update chaosengine*
|
||||
|
||||
- Check the logs of the chaos-operator pod using the following command to get more details (on failed creation of chaos resources).
|
||||
The below example uses litmus namespace, which is the default mode of installation. Please provide the namespace into which the
|
||||
operator has been deployed:
|
||||
|
||||
```console
|
||||
kubectl logs -f <chaos-operator-(hash)-(hash)>-runner -n litmus
|
||||
```
|
||||
|
||||
Some of the possible reasons for these errors include:
|
||||
|
||||
- The annotationCheck is set to `true` in the ChaosEngine spec, but the application deployment (AUT) has not
|
||||
been annotated for chaos. If so, please add it using the following command:
|
||||
|
||||
```console
|
||||
kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos="true"
|
||||
```
|
||||
|
||||
- The annotationCheck is set to `true` in the ChaosEngine spec and there are multiple chaos candidates that
|
||||
share the same label (as provided in the `.spec.appinfo` of the ChaosEngine) and are also annotated for chaos.
|
||||
If so, please provide a unique label for the AUT, or remove annotations on other applications with the same label.
|
||||
Litmus, by default, doesn't allow selection of multiple applications. If this is a requirement, set the
|
||||
annotationCheck to `false`.
|
||||
|
||||
```console
|
||||
kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos-
|
||||
```
|
||||
- The ChaosEngine has the `.spec.engineState` set to `stop`, which causes the operator to refrain from creating chaos
|
||||
resources. While it is an unlikely scenario, it is possible to reuse a previously modified ChaosEngine manifest.
|
||||
|
||||
- Verify if the service account used by the Litmus ChaosOperator has enough permissions to launch pods/services
|
||||
(this is available by default if the manifests suggested by the docs have been used).
|
||||
|
||||
### The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?
|
||||
|
||||
If the chaos-runner enters completed state immediately post creation, i.e., the creation of experiment resources is
|
||||
unsuccessful, perform the following checks:
|
||||
|
||||
- Check the Kubernetes events generated against the chaosengine resource.
|
||||
|
||||
```
|
||||
kubectl describe chaosengine <chaosengine-name> -n <namespace>
|
||||
```
|
||||
|
||||
Look for one of these events: *ExperimentNotFound*, *ExperimentDependencyCheck*, *EnvParseError*
|
||||
|
||||
- Check the logs of the chaos-runner pod logs.
|
||||
|
||||
```console
|
||||
kubectl logs -f <chaosengine_name>-runner -n <namespace>
|
||||
```
|
||||
|
||||
Some of the possible reasons may include:
|
||||
|
||||
- The ChaosExperiment CR for the experiment (name) specified in the ChaosEngine .spec.experiments list is not installed.
|
||||
If so, please install the desired experiment from the [chaoshub](https://hub.litmuschaos.io)
|
||||
|
||||
- The dependent resources for the ChaosExperiment, such as ConfigMap & secret volumes (as specified in the ChaosExperiment CR
|
||||
or the ChaosEngine CR) may not be present in the cluster (or in the desired namespace). The runner pod doesn’t proceed
|
||||
with creation of experiment resources if the dependencies are unavailable.
|
||||
|
||||
- The values provided for the ENV variables in the ChaosExperiment or the ChaosEngines might be invalid
|
||||
|
||||
- The chaosServiceAccount specified in the ChaosEngine CR doesn’t have sufficient permissions to create the experiment
|
||||
resources (For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs).
|
||||
|
||||
### The experiment pod enters completed state w/o the desired chaos being injected?
|
||||
|
||||
If the experiment pod enters completed state immediately (or in a few seconds) after creation w/o injecting the desired chaos,
|
||||
perform the following checks:
|
||||
|
||||
- Check the Kubernetes events generated against the ChaosEngine resource
|
||||
|
||||
```
|
||||
kubectl describe chaosengine <chaosengine-name> -n <namespace>
|
||||
```
|
||||
|
||||
Look for the event with reason *Summary* with message *<experiment-name> experiment has been failed*
|
||||
|
||||
- Check the logs of the chaos-experiment pod.
|
||||
|
||||
```console
|
||||
kubectl logs -f <experiment_name_(hash)_(hash)> -n <namespace>
|
||||
```
|
||||
|
||||
Some of the possible reasons may include:
|
||||
|
||||
- The ChaosExperiment CR or the ChaosEngine CR doesn’t include mandatory ENVs (or consists of incorrect values/info)
|
||||
needed by the experiment. Note that each experiment (see docs) specifies a mandatory set of ENVs along with some
|
||||
optional ones, which are necessary for successful execution of the experiment.
|
||||
|
||||
- The chaosServiceAccount specified in the ChaosEngine CR doesn’t have sufficient permissions to create the experiment
|
||||
helper-resources (i.e., some experiments in turn create other K8s resources like Jobs/Daemonsets/Deployments etc..,
|
||||
For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs).
|
||||
|
||||
- The application's (AUT) unique label provided in the ChaosEngine is set only at the parent resource metadata but not
|
||||
propagated to the pod template spec. Note that the Operator uses this label to filter chaos candidates at the parent
|
||||
resource level (deployment/statefulset/daemonset) but the experiment pod uses this to pick application **pods** into
|
||||
which the chaos is injected.
|
||||
|
||||
- The experiment pre-chaos checks have failed on account of application (AUT) or auxiliary application unavailability
|
||||
|
||||
### Scheduler not forming chaosengines for type=repeat?
|
||||
|
||||
If the ChaosSchedule has been created successfully created in the cluster and ChaosEngine is not being formed, the most common problem is that either start or
|
||||
end time has been wrongly specified. We should verify the times. We can identify if this is the problem or not by changing to `type=now`. If the ChaosEngine is
|
||||
formed successfully then the problem is with the specified time ranges, if ChaosEngine is still not formed, then the problem is with `engineSpec`.
|
||||
|
||||
|
||||
### Litmus uninstallation is not successful and namespace is stuck in terminating state?
|
||||
|
||||
Under typical operating conditions, the ChaosOperator makes use of finalizers to ensure that the ChaosEngine is deleted
|
||||
only after chaos resources (chaos-runner, experiment pod, any other helper pods) are removed.
|
||||
|
||||
When uninstalling Litmus via the operator manifest (which contains the namespace, operator, crd specifictions in a single YAML)
|
||||
without deleting the existing chaosengine resources first, the ChaosOperator deployment may get deleted before the CRD removal
|
||||
is attempted. Since the stale chaosengines have the finalizer present on them, their deletion (triggered by the CRD delete) and
|
||||
by consequence, the deletion of the chaosengine CRD itself is "stuck".
|
||||
|
||||
In such cases, manually remove the finalizer entries on the stale chaosengines to facilitate their successful delete.
|
||||
To get the chaosengine, run:
|
||||
|
||||
`kubectl get chaosengine -n <namespace>`
|
||||
|
||||
followed by:
|
||||
|
||||
`kubectl edit chaosengine <chaosengine-name> -n <namespace>` and remove the finalizer entry `chaosengine.litmuschaos.io/finalizer`
|
||||
|
||||
Repeat this on all the stale chaosengine CRs to remove the CRDs successfully & complete uninstallation process.
|
||||
|
||||
If however, the `litmus` namespace deletion remains stuck despite the above actions, follow the procedure described
|
||||
[here](https://success.docker.com/article/kubernetes-namespace-stuck-in-terminating) to complete the uninstallation.
|
||||
|
||||
### Observing experiment results using `describe chaosresult` is showing `NotFound` error?
|
||||
|
||||
Upon observing the ChaosResults by executing the describe command given below, it may give a `NotFound` error.
|
||||
|
||||
```
|
||||
kubectl describe chaosresult <chaos-engine-name>-<chaos-experiment-name> -n <namespace>
|
||||
```
|
||||
|
||||
Alternatively, running the describe command without specifying the expected ChaosResult name might execute successfully, but does may not show any output.
|
||||
|
||||
```
|
||||
kubectl describe chaosresult -n <namespace>`
|
||||
```
|
||||
|
||||
This can occur sometimes due to the time taken in pulling the image starting the experiment pod (note that the ChaosResult resource is generated by the experiment).
|
||||
For the above commands to execute successfully, you should simply wait for the experiment pod to be created. The waiting time will be based upon resource available
|
||||
(network bandwidth, space availability on the node filesyste
|
|
@ -108,7 +108,7 @@ nav:
|
|||
- Kafka Broker Pod Failure: experiments/categories/kafka/kafka-broker-pod-failure.md
|
||||
- Cassandra:
|
||||
- Cassandra Pod Delete: experiments/categories/cassandra/cassandra-pod-delete.md
|
||||
- Cloud-Platform:
|
||||
- Cloud Infrastructure:
|
||||
- AWS:
|
||||
- EC2:
|
||||
- EC2 Terminate By ID: experiments/categories/aws/ec2-terminate-by-id.md
|
||||
|
@ -127,19 +127,37 @@ nav:
|
|||
- GCP Instance Stop: experiments/categories/gcp/gcp-vm-instance-stop.md
|
||||
- GCP Disk Loss: experiments/categories/gcp/gcp-vm-disk-loss.md
|
||||
- Chaos Resources:
|
||||
- Contents: experiments/chaos-resources/chaos-resources.md
|
||||
- ChaosEngine Specifications:
|
||||
- State Specifications: experiments/chaos-resources/engine-state.md
|
||||
- Application Specifications: experiments/chaos-resources/application-details.md
|
||||
- RBAC Specifications: experiments/chaos-resources/rbac-details.md
|
||||
- Runtime Specifications: experiments/chaos-resources/runtime-details.md
|
||||
- Runner Specifications: experiments/chaos-resources/runner-components.md
|
||||
- Experiment Specifications: experiments/chaos-resources/experiment-components.md
|
||||
- Contents: experiments/chaos-resources/contents.md
|
||||
- ChaosEngine:
|
||||
- Contents: experiments/chaos-resources/chaos-engine/contents.md
|
||||
- State Specifications: experiments/chaos-resources/chaos-engine/engine-state.md
|
||||
- Application Specifications: experiments/chaos-resources/chaos-engine/application-details.md
|
||||
- RBAC Specifications: experiments/chaos-resources/chaos-engine/rbac-details.md
|
||||
- Runtime Specifications: experiments/chaos-resources/chaos-engine/runtime-details.md
|
||||
- Runner Specifications: experiments/chaos-resources/chaos-engine/runner-components.md
|
||||
- Experiment Specifications: experiments/chaos-resources/chaos-engine/experiment-components.md
|
||||
- ChaosExperiment:
|
||||
- Contents: experiments/chaos-resources/chaos-experiment/contents.md
|
||||
- ChaosResult:
|
||||
- Contents: experiments/chaos-resources/chaos-result/contents.md
|
||||
- ChaosScheduler:
|
||||
- Contents: experiments/chaos-resources/chaos-scheduler/contents.md
|
||||
- Probes:
|
||||
- Contents: experiments/chaos-resources/probes/contents.md
|
||||
- Command Probe: experiments/chaos-resources/probes/cmdProbe.md
|
||||
- HTTP Probe: experiments/chaos-resources/probes/httpProbe.md
|
||||
- K8S Probe: experiments/chaos-resources/probes/k8sProbe.md
|
||||
- Prometheus Probe: experiments/chaos-resources/probes/promProbe.md
|
||||
- Litmus FAQ:
|
||||
- General:
|
||||
- Experiments: experiments/faq/experiments.md
|
||||
- Install: experiments/faq/install.md
|
||||
- Portal: experiments/faq/portal.md
|
||||
- Security: experiments/faq/security.md
|
||||
- Troubleshooting:
|
||||
- Experiments: experiments/troubleshooting/experiments.md
|
||||
- Install: experiments/troubleshooting/install.md
|
||||
- Portal: experiments/troubleshooting/portal.md
|
||||
- Chaos Hub ⧉: https://hub.litmuschaos.io/
|
||||
- Platform Docs ⧉: https://litmusdocs-beta.netlify.app/
|
||||
- Releases ⧉: https://github.com/litmuschaos/litmus/releases
|
||||
|
|
|
@ -51,7 +51,7 @@
|
|||
</head>
|
||||
|
||||
<!-- Header Section -->
|
||||
<div style="margin: 0%; padding-top: 100px;">
|
||||
<div style="margin: 0%;">
|
||||
<div style="display: flex;">
|
||||
<div style="margin: auto;">
|
||||
<h1 style="width:400px;font-size: 40px">
|
||||
|
@ -62,20 +62,46 @@
|
|||
</p>
|
||||
<button
|
||||
onclick="window.location.href = 'experiments/categories/getstarted/';"
|
||||
style="width: 190px;
|
||||
height: 55px;
|
||||
style="width: 150px;
|
||||
height: 40px;
|
||||
background: linear-gradient(133.06deg, #7C6AC8 1.78%, #5B44BA 64.41%);
|
||||
color: #FFFFFF; border-radius: 4px;
|
||||
border:1px solid;
|
||||
cursor: pointer;
|
||||
font-size: 15px;">
|
||||
Start Learning To Experiment
|
||||
Start Learning
|
||||
</button>
|
||||
</div>
|
||||
<div style="margin: auto;">
|
||||
<img src="exp-docs-icons/cloud-provider.svg" style="height:400px;" alt="cloud provider" />
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- More Resource Section -->
|
||||
<div style="background-color: #F9FAFC;">
|
||||
<div style="padding: 40px; margin: auto;">
|
||||
<div style="display: flex; justify-content: space-between;">
|
||||
<div style="display: flex; margin: auto;">
|
||||
<img src="exp-docs-icons/faqs.svg" alt="faqs"/>
|
||||
<div style="margin-left: 20px;">
|
||||
<a href="experiments/faq/experiments"><h1 style="color: blue; text-decoration: underline; cursor: pointer;">FAQs</h1></a>
|
||||
<p style="width: 280px; color: #777288; font-size: 18px;">
|
||||
All common Frequently Asked Questions curated in one place
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
<div style="display: flex; margin: auto;">
|
||||
<img src="exp-docs-icons/troubleshooting.svg" alt="troubleshooting"/>
|
||||
<div style="margin-left: 20px;">
|
||||
<a href="experiments/troubleshooting/experiments"><h1 style="color: blue; text-decoration: underline; cursor: pointer;">Troubleshooting</h1></a>
|
||||
<p style="width: 320px; color: #777288; font-size: 18px;">
|
||||
Know more about troubleshooting of common issues
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div style="margin: 0%;">
|
||||
<div style="display: flex; padding:20px 0px">
|
||||
<div style="margin: auto;">
|
||||
|
@ -110,15 +136,14 @@
|
|||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<br><br><br>
|
||||
<hr style="height: 1px; border: 2; border-top: 2px solid #E6E8F0;"/>
|
||||
<br>
|
||||
<div style="display: flex;padding:15px 0px">
|
||||
<div style="display: flex;margin-left: 12%;">
|
||||
<div style="display: flex;margin: auto">
|
||||
<img src="exp-docs-icons/litmus.svg" alt="litmus"/>
|
||||
<p style="font-size: 14px; margin-top: 15px; margin-left: 10px; color: #777288;"> © 2021 LitmusChaos Authors</p>
|
||||
</div>
|
||||
<div style="display: inline-flex;margin-left: auto; margin-right: 200px;">
|
||||
<div style="display: inline-flex;margin: auto">
|
||||
<p style="font-size: 14px;margin-top:0.5px; color: #777288;"> Follow us on:</p>
|
||||
<a target="_" href="https://slack.litmuschaos.io/"><img src="exp-docs-icons/Slack.svg" alt="slack" style="margin-left: 10px;"/></a>
|
||||
<a target="_" href="https://github.com/litmuschaos/litmus"><img src="exp-docs-icons/Github.svg" alt="github" style="margin-left: 10px;"/></a>
|
||||
|
|
Loading…
Reference in New Issue