chore(docs): Added Docs for Probes (#12)

* Added Docs for Probes

Signed-off-by: Sayan Mondal <sayan@chaosnative.com>

* Removing an extra page break

Signed-off-by: Sayan Mondal <sayan@chaosnative.com>

* EmbedMD check

Signed-off-by: Sayan Mondal <sayan@chaosnative.com>

* Resolving Spellchecker issue

Signed-off-by: Sayan Mondal <sayan@chaosnative.com>

* Resolving new Spellchecker issue

Signed-off-by: Sayan Mondal <sayan@chaosnative.com>
This commit is contained in:
Sayan Mondal 2021-03-12 15:37:36 +05:30 committed by GitHub
parent 750c9637f4
commit 55b97acad0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 907 additions and 18 deletions

View File

@ -1,21 +1,653 @@
gitOps accomodate
accomodated
ada
addon
Adminstator
afx
airgapped
AKS
al
allowedCapabilities
allowedHostPaths
allowedProfileNames
allowPrivilegeEscalation
amazonaws
amd
analytics
annotationCheck
annotationChecks
ansible
Ansible
api
apiextensions
apiGroup
apiGroups
apis
APIs
apiserver
apiVersion
AppArmor
appGroup
appinfo
appInfo
appkind
applabel
appLabels
appns
appResource
appResourceName
appVersion
arg
argo
ArgoCD
argoproj
argowf
args
AUT
auth
AUTH
autoscaler
Autoscaler
autoscaling
auxiliaryAppInfo
avc
AVC
awk
aws
awsregion
ba
backOff
baremetal
Baremetal
bd
BDD
bfea
blockcount
blocksize
bodyPath
bool
BringYourOwnChaos
busybox
Busybox
BYOC
cassandra
cd
cde
ce
Centric
chaoexperiment
ChaosChart
ChaosCharts
chaosec
chaosengine
ChaosEngine
ChaosEngineCompleted
ChaosEngineInitialized
chaosengines
ChaosEngine's
ChaosEngines
ChaosEngineSpec
chaosexperiment
ChaosExperiment
chaosexperiments
ChaosExperiments
chaoshub
ChaosHub
ChaosInject
chaoslib
ChaosOperator
ChaosPod
ChaosResourcesOperationFailed
chaosresult
ChaosResult
chaosresults
ChaosResults
ChaosRunner
chaosschedule
ChaosSchedule
ChaosScheduler
chaosschedules
chaosServiceAccount
chaosServiceAccounts
chaosshedule
chaostoolkit
Chaostoolkit
ChaosToolKit
chaosUID
ChaosWorkflow
ChaosWorkflows
charthub
chmod
chroot
ci
CLI
clientset
ClusterIP
clusterrole
clusterRole
ClusterRole
clusterrolebinding
ClusterRoleBinding
ClusterRoles
cmd
cmdProbe
CmdProbe
cmdProbes
CNXNB
commmand
comparator
comparision
config
configmap
configMap
ConfigMap
configmaps
configMaps
configs
connectto
ContainerCreating
containerd
containerPort
containerSecurityContext
continer
coredns
coreDNS
CoreDns
CoreDNS
corev
cp
cpu
Cpu
cr
CrashLoopBackOff
crd
CRD
crds
CRDs
CRI
crictl
crio
cron
cronjob
CRs
cstor
cStor
cstorpools
cstorvolumereplicas
customresourcedefinition
Customresourcedefinition
da
daemonset
daemonsets
datasource
dbname
dbpassword
dbuser
debuggability
deletecollection
Deployable
deploymemt
deploymentconfig
deploymentconfigs
Deplying
deprovision
dettached
dev
devguide
devops
DevOps
dns
DNS
downtimes
downwardAPI
ds
DS
DSL
du
ebs
EBS
ec
efb
EKS
embedmd
emptyDir
enbaledor
endTime
engineSpec
engineState
engineStatus
engineTemplateSpec
entrypoint
env
EnvParseError
ENVs
EoT
EOT
errored
et
eth
ethernet
eventrouter
exampleFile
exe
execing
Execing
executionTime
experimentannotation
experimentannotations
ExperimentDependencyCheck
experimentImage
experimentImagePullSecrets
ExperimentNotFound
experimentstatus
Experimentstatus
experimet
expr
failedRuns
failstep
faq
Fdocker
fff
fieldPath
fieldRef
fieldSelector
filesystem
FILESYSTEM
Fillup
finalizer
finalizers
fluentd
FluxCD
FQDN
frontend
Frun
fsGroup
Fvar
gaiaadm
gaiadocker
GC
gcloud
GCloud
gcp
GCP
getstarted
gettingstarted
Gi
GigaBytes
github
Github
githubusercontent
Gitlab
GitOps GitOps
gitops gitops
gke
GKE
gprasath
grafana
Grafana
GVR
hax
heptio
hostFileVolumes
hostIPC
hostname
hostNetwork
hostPath
hostPID
htop
http
httpProbe
HTTPProbe
https
highlevel
iam
iconColor
IfNotPresent
iks
imagePullPolicy
imagePullSecret
imagePullSecrets
includedDays
includedHours
inflight
initialDelaySeconds
insecureSkipVerify
instace
instanceCount
Integrations
io
ip
iproute
ips
IPS
iSCSI
istgt
jiva
Jiva
jobCleanupPolicy
jobCleanUpPolicy
json
JSON
jsonpath
jt
kafka
keiko
keygen
keyspace
KEYSPACE
kh
kiam
Kiam
KinD
Konvoy
KOPS
kube
Kube
kubeadm
Kubeadm
kubeconfig
KUBECONFIG
kubectl
kubelet
Kubelet
kubernetes
Kubernetes Kubernetes
downtimes kubespray
DevOps Kubevirt
SREs Kudo
CRDs KUDO
ChaosHub labelSelector
Kubernetes labelselectors
toolset latencies
ChaosWorkflow legendFormat
analytics libs
ArgoCD lifecycle
FluxCD linux
scalable litmusbooks
observability litmuschaos
ChaosWorkflows LitmusChaos
litmuslib
LitmusLib
litmusresults
liveness
LIVENESS
livenss
lname
LoadBalancer
localhost
LocalObjectReference
localpv
logrus
lossy
lqbw
lZ
matchLabels
maya
mayadata
md
mebibytes
Mebibytes
MEBIBYTES
metricset
mgmt
Microservice
microservices microservices
CLI minChaosInterval
Minikube
minio
mountPath
msg
MustRunAs
mv
myhtop
mysql
namespace
namespaced
Namespaced
namespaces
ndm
netem
nfs
NFS
ng
nginx
nodename
nodeName
nodeport
NodePort
nodeselector
nodeSelector
NoExecute
notEqual
NotFound
notMatches
NotReady
ns
observability
oc
OnChaos
oneof
oneOf
openebs
Openebs
OpenEBS
OPENEBS
openshift
OpenShift
opensource
overlayed
params
passedRuns
pathPrefix
pdf
PERC
percona
persistentVolumeClaim
PersistentVolumeClaim
persistentvolumeclaims
persistentvolumes
persistentVolumes
pid
PID
pluggable
podSecurityContext
PodSecurityContext
podsecuritypolicies
PodSecurityPolicy
postchaos
powerfulseal
PowerfulSeal
poweroff
Poweroff
pre
prebuilt
prechaos
PreRequisites
privatekey
probeArtifact
ProbeArtifacts
probeName
probePollingInterval
probestatus
probesuccesspercentage
probeSuccessPercentage
probeTimeout
proc
prometheus
promProbe
promql
provisioner
Provisioner
PROVISIONER
ps
psp
PSP
ptr
pumba
Pumba
pushgateway
PV
pvc
PVs
py
PyYaml
queryPath
rbac
rdx
reaccessable
readOnlyRootFilesystem
redhat
refId
reframed
relook
repicas
replicaset
replicasets
replicationcontrollers
repo
resourceNames
ResourceRequirements
responseCode
restorecon
rlt
RMW
RoleBinding
roleRef
rollout
rollouts
rsa
RSA
RunAsAny
runAsUser
runcmd
runnerannotation
runnning
runProperties
runtime
runtimes
rw
RWM
sa
SBT
Scalability Scalability
scalable
schedulable
scheduleState
scontext
sdb
sDem
sdk
seccomp
securityContext
selinux
seLinux
Selinux
SeLinux
SELinux
semodule
ServerAliveCountMax
ServerAliveInterval
serviceaccount
ServiceAccount
serviceAccountName
serviceaccounts
sfeyq
showIn
sig
SIGKILL
sirupsen
slc
sLO
SLOs
SoT
specifed
specificed
specifictions
sProbe
SRE
SREs
srw
SSHing
SSL
startTime
stateful
Stateful
statefulset
statefulsets
statusCheckTimeouts
stdout
stoppedRuns
storageclasses
stringData
sts
sudo
sumitnagal
superset
supplementalGroups
svc
SVC
SYS
systemctl
systemd
tagKeys
tc
tclass
tcontext
tcp
TCP
te
terminationGracePeriodSeconds
testcase
testfile
textFormat
th
Timeformat
timeRange
TimeStamp
titleFormat
TKGi
TODO
tolerations
Tolerations
toolset
TSDB
TSDBs
tunable
tunables
Tunables
ubuntu
uc
ui
UI
UID
un
uncordon
uninstallation
Uninstallation
unix
unschedulable
uptime
upto
URIs
url
usecases
usr
utils
valueFrom
VMs
VMware
volumeMount
WithError
WebUI
wordpress
workDays
workHours
wp
yaml
yml
zk
Zokeeper
ZTN
html
XXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXX

257
website/docs/probes.md Normal file
View File

@ -0,0 +1,257 @@
---
id: probes
title: Declarative Approach to Chaos Hypothesis using Litmus Probes
sidebar_label: Probes
---
---
## Litmus Probes
Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard “in-built” checks).
Litmus currently supports four types of probes:
* **httpProbe:** To query health/downstream URIs
* **cmdProbe:** To execute any user-desired health-check function implemented as a shell command
* **k8sProbe:** To perform CRUD operations against native & custom Kubernetes resources
* **promProbe:** To execute promql queries and match prometheus metrics for specific criteria
These probes can be used in isolation or in several combinations to achieve the desired checks. While the `httpProbe` & `k8sProbe` are fully declarative in the way they are conceived, the `cmdProbe` expects the user to provide a shell command to implement checks that are highly specific to the application use case. `promProbe` expects the user to provide a promql query along with Prometheus service endpoints to check for specific criteria.
The probes can be set up to run in different modes:
* **SoT:** Executed at the Start of Test as a pre-chaos check
* **EoT:** Executed at the End of Test as a post-chaos check
* **Edge:** Executed both, before and after the chaos
* **Continuous:** The probe is executed continuously, with a specified polling interval during the chaos injection.
* **OnChaos:** The probe is executed continuously, with a specified polling interval strictly for chaos duration of chaos
All probes share some common attributes:
* **probeTimeout:** Represents the time limit for the probe to execute the check specified and return the expected data.
* **retry:** The number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.
* **interval:** The period between subsequent retries
* **probePollingInterval:** The time interval for which continuous probe should be sleep after each iteration
* **initialDelaySeconds:** Represents the initial waiting time interval for the probes.
## Types of Litmus Probes
### **httpProbe**
The `httpProbe` allows developers to specify a URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria. The received status code is mapped against an expected status. It supports http `Get` and `Post` methods.
In HTTP `Get` method it sends a http `GET` request to the provided url and matches the response code based on the given criteria(`==`, `!=`, `oneOf`).
In HTTP `Post` method it sends a http `POST` request to the provided url. The http body can be provided in the `body` field. In the case of a complex POST request in which the body spans multiple lines, the `bodyPath` attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.
It can be defined at `.spec.experiments[].spec.probe` inside ChaosEngine.
> **NOTE:** `body` and `bodyPath` are mutually exclusive.
```yaml
probe:
- name: "check-frontend-access-url"
type: "httpProbe"
httpProbe/inputs:
url: "<url>"
insecureSkipVerify: false
method:
get:
criteria: == # supports == & != and oneof operations
responseCode: "<response code>"
mode: "Continuous"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
probePollingInterval: 2
```
The `httpProbe` is better used in the Continuous mode of operation as a parallel liveness indicator of a target or downstream application. It uses the `probePollingInterval` property to specify the polling interval for the access checks.
> **NOTE:** `insecureSkipVerify` can be set to true to skip the certificate checks.
<br />
### **cmdProbe**
The `cmdProbe` allows developers to run shell commands and match the resulting output as part of the entry/exit criteria. The intent behind this probe was to allow users to implement a non-standard & imperative way for expressing their hypothesis. For example, the cmdProbe enables you to check for specific data within a database, parse the value out of a JSON blob being dumped into a certain path or check for the existence of a particular string in the service logs.
In order to enable this behaviour, the probe supports an inline mode in which the command is run from within the experiment image as well as a source mode, where the command execution is carried out from within a new pod whose image can be specified. While inline is preferred for simple shell commands , source mode can be used when application-specific binaries are required. The `cmdProbe` can be defined at `.spec.experiments[].spec.probe` the path inside the ChaosEngine.
```yaml
probe:
- name: "check-database-integrity"
type: "cmdProbe"
cmdProbe/inputs:
command: "<command>"
comparator:
type: "string" # supports: string, int, float
criteria: "contains" #supports >=,<=,>,<,==,!= for int and contains,equal,notEqual,matches,notMatches for string values
value: "<value-for-criteria-match>"
source: "<repo>/<tag>" # it can be “inline” or any image
mode: "Edge"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
initialDelaySeconds: 5
```
<br />
### **k8sProbe**
With the proliferation of custom resources & operators, especially in the case of stateful applications, the steady-state is manifested as status parameters/flags within Kubernetes resources. k8sProbe addresses verification of the desired resource state by allowing users to define the Kubernetes GVR (group-version-resource) with appropriate filters (field selectors/label selectors). The experiment makes use of the Kubernetes Dynamic Client to achieve this. The `k8sProbe` can be defined at `.spec.experiments[].spec.probe` the path inside ChaosEngine.
It supports following CRUD operations which can be defined at `probe.operation`.
* **create:** It creates kubernetes resource based on the data provided inside probe.data field.
* **delete:** It deletes matching kubernetes resource via GVR and filters (field selectors/label selectors).
* **present:** It checks for the presence of kubernetes resource based on GVR and filters (field selectors/labelselectors).
* **absent:** It checks for the absence of kubernetes resource based on GVR and filters (field selectors/labelselectors).
```yaml
probe:
- name: "check-app-cluster-cr-status"
type: "k8sProbe"
k8sProbe/inputs:
command:
group: "<appGroup>"
version: "<appVersion>"
resource: "<appResource>"
namespace: "default"
fieldSelector: "metadata.name=<appResourceName>,status.phase=Running"
labelSelector: "<app-labels>"
operation: "present" # it can be present, absent, create, delete
mode: "EOT"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
```
<br />
### **promProbe**
The `promProbe` allows users to run Prometheus queries and match the resulting output against specific conditions. The intent behind this probe is to allow users to define metrics-based SLOs in a declarative way and determine the experiment verdict based on its success. The probe runs the query on a Prometheus server defined by the `endpoint`, and checks whether the output satisfies the specified `criteria`.
The promql query can be provided in the `query` field. In the case of complex queries that span multiple lines, the `queryPath` attribute can be used to provide the link to a file consisting of the query. This file can be made available in the experiment pod via a ConfigMap resource, with the ConfigMap being passed in the ChaosEngine OR the ChaosExperiment CR.
> **NOTE:** `query` and `queryPath` are mutually exclusive.
```yaml
probe:
- name: "check-probe-success"
type: "promProbe"
promProbe/inputs:
endpoint: "<prometheus-endpoint>"
query: "<promql-query>"
comparator:
criteria: "==" #supports >=,<=,>,<,==,!= comparision
value: "<value-for-criteria-match>"
mode: "Edge"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
```
<br />
---
## **Probe Status & Deriving Inferences**
The litmus chaos experiments run the probes defined in the ChaosEngine and update their stage-wise success in the ChaosResult custom resource, with details including the overall `probeSuccessPercentage` (a ratio of successful checks v/s total probes) and failure step, where applicable. The success of a probe is dependent on whether the expected status/results are met and also on whether it is successful in all the experiment phases defined by the probes execution mode. For example, probes that are executed in “Edge” mode, need the checks to be successful both during the pre-chaos & post-chaos phases to be declared as successful.
The pass criteria for an experiment is the logical conjunction of all probes defined in the ChaosEngine and an inbuilt entry/exit criteria. Failure of either indicates a failed hypothesis and is deemed experiment failure.
Provided below is a ChaosResult snippet containing the probe status for a mixed-probe ChaosEngine.
```yaml
Name: app-pod-delete
Namespace: test
Labels: name=app-pod-delete
Annotations: <none>
API Version: litmuschaos.io/v1alpha1
Kind: ChaosResult
Metadata:
Creation Timestamp: 2020-08-29T08:28:26Z
Generation: 36
Resource Version: 50239
Self Link: /apis/litmuschaos.io/v1alpha1/namespaces/test/ChaosResults/app-pod-delete
UID: b9e3638a-b7a4-4b93-bfea-bd143d91a5e8
Spec:
Engine: probe
Experiment: pod-delete
Status:
Experimentstatus:
Fail Step: N/A
Phase: Completed
Probe Success Percentage: 100
Verdict: Pass
Probe Status:
Name: check-frontend-access-url
Status:
Continuous: Passed 👍
Type: HTTPProbe
Name: check-app-cluster-cr-status
Status:
Post Chaos: Passed 👍 #EoT
Type: K8sProbe
Name: check-database-integrity
Status:
Post Chaos: Passed 👍 #Edge
Pre Chaos: Passed 👍
Type: CmdProbe
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Summary 7s pod-delete-0s2jt6-s4rdx pod-delete experiment has been Passed
```
<br />
---
## **Probe Chaining**
Probe chaining enables reuse of probe a result (represented by the template function `{{ .<probeName>.probeArtifact.Register}})` in subsequent "downstream" probes defined in the ChaosEngine. Note that the order of execution of probes in the experiment depends purely on the order in which they are defined in the ChaosEngine.
Probe chaining is currently supported only for `cmdProbes`.
```yaml
probe:
- name: "probe1"
type: "cmdProbe"
cmdProbe/inputs:
command: "<command>"
comparator:
type: "string"
criteria: "equals"
value: "<value-for-criteria-match>"
source: "inline"
mode: "SOT"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
- name: "probe2"
type: "cmdProbe"
cmdProbe/inputs:
## probe1's result being used as one of the args in probe2
command: "<commmand> {{ .probe1.ProbeArtifacts.Register }} <arg2>"
comparator:
type: "string"
criteria: "equals"
value: "<value-for-criteria-match>"
source: "inline"
mode: "SOT"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
```

View File

@ -31,7 +31,7 @@ module.exports = {
], ],
Concepts: [ Concepts: [
"workflow", "workflow",
// "probes", "probes",
// "cross-cloud-control", // "cross-cloud-control",
// "litmusctl", // "litmusctl",
// "crds", // "crds",