litmus-docs/website/versioned_docs/version-1.3.0/cassandra-pod-delete.md

9.6 KiB

id title sidebar_label original_id
cassandra-pod-delete Cassandra Pod Delete Experiment Details Cassandra Pod Delete cassandra-pod-delete
## Experiment Metadata
Type Description Tested K8s Platform
Cassandra Fail the Cassandra statefulset pod GKE, Konvoy(AWS), Packet(Kubeadm), Minikube, EKS
## Prerequisites - Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here - Ensure that the cassandra-pod-delete experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here ## Entry Criteria - Cassandra pods are healthy before chaos injection - The load should be distributed on the each replicas. ## Exit Criteria - Cassandra pods are healthy post chaos injection - The load should be distributed on the each replicas. ## Details - Causes (forced/graceful) pod failure of specific/random replicas of an cassandra statefulset - Tests cassandra sanity (replica availability & uninterrupted service) and recovery workflow of the cassandra statefulset. - The pod delete by Powerfulseal is only supporting single pod failure (kill_count = 1) ## Integrations - Pod failures can be effected using one of these chaos libraries: litmus, powerfulseal - The desired chaos library can be selected by setting one of the above options as value for the env variable LIB ## Steps to Execute the Chaos Experiment - This Chaos Experiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer Getting Started - Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment. ### Prepare chaosServiceAccount - Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment. - The RBAC sample manifest is different for both LIB (litmus, powerseal). Use the respective rbac sample manifest on the basis of LIB ENV. #### Sample Rbac Manifest for litmus LIB embedmd:# (https://raw.githubusercontent.com/litmuschaos/chaos-charts/master/charts/cassandra/cassandra-pod-delete/rbac.yaml yaml) ```yaml

apiVersion: v1 kind: ServiceAccount metadata: name: cassandra-pod-delete-sa namespace: default labels: name: cassandra-pod-delete-sa

apiVersion: rbac.authorization.k8s.io/v1beta1 kind: Role metadata: name: cassandra-pod-delete-sa namespace: default labels: name: cassandra-pod-delete-sa rules:

  • apiGroups: ["","litmuschaos.io","batch","apps"] resources: ["pods","deployments","statefulsets","services","pods/log","pods/exec","events","jobs","chaosengines","chaosexperiments","chaosresults"] verbs: ["create","list","get","patch","update","delete"]

apiVersion: rbac.authorization.k8s.io/v1beta1 kind: RoleBinding metadata: name: cassandra-pod-delete-sa namespace: default labels: name: cassandra-pod-delete-sa roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: cassandra-pod-delete-sa subjects:

  • kind: ServiceAccount name: cassandra-pod-delete-sa namespace: default

#### Sample Rbac Manifest for powerfulseal LIB

[embedmd]:# (https://raw.githubusercontent.com/litmuschaos/chaos-charts/master/charts/cassandra/cassandra-pod-delete/powerfulseal_rbac.yaml yaml)
```yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cassandra-pod-delete-sa
  namespace: default
  labels:
    name: cassandra-pod-delete-sa
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: cassandra-pod-delete-sa
  labels:
    name: cassandra-pod-delete-sa
rules:
- apiGroups: ["","litmuschaos.io","batch","apps"]
  resources: ["pods","deployments","statefulsets","pods/log","pods/exec","services","events","jobs","configmaps","chaosengines","chaosexperiments","chaosresults"]
  verbs: ["create","list","get","patch","update","delete"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get","list"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: cassandra-pod-delete-sa
  labels:
    name: cassandra-pod-delete-sa
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cassandra-pod-delete-sa
subjects:
- kind: ServiceAccount
  name: cassandra-pod-delete-sa
  namespace: default

Prepare ChaosEngine

  • Provide the application info in spec.appinfo
  • Override the experiment tunables if desired in experiments.spec.components.env
  • To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts

Supported Experiment Tunables

Variables Description Specify In ChaosEngine Notes
CASSANDRA_SVC_NAME Cassandra Service Name Mandatory Defaults value: cassandra
KEYSPACE_REPLICATION_FACTOR Value of the Replication factor for the cassandra liveness deploy Mandatory It needs to create keyspace while checking the livenss of cassandra
CASSANDRA_PORT Port of the cassandra statefulset Mandatory Defaults value: 9042
CASSANDRA_LIVENESS_CHECK It allows to check the liveness of the cassandra statefulset Optional It can be`enabled` or `disabled`
CASSANDRA_LIVENESS_IMAGE Image of the cassandra liveness deployment Optional Default value: litmuschaos/cassandra-client:latest
TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Optional Defaults to 15s
CHAOS_INTERVAL Time interval b/w two successive pod failures (sec) Optional Defaults to 5s
KILL_COUNT No. of cassandra pods to be deleted Optional Default to `1`, kill_count > 1 is only supported by litmus lib , not by the powerfulseal
LIB The chaos lib used to inject the chaos Optional Defaults to `litmus`. Supported: `litmus`, `powerfulseal`
FORCE Application Pod failures type Optional Default to `true`, With `terminationGracePeriodSeconds=0`
RAMP_TIME Period to wait before injection of chaos in sec Optional

Sample ChaosEngine Manifest

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: cassandra-chaos
  namespace: default
spec:
  appinfo:
    appns: 'default'
    applabel: 'app=cassandra'
    appkind: 'statefulset'
  # It can be true/false
  annotationCheck: 'true'
  # It can be active/stop
  engineState: 'active'
  #ex. values: ns1:name=percona,ns2:run=nginx
  auxiliaryAppInfo: ''
  chaosServiceAccount: cassandra-pod-delete-sa
  monitoring: false
  # It can be delete/retain
  jobCleanUpPolicy: 'delete'
  experiments:
    - name: cassandra-pod-delete
      spec:
        components:
          env:
            # set chaos duration (in sec) as desired
            - name: TOTAL_CHAOS_DURATION
              value: '15'

            # set chaos interval (in sec) as desired
            - name: CHAOS_INTERVAL
              value: '15'
              
            # pod failures without '--force' & default terminationGracePeriodSeconds
            - name: FORCE
              value: 'false'

            # provide cassandra service name
            # default service: cassandra
            - name: CASSANDRA_SVC_NAME
              value: 'cassandra'

            # provide the keyspace replication factor
            - name: KEYSPACE_REPLICATION_FACTOR
              value: '3'

            # provide cassandra port 
            # default port: 9042
            - name: CASSANDRA_PORT
              value: '9042'

            # SET THE CASSANDRA_LIVENESS_CHECK
            # IT CAN BE `enabled` OR `disabled`
            - name: CASSANDRA_LIVENESS_CHECK
              value: ''

            

Create the ChaosEngine Resource

  • Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.

    kubectl apply -f chaosengine.yml

Watch Chaos progress

  • View pod terminations & recovery by setting up a watch on the pods in the application namespace

    watch -n 1 kubectl get pods -n <application-namespace>

Check Chaos Experiment Result

  • Check whether the cassandra statefulset is resilient to the pod failure, once the experiment (job) is completed. The ChaosResult resource name is derived like this: <ChaosEngine-Name>-<ChaosExperiment-Name>.

    kubectl describe chaosresult cassandra-chaos-cassandra-pod-delete -n <cassandra-namespace>

Cassandra Pod Failure Demo

  • It will be added soon.