Monitoring utils added with sample AUT demo for monitoring. (#2121)

* monitoring utils with demonstration.

Signed-off-by: ishangupta-ds <ishan.gupta@mayadata.io>
This commit is contained in:
Ishan Gupta 2020-10-06 18:00:39 +05:30 committed by GitHub
parent 780dbd0a39
commit 0bcf7918fe
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
87 changed files with 21280 additions and 8 deletions

View File

@ -1,18 +1,239 @@
# Monitor Chaos
This directory contains chaos interleaved grafana dashboards to get started with monitoring chaos experiments and workflows.
This directory contains chaos interleaved grafana dashboards along with the utilities needed to get started with monitoring chaos experiments and workflows.
## Instructions
# Components
- Clone the litmus repo
- [Grafana Dashboards](https://github.com/litmuschaos/litmus/blob/master/monitoring/grafana-dashboards)
> Contains chaos interleaved grafana dashboards for various native k8s and application metrics.
- [Utilities](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils)
> Contains utilities required to setup monitoring infrastructure on a kubernetes cluster.
# Demonstration
## Monitor Chaos on Sock-Shop
Run chaos experiments and workflows on sock-shop application with grafana dashboard to monitor it.
### Step-1: Setup Sock-Shop Microservices Application
- Create sock-shop namespace on the cluster
```
git clone https://github.com/litmuschaos/litmus.git
cd litmus/monitoring
kubectl create ns sock-shop
```
- Grafana Dashboards
- Apply the sock-shop microservices manifests
```
cd grafana-dashboards
```
kubectl apply -f utils/sample-application-under-test/sock-shop/
```
- Wait until all services are up. Verify via `kubectl get pods -n sock-shop`
### Step-2: Setup the LitmusChaos Infrastructure
- Install the litmus chaos operator and CRDs
```
kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.8.0.yaml
```
- Install the litmus-admin serviceaccount for centralized/admin-mode of chaos execution
```
kubectl apply -f https://litmuschaos.github.io/litmus/litmus-admin-rbac.yaml
```
- Install the chaos experiments in admin(litmus) namespace
```
kubectl apply -f https://hub.litmuschaos.io/api/chaos/1.8.0?file=charts/generic/experiments.yaml -n litmus
```
### Step-3: Setup the Monitoring Infrastructure
- Create monitoring namespace on the cluster
```
kubectl create ns monitoring
```
- Create the operator to instantiate all CRDs
```
kubectl -n monitoring apply -f utils/prometheus/prometheus-operator/
```
- Deploy monitoring components
```
kubectl -n monitoring apply -f utils/metrics-exporters-with-service-monitors/node-exporter/
kubectl -n monitoring apply -f utils/metrics-exporters-with-service-monitors/kube-state-metrics/
kubectl -n monitoring apply -f utils/alert-manager-with-service-monitor/
kubectl -n sock-shop apply -f utils/sample-application-service-monitors/
kubectl -n litmus apply -f utils/metrics-exporters-with-service-monitors/litmus-metrics/chaos-exporter/
kubectl -n litmus apply -f utils/metrics-exporters-with-service-monitors/litmus-metrics/litmus-event-router/
```
- Deploy prometheus instance and all the service monitors for targets
```
kubectl -n monitoring apply -f utils/prometheus/prometheus-configuration/
```
- Apply the grafana manifests after deploying prometheus for all metrics.
```
kubectl -n monitoring apply -f utils/grafana/
```
- Access the grafana dashboard via the LoadBalancer (or NodePort) service IP or via a port-forward operation on localhost
Note: To change the service type to NodePort, perform a `kubectl edit svc prometheus-k8s -n monitoring` and replace
`type: LoadBalancer` to `type: NodePort`
```
kubectl get svc -n monitoring
```
Default username/password credentials: `admin/admin`
- Add the prometheus datasource from monitoring namespace as DS_PROMETHEUS for Grafana via the Grafana Settings menu
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/data-source-config.png?raw=true)
- Import the grafana dashboards
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/import-dashboard.png?raw=true)
- Import the grafana dashboard "Sock-Shop Performance" provided [here](https://raw.githubusercontent.com/litmuschaos/litmus/master/monitoring/grafana-dashboards/sock-shop/Sock-Shop-Performance-Under-Chaos.json)
- Import the grafana dashboard "Node and Pod Chaos Demo" provided [here](https://raw.githubusercontent.com/litmuschaos/litmus/master/monitoring/grafana-dashboards/kubernetes/Node-and-pod-metrics-dashboard.json)
### Step-4: Execute the Chaos Experiments
- For the sake of illustration, let us execute node and pod level, CPU hog experiments on the `catalogue` microservice & Memory Hog experiments on the `orders` microservice in a staggered manner.
```
kubectl apply -f utils/sample-chaos-injectors/chaos-experiments/catalogue/catalogue-pod-cpu-hog.yaml
```
Wait for ~60s
```
kubectl apply -f utils/sample-chaos-injectors/chaos-experiments/orders/orders-pod-memory-hog.yaml
```
Wait for ~60s
```
kubectl apply -f utils/sample-chaos-injectors/chaos-experiments/catalogue/catalogue-node-cpu-hog.yaml
```
Wait for ~60s
```
kubectl apply -f utils/sample-chaos-injectors/chaos-experiments/orders/orders-node-memory-hog.yaml
```
- Verify execution of chaos experiments
```
kubectl describe chaosengine catalogue-pod-cpu-hog -n litmus
kubectl describe chaosengine orders-pod-memory-hog -n litmus
kubectl describe chaosengine catalogue-node-cpu-hog -n litmus
kubectl describe chaosengine orders-node-memory-hog -n litmus
```
### Step-5: Visualize Chaos Impact
- Observe the impact of chaos injection through increased Latency & reduced QPS (queries per second) on the microservices
under test.
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/Sock-Shop-Dashboard.png?raw=true)
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/Node-and-Pod-metrics-Dashboard.png?raw=true)
### Step-6 (optional): Inject continous chaos using Argo CD.
- Install Chaos workflow infrastructure.
- Create argo namespace
```
kubectl create ns argo
```
- Create the CRDs, workflow controller deployment with associated RBAC.
```
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo/stable/manifests/install.yaml -n argo
```
- Install the argo CLI on the test harness machine (where the kubeconfig is available)
```bash
# Download the binary
curl -sLO https://github.com/argoproj/argo/releases/download/v2.11.0/argo-linux-amd64.gz
# Unzip
gunzip argo-linux-amd64.gz
# Make binary executable
chmod +x argo-linux-amd64
# Move binary to path
mv ./argo-linux-amd64 /usr/local/bin/argo
# Test installation
argo version
```
- Create the Argo Access ServiceAccount
```
kubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-workflows/master/Argo/argo-access.yaml -n litmus
```
- Run one or more of the litmuschaos experiments as Chaos workflows using argo CLI or kubectl.
> Node CPU hog
```bash
argo cron create utils/sample-chaos-injectors/chaos-workflows-with-argo-CD/catalogue/catalogue-node-cpu-hog-workflow.yaml -n litmus
```
> Node memory hog
```bash
argo cron create utils/sample-chaos-injectors/chaos-workflows-with-argo-CD/orders/orders-node-memory-hog-workflow.yaml -n litmus
```
> Pod CPU hog
```bash
kubectl apply -f utils/sample-chaos-injectors/chaos-workflows-with-argo-CD/catalogue/catalogue-pod-cpu-hog-workflow.yaml -n litmus
```
> Pod memory hog
```bash
kubectl apply -f utils/sample-chaos-injectors/chaos-workflows-with-argo-CD/orders/orders-pod-memory-hog-workflow.yaml -n litmus
```
- Visualize the Chaos cron workflow through argo UI by obtaining Node port or Load Balancer IP.
```
kubectl patch svc argo-server -n argo -p '{"spec": {"type": "NodePort"}}'
```
OR
```
kubectl patch svc argo-server -n argo -p '{"spec": {"type": "LoadBalancer"}}'
```
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/chaos-workflow-representation.png?raw=true)
![image](https://github.com/litmuschaos/litmus/blob/master/monitoring/screenshots/chaos-cron-workflows.png?raw=true)

Binary file not shown.

After

Width:  |  Height:  |  Size: 196 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 241 KiB

View File

@ -0,0 +1,33 @@
# Utilities
This directory contains utilities required to setup a monitoring infrastructure for application or generic metrics with chaos exporter metrics and litmus event router updates.
## Setups
- [Alert manager with service monitor](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/alert-manager-with-service-monitor)
> Contains setup for alert manager with its corresponding service monitor.
- [Grafana](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/grafana)
> Contains setup for Grafana.
- [Metrics exporters with service monitors](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/metrics-exporters-with-service-monitors)
> Contains setup for different metrics exporters with their corresponding service monitors.
- [Prometheus](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/prometheus)
> Contains setup for prometheus.
- [Sample application service monitors](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/sample-application-service-monitors)
> Contains service monitors for sample application.
- [Sample application under test](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/sample-application-under-test)
> Contains sample AUT manifests.
- [Sample chaos injectors](https://github.com/litmuschaos/litmus/blob/master/monitoring/utils/sample-chaos-injectors)
> Contains chaos experiments and chaos workflows as sample chaos injectors.

View File

@ -0,0 +1,44 @@
apiVersion: v1
data: {}
kind: Secret
metadata:
name: alertmanager-main
namespace: monitoring
stringData:
alertmanager.yaml: |-
"global":
"resolve_timeout": "5m"
"inhibit_rules":
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "critical"
"target_match_re":
"severity": "warning|info"
- "equal":
- "namespace"
- "alertname"
"source_match":
"severity": "warning"
"target_match_re":
"severity": "info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
"route":
"group_by":
- "namespace"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "Default"
"repeat_interval": "12h"
"routes":
- "match":
"alertname": "Watchdog"
"receiver": "Watchdog"
- "match":
"severity": "critical"
"receiver": "Critical"
type: Opaque

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: alertmanager-main
namespace: monitoring

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
spec:
ports:
- name: web
port: 9093
targetPort: web
selector:
alertmanager: main
app: alertmanager
sessionAffinity: ClientIP

View File

@ -0,0 +1,14 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: alertmanager
name: alertmanager
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: web
selector:
matchLabels:
alertmanager: main

View File

@ -0,0 +1,18 @@
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
labels:
alertmanager: main
name: main
namespace: monitoring
spec:
image: quay.io/prometheus/alertmanager:v0.21.0
nodeSelector:
kubernetes.io/os: linux
replicas: 3
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: alertmanager-main
version: v0.21.0

View File

@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
name: grafana
spec:
containers:
- image: grafana/grafana:latest
imagePullPolicy: Always
name: grafana
ports:
- containerPort: 3000
name: grafana
protocol: TCP
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-storage
volumes:
- emptyDir: {}
name: grafana-storage

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
annotations:
name: grafana
namespace: monitoring
spec:
ports:
- nodePort: 31687
port: 3000
protocol: TCP
targetPort: 3000
selector:
app: grafana
sessionAffinity: None
type: LoadBalancer

View File

@ -0,0 +1,15 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: monitoring

View File

@ -0,0 +1,117 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
rules:
- apiGroups:
- ""
resources:
- configmaps
- secrets
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs:
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
- ingresses
verbs:
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
- daemonsets
- deployments
- replicasets
verbs:
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- list
- watch
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- list
- watch
- apiGroups:
- certificates.k8s.io
resources:
- certificatesigningrequests
verbs:
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- volumeattachments
verbs:
- list
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
verbs:
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- list
- watch

View File

@ -0,0 +1,56 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
template:
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
spec:
containers:
- args:
- --host=127.0.0.1
- --port=8081
- --telemetry-host=127.0.0.1
- --telemetry-port=8082
image: quay.io/coreos/kube-state-metrics:v1.9.5
name: kube-state-metrics
securityContext:
runAsUser: 65534
- args:
- --logtostderr
- --secure-listen-address=:8443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://127.0.0.1:8081/
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy-main
ports:
- containerPort: 8443
name: https-main
securityContext:
runAsUser: 65534
- args:
- --logtostderr
- --secure-listen-address=:9443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://127.0.0.1:8082/
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy-self
ports:
- containerPort: 9443
name: https-self
securityContext:
runAsUser: 65534
nodeSelector:
kubernetes.io/os: linux
serviceAccountName: kube-state-metrics

View File

@ -0,0 +1,8 @@
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
namespace: monitoring

View File

@ -0,0 +1,32 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
k8s-app: kube-state-metrics
name: kube-state-metrics
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
interval: 30s
port: https-main
relabelings:
- action: labeldrop
regex: (pod|service|endpoint|namespace)
scheme: https
scrapeTimeout: 30s
tlsConfig:
insecureSkipVerify: true
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
port: https-self
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics

View File

@ -0,0 +1,19 @@
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.5
name: kube-state-metrics
namespace: monitoring
spec:
clusterIP: None
ports:
- name: https-main
port: 8443
targetPort: https-main
- name: https-self
port: 9443
targetPort: https-self
selector:
app.kubernetes.io/name: kube-state-metrics

View File

@ -0,0 +1,41 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: chaos-monitor
name: chaos-monitor
namespace: litmus
spec:
replicas: 1
selector:
matchLabels:
app: chaos-monitor
template:
metadata:
labels:
app: chaos-monitor
spec:
containers:
- image: litmuschaos/chaos-exporter:ci
imagePullPolicy: Always
name: chaos-exporter
serviceAccount: litmus
serviceAccountName: litmus
---
apiVersion: v1
kind: Service
metadata:
labels:
app: chaos-monitor
name: chaos-monitor
namespace: litmus
spec:
ports:
- port: 8080
name: tcp
protocol: TCP
targetPort: 8080
selector:
app: chaos-monitor
type: ClusterIP

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: chaos-monitor
namespace: litmus

View File

@ -0,0 +1,18 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: chaos-exporter
labels:
k8s-app: chaos-exporter
namespace: litmus
spec:
jobLabel: app
selector:
matchLabels:
app: chaos-monitor
namespaceSelector:
matchNames:
- litmus
endpoints:
- port: tcp
interval: 1s

View File

@ -0,0 +1,13 @@
apiVersion: v1
data:
config.json: |-
{
"sink": "http",
"httpSinkUrl": "http://localhost:8080",
"httpSinkBufferSize": 1500,
"httpSinkDiscardMessages": true
}
kind: ConfigMap
metadata:
name: litmus-eventrouter-http-cm
namespace: litmus

View File

@ -0,0 +1,51 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: litmus-eventrouter
name: litmus-eventrouter
namespace: litmus
spec:
replicas: 1
selector:
matchLabels:
app: litmus-eventrouter
template:
metadata:
labels:
app: litmus-eventrouter
spec:
containers:
- image: containership/eventrouter
imagePullPolicy: IfNotPresent
name: litmus-eventrouter
volumeMounts:
- mountPath: /etc/eventrouter
name: config-volume
serviceAccount: litmus
serviceAccountName: litmus
volumes:
- configMap:
defaultMode: 420
name: litmus-eventrouter-http-cm
name: config-volume
---
apiVersion: v1
kind: Service
metadata:
labels:
app: litmus-eventrouter
name: litmus-eventrouter
namespace: litmus
spec:
ports:
- nodePort: 31399
name: web
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: litmus-eventrouter
sessionAffinity: None
type: NodePort

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: litmus-eventrouter
namespace: litmus

View File

@ -0,0 +1,18 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: litmus-event-router
labels:
k8s-app: litmus-event-router
namespace: litmus
spec:
jobLabel: app
selector:
matchLabels:
app: litmus-eventrouter
namespaceSelector:
matchNames:
- litmus
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: node-exporter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: node-exporter
subjects:
- kind: ServiceAccount
name: node-exporter
namespace: monitoring

View File

@ -0,0 +1,17 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: node-exporter
rules:
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create

View File

@ -0,0 +1,90 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
name: node-exporter
namespace: monitoring
spec:
selector:
matchLabels:
app.kubernetes.io/name: node-exporter
template:
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
spec:
containers:
- args:
- --web.listen-address=127.0.0.1:9100
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --no-collector.wifi
- --no-collector.hwmon
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
image: quay.io/prometheus/node-exporter:v0.18.1
name: node-exporter
resources:
limits:
cpu: 250m
memory: 180Mi
requests:
cpu: 102m
memory: 180Mi
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: false
- mountPath: /host/sys
name: sys
readOnly: false
- mountPath: /host/root
mountPropagation: HostToContainer
name: root
readOnly: true
- args:
- --logtostderr
- --secure-listen-address=[$(IP)]:9100
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://127.0.0.1:9100/
env:
- name: IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy
ports:
- containerPort: 9100
hostPort: 9100
name: https
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
hostNetwork: true
hostPID: true
nodeSelector:
kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: node-exporter
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /proc
name: proc
- hostPath:
path: /sys
name: sys
- hostPath:
path: /
name: root

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-exporter
namespace: monitoring

View File

@ -0,0 +1,28 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
k8s-app: node-exporter
name: node-exporter
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 15s
port: https
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: instance
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
app.kubernetes.io/name: node-exporter

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: node-exporter
app.kubernetes.io/version: v0.18.1
name: node-exporter
namespace: monitoring
spec:
clusterIP: None
ports:
- name: https
port: 9100
targetPort: https
selector:
app.kubernetes.io/name: node-exporter

View File

@ -0,0 +1,74 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: apiserver
name: kube-apiserver
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 30s
metricRelabelings:
- action: drop
regex: kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)
sourceLabels:
- __name__
- action: drop
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
sourceLabels:
- __name__
- action: drop
regex: apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)
sourceLabels:
- __name__
- action: drop
regex: kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)
sourceLabels:
- __name__
- action: drop
regex: reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)
sourceLabels:
- __name__
- action: drop
regex: etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)
sourceLabels:
- __name__
- action: drop
regex: transformation_(transformation_latencies_microseconds|failures_total)
sourceLabels:
- __name__
- action: drop
regex: (admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)
sourceLabels:
- __name__
- action: drop
regex: etcd_(debugging|disk|server).*
sourceLabels:
- __name__
- action: drop
regex: apiserver_admission_controller_admission_latencies_seconds_.*
sourceLabels:
- __name__
- action: drop
regex: apiserver_admission_step_admission_latencies_seconds_.*
sourceLabels:
- __name__
- action: drop
regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)
sourceLabels:
- __name__
- le
port: https
scheme: https
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
serverName: kubernetes
jobLabel: component
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
component: apiserver
provider: kubernetes

View File

@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-k8s
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring

View File

@ -0,0 +1,25 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-k8s
rules:
- apiGroups:
- "*"
resources:
- "*"
verbs:
- "*"
- nonResourceURLs:
- /metrics
verbs:
- get
- apiGroups:
- ""
resources:
- services
- endpoints
- pods
verbs:
- get
- list
- watch

View File

@ -0,0 +1,77 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kubelet
name: kubelet
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
interval: 30s
metricRelabelings:
- action: drop
regex: kubelet_(pod_worker_latency_microseconds|pod_start_latency_microseconds|cgroup_manager_latency_microseconds|pod_worker_start_latency_microseconds|pleg_relist_latency_microseconds|pleg_relist_interval_microseconds|runtime_operations|runtime_operations_latency_microseconds|runtime_operations_errors|eviction_stats_age_microseconds|device_plugin_registration_count|device_plugin_alloc_latency_microseconds|network_plugin_operations_latency_microseconds)
sourceLabels:
- __name__
- action: drop
regex: scheduler_(e2e_scheduling_latency_microseconds|scheduling_algorithm_predicate_evaluation|scheduling_algorithm_priority_evaluation|scheduling_algorithm_preemption_evaluation|scheduling_algorithm_latency_microseconds|binding_latency_microseconds|scheduling_latency_seconds)
sourceLabels:
- __name__
- action: drop
regex: apiserver_(request_count|request_latencies|request_latencies_summary|dropped_requests|storage_data_key_generation_latencies_microseconds|storage_transformation_failures_total|storage_transformation_latencies_microseconds|proxy_tunnel_sync_latency_secs)
sourceLabels:
- __name__
- action: drop
regex: kubelet_docker_(operations|operations_latency_microseconds|operations_errors|operations_timeout)
sourceLabels:
- __name__
- action: drop
regex: reflector_(items_per_list|items_per_watch|list_duration_seconds|lists_total|short_watches_total|watch_duration_seconds|watches_total)
sourceLabels:
- __name__
- action: drop
regex: etcd_(helper_cache_hit_count|helper_cache_miss_count|helper_cache_entry_count|request_cache_get_latencies_summary|request_cache_add_latencies_summary|request_latencies_summary)
sourceLabels:
- __name__
- action: drop
regex: transformation_(transformation_latencies_microseconds|failures_total)
sourceLabels:
- __name__
- action: drop
regex: (admission_quota_controller_adds|crd_autoregistration_controller_work_duration|APIServiceOpenAPIAggregationControllerQueue1_adds|AvailableConditionController_retries|crd_openapi_controller_unfinished_work_seconds|APIServiceRegistrationController_retries|admission_quota_controller_longest_running_processor_microseconds|crdEstablishing_longest_running_processor_microseconds|crdEstablishing_unfinished_work_seconds|crd_openapi_controller_adds|crd_autoregistration_controller_retries|crd_finalizer_queue_latency|AvailableConditionController_work_duration|non_structural_schema_condition_controller_depth|crd_autoregistration_controller_unfinished_work_seconds|AvailableConditionController_adds|DiscoveryController_longest_running_processor_microseconds|autoregister_queue_latency|crd_autoregistration_controller_adds|non_structural_schema_condition_controller_work_duration|APIServiceRegistrationController_adds|crd_finalizer_work_duration|crd_naming_condition_controller_unfinished_work_seconds|crd_openapi_controller_longest_running_processor_microseconds|DiscoveryController_adds|crd_autoregistration_controller_longest_running_processor_microseconds|autoregister_unfinished_work_seconds|crd_naming_condition_controller_queue_latency|crd_naming_condition_controller_retries|non_structural_schema_condition_controller_queue_latency|crd_naming_condition_controller_depth|AvailableConditionController_longest_running_processor_microseconds|crdEstablishing_depth|crd_finalizer_longest_running_processor_microseconds|crd_naming_condition_controller_adds|APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds|DiscoveryController_queue_latency|DiscoveryController_unfinished_work_seconds|crd_openapi_controller_depth|APIServiceOpenAPIAggregationControllerQueue1_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_unfinished_work_seconds|DiscoveryController_work_duration|autoregister_adds|crd_autoregistration_controller_queue_latency|crd_finalizer_retries|AvailableConditionController_unfinished_work_seconds|autoregister_longest_running_processor_microseconds|non_structural_schema_condition_controller_unfinished_work_seconds|APIServiceOpenAPIAggregationControllerQueue1_depth|AvailableConditionController_depth|DiscoveryController_retries|admission_quota_controller_depth|crdEstablishing_adds|APIServiceOpenAPIAggregationControllerQueue1_retries|crdEstablishing_queue_latency|non_structural_schema_condition_controller_longest_running_processor_microseconds|autoregister_work_duration|crd_openapi_controller_retries|APIServiceRegistrationController_work_duration|crdEstablishing_work_duration|crd_finalizer_adds|crd_finalizer_depth|crd_openapi_controller_queue_latency|APIServiceOpenAPIAggregationControllerQueue1_work_duration|APIServiceRegistrationController_queue_latency|crd_autoregistration_controller_depth|AvailableConditionController_queue_latency|admission_quota_controller_queue_latency|crd_naming_condition_controller_work_duration|crd_openapi_controller_work_duration|DiscoveryController_depth|crd_naming_condition_controller_longest_running_processor_microseconds|APIServiceRegistrationController_depth|APIServiceRegistrationController_longest_running_processor_microseconds|crd_finalizer_unfinished_work_seconds|crdEstablishing_retries|admission_quota_controller_unfinished_work_seconds|non_structural_schema_condition_controller_adds|APIServiceRegistrationController_unfinished_work_seconds|admission_quota_controller_work_duration|autoregister_depth|autoregister_retries|kubeproxy_sync_proxy_rules_latency_microseconds|rest_client_request_latency_seconds|non_structural_schema_condition_controller_retries)
sourceLabels:
- __name__
port: https-metrics
relabelings:
- sourceLabels:
- __metrics_path__
targetLabel: metrics_path
scheme: https
tlsConfig:
insecureSkipVerify: true
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
honorLabels: true
interval: 30s
metricRelabelings:
- action: drop
regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
sourceLabels:
- __name__
path: /metrics/cadvisor
port: https-metrics
relabelings:
- sourceLabels:
- __metrics_path__
targetLabel: metrics_path
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: kubelet

View File

@ -0,0 +1,21 @@
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: example-rule
spec:
groups:
- name: example-rule
rules:
- alert: example-alert
annotations:
description: Memory on node {{ $labels.instance }} currently at {{ $value }}%
is under pressure
summary: Memory usage is under pressure, system may become unstable.
expr: |
100 - ((node_memory_MemAvailable_bytes{job="node-exporter"} * 100) / node_memory_MemTotal_bytes{job="node-exporter"}) > 70
for: 2m
labels:
severity: warning

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,17 @@
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
ports:
- name: web
port: 9090
targetPort: web
selector:
app: prometheus
prometheus: k8s
sessionAffinity: None
type: LoadBalancer

View File

@ -0,0 +1,58 @@
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
prometheus: k8s
name: k8s
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
externalLabels:
cluster: docker-desktop
image: quay.io/prometheus/prometheus:v2.19.2
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
replicas: 1
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- node-exporter
- kube-state-metrics
- apiserver
- kubelet
- carts
- carts-db
- shipping
- rabbitmq
- queue-master
- catalogue-db
- catalogue
- front-end
- orders-db
- orders
- payment
- user-db
- user
- litmus-event-router
- chaos-exporter
version: v2.19.2

View File

@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-k8s
namespace: monitoring

View File

@ -0,0 +1,16 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-operator
subjects:
- kind: ServiceAccount
name: prometheus-operator
namespace: monitoring

View File

@ -0,0 +1,81 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
rules:
- apiGroups:
- monitoring.coreos.com
resources:
- alertmanagers
- alertmanagers/finalizers
- prometheuses
- prometheuses/finalizers
- thanosrulers
- thanosrulers/finalizers
- servicemonitors
- podmonitors
- prometheusrules
verbs:
- '*'
- apiGroups:
- apps
resources:
- statefulsets
verbs:
- '*'
- apiGroups:
- ""
resources:
- configmaps
- secrets
verbs:
- '*'
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- delete
- apiGroups:
- ""
resources:
- services
- services/finalizers
- endpoints
verbs:
- get
- create
- update
- delete
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
- list
- watch
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,265 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: podmonitors.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: PodMonitor
listKind: PodMonitorList
plural: podmonitors
singular: podmonitor
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: PodMonitor defines monitoring for a set of pods.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Pod selection for target discovery
by Prometheus.
properties:
jobLabel:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podMetricsEndpoints:
description: A list of endpoints allowed as part of this PodMonitor.
items:
description: PodMetricsEndpoint defines a scrapeable endpoint of
a Kubernetes Pod serving Prometheus metrics.
properties:
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
params:
additionalProperties:
items:
type: string
type: array
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the pod port this endpoint refers to. Mutually
exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before ingestion.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: integer
- type: string
description: 'Deprecated: Use ''port'' instead.'
x-kubernetes-int-or-string: true
type: object
type: array
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Pod objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
type: object
type: object
required:
- podMetricsEndpoints
- selector
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,94 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: prometheusrules.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: PrometheusRule
listKind: PrometheusRuleList
plural: prometheusrules
singular: prometheusrule
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: PrometheusRule defines alerting rules for a Prometheus instance
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired alerting rule definitions for Prometheus.
properties:
groups:
description: Content of Prometheus rule file
items:
description: 'RuleGroup is a list of sequentially evaluated recording
and alerting rules. Note: PartialResponseStrategy is only used
by ThanosRuler and will be ignored by Prometheus instances. Valid
values for this field are ''warn'' or ''abort''. More info: https://github.com/thanos-io/thanos/blob/master/docs/components/rule.md#partial-response'
properties:
interval:
type: string
name:
type: string
partial_response_strategy:
type: string
rules:
items:
description: Rule describes an alerting or recording rule.
properties:
alert:
type: string
annotations:
additionalProperties:
type: string
type: object
expr:
anyOf:
- type: integer
- type: string
x-kubernetes-int-or-string: true
for:
type: string
labels:
additionalProperties:
type: string
type: object
record:
type: string
required:
- expr
type: object
type: array
required:
- name
- rules
type: object
type: array
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

View File

@ -0,0 +1,465 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.4
creationTimestamp: null
name: servicemonitors.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
kind: ServiceMonitor
listKind: ServiceMonitorList
plural: servicemonitors
singular: servicemonitor
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: ServiceMonitor defines monitoring for a set of services.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Service selection for target discovery
by Prometheus.
properties:
endpoints:
description: A list of endpoints allowed as part of this ServiceMonitor.
items:
description: Endpoint defines a scrapeable endpoint serving Prometheus
metrics.
properties:
basicAuth:
description: 'BasicAuth allow an endpoint to authenticate over
basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints'
properties:
password:
description: The secret in the service monitor namespace
that contains the password for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
username:
description: The secret in the service monitor namespace
that contains the username for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
type: object
bearerTokenFile:
description: File to read bearer token for scraping targets.
type: string
bearerTokenSecret:
description: Secret to mount to read bearer token for scraping
targets. The secret needs to be in the same namespace as the
service monitor and accessible by the Prometheus Operator.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
params:
additionalProperties:
items:
type: string
type: array
description: Optional HTTP URL parameters
type: object
path:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the service port this endpoint refers to.
Mutually exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before scraping.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
type: string
type: object
type: array
scheme:
description: HTTP scheme to use for scraping.
type: string
scrapeTimeout:
description: Timeout after which the scrape is ended
type: string
targetPort:
anyOf:
- type: integer
- type: string
description: Name or number of the pod port this endpoint refers
to. Mutually exclusive with port.
x-kubernetes-int-or-string: true
tlsConfig:
description: TLS configuration to use when scraping the endpoint
properties:
ca:
description: Stuct containing the CA cert to use for the
targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
type: boolean
required:
- key
type: object
secret:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
type: boolean
required:
- key
type: object
type: object
caFile:
description: Path to the CA cert in the Prometheus container
to use for the targets.
type: string
cert:
description: Struct containing the client cert file for
the targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
type: boolean
required:
- key
type: object
secret:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
type: boolean
required:
- key
type: object
type: object
certFile:
description: Path to the client cert file in the Prometheus
container for the targets.
type: string
insecureSkipVerify:
description: Disable target certificate validation.
type: boolean
keyFile:
description: Path to the client key file in the Prometheus
container for the targets.
type: string
keySecret:
description: Secret containing the client key file for the
targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
serverName:
description: Used to verify the hostname for the targets.
type: string
type: object
type: object
type: array
jobLabel:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
items:
type: string
type: array
type: object
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Endpoints objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
type: object
type: object
targetLabels:
description: TargetLabels transfers labels on the Kubernetes Service
onto the target.
items:
type: string
type: array
required:
- endpoints
- selector
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,59 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
template:
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=jimmidyson/configmap-reload:v0.3.0
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.40.0
image: quay.io/coreos/prometheus-operator:v0.40.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
# resources:
# limits:
# cpu: 200m
# memory: 200Mi
# requests:
# cpu: 100m
# memory: 100Mi
securityContext:
allowPrivilegeEscalation: false
- args:
- --logtostderr
- --secure-listen-address=:8443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://127.0.0.1:8080/
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy
ports:
- containerPort: 8443
name: https
securityContext:
runAsUser: 65534
nodeSelector:
beta.kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: prometheus-operator

View File

@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: monitoring

View File

@ -0,0 +1,8 @@
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
name: prometheus-operator

View File

@ -0,0 +1,18 @@
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
namespace: monitoring
spec:
clusterIP: None
ports:
- name: https
port: 8443
targetPort: https
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: carts-db
labels:
name: carts-db
k8s-app: carts-db
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: carts-db
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: mongo
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: carts
labels:
name: carts
k8s-app: carts
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: carts
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: catalogue-db
labels:
name: catalogue-db
k8s-app: catalogue-db
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: catalogue-db
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: mysql
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: catalogue
labels:
name: catalogue
k8s-app: catalogue
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: catalogue
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: front-end
labels:
name: front-end
k8s-app: front-end
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: front-end
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: orders-db
labels:
name: front-end
k8s-app: front-end
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: front-end
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: mongo
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: orders
labels:
name: orders
k8s-app: orders
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: orders
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: payment
labels:
name: payment
k8s-app: payment
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: payment
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,20 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: queue-master
labels:
name: queue-master
k8s-app: queue-master
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: queue-master
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
path: /prometheus
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: rabbitmq
labels:
name: rabbitmq
k8s-app: rabbitmq
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: rabbitmq
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: exporter
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: shipping
labels:
name: shipping
k8s-app: shipping
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: shipping
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: user-db
labels:
name: user-db
k8s-app: user-db
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: user-db
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: mongo
interval: 1s

View File

@ -0,0 +1,19 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: user
labels:
name: user
k8s-app: user
namespace: sock-shop
spec:
jobLabel: name
selector:
matchLabels:
name: user
namespaceSelector:
matchNames:
- sock-shop
endpoints:
- port: web
interval: 1s

View File

@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: sock-shop

View File

@ -0,0 +1,68 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: carts-db
name: carts-db
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: carts-db
template:
metadata:
labels:
app: sock-shop
name: carts-db
spec:
containers:
- image: mongo
imagePullPolicy: Always
name: carts-db
ports:
- containerPort: 27017
name: mongo
protocol: TCP
resources:
limits:
ephemeral-storage: 2Gi
requests:
ephemeral-storage: 1Gi
securityContext:
capabilities:
add:
- CHOWN
- SETGID
- SETUID
drop:
- all
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /tmp
name: tmp-volume
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: carts-db
labels:
name: carts-db
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 27017
name: mongo
targetPort: 27017
selector:
name: carts-db

View File

@ -0,0 +1,80 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: carts
name: carts
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: carts
template:
metadata:
labels:
app: sock-shop
name: carts
spec:
containers:
- env:
- name: ZIPKIN
value: zipkin.jaeger.svc.cluster.local
- name: JAVA_OPTS
value: -Xms64m -Xmx128m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:+UseG1GC
-Djava.security.egd=file:/dev/urandom
image: weaveworksdemos/carts:0.4.8
imagePullPolicy: IfNotPresent
name: carts
ports:
- containerPort: 80
protocol: TCP
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
volumeMounts:
- mountPath: /tmp
name: tmp-volume
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: carts
labels:
name: carts
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: carts

View File

@ -0,0 +1,52 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: catalogue-db
name: catalogue-db
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: catalogue-db
template:
metadata:
labels:
app: sock-shop
name: catalogue-db
spec:
containers:
- env:
- name: MYSQL_ROOT_PASSWORD
value: fake_password
- name: MYSQL_DATABASE
value: socksdb
image: weaveworksdemos/catalogue-db:0.3.0
imagePullPolicy: IfNotPresent
name: catalogue-db
ports:
- containerPort: 3306
name: mysql
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: catalogue-db
labels:
name: catalogue-db
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 3306
name: mysql
targetPort: 3306
selector:
name: catalogue-db

View File

@ -0,0 +1,67 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: catalogue
name: catalogue
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: catalogue
template:
metadata:
labels:
app: sock-shop
name: catalogue
spec:
containers:
- image: weaveworksdemos/catalogue:0.3.5
imagePullPolicy: IfNotPresent
name: catalogue
ports:
- containerPort: 80
protocol: TCP
resources: {}
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
---
apiVersion: v1
kind: Service
metadata:
name: catalogue
labels:
name: catalogue
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: catalogue

View File

@ -0,0 +1,70 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: front-end
name: front-end
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: front-end
template:
metadata:
labels:
app: sock-shop
name: front-end
spec:
containers:
- image: weaveworksdemos/front-end:0.3.12
imagePullPolicy: IfNotPresent
name: front-end
ports:
- containerPort: 8079
protocol: TCP
resources:
requests:
cpu: 100m
memory: 100Mi
livenessProbe:
httpGet:
path: /
port: 8079
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /
port: 8079
initialDelaySeconds: 30
periodSeconds: 3
securityContext:
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
---
apiVersion: v1
kind: Service
metadata:
name: front-end
labels:
name: front-end
namespace: sock-shop
spec:
type: LoadBalancer
ports:
- port: 80
name: web
targetPort: 8079
nodePort: 30001
selector:
name: front-end

View File

@ -0,0 +1,64 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: orders-db
name: orders-db
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: orders-db
template:
metadata:
labels:
app: sock-shop
name: orders-db
spec:
containers:
- image: mongo
imagePullPolicy: Always
name: orders-db
ports:
- containerPort: 27017
name: mongo
protocol: TCP
resources: {}
securityContext:
capabilities:
add:
- CHOWN
- SETGID
- SETUID
drop:
- all
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /tmp
name: tmp-volume
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: orders-db
labels:
name: orders-db
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 27017
name: mongo
targetPort: 27017
selector:
name: orders-db

View File

@ -0,0 +1,82 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: orders
name: orders
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: orders
template:
metadata:
labels:
app: sock-shop
name: orders
spec:
containers:
- env:
- name: ZIPKIN
value: zipkin.jaeger.svc.cluster.local
- name: JAVA_OPTS
value: -Xms64m -Xmx128m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:+UseG1GC
-Djava.security.egd=file:/dev/urandom
image: weaveworksdemos/orders:0.4.7
imagePullPolicy: IfNotPresent
name: orders
ports:
- containerPort: 80
protocol: TCP
resources: {}
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
volumeMounts:
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: orders
labels:
name: orders
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: orders

View File

@ -0,0 +1,67 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: payment
name: payment
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: payment
template:
metadata:
labels:
app: sock-shop
name: payment
spec:
containers:
- image: weaveworksdemos/payment:0.4.3
imagePullPolicy: IfNotPresent
name: payment
ports:
- containerPort: 80
protocol: TCP
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
---
apiVersion: v1
kind: Service
metadata:
name: payment
labels:
name: payment
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: payment

View File

@ -0,0 +1,60 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: queue-master
name: queue-master
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: queue-master
template:
metadata:
labels:
app: sock-shop
name: queue-master
spec:
containers:
- image: weaveworksdemos/queue-master:0.3.1
imagePullPolicy: IfNotPresent
name: queue-master
ports:
- containerPort: 80
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
---
apiVersion: v1
kind: Service
metadata:
name: queue-master
labels:
name: queue-master
annotations:
prometheus.io/path: "/prometheus"
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: queue-master

View File

@ -0,0 +1,61 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: rabbitmq
name: rabbitmq
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: rabbitmq
template:
metadata:
labels:
app: sock-shop
name: rabbitmq
spec:
containers:
- image: rabbitmq:3.6.8
imagePullPolicy: IfNotPresent
name: rabbitmq
ports:
- containerPort: 5672
protocol: TCP
resources: {}
securityContext:
capabilities:
add:
- CHOWN
- SETGID
- SETUID
- DAC_OVERRIDE
drop:
- all
readOnlyRootFilesystem: true
---
apiVersion: v1
kind: Service
metadata:
name: rabbitmq
labels:
name: rabbitmq
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 5672
name: rabbitmq
targetPort: 5672
- port: 9090
name: exporter
targetPort: exporter
protocol: TCP
selector:
name: rabbitmq

View File

@ -0,0 +1,82 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: shipping
name: shipping
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: shipping
template:
metadata:
labels:
app: sock-shop
name: shipping
spec:
containers:
- env:
- name: ZIPKIN
value: zipkin.jaeger.svc.cluster.local
- name: JAVA_OPTS
value: -Xms64m -Xmx128m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:+UseG1GC
-Djava.security.egd=file:/dev/urandom
image: weaveworksdemos/shipping:0.4.8
imagePullPolicy: IfNotPresent
name: shipping
ports:
- containerPort: 80
protocol: TCP
resources: {}
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
volumeMounts:
- mountPath: /tmp
name: tmp-volume
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: shipping
labels:
name: shipping
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: shipping

View File

@ -0,0 +1,63 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: user-db
name: user-db
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: user-db
template:
metadata:
labels:
app: sock-shop
name: user-db
spec:
containers:
- image: weaveworksdemos/user-db:0.4.0
imagePullPolicy: IfNotPresent
name: user-db
ports:
- containerPort: 27017
name: mongo
protocol: TCP
resources: {}
securityContext:
capabilities:
add:
- CHOWN
- SETGID
- SETUID
drop:
- all
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /tmp
name: tmp-volume
volumes:
- emptyDir:
medium: Memory
name: tmp-volume
---
apiVersion: v1
kind: Service
metadata:
name: user-db
labels:
name: user-db
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 27017
name: mongo
targetPort: 27017
selector:
name: user-db

View File

@ -0,0 +1,27 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
name: user-load
name: user-load
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
name: user-load
template:
metadata:
creationTimestamp: null
labels:
name: user-load
spec:
containers:
- args:
- -h
- front-end:80
- -r
- "9999999"
image: weaveworksdemos/load-test
imagePullPolicy: Always
name: user-load

View File

@ -0,0 +1,70 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
litmuschaos.io/chaos: "true"
labels:
app: sock-shop
name: user
name: user
namespace: sock-shop
spec:
replicas: 1
selector:
matchLabels:
app: sock-shop
name: user
template:
metadata:
labels:
app: sock-shop
name: user
spec:
containers:
- env:
- name: MONGO_HOST
value: user-db:27017
image: weaveworksdemos/user:0.4.7
imagePullPolicy: IfNotPresent
name: user
ports:
- containerPort: 80
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 300
periodSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 180
periodSeconds: 3
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
---
apiVersion: v1
kind: Service
metadata:
name: user
labels:
name: user
namespace: sock-shop
spec:
ports:
# the port that this service should serve on
- port: 80
name: web
targetPort: 80
selector:
name: user

View File

@ -0,0 +1,26 @@
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: catalogue-node-cpu-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=catalogue'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: node-cpu-hog
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: NODE_CPU_CORE
value: '1'

View File

@ -0,0 +1,33 @@
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: catalogue-pod-cpu-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=catalogue'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: pod-cpu-hog
spec:
components:
experimentImage: "litmuschaos/go-runner:ci"
env:
- name: TARGET_CONTAINER
value: 'catalogue'
- name: CPU_CORES
value: '1'
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: CHAOS_KILL_COMMAND
value: "kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print$1}' | tr '\n' ' ')"

View File

@ -0,0 +1,26 @@
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: orders-node-memory-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=orders'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: node-memory-hog
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: '240'
- name: MEMORY_PERCENTAGE
value: '90' # in seconds

View File

@ -0,0 +1,33 @@
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: orders-pod-memory-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=orders'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: pod-memory-hog
spec:
components:
experimentImage: "litmuschaos/go-runner:ci"
env:
- name: TARGET_CONTAINER
value: 'orders'
- name: MEMORY_CONSUMPTION
value: '500'
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: CHAOS_KILL_COMMAND
value: "kill -9 $(ps afx | grep \"[dd] if /dev/zero\" | awk '{print $1}' | tr '\n' ' ')"

View File

@ -0,0 +1,54 @@
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: node-cpu-cron-wf
namespace: litmus
spec:
schedule: "0-14/15 * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: argowf-chaos
serviceAccountName: argo-chaos
templates:
- name: argowf-chaos
steps:
- - name: run-node-cpu-hog
template: run-node-cpu-hog
- name: run-node-cpu-hog
inputs:
artifacts:
- name: run-node-cpu-hog
path: /tmp/chaosengine-node-cpu-hog.yaml
raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: catalogue-node-cpu-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=catalogue'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: node-cpu-hog
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: NODE_CPU_CORE
value: '1'
container:
image: lachlanevenson/k8s-kubectl
command: [sh, -c]
args:
["kubectl apply -f /tmp/chaosengine-node-cpu-hog.yaml -n litmus"]

View File

@ -0,0 +1,59 @@
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: pod-cpu-cron-wf
namespace: litmus
spec:
schedule: "15-29/15 * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: argowf-chaos
serviceAccountName: argo-chaos
templates:
- name: argowf-chaos
steps:
- - name: run-pod-cpu-hog
template: run-pod-cpu-hog
- name: run-pod-cpu-hog
inputs:
artifacts:
- name: run-pod-cpu-hog
path: /tmp/chaosengine-pod-cpu-hog.yaml
raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: catalogue-pod-cpu-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=catalogue'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: pod-cpu-hog
spec:
components:
experimentImage: "litmuschaos/go-runner:ci"
env:
- name: TARGET_CONTAINER
value: 'catalogue'
- name: CPU_CORES
value: '1'
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: CHAOS_KILL_COMMAND
value: "kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print$1}' | tr '\n' ' ')"
container:
image: lachlanevenson/k8s-kubectl
command: [sh, -c]
args: ["kubectl apply -f /tmp/chaosengine-pod-cpu-hog.yaml -n litmus"]

View File

@ -0,0 +1,54 @@
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: node-memory-cron-wf
namespace: litmus
spec:
schedule: "30-44/15 * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: argowf-chaos
serviceAccountName: argo-chaos
templates:
- name: argowf-chaos
steps:
- - name: run-node-memory-hog
template: run-node-memory-hog
- name: run-node-memory-hog
inputs:
artifacts:
- name: run-node-memory-hog
path: /tmp/chaosengine-node-memory-hog.yaml
raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: orders-node-memory-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=orders'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: node-memory-hog
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: '240'
- name: MEMORY_PERCENTAGE
value: '90' # in seconds
container:
image: lachlanevenson/k8s-kubectl
command: [sh, -c]
args:
["kubectl apply -f /tmp/chaosengine-node-memory-hog.yaml -n litmus"]

View File

@ -0,0 +1,59 @@
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: pod-memory-cron-wf
namespace: litmus
spec:
schedule: "45-59/15 * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: argowf-chaos
serviceAccountName: argo-chaos
templates:
- name: argowf-chaos
steps:
- - name: run-pod-memory-hog
template: run-pod-memory-hog
- name: run-pod-memory-hog
inputs:
artifacts:
- name: run-pod-memory-hog
path: /tmp/chaosengine-pod-memory-hog.yaml
raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: orders-pod-memory-hog
namespace: litmus
spec:
annotationCheck: 'false'
engineState: 'active'
auxiliaryAppInfo: ''
appinfo:
appns: 'sock-shop'
applabel: 'name=orders'
appkind: 'deployment'
chaosServiceAccount: litmus-admin
monitoring: true
jobCleanUpPolicy: 'retain'
experiments:
- name: pod-memory-hog
spec:
components:
experimentImage: "litmuschaos/go-runner:ci"
env:
- name: TARGET_CONTAINER
value: 'orders'
- name: MEMORY_CONSUMPTION
value: '500'
- name: TOTAL_CHAOS_DURATION
value: '240' # in seconds
- name: CHAOS_KILL_COMMAND
value: "kill -9 $(ps afx | grep \"[dd] if /dev/zero\" | awk '{print $1}' | tr '\n' ' ')"
container:
image: lachlanevenson/k8s-kubectl
command: [sh, -c]
args:
["kubectl apply -f /tmp/chaosengine-pod-memory-hog.yaml -n litmus"]