Update with structure format for README.md

2025-05-27 19:20:46 -04:00 · 2025-05-27 19:20:46 -04:00 · 01cf1c41cf
parent ea6121e61c
commit 01cf1c41cf
1 changed files with 141 additions and 16 deletions
--- a/ai/model-serving-tensorflow/README.md
+++ b/ai/model-serving-tensorflow/README.md
@ -1,23 +1,74 @@
+# TensorFlow Model Serving on Kubernetes

-To expose this externally, ensure you have an Ingress Controller installed (e.g., NGINX).
+## 🎯 Purpose / What You'll Learn

-Adjust the host value in ingress.yaml to match your domain or use a LoadBalancer for direct access instead of Ingress.
+This example demonstrates how to deploy a TensorFlow model for inference using [TensorFlow Serving](https://www.tensorflow.org/tfx/serving) on Kubernetes. You’ll learn how to:

-For basic testing:
-```
-curl -X POST http://<ingress-host>/tf/v1/models/my_model:predict -d '{ "instances": [[1.0, 2.0, 5.0]] }'
+- Set up TensorFlow Serving with a pre-trained model
+- Use a PersistentVolume to mount your model directory
+- Expose the inference endpoint using a Kubernetes `Service` and `Ingress`
+- Send a sample prediction request to the model
+
+---
+
+## 📚 Table of Contents
+
+- [Prerequisites](#prerequisites)
+- [Quick Start / TL;DR](#quick-start--tldr)
+- [Detailed Steps & Explanation](#detailed-steps--explanation)
+- [Verification / Seeing it Work](#verification--seeing-it-work)
+- [Configuration Customization](#configuration-customization)
+- [Cleanup](#cleanup)
+- [Further Reading / Next Steps](#further-reading--next-steps)
+
+---
+
+## ⚙️ Prerequisites
+
+- Kubernetes cluster (tested with v1.29+)
+- `kubectl` configured
+- Optional: `ingress-nginx` for external access
+- x86-based machine (for running TensorFlow Serving image)
+- Local hostPath support (for demo) or a cloud-based PVC
+
+---
+
+## ⚡ Quick Start / TL;DR
+
+```bash
+# Create demo model directory
+mkdir -p /mnt/models/my_model/1
+wget https://storage.googleapis.com/tf-serving-models/resnet_v2.tar.gz
+tar -xzvf resnet_v2.tar.gz -C /mnt/models/my_model/1 --strip-components=1
+
+# Apply manifests
+kubectl apply -f pvc.yaml
+kubectl apply -f deployment.yaml
+kubectl apply -f service.yaml
+kubectl apply -f ingress.yaml  # Optional
 ```

-PVC 
+---

-⚠️ Note: You must create a PersistentVolumeClaim named my-model-pvc that points to the directory where your SavedModel is stored (/models/my_model/1/saved_model.pb).
+## 🧩 Detailed Steps & Explanation

-📌 For demo/testing purposes, this uses a hostPath volume — useful for single-node clusters (like Minikube or KIND).
-Replace hostPath with a cloud volume (e.g., GCP PD, AWS EBS, NFS, etc.) in production environments.
+### 1. PersistentVolume & PVC Setup

- Directory Structure on Node (/mnt/models/my_model/)
-Your model directory (mounted on the node at /mnt/models/my_model/) should look like this:
+> ⚠️ Note: For local testing, `hostPath` is used to mount `/mnt/models/my_model`. In production, replace this with a cloud-native storage backend (e.g., AWS EBS, GCP PD, or NFS).

+```yaml
+# pvc.yaml (example snippet)
+apiVersion: v1
+kind: PersistentVolume
+metadata:
+  name: my-model-pv
+spec:
+  hostPath:
+    path: /mnt/models/my_model
+...
+```
+
+Model folder structure:
 ```
 /mnt/models/my_model/
 └── 1/
@ -25,11 +76,85 @@ Your model directory (mounted on the node at /mnt/models/my_model/) should look
    └── variables/
 ```

-✅ How to Load a Demo Model (if using local clusters)
-```
-mkdir -p /mnt/models/my_model/1
-wget https://storage.googleapis.com/tf-serving-models/resnet_v2.tar.gz
-tar -xzvf resnet_v2.tar.gz -C /mnt/models/my_model/1 --strip-components=1
+---
+
+### 2. Deploy TensorFlow Serving
+
+```yaml
+# deployment.yaml (example snippet)
+containers:
+  - name: tf-serving
+    image: tensorflow/serving:2.13.0
+    args:
+      - --model_name=my_model
+      - --model_base_path=/models/my_model
+    volumeMounts:
+      - name: model-volume
+        mountPath: /models/my_model
 ```

+---
+
+### 3. Expose the Service
+
+- A `ClusterIP` service exposes gRPC (8500) and REST (8501).
+- An optional `Ingress` exposes `/tf/v1/models/my_model:predict` to external clients.
+
+Update the `host` value in `ingress.yaml` to match your domain.
+
+---
+
+## ✅ Verification / Seeing it Work
+
+If using ingress:
+
+```bash
+curl -X POST http://<ingress-host>/tf/v1/models/my_model:predict \
+  -H "Content-Type: application/json" \
+  -d '{ "instances": [[1.0, 2.0, 5.0]] }'
+```
+
+Expected output:
+
+```json
+{
+  "predictions": [...]
+}
+```
+
+To verify the pod is running:
+
+```bash
+kubectl get pods
+kubectl logs <tf-serving-pod-name>
+```
+
+---
+
+## 🛠️ Configuration Customization
+
+- Update `model_name` and `model_base_path` in the deployment
+- Replace `hostPath` with `PersistentVolumeClaim` bound to cloud storage
+- Modify resource requests/limits for TensorFlow container
+
+---
+
+## 🧹 Cleanup
+
+```bash
+kubectl delete -f ingress.yaml
+kubectl delete -f service.yaml
+kubectl delete -f deployment.yaml
+kubectl delete -f pvc.yaml
+```
+
+---
+
+## 📘 Further Reading / Next Steps
+
+- [TensorFlow Serving](https://www.tensorflow.org/tfx/serving)
+- [TF Serving REST API Reference](https://www.tensorflow.org/tfx/serving/api_rest)
+- [Kubernetes Ingress Controller](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/)
+- [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
+