History

Jayesh Mahajan 0598f0762a AI Example model serving tensorflow (#563 ) * Create AI Example model serving tensorflow * ai/model-serving-tensorflow service.yaml * ai/model-serving-tensorflow ingress.yaml * ai/model-serving-tensorflow pv.yaml * ai/model-serving-tensorflow pvc.yaml * Create Readme.md * Rename Readme.md to README.md * Update with structure format for README.md * Correct link for serving in ai/model-serving-tensorflow/README.md Co-authored-by: Janet Kuo <chiachenk@google.com> * Fix kubectl README.md * Update README.md * Update as per comments README.md * Update tensorflow/serving:2.19.0 deployment.yaml * remove hostname ai/model-serving-tensorflow/ingress.yaml --------- Co-authored-by: Janet Kuo <chiachenk@google.com>		2025-06-03 17:48:38 -07:00
..
README.md	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00
deployment.yaml	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00
ingress.yaml	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00
pv.yaml	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00
pvc.yaml	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00
service.yaml	AI Example model serving tensorflow (#563 )	2025-06-03 17:48:38 -07:00

README.md

TensorFlow Model Serving on Kubernetes

1 Purpose / What You'll Learn

This example demonstrates how to deploy a TensorFlow model for inference using TensorFlow Serving on Kubernetes. You’ll learn how to:

Set up TensorFlow Serving with a pre-trained model
Use a PersistentVolume to mount your model directory
Expose the inference endpoint using a Kubernetes Service and Ingress
Send a sample prediction request to the model

⚙️ Prerequisites

Kubernetes cluster (tested with v1.29+)
kubectl configured
Optional: ingress-nginx for external access
x86-based machine (for running TensorFlow Serving image)
Local hostPath support (for demo) or a cloud-based PVC

⚡ Quick Start / TL;DR


# Apply manifests
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml  # Optional

2. Expose the Servic

1. PersistentVolume & PVC Setup

⚠️ Note: For local testing, hostPath is used to mount /mnt/models/my_model. In production, replace this with a cloud-native storage backend (e.g., AWS EBS, GCP PD, or NFS).

Model folder structure:

/mnt/models/my_model/
└── 1/
    ├── saved_model.pb
    └── variables/

2. Expose the Service

A ClusterIP service exposes gRPC (8500) and REST (8501).
An optional Ingress exposes /tf/v1/models/my_model:predict to external clients.

Update the host value in ingress.yaml to match your domain.

3 Verification / Seeing it Work

If using ingress:

curl -X POST http://<ingress-host>/tf/v1/models/my_model:predict \
  -H "Content-Type: application/json" \
  -d '{ "instances": [[1.0, 2.0, 5.0]] }'

Expected output:

{
  "predictions": [...]
}

To verify the pod is running:

kubectl get pods
kubectl wait --for=condition=Available deployment/tf-serving --timeout=300s
kubectl logs deployment/tf-serving

🛠️ Configuration Customization

Update model_name and model_base_path in the deployment
Replace hostPath with PersistentVolumeClaim bound to cloud storage
Modify resource requests/limits for TensorFlow container

🧹 Cleanup

kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/ingress.yaml  # Optional
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/service.yaml
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/deployment.yaml
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pvc.yaml
kubectl delete -f https://raw.githubusercontent.com/kubernetes/examples/refs/heads/master/ai/model-serving-tensorflow/pv.yaml

README.md Unescape Escape