vllm/docs/source/deployment/integrations/kubeai.md

787 B

(deployment-kubeai)=

KubeAI

KubeAI is a Kubernetes operator that enables you to deploy and manage AI models on Kubernetes. It provides a simple and scalable way to deploy vLLM in production. Functionality such as scale-from-zero, load based autoscaling, model caching, and much more is provided out of the box with zero external dependencies.

Please see the Installation Guides for environment specific instructions:

Once you have KubeAI installed, you can configure text generation models using vLLM.