docs: Add tutorial on deploying vLLM model with KServe (#2586)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
This commit is contained in:
Yuan Tang 2024-03-01 14:04:14 -05:00 committed by GitHub
parent 27ca23dc00
commit 49d849b3ab
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 9 additions and 0 deletions

View File

@ -70,6 +70,7 @@ Documentation
serving/distributed_serving
serving/run_on_sky
serving/deploying_with_kserve
serving/deploying_with_triton
serving/deploying_with_docker
serving/serving_with_langchain

View File

@ -0,0 +1,8 @@
.. _deploying_with_kserve:
Deploying with KServe
============================
vLLM can be deployed with `KServe <https://github.com/kserve/kserve>`_ on Kubernetes for highly scalable distributed model serving.
Please see `this guide <https://kserve.github.io/website/latest/modelserving/v1beta1/llm/vllm/>`_ for more details on using vLLM with KServe.