docs: Add tutorial on deploying vLLM model with KServe (#2586)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2024-03-01 14:04:14 -05:00 · 2024-03-01 14:04:14 -05:00 · 49d849b3ab
parent 27ca23dc00
commit 49d849b3ab
2 changed files with 9 additions and 0 deletions
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -70,6 +70,7 @@ Documentation

   serving/distributed_serving
   serving/run_on_sky
+   serving/deploying_with_kserve
   serving/deploying_with_triton
   serving/deploying_with_docker
   serving/serving_with_langchain
--- a/docs/source/serving/deploying_with_kserve.rst
+++ b/docs/source/serving/deploying_with_kserve.rst
@ -0,0 +1,8 @@
+.. _deploying_with_kserve:
+
+Deploying with KServe
+============================
+
+vLLM can be deployed with `KServe <https://github.com/kserve/kserve>`_ on Kubernetes for highly scalable distributed model serving.
+
+Please see `this guide <https://kserve.github.io/website/latest/modelserving/v1beta1/llm/vllm/>`_ for more details on using vLLM with KServe.