vllm/docs/source/serving
Simon Mo ef65dcfa6f
[Doc] Add docs about OpenAI compatible server (#3288)
2024-03-18 22:05:34 -07:00
..
deploying_with_bentoml.rst docs: Add BentoML deployment doc (#3336) 2024-03-12 10:34:30 -07:00
deploying_with_docker.rst [Docker] Add cuda arch list as build option (#1950) 2023-12-08 09:53:47 -08:00
deploying_with_kserve.rst docs: Add tutorial on deploying vLLM model with KServe (#2586) 2024-03-01 11:04:14 -08:00
deploying_with_triton.rst Add documentation to Triton server tutorial (#983) 2023-09-20 10:32:40 -07:00
distributed_serving.rst [Doc] Documentation for distributed inference (#261) 2023-06-26 11:34:23 -07:00
integrations.rst [Doc] Add docs about OpenAI compatible server (#3288) 2024-03-18 22:05:34 -07:00
metrics.rst Add Production Metrics in Prometheus format (#1890) 2023-12-02 16:37:44 -08:00
openai_compatible_server.md [Doc] Add docs about OpenAI compatible server (#3288) 2024-03-18 22:05:34 -07:00
run_on_sky.rst Update run_on_sky.rst (#2025) 2023-12-11 10:32:58 -08:00
serving_with_langchain.rst docs: fix langchain (#2736) 2024-02-03 18:17:55 -08:00