mirror of https://github.com/vllm-project/vllm.git
Signed-off-by: simon-mo <simon.mo@hey.com> |
||
---|---|---|
.. | ||
README.md | ||
disagg_proxy_demo.py | ||
kv_events.sh |
README.md
Disaggregated Serving
This example contains scripts that demonstrate the disaggregated serving features of vLLM.
Files
disagg_proxy_demo.py
- Demonstrates XpYd (X prefill instances, Y decode instances).kv_events.sh
- Demonstrates KV cache event publishing.