mirror of https://github.com/vllm-project/vllm.git
Signed-off-by: simon-mo <simon.mo@hey.com> |
||
---|---|---|
.. | ||
README.md | ||
decode_example.py | ||
prefill_example.py | ||
run.sh |
README.md
Disaggregated Prefill V1
This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.
Files
run.sh
- A helper script that will runprefill_example.py
anddecode_example.py
sequentially.- Make sure you are in the
examples/offline_inference/disaggregated-prefill-v1
directory before runningrun.sh
.
- Make sure you are in the
prefill_example.py
- A script which performs prefill only, saving the KV state to thelocal_storage
directory and the prompts tooutput.txt
.decode_example.py
- A script which performs decode only, loading the KV state from thelocal_storage
directory and the prompts fromoutput.txt
.