mirror of https://github.com/vllm-project/vllm.git
661 B
661 B
Disaggregated Prefill V1
This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.
Files
run.sh
- A helper script that will runprefill_example.py
anddecode_example.py
sequentially.- Make sure you are in the
examples/offline_inference/disaggregated-prefill-v1
directory before runningrun.sh
.
- Make sure you are in the
prefill_example.py
- A script which performs prefill only, saving the KV state to thelocal_storage
directory and the prompts tooutput.txt
.decode_example.py
- A script which performs decode only, loading the KV state from thelocal_storage
directory and the prompts fromoutput.txt
.