vllm/examples/offline_inference/disaggregated-prefill-v1
Simon Mo 02f0c7b220
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
..
README.md [Misc] refactor disaggregated-prefill-v1 example (#18474) 2025-05-21 11:10:14 +00:00
decode_example.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
prefill_example.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
run.sh add an absolute path for run.sh (#18258) 2025-06-02 19:38:23 +00:00

README.md

Disaggregated Prefill V1

This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.

Files

  • run.sh - A helper script that will run prefill_example.py and decode_example.py sequentially.
    • Make sure you are in the examples/offline_inference/disaggregated-prefill-v1 directory before running run.sh.
  • prefill_example.py - A script which performs prefill only, saving the KV state to the local_storage directory and the prompts to output.txt.
  • decode_example.py - A script which performs decode only, loading the KV state from the local_storage directory and the prompts from output.txt.