vllm/tests/v1/engine
Isotr0py 1173804dca
[Bugfix] Fix TP inference for Flex attention backend (#19657)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-06-16 11:21:37 +00:00
..
__init__.py [V1] `AsyncLLM` Implementation (#9826) 2024-11-11 23:05:38 +00:00
conftest.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_async_llm.py [Bugfix] Fix auto dtype casting for BatchFeature (#19316) 2025-06-14 15:13:08 +00:00
test_engine_args.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_engine_core.py [Bugfix] Fix TP inference for Flex attention backend (#19657) 2025-06-16 11:21:37 +00:00
test_engine_core_client.py [Bugfix] Fix auto dtype casting for BatchFeature (#19316) 2025-06-14 15:13:08 +00:00
test_fast_incdec_prefix_err.py [BugFix] Work-around incremental detokenization edge case error (#19449) 2025-06-12 06:43:20 +00:00
test_llm_engine.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_output_processor.py Allow AsyncLLMEngine.generate to target a specific DP rank (#19102) 2025-06-04 08:26:47 -07:00
utils.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00