vllm/tests/compile
Maximilien de Bayser 799397ee4f
Support embedding models in V1 (#16188)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-06-18 21:36:33 -07:00
..
piecewise [Feature][ROCm] Add full graph capture support for TritonAttentionBackend (#19158) 2025-06-17 17:03:06 -04:00
__init__.py [torch.compile] register allreduce operations as custom ops (#8526) 2024-09-16 22:57:57 -07:00
backend.py [torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756) 2025-06-12 08:31:04 -07:00
test_async_tp.py [torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756) 2025-06-12 08:31:04 -07:00
test_basic_correctness.py Support embedding models in V1 (#16188) 2025-06-18 21:36:33 -07:00
test_config.py [BugFix] Fix use_cudagraph=False (#19612) 2025-06-19 08:23:12 +08:00
test_full_graph.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_functionalization.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_fusion.py [torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756) 2025-06-12 08:31:04 -07:00
test_fusion_attn.py [torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756) 2025-06-12 08:31:04 -07:00
test_pass_manager.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_sequence_parallelism.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_silu_mul_quant_fusion.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_wrapper.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00