vllm/compile at 202c5df9357e7c52b51e19abc70e8444f3f85ada - vllm

History

Maximilien de Bayser 799397ee4f Support embedding models in V1 (#16188 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Signed-off-by: Max de Bayser <maxdebayser@gmail.com> Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>		2025-06-18 21:36:33 -07:00
..
piecewise	[Feature][ROCm] Add full graph capture support for TritonAttentionBackend (#19158 )	2025-06-17 17:03:06 -04:00
__init__.py	[torch.compile] register allreduce operations as custom ops (#8526 )	2024-09-16 22:57:57 -07:00
backend.py	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 )	2025-06-12 08:31:04 -07:00
test_async_tp.py	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 )	2025-06-12 08:31:04 -07:00
test_basic_correctness.py	Support embedding models in V1 (#16188 )	2025-06-18 21:36:33 -07:00
test_config.py	[BugFix] Fix use_cudagraph=False (#19612 )	2025-06-19 08:23:12 +08:00
test_full_graph.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_functionalization.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_fusion.py	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 )	2025-06-12 08:31:04 -07:00
test_fusion_attn.py	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 )	2025-06-12 08:31:04 -07:00
test_pass_manager.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_sequence_parallelism.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_silu_mul_quant_fusion.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_wrapper.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00