vllm/tests/kernels
TY-AMD 96453cfa83
[BugFix][V1][ROCm] Triton MLA uses V0 backend on V1 engine (#19067)
Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>
2025-07-01 16:12:19 +08:00
..
attention [BugFix][V1][ROCm] Triton MLA uses V0 backend on V1 engine (#19067) 2025-07-01 16:12:19 +08:00
core [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
mamba [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
moe [Bugfix] Fix deepep tests (#20288) 2025-07-01 15:29:08 +08:00
quantization Enable ZP Support for Machete (#20268) 2025-07-01 07:12:20 +00:00
__init__.py [CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425) 2024-05-13 23:50:09 +09:00
allclose_default.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
quant_utils.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_apply_repetition_penalties.py [KERNEL] Sampler. CUDA kernel for applying repetition penalty (#18437) 2025-06-03 21:13:01 -07:00
test_cutlass_mla_decode.py [NVIDIA] Add Cutlass MLA backend (#17625) 2025-06-03 21:40:26 -07:00
test_flex_attention.py Fixes IMA for TP w/ flex-attention (#19712) 2025-06-17 04:01:50 +00:00
test_fused_quant_activation.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_triton_flash_attention.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
utils.py [Kernels][Bugfix] Use torch op for all kernels in FusedMoE forward. Add additional testing for cudagraphs. (#19717) 2025-06-24 23:22:58 -07:00