vllm/tests/kernels
Wentao Ye ffb2cd6b54
[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-06-17 11:49:26 -07:00
..
attention [Kernel] Raise verbose error and consolidate `num_heads/num_kv_heads` divisibility check (#19339) 2025-06-15 13:43:48 +08:00
core [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
mamba [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
moe [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00
quantization [Perf] Vectorize static / dynamic INT8 quant kernels (#19233) 2025-06-12 06:51:41 -07:00
__init__.py [CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425) 2024-05-13 23:50:09 +09:00
allclose_default.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
quant_utils.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_apply_repetition_penalties.py [KERNEL] Sampler. CUDA kernel for applying repetition penalty (#18437) 2025-06-03 21:13:01 -07:00
test_cutlass_mla_decode.py [NVIDIA] Add Cutlass MLA backend (#17625) 2025-06-03 21:40:26 -07:00
test_flex_attention.py Fixes IMA for TP w/ flex-attention (#19712) 2025-06-17 04:01:50 +00:00
test_fused_quant_activation.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
test_triton_flash_attention.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
utils.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00