vllm/kernels at fix-precommit - vllm - Gitea: Git with a cup of tea

History

Wentao Ye ffb2cd6b54 [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572 ) Signed-off-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: mgoin <mgoin64@gmail.com>		2025-06-17 11:49:26 -07:00
..
attention	[Kernel] Raise verbose error and consolidate `num_heads/num_kv_heads` divisibility check (#19339 )	2025-06-15 13:43:48 +08:00
core	[CI] change spell checker from codespell to typos (#18711 )	2025-06-11 19:57:10 -07:00
mamba	[CI] change spell checker from codespell to typos (#18711 )	2025-06-11 19:57:10 -07:00
moe	[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572 )	2025-06-17 11:49:26 -07:00
quantization	[Perf] Vectorize static / dynamic INT8 quant kernels (#19233 )	2025-06-12 06:51:41 -07:00
__init__.py	[CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425 )	2024-05-13 23:50:09 +09:00
allclose_default.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
quant_utils.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_apply_repetition_penalties.py	[KERNEL] Sampler. CUDA kernel for applying repetition penalty (#18437 )	2025-06-03 21:13:01 -07:00
test_cutlass_mla_decode.py	[NVIDIA] Add Cutlass MLA backend (#17625 )	2025-06-03 21:40:26 -07:00
test_flex_attention.py	Fixes IMA for TP w/ flex-attention (#19712 )	2025-06-17 04:01:50 +00:00
test_fused_quant_activation.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
test_triton_flash_attention.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00
utils.py	[Misc] Add SPDX-FileCopyrightText (#19100 )	2025-06-03 11:20:17 -07:00