vllm/csrc/moe
Wentao Ye ffb2cd6b54
[Perf] Optimize `moe_align_block_size` CUDA kernel (#19572)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-06-17 11:49:26 -07:00
..
marlin_moe_wna16 [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
permute_unpermute_kernels [Refactor] Remove unused variables in `moe_permute_unpermute_kernel.inl` (#19573) 2025-06-13 06:12:15 -07:00
moe_align_sum_kernels.cu [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00
moe_ops.h [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00
moe_permute_unpermute_op.cu [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
moe_wna16.cu [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801) 2025-04-17 22:13:29 -07:00
moe_wna16_utils.h `pre-commit autoupdate` (#17380) 2025-04-29 06:46:55 -07:00
topk_softmax_kernels.cu [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
torch_bindings.cpp [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00