vllm/csrc/moe
周周周 9290de5667
remove unused variables in marlin_template.h (#20236)
2025-07-02 00:51:52 +00:00
..
marlin_moe_wna16 remove unused variables in marlin_template.h (#20236) 2025-07-02 00:51:52 +00:00
permute_unpermute_kernels [Refactor] Remove unused variables in `moe_permute_unpermute_kernel.inl` (#19573) 2025-06-13 06:12:15 -07:00
moe_align_sum_kernels.cu Fix `numel()` downcast in vllm/csrc/moe/moe_align_sum_kernels.cu +2 (#17082) 2025-07-01 06:48:10 +00:00
moe_ops.h [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00
moe_permute_unpermute_op.cu [CI] change spell checker from codespell to typos (#18711) 2025-06-11 19:57:10 -07:00
moe_wna16.cu [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801) 2025-04-17 22:13:29 -07:00
moe_wna16_utils.h `pre-commit autoupdate` (#17380) 2025-04-29 06:46:55 -07:00
topk_softmax_kernels.cu Fix `numel()` downcast in vllm/csrc/moe/moe_align_sum_kernels.cu +2 (#17082) 2025-07-01 06:48:10 +00:00
torch_bindings.cpp [Perf] Optimize `moe_align_block_size` CUDA kernel (#19572) 2025-06-17 11:49:26 -07:00