vllm/csrc/moe
Jinzhen Lin e73b7dfd69
[Bugfix] fix `an illegal memory access was encountered` of marlin kernel + act_order (#18245)
2025-05-16 16:02:44 -07:00
..
marlin_moe_wna16 [Bugfix] fix `an illegal memory access was encountered` of marlin kernel + act_order (#18245) 2025-05-16 16:02:44 -07:00
permute_unpermute_kernels permute/unpermute kernel for moe optimization (#14568) 2025-05-02 11:31:55 -07:00
moe_align_sum_kernels.cu Modularize fused experts and integrate PPLX kernels (#15956) 2025-05-14 13:11:54 -07:00
moe_ops.h [ROCm][Bugfix] Ensure that the moe_wna16_gemm kernel is not built on ROCm platforms. (#14629) 2025-03-12 08:00:28 -04:00
moe_permute_unpermute_op.cu permute/unpermute kernel for moe optimization (#14568) 2025-05-02 11:31:55 -07:00
moe_wna16.cu [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801) 2025-04-17 22:13:29 -07:00
moe_wna16_utils.h `pre-commit autoupdate` (#17380) 2025-04-29 06:46:55 -07:00
topk_softmax_kernels.cu Modularize fused experts and integrate PPLX kernels (#15956) 2025-05-14 13:11:54 -07:00
torch_bindings.cpp [Kernel] fp4 marlin kernel (#17687) 2025-05-10 19:58:49 -07:00