vllm/csrc/moe
Varun Sundar Rabindranath fa98d77773
[Kernel] DeepEP dispatch-combine kernel integration (#18434)
Signed-off-by: Varun <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-06-03 12:30:02 -07:00
..
marlin_moe_wna16 [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
permute_unpermute_kernels [Build/CI] Fix CUDA 11.8 build (#17679) 2025-05-22 12:13:54 -07:00
moe_align_sum_kernels.cu Modularize fused experts and integrate PPLX kernels (#15956) 2025-05-14 13:11:54 -07:00
moe_ops.h [Build/CI] Fix CUDA 11.8 build (#17679) 2025-05-22 12:13:54 -07:00
moe_permute_unpermute_op.cu [Build/CI] Fix CUDA 11.8 build (#17679) 2025-05-22 12:13:54 -07:00
moe_wna16.cu [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801) 2025-04-17 22:13:29 -07:00
moe_wna16_utils.h `pre-commit autoupdate` (#17380) 2025-04-29 06:46:55 -07:00
topk_softmax_kernels.cu [Kernel] DeepEP dispatch-combine kernel integration (#18434) 2025-06-03 12:30:02 -07:00
torch_bindings.cpp [Build/CI] Fix CUDA 11.8 build (#17679) 2025-05-22 12:13:54 -07:00