vllm/csrc/quantization/gptq_marlin
周周周 9290de5667
remove unused variables in marlin_template.h (#20236)
2025-07-02 00:51:52 +00:00
..
.gitignore [Kernel] some optimizations for dense marlin and moe marlin (#16850) 2025-05-05 09:39:30 -07:00
awq_marlin_repack.cu Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160) 2025-03-25 15:36:45 +08:00
dequant.h [Kernel] fp4 marlin kernel (#17687) 2025-05-10 19:58:49 -07:00
generate_kernels.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
gptq_marlin.cu [Kernel] fp4 marlin kernel (#17687) 2025-05-10 19:58:49 -07:00
gptq_marlin_repack.cu Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160) 2025-03-25 15:36:45 +08:00
kernel.h [Kernel] fp4 marlin kernel (#17687) 2025-05-10 19:58:49 -07:00
marlin.cuh [Kernel] moe wna16 marlin kernel (#14447) 2025-04-14 20:05:22 -07:00
marlin_dtypes.cuh [Kernel] moe wna16 marlin kernel (#14447) 2025-04-14 20:05:22 -07:00
marlin_template.h remove unused variables in marlin_template.h (#20236) 2025-07-02 00:51:52 +00:00