vllm/sparse at 6d0df0ebebd4e347e1ebcdea4be010a4b54b901b - vllm

History

Lu Fang 051da7efe3 Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160 ) Signed-off-by: Lu Fang <lufang@fb.com> Co-authored-by: Richard Barnes <rbarnes@meta.com>		2025-03-25 15:36:45 +08:00
..
common	Update `pre-commit` hooks (#12475 )	2025-01-27 17:23:08 -07:00
LICENSE	Add GPTQ Marlin 2:4 sparse structured support (#4790 )	2024-05-16 12:56:15 -04:00
marlin_24_cuda_kernel.cu	Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160 )	2025-03-25 15:36:45 +08:00