vllm/csrc/quantization
Tyler Michael Smith 6e588da0f4
[Build/CI] Fix CUDA 11.8 build (#17679)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-22 12:13:54 -07:00
..
aqlm [Kernel] fix types used in aqlm and ggml kernels to support dynamo (#7596) 2024-08-16 14:00:11 -07:00
awq [Kernel] Fix awq error when n is not divisable by 128 (#13227) 2025-02-13 20:07:05 -08:00
compressed_tensors [ROCm]: Fix build from source failure with gcc14 and ROCm 6.3 (#13779) 2025-05-12 20:36:33 -07:00
cutlass_w8a8 [Build/CI] Fix CUDA 11.8 build (#17679) 2025-05-22 12:13:54 -07:00
fp4 [Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362) 2025-05-09 16:24:41 -07:00
fp8 Removed unused marlin cuda code (#17684) 2025-05-06 17:59:47 -07:00
fused_kernels [ROCm]: Fix build from source failure with gcc14 and ROCm 6.3 (#13779) 2025-05-12 20:36:33 -07:00
gguf [Kernel] GGUF MoeVec kernel (#16780) 2025-05-06 23:07:23 -07:00
gptq Fix CUDA kernel index data type in vllm/csrc/quantization/fused_kernels/layernorm_utils.cuh +10 (#15159) 2025-03-21 10:01:11 +08:00
gptq_allspark [Easy] Eliminate c10::optional usage in vllm/csrc (#17819) 2025-05-08 03:05:10 -07:00
gptq_marlin [Bugfix] fix `an illegal memory access was encountered` of marlin kernel + act_order (#18245) 2025-05-16 16:02:44 -07:00
machete add cutlass support for blackwell fp8 gemm (#13798) 2025-03-04 07:55:07 -08:00
marlin `pre-commit autoupdate` (#17380) 2025-04-29 06:46:55 -07:00
activation_kernels.cu [AMD][torch.compile] Enable silu+fp8_quant fusion for rocm (#18082) 2025-05-13 22:13:56 -07:00
utils.cuh [Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050) 2025-03-31 04:42:18 -07:00
vectorization.cuh dynamic distpatch of fp8 kernels (#14245) 2025-03-11 10:54:56 -04:00