vllm/csrc/quantization/fp4
Pavani Majety 0c0fdae84f
[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362)
2025-05-09 16:24:41 -07:00
..
nvfp4_blockwise_moe_kernel.cu [Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362) 2025-05-09 16:24:41 -07:00
nvfp4_experts_quant.cu [Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362) 2025-05-09 16:24:41 -07:00
nvfp4_quant_entry.cu [Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362) 2025-05-09 16:24:41 -07:00
nvfp4_quant_kernels.cu [NVIDIA] Fix an issue to use current stream for the nvfp4 quant (#13632) 2025-02-20 22:01:48 -08:00
nvfp4_scaled_mm_entry.cu [Kernel] Add ModelOpt FP4 Checkpoint Support (#12520) 2025-03-12 05:13:11 +00:00
nvfp4_scaled_mm_kernels.cu [NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032) 2025-04-27 06:29:21 -07:00