vllm/cutlass_extensions at 55f1a468d97fbf9387e577e901b3f290ed8aa15b - vllm

History

Shu Wang 376786fac1 Add cutlass support for blackwell fp8 blockwise gemm (#14383 ) Signed-off-by: Shu Wang <shuw@nvidia.com>		2025-05-08 15:09:55 -07:00
..
epilogue	[Kernel] CUTLASS grouped gemm fp8 MoE kernel (#13972 )	2025-03-27 00:54:44 +00:00
gemm	[BugFix] Illegal Memory Access in the blockwise cutlass fp8 GEMMs (#14396 )	2025-03-06 21:56:06 -08:00
common.cpp	[Kernel]: Cutlass 2:4 Sparsity + FP8/Int8 Quant Support (#10995 )	2024-12-18 09:57:16 -05:00
common.hpp	Add cutlass support for blackwell fp8 blockwise gemm (#14383 )	2025-05-08 15:09:55 -07:00
cute_utils.cuh	[Kernel] Initial Machete W4A8 support + Refactors (#9855 )	2024-11-18 12:59:29 -07:00
torch_utils.hpp	[MISC] Replace c10::optional with std::optional (#11730 )	2025-01-05 10:20:34 +09:00
vllm_collective_builder.cuh	[Kernel] Update `cutlass_scaled_mm` to support 2d group (blockwise) scaling (#11868 )	2025-01-30 18:33:00 -08:00
vllm_custom_types.cuh	[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174 )	2024-08-20 07:09:33 -06:00
vllm_cutlass_library_extension.py	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
vllm_numeric_conversion.cuh	[Kernel] Initial Machete W4A8 support + Refactors (#9855 )	2024-11-18 12:59:29 -07:00
vllm_type_utils.cuh	[Kernel] Initial Machete W4A8 support + Refactors (#9855 )	2024-11-18 12:59:29 -07:00