vllm/csrc/quantization/cutlass_w8a8
Ilya Markov e13945f9dd
[Perf] Further tunings for SM100 FP8 CUTLASS kernel (#19566)
2025-06-14 17:25:10 -07:00
..
c3x [Perf] Further tunings for SM100 FP8 CUTLASS kernel (#19566) 2025-06-14 17:25:10 -07:00
moe [Kernel] Integrate CUTLASS MoE kernel with PPLX (#18762) 2025-06-06 18:26:11 -07:00
Epilogues.md [CI/Build] Auto-fix Markdown files (#12941) 2025-02-08 04:25:15 -08:00
scaled_mm_c2x.cu [MISC] Replace c10::optional with std::optional (#11730) 2025-01-05 10:20:34 +09:00
scaled_mm_c2x.cuh [Kernel][Bugfix] Refactor and Fix CUTLASS 2:4 Sparse Kernels (#13198) 2025-02-14 00:01:14 +00:00
scaled_mm_c2x_sm75_dispatch.cuh [Kernel] Tuned int8 Cutlass Kernels for SM75 (T4) (#6996) 2024-07-31 14:40:32 -07:00
scaled_mm_c2x_sm80_dispatch.cuh [Kernel] Tuned FP8 Kernels for Ada Lovelace (#6677) 2024-07-29 09:42:35 -06:00
scaled_mm_c2x_sm89_fp8_dispatch.cuh [Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751) 2025-04-27 19:38:42 -07:00
scaled_mm_c2x_sm89_int8_dispatch.cuh [Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751) 2025-04-27 19:38:42 -07:00
scaled_mm_c3x_sm90.cu Add cutlass support for blackwell fp8 blockwise gemm (#14383) 2025-05-08 15:09:55 -07:00
scaled_mm_c3x_sm100.cu Add cutlass support for blackwell fp8 blockwise gemm (#14383) 2025-05-08 15:09:55 -07:00
scaled_mm_entry.cu [Kernel] Integrate CUTLASS MoE kernel with PPLX (#18762) 2025-06-06 18:26:11 -07:00