Tyler Michael Smith
|
3be8d312a2
|
[Kernel][Bugfix] Fixup some warnings in nvfp4_blockwise_moe when CUDA < 12.8 (#20324)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-07-01 18:05:47 -07:00 |
Tyler Michael Smith
|
e8c3bd2cd1
|
[Bugfix] Fix some narrowing conversion warnings (#20141)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-06-27 09:01:28 -07:00 |
jiahanc
|
294fc1e2c9
|
[Hardware][NVIDIA][kernel] Fp4 MOE quant kernel optimization (#19500)
|
2025-06-14 09:34:28 -07:00 |
Pavani Majety
|
0c0fdae84f
|
[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362)
|
2025-05-09 16:24:41 -07:00 |
Kaixi Hou
|
ed7a29d9f8
|
[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032)
Signed-off-by: kaixih <kaixih@nvidia.com>
|
2025-04-27 06:29:21 -07:00 |
Pavani Majety
|
debd6bbf09
|
[Kernel] Add ModelOpt FP4 Checkpoint Support (#12520)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
|
2025-03-12 05:13:11 +00:00 |
Roger Wang
|
82e0d601fc
|
[CI/Build] Fix pre-commit errors from #13571 (#13709)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-02-22 16:50:38 -08:00 |
Kaixi Hou
|
e109e598c7
|
[NVIDIA] Support nvfp4 cutlass gemm (#13571)
|
2025-02-22 05:24:05 -08:00 |
Kaixi Hou
|
27a09dc52c
|
[NVIDIA] Fix an issue to use current stream for the nvfp4 quant (#13632)
|
2025-02-20 22:01:48 -08:00 |
Kaixi Hou
|
4fc5c23bb6
|
[NVIDIA] Support nvfp4 quantization (#12784)
|
2025-02-12 19:51:51 -08:00 |