Simon Mo
|
02f0c7b220
|
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-06-03 11:20:17 -07:00 |
Isotr0py
|
1f1b1bc03b
|
[V1][Quantization] Add CUDA graph compatible v1 GGUF support (#18646)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-27 04:40:28 +00:00 |
Michael Goin
|
63934543a0
|
Speed up the `kernels/quantization/` tests (#18669)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-25 05:02:59 +00:00 |
bnellnm
|
f9c069c85e
|
Modularize fused experts and integrate PPLX kernels (#15956)
|
2025-05-14 13:11:54 -07:00 |
Charlie Fu
|
7b2f28deba
|
[AMD][torch.compile] Enable silu+fp8_quant fusion for rocm (#18082)
Signed-off-by: charlifu <charlifu@amd.com>
|
2025-05-13 22:13:56 -07:00 |
Jinzhen Lin
|
d74e5f37bc
|
[Kernel] fp4 marlin kernel (#17687)
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
|
2025-05-10 19:58:49 -07:00 |
Pavani Majety
|
0c0fdae84f
|
[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362)
|
2025-05-09 16:24:41 -07:00 |
Shu Wang
|
376786fac1
|
Add cutlass support for blackwell fp8 blockwise gemm (#14383)
Signed-off-by: Shu Wang <shuw@nvidia.com>
|
2025-05-08 15:09:55 -07:00 |
Hashem Hashemi
|
5a499e70d5
|
[Kernel][Hardware][AMD] Bf16 mfma opt for ROCm skinny GEMMs (#17071)
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: charlifu <charlifu@amd.com>
Co-authored-by: charlifu <charlifu@amd.com>
|
2025-05-07 22:34:49 -07:00 |
Szymon Ożóg
|
1a45a61387
|
[Kernel] GGUF MoeVec kernel (#16780)
Signed-off-by: SzymonOzog <szymon.ozog@aleph-alpha.com>
Signed-off-by: SzymonOzog <szymon.ozog@gmail.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-05-06 23:07:23 -07:00 |
Lucas Wilkinson
|
6eae34533a
|
[Misc] Fix ScalarType float4 naming (#17690)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2025-05-06 01:07:15 -07:00 |
Jinzhen Lin
|
1d0c9d6b2d
|
[Kernel] some optimizations for dense marlin and moe marlin (#16850)
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
|
2025-05-05 09:39:30 -07:00 |
Caleb_Du
|
3e887d2e0c
|
permute/unpermute kernel for moe optimization (#14568)
Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
|
2025-05-02 11:31:55 -07:00 |
Michael Goin
|
82e43b2d7e
|
Add missing rocm_skinny_gemms kernel test to CI (#17060)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-24 07:49:37 -07:00 |
Michael Goin
|
6317a5174a
|
Categorize `tests/kernels/` based on kernel type (#16799)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-23 09:21:07 -04:00 |