.. |
deepgemm
|
[Bugfix] Fix triton import with local TritonPlaceholder (#17446)
|
2025-05-06 17:53:09 +08:00 |
benchmark_aqlm.py
|
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
|
2025-02-02 11:58:18 -08:00 |
benchmark_bitblas.py
|
[Kernel] Support Microsoft Runtime Kernel Lib for our Low Precision Computation - BitBLAS (#6036)
|
2025-04-22 09:01:36 +01:00 |
benchmark_grouped_gemm_cutlass.py
|
permute/unpermute kernel for moe optimization (#14568)
|
2025-05-02 11:31:55 -07:00 |
benchmark_layernorm.py
|
[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183)
|
2025-03-07 00:42:49 +00:00 |
benchmark_lora.py
|
[Bugfix][Misc] Use TritonPlaceholderModule to defensively import triton (#15099)
|
2025-04-24 22:51:02 -07:00 |
benchmark_machete.py
|
[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183)
|
2025-03-07 00:42:49 +00:00 |
benchmark_marlin.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
benchmark_moe.py
|
[Bugfix] Fix triton import with local TritonPlaceholder (#17446)
|
2025-05-06 17:53:09 +08:00 |
benchmark_moe_permute_unpermute.py
|
permute/unpermute kernel for moe optimization (#14568)
|
2025-05-02 11:31:55 -07:00 |
benchmark_paged_attention.py
|
[Misc] Warn about v0 in benchmark_paged_attn.py (#15495)
|
2025-03-25 20:31:04 -07:00 |
benchmark_quant.py
|
[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183)
|
2025-03-07 00:42:49 +00:00 |
benchmark_rmsnorm.py
|
[Bugfix] Fix triton import with local TritonPlaceholder (#17446)
|
2025-05-06 17:53:09 +08:00 |
benchmark_rope.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
benchmark_shapes.py
|
[Kernel] CUTLASS grouped gemm fp8 MoE kernel (#13972)
|
2025-03-27 00:54:44 +00:00 |
benchmark_w8a8_block_fp8.py
|
[Misc] Add tuned R1 w8a8 and MoE configs for NVIDIA L20 (#15322)
|
2025-03-23 01:10:10 -07:00 |
graph_machete_bench.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
requirements.txt
|
[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin (#7701)
|
2024-09-23 13:46:26 -04:00 |
utils.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
weight_shapes.py
|
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
|
2025-02-02 11:58:18 -08:00 |