vllm/kernels at 6d0df0ebebd4e347e1ebcdea4be010a4b54b901b - vllm

History

Lei Wang 8d32dc603d [Kernel] Support Microsoft Runtime Kernel Lib for our Low Precision Computation - BitBLAS (#6036 ) Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com> Co-authored-by: xinyuxiao <xinyuxiao2024@gmail.com>		2025-04-22 09:01:36 +01:00
..
deepgemm	Add benchmark for DeepGEMM and vLLM Block FP8 Dense GEMM (#13917 )	2025-03-05 17:08:51 -08:00
benchmark_aqlm.py	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )	2025-02-02 11:58:18 -08:00
benchmark_bitblas.py	[Kernel] Support Microsoft Runtime Kernel Lib for our Low Precision Computation - BitBLAS (#6036 )	2025-04-22 09:01:36 +01:00
benchmark_grouped_gemm_cutlass.py	[Kernel] CUTLASS grouped gemm fp8 MoE kernel (#13972 )	2025-03-27 00:54:44 +00:00
benchmark_layernorm.py	[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183 )	2025-03-07 00:42:49 +00:00
benchmark_lora.py	[Kernels] LoRA - Retire SGMV and BGMV Kernels (#14685 )	2025-03-18 09:47:53 +00:00
benchmark_machete.py	[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183 )	2025-03-07 00:42:49 +00:00
benchmark_marlin.py	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
benchmark_moe.py	Upstream Llama4 Support to Main (#16113 )	2025-04-07 08:06:27 -07:00
benchmark_paged_attention.py	[Misc] Warn about v0 in benchmark_paged_attn.py (#15495 )	2025-03-25 20:31:04 -07:00
benchmark_quant.py	[Bugfix] Correctly call `cudaProfilerStop` in benchmarks script (#14183 )	2025-03-07 00:42:49 +00:00
benchmark_rmsnorm.py	Correct capitalisation: `VLLM` -> `vLLM` (#14562 )	2025-03-10 16:36:21 +00:00
benchmark_rope.py	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
benchmark_shapes.py	[Kernel] CUTLASS grouped gemm fp8 MoE kernel (#13972 )	2025-03-27 00:54:44 +00:00
benchmark_w8a8_block_fp8.py	[Misc] Add tuned R1 w8a8 and MoE configs for NVIDIA L20 (#15322 )	2025-03-23 01:10:10 -07:00
graph_machete_bench.py	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
requirements.txt	[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin (#7701 )	2024-09-23 13:46:26 -04:00
utils.py	Update deprecated Python 3.8 typing (#13971 )	2025-03-02 17:34:51 -08:00
weight_shapes.py	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )	2025-02-02 11:58:18 -08:00