vllm/benchmarks/kernels
afeldman-nm dfada85eee
[Frontend] Expose custom args in OpenAI APIs (#16862)
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-06-18 17:41:11 -07:00
..
deepgemm [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
bench_fp8_gemm.py [Benchmark] Refactor benchmark script for fp8 & int8 (#19627) 2025-06-15 15:15:37 +08:00
bench_int8_gemm.py [Benchmark] Refactor benchmark script for fp8 & int8 (#19627) 2025-06-15 15:15:37 +08:00
benchmark_aqlm.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_bitblas.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_cutlass_fp4_moe.py [Hardware][NVIDIA] FP4 MoE kernel optimization (#19110) 2025-06-05 09:48:26 -07:00
benchmark_grouped_gemm_cutlass.py [Kernel] Integrate CUTLASS MoE kernel with PPLX (#18762) 2025-06-06 18:26:11 -07:00
benchmark_layernorm.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_lora.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_machete.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_marlin.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_moe.py [Bugfix] Fix benchmark_moe.py (#19016) 2025-06-09 18:04:36 -07:00
benchmark_moe_align_block_size.py [Frontend] Expose custom args in OpenAI APIs (#16862) 2025-06-18 17:41:11 -07:00
benchmark_moe_permute_unpermute.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_paged_attention.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_quant.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_rmsnorm.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_rope.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_shapes.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
benchmark_w8a8_block_fp8.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
graph_machete_bench.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
requirements.txt [Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin (#7701) 2024-09-23 13:46:26 -04:00
utils.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00
weight_shapes.py [Misc] Add SPDX-FileCopyrightText (#19100) 2025-06-03 11:20:17 -07:00