vllm/benchmarks/kernels
Woosuk Kwon 27208be66e
[Kernel] Add back batch size 1536 and 3072 to MoE tuning (#5242)
2024-06-04 09:58:47 -07:00
..
benchmark_aqlm.py [Core]refactor aqlm quant ops (#4351) 2024-04-25 15:03:56 -04:00
benchmark_marlin.py Marlin 24 prefill performance improvement (about 25% better on average) (#4983) 2024-05-23 02:39:27 -04:00
benchmark_moe.py [Kernel] Add back batch size 1536 and 3072 to MoE tuning (#5242) 2024-06-04 09:58:47 -07:00
benchmark_paged_attention.py [Model] Support MAP-NEO model (#5081) 2024-05-30 19:24:41 -07:00
benchmark_rope.py [Model] Support MAP-NEO model (#5081) 2024-05-30 19:24:41 -07:00
benchmark_shapes.py Add marlin unit tests and marlin benchmark script (#4815) 2024-05-16 09:36:49 -04:00