vllm/kernels at 02cc3b51a7f2af012a8f17f0d836529d57012eee - vllm

History

Woosuk Kwon 27208be66e [Kernel] Add back batch size 1536 and 3072 to MoE tuning (#5242 )		2024-06-04 09:58:47 -07:00
..
benchmark_aqlm.py	[Core]refactor aqlm quant ops (#4351 )	2024-04-25 15:03:56 -04:00
benchmark_marlin.py	Marlin 24 prefill performance improvement (about 25% better on average) (#4983 )	2024-05-23 02:39:27 -04:00
benchmark_moe.py	[Kernel] Add back batch size 1536 and 3072 to MoE tuning (#5242 )	2024-06-04 09:58:47 -07:00
benchmark_paged_attention.py	[Model] Support MAP-NEO model (#5081 )	2024-05-30 19:24:41 -07:00
benchmark_rope.py	[Model] Support MAP-NEO model (#5081 )	2024-05-30 19:24:41 -07:00
benchmark_shapes.py	Add marlin unit tests and marlin benchmark script (#4815 )	2024-05-16 09:36:49 -04:00