Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Updated 2025-08-26 01:48:08 +08:00
Updated 2025-08-18 21:26:29 +08:00
Updated 2025-08-01 20:19:25 +08:00
Updated 2025-07-23 16:35:22 +08:00
Community maintained hardware plugin for vLLM on Spyre
Updated 2025-07-20 16:37:05 +08:00
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Updated 2025-07-20 16:31:04 +08:00
Community maintained hardware plugin for vLLM on Ascend
Updated 2025-07-20 16:30:47 +08:00