Updated 2025-09-13 05:20:17 +08:00
Updated 2025-09-12 03:36:04 +08:00
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Updated 2025-08-26 01:48:08 +08:00
Updated 2025-08-18 21:26:29 +08:00
Community maintained hardware plugin for vLLM on Spyre
Updated 2025-07-20 16:37:05 +08:00
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Updated 2025-07-20 16:31:04 +08:00
Community maintained hardware plugin for vLLM on Ascend
Updated 2025-07-20 16:30:47 +08:00