vLLM - Gitea: Git with a cup of tea

vllm

Python 0 0

A high-throughput and memory-efficient inference and serving engine for LLMs

Updated 2025-09-14 00:30:00 +08:00

ci-infra

HCL 0 0

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

Updated 2025-09-13 14:39:53 +08:00

vllm-gaudi

Python 0 0

Updated 2025-09-13 05:20:17 +08:00

vllm-project.github.io

HTML 0 0

Updated 2025-09-12 03:36:04 +08:00

llm-compressor

Python 0 0

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

compression quantization sparsity

Updated 2025-08-26 01:48:08 +08:00

vllm-openvino

Python 0 0

Updated 2025-08-18 21:26:29 +08:00

vllm-spyre

Python 0 0

Community maintained hardware plugin for vLLM on Spyre

Updated 2025-07-20 16:37:05 +08:00

aibrix

Go 0 0

Cost-efficient and pluggable Infrastructure components for GenAI inference

Updated 2025-07-20 16:33:47 +08:00

production-stack

Python 0 0

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Updated 2025-07-20 16:31:04 +08:00

vllm-ascend

Python 0 0

Community maintained hardware plugin for vLLM on Ascend

mlops llm inference model-serving transformer vllm llm-serving llmops ascend

Updated 2025-07-20 16:30:47 +08:00

media-kit

SVG 0 0

vLLM Logo Assets

Updated 2024-12-12 09:11:44 +08:00

dashboard

Python 0 0

vLLM performance dashboard

Updated 2024-04-26 14:13:44 +08:00