Gitea: Git with a cup of tea

vLLM / vllm-spyre

Python 0 0

Community maintained hardware plugin for vLLM on Spyre

Updated 2025-07-20 16:37:05 +08:00

vLLM / production-stack

Python 0 0

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Updated 2025-07-20 16:31:04 +08:00

vLLM / vllm-ascend

Python 0 0

Community maintained hardware plugin for vLLM on Ascend

mlops llm inference model-serving transformer vllm llm-serving llmops ascend

Updated 2025-07-20 16:30:47 +08:00

vLLM / llm-compressor

Python 0 0

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

compression quantization sparsity

Updated 2025-07-20 13:59:02 +08:00

vLLM / vllm-openvino

Python 0 0

Updated 2025-07-20 13:58:44 +08:00

containers / podman-desktop-extension-ai-lab

TypeScript 0 0

Work with LLMs on a local environment using containers

containers podman ai inference-server llms local

Updated 2025-07-20 13:50:12 +08:00

containers / ramalama

Python 0 0

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

containers ai llm inference-server intel llamacpp podman vllm cuda hip

Updated 2025-07-19 18:35:54 +08:00

vLLM / vllm-gaudi

Python 0 0

Updated 2025-07-04 16:08:08 +08:00

vLLM / vllm

Python 0 0

A high-throughput and memory-efficient inference and serving engine for LLMs

Updated 2025-07-04 16:00:34 +08:00

vLLM / dashboard

Python 0 0

vLLM performance dashboard

Updated 2025-07-04 15:22:09 +08:00

vLLM / media-kit

SVG 0 0

vLLM Logo Assets

Updated 2025-07-04 15:22:06 +08:00

vLLM / vllm-project.github.io

HTML 0 0

Updated 2025-07-04 15:19:52 +08:00

vLLM / ci-infra

HCL 0 0

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

Updated 2025-07-04 15:19:47 +08:00

kubeflow / katib

Python 0 0

Automated Machine Learning on Kubernetes

kubernetes machine-learning kubeflow ai tensorflow huggingface llm mlops jax pytorch hyperparameter-tuning neural-architecture-search automl scikit-learn

Updated 2025-06-26 22:13:16 +08:00

kubeflow / trainer

Python 0 0

Distributed ML Training and Fine-Tuning on Kubernetes

kubernetes huggingface ai llm gpu jax kubeflow distributed xgboost machine-learning mlops python pytorch tensorflow fine-tuning

Updated 2025-06-20 16:05:11 +08:00

containers / ai-lab-recipes

Python 0 0

Examples for building and running LLM services and applications locally with Podman

Updated 2025-06-19 16:22:59 +08:00

cncf / llm-in-action

Python 0 0

🤖 Discover how to apply your LLM app skills on Kubernetes!

cloudnative inference llm

Updated 2024-03-09 05:47:39 +08:00