Gitea: Git with a cup of tea

containers / podman-desktop-extension-ai-lab

TypeScript 0 0

Work with LLMs on a local environment using containers

containers podman ai inference-server llms local

Updated 2025-08-25 23:46:14 +08:00

containers / ramalama-stack

Python 0 0

An external provider for Llama Stack allowing for the use of RamaLama for inference.

Updated 2025-08-21 17:46:16 +08:00

vLLM / vllm

Python 0 0

A high-throughput and memory-efficient inference and serving engine for LLMs

Updated 2025-08-01 01:35:07 +08:00

vLLM / aibrix

Go 0 0

Cost-efficient and pluggable Infrastructure components for GenAI inference

Updated 2025-07-20 16:33:47 +08:00

vLLM / vllm-ascend

Python 0 0

Community maintained hardware plugin for vLLM on Ascend

mlops llm inference model-serving transformer vllm llm-serving llmops ascend

Updated 2025-07-20 16:30:47 +08:00

containers / ramalama

Python 0 0

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

containers ai llm inference-server intel llamacpp podman vllm cuda hip

Updated 2025-07-19 18:35:54 +08:00

cncf / llm-in-action

Python 0 0

🤖 Discover how to apply your LLM app skills on Kubernetes!

cloudnative inference llm

Updated 2024-03-09 05:47:39 +08:00