Gitea: Git with a cup of tea

vLLM / vllm

Python 0 0

A high-throughput and memory-efficient inference and serving engine for LLMs

Updated 2025-08-01 01:35:07 +08:00

containers / ramalama

Python 0 0

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

containers ai llm inference-server intel llamacpp podman vllm cuda hip

Updated 2025-07-19 18:35:54 +08:00