vllm/anything-llm.md at fix-precommit

1.3 KiB

Raw Permalink Blame History

title
Anything LLM

{ #deployment-anything-llm }

Anything LLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting.

It allows you to deploy a large language model (LLM) server with vLLM as the backend, which exposes OpenAI-compatible endpoints.

Prerequisites

Setup vLLM environment

Deploy

Start the vLLM server with the supported chat completion model, e.g.

vllm serve Qwen/Qwen1.5-32B-Chat-AWQ --max-model-len 4096

Download and install Anything LLM desktop.
On the bottom left of open settings, AI Prooviders --> LLM:
- LLM Provider: Generic OpenAI
- Base URL: http://{vllm server host}:{vllm server port}/v1
- Chat Model Name: Qwen/Qwen1.5-32B-Chat-AWQ

Back to home page, New Workspace --> create vllm workspace, and start to chat:

Click the upload button:
- upload the doc
- select the doc and move to the workspace
- save and embed

Chat again:

1.3 KiB Raw Permalink Blame History

Prerequisites

Deploy

1.3 KiB

Raw Permalink Blame History