Commit Graph

9 Commits

Author SHA1 Message Date
Roman Rizzi e52045ebdc
DEV: Robust check for embeddings enabled (#1116) 2025-02-06 12:18:55 -03:00
Roman Rizzi 1572068735
DEV: Improve embedding configs validations (#1101)
Before this change, we let you set the embeddings selected model back to " " even with embeddings enabled. This will leave the site in a broken state.

Additionally, it adds a fail-safe for these scenarios to avoid errors on the topics page.
2025-01-30 14:16:56 -03:00
Roman Rizzi f5cf1019fb
FEATURE: configurable embeddings (#1049)
* Use AR model for embeddings features

* endpoints

* Embeddings CRUD UI

* Add presets. Hide a couple more settings

* system specs

* Seed embedding definition from old settings

* Generate search bit index on the fly. cleanup orphaned data

* support for seeded models

* Fix run test for new embedding

* fix selected model not set correctly
2025-01-21 12:23:19 -03:00
Roman Rizzi 534b0df391
REFACTOR: Separation of concerns for embedding generation. (#1027)
In a previous refactor, we moved the responsibility of querying and storing embeddings into the `Schema` class. Now, it's time for embedding generation.

The motivation behind these changes is to isolate vector characteristics in simple objects to later replace them with a DB-backed version, similar to what we did with LLM configs.
2024-12-16 09:55:39 -03:00
Roman Rizzi eae527f99d
REFACTOR: A Simpler way of interacting with embeddings tables. (#1023)
* REFACTOR: A Simpler way of interacting with embeddings' tables.

This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table.

It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.
2024-12-13 10:15:21 -03:00
Sam 03eccbe392
FEATURE: Make tool support polymorphic (#798)
Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool

In turn this gives us support for richer RAG implementations.
2024-09-16 08:17:17 +10:00
Sam f6ac5cd0a8
FEATURE: allow tuning of RAG generation (#565)
* FEATURE: allow tuning of RAG generation

- change chunking to be token based vs char based (which is more accurate)
- allow control over overlap / tokens per chunk and conversation snippets inserted
- UI to control new settings

* improve ui a bit

* fix various reindex issues

* reduce concurrency

* try ultra low queue ... concurrency 1 is too slow.
2024-04-12 10:32:46 -03:00
Roman Rizzi aa8918911d
UX: Display the indexing progress for RAG uploads (#557) 2024-04-09 11:03:07 -03:00
Roman Rizzi 1f1c94e5c6
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.

For now, we'll only allow plain-text files, but this will change in the future.

Commits:

* FEATURE: RAG embeddings for the AI Bot

This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.

* Basic asymmetric similarity search to provide guidance in system prompt

* Fix tests and lint

* Apply reranker to fragments

* Uploads filter, css adjustments and file validations

* Add placeholder for rag fragments

* Update annotations
2024-04-01 13:43:34 -03:00