This commit introduces file upload capabilities to the AI Bot conversations interface and improves the overall dedicated UX experience. It also changes the experimental setting to a more permanent one.
## Key changes:
- **File upload support**:
- Integrates UppyUpload for handling file uploads in conversations
- Adds UI for uploading, displaying, and managing attachments
- Supports drag & drop, clipboard paste, and manual file selection
- Shows upload progress indicators for in-progress uploads
- Appends uploaded file markdown to message content
- **Renamed setting**:
- Changed `ai_enable_experimental_bot_ux` to `ai_bot_enable_dedicated_ux`
- Updated setting description to be clearer
- Changed default value to `true` as this is now a stable feature
- Added migration to handle the setting name change in database
- **UI improvements**:
- Enhanced input area with better focus states
- Improved layout and styling for conversations page
- Added visual feedback for upload states
- Better error handling for uploads in progress
- **Code organization**:
- Refactored message submission logic to handle attachments
- Updated DOM element IDs for consistency
- Fixed focus management after submission
- **Added tests**:
- Tests for file upload functionality
- Tests for removing uploads before submission
- Updated existing tests to work with the renamed setting
---------
Co-authored-by: awesomerobot <kris.aubuchon@discourse.org>
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:
**1. LLM Model Association for RAG and Personas:**
- **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
- **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
- **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
- **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector.
- **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.
**2. PDF and Image Support for RAG:**
- **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
- **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
- **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
- **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
- **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
- **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.
**3. Refactoring and Improvements:**
- **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
- **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
- **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
- **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.
**4. Testing:**
- The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
* FIX: Llm selector / forced tools / search tool
This fixes a few issues:
1. When search was not finding any semantic results we would break the tool
2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option
3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly.
4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it.
5. Updates anthropic model names to latest release
* linting
* fix a couple of tests I missed
* clean up conditional
* DRAFT: Create AI Bot users dynamically and support custom LlmModels
* Get user associated to llm_model
* Track enabled bots with attribute
* Don't store bot username. Minor touches to migrate default values in settings
* Handle scenario where vLLM uses a SRV record
* Made 3.5-turbo-16k the default version so we can remove hack
LLM selector control had no memory and was awkward to click.
Instead we now:
- Clearly display which llm you are talking to
- Allow you to change llm direct from composer
Previous to this change we relied on client side settings to
determine if an end user has access to the ai bot.
This meant that if a user was not aware they are a member of a
group (as it is with restricted visibility ones) they would not
see the bot button.
All checking has now moved to the server side, and tests were
added to cover.
Previously we were not using using HeaderPanel for drop down, which caused
it not to properly act like a header panel.
- Not styled right
- Not hidden when other buttons clicked
Etc...
Header is sadly full of legacy so this is somewhat hacky weaving widgets.