This commit introduces file upload capabilities to the AI Bot conversations interface and improves the overall dedicated UX experience. It also changes the experimental setting to a more permanent one.
## Key changes:
- **File upload support**:
- Integrates UppyUpload for handling file uploads in conversations
- Adds UI for uploading, displaying, and managing attachments
- Supports drag & drop, clipboard paste, and manual file selection
- Shows upload progress indicators for in-progress uploads
- Appends uploaded file markdown to message content
- **Renamed setting**:
- Changed `ai_enable_experimental_bot_ux` to `ai_bot_enable_dedicated_ux`
- Updated setting description to be clearer
- Changed default value to `true` as this is now a stable feature
- Added migration to handle the setting name change in database
- **UI improvements**:
- Enhanced input area with better focus states
- Improved layout and styling for conversations page
- Added visual feedback for upload states
- Better error handling for uploads in progress
- **Code organization**:
- Refactored message submission logic to handle attachments
- Updated DOM element IDs for consistency
- Fixed focus management after submission
- **Added tests**:
- Tests for file upload functionality
- Tests for removing uploads before submission
- Updated existing tests to work with the renamed setting
---------
Co-authored-by: awesomerobot <kris.aubuchon@discourse.org>
This commit enhances the AI image generation functionality by adding support for:
1. OpenAI's GPT-based image generation model (gpt-image-1)
2. Image editing capabilities through the OpenAI API
3. A new "Designer" persona specialized in image generation and editing
4. Two new AI tools: CreateImage and EditImage
Technical changes include:
- Renaming `ai_openai_dall_e_3_url` to `ai_openai_image_generation_url` with a migration
- Adding `ai_openai_image_edit_url` setting for the image edit API endpoint
- Refactoring image generation code to handle both DALL-E and the newer GPT models
- Supporting multipart/form-data for image editing requests
* wild guess but maybe quantization is breaking the test sometimes
this increases distance
* Update lib/personas/designer.rb
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* simplify and de-flake code
* fix, in chat we need enough context so we know exactly what uploads a user uploaded.
* Update lib/personas/tools/edit_image.rb
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* cleanup downloaded files right away
* fix implementation
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* FEATURE: display more places where AI is used
- Usage was not showing automation or image caption in llm list.
- Also: FIX - reasoning models would time out incorrectly after 60 seconds (raised to 10 minutes)
* correct enum not to enumerate non configured models
* FEATURE: implement chat streamer
This implements a basic chat streamer, it provides 2 things:
1. Gives feedback to the user when LLM is generating
2. Streams stuff much more efficiently to client (given it may take 100ms or so per call to update chat)
Overview
This PR introduces a Bot Homepage that was first introduced at https://ask.discourse.org/.
Key Features:
Add a bot homepage: /discourse-ai/ai-bot/conversations
Display a sidebar with previous bot conversations
Infinite scroll for large counts
Sidebar still visible when navigation mode is header_dropdown
Sidebar visible on homepage and bot PM show view
Add New Question button to the bottom of sidebar on bot PM show view
Add persona picker to homepage
In this feature update, we add the UI for the ability to easily configure persona backed AI-features. The feature will still be hidden until structured responses are complete.
* REFACTOR: Move personas into it's own module.
* WIP: Use personas for summarization
* Prioritize persona default LLM or fallback to newest one
* Simplify summarization strategy
* Keep ai_sumarization_model as a fallback
* FEATURE: Experimental search results from an AI Persona.
When a user searches discourse, we'll send the query to an AI Persona to provide additional context and enrich the results. The feature depends on the user being a member of a group to which the persona has access.
* Update assets/stylesheets/common/ai-blinking-animation.scss
Co-authored-by: Keegan George <kgeorge13@gmail.com>
---------
Co-authored-by: Keegan George <kgeorge13@gmail.com>
* FEATURE: Native PDF support
This amends it so we use PDF Reader gem to extract text from PDFs
* This means that our simple pdf eval passes at last
* fix spec
* skip test in CI
* test file support
* Update lib/utils/image_to_text.rb
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* address pr comments
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:
**1. LLM Model Association for RAG and Personas:**
- **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
- **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
- **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
- **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector.
- **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.
**2. PDF and Image Support for RAG:**
- **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
- **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
- **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
- **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
- **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
- **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.
**3. Refactoring and Improvements:**
- **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
- **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
- **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
- **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.
**4. Testing:**
- The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
* Use AR model for embeddings features
* endpoints
* Embeddings CRUD UI
* Add presets. Hide a couple more settings
* system specs
* Seed embedding definition from old settings
* Generate search bit index on the fly. cleanup orphaned data
* support for seeded models
* Fix run test for new embedding
* fix selected model not set correctly
This introduces a comprehensive spam detection system that uses LLM models
to automatically identify and flag potential spam posts. The system is
designed to be both powerful and configurable while preventing false positives.
Key Features:
* Automatically scans first 3 posts from new users (TL0/TL1)
* Creates dedicated AI flagging user to distinguish from system flags
* Tracks false positives/negatives for quality monitoring
* Supports custom instructions to fine-tune detection
* Includes test interface for trying detection on any post
Technical Implementation:
* New database tables:
- ai_spam_logs: Stores scan history and results
- ai_moderation_settings: Stores LLM config and custom instructions
* Rate limiting and safeguards:
- Minimum 10-minute delay between rescans
- Only scans significant edits (>10 char difference)
- Maximum 3 scans per post
- 24-hour maximum age for scannable posts
* Admin UI features:
- Real-time testing capabilities
- 7-day statistics dashboard
- Configurable LLM model selection
- Custom instruction support
Security and Performance:
* Respects trust levels - only scans TL0/TL1 users
* Skips private messages entirely
* Stops scanning users after 3 successful public posts
* Includes comprehensive test coverage
* Maintains audit log of all scan attempts
---------
Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
* FEATURE: Backfill posts sentiment.
It adds a scheduled job to backfill posts' sentiment, similar to our existing rake task, but with two settings to control the batch size and posts' max-age.
* Make sure model_name order is consistent.
* PERF: Preload only gists when including summaries in topic list
* Add unique index on summaries and dedup existing records
* Make hot topics batch size setting hidden
This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes:
1. AI Artifacts System:
- Adds a new `AiArtifact` model and database migration
- Allows creation of web artifacts with HTML, CSS, and JavaScript content
- Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution
- Implements artifact rendering in iframes with sandbox protection
- New `CreateArtifact` tool for AI to generate interactive content
2. Tool System Improvements:
- Adds support for partial tool calls, allowing incremental updates during generation
- Better handling of tool call states and progress tracking
- Improved XML tool processing with CDATA support
- Fixes for tool parameter handling and duplicate invocations
3. LLM Provider Updates:
- Updates for Anthropic Claude models with correct token limits
- Adds support for native/XML tool modes in Gemini integration
- Adds new model configurations including Llama 3.1 models
- Improvements to streaming response handling
4. UI Enhancements:
- New artifact viewer component with expand/collapse functionality
- Security controls for artifact execution (click-to-run in strict mode)
- Improved dialog and response handling
- Better error management for tool execution
5. Security Improvements:
- Sandbox controls for artifact execution
- Public/private artifact sharing controls
- Security settings to control artifact behavior
- CSP and frame-options handling for artifacts
6. Technical Improvements:
- Better post streaming implementation
- Improved error handling in completions
- Better memory management for partial tool calls
- Enhanced testing coverage
7. Configuration:
- New site settings for artifact security
- Extended LLM model configurations
- Additional tool configuration options
This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify.
We'll prioritize topics without summary over outdated ones.
* FIX: Llm selector / forced tools / search tool
This fixes a few issues:
1. When search was not finding any semantic results we would break the tool
2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option
3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly.
4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it.
5. Updates anthropic model names to latest release
* linting
* fix a couple of tests I missed
* clean up conditional
* Display gists in the hot topics list
* Adjust hot topics gist strategy and add a job to generate gists
* Replace setting with a configurable batch size
* Avoid loading summaries for other topic lists
* Tweak gist prompt to focus on latest posts in the context of the OP
* Remove serializer hack and rely on core change from discourse/discourse#29291
* Update lib/summarization/strategies/hot_topic_gists.rb
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
---------
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
New `ai_pm_summarization_allowed_groups` can be used to allow
visibility of the summarization feature on PMs.
This can be useful on forums where a lot of communication happens
inside PMs.
* DEV: Remove old code now that features rely on LlmModels.
* Hide old settings and migrate persona llm overrides
* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
This allows summary to use the new LLM models and migrates of API key based model selection
Claude 3.5 etc... all work now.
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
* DRAFT: Create AI Bot users dynamically and support custom LlmModels
* Get user associated to llm_model
* Track enabled bots with attribute
* Don't store bot username. Minor touches to migrate default values in settings
* Handle scenario where vLLM uses a SRV record
* Made 3.5-turbo-16k the default version so we can remove hack
Native tools do not work well on Opus.
Chain of Thought prompting means it consumes enormous amounts of
tokens and has poor latency.
This commit introduce and XML stripper to remove various chain of
thought XML islands from anthropic prompts when tools are involved.
This mean Opus native tools is now functions (albeit slowly)
From local testing XML just works better now.
Also fixes enum support in Anthropic native tools
Previoulsy on GPT-4-vision was supported, change introduces support
for Google/Anthropic and new OpenAI models
Additionally this makes vision work properly in dev environments
cause we sent the encoded payload via prompt vs sending urls
- a post can be triaged a maximum of twice a minute
- system can run a total of 60 triages a minute
Low defaults were picked to safeguard against any possible loops
This can be amended if required via hidden site settings.
- Introduce new support for GPT4o (automation / bot / summary / helper)
- Properly account for token counts on OpenAI models
- Track feature that was used when generating AI completions
- Remove custom llm support for summarization as we need better interfaces to control registration and de-registration
Both endpoints provide OpenAI-compatible servers. The only difference is that Vllm doesn't support passing tools as a separate parameter. Even if the tool param is supported, it ultimately relies on the model's ability to handle native functions, which is not the case with the models we have today.
As a part of this change, we are dropping support for StableBeluga/Llama2 models. They don't have a chat_template, meaning the new API can translate them.
These changes let us remove some of our existing dialects and are a first step in our plan to support any LLM by defining them as data-driven concepts.
I rewrote the "translate" method to use a template method and extracted the tool support strategies into its classes to simplify the code.
Finally, these changes bring support for Ollama when running in dev mode. It only works with Mistral for now, but it will change soon..
- Adds support for sd3 and sd3 turbo models - this requires new endpoints
- Adds a hack to normalize arrays in the tool calls
- Removes some leftover code
- Adds support for aspect ratio as well so you can generate wide or tall images