Commit Graph

567 Commits

Author SHA1 Message Date
Sam cf86d274a0
FEATURE: improve o3-mini support (#1106)
* DEV: raise timeout for reasoning LLMs

* FIX: use id to identify llms, not model_name

model_name is not unique, in the case of reasoning models
you may configure the same llm multiple times using different
reasoning levels.
2025-02-03 08:45:56 +11:00
Sam 381a2715c8
FEATURE: o3-mini supports (#1105)
1. Adds o3-mini presets
2. Adds support for reasoning effort
3. properly use "developer" messages for reasoning models
2025-02-01 14:08:34 +11:00
Rafael dos Santos Silva 8c22540e61
FEATURE: Use persona default LLM for Discord integration (#1104) 2025-01-31 18:11:20 -03:00
Keegan George 13f9a1908a
DEV: Only permit config of allowed seeded models (#1103)
This update resolves a regression that was introduced in https://github.com/discourse/discourse-ai/pull/1036/files. Previously, only seeded models that were allowed could be configured for model settings. However, in our attempts to prevent unreachable LLM errors from not allowing settings to persist, it also unknowingly allowed seeded models that were not allowed to be configured. This update resolves this issue, while maintaining the ability to still set unreachable LLMs.
2025-01-31 13:01:15 -08:00
Roman Rizzi 1572068735
DEV: Improve embedding configs validations (#1101)
Before this change, we let you set the embeddings selected model back to " " even with embeddings enabled. This will leave the site in a broken state.

Additionally, it adds a fail-safe for these scenarios to avoid errors on the topics page.
2025-01-30 14:16:56 -03:00
Natalie Tay fe44d78156
DEV: Expose AI spam scanning metrics (#1077)
This should give us a better idea on how our scanner is faring across sites.

```
# HELP discourse_discourse_ai_spam_detection AI spam scanning statistics
# TYPE discourse_discourse_ai_spam_detection counter
discourse_discourse_ai_spam_detection{db="default",type="scanned"} 16
discourse_discourse_ai_spam_detection{db="default",type="is_spam"} 7
discourse_discourse_ai_spam_detection{db="default",type="false_positive"} 1
discourse_discourse_ai_spam_detection{db="default",type="false_negative"} 2
```
2025-01-27 11:57:01 +08:00
Rafael dos Santos Silva 67a1257b89
FEATURE: Gemini Tokenizer (#1088) 2025-01-23 18:20:35 -03:00
Roman Rizzi 5a97752117
FIX: Always raise the single exception/Open AI models migration (#1087) 2025-01-23 15:30:06 -03:00
Sam 8bf350206e
FEATURE: track duration of AI calls (#1082)
* FEATURE: track duration of AI calls

* annotate
2025-01-23 11:32:12 +11:00
Roman Rizzi faa8e6e873
FIX: Embeddings backfill rake task was using old code (#1084) 2025-01-22 14:00:26 -03:00
Roman Rizzi 3b66fb3e87
FIX: Restore the accidentally deleted query prefix. (#1079)
Additionally, we add a prefix for embedding generation.
Both are stored in the definitions table.
2025-01-21 14:10:31 -03:00
Roman Rizzi f5cf1019fb
FEATURE: configurable embeddings (#1049)
* Use AR model for embeddings features

* endpoints

* Embeddings CRUD UI

* Add presets. Hide a couple more settings

* system specs

* Seed embedding definition from old settings

* Generate search bit index on the fly. cleanup orphaned data

* support for seeded models

* Fix run test for new embedding

* fix selected model not set correctly
2025-01-21 12:23:19 -03:00
Sam 2c609e165b
FEATURE: Add user location info to spam scanner context (#1076)
This adds registration and last known IP information and email to scanning context.

This provides another hint for spam scanner about possible malicious users.

For example registered in India, replying from Australia or
email is clearly a throwaway email address.
2025-01-21 17:51:21 +11:00
Roman Rizzi 46fcdb6ba5
FIX: Make summaries backfill job more resilient. (#1071)
To quickly select backfill candidates without comparing SHAs, we compare the last summarized post to the topic's highest_post_number. However, hiding or deleting a post and adding a small action will update this column, causing the job to stall and re-generate the same summary repeatedly until someone posts a regular reply. On top of this, this is not always true for topics with `best_replies`, as this last reply isn't necessarily included.

Since this is not evident at first glance and each summarization strategy picks its targets differently, I'm opting to simplify the backfill logic and how we track potential candidates.

The first step is dropping `content_range`, which serves no purpose and it's there because summary caching was supposed to work differently at the beginning. So instead, I'm replacing it with a column called `highest_target_number`, which tracks `highest_post_number` for topics and could track other things like channel's `message_count` in the future.

Now that we have this column when selecting every potential backfill candidate, we'll check if the summary is truly outdated by comparing the SHAs, and if it's not, we just update the column and move on
2025-01-16 09:42:53 -03:00
Rafael dos Santos Silva f9aa2de413
FIX: AWS Bedrock non-streaming calls response log (#1072) 2025-01-15 18:51:25 -03:00
Sam 81b952d56e
FIX: only hide posts detected explicitly as spam (#1070)
When enabling spam scanner it there may be old unscanned posts
this can create a risky situation where spam scanner operates
on legit posts during false positives

To keep this a lot safer we no longer try to hide old stuff by
the spammers.
2025-01-15 16:50:41 +11:00
Natalie Tay c881f8b361
DEV: Add rake task to send topics or posts to spam scanner (#1059) 2025-01-15 11:48:57 +08:00
Roman Rizzi 65bbcd71fc
DEV: Embedding tables' model_id has to be a bigint (#1058)
* DEV: Embedding tables' model_id has to be a bigint

* Drop old search_bit indexes

* copy rag fragment embeddings created during deploy window
2025-01-14 10:53:06 -03:00
Sam d07cf51653
FEATURE: llm quotas (#1047)
Adds a comprehensive quota management system for LLM models that allows:

- Setting per-group (applied per user in the group) token and usage limits with configurable durations
- Tracking and enforcing token/usage limits across user groups
- Quota reset periods (hourly, daily, weekly, or custom)
-  Admin UI for managing quotas with real-time updates

This system provides granular control over LLM API usage by allowing admins
to define limits on both total tokens and number of requests per group.
Supports multiple concurrent quotas per model and automatically handles
quota resets.


Co-authored-by: Keegan George <kgeorge13@gmail.com>
2025-01-14 15:54:09 +11:00
Sam 20612fde52
FEATURE: add the ability to disable streaming on an Open AI LLM
Disabling streaming is required for models such o1 that do not have streaming
enabled yet

It is good to carry this feature around in case various apis decide not to support streaming endpoints and Discourse AI can continue to work just as it did before. 

Also: fixes issue where sharing artifacts would miss viewport leading to tiny artifacts on mobile
2025-01-13 17:01:01 +11:00
Keegan George b24669c810
DEV: Add structure for errors in spam (#1054)
This update adds some structure for handling errors in the spam config while also handling a specific error related to the spam scanning user not being an admin account.
2025-01-09 09:17:06 -08:00
Sam 11d0f60f1e
FEATURE: smart date support for AI helper (#1044)
* FEATURE: smart date support for AI helper

This feature allows conversion of human typed in dates and times
to smart "Discourse" timezone friendly dates.

* fix specs and lint

* lint

* address feedback

* add specs
2024-12-31 08:04:25 +11:00
Sam f9f89adac5
FIX: keep track of silence reason when spam detection flags user (#1046)
Previously reason was blank for silencing user
2024-12-27 17:47:16 +11:00
Keegan George b480f13a0f
FIX: Prevent LLM enumerator from erroring when spam enabled (#1045)
This PR fixes an issue where LLM enumerator would error out when `SiteSetting.ai_spam_detection = true` but there was no `AiModerationSetting.spam` present.

Typically, we add an `LlmDependencyValidator` for the setting itself, however, since Spam is unique in that it has it's model set in `AiModerationSetting` instead of a `SiteSetting`, we'll add a simple check here to prevent erroring out.
2024-12-27 09:12:29 +11:00
Sam 47ecf86aa1
FIX: embedding validation (#1043) 2024-12-24 09:37:23 +11:00
Rafael dos Santos Silva 792df58fbc
FIX: AI Helper category / tag suggestion when user does not categories muted (#1042) 2024-12-23 15:58:26 -03:00
Roman Rizzi ceac6e5efb
FIX: Embeddings validator test needs to use the new Vector class. (#1041) 2024-12-23 14:19:22 -03:00
Keegan George bdb8f1d5e0
FIX: Custom prefix causing allowed seeded LLMs not to be shown (#1039)
* FIX: Custom prefix causing allowed seeded LLMs not to be shown

* DEV: update spec

* not `_map` so should be string not array
2024-12-23 16:42:26 +11:00
Rafael dos Santos Silva 7607477ff9
FIX: Cloudflare Workers AI embeddings (#1037)
Regressed on 534b0df
2024-12-20 17:45:27 -03:00
Keegan George 059b3fabb8
DEV: Unreachable LLM error shouldn't prevent setting (#1036)
Previously we had the behaviour for model settings so that when you try and set a model, it runs a test and returns an error if it can't run the test successfully. The error then prevents you from setting the site setting.

This results in some issues when we try and automate things. This PR updates that so that the test runs and discreetly logs the changes, but doesn't prevent the setting from being set. Instead we rely on "run test" in the LLM config along with ProblemChecks to catch issues.
2024-12-20 11:52:11 -08:00
Sam 6a7a45fd4f
FIX: properly spin down unused streamer threads (#1035)
Previous version was prone to the bug:

https://github.com/ruby-concurrency/concurrent-ruby/issues/1075

This is particularly bad cause we could have a DB connection
attached to the thread and we never clear it up, so after N hours
this could start exhibiting weird connection issues.
2024-12-20 12:09:42 +11:00
Sam fae2d5ff2c
FEATURE: link correctly to filters to assist in debugging spam (#1031)
- Add spam_score_type to AiSpamSerializer for better integration with reviewables.
- Introduce a custom filter for detecting AI spam false negatives in moderation workflows.
- Refactor spam report generation to improve identification of false negatives.
- Add tests to verify the custom filter and its behavior.
- Introduce links for all spam counts in report
2024-12-17 11:02:18 +11:00
Mark VanLandingham 24b107881a
FEATURE: Unavailable state for semantic search when sort is not Relevant (#1030)
This commit adds an "unavailable" state for the AI semantic search toggle. Currently the AI toggle disappears when the sort by is anything but Relevance which makes the UI confusing for users looking for AI results. This should help!
2024-12-16 14:30:11 -06:00
Roman Rizzi 534b0df391
REFACTOR: Separation of concerns for embedding generation. (#1027)
In a previous refactor, we moved the responsibility of querying and storing embeddings into the `Schema` class. Now, it's time for embedding generation.

The motivation behind these changes is to isolate vector characteristics in simple objects to later replace them with a DB-backed version, similar to what we did with LLM configs.
2024-12-16 09:55:39 -03:00
Roman Rizzi eae527f99d
REFACTOR: A Simpler way of interacting with embeddings tables. (#1023)
* REFACTOR: A Simpler way of interacting with embeddings' tables.

This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table.

It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.
2024-12-13 10:15:21 -03:00
Roman Rizzi 97ec2c5ff4
FEATURE: Show gists everywhere except suggested/related (#995) 2024-12-12 12:29:35 -03:00
Sam 47c1ea337e
FIX: allow scanning of trashed posts and deleted users for test (#1024)
When a post is trashed we should still be allowed to scan it
post.topic will be nil for a trashed topic even if post is trashed
2024-12-12 10:26:05 +11:00
Sam 47f5da7e42
FEATURE: Add AI-powered spam detection for new user posts (#1004)
This introduces a comprehensive spam detection system that uses LLM models
to automatically identify and flag potential spam posts. The system is
designed to be both powerful and configurable while preventing false positives.

Key Features:
* Automatically scans first 3 posts from new users (TL0/TL1)
* Creates dedicated AI flagging user to distinguish from system flags
* Tracks false positives/negatives for quality monitoring
* Supports custom instructions to fine-tune detection
* Includes test interface for trying detection on any post

Technical Implementation:
* New database tables:
  - ai_spam_logs: Stores scan history and results
  - ai_moderation_settings: Stores LLM config and custom instructions
* Rate limiting and safeguards:
  - Minimum 10-minute delay between rescans
  - Only scans significant edits (>10 char difference)
  - Maximum 3 scans per post
  - 24-hour maximum age for scannable posts
* Admin UI features:
  - Real-time testing capabilities
  - 7-day statistics dashboard
  - Configurable LLM model selection
  - Custom instruction support

Security and Performance:
* Respects trust levels - only scans TL0/TL1 users
* Skips private messages entirely
* Stops scanning users after 3 successful public posts
* Includes comprehensive test coverage
* Maintains audit log of all scan attempts


---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2024-12-12 09:17:25 +11:00
Martin Brennan ae80494448
UX: Improve rough edges of AI usage page (#1014)
* UX: Improve rough edges of AI usage page

* Ensure all text uses I18n
* Change from <button> usage to <DButton>
* Use <AdminConfigAreaCard> in place of custom card styles
* Format numbers nicely using our number format helper,
  show full values on hover using title attr
* Ensure 0 is always shown for counters, instead of being blank

* FEATURE: Load usage data after page load

Use ConditionalLoadingSpinner to hide load of usage
data, this prevents us hanging on page load with a white
screen.

* UX: Split users table, and add empty placeholders and page subheader

* DEV: Test fix
2024-12-12 08:55:24 +11:00
Keegan George a4440c507b
UX: Make sentiment trends more readable (#1018)
Instead of a stacked chart showing a separate series for positive and negative, this PR introduces a simplification to the overall sentiment dashboard. It comprises the sentiment into a single series of the difference between `positive - negative` instead. This should allow for the data to be more easy to scan and look for trends
2024-12-11 09:13:18 -08:00
Roman Rizzi 5fc7a730ef
FIX: Triage rule should append selected tags instead of replacing them (#1022) 2024-12-11 11:19:44 -03:00
Roman Rizzi 6da35d8e66
FIX: Gemini inference client was missing #instance (#1019) 2024-12-10 15:42:31 -03:00
Keegan George 700e9de073
Revert "UX: Make sentiment trends more readable in time series data (#1013)" (#1016)
This reverts commit 375dd702b2.
2024-12-10 08:15:27 -08:00
Keegan George 375dd702b2
UX: Make sentiment trends more readable in time series data (#1013)
Instead of a stacked chart showing a separate series for positive and negative, this PR introduces a simplification to the overall sentiment dashboard. It comprises the sentiment into a single series of the difference between `positive - negative` instead. This should allow for the data to be more easy to scan and look for trends.
2024-12-10 07:22:41 -08:00
Sam 7ca21cc329
FEATURE: first class support for OpenRouter (#1011)
* FEATURE: first class support for OpenRouter

This new implementation supports picking quantization and provider pref

Also:

- Improve logging for summary generation
- Improve error message when contacting LLMs fails

* Better support for full screen artifacts on iPad

Support back button to close full screen
2024-12-10 05:59:19 +11:00
Roman Rizzi 085dde7042
FEATURE: Select stop sequences from triage script (#1010) 2024-12-06 11:13:47 -03:00
Roman Rizzi 7ebbcd2de3
FIX: Make sure prompt uploads get included in the prompt when triaging (#1008) 2024-12-05 21:04:35 -03:00
Sam a55216773a
FEATURE: Amazon Nova support via bedrock (#997)
Refactor dialect selection and add Nova API support

    Change dialect selection to use llm_model object instead of just provider name
    Add support for Amazon Bedrock's Nova API with native tools
    Implement Nova-specific message processing and formatting
    Update specs for Nova and AWS Bedrock endpoints
    Enhance AWS Bedrock support to handle Nova models
    Fix Gemini beta API detection logic
2024-12-06 07:45:58 +11:00
Rafael dos Santos Silva 5ba274f7a5
FIX: Change how we filter date for emotion on /filter (#1006) 2024-12-05 14:10:12 -03:00
Roman Rizzi b32b1cf241
FIX: Add a digest check to avoid repeteadly generating embeddings (bulk) (#1001) 2024-12-04 17:47:28 -03:00