Commit Graph

5 Commits

Author SHA1 Message Date
Sam 2c609e165b
FEATURE: Add user location info to spam scanner context (#1076)
This adds registration and last known IP information and email to scanning context.

This provides another hint for spam scanner about possible malicious users.

For example registered in India, replying from Australia or
email is clearly a throwaway email address.
2025-01-21 17:51:21 +11:00
Sam 81b952d56e
FIX: only hide posts detected explicitly as spam (#1070)
When enabling spam scanner it there may be old unscanned posts
this can create a risky situation where spam scanner operates
on legit posts during false positives

To keep this a lot safer we no longer try to hide old stuff by
the spammers.
2025-01-15 16:50:41 +11:00
Natalie Tay c881f8b361
DEV: Add rake task to send topics or posts to spam scanner (#1059) 2025-01-15 11:48:57 +08:00
Sam f9f89adac5
FIX: keep track of silence reason when spam detection flags user (#1046)
Previously reason was blank for silencing user
2024-12-27 17:47:16 +11:00
Sam 47f5da7e42
FEATURE: Add AI-powered spam detection for new user posts (#1004)
This introduces a comprehensive spam detection system that uses LLM models
to automatically identify and flag potential spam posts. The system is
designed to be both powerful and configurable while preventing false positives.

Key Features:
* Automatically scans first 3 posts from new users (TL0/TL1)
* Creates dedicated AI flagging user to distinguish from system flags
* Tracks false positives/negatives for quality monitoring
* Supports custom instructions to fine-tune detection
* Includes test interface for trying detection on any post

Technical Implementation:
* New database tables:
  - ai_spam_logs: Stores scan history and results
  - ai_moderation_settings: Stores LLM config and custom instructions
* Rate limiting and safeguards:
  - Minimum 10-minute delay between rescans
  - Only scans significant edits (>10 char difference)
  - Maximum 3 scans per post
  - 24-hour maximum age for scannable posts
* Admin UI features:
  - Real-time testing capabilities
  - 7-day statistics dashboard
  - Configurable LLM model selection
  - Custom instruction support

Security and Performance:
* Respects trust levels - only scans TL0/TL1 users
* Skips private messages entirely
* Stops scanning users after 3 successful public posts
* Includes comprehensive test coverage
* Maintains audit log of all scan attempts


---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2024-12-12 09:17:25 +11:00