From 7db2589cc4ecdb6fc69059c0cc3d4b0c80f8e12a Mon Sep 17 00:00:00 2001 From: Sam Date: Tue, 20 May 2025 13:01:35 +1000 Subject: [PATCH] DEV: prompt engineering to improve citations (#1351) --- lib/personas/forum_researcher.rb | 12 ++++++++---- lib/personas/tools/researcher.rb | 9 ++++++++- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/lib/personas/forum_researcher.rb b/lib/personas/forum_researcher.rb index eb0f82cb..a94d9e46 100644 --- a/lib/personas/forum_researcher.rb +++ b/lib/personas/forum_researcher.rb @@ -21,6 +21,8 @@ module DiscourseAi The description is: {site_description} The participants in this conversation are: {participants} The date now is: {time}, much has changed since you were trained. + Topic URLs are formatted as: /t/-/TOPIC_ID + Post URLs are formatted as: /t/-/TOPIC_ID/POST_NUMBER As a forum researcher, guide users through a structured research process: 1. UNDERSTAND: First clarify the user's research goal - what insights are they seeking? @@ -41,10 +43,12 @@ module DiscourseAi Research workflow best practices: 1. Start with a dry_run to gauge the scope (set dry_run:true) - 2. If results are too numerous (>1000), add more specific filters - 3. If results are too few (<5), broaden your filters - 4. For temporal analysis, specify explicit date ranges - 5. For user behavior analysis, combine @username with categories or tags + 2. For temporal analysis, specify explicit date ranges + 3. For user behavior analysis, combine @username with categories or tags + + - When formatting research results, format backing links clearly: + - When it is a good fit, link to the topic with descriptive text. + - When it is a good fit, link using markdown footnotes. PROMPT end end diff --git a/lib/personas/tools/researcher.rb b/lib/personas/tools/researcher.rb index 2e93da3d..0c0921e6 100644 --- a/lib/personas/tools/researcher.rb +++ b/lib/personas/tools/researcher.rb @@ -166,8 +166,15 @@ module DiscourseAi def goal_system_prompt(goals) <<~TEXT - You are a researcher tool designed to analyze and extract information from forum content. + You are a researcher tool designed to analyze and extract information from forum content on #{Discourse.base_url}. + The current date is #{::Time.zone.now.strftime("%a, %d %b %Y %H:%M %Z")}. Your task is to process the provided content and extract relevant information based on the specified goal. + When extracting content ALWAYS include the following: + - Multiple citations using Markdown + - Topic citations: Interesting fact [ref](/t/-/TOPIC_ID) + - Post citations: Interesting fact [ref](/t/-/TOPIC_ID/POST_NUMBER) + - Relevent quotes from the direct source content + - Relevant dates and times from the content Your goal is: #{goals} TEXT