This PR enhances the LLM triage automation with several important improvements:
- Add ability to use AI personas for automated replies instead of canned replies
- Add support for whisper responses
- Refactor LLM persona reply functionality into a reusable method
- Add new settings to configure response behavior in automations
- Improve error handling and logging
- Fix handling of personal messages in the triage flow
- Add comprehensive test coverage for new features
- Make personas configurable with more flexible requirements
This allows for more dynamic and context-aware responses in automated workflows, with better control over visibility and attribution.
adds support for "thinking tokens" - a feature that exposes the model's reasoning process before providing the final response. Key improvements include:
- Add a new Thinking class to handle thinking content from LLMs
- Modify endpoints (Claude, AWS Bedrock) to handle thinking output
- Update AI bot to display thinking in collapsible details section
- Fix SEARCH/REPLACE blocks to support empty replacement strings and general improvements to artifact editing
- Allow configurable temperature in triage and report automations
- Various bug fixes and improvements to diff parsing
A new feature_context json column was added to ai_api_audit_logs
This allows us to store rich json like context on any LLM request
made.
This new field now stores automation id and name.
Additionally allows llm_triage to specify maximum number of tokens
This means that you can limit the cost of llm triage by scanning only
first N tokens of a post.
* DEV: Remove old code now that features rely on LlmModels.
* Hide old settings and migrate persona llm overrides
* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
Introduce a Discourse Automation based periodical report. Depends on Discourse Automation.
Report works best with very large context language models such as GPT-4-Turbo and Claude 2.
- Introduces final_insts to generic llm format, for claude to work best it is better to guide the last assistant message (we should add this to other spots as well)
- Adds GPT-4 turbo support to generic llm interface
llm_triage supported claude 2 in triage, this implements it
OpenAI rate limits frequently, this introduces some exponential
backoff (3 attempts - 3 seconds, 9 and 27)
Also reduces temp of classifiers so they have consistent behavior
The new automation rule can be used to perform llm based classification and categorization of topics.
You specify a system prompt (which has %%POST%% as an input), if it returns a particular piece of text then we will apply rules such as tagging, hiding, replying or categorizing.
This can be used as a spam filter, a "oops you are in the wrong place" filter and so on.
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>