Fix: Improve iterative refinement performance and reduce context explosion #874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

LearningCircuit wants to merge 2 commits into dev

from fix/iterative-refinement-performance

Draft

Fix: Improve iterative refinement performance and reduce context explosion #874

LearningCircuit wants to merge 2 commits into dev from fix/iterative-refinement-performance

Conversation

LearningCircuit

Copy link

Owner

@LearningCircuit LearningCircuit commented Sep 25, 2025

Summary

This PR fixes critical performance issues in the iterative refinement strategy that were causing:

Context explosion with unbounded LLM evaluation tokens
Query drift where refinements would diverge from the original topic
Excessive token usage and slow processing

Changes

Set max_evaluation_tokens default to 2000 (was None) to prevent unbounded context growth
Refactored _evaluate_with_llm to use structured findings list instead of parsing formatted text
Modified to pass only recent refinement findings to LLM instead of all accumulated context
Constrained refinement prompts to stay focused on original query

Testing

Tested with query "What to do against stress?" and observed:

Refinement queries now stay on topic (no more drift to unrelated company verifications)
Faster processing with reduced context
Confidence progression improved from erratic (63%→78%→60%) to more stable

Further Work Needed

While this PR addresses the immediate performance issues, additional improvements should be considered:

Search relevance: The search still returns some irrelevant sources that need better filtering
Confidence threshold tuning: Current threshold of 0.95 may be too high, causing unnecessary refinements
Duplicate detection: Add similarity checking to prevent redundant refinements
Completeness detection: Improve the evaluation logic to better detect when sufficient information has been gathered

Test Results

Before: Queries drifted from "stress reduction" to "verify 13 companies implementing mental health programs"
After: Queries stay focused - "provide evidence-based stress reduction techniques with mechanisms and actionable steps"

@LearningCircuit


 fix: improve iterative refinement performance and reduce context expl...

be14ddd

...osion
- Set max_evaluation_tokens default to 2000 (was None) to prevent unbounded context
- Use structured findings list instead of formatted text in _evaluate_with_llm
- Pass only recent refinement findings to LLM instead of all accumulated context
- Constrain refinement prompts to stay focused on original query
This fixes:
- Context explosion causing slow processing and high token costs
- Query drift where refinements diverge from original topic
- Excessive LLM context leading to degraded performance

@Copilot Copilot AI review requested due to automatic review settings

September 25, 2025 21:14

@LearningCircuit LearningCircuit requested review from HashedViking and djpetti as code owners

September 25, 2025 21:14

@actions-user