[論文レビュー] InfiAgent: An Infinite-Horizon Framework for General-Purpose Autonomous Agents
InfiAgent externalizes persistent task state via a file-centric workspace and reconstructs a bounded reasoning context at each step, enabling long-horizon autonomy with a 20B open-source model that rivals larger proprietary agents on research tasks and maintains high long-horizon coverage.
LLM agents can reason and use tools, but they often break down on long-horizon tasks due to unbounded context growth and accumulated errors. Common remedies such as context compression or retrieval-augmented prompting introduce trade-offs between information fidelity and reasoning stability. We present InfiAgent, a general-purpose framework that keeps the agent's reasoning context strictly bounded regardless of task duration by externalizing persistent state into a file-centric state abstraction. At each step, the agent reconstructs context from a workspace state snapshot plus a fixed window of recent actions. Experiments on DeepResearch and an 80-paper literature review task show that, without task-specific fine-tuning, InfiAgent with a 20B open-source model is competitive with larger proprietary systems and maintains substantially higher long-horizon coverage than context-centric baselines. These results support explicit state externalization as a practical foundation for stable long-horizon agents. Github Repo:https://github.com/ChenglinPoly/infiAgent
研究の動機と目的
- Motivate the need for stable long-horizon autonomy in LLM agents and identify how unbounded context harms reasoning over time.
- Propose a file-centric persistent state abstraction that keeps reasoning context strictly bounded.
- Describe a hierarchical multi-level agent architecture and external attention to manage large documents.
- Evaluate long-horizon stability and performance on research-oriented benchmarks without task-specific fine-tuning.
提案手法
- Formalize the separation between persistent task state and bounded reasoning context.
- Define persistent state F_t as a file-system workspace evolving via state-transition operators T( F_t, a_t ).
- Construct bounded reasoning context c_t^{bounded} from a workspace snapshot and a fixed window of recent actions.
- Implement a hierarchical agent stack (Alpha, Domain, Atomic) with Agent-as-a-Tool calls to manage complexity and reduce tool-call chaos.
- Introduce an External Attention Pipeline to extract task-relevant information from large documents without inflating the main reasoning context.
- Evaluate on DeepResearch benchmark and a long-horizon 80-paper literature review, comparing against larger models and ablations.

実験結果
リサーチクエスチョン
- RQ1Can explicit externalization of persistent state into a file-centric workspace stabilize long-horizon LLM agents without task-specific fine-tuning?
- RQ2Does a hierarchical agent architecture with bounded context reconstruction improve reliability and coverage on long-running research tasks?
- RQ3Is external attention effective for processing large documents while preserving bounded reasoning context?
主な発見
| Setting | Model | Max | Min | Avg | |
|---|---|---|---|---|---|
| Main results (with file-centric state; InfiAgent vs. baselines) | InfiAgent | 80 | 15 | 67.1 | |
| Main results (with file-centric state; InfiAgent vs. baselines) | Gemini-3-Flash | 80 | 80 | 80.0 | |
| Main results (with file-centric state; InfiAgent vs. baselines) | Claude-4.5-Sonnet | 80 | 80 | 80.0 | |
| Main results (with file-centric state; InfiAgent vs. baselines) | Claude Code | 80 | 11 | 29.1 | |
| Main results (with file-centric state; InfiAgent vs. baselines) | Cursor | Claude-4.5-Sonnet | 5 | 0 | 1.0 | |
| Main results (with file-centric state; InfiAgent vs. baselines) | Cursor | Gemini-3-Flash | 1 | 0 | 0.1 | |
| Ablation (remove file-centric state; compressed long-context prompts) | No File State (Compressed Context) | GPT-OSS-20B | 7 | 1 | 3.2 |
| Ablation (remove file-centric state; compressed long-context prompts) | No File State (Compressed Context) | Gemini-3-Flash | 25 | 20 | 21.1 |
| Ablation (remove file-centric state; compressed long-context prompts) | No File State (Compressed Context) | Claude-4.5-Sonnet | 77 | 11 | 27.7 |
- With a 20B open-source model, InfiAgent achieves competitive DeepResearch performance compared to larger proprietary systems.
- InfiAgent attains strong instruction-following and readability, contributing to stable long-horizon behavior.
- On the 80-paper literature review, InfiAgent achieves high coverage and maintains stability across hundreds of steps, outperforming baselines using compressed long-context prompts.
- Ablation removing the file-centric state substantially degrades coverage, supporting the importance of explicit persistent state externalization.
- The long-horizon literature review shows InfiAgent with 20B model achieves up to 80.0 coverage with certain backbones, while ablations drop in average coverage.

より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。