[論文レビュー] Automated Design of Agentic Systems
論文は Automated Design of Agentic Systems (ADAS) と Meta Agent Search というメタエージェントアルゴリズムを紹介し、コード内で新しいエージェント的システムをプログラム・発見する。手で設計されたベースラインを上回り、ドメインとモデルを跨いで転移するエージェントを示す。
Researchers are investing substantial effort in developing powerful general-purpose agents, wherein Foundation Models are used as modules within agentic systems (e.g. Chain-of-Thought, Self-Reflection, Toolformer). However, the history of machine learning teaches us that hand-designed solutions are eventually replaced by learned solutions. We describe a newly forming research area, Automated Design of Agentic Systems (ADAS), which aims to automatically create powerful agentic system designs, including inventing novel building blocks and/or combining them in new ways. We further demonstrate that there is an unexplored yet promising approach within ADAS where agents can be defined in code and new agents can be automatically discovered by a meta agent programming ever better ones in code. Given that programming languages are Turing Complete, this approach theoretically enables the learning of any possible agentic system: including novel prompts, tool use, workflows, and combinations thereof. We present a simple yet effective algorithm named Meta Agent Search to demonstrate this idea, where a meta agent iteratively programs interesting new agents based on an ever-growing archive of previous discoveries. Through extensive experiments across multiple domains including coding, science, and math, we show that our algorithm can progressively invent agents with novel designs that greatly outperform state-of-the-art hand-designed agents. Importantly, we consistently observe the surprising result that agents invented by Meta Agent Search maintain superior performance even when transferred across domains and models, demonstrating their robustness and generality. Provided we develop it safely, our work illustrates the potential of an exciting new research direction toward automatically designing ever-more powerful agentic systems to benefit humanity.
研究の動機と目的
- Motivate ADAS as automating the design of powerful agentic systems that integrate Foundation Models as modules.
- Propose a code-space search approach where agents are programmed in code and discovered by a meta agent.
- Introduce Meta Agent Search to iteratively program, test, archive, and refine agent designs.
- Demonstrate robustness and transferability of discovered agents across domains and models.
提案手法
- Define a three-component ADAS framework: search space (code-defined agents), search algorithm (meta-agent-driven exploration), and evaluation function (task performance).
- Propose Meta Agent Search where a meta agent uses an archive of discovered agents to iteratively write new agent code.
- Use self-reflection steps within the meta agent to ensure novelty and refine designs.
- Evaluate discovered agents on ARC, reading comprehension, math, science, and multi-task benchmarks, plus cross-domain and cross-model transfers.
- Compare against state-of-the-art hand-designed baselines and prompt-only methods across multiple domains.
実験結果
リサーチクエスチョン
- RQ1Can ADAS automatically invent novel building blocks and combine them into effective agentic systems?
- RQ2Can a meta agent programming agents in code discover designs that outperform hand-designed baselines?
- RQ3Do agents discovered by Meta Agent Search transfer robustness across domains and models?
主な発見
- Discovered agents substantially outperform hand-designed baselines on ARC, reading, math, and multi-task benchmarks.
- On reading comprehension (DROP) and math (MGSM), improvements include 13.6/100 F1 and 14.4% accuracy respectively.
- Agents show strong transfer across math and non-math domains, improving GSM8K by 25.9% and GSM-Hard by 13.2% over baselines.
- Transferring across different FMs on ARC, the best agent achieves nearly 50% accuracy with Claude-Sonnet.
- Meta Agent Search consistently outperforms prompt-optimization baselines across domains.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。