QUICK REVIEW

[論文レビュー] AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Siyu Wang, Renhong Lu|arXiv (Cornell University)|Feb 19, 2026

Multimodal Machine Learning Applications被引用数 0

ひとこと要約

AgentConductor は、タスク適応的で密度認識型の相互作用トポロジを動的に生成するLMMオーケストレータを備えたRL最適化型マルチエージェントシステムで、実行フィードバックに基づいてトポロジを洗練します。

ABSTRACT

Large language model(LLM)-driven multi-agent systems(MAS) coordinate specialized agents through predefined interaction topologies and have shown promise for complex tasks such as competition-level code generation. Recent studies demonstrate that carefully designed multi-agent workflows and communication graphs can significantly improve code generation performance by leveraging collaborative reasoning. However, existing methods neither adapt topology density to task difficulty nor iteratively refine the topology within an instance using execution feedback, which leads to redundant communication and performance bottlenecks. To address these issues, we propose AgentConductor: a reinforcement learning-optimized MAS with an LLM-based orchestrator agent as its core, which enables end-to-end feedback-driven dynamic generation of interaction topologies. For each query, AgentConductor infers agent roles and task difficulty, then constructs a task-adapted, density-aware layered directed acyclic graph (DAG) topology, underpinned by two key innovations. First, we design a novel topological density function that captures communication-aware mathematical characterizations of multi-agent interactions. Second, we adopt difficulty interval partitioning to avoid excessive pruning for precise topological density upper bound measurement per difficulty level and finer-grained control. Empirically, across three competition-level and two foundational code datasets, AgentConductor achieves state-of-the-art accuracy, outperforming the strongest baseline by up to 14.6% in pass@1 accuracy, 13% in density reduction, and 68% in token cost reduction.

研究の動機と目的

競技レベルのコード生成において問題難易度に応じて密度をスケールさせる自動化・タスク特異的トポロジ生成を動機づける。
単一の問題インスタンス内でのエンドツーエンド、フィードバック駆動の相互作用トポロジの洗練を可能にする。
通信および計算オーバーヘッドを削減しつつ高いコード精度を維持する。
階層的DAGに適したトポロジ密度評価関数を導入し、難易度制約下でコストと性能のバランスを取る。
競技レベルおよび基礎コードデータセットで最先端の性能を示す。

提案手法

層状DAGトポロジを提案し、層内並列性と層間通信を可能にする YAML ベースの人間が読める表現を提供する。
強化学習フレームワーク（GRPO）を用いて、実行フィードバックを伴うタスク適応トポロジを複数ターンにわたり生成するオーケストレータエージェントを最適化する。
ノード・エッジ・深さを測定し、それらを正規化して報酬構造を誘導する複合密度指標を導くグラフ密度評価関数を定義する。
監督付きファインチューニングでトポロジ priors をエンコードするようオーケストレータを事前訓練し、軌跡ベースの方策最適化で洗練する。
難易度区間の分割を取り入れてタスク固有の密度上限とより細かなトポロジ制御を得る。

実験結果

リサーチクエスチョン

RQ1マルチエージェントのコード生成においてトポロジ密度をタスク難易度に適応させるにはどうすればよいか。
RQ2実行フィードバックを用いて単一の問題インスタンス内で相互作用トポロジを反復的に洗練できるか。
RQ3YAML 表現の層状DAGトポロジが固定またはチェーン/トポロジのベースラインより柔軟性と効率を向上させるか。
RQ4難易度認識密度報酬がコード精度とトークンコストに及ぼす影響は。
RQ5AgentConductor は追加の最適化なしで新しいデータセットやタスクタイプへどれだけ転用可能か。

主な発見

AgentConductor は3つの競技レベルのコードデータセットと2つの基礎コードデータセットで最先端の精度を達成。
アプローチはトポロジ密度を最大で13%削減。
手法はトークンコストを最大68%削減。
パス@1の精度で最も強力なベースラインを最大14.6%上回る。
新しい密度関数と難易度認識の下でトポロジ密度が難易度に合わせて調整される。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。