QUICK REVIEW

[論文レビュー] The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search

Yutaro Yamada, Robert Tjarko Lange|ArXiv.org|Apr 10, 2025

Scientific Computing and Data Management被引用数 14

ひとこと要約

AI Scientist-v2 は自律的にアイデアを生み出し、設計し、エージェント的な木探索フレームワークで実験を実行し、原稿を執筆し、AI生成論文のピアレビュー付きワークショップ採択を達成します。コードテンプレート依存を排除し、VLMフィードバックを用いて図と内容を洗練します。

ABSTRACT

AI is increasingly playing a pivotal role in transforming how scientific discoveries are made. We introduce The AI Scientist-v2, an end-to-end agentic system capable of producing the first entirely AI generated peer-review-accepted workshop paper. This system iteratively formulates scientific hypotheses, designs and executes experiments, analyzes and visualizes data, and autonomously authors scientific manuscripts. Compared to its predecessor (v1, Lu et al., 2024 arXiv:2408.06292), The AI Scientist-v2 eliminates the reliance on human-authored code templates, generalizes effectively across diverse machine learning domains, and leverages a novel progressive agentic tree-search methodology managed by a dedicated experiment manager agent. Additionally, we enhance the AI reviewer component by integrating a Vision-Language Model (VLM) feedback loop for iterative refinement of content and aesthetics of the figures. We evaluated The AI Scientist-v2 by submitting three fully autonomous manuscripts to a peer-reviewed ICLR workshop. Notably, one manuscript achieved high enough scores to exceed the average human acceptance threshold, marking the first instance of a fully AI-generated paper successfully navigating a peer review. This accomplishment highlights the growing capability of AI in conducting all aspects of scientific research. We anticipate that further advancements in autonomous scientific discovery technologies will profoundly impact human knowledge generation, enabling unprecedented scalability in research productivity and significantly accelerating scientific breakthroughs, greatly benefiting society at large. We have open-sourced the code at https://github.com/SakanaAI/AI-Scientist-v2 to foster the future development of this transformative technology. We also discuss the role of AI in science, including AI safety.

研究の動機と目的

仮説から原稿まで、完全に自律的でエンドツーエンドのAI駆動による科学的発見を実証する。
人間が作成したコードテンプレートへの依存を排除し、ドメイン一般適用を可能にする。
仮説の探索を深めるための実験進行マネージャとエージェント的木探索を導入する。
実験と原稿の図/テキストに対するフィードバックのために Vision-Language Models を組み込む。
ICLRワークショップへAI生成原稿を提出してシステムを評価し、制限を分析する。

提案手法

人間のテンプレートを用いず、Python実験コードを生成・洗練するドメイン一般の木ベース探索を提案する。
予備調査、ハイパーパラメータ調整、研究アジェンダの実行、アブレーション研究の4段階を調整する実験進行マネージャを実装する。
並列化されたエージェント的木探索を用いて複数ノードを生成、実行、批評し、 buggy/ non-buggy の分類が洗練化を導く。
実験中および原稿の査読段階で生成された図とキャプションを批評するために Vision-Language Models を統合する。
データセットの読み込みと文献の基づけのために Hugging Face のデータセットと文献ツール（例: Semantic Scholar）を活用する。
推論モデルによる思考後の反省段階を備えた単一パスの原稿生成と、図とテキストのVLM支援による洗練を行う。

実験結果

リサーチクエスチョン

RQ1人間が作成したテンプレートを用いず、機械学習領域全体で研究仮説を生成し実験を実行できる完全自律AIシステムは存在するか？
RQ2エージェント的木探索は、線形・テンプレートベースのワークフローと比較して複雑な仮説のより深い探索を可能にするか？
RQ3AI生成の原稿はワークショップ環境でどの程度ピアレビューを通過でき、制限は何か？
RQ4Vision-Language Model のフィードバックは図表と原稿内容の品質と明確さをどのように向上させるか？

主な発見

三つの自律的原稿がICLR workshopへ提出され、うち一件は平均レビュアー点6.33を獲得し、提出物の上位約45%に相当する。
構成的正則化に関するAI生成ワークショップ論文はピアレビューで6, 7, 6を受け、メタレビュー後には受理されていただろう。
本研究は、完全にAI生成の原稿がワークショップレベルの受理に達しうることを示し、自律的科学発見のマイルストーンを印する。
内部評価では、時折の引用の幻覚やメインカンファレンスの厳密性不足などの限界が指摘された。
著者はコミュニティの探索と安全性議論のためにコードとデータセットをオープンソース化した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。