[Paper Review] ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
ResearchAgent uses LLMs augmented with a citation graph, an entity-centric knowledge store, and ReviewingAgents to automatically generate and iteratively refine novel research ideas (problem, method, experiment design) from scientific literature.
The pace of scientific research, vital for improving human life, is complex, slow, and needs specialized expertise. Meanwhile, novel, impactful research often stems from both a deep understanding of prior work, and a cross-pollination of ideas across domains and fields. To enhance the productivity of researchers, we propose ResearchAgent, which leverages the encyclopedic knowledge and linguistic reasoning capabilities of Large Language Models (LLMs) to assist them in their work. This system automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them based on the feedback from collaborative LLM-powered reviewing agents. Specifically, starting with a core scientific paper, ResearchAgent is augmented not only with relevant publications by connecting information over an academic graph but also entities retrieved from a knowledge store derived from shared underlying concepts mined across numerous papers. Then, mimicking a scientific approach to improving ideas with peer discussions, we leverage multiple LLM-based ReviewingAgents that provide reviews and feedback via iterative revision processes. These reviewing agents are instantiated with human preference-aligned LLMs whose criteria for evaluation are elicited from actual human judgments via LLM prompting. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showing its effectiveness in generating novel, clear, and valid ideas based on both human and model-based evaluation results. Our initial foray into AI-mediated scientific research has important implications for the development of future systems aimed at supporting researchers in their ideation and operationalization of novel work.
Motivation & Objective
- Model a three-stage pipeline for research idea generation: problem identification, method development, and experiment design.
- Augment LLM reasoning with an A) citation-graph-based literature survey, B) an entity-centric knowledge store, and C) iterative peer-review via ReviewingAgents.
- Align LLM evaluation criteria with human judgments to produce human-preference-aligned assessments.
- Demonstrate that knowledge-augmented and iteratively refined ideas outperform baselines across multiple disciplines.
Proposed method
- Define o = [p, m, d] where p is the problem, m is the method, and d is the experiment design, generated by f(L) over literature L.
- Use a citation-graph survey to select a core paper l0 and related papers {l1,...,ln} based on citation counts and abstract similarity to build a focused LLM input.
- Construct an entity-centric knowledge store K from entities extracted across papers, stored as a sparse matrix to capture co-occurrences and cross-domain connections.
- Augment LLM prompts with relevant external entities retrieved from K to expand context during idea generation: o = LLM(T({l0,...,ln}, Ret({l0,...,ln}; K))).
- Introduce ReviewingAgents that critique each idea (problem, method, experiment) along five human-aligned criteria, enabling iterative refinement of o.
- Calibrate model-based evaluation criteria by deriving prompts from human-annotated scores to better reflect human judgments.
Experimental results
Research questions
- RQ1Can an LLM-powered ResearchAgent generate novel, clear, and valid research ideas (problem, method, experiment) from scientific literature?
- RQ2Does augmenting LLMs with an entity-centric knowledge store and citation-graph literature improve idea quality over baselines?
- RQ3Do iterative reviews by ReviewingAgents aligned with human judgments improve idea quality through refinement steps?
- RQ4How do different knowledge sources (references vs. entities) contribute to idea quality across disciplines?
Key findings
- The full ResearchAgent outperforms ablated baselines on both human and model-based evaluations across problems, methods, and experiment designs.
- Augmenting with an entity-centric knowledge store yields higher originality and innovativeness in ideas.
- Iterative refinements via ReviewingAgents improve idea quality, with gains saturating after about three iterations.
- Both relevant references and entities contribute to performance, with references often providing the strongest benefit.
- Human-aligned evaluation criteria improve the alignment between model-based judgments and human judgments.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.