[论文解读] System for systematic literature review using multiple AI agents: Concept and an empirical evaluation
本文提出一个由 LLMs 驱动的全新多 AI 智能体模型,以实现对系统性文献综述(SLRs)的完全自动化,并使用 ten software engineering researchers 进行评估,显示出高满意度和潜在的效率提升。
Systematic literature review (SLR) is foundational to evidence-based research, enabling scholars to identify, classify, and synthesize existing studies to address specific research questions. Conducting an SLR is, however, largely a manual process. In recent years, researchers have made significant progress in automating portions of the SLR pipeline to reduce the effort and time required for high-quality reviews; nevertheless, there remains a lack of AI-agent-based systems that automate the entire SLR workflow. To this end, we introduce a novel multi-AI-agent system designed to fully automate SLRs. Leveraging large language models (LLMs), our system streamlines the review process to enhance efficiency and accuracy. Through a user-friendly interface, researchers specify a topic; the system then generates a search string to retrieve relevant academic papers. Next, an inclusion/exclusion filtering step is applied to titles relevant to the research area. The system subsequently summarizes paper abstracts and retains only those directly related to the field of study. In the final phase, it conducts a thorough analysis of the selected papers with respect to predefined research questions. This paper presents the system, describes its operational framework, and demonstrates how it substantially reduces the time and effort traditionally required for SLRs while maintaining comprehensiveness and precision. The code for this project is available at: https://github.com/GPT-Laboratory/SLR-automation .
研究动机与目标
- 由于时间和劳动力强度,软件工程领域对端到端自动化 SLR 的需求日益增加。
- 开发一个基于多代理的 LLM 框架,自动化从检索到分析的 SLR 步骤。
- 展示该模型生成检索字符串、筛选文献、摘要汇总以及回答研究问题的能力。
- 通过专家评估评估该方案的用户体验与实用性。
提出的方法
- 创建一个规划器代理,从用户输入中生成研究问题、目的和检索字符串。
- 使用文献识别代理通过检索字符串检索论文并应用基于标题的筛选。
- 采用数据提取代理按纳入/排除标准筛选并汇总与研究问题相关的内容。
- 实施数据汇编代理以综合发现、检测趋势并起草最终报告。
- 以十位软件工程研究人员对系统进行评估,以评估效率、可用性和准确性。
- 规划更广泛的未来评估与公开展示(如 SANER 2024)并提供 GitHub 代码。
实验结果
研究问题
- RQ1RQ1. How does a LLM-based multi-agent system transform traditional methodologies to automate the systematic literature review process in SE?
- RQ2RQ2. How can the efficiency and accuracy of the proposed LLM-based multi-agent model be evaluated?
主要发现
- 该多代理模型自动化了 SLR 的检索字符串生成、文献识别、数据提取和数据综合。
- 对十位专家的评估显示 80% 的人同意该工具的功能,20%的人提出改进建议。
- 作者计划与另外 50 位从业者和研究人员进行更广泛的评估,并在 SANER 2024 上展示。
- 该工具在减少人工工作量的同时保持或提高了全面性和准确性。
- 该项目提供了一个 GitHub 仓库以获取代码。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。