QUICK REVIEW

[论文解读] Can your paper evade the editors axe? Towards an AI assisted peer review system

Tirthankar Ghosal, Rajeev Verma|arXiv (Cornell University)|Feb 5, 2018

Advanced Text Analysis Techniques被引用 1

一句话总结

本文提出一种AI辅助的同行评审系统，旨在通过识别因超出范围或质量低下而被拒稿的稿件，实现自动初筛拒稿。该方法基于关键词分析、引用模式、作者声誉以及与已接受论文的相似性等特征工程，采用监督式机器学习，在三家期刊中均取得了令人满意的结果。

ABSTRACT

This work is an exploratory study of how we could progress a step towards an AI assisted peer- review system. The proposed approach is an ambitious attempt to automate the Desk-Rejection phenomenon prevalent in academic peer review. In this investigation we first attempt to decipher the possible reasons of rejection of a scientific manuscript from the editors desk. To seek a solution to those causes, we combine a flair of information extraction techniques, clustering, citation analysis to finally formulate a supervised solution to the identified problems. The projected approach integrates two important aspects of rejection: i) a paper being rejected because of out of scope and ii) a paper rejected due to poor quality. We extract several features to quantify the quality of a paper and the degree of in-scope exploring keyword search, citation analysis, reputations of authors and affiliations, similarity with respect to accepted papers. The features are then fed to standard machine learning based classifiers to develop an automated system. On a decent set of test data our generic approach yields promising results across 3 different journals. The study inherently exhibits the possibility of a redefined interest of the research community on the study of rejected papers and inculcates a drive towards an automated peer review system.

研究动机与目标

探究学术同行评审中初筛拒稿的根本原因。
开发一种自动化系统，预测稿件是否因范围不符或质量低下而被初筛拒稿。
整合多种特征——关键词相关性、引用分析、作者/机构声誉，以及与已接受论文的相似性——构建统一的预测模型。
评估基于真实期刊数据的机器学习技术在自动化早期稿件筛选中的可行性。

提出的方法

使用基于关键词的语义分析及与以往已接受论文的相似性，提取与稿件范围相关的特征。
分析引用模式，评估被引文献的学术影响力与相关性。
引入作者与机构声誉指标，以评估稿件的可信度。
基于内容相似性对稿件进行聚类，以识别异常或不匹配的投稿。
在标注数据上训练监督式机器学习分类器，以预测初筛拒稿结果。
将多种特征整合为统一评分系统，按拒稿可能性对稿件进行排序。

实验结果

研究问题

RQ1学术同行评审中初筛拒稿的主要原因是什么？
RQ2在多大程度上，机器学习可基于稿件特征预测初筛拒稿？
RQ3关键词相关性、引用质量以及与已接受论文的相似性等特征，与拒稿决定之间存在何种关联？
RQ4自动化系统能否可靠地区分因超出范围与因质量低下导致的拒稿？

主要发现

所提出的系统在三家不同期刊中均表现出色，成功预测了初筛拒稿。
特征整合——包括关键词分析、引用模式以及与已接受论文的相似性——显著提升了预测准确率。
该模型成功利用自然语言处理与机器学习技术，识别出因超出范围或质量低下而被拒稿的稿件。
本研究证明了利用人工智能自动化早期稿件筛选的可行性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。