QUICK REVIEW

[论文解读] REFORMS: Reporting Standards for Machine Learning Based Science

Sayash Kapoor, Emily M. Cantrell|arXiv (Cornell University)|Aug 15, 2023

Explainable Artificial Intelligence (XAI)被引用 19

一句话总结

简介改革清单，这是一个32项、八模块的报告标准，附有指南以提升基于ML的科学的有效性、可重复性和普遍性。

ABSTRACT

Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear reporting standards for ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist ($ extbf{Re}$porting Standards $ extbf{For}$ $ extbf{M}$achine Learning Based $ extbf{S}$cience). It consists of 32 questions and a paired set of guidelines. REFORMS was developed based on a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.

研究动机与目标

建立一套面向所有领域的ML基础科学报告标准，以防止常见错误。
通过定义样本群体、数据与方法学要求，阐明ML性能如何支撑科学结论。
提供一个结构化、可复用的框架，以提升计算可重复性和独立验证。
为每个清单项提供指南，帮助研究人员、评审和期刊在应用标准时。

提出的方法

通过跨多学科的19位研究者共识过程开发改革清单。
将清单建立在对ML基础科学最佳实践与常见错误的文献综述之上。
将清单组织为8个模块，对应ML基础科学研究的阶段，并附有实施指南（附录B）。
利用来自ML方法研究和健康预测的现有清单来指导覆盖范围，同时使改革具有领域无关性。

实验结果

研究问题

RQ1需要哪些报告标准以减少跨学科的ML基础科学错误？
RQ2标准化清单如何提高基于ML的科学主张的透明度、可重复性与可验证性？
RQ3实践者和期刊应采用哪些核心模块和报告期望？
RQ4如何在遵循领域特有规范的同时应用改革，而不对每项研究都设定强制性规定？

主要发现

提出一个32项、八模块的改革清单，以解决典型的ML基础科学失败。
清单旨在确立科学主张、确保适当评估和不确定性报告，并实现独立验证。
为每一项附上指南以澄清报告期望，并帮助作者、评审和期刊的可用性。
该清单旨在作为对学科特定要求的补充，而非替代，促进跨学科透明度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。