QUICK REVIEW

[論文レビュー] Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers

Harald Semmelrock, Tony Ross‐Hellauer|arXiv (Cornell University)|Jun 20, 2024

Artificial Intelligence in Healthcare and Education被引用数 10

ひとこと要約

ML再現性の障壁（記述、コード、データ、実験）を特定し、技術的・手続き的・認識主導のドライバーを議論するとともに、改善を推進するDrivers-Barriers-Matrixを提示する批判的な総説。

ABSTRACT

Many research fields are currently reckoning with issues of poor levels of reproducibility. Some label it a "crisis", and research employing or building Machine Learning (ML) models is no exception. Issues including lack of transparency, data or code, poor adherence to standards, and the sensitivity of ML training conditions mean that many papers are not even reproducible in principle. Where they are, though, reproducibility experiments have found worryingly low degrees of similarity with original results. Despite previous appeals from ML researchers on this topic and various initiatives from conference reproducibility tracks to the ACM's new Emerging Interest Group on Reproducibility and Replicability, we contend that the general community continues to take this issue too lightly. Poor reproducibility threatens trust in and integrity of research results. Therefore, in this article, we lay out a new perspective on the key barriers and drivers (both procedural and technical) to increased reproducibility at various levels (methods, code, data, and experiments). We then map the drivers to the barriers to give concrete advice for strategies for researchers to mitigate reproducibility issues in their own work, to lay out key areas where further research is needed in specific areas, and to further ignite discussion on the threat presented by these urgent issues.

研究の動機と目的

ML研究における再現性の定義を、記述、コード、データ、実験の各タイプを横断して明確化・統一する。
CSおよび生物医療分野の両方で、ML再現性の障壁を特定し分類する。
ML再現性を高める可能性のあるドライバーを特定・分類し、それらを障壁に対応づける。
再現性解決策の採用を支援する意思決定を視覚的に支援するDrivers-Barriers-Matrixを提案する。

提案手法

ML再現性と既存の障壁/ドライバーに関する文献をレビューし統合する。
障壁を4つの再現性タイプに分類する（R1 Description、R2 Code、R3 Data、R4 Experiment）。
ドライバーを技術ベース、手続き、認識/教育のカテゴリに分類する。
改善の実現可能性を評価するためにドライバーを障壁に対応づける。
関係性を可視化・伝達するためにDrivers-Barriers-Matrixを導入する。

Figure 1 : Types of reproducibility . Adapted from Gundersen [ 26 ] .

実験結果

リサーチクエスチョン

RQ1記述、コード、データ、実験の各側面を横断するML主導の研究における主な再現性障壁は何か？
RQ2ML再現性を支援するドライバーにはどのようなものがあり、それらは特定された障壁にどう対応づくか？
RQ3Drivers-Barriers-Matrixは研究者や機関が再現性介入を決定するのにどう役立つか？
RQ4ML特有の課題（例： nondeterminism、データ漏洩、AutoML）が一般的な再現性の懸念とどう相互作用するか？
RQ5CSおよび生物医療分野でML再現性を向上させると期待されるガイドライン、ツール、実践は何か？

主な発見

Nine identified barriers to ML reproducibility distributed across four reproducibility types: description, code, data, and experiment.
Limited access to code and data, plus documentation gaps, are major impediments to repro- ducibility.
Inherent nondeterminism, environmental differences, and resource constraints significantly affect experiment reproducibility.
Privacy-preserving tech, hosting services, virtualization, and tooling/platforms can act as reproducibility enablers but have trade-offs.
Standardized datasets, evaluation methods, and guidelines/checklists (e.g., model cards, data cards) support reproducibility.
Awareness, education, and practical workflows are essential for sustained improvements in ML reproducibility.

Figure 2 : Drivers-Barriers-Matrix. We map the 9 drivers to the 9 barriers identified in this paper. The colored boxes show that a specific driver is applicable to a specific barrier.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。