QUICK REVIEW

[论文解读] Online multiple hypothesis testing

D. S. Robertson, James Wason|arXiv (Cornell University)|Aug 24, 2022

Statistical Methods in Clinical Trials被引用 1

一句话总结

本文全面综述了在线多重假设检验方法，用于在序列化、大规模场景下控制错误发现率（FDR），其中假设按时间顺序逐一到达。论文提出了自适应算法如 LORD++、SAFFRON 和 ADDIS，通过利用 p 值和过往的拒绝决策动态分配误差预算，相较于传统离线方法，在 A/B 测试和基因组学应用中实现了更强的 FDR 控制与更高的统计功效。

ABSTRACT

Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.

研究动机与目标

解决传统离线多重检验程序在现代数据驱动环境中（假设按顺序测试）的局限性。
为在线 FDR 控制提供统一框架，确保在流式数据条件下保持 I 类错误率控制。
评估并比较近期在线检验算法在 FDR 控制、统计功效及依赖性鲁棒性方面的表现。
突出展示在 A/B 测试、基因组学和临床平台试验中的实际应用，并通过软件实现与指导促进其采用。

提出的方法

通过动态分配 alpha 财富的在线 FDR 控制方法，其中误差预算根据过往拒绝结果和 p 值进行更新。
提出 LORD++ 作为 LORD 算法的扩展，采用单调分配规则，在保持 FDR 控制的同时提升统计功效。
采用 SAFFRON 和 ADDIS 算法，通过阈值 λ 和 η 自适应设定显著性水平，分别处理原假设与备择假设信号。
使用任意时间有效的 p 值，允许序列化检验，无需固定样本量或预先知晓总检验数。
应用依赖于累计拒绝数和 p 值排名的检验水平公式，确保在任意依赖结构下实现 FDR 控制。
通过在不同非零假设比例和依赖结构下进行大量模拟验证方法性能。

实验结果

研究问题

RQ1当假设按顺序到达且总数未知时，如何在在线多重检验中控制 FDR？
RQ2在线与离线检验框架之间，统计功效与 FDR 控制之间的权衡是什么？
RQ3在 p 值存在依赖性的情况下，LORD++、SAFFRON 和 ADDIS 等在线方法相较于独立设定下的表现如何？
RQ4在线方法能否在大规模检验中实现与 Benjamini-Hochberg 程序相当或更优的统计功效？
RQ5在 A/B 测试和平台试验等实际应用中部署在线检验时会面临哪些实际挑战？

主要发现

LORD++、SAFFRON 和 ADDIS 在各种模拟场景下（包括 p 值存在依赖性的情况）均将 FDR 控制在名义水平 α = 0.05 以内。
所提出的在线方法在大规模检验场景中，尤其当非零假设比例（π1）较高时，比 Benjamini-Hochberg 程序具有更高的统计功效。
模拟结果表明，未经校正的检验会导致 FDR 明显超过 α = 0.05，而所有在线方法均保持在可接受范围内。
ADDIS 和 SAFFRON 通过整合过往拒绝信息和 p 值阈值（η 和 λ），在依赖性条件下表现出良好的鲁棒性。
单调 AI 算法在稀疏信号场景下表现出强大的功效，同时保持了 FDR 控制。
onlineFDR R 包支持这些方法的可重现实现，有助于在基因组学、A/B 测试和临床试验中实现更广泛的应用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。