QUICK REVIEW

[论文解读] Optimal Stopping Rules for Sequential Hypothesis Testing

Akshay Balsubramani|arXiv (Cornell University)|May 12, 2014

Markov Chains and Monte Carlo Methods参考文献 14被引用 21

一句话总结

本文通过一种新颖的指数矩方法，结合平均上鞅与停时，建立了适用于统一时间控制的鞅的尖锐、有限时间浓度界限，将迭代对数定律（LIL）推广至统一时间范围，实现了最优常数的非渐近性LIL界限，并通过匹配的反浓度不等式得到验证，为顺序假设检验与顺序分析提供了优于经典Hoeffding与Bernstein不等式的有限样本保证，是经典浓度不等式在有限时间下的对应版本。

ABSTRACT

Suppose that we are given sample access to an unknown distribution p over n elements and an explicit distribution q over the same n elements. We would like to reject the null hypothesis "p=q" after seeing as few samples as possible, when p =/= q, while we never want to reject the null, when p=q. Well-known results show that Theta(sqrt{n}/epsilon^2) samples are necessary and sufficient for distinguishing whether p equals q versus p is epsilon-far from q in total variation distance. However, this requires the distinguishing radius epsilon to be fixed prior to deciding how many samples to request. Our goal is instead to design sequential hypothesis testers, i.e. online algorithms that request i.i.d. samples from p and stop as soon as they can confidently reject the hypothesis p=q, without being given a lower bound on the distance between p and q, when p =/= q. In particular, we want to minimize the number of samples requested by our tests as a function of the distance between p and q, and if p=q we want the algorithm, with high probability, to never reject the null. Our work is motivated by and addresses the practical challenge of sequential A/B testing in Statistics. We show that, when n=2, any sequential hypothesis test must see Omega(1/{d_{tv}(p,q)^2} log log 1/{d_{tv}(p,q)}) samples, with high (constant) probability, before it rejects p=q, where d_{tv}(p,q) is the - unknown to the tester - total variation distance between p and q. We match the dependence of this lower bound on d_{tv}(p,q) by proposing a sequential tester that rejects p=q from at most O({\sqrt{n}}/{d_{tv}(p,q)^2}log log 1/{d_{tv}(p,q)}) samples with high (constant) probability. The Omega(sqrt{n}) dependence on the support size n is also known to be necessary. We similarly provide two-sample sequential hypothesis testers, when sample access is given to both p and q, and discuss applications to sequential A/B testing.

研究动机与目标

弥合迭代对数定律（LIL）的渐近性质与顺序随机过程中的有限时间浓度之间的差距。
开发在小常数范围内最优的统一时间鞅浓度不等式，适用于顺序假设检验与顺序分析。
将经典不等式（Hoeffding、Bernstein）统一并推广为具有精确时间t与置信水平δ之间权衡关系的有限时间、统一时间形式。
建立LIL的非渐近版本，将中心极限定理区域（O(√t log(1/δ)))与LIL区域（O(√t log log t))统一于一个尖锐界限中。
通过匹配的反浓度不等式证明界限的最优性，展示时间t与失败概率δ之间权衡的紧致性。

提出的方法

使用一种精心构造的连续参数族λ上的平均上鞅的指数矩方法，以控制鞅的矩生成函数。
利用停时实现分析的局部化，推导出所有有限时间t上的统一界限，其中时间依赖的初始阈值τ₀依赖于δ。
引入一族概率分布P^v_λ，定义于λ ∈ (−1/exp_v(2), 1/exp_v(2)) ⋯ {0}，以优化平均过程，降低√t log log t项的主导常数。
对P^v_λ上的高斯型函数积分应用改进的下界估计，以渐近展开替代粗糙界限，实现最优常数。
通过反向浓度论证证明反浓度不等式，表明该界限无法在常数因子内进一步改进。
推导出有界差分鞅（Hoeffding型）与亚指数型鞅（Bernstein型）的统一界限，失败概率δ为时间统一。

实验结果

研究问题

RQ1迭代对数定律（LIL）能否被推广至鞅的有限时间、统一时间浓度界限？
RQ2统一鞅浓度中时间t与置信水平δ之间的最优权衡是什么？能否精确刻画？
RQ3所提出的鞅有限时间界限在反浓度行为上是否最优？
RQ4能否通过改进的平均技术，将√t log log t项的主导常数从√6降低至渐近最优的√2？
RQ5经典浓度不等式（Hoeffding、Bernstein）如何被推广为在所有有限时间上一致成立且具有精确常数的形式？

主要发现

本文建立了Rademacher随机游走的尖锐、统一时间浓度界限：以概率≥1−δ，对所有t ≥ C log(4/δ)，有|Mt| ≤ √(3t (2 log log(5t/(2|Mt|)) + log(2/δ)))，其中C=173。
该界限表现出有限时间权衡：当t ≲ exp(1/δ)时，主导项为O(√(t log(1/δ)))，类似于中心极限定理型界限。
当t较大且δ > 0固定时，界限收敛至O(√(t log log t))，与渐近LIL速率一致，且该速率不可改进。
证明了匹配的反浓度不等式，表明该界限无法在常数因子内进一步改进，确认了其最优性。
通过使用v ≥ 2的改进族分布P^v_λ，将√t log log t项的主导常数从√6降低至√2，实现了渐近最优性。
该方法可推广至广泛的鞅类，包括有界差分与亚指数型鞅，得到具有最优常数的统一Hoeffding型与Bernstein型界限。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。