QUICK REVIEW

[论文解读] Robust Hypothesis Testing Using Wasserstein Uncertainty Sets

Rui Gao, Liyan Xie|arXiv (Cornell University)|May 1, 2018

Adversarial Robustness in Machine Learning被引用 23

一句话总结

该论文提出了一种基于Wasserstein不确定性集的、以经验分布为中心的数据驱动鲁棒假设检验框架，实现了无需分布假设、计算高效的检测。通过凸近似和维度无关的可处理重构，该方法实现了近乎最优的性能，并在人体活动识别的真实数据上得到了强有力的验证。

ABSTRACT

We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector.

研究动机与目标

解决在缺乏参数分布假设的前提下，模型不确定性下的鲁棒假设检验挑战。
开发一种计算高效的检测方法，即使在真实分布偏离名义模型时仍保持有效性。
利用Wasserstein度量在经验分布周围构建不确定性集，以确保数据驱动的自适应能力。
通过凸安全近似求解最小最大公式，实现近乎最优的检测性能。
实现与维度无关的计算和近乎与样本规模无关的计算，以提升高维场景下的可扩展性。

提出的方法

利用Wasserstein度量在经验分布周围定义不确定性集，为原假设和备择假设下的可能分布构建鲁棒邻域。
将鲁棒检验公式化为最小最大问题，以在分布模糊性下最大化最坏情况功效。
对最小最大公式应用凸安全近似，确保计算可处理性的同时保持近似最优性。
利用最不利分布的结构，推导出与观测空间维度无关的可处理重构。
确保计算复杂度几乎不随样本规模变化，从而实现对大规模数据集的可扩展性。
利用对偶性和优化技术，将鲁棒检测问题转化为可求解的凸规划问题。

实验结果

研究问题

RQ1如何使鲁棒假设检验在计算高效的同时，摆脱对参数分布假设的依赖？
RQ2基于Wasserstein的不确定性集是否能在保持计算可处理性的同时提升检测的鲁棒性？
RQ3在模型误设情况下，所提出的鲁棒检测器与最优检验之间的性能差距有多大？
RQ4计算复杂度在多大程度上可以与观测空间维度解耦？
RQ5该方法在真实世界高维数据（如人体活动识别）上的表现如何？

主要发现

所提出的框架即使在最坏情况下的分布偏差下，也能实现所有可能检验中近乎最优的检测性能。
最小最大公式的凸安全近似确保了计算效率，同时保持了强大的理论保证。
近似公式的可处理重构表现出与观测空间维度无关的复杂度。
该方法展现出近乎与样本规模无关的特性，使其可扩展至大规模数据集。
在人体活动识别数据上的实证评估证实了该鲁棒检测器在真实场景中的优异表现。
使用Wasserstein不确定性集实现了无需对底层分布做先验假设的数据驱动鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。