QUICK REVIEW

[论文解读] Out of Distribution Generalization in Machine Learning

Martín Arjovsky|arXiv (Cornell University)|Mar 3, 2021

Machine Learning and Algorithms参考文献 68被引用 50

一句话总结

对机器学习中越出分布（OOD）泛化的全面综述与框架，聚焦鲁棒性、不变性和因果视角，并包含IRM及领域特定讨论。

ABSTRACT

Machine learning has achieved tremendous success in a variety of domains in recent years. However, a lot of these success stories have been in places where the training and the testing distributions are extremely similar to each other. In everyday situations when models are tested in slightly different data than they were trained on, ML algorithms can fail spectacularly. This research attempts to formally define this problem, what sets of assumptions are reasonable to make in our data and what kind of guarantees we hope to obtain from them. Then, we focus on a certain class of out of distribution problems, their assumptions, and introduce simple algorithms that follow from these assumptions that are able to provide more reliable generalization. A central topic in the thesis is the strong link between discovering the causal structure of the data, finding features that are reliable (when using them to predict) regardless of their context, and out of distribution generalization.

研究动机与目标

澄清越出分布泛化（OOD）在不同问题和应用中的含义。
提出一个框架，在多个环境下量化和比较OOD性能。
综述现有方法（鲁棒优化、领域自适应、不变性）并分析其潜在假设。
突出因果与不变特征在实现泛化中的作用，并指出当前方法的局限性。

提出的方法

引入一个框架，其中OOD风险是在一组环境中的最坏情况风险：ROOD(f)=max_e R^e(f)。
讨论训练环境以及需要E_train与更广泛的环境集合E_all相关。
提供能说明虚假相关性及其如何影响OOD性能的示例。
考察经验风险最小化（ERM）、鲁棒优化，以及Wasserstein/KL鲁棒性之间的联系。
发展并参考Invariant Risk Minimization（IRM）及其非线性变体，作为在不同环境中强制不变性的途径。
概述跨机器人、计算机视觉、自然语言处理和公平性等领域的领域特定案例研究及实际影响。

实验结果

研究问题

RQ1如何在不同问题中量化并比较越出分布的泛化？
RQ2在什么假设条件下，各种OOD方法（鲁棒优化、领域自适应、不变性）能够成功或失败？
RQ3在跨环境实现鲁棒泛化时，不变或因果特征的作用是什么？
RQ4如何将训练环境与未见测试环境联系起来，以实现更好的OOD性能？
RQ5在真实世界应用中，OOD泛化的局限性与未来方向是什么？

主要发现

没有一种通用的OOD方法；性能取决于对环境的特定问题假设。
基于环境下最坏情况风险的框架提供了一种有原则性的衡量OOD性能的方法。
基于不变性的方法，如IRM，旨在识别在各环境中稳定的预测变量，在存在此类不变性时有助于泛化。
实证讨论包括合成数据和有色MNIST等实验，说明虚假相关性与鲁棒性挑战。
领域特定的视角显示OOD泛化的考量在机器人、计算机视觉、NLP和公平性设置中有所不同。
该论文强调假设的现实性与在OOD泛化中的实际收益之间的权衡。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。