QUICK REVIEW

[论文解读] Joint Distribution Optimal Transportation for Domain Adaptation

Courty, Nicolas, Rémi Flamary|arXiv (Cornell University)|May 24, 2017

Domain Adaptation and Few-Shot Learning参考文献 32被引用 69

一句话总结

JDOT 通过对齐源域与目标域的联合分布来学习预测器 f，使用最优传输，直接在无监督领域自适应中优化目标风险界限。

ABSTRACT

This paper deals with the unsupervised domain adaptation problem, where one wants to estimate a prediction function $f$ in a given target domain without any labeled sample by exploiting the knowledge available from a source domain where labels are known. Our work makes the following assumption: there exists a non-linear transformation between the joint feature/label space distributions of the two domain $\mathcal{P}_s$ and $\mathcal{P}_t$. We propose a solution of this problem with optimal transport, that allows to recover an estimated target $\mathcal{P}^f_t=(X,f(X))$ by optimizing simultaneously the optimal coupling and $f$. We show that our method corresponds to the minimization of a bound on the target error, and provide an efficient algorithmic solution, for which convergence is proved. The versatility of our approach, both in terms of class of hypothesis or loss functions is demonstrated with real world classification and regression problems, for which we reach or surpass state-of-the-art results.

研究动机与目标

提出在目标标签不可用的情况下的无监督域自适应的动机。
提出一个联合分布 OT 框架以对齐源域和目标域中的 (X,Y)。
推导一个界限，展示在对齐条件下 JDOT 如何使目标误差最小化。
给出一个对学习 f 和传输计划具有收敛性保证的算法。
在实际分类和回归任务中展示 JDOT 的有效性。

提出的方法

用 f(X) 定义 P_s 和 P_t^f 作为源域联合分布与代理目标分布。
将 JDOT 形式化为在传输多面体内对 min_f, gamma 的优化，其中 D((x_s,y_s);(x_t,f(x_t))) 的形式为 D = alpha d(x_s,x_t) + L(y_s,f(x_t))。
在经验联合分布之间使用 1-Wasserstein 距离 W_1。
通过交替优化 OT 计划 gamma 和预测器 f，采用块坐标下降求解。
对 f 进行正则化以防止过拟合，并讨论收敛性保证。
解释特殊情形如何退化为使用 RKHS 或神经网络的回归/分类。

实验结果

研究问题

RQ1通过 OT 的联合分布对齐是否能够同时弥合无监督领域自适应中的边际分布和条件分布移位？
RQ2如何联合学习预测器 f 和最优传输计划以最小化目标风险？
RQ3JDOT 框架是否提供理论保证，将传输计划与目标误差联系起来？
RQ4JDOT 是否可用于回归和分类，使用通用损失函数和假设空间？

主要发现

JDOT 在多个域自适应任务上持续优于基线，包括 Caltech-Office、Amazon 评论和 Wifi 定位。
该方法利用最优传输计划将源标签传播并融合到目标样本，提升迁移性能。
JDOT 在标准条件下为其块坐标优化方案提供收敛性保证。
实证结果显示 JDOT 在具有多样假设空间（核方法、神经网络）的回归和分类问题中达到具竞争力或优越的性能。
作者提供开源实现，并在真实数据集上展示实际适用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。