QUICK REVIEW

[논문 리뷰] Joint Distribution Optimal Transportation for Domain Adaptation

Courty, Nicolas, Rémi Flamary|arXiv (Cornell University)|2017. 05. 24.

Domain Adaptation and Few-Shot Learning참고 문헌 32인용 수 69

한 줄 요약

JDOT는 joint source 및 target 분포를 최적 수송으로 정렬하여 f를 학습하고, 비지도 도메인 적응에서 타깃 위험 한계를 직접 최적화한다.

ABSTRACT

This paper deals with the unsupervised domain adaptation problem, where one wants to estimate a prediction function $f$ in a given target domain without any labeled sample by exploiting the knowledge available from a source domain where labels are known. Our work makes the following assumption: there exists a non-linear transformation between the joint feature/label space distributions of the two domain $\mathcal{P}_s$ and $\mathcal{P}_t$. We propose a solution of this problem with optimal transport, that allows to recover an estimated target $\mathcal{P}^f_t=(X,f(X))$ by optimizing simultaneously the optimal coupling and $f$. We show that our method corresponds to the minimization of a bound on the target error, and provide an efficient algorithmic solution, for which convergence is proved. The versatility of our approach, both in terms of class of hypothesis or loss functions is demonstrated with real world classification and regression problems, for which we reach or surpass state-of-the-art results.

연구 동기 및 목표

Motivate unsupervised domain adaptation where target labels are unavailable.
Propose a joint distribution OT framework to align (X,Y) across source and target.
Derive a bound showing how JDOT minimizes target error under alignment.
Provide an algorithm with convergence guarantees for learning f and the transport plan.
Demonstrate JDOT's effectiveness on real classification and regression tasks.

제안 방법

Define P_s and P_t^f as joint source and proxy target distributions using f(X).
Formulate JDOT as min_f, gamma in transport polytope of D((x_s,y_s);(x_t,f(x_t))) with D = alpha d(x_s,x_t) + L(y_s,f(x_t)).
Use 1-Wasserstein distance W_1 between empirical joint distributions.
Solve via block coordinate descent alternating between OT plan gamma and predictor f.
Provide regularization on f to prevent overfitting and discuss convergence guarantees.
Explain how special cases reduce to regression/classification with RKHS or neural networks.

실험 결과

연구 질문

RQ1Can joint distribution alignment via OT bridge both marginal and conditional shifts in unsupervised DA?
RQ2How to jointly learn a predictor f and an optimal transport plan to minimize target risk?
RQ3Does the JDOT framework provide theoretical guarantees linking the transport plan to target error?
RQ4Is JDOT applicable to both regression and classification with common loss functions and hypothesis spaces?

주요 결과

JDOT consistently outperforms baselines on multiple domain adaptation tasks, including Caltech-Office, Amazon reviews, and Wifi localization.
The method uses the optimal transport plan to propagate and fuse source labels to target instances, improving transfer performance.
JDOT provides convergence guarantees for its block-coordinate optimization scheme under standard conditions.
Empirical results show JDOT achieves competitive or superior performance across regression and classification problems with diverse hypothesis spaces (kernel methods, neural networks).
The authors provide an open-source implementation and demonstrate practical applicability across real datasets.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.