QUICK REVIEW

[論文レビュー] DIVA: Domain Invariant Variational Autoencoders

Maximilian Ilse, Jakub M. Tomczak|arXiv (Cornell University)|May 24, 2019

Domain Adaptation and Few-Shot Learning参考文献 43被引用数 66

ひとこと要約

DIVAはVAE内にドメイン、クラス、残差の3つの独立した潜在サブスペースを学習し、ドメイン一般化を達成し、ラベルなしデータを活用して性能を向上させる。

ABSTRACT

We consider the problem of domain generalization, namely, how to learn representations given data from a set of domains that generalize to data from a previously unseen domain. We propose the Domain Invariant Variational Autoencoder (DIVA), a generative model that tackles this problem by learning three independent latent subspaces, one for the domain, one for the class, and one for any residual variations. We highlight that due to the generative nature of our model we can also incorporate unlabeled data from known or previously unseen domains. To the best of our knowledge this has not been done before in a domain generalization setting. This property is highly desirable in fields like medical imaging where labeled data is scarce. We experimentally evaluate our model on the rotated MNIST benchmark and a malaria cell images dataset where we show that (i) the learned subspaces are indeed complementary to each other, (ii) we improve upon recent works on this task and (iii) incorporating unlabelled data can boost the performance even further.

研究の動機と目的

医用画像処理を含む、訓練ドメインが未知のテストドメインと異なるドメイン一般化を動機づける。
ドメイン、クラス、残差因子を別個の潜在サブスペースに分解する生成モデル(DIVA)を提案する。
Known or unseen domains from unlabeled dataを活用した半教師あり学習を可能にする。
回転MNISTとマラリア細胞画像データセットでの分離と一般化の改善を示す。
ラベルなしデータが性能を向上させることを示し、ドメイン空間の插入（補間）と外挿について議論する。

提案手法

独立した3つの潜在変数 z_d (ドメイン), z_y (クラス), z_x (残差) とそれぞれ事前分布 p(z_d|d), p(z_y|y), p(z_x) を導入する。
xから z_d, z_y, z_x を推定する3つの別々のエンコーダ q_phi_d, q_phi_y, q_phi_x を用い、共有デコーダ p_theta(x|z_d,z_x,z_y) を用いる。
β-VAE様の下界を最適化し、再構成項と各潜在のKLペナルティを含む: L_s = E[...] log p_theta(x|z_d,z_x,z_y) - beta[KL(q_phi_d(z_d|x)||p_theta_d(z_d|d)) + KL(q_phi_x(z_x|x)||p(z_x)) + KL(q_phi_y(z_y|x)||p_theta_y(z_y|y))].
分離を促進するために z_d からドメインを予測し z_y からクラスを予測する補助目的を含める: F_DIVA = L_s + alpha_d E[log q_omega_d(d|z_d)] + alpha_y E[log q_omega_y(y|z_y)].
有教師付きデータ (d,x,y) と無教師付きデータ (d,x) を共同訓練し、yを周辺化し z_y 上の補助分類器と式(4)のように監視と非監視項をブレンドする目的を組み込んだ半教師化DIVAへ拡張する。

実験結果

リサーチクエスチョン

RQ13系統の潜在空間（ドメイン、クラス、残差）を持つVAEが、未知のドメインに対する一般化を改善するよう、ドメイン特有情報とクラス特有情報を分離できるか？
RQ2Knownまたは unseenドメインからのラベルなしデータを取り入れることで、DIVAのドメイン一般化性能が向上するか？
RQ3DIVAは rotated MNIST や malaria cell 画像などのベンチマークで、ドメイン適応法や他のドメイン一般化法と比べてどうか？

主な発見

DIVAは rotated MNIST のテストドメインで DA, LG, HEX, ADV より高いテスト精度を達成。
無ラベルデータを取り入れると(+1, +3, +5, +9) 一般に性能が向上するが、無ラベルデータがラベル付きデータを支配すると利益が小さくなる。
潜在サブスペースは分離されており: z_d はドメイン、 z_y はクラス、 z_x は残差変動を捉え、条件付き再構成とサンプル生成を可能にする。
マラリア細胞画像で、DIVAはベースラインに対してROC AUCを改善し、半教師付き設定で無ラベルデータの恩恵を受ける。
DIVAは新しいドメインの無ラベルデータから学習でき、y予測子とドメインエンコーダを更新して一般化を改善する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。