QUICK REVIEW

[論文レビュー] Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions

Sara Magliacane, Thijs van Ommen|arXiv (Cornell University)|Jul 20, 2017

Domain Adaptation and Few-Shot Learning被引用数 91

ひとこと要約

本論文は、Joint Causal Inferenceを用いて分離特徴セットを選択し、源ドメインとターゲットドメイン間でY|Aが不変となるよう因果領域適応を提案し、既知の因果グラフがなくても転送を可能にする方法を提案する。

ABSTRACT

An important goal common to domain adaptation and causal inference is to make accurate predictions when the distributions for the source (or training) domain(s) and target (or test) domain(s) differ. In many cases, these different distributions can be modeled as different contexts of a single underlying system, in which each distribution corresponds to a different perturbation of the system, or in causal terms, an intervention. We focus on a class of such causal domain adaptation problems, where data for one or more source domains are given, and the task is to predict the distribution of a certain target variable from measurements of other variables in one or more target domains. We propose an approach for solving these problems that exploits causal inference and does not rely on prior knowledge of the causal graph, the type of interventions or the intervention targets. We demonstrate our approach by evaluating a possible implementation on simulated and real world data.

研究の動機と目的

介入や文脈の違いによりソースとターゲットの分布が異なるドメイン適応を動機付ける。
因果グラフや介入ターゲットの完全な知識を必要としない方法を開発する。
分離集合Aを特定し、YがAで条件付けられたときドメイン間で不変であることを示す。
潜在的混乱因子を扱える概念実証的実装を提供する。
合成データと実世界のマウス遺伝学データを用いてアプローチを評価する。

提案手法

システムおよび文脈変数を含む構造的因果モデルでデータをモデリングする。
Joint Causal Inference (JCI) における文脈変数として表現された介入としてドメインシフトを定式化する。
Aを分離集合として定義し、因果グラフにおいてC1 (ドメイン) がYと条件付き独立になるようにする。
条件付き独立性と自動定理証明機を用いたAをテストする因果推論/発見アプローチを用いる。
ソースドメインのリスクで特徴サブセットをランク付けし、AがC1 ⟂ Y | Aを満たすかを探索する。
Aが分離集合である場合、共変量シフトを扱う方法でYをAから予測する。

実験結果

リサーチクエスチョン

RQ1結合系においてC1がYと独立であるような変数Aのサブセットを特定し、ドメイン間でY|Aが不変になることを保証できるか？
RQ2完全な因果グラフ知識がなくてもJoint Causal Inferenceを用いて分離集合を発見するにはどう活用できるか？
RQ3非分離特徴集合を使用した場合の転送バイアスと情報欠如バイアスの影響はどのようになるか？
RQ4提案手法は潜在的混乱因子やさまざまな介入タイプのドメイン適応を扱えるか？
RQ5合成データと実世界の生物学的データでの性能はどうか？

主な発見

フレームワークの下で分離集合Aが存在し、源とターゲットドメイン間でY|Aを不変にする。
Aを使用することで転送バイアスを低減し、Aが分離条件を満たす場合にはターゲットドメインリスクに漸近的な保証を提供する。
特徴リ rankingと因果推論検証機を組み合わせた実用的なアルゴリズムが分離集合を特定する。
分離集合が見つからない場合には予測を控え、恣意的な悪い転送を避ける。
合成データとIMPCマウス遺伝データでの評価を行い、標準的な特徴選択ベースラインと比較した。
実装は再現性のためオンラインで公開されている（caus-am/dom_adapt）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。