QUICK REVIEW

[論文レビュー] Towards a Definition of Disentangled Representations

Irina Higgins, David Amos|arXiv (Cornell University)|Dec 5, 2018

Generative Adversarial Networks and Image Synthesis参考文献 5被引用数 294

ひとこと要約

この論文は、群理論と対称性を用いて解離された表現を正式に定義する。表現は、世界の対称群の各サブグループによって独立に変換されるサブ空間に分解される場合に解離されたものとみなされる。

ABSTRACT

How can intelligent agents solve a diverse set of tasks in a data-efficient manner? The disentangled representation learning approach posits that such an agent would benefit from separating out (disentangling) the underlying structure of the world into disjoint parts of its representation. However, there is no generally agreed-upon definition of disentangling, not least because it is unclear how to formalise the notion of world structure beyond toy datasets with a known ground truth generative process. Here we propose that a principled solution to characterising disentangled representations can be found by focusing on the transformation properties of the world. In particular, we suggest that those transformations that change only some properties of the underlying world state, while leaving all other properties invariant, are what gives exploitable structure to any kind of data. Similar ideas have already been successfully applied in physics, where the study of symmetry transformations has revolutionised the understanding of the world structure. By connecting symmetry transformations to vector representations using the formalism of group and representation theory we arrive at the first formal definition of disentangled representations. Our new definition is in agreement with many of the current intuitions about disentangling, while also providing principled resolutions to a number of previous points of contention. While this work focuses on formally defining disentangling - as opposed to solving the learning problem - we believe that the shift in perspective to studying data transformations can stimulate the development of better representation learning algorithms.

研究の動機と目的

対称変換を用いて、解離された表現の原理的で formal な定義を動機付ける。
物理学の概念（群と表現論）を機械学習の表現へ橋渡しする。
データの生成因子が何であるか、どのように表現され、操作され得るかを明確にする。

提案手法

世界の一部の側面のみを変え、他の側面を不変に保つ群作用として対称変換を導入する。
世界の対称群の1つのサブグループによって影響を受ける独立したサブ空間へ分解される場合に、ベクトル表現が解離されているとする。
共変性を定義する：世界Wに対してfがG共変であるようなZ上のG作用が存在すれば、fは解離された表現である。
G = G1 × ... × Gnの分解と、それに対応するZの分解Z = Z1 ⊕ ... ⊕ Zn または Z1 × ... × Zn に基づいて、解離された表現を形式化する。
サブ空間上のサブグループ作用が線形である線形解離表現について議論する。
概念を説明するための実例としてグリッドワールドを用いる。

実験結果

リサーチクエスチョン

RQ1対称変換をどのように形式化して解離された表現を定義できるか。
RQ2世界の対称群分解において、変動因子を独立したサブ空間に分離する表現を保証する条件は何か。
RQ3共変性が、対称性構造を保持しつつ世界状態を表現空間に写像する能力とどのように関係するか。
RQ4特定のデータセットに対して異なるサブグループ分解を選ぶことの含意は何か。

主な発見

群と表現論に基づく、初めて principled で formal な解離表現の定義を提案する。
解離表現は、世界の対称群の分解と整合する表現空間の分解に対応することを示す。
複数のサブグループ分解が存在しうるが、世界構造を反映する自然な分解だけが有用な解離を生むと主張する。
解離表現は組成性や変換の線形性の可能性を高め、学習効率を助け得る点を強調する。
純粋な経験的直感ではなく、対称性に基づく定義の背景に照らして解離を評価する方法を明確にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。