QUICK REVIEW

[论文解读] Learning Plannable Representations with Causal InfoGAN

Thanard Kurutach, Aviv Tamar|arXiv (Cornell University)|Jul 24, 2018

Multimodal Machine Learning Applications参考文献 33被引用 91

一句话总结

Causal InfoGAN 以从高维观测中学习低维、可规划的表征，并通过将观测映射到抽象状态并在该潜在空间中进行规划，从而实现面向目标的视觉规划。随后将计划解码回观测序列。

ABSTRACT

In recent years, deep generative models have been shown to 'imagine' convincing high-dimensional observations such as images, audio, and even video, learning directly from raw data. In this work, we ask how to imagine goal-directed visual plans -- a plausible sequence of observations that transition a dynamical system from its current configuration to a desired goal state, which can later be used as a reference trajectory for control. We focus on systems with high-dimensional observations, such as images, and propose an approach that naturally combines representation learning and planning. Our framework learns a generative model of sequential observations, where the generative process is induced by a transition in a low-dimensional planning model, and an additional noise. By maximizing the mutual information between the generated observations and the transition in the planning model, we obtain a low-dimensional representation that best explains the causal nature of the data. We structure the planning model to be compatible with efficient planning algorithms, and we propose several such models based on either discrete or continuous states. Finally, to generate a visual plan, we project the current and goal observations onto their respective states in the planning model, plan a trajectory, and then use the generative model to transform the trajectory to a sequence of observations. We demonstrate our method on imagining plausible visual plans of rope manipulation.

研究动机与目标

激励并解决如何从高维观测中想象面向目标的视觉计划的问题。
学习一个捕捉数据因果结构的低维、便于规划的表征。
将表征学习与规划结合，生成从起始到目标的观测序列。

提出的方法

训练一个因果 InfoGAN，将观测分解为结构化的潜在规划系统和噪声分量。
使用互信息目标函数，确保抽象状态 s、s' 捕捉解释数据的因果转移。
支持离散（one-hot 或二进制）和连续潜在规划系统，并具备兼容的规划算法。
通过 Q(s|o) 或通过潜在空间优化将观测编码为潜在状态，以处理高维观测。
使用条件 GAN 生成器将潜在状态轨迹解码为观测序列，并通过判别器或新颖性检测器选择最佳轨迹。
使用变分下界 I_VLB 近似潜在转移与生成观测之间的互信息，以进行优化。

实验结果

研究问题

RQ1我们如何学习在高维数据上既具表达性又便于规划的表征？
RQ2基于 GAN 的模型在面向规划的潜在空间中，是否能从起始观测到目标观测生成可信的走访过程？
RQ3潜在规划系统应如何设计（离散或连续），以与标准规划算法兼容？
RQ4在高维领域中，哪些策略可以改进对真实观测的编码以映射到潜在状态？
RQ5学习的表征和生成的走访过程在绳索操作等任务上有多有效？

主要发现

因果 InfoGAN 能学习与因果转移对齐并支持潜在空间中的规划的抽象状态。
该框架支持与 Dijkstra、线性插值等规划算法兼容的离散与连续潜在规划系统。
方法能够在使用真实图像数据的绳索操作场景中，从起始到目标生成可信的视觉走访过程。
针对高维观测的编码策略（基于搜索的潜在映射或学习的 Q）在状态映射方面优于仅在生成数据上训练的判别器。
一个变分下界促进了对互信息目标的训练，使模型的端到端优化成为可能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。