QUICK REVIEW

[论文解读] Neural Relational Inference for Interacting Systems

Thomas Kipf, Ethan Fetaya|arXiv (Cornell University)|Feb 13, 2018

Human Pose and Action Recognition参考文献 45被引用 255

一句话总结

一个无监督的变分自编码器，能够同时推断潜在交互图并通过图神经网络学习动态，从而在模拟和现实世界的交互系统中实现可解释的边类型和对未来状态的准确预测。

ABSTRACT

Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics. The interplay of components can give rise to complex behavior, which can often be explained using a simple model of the system's constituent parts. In this work, we introduce the neural relational inference (NRI) model: an unsupervised model that learns to infer interactions while simultaneously learning the dynamics purely from observational data. Our model takes the form of a variational auto-encoder, in which the latent code represents the underlying interaction graph and the reconstruction is based on graph neural networks. In experiments on simulated physical systems, we show that our NRI model can accurately recover ground-truth interactions in an unsupervised manner. We further demonstrate that we can find an interpretable structure and predict complex dynamics in real motion capture and sports tracking data.

研究动机与目标

在没有人工边注释的轨迹下，激发对动力系统潜在相互作用的学习。
提出一个神经关系推理（NRI）模型，联合学习边类型交互和系统动态。
通过概率图解码器使边类型交互可解释、离散化表示。
在模拟物理、运动捕捉和体育数据上演示对真实交互的无监督恢复以及对长期预测的准确性。

提出的方法

用一个在全连接图上的 GNN 编码观测轨迹，以推断边类型分布 q_phi(z_ij|x)。
使用离散潜在图（边类型）表示为 one-hot 向量，并使用连续放松（具体分布）进行重参数化。
用基于 GNN 的解码器对未来轨迹进行解码，条件是推断得到的图 z，并使用独立的边类型解码器来建模不同的交互。
用变分目标（ELBO）进行训练，结合重构项 E_{q_phi(z|x)}[log p_theta(x|z)] 与 KL 项 KL[q_phi(z|x)||p_theta(z)]。
通过预测多个未来步骤并使用逐边类型解码器来强制依赖边类型，从而缓解退化解码。

实验结果

研究问题

RQ1模型是否能够在无监督的情况下从轨迹推断潜在交互图？
RQ2推断的边类型在物理仿真中的真实交互对应性有多好？
RQ3边类型感知解码器是否在长期预测准确度上优于全连接或非关系基线？
RQ4该方法对现实世界数据（如运动捕捉和体育跟踪）是否鲁棒，能产生可解释的交互结构？

主要发现

NRI 模型在无监督物理系统实验中准确恢复真实的交互图。
NRI 学习少量边类型，使在运动捕捉和体育数据上实现长期预测的准确性。
对潜在图的动态重新评估提升了对真实运动捕捉数据的预测性能。
边类型特定解码器和多步预测缓解退化解码并提升对交互的学习。
在多个模拟任务中，NRI 学习的图在交互恢复和预测方面接近或达到有监督和黄金基准。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。