QUICK REVIEW

[论文解读] Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit

Belinda Tzen, Maxim Raginsky|arXiv (Cornell University)|May 23, 2019

Model Reduction and Neural Networks参考文献 41被引用 54

一句话总结

本文通过将潜在路径视为带有神经网络参数化漂移和扩散的 Wiener 过程来实现神经 SDE 的变分推断框架，从而实现端到端学习，配合黑箱 SDE 求解器。

ABSTRACT

In deep latent Gaussian models, the latent variable is generated by a time-inhomogeneous Markov chain, where at each time step we pass the current state through a parametric nonlinear map, such as a feedforward neural net, and add a small independent Gaussian perturbation. This work considers the diffusion limit of such models, where the number of layers tends to infinity, while the step size and the noise variance tend to zero. The limiting latent object is an Itô diffusion process that solves a stochastic differential equation (SDE) whose drift and diffusion coefficient are implemented by neural nets. We develop a variational inference framework for these extit{neural SDEs} via stochastic automatic differentiation in Wiener space, where the variational approximations to the posterior are obtained by Girsanov (mean-shift) transformation of the standard Wiener process and the computation of gradients is based on the theory of stochastic flows. This permits the use of black-box SDE solvers and automatic differentiation for end-to-end inference. Experimental results with synthetic data are provided.

研究动机与目标

动机化并形式化将深潜在高斯模型的扩散极限表述为 Itô 扩散。
利用路径空间 Gibbs 原理发展神经 SDE 的变分推断框架。
利用 Girsanov 重参数化和随机流以实现对黑箱 SDE 求解器的基于梯度的优化。
提供在 Wiener 空间中进行自动微分以实现端到端学习的方法。

提出的方法

将潜在过程建模为具有神经网络漂移和扩散系数的 Itô SDE。
将 Wiener 空间用作潜在空间，并在路径空间应用 Gibbs 变分原理。
应用 Girsanov 定理将变分后验与 Wiener 过程的均值平移相关联。
在 Wiener 空间中使用自动微分以计算通过 SDE 求解器的梯度。
讨论两种梯度计算策略：先求解再求导（Euler 反向传播）和先求导再求解（路径自微分）。
描述如何通过神经网络和蒙特卡洛估计近似 Föllmer 漂移。

实验结果

研究问题

RQ1如何在与深潜在高斯模型类似的变分框架中对神经 SDE 进行推断建模？
RQ2Girsanov 重参数化和随机流是否能够在黑箱 SDE 求解器下实现可处理的基于梯度的学习？
RQ3在连续时间中计算神经 SDE 参数梯度的实际方法是什么？
RQ4扩散极限视角如何影响深度概率模型的表达能力与推断？

主要发现

在路径空间中利用 Gibbs 原理建立了对边际似然的变分界。
在 Girsanov 定理下，任意变分后验都对应于标准 Wiener 过程的均值平移。
神经 SDE 的变分推断可以通过 Wiener 空间中的自动微分来执行。
提出两种梯度计算策略：Solve-then-differentiate 与 Euler 反向传播，以及 differentiate-then-solve 通过随机流。
神经 SDE 的漂移与扩散可以通过神经网络实现，保持端到端可微。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。