QUICK REVIEW

[论文解读] Scalable Gradients for Stochastic Differential Equations

Xuechen Li, Ting‐Kam Leonard Wong|arXiv (Cornell University)|Jan 5, 2020

Model Reduction and Neural Networks参考文献 77被引用 108

一句话总结

本文将伴随灵敏度方法推广到随机微分方程，实现神经SDEs和潜在SDE模型的记忆高效、可扩展的梯度计算，借助高阶自适应求解器。

ABSTRACT

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.

研究动机与目标

Motivate scalable gradient computation for SDEs beyond ODE adjoints.
Extend the adjoint sensitivity framework to Stratonovich SDEs with backward dynamics.
Develop memory-efficient techniques to replay forward noise without storing trajectories.
Enable gradient-based inference and learning for latent SDE models with irregular observations.
Demonstrate competitive performance on high-dimensional stochastic dynamics tasks.

提出的方法

Generalize the adjoint sensitivity method from ODEs to SDEs using backward Stratonovich SDEs to compute gradients.
Derive a stochastic adjoint process whose solution yields the gradient with respect to Z_T through backward dynamics.
Propose an algorithm to query forward-path noise with a single random seed, avoiding storage of all noise realizations.
Integrate the stochastic adjoint with gradient-based stochastic variational inference for latent SDEs.
Allow using high-order adaptive time-stepping SDE solvers within the backward gradient computation.

实验结果

研究问题

RQ1Can the adjoint sensitivity method be extended to stochastic dynamics to compute gradients efficiently with constant memory?
RQ2What backward dynamics (Stratonovich/SDE form) correctly recover forward trajectories for gradient computation?
RQ3How can one replay forward noise without storing full trajectories while maintaining exact gradient information?
RQ4Can the proposed stochastic adjoint framework be combined with variational inference for latent SDEs?
RQ5Is the approach scalable to neural SDEs with high dimensionality and irregularly sampled data?

主要发现

The stochastic adjoint method achieves constant memory usage and O(L log L) time compared to other baselines.
The backward Stratonovich SDE framework yields correct gradient reconstruction for stochastic dynamics.
An efficient noise-caching mechanism allows exact gradient computation with a single random seed.
The method supports high-order adaptive solvers for SDEs, enabling scalable training of neural SDEs.
The approach, when combined with gradient-based variational inference, yields competitive performance on latent SDEs across datasets.
Demonstrations include 50-dimensional motion capture dynamics.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。