Skip to main content
QUICK REVIEW

[论文解读] Scalable Gradients for Stochastic Differential Equations

Xuechen Li, Ting‐Kam Leonard Wong|arXiv (Cornell University)|Jan 5, 2020
Model Reduction and Neural Networks参考文献 77被引用 108
一句话总结

本文将伴随灵敏度方法推广到随机微分方程,实现神经SDEs和潜在SDE模型的记忆高效、可扩展的梯度计算,借助高阶自适应求解器。

ABSTRACT

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.

研究动机与目标

  • Motivate scalable gradient computation for SDEs beyond ODE adjoints.
  • Extend the adjoint sensitivity framework to Stratonovich SDEs with backward dynamics.
  • Develop memory-efficient techniques to replay forward noise without storing trajectories.
  • Enable gradient-based inference and learning for latent SDE models with irregular observations.
  • Demonstrate competitive performance on high-dimensional stochastic dynamics tasks.

提出的方法

  • Generalize the adjoint sensitivity method from ODEs to SDEs using backward Stratonovich SDEs to compute gradients.
  • Derive a stochastic adjoint process whose solution yields the gradient with respect to Z_T through backward dynamics.
  • Propose an algorithm to query forward-path noise with a single random seed, avoiding storage of all noise realizations.
  • Integrate the stochastic adjoint with gradient-based stochastic variational inference for latent SDEs.
  • Allow using high-order adaptive time-stepping SDE solvers within the backward gradient computation.

实验结果

研究问题

  • RQ1Can the adjoint sensitivity method be extended to stochastic dynamics to compute gradients efficiently with constant memory?
  • RQ2What backward dynamics (Stratonovich/SDE form) correctly recover forward trajectories for gradient computation?
  • RQ3How can one replay forward noise without storing full trajectories while maintaining exact gradient information?
  • RQ4Can the proposed stochastic adjoint framework be combined with variational inference for latent SDEs?
  • RQ5Is the approach scalable to neural SDEs with high dimensionality and irregularly sampled data?

主要发现

  • The stochastic adjoint method achieves constant memory usage and O(L log L) time compared to other baselines.
  • The backward Stratonovich SDE framework yields correct gradient reconstruction for stochastic dynamics.
  • An efficient noise-caching mechanism allows exact gradient computation with a single random seed.
  • The method supports high-order adaptive solvers for SDEs, enabling scalable training of neural SDEs.
  • The approach, when combined with gradient-based variational inference, yields competitive performance on latent SDEs across datasets.
  • Demonstrations include 50-dimensional motion capture dynamics.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。