QUICK REVIEW

[论文解读] Accelerating Neural ODEs with Spectral Elements.

Alessio Quaglino, Marco Gallieri|arXiv (Cornell University)|Jun 17, 2019

Model Reduction and Neural Networks参考文献 24被引用 8

一句话总结

该论文通过将神经ODE的动力学表示为截断的勒让德多项式级数，加速了神经ODE的训练，实现了通过坐标下降法交替更新谱系数和网络权重的时序并行优化。该方法实现了比标准反向传播和伴随方法快至少10倍的收敛速度，且测试均方误差降低10倍。

ABSTRACT

This paper proposes the use of spectral element methods \citep{canuto_spectral_1988} for fast and accurate training of Neural Ordinary Differential Equations (ODE-Nets; \citealp{Chen2018NeuralOD}) for system identification. This is achieved by expressing their dynamics as a truncated series of Legendre polynomials. The series coefficients, as well as the network weights, are computed by minimizing the weighted sum of the loss function and the violation of the ODE-Net dynamics. The problem is solved by coordinate descent that alternately minimizes, with respect to the coefficients and the weights, two unconstrained sub-problems using standard backpropagation and gradient methods. The resulting optimization scheme is fully time-parallel and results in a low memory footprint. Experimental comparison to standard methods, such as backpropagation through explicit solvers and the adjoint technique \citep{Chen2018NeuralOD}, on training surrogate models of small and medium-scale dynamical systems shows that it is at least one order of magnitude faster at reaching a comparable value of the loss function. The corresponding testing MSE is one order of magnitude smaller as well, suggesting generalization capabilities increase.

研究动机与目标

加速动力系统中系统辨识的神经ODE训练。
减少神经ODE训练过程中的内存占用。
通过最小化损失和ODE动力学违反程度，提升泛化能力。
通过在谱空间中重新表述ODE动力学，实现完全的时间并行优化。

提出的方法

神经ODE的动力学表示为勒让德多项式截断级数。
通过最小化损失和ODE残差违反程度的加权和，联合优化谱系数和网络权重。
坐标下降法交替优化系数（通过梯度方法）和权重（通过反向传播）。
该方法完全支持时间并行，因为谱系数在时间区间上可独立计算。
通过残差的弱形式施加ODE约束，以最小化与真实动力学的偏离。
该方法利用谱元方法，在低内存使用下实现高精度。

实验结果

研究问题

RQ1谱元方法能否提升神经ODE在系统辨识中的训练速度和精度？
RQ2通过谱系数实现的时间并行优化与通过求解器的标准反向传播相比如何？
RQ3该方法在训练过程中在多大程度上减少了内存消耗？
RQ4通过谱残差强制执行ODE动力学是否能改善泛化性能？
RQ5该方法在中等规模动力系统上是否具有良好的可扩展性？

主要发现

所提方法相比标准通过显式求解器的反向传播和伴随方法，收敛速度至少快一个数量级。
该方法在显著更少的训练迭代次数内达到相近的损失值，表明优化动力学更快。
测试均方误差（MSE）比基线方法低一个数量级，表明泛化能力得到提升。
优化方案完全支持时间并行，可在时间区间上高效计算。
由于通过勒让德多项式对动力学进行结构化、全局表示，该方法保持了较低的内存占用。
联合最小化损失和ODE残差违反程度，可获得更精确的动力系统代理模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。