QUICK REVIEW

[论文解读] Learning Long-Term Dependencies in Irregularly-Sampled Time Series

Mathias Lechner, Ramin Hasani|arXiv (Cornell University)|Jun 8, 2020

Model Reduction and Neural Networks参考文献 49被引用 56

一句话总结

本文提出 ODE-LSTMs，是 LSTMs 的连续时间扩展，将记忆与时间连续状态解耦，以在不规则采样数据中可靠地学习长期依赖，解决 ODE-RNNs 的梯度消失/爆炸问题，并在各种任务上实现更优性能。

ABSTRACT

Recurrent neural networks (RNNs) with continuous-time hidden states are a natural fit for modeling irregularly-sampled time series. These models, however, face difficulties when the input data possess long-term dependencies. We prove that similar to standard RNNs, the underlying reason for this issue is the vanishing or exploding of the gradient during training. This phenomenon is expressed by the ordinary differential equation (ODE) representation of the hidden state, regardless of the ODE solver's choice. We provide a solution by designing a new algorithm based on the long short-term memory (LSTM) that separates its memory from its time-continuous state. This way, we encode a continuous-time dynamical flow within the RNN, allowing it to respond to inputs arriving at arbitrary time-lags while ensuring a constant error propagation through the memory path. We call these RNN models ODE-LSTMs. We experimentally show that ODE-LSTMs outperform advanced RNN-based counterparts on non-uniformly sampled data with long-term dependencies. All code and data is available at https://github.com/mlech26l/ode-lstms.

研究动机与目标

将不规则采样的时间序列建模为具有长期依赖的动机。
识别在训练过程中为何 ODE-RNNs 会出现梯度消失/爆炸的问题。
提出一种具备记忆增强的连续时间 RNN，以维持梯度传递。
在合成数据和真实世界任务中展示所提模型的经验优越性。

提出的方法

通过保留 LSTM 的记忆单元和门控，同时通过一个 ODE-RNN 通路求解连续时间的输出动力学，来提出 ODE-LSTM。
在常见离散化和自洽对偶训练下，理论证明 ODE-RNNs 会出现梯度消失/爆炸。
将时间连续的流集成到 LSTM 输出计算中，使其能够对任意时间滞后作出响应，同时维持稳定的梯度传播。
在合成数据和真实数据集上，将 ODE-LSTM 与广泛的连续时间 RNN 基线进行比较（例如 ODE-RNN、CT-RNN、GRU-ODE、CT-LSTM、GRU-D）。
在 bit-stream XOR 类任务、活动识别、非规则序列 MNIST 以及 Walker2d 动力学等任务上给出经验结果。

实验结果

研究问题

RQ1在不规则采样数据上学习长期依赖时，ODE-RNNs 是否会出现梯度消失/爆炸？
RQ2通过在连续时间 RNN 中使用类似 LSTM 的记忆将记忆与时间连续状态解耦，能否实现稳定的梯度流？
RQ3在合成和真实的不规则时间序列基准上，ODE-LSTMs 是否优于现有的连续时间 RNN 变体？
RQ4在需要学习长期依赖且采样不均匀的任务中，所提出模型的表现如何？

主要发现

ODE-RNN 和相关的连续时间 RNN 表现出梯度消失或爆炸，妨碍长期依赖的学习。
ODE-LSTMs 通过记忆路径维持近乎恒定的梯度流，从而在不规则采样数据中学习长期依赖。
在合成与真实世界任务中，ODE-LSTMs 始终优于先进的连续时间 RNN 变体。
在像 bit-stream XOR、非规则 MNIST 和 Walker2d 动力学等任务上，ODE-LSTMs 获得更优的性能。
该架构在处理不规则采样时能有效保持记忆，不像若干衰减基线那样记忆衰减。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。