QUICK REVIEW

[論文レビュー] Learning Long-Term Dependencies in Irregularly-Sampled Time Series

Mathias Lechner, Ramin Hasani|arXiv (Cornell University)|Jun 8, 2020

Model Reduction and Neural Networks参考文献 49被引用数 56

ひとこと要約

この論文は ODE-LSTM を導入します。連続時間拡張の LSTM はメモリを時間連続状態から切り離し、不規則にサンプリングされたデータの長期依存性を安定的に学習することで、ODE-RNN の勾配消失/爆発を解決し、さまざまなタスクで優れた性能を発揮します。

ABSTRACT

Recurrent neural networks (RNNs) with continuous-time hidden states are a natural fit for modeling irregularly-sampled time series. These models, however, face difficulties when the input data possess long-term dependencies. We prove that similar to standard RNNs, the underlying reason for this issue is the vanishing or exploding of the gradient during training. This phenomenon is expressed by the ordinary differential equation (ODE) representation of the hidden state, regardless of the ODE solver's choice. We provide a solution by designing a new algorithm based on the long short-term memory (LSTM) that separates its memory from its time-continuous state. This way, we encode a continuous-time dynamical flow within the RNN, allowing it to respond to inputs arriving at arbitrary time-lags while ensuring a constant error propagation through the memory path. We call these RNN models ODE-LSTMs. We experimentally show that ODE-LSTMs outperform advanced RNN-based counterparts on non-uniformly sampled data with long-term dependencies. All code and data is available at https://github.com/mlech26l/ode-lstms.

研究の動機と目的

不規則にサンプリングされた時系列と長期依存性をモデリングする動機。
訓練中に ODE-RNN が勾配消失/勾配爆発を起こす原因を特定。
勾配フローを維持する記憶増強型の連続時間 RNN の提案。
提案モデルの実験的優位性を合成データおよび実世界タスクで示す。

提案手法

LSTM のメモリセルとゲーティングを維持しつつ、ODE-RNN 経路で連続時間出力ダイナミクスを解くことで ODE-LSTM を提案。
一般的な離散化とアジョイント学習の下で ODE-RNN が勾配の消失/爆発を被ることを理論的に証明。
LSTM 出力計算に時系列連続フローを組み込み、任意の時間ラグに応答しつつ安定した勾配伝搬を維持。
合成データと実データセットで、ODE-LSTM を ODE-RNN、CT-RNN、GRU-ODE、CT-LSTM、GRU-D などの広範な連続時間 RNN ベースラインと比較。
bit-stream XOR のようなタスク、アクティビティ認識、不規則な連続 MNIST、Walker2d キネマティクスにわたる経験的結果を提供。

Figure 1 : Magnitude of the states’ error propagation in time-continuous recurrent neural networks gives rise to the vanishing or exploding of the gradient (first two models). ODE-LSTMs are a solution to keep a constant gradient flow to avoid these phenomena in modeling irregularly sampled data.

実験結果

リサーチクエスチョン

RQ1ODE-RNN は不規則にサンプリングされたデータで長期依存性を学習する際、勾配の消失/爆発に悩まされるか？
RQ2連続時間 RNN において LSTM 的な記憶によって記憶を時間連続状態から切り離すことで、安定した勾配フローを可能にできるか？
RQ3ODE-LSTMs は合成データと実世界の不規則時系列ベンチマークで既存の連続時間 RNN variant より優れているか？
RQ4非均一サンプリングを要する長期依存性学習タスクで提案モデルはどのように機能するか？

主な発見

ODE-RNN および関連する連続時間 RNN は勾配が消失するか爆発することがあり、長期依存性の学習を妨げる。
ODE-LSTMs はメモリ経路を通じてほぼ一定の勾配流を維持し、不規則にサンプリングされたデータの長期依存性の学習を可能にする。
合成データと実世界のタスク全体で、ODE-LSTMs は高度な連続時間 RNN variant を一貫して上回る。
bit-stream XOR、 irregular MNIST、Walker2d ダイナミクスのようなタスクで、ODE-LSTMs は優れた性能を達成する。
このアーキテクチャは不規則サンプリングを効果的に処理し、減衰ベースのいくつかのベースラインとは異なりメモリを減衰させない。

Figure 2 : Left: Illustration of how vanishing gradients make the training process of RNNs difficult when the data express long-term dependencies. The prediction error can be thought of as a teaching signal indicating how the dynamics should be changed to minimize the loss. The vanishing gradient of

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。