QUICK REVIEW

[論文レビュー] Recurrent Neural Networks for Multivariate Time Series with Missing Values

Zhengping Che, Sanjay Purushotham|arXiv (Cornell University)|Jun 6, 2016

Time Series Analysis and Forecasting参考文献 30被引用数 231

ひとこと要約

GRU-D は、マスキング、時間間隔、学習可能な減衰を入力と隠れ状態に組み込むことで、多変量時系列における情報量の多い欠損を明示的にモデル化し、医療データの予測タスクを改善する GRU 基盤のモデルである。

ABSTRACT

Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.

研究の動機と目的

特に医療分野において、多変量時系列データにおける情報量の多い欠損を活用する必要性を動機づける。
マスキングと時間間隔を通じて欠損値を統合的に扱う GRU ベースのモデル（GRU-D）を開発する。
GRU-D が実世界の臨床データセットと合成データで、GRU のベースラインや非 RNN 手法を上回ることを示す。
欠損パターンが予測にどのように役立つかについて洞察を提供し、欠損データを含む時系列のフレームワークを提示する。

提案手法

観測の欠損と近接度を表すマスキング m_t および時間間隔 δ_t を導入する。
観測からの経過時間が増加するにつれて入力と隠れ状態を平均値/デフォルトへ減衰させる学習可能な減衰機構 γ_x および γ_h を追加して GRU-D を提案する。
マスキングベクトル m_t および減衰項を直接 GRU の更新式 (z_t, r_t, h_t) に組み込み、予測と欠損処理を同時に学習する。
γ_t を γ_t = exp(-max(0, W_γ δ_t + b_γ)) により定義し、減衰を (0,1) に保ち、変数ごとの減衰を可能にする（入力減衰は対角行列）。
2 つの減衰経路を許可する：観測済み特徴の経験的平均へ減衰する入力減衰 γ_x、および h_{t-1} に影響を与える隠れ状態減衰 γ_h。
各ステップでどの特徴が観測されているかをモデルに知らせるため、マスキング m_t を GRU ゲートに統合する。

実験結果

リサーチクエスチョン

RQ1情報量の多い欠損パターンは医療データの時系列分類を改善できるか？
RQ2実世界の欠損値を含む多変量臨床時系列データで、GRU-D は GRU の派生型や非 RNN のベースラインを上回るか？
RQ3入力と隠れ状態の減衰が予測性能と欠損パターンの解釈性にどう寄与するか？
RQ4部分的な時系列のみが利用可能な場合、GRU-D はオンライン/早期予測が可能か？

主な発見

GRU-D は MIMIC-III と PhysioNet で死亡率予測と ICD-9 予測タスクの平均 AUC が最も高く、GRU ベースラインおよび非 RNN モデルと比較して優れる。
合成ジェスチャーデータでは、欠損がより情報量を持つにつれて GRU-D がベースラインを上回り、情報量のある欠損パターンの効果的なモデリングを示す。
GRU-D は早期予測性能を改善し、追加データが少なくても後期の非 RNN ベースラインに近づく/同等に近づくことを示し、より多くの時間ステップが観測されるとオンライン予測で優れていることを示す。
入力と隠れ状態の減衰は変数ごとの欠損影響を明らかにし、欠損率が低い変数ほど影響が顕著であることから、情報量のある欠損の有意な処理を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。