QUICK REVIEW

[论文解读] BRITS: Bidirectional Recurrent Imputation for Time Series

Wei Cao, Dong Wang|arXiv (Cornell University)|May 27, 2018

Time Series Analysis and Forecasting参考文献 22被引用 318

一句话总结

BRITS 引入一个双向循环网络，在多变量时间序列中推断缺失值，并在图中联合优化插补与分类/回归，超越了最新的方法。

ABSTRACT

Time series are widely used as signals in many classification/regression tasks. It is ubiquitous that time series contains many missing values. Given multiple correlated time series data, how to fill in missing values and to predict their class labels? Existing imputation methods often impose strong assumptions of the underlying data generating process, such as linear dynamics in the state space. In this paper, we propose BRITS, a novel method based on recurrent neural networks for missing value imputation in time series data. Our proposed method directly learns the missing values in a bidirectional recurrent dynamical system, without any specific assumption. The imputed values are treated as variables of RNN graph and can be effectively updated during the backpropagation.BRITS has three advantages: (a) it can handle multiple correlated missing values in time series; (b) it generalizes to time series with nonlinear dynamics underlying; (c) it provides a data-driven imputation procedure and applies to general settings with missing data.We evaluate our model on three real-world datasets, including an air quality dataset, a health-care data, and a localization data for human activity. Experiments show that our model outperforms the state-of-the-art methods in both imputation and classification/regression accuracies.

研究动机与目标

在多变量、非规则采样的时间序列中推动鲁棒的缺失值插补，而不依赖强数据生成假设。
提出一个双向 RNN 框架，在图中将缺失值视为可训练变量。
实现插补与下游分类/回归的联合学习，以减少误差传播。
在多个真实世界数据集上证明更优的插补和预测性能。

提出的方法

为多变量时间序列的缺失值插补开发一个双向循环神经网络。
将缺失项视为 RNN 图中的变量，通过前向与后向反向传播实现一致性。
引入时间衰减因子以处理不规则采样和缺失间隙。
通过将基于历史的估计与基于特征的估计相结合并使用学习得到的权重实现对相关特征的扩展。
在单一神经图中同时优化插补损失和下游任务损失（分类/回归）。

实验结果

研究问题

RQ1在对数据生成过程没有强假设的情况下，双向 RNN 在多变量时间序列中对缺失值的插补有多好？
RQ2将特征相关性与联合监督纳入后，是否比现有方法在插补和下游任务的准确性上有所提升？
RQ3是否可以将缺失值有效地作为可训练变量嵌入到 RNN 图中，以在训练过程中实现更好的梯度传播？
RQ4双向动态对不规则采样数据的收敛速度与插补质量有何影响？

主要发现

方法	空气质量 MAE	空气质量 MRE%	医疗保健 MAE	医疗保健 MRE%	人类活动 MAE	人类活动 MRE%
平均值	55.51	77.97%	0.720	100.00%	0.767	96.43%
KNN	29.79	41.85%	0.732	101.66%	0.479	58.54%
MF	27.94	39.25%	0.622	87.68%	0.879	110.44%
MICE	27.42	38.52%	0.634	89.17%	0.477	57.94%
ImputeTS	19.58	27.51%	0.390	54.2%	0.363	45.65%
STMVL	12.12	17.40%	/	/	/	/
GRU-D	/	0.559	0.558	/	/	/
M-RNN	14.24	20.43%	0.451	62.65%	0.248	31.19%
RITS-I	12.73	18.32%	0.395	54.80%	0.240	30.10%
BRITS-I	11.58	16.66%	0.361	50.01%	0.220	27.61%
RITS	12.19	17.54%	0.300	41.89%	0.248	31.21%
BRITS	11.56	16.65%	0.281	39.14%	0.219	27.59%

BRITS 在插补精度（MAE/MRE 跨数据集）方面显著优于非 RNN 与若干基线的 RNN 变体。
双向动态和特征相关性都对结果的改进有贡献，相较于单向和不相关的变体。
BRITS 在插补增益的同时实现了前沿或有竞争力的分类/回归性能（如在健康护理和活动任务上的 AUC/准确率更高）。
在所报告的表格中，BRITS 在测试方法中始终取得最低的插补误差，优于 RITS-I、RITS、BRITS-I、GRU-D 和 M-RNN。
依赖简单平滑或单向递归的插补方法在插补和下游预测上落后于 BRITS。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。