QUICK REVIEW

[论文解读] Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models

Vincent Le Guen, Nicolas Thome|arXiv (Cornell University)|Sep 19, 2019

Time Series Analysis and Forecasting被引用 105

一句话总结

DILATE 引入了一种可微分损失，分别对形状精度和时间定位进行优化，针对多步、非平稳时间序列预测，在多个数据集上优于基于 MSE 和 DTW 的损失。

ABSTRACT

This paper addresses the problem of time series forecasting for non-stationary signals and multiple future steps prediction. To handle this challenging task, we introduce DILATE (DIstortion Loss including shApe and TimE), a new objective function for training deep neural networks. DILATE aims at accurately predicting sudden changes, and explicitly incorporates two terms supporting precise shape and temporal change detection. We introduce a differentiable loss function suitable for training deep neural nets, and provide a custom back-prop implementation for speeding up optimization. We also introduce a variant of DILATE, which provides a smooth generalization of temporally-constrained Dynamic Time Warping (DTW). Experiments carried out on various non-stationary datasets reveal the very good behaviour of DILATE compared to models trained with the standard Mean Squared Error (MSE) loss function, and also to DTW and variants. DILATE is also agnostic to the choice of the model, and we highlight its benefit for training fully connected networks as well as specialized recurrent architectures, showing its capacity to improve over state-of-the-art trajectory forecasting approaches.

研究动机与目标

解决具有突变性区间转换的非平稳时间序列的多步预测。
提供一个可微分的损失，使形状误差与时间错位分离。
利用 DILATE 实现对深度模型（MLP 和 Seq2Seq）的高效训练。
在多个数据集上展示相对于 MSE 和 DTW 变体更优的形状与时间度量性能。

提出的方法

定义一个两项式的可微分损失：L_DILATE = alpha * L_shape + (1 - alpha) * L_temporal.
L_shape 是一个基于 DTW 的可微分损失，使用对扭曲路径的 softmin（DTW_gamma）。
L_temporal 通过一个平滑近似 A_gamma^* 和一个时间惩罚矩阵 Omega，衡量与最优 DTW 路径的偏差。
提供自定义反向实现，将每次前向/反向传播的计算成本降低到 O(k^2)。
扩展出一个切向变体 L_DILATE^t，将 DTW 与时间约束（通过 Omega）融合。
展示 DILATE 可以与 MLP 和 Seq2Seq 架构一起使用，并且与各种数据集兼容。

实验结果

研究问题

RQ1一个可微分的两项失真损失是否能在相对于标准 MSE 的情况下改进对非平稳时间序列的多步预测？
RQ2将形状分量与时间分量分离是否有助于更准确地捕捉尖变及其时序？
RQ3DILATE 损失是否对网络架构没有依赖，且在简单模型和基于序列的模型上均有效？
RQ4在真实数据和合成数据上，DILATE 与基于 DTW 的损失以及其他最先进的预测模型相比如何？

主要发现

DILATE 在形状（DTW）和时间（TDI）指标上优于 MSE，覆盖合成数据、ECG 和交通数据集，在若干情形中差异显著。
相比 DTW_gamma，DILATE 在所有实验中获得更好的时间精度（TDI），并在整体上实现更优的形状-时间平衡。
在 Traffic 数据集上，使用 DILATE 训练的 Seq2Seq 模型在形状和时间度量上均优于以 MSE 训练的最先进模型。
在以 MSE 评估时，DILATE 仍保持与 MSE 竞争力或更优的性能，显示出对评估指标的鲁棒性。
定制的向后实现降低了训练时间，随着预测 horizon k 的增大，提速幅度更大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。