QUICK REVIEW

[论文解读] Explaining Time Series Predictions with Dynamic Masks

Jonathan Crabbé, Mihaela van der Schaar|arXiv (Cornell University)|Jun 9, 2021

Explainable Artificial Intelligence (XAI)参考文献 44被引用 26

一句话总结

Dynamask 引入按实例分配的动态掩码，通过对输入进行时序感知的扰动来解释多变量时间序列预测，通过信息理论框架追求简约性和可读性。

ABSTRACT

How can we explain the predictions of a machine learning model? When the data is structured as a multivariate time series, this question induces additional difficulties such as the necessity for the explanation to embody the time dependency and the large number of inputs. To address these challenges, we propose dynamic masks (Dynamask). This method produces instance-wise importance scores for each feature at each time step by fitting a perturbation mask to the input sequence. In order to incorporate the time dependency of the data, Dynamask studies the effects of dynamic perturbation operators. In order to tackle the large number of inputs, we propose a scheme to make the feature selection parsimonious (to select no more feature than necessary) and legible (a notion that we detail by making a parallel with information theory). With synthetic and real-world data, we demonstrate that the dynamic underpinning of Dynamask, together with its parsimony, offer a neat improvement in the identification of feature importance over time. The modularity of Dynamask makes it ideal as a plug-in to increase the transparency of a wide range of machine learning models in areas such as medicine and finance, where time series are abundant.

研究动机与目标

提出需要对时间序列进行能够保留时序上下文的解释的动机。
定义基于动态扰动的掩码，以识别跨时间的显著特征。
通过强制稀疏、近二值的掩码并量化信息内容，提升简约性与可读性。
提供一个框架，利用信息理论度量比较时间序列的显著性方法。

提出的方法

定义形状为 T x dX 的掩码 M，其中 m_{t,i} 表示时间 t 时对 f(X) 的特征 i 的重要性。
使用依赖于 M 的扰动算子 Pi_M，对 X 进行扰动，并在 m_{t,i} 高时减少其影响；动态扰动包含相邻时间步（W1, W2 窗口）。
提出几种扰动算子：pi^g（时间高斯模糊），pi^m（衰减到移动平均），pi^p（向过去聚焦的衰减到移动平均）。
通过最小化 f(X) 与 f(Pi_M(X)) 之间的预测偏移，同时加入稀疏性和时间平滑项来优化 M：L_e（预测误差）、L_a（通过 vecsort 正则化的基于面积的稀疏性）、L_c（时序连续性）。
定义一个极值掩码 M_a*，在误差低于阈值 epsilon 的同时实现最小面积 a*。
引入掩码的信息理论度量：掩码信息 I_M(A) = -Σ ln(1 - m_{t,i}) 和掩码熵 S_M(A) = -Σ [ m_{t,i} ln m_{t,i} + (1 - m_{t,i}) ln(1 - m_{t,i}) ]，具有正性、可加性和单调性等性质。

实验结果

研究问题

RQ1如何在时间序列模型的显著性解释中纳入时间上下文？
RQ2是否可以创建简约且易于理解的掩码，以识别解释预测所需的最小输入子集？
RQ3如何使用信息理论度量来量化时间序列显著性掩码的质量与可解释性？
RQ4Dynamask 与现有的显著性方法在合成与真实世界时间序列数据上有何比较？

主要发现

Dynamask 在时间序列显著性任务中优于基线（FO、FP、IG、SVS），在白盒与黑盒实验中具有更高的 AUR 和显著的 I_M(A)，且 AUP 相对合理。
使用动态扰动比静态方法在时间上更好地识别显著特征。
极值掩码方法在保持可预测性在指定公差内的同时实现强稀疏性（低掩码面积 a）。
当显著区域被很好地捕捉时，掩码信息量增加；掩码熵随着更接近二值、易读的掩码而下降。
该框架提供一个可插拔的方法，具有模块化的扰动设计，适用于医学和金融时间序列解释。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。