QUICK REVIEW

[论文解读] RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks

Haowen Hou, Fuxun Yu|arXiv (Cornell University)|Jan 17, 2024

Time Series Analysis and Forecasting被引用 9

一句话总结

RWKV-TS 是一个高效的基于 RNN 的时间序列模型，时间复杂度和内存占用为 O(L)，在降低延迟和内存使用的同时，其性能与最先进的 Transformer/CNN 相媲美。

ABSTRACT

Traditional Recurrent Neural Network (RNN) architectures, such as LSTM and GRU, have historically held prominence in time series tasks. However, they have recently seen a decline in their dominant position across various time series tasks. As a result, recent advancements in time series forecasting have seen a notable shift away from RNNs towards alternative architectures such as Transformers, MLPs, and CNNs. To go beyond the limitations of traditional RNNs, we design an efficient RNN-based model for time series tasks, named RWKV-TS, with three distinctive features: (i) A novel RNN architecture characterized by $O(L)$ time complexity and memory usage. (ii) An enhanced ability to capture long-term sequence information compared to traditional RNNs. (iii) High computational efficiency coupled with the capacity to scale up effectively. Through extensive experimentation, our proposed RWKV-TS model demonstrates competitive performance when compared to state-of-the-art Transformer-based or CNN-based models. Notably, RWKV-TS exhibits not only comparable performance but also demonstrates reduced latency and memory utilization. The success of RWKV-TS encourages further exploration and innovation in leveraging RNN-based approaches within the domain of Time Series. The combination of competitive performance, low latency, and efficient memory usage positions RWKV-TS as a promising avenue for future research in time series tasks. Code is available at:\href{https://github.com/howard-hou/RWKV-TS}{ https://github.com/howard-hou/RWKV-TS}

研究动机与目标

尽管 Transformer 占主导地位，重新审视 RNN 在时间序列任务中的作用。
提出一种基于 RNN 的架构（RWKV-TS），对于时间序列数据具有线性时间/空间复杂度。
在预测、插补、异常检测、分类和少样本学习任务中对 RWKV-TS 进行经验验证。
证明 RWKV-TS 能在降低延迟和内存使用的同时，达到与 SOTA 模型相竞争的准确性。

提出的方法

引入实例归一化和切块，将多变量序列转换为 RWKV-TS 的输入令牌。
使用带有时间混合和通道混合子块的 RWKV 骨干，配备多头 WKV 运算符，用于线性时间的类注意力计算。
同时提供并行（主模式）和递归模式，展示它们的形式等价性并实现高效的训练与推理。
对输出应用 SiLU 门控和层归一化，使用展平投影后再采用 MSE 损失进行预测。
分析 O(L) 时间/空间复杂度和仅编码架构，以避免传统 RNN 常见的误差累积。

实验结果

研究问题

RQ1RWKV-TS 能否在多样化时间序列任务上达到或超过基于 Transformer 或 CNN 的最新模型？
RQ2线性时间的 RWKV-TS 架构在不牺牲准确性的前提下，是否在训练/推理延迟和内存使用方面提供实际的优势？
RQ3RWKV-TS 在长期预测、短期预测、插补、异常检测、分类和少样本设置下的表现如何？
RQ4在设计为时间混合与通道混合机制时，基于 RNN 的方法能否在具有长距离依赖的时间序列中保持竞争力？

主要发现

RWKV-TS 在多项时间序列任务中实现与基于 Transformer 和 CNN 的最先进模型相媲美的性能。
在长期预测方面，相对于 TimesNet，RWKV-TS 的平均 MSE 降幅为 12.58%，MAE 降幅为 4.38%。
在效率分析中，RWKV-TS-768（24M 参数）显示每个批次训练时间 0.067s、推理时间 0.018s，在参数相近或更少的情况下，速度优于若干基线。
RWKV-TS 在短期预测（M4）和少样本预测方面表现强劲，常常优于知名基线，并在与 TimesNet 和 N-BEATS 的比较中显示出竞争性优势。
在时间序列分类中，RWKV-TS 的准确率较高，在 UEA 数据集平均达到 73.10%，接近 TimesNet，并超越大多数基线。
RWKV-TS 保持良好的异常检测性能，在标准数据集上的平均 F1 分数接近最先进水平。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。