Skip to main content
QUICK REVIEW

[论文解读] SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters

Shengsheng Lin, Weiwei Lin|arXiv (Cornell University)|May 2, 2024
Stock Market Forecasting Methods被引用 8
一句话总结

SparseTSF 是一个超轻量级的长期时间序列预测模型(<1k 参数),通过 Cross-Period Sparse Forecasting 将周期性与趋势解耦,以实现具有竞争力的准确性。

ABSTRACT

This paper introduces SparseTSF, a novel, extremely lightweight model for Long-term Time Series Forecasting (LTSF), designed to address the challenges of modeling complex temporal dependencies over extended horizons with minimal computational resources. At the heart of SparseTSF lies the Cross-Period Sparse Forecasting technique, which simplifies the forecasting task by decoupling the periodicity and trend in time series data. This technique involves downsampling the original sequences to focus on cross-period trend prediction, effectively extracting periodic features while minimizing the model's complexity and parameter count. Based on this technique, the SparseTSF model uses fewer than *1k* parameters to achieve competitive or superior performance compared to state-of-the-art models. Furthermore, SparseTSF showcases remarkable generalization capabilities, making it well-suited for scenarios with limited computational resources, small samples, or low-quality data. The code is publicly available at this repository: https://github.com/lss-1138/SparseTSF.

研究动机与目标

  • 在实现极少计算资源的前提下解决高精度长期预测的挑战。
  • 利用数据固有的周期性,将周期性与趋势解耦。
  • 开发一个轻量级模型,在极少参数的情况下保持具有竞争力甚至更优的性能。
  • 在低资源场景中展示泛化能力和效率优势。

提出的方法

  • 通过将时间序列下采样为 w 个子序列并在每个子序列上应用一个共享参数的线性预测器,介绍 Cross-Period Sparse Forecasting。
  • 在稀疏预测之前使用滑动聚合(1D 卷积)来缓解信息丢失和离群点敏感性。
  • 通过均值减法对输入进行归一化,并在输出中重新加回均值,以缓解分布漂移。
  • 使用简单的均方误差损失进行训练。
  • 提供理论分析,显示 Sparse 技术的参数效率和有效性。
  • 在标准数据集上使用 CI(Channel Independent,信道独立)策略对比最先进的 LTSF 模型进行评估。

实验结果

研究问题

  • RQ1Can Cross-Period Sparse Forecasting decouple periodicity from trend to enable accurate long-horizon forecasts with extremely few parameters?
  • RQ2How does SparseTSF perform relative to state-of-the-art LTSF models on mainstream benchmarks while using sub-1k parameters?
  • RQ3What are the efficiency gains (parameters, MACs, memory, runtime) and generalization capabilities of SparseTSF?
  • RQ4How sensitive is performance to the chosen period w and how well does SparseTSF generalize across domains with the same periodicity?

主要发现

模型参数MACs最大内存(MB)训练时间(s)
Informer (2021)12.53 M3.97 G969.770.1
Autoformer (2021)12.22 M4.41 G2631.2107.7
FEDformer (2022b)17.98 M4.41 G1102.5238.7
FiLM (2022a)12.22 M4.41 G1773.978.3
PatchTST (2023)6.31 M11.21 G10882.3290.3
DLinear (2023)485.3 K156.0 M123.825.4
FITS (2024)10.5 K79.9 M496.735.0
SparseTSF (Ours)0.92 K12.71 M125.231.3
  • SparseTSF achieves competitive or superior MSE performance compared to strong baselines on multiple LTSF datasets with under 1k parameters.
  • The Sparse technique enables order-of-magnitude parameter reductions (vs. mainstream models) while maintaining robustness (low standard deviation across runs).
  • Efficiency metrics show SparseTSF uses ~0.92k parameters and ~12.7M MACs, with markedly lower memory and training time than many baselines.
  • Ablation studies confirm the Sparse technique substantially improves Linear, Transformer, and GRU baselines, indicating broad applicability of the approach.
  • Cross-domain generalization experiments show SparseTSF outperforms several baselines when transferring between datasets with the same daily periodicity.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。