QUICK REVIEW

[论文解读] SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters

Shengsheng Lin, Weiwei Lin|arXiv (Cornell University)|May 2, 2024

Stock Market Forecasting Methods被引用 8

一句话总结

SparseTSF 是一个超轻量级的长期时间序列预测模型（<1k 参数），通过 Cross-Period Sparse Forecasting 将周期性与趋势解耦，以实现具有竞争力的准确性。

ABSTRACT

This paper introduces SparseTSF, a novel, extremely lightweight model for Long-term Time Series Forecasting (LTSF), designed to address the challenges of modeling complex temporal dependencies over extended horizons with minimal computational resources. At the heart of SparseTSF lies the Cross-Period Sparse Forecasting technique, which simplifies the forecasting task by decoupling the periodicity and trend in time series data. This technique involves downsampling the original sequences to focus on cross-period trend prediction, effectively extracting periodic features while minimizing the model's complexity and parameter count. Based on this technique, the SparseTSF model uses fewer than *1k* parameters to achieve competitive or superior performance compared to state-of-the-art models. Furthermore, SparseTSF showcases remarkable generalization capabilities, making it well-suited for scenarios with limited computational resources, small samples, or low-quality data. The code is publicly available at this repository: https://github.com/lss-1138/SparseTSF.

研究动机与目标

在实现极少计算资源的前提下解决高精度长期预测的挑战。
利用数据固有的周期性，将周期性与趋势解耦。
开发一个轻量级模型，在极少参数的情况下保持具有竞争力甚至更优的性能。
在低资源场景中展示泛化能力和效率优势。

提出的方法

通过将时间序列下采样为 w 个子序列并在每个子序列上应用一个共享参数的线性预测器，介绍 Cross-Period Sparse Forecasting。
在稀疏预测之前使用滑动聚合（1D 卷积）来缓解信息丢失和离群点敏感性。
通过均值减法对输入进行归一化，并在输出中重新加回均值，以缓解分布漂移。
使用简单的均方误差损失进行训练。
提供理论分析，显示 Sparse 技术的参数效率和有效性。
在标准数据集上使用 CI（Channel Independent，信道独立）策略对比最先进的 LTSF 模型进行评估。

实验结果

研究问题

RQ1Can Cross-Period Sparse Forecasting decouple periodicity from trend to enable accurate long-horizon forecasts with extremely few parameters?
RQ2How does SparseTSF perform relative to state-of-the-art LTSF models on mainstream benchmarks while using sub-1k parameters?
RQ3What are the efficiency gains (parameters, MACs, memory, runtime) and generalization capabilities of SparseTSF?
RQ4How sensitive is performance to the chosen period w and how well does SparseTSF generalize across domains with the same periodicity?

主要发现

模型	参数	MACs	最大内存（MB）	训练时间（s）
Informer (2021)	12.53 M	3.97 G	969.7	70.1
Autoformer (2021)	12.22 M	4.41 G	2631.2	107.7
FEDformer (2022b)	17.98 M	4.41 G	1102.5	238.7
FiLM (2022a)	12.22 M	4.41 G	1773.9	78.3
PatchTST (2023)	6.31 M	11.21 G	10882.3	290.3
DLinear (2023)	485.3 K	156.0 M	123.8	25.4
FITS (2024)	10.5 K	79.9 M	496.7	35.0
SparseTSF (Ours)	0.92 K	12.71 M	125.2	31.3

SparseTSF achieves competitive or superior MSE performance compared to strong baselines on multiple LTSF datasets with under 1k parameters.
The Sparse technique enables order-of-magnitude parameter reductions (vs. mainstream models) while maintaining robustness (low standard deviation across runs).
Efficiency metrics show SparseTSF uses ~0.92k parameters and ~12.7M MACs, with markedly lower memory and training time than many baselines.
Ablation studies confirm the Sparse technique substantially improves Linear, Transformer, and GRU baselines, indicating broad applicability of the approach.
Cross-domain generalization experiments show SparseTSF outperforms several baselines when transferring between datasets with the same daily periodicity.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。