[論文レビュー] Benchmarking State Space Models, Transformers, and Recurrent Networks for US Grid Forecasting
A comprehensive benchmark comparing five deep learning architectures (two state space models, two Transformers, and an LSTM) for hourly US grid load forecasting across six ISOs, with and without weather covariates, plus generalization to solar, wind, and prices. No single winner; performance depends on data and task.
Selecting the right deep learning model for power grid forecasting is challenging, as performance heavily depends on the data available to the operator. This paper presents a comprehensive benchmark of five modern neural architectures: two state space models (PowerMamba, S-Mamba), two Transformers (iTransformer, PatchTST), and a traditional LSTM. We evaluate these models on hourly electricity demand across six diverse US power grids for forecast windows between 24 and 168 hours. To ensure a fair comparison, we adapt each model with specialized temporal processing and a modular layer that cleanly integrates weather covariates. Our results reveal that there is no single best model for all situations. When forecasting using only historical load, PatchTST and the state space models provide the highest accuracy. However, when explicit weather data is added to the inputs, the rankings reverse: iTransformer improves its accuracy three times more efficiently than PatchTST. By controlling for model size, we confirm that this advantage stems from the architecture's inherent ability to mix information across different variables. Extending our evaluation to solar generation, wind power, and wholesale prices further demonstrates that model rankings depend on the forecast task: PatchTST excels on highly rhythmic signals like solar, while state space models are better suited for the chaotic fluctuations of wind and price. Ultimately, this benchmark provides grid operators with actionable guidelines for selecting the optimal forecasting architecture based on their specific data environments.
研究の動機と目的
- Assess the predictive accuracy of five neural architectures (PowerMamba, S-Mamba, iTransformer, PatchTST, LSTM) on hourly system-level load across six US ISOs under consistent preprocessing and training protocols.
- Evaluate how weather covariates, aligned with thermal lag, affect forecast accuracy across architectures.
- Examine architecture differences in handling multivariate inputs and different grid-related forecasting tasks (load, solar, wind, prices).
- Provide actionable guidelines for grid operators on selecting forecasting models based on data characteristics and available covariates.
提案手法
- Benchmark five models spanning three families: two state space models (PowerMamba, S-Mamba), two Transformers (iTransformer, PatchTST), and an LSTM.
- Adapt each model with temporal embeddings (hour, day) and bidirectional encoding for the fixed-lookback window.
- Introduce architecture-matched weather fusion layers that can be toggled on/off without retraining to compare weather integration fairly.
- Evaluate using identical preprocessing and a walk-forward backtest for weather experiments, across forecast horizons W = 24, 48, 72, 96, 168 hours.
- Measure performance with MSE (%), MAPE (%), and forecast-error tails (P0.5, P99.5) to characterize central tendency and tails.

実験結果
リサーチクエスチョン
- RQ1Which architecture yields the best load forecast accuracy across diverse US ISOs under load-only conditions?
- RQ2How does the inclusion of weather covariates impact forecast accuracy for each architecture, and which fusion strategy is most effective?
- RQ3Are architectural advantages consistent across different grid-related forecasting tasks (load, solar, wind, wholesale prices) or task-dependent?
- RQ4Does controlling for model capacity (parameter count) change the relative performance of architectures when weather covariates are included?
- RQ5What practical guidelines can grid operators derive for selecting forecasting models based on data environment and operational horizon?
主な発見
- PatchTST most often achieves the best load-only MAPE across grids, with strong performance on several horizons.
- State space models (PowerMamba, S-Mamba) are competitive and offer favorable inference complexity (O(n)) for operational deployment.
- iTransformer underperforms with a single load variate but gains when weather covariates are incorporated, highlighting the importance of cross-variate attention.
- Weather covariates generally improve all architectures, with iTransformer benefiting the most on weather-sensitive grids; PatchTST shows smaller gains in some cases due to its strong baseline.
- Under capacity-controlled comparisons, when matching hidden dimensions, the weather-induced advantages shift, revealing architecture-dependent benefits in multivariate integration.
- Generalization to solar, wind, and prices indicates PatchTST excels for highly rhythmic signals (solar), while state space models better capture chaotic fluctuations (wind, prices).

より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。