QUICK REVIEW

[論文レビュー] Deep Time-Series Models Meet Volatility: Multi-Horizon Electricity Price Forecasting in the Australian National Electricity Market

Mohammed Osman Gani, Zhipeng He|arXiv (Cornell University)|Feb 1, 2026

Energy Load and Power Forecasting被引用数 0

ひとこと要約

この論文は、豪州NEMの変動性の中で日次予測と2日先予測の電力価格予測について、最新の深層時系列モデルを評価し、標準的なDLベースラインと比較し、 intradayおよび極端価格の性能を分析しています。

ABSTRACT

Accurate electricity price forecasting (EPF) is increasingly difficult in markets characterised by extreme volatility, frequent price spikes, and rapid structural shifts. Deep learning (DL) has been increasingly adopted in EPF due to its ability to achieve high forecasting accuracy. Recently, state-of-the-art (SOTA) deep time-series models have demonstrated promising performance across general forecasting tasks. Yet, their effectiveness in highly volatile electricity markets remains underexplored. Moreover, existing EPF studies rarely assess how model accuracy varies across intraday periods, leaving model sensitivity to market conditions unexplored. To address these gaps, this paper proposes an EPF framework that systematically evaluates SOTA deep time-series models using a direct multi-horizon forecasting approach across day-ahead and two-day-ahead settings. We conduct a comprehensive empirical study across all five regions of the Australian National Electricity Market using contemporary, high-volatility data. The results reveal a clear gap between time-series benchmark expectations and observed performance under real-world price volatility: recent deep time-series models often fail to surpass standard DL baselines. All models experience substantial degradation under extreme and negative prices, yet DL baselines often remain competitive. Intraday performance analysis further reveals that all evaluated models are consistently vulnerable to prevailing market conditions, where absolute errors peak during evening ramps, relative errors escalate during midday negative-price periods, and directional accuracy deteriorates sharply during abrupt shifts in price direction. These findings emphasise the need for volatility-aware modelling strategies and richer feature representations to advance EPF.

研究の動機と目的

Assess the effectiveness of state-of-the-art deep time-series models under contemporary high-volatility electricity market conditions in the Australian NEM.
Examine how forecasting accuracy varies across intraday intervals to reveal time-of-day and market-condition sensitivities.
Provide a rigorous comparison between SOTA time-series architectures and standard DL baselines in day-ahead and two-day-ahead horizons.
Enable intraday diagnostics to guide volatility-aware modeling strategies for EPF.

提案手法

Use a direct multi-horizon forecasting approach to predict 48 half-hour steps (day-ahead) and 96 steps (two-day-ahead) via 7-day and 14-day lookback windows.
Evaluate a set of models: two standard DL baselines (LSTM, CNN-LSTM, Transformer) and several SOTA time-series architectures (TimeXer, TimeMixer, TimesNet, iTransformer, Mamba, DLinear, etc.).
Construct inputs from five NEM regions using 2023-2025 data at 5-minute resolution aggregated to 30-minute intervals.
Apply a comprehensive hyperparameter grid search per horizon and region; train with Adam, early stopping, and ReduceLROnPlateau; normalize features with Min–Max scaling.
Assess performance with MAE, RMSE, sMAPE, rMAE, and MDA; analyze tail (extreme prices) and intraday (48 half-hour intervals) performance.
Use chronological train/validation/test splits (2023-2024 training/validation, 2025 testing) and publicly share code and data artifacts.

Figure 1: An overview of the proposed EPF evaluation framework, encompassing stages from data collection to performance evaluation.

実験結果

リサーチクエスチョン

RQ1Do state-of-the-art deep time-series models outperform standard DL baselines in the volatile, real-world NEM price series?
RQ2How does model accuracy vary across intraday intervals and under extreme or negative price conditions?
RQ3Are advances in deep time-series forecasting architectures translating to robust EPF performance across all five NEM regions?
RQ4What are the relative strengths of different model families (LSTM, CNN-LSTM, Transformer, SOTA time-series architectures) in day-ahead versus two-day-ahead horizons?

主な発見

SOTA deep time-series models often do not surpass standard DL baselines in the highly volatile NEM context.
Most models degrade under extreme and negative prices, yet DL baselines remain competitive in these regimes.
Intraday analysis shows that forecasting errors peak during evening ramps and that directional accuracy worsens during abrupt price shifts.
CNN-LSTM and LSTM frequently rank among the top performers across regions and horizons, while several SOTA architectures offer region-specific advantages (e.g., TimeXer in VIC, Mamba in QLD/SA).
DLinear and TimeMixer generally underperform compared with standard DL baselines, and TimeXer emerges as a strong baseline contender in some horizons.
Negative-price periods expose model limitations across all architectures; no single SOTA model consistently outperforms baselines under negative prices.

(a) MAE of all models in the QLD region, evaluated at 30-minute intervals.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。