QUICK REVIEW

[논문 리뷰] Zero-shot Forecasting by Simulation Alone

Boris N. Oreshkin, Mayank Jauhari|arXiv (Cornell University)|2026. 01. 02.

Forecasting Techniques and Applications인용 수 0

한 줄 요약

이 논문은 SarSim0를 소개하는 빠른 SARIMA 기반 시계열 시뮬레이터로, 순수 합성 데이터로 신경 예보모를 프리트레이닝하여 제로샷 예측을 가능하게 하고 M-Series와 GiftEval 벤치마크에서 강한 일반화를 달성한다. 합성 데이터가 실제 데이터 프리트레이닝에 필적하거나 제로샷 설정에서 특정 기준선을 능가할 수 있음을 보여준다.

ABSTRACT

Zero-shot time-series forecasting holds great promise, but is still in its infancy, hindered by limited and biased data corpora, leakage-prone evaluation, and privacy and licensing constraints. Motivated by these challenges, we propose the first practical univariate time series simulation pipeline which is simultaneously fast enough for on-the-fly data generation and enables notable zero-shot forecasting performance on M-Series and GiftEval benchmarks that capture trend/seasonality/intermittency patterns, typical of industrial forecasting applications across a variety of domains. Our simulator, which we call SarSim0 (SARIMA Simulator for Zero-Shot Forecasting), is based off of a seasonal autoregressive integrated moving average (SARIMA) model as its core data source. Due to instability in the autoregressive component, naive SARIMA simulation often leads to unusable paths. Instead, we follow a three-step procedure: (1) we sample well-behaved trajectories from its characteristic polynomial stability region; (2) we introduce a superposition scheme that combines multiple paths into rich multi-seasonality traces; and (3) we add rate-based heavy-tailed noise models to capture burstiness and intermittency alongside seasonalities and trends. SarSim0 is orders of magnitude faster than kernel-based generators, and it enables training on circa 1B unique purely simulated series, generated on the fly; after which well-established neural network backbones exhibit strong zero-shot generalization, surpassing strong statistical forecasters and recent foundation baselines, while operating under strict zero-shot protocol. Notably, on GiftEval we observe a "student-beats-teacher" effect: models trained on our simulations exceed the forecasting accuracy of the AutoARIMA generating processes.

연구 동기 및 목표

실제 데이터가 부족하고 편향되었거나 누설 위험이 있는 산업 현장에서 제로샷 예측을 고무한다.
예측 모델을 대규모로 프리트레이닝하기 위한 빠르고 누설이 없는 합성 데이터 생성기 개발.
다중 계절성과 무거운 꼬리 노이즈에 대한 확장치를 포함한 안정적인 SARIMA 동역학에 시뮬레이터를 기반으로 삼는다.
목표 데이터 미세조정 없이도 합성 데이터로 프리트레이닝된 모델이 이질적인 벤치마크에 일반화됨을 입증한다.

제안 방법

모형 기반: 합성 시계열의 핵심 데이터 생성 프로세스로 SARIMA를 사용한다.
극을 단위 원 안에서 샘플링하고 극 표현으로부터 계수를 도출하여 시뮬레이션을 안정화한다.
기저 프로세스와 엔벨로프 프로세스를 통해 가법적 또는 곱적 상호작용으로 이중 계절성을 포착하기 위해 SARIMA-2를 도입한다.
무거운 꼬리와 레벨 의존 교란(Poisson, 일반화 감마, 로그정규)을 주입하는 Noiser 모듈을 첨부한다.
여러 궤적에 걸쳐 생성을 벡터화하여 즉시 수십억 개의 시계열 합성을 가능하게 한다.
기초 모델 백본(NBEATS, PatchTST, Chronos-Small T5 등)을 전적으로 SarSim0 생성 데이터에서만 학습시키고 벤치마크에서 제로샷 성능을 평가한다.

Figure 1: SarSim0 simulator pipeline. Top: two base components are generated by SARIMA with AR (and seasonal) roots sampled via the characteristic polynomial inside the stability region, yielding well-behaved paths at seasonalities $s\!=\!24$ and $s\!=\!7$ . Middle: a SARIMA-2 superposition/modulati

실험 결과

연구 질문

RQ1SARIMA 기반 합성 데이터 생성기가 예측기를 훈련시키기에 적합한 현실적인 시계열 패턴을 생성할 수 있는가?
RQ2순수하게 시뮬레이션된 데이터로의 프리트레이닝이 다양한 실제 벤치마크에 대해 강력한 제로샷 일반화를 가능하게 하는가?
RQ3합성 데이터로 프리트레이닝되었을 때 서로 다른 구조적 귀납 바이어스(NBEATS, PatchTST, Chronos 등)의 성능은 어떠한가?
RQ4각 시뮬레이터 구성요소(SARIMA, SARIMA-2, Noisers)가 제로샷 예측 성능에 기여하는 바는 무엇인가?

주요 결과

SarSim0로 훈련된 모델은 이질적인 벤치마크 전반에서 강한 제로샷 일반화를 달성하고 일부 실데이터 프리트레이닝 베이스라인을 능가한다.
SarSim0에서 프리트레이닝된 모델은 대규모 실데이터 프리트레이닝 모델과의 격차를 좁히는 경우가 많고 KernelSynth, ForecastPFN 같은 특정 합성 베이스라인을 능가하기도 한다.
동일한 합성 데이터에서 학습된 다양한 귀납 바이어스( dense, attention-based, patching )를 가진 아키텍처가 경쟁력 있는 성능을 달성하여 모델 선택에 대한 강건함을 시사한다.
GiftEval에서 SarSim0로 프리트레이닝된 모델은 학생이 선생을 이기는 효과를 보이며 AutoARIMA가 생성한 프로세스를 능가한다.
결손 분석은 SARIMA-2와 Noisers가 일반화에 의미 있게 기여하며, SARIMA-2가 백본 전반의 정확도에 특히 중요하다는 것을 보여준다.

Figure 2: Sampling of SARIMA poles by SarSim0 . The SARIMA order-10 AR process poles are shown along with the unit circle on the left. The resulting generated processes with these poles are shown on the right. The top pane shows poles sampled according to the proposed procedure, resulting in a reali

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.