QUICK REVIEW

[논문 리뷰] Spatial-Temporal Large Language Model for Traffic Prediction

Chenxi Liu, Sun Yang|arXiv (Cornell University)|2024. 01. 18.

Traffic Prediction and Management Techniques인용 수 10

한 줄 요약

ST-LLM은 위치 타임스텝을 토큰으로 정의하고, 시공-시간 임베딩을 적용하며, 부분적으로 고정된 어텐션 LLM을 사용하여 트래픽을 예측함으로써 일반 상황에서의 성능, 소수 샷, 제로샷 시나리오에서 강한 성능을 달성한다.

ABSTRACT

Traffic prediction, an essential component for intelligent transportation systems, endeavours to use historical data to foresee future traffic features at specific locations. Although existing traffic prediction models often emphasize developing complex neural network structures, their accuracy has not improved. Recently, large language models have shown outstanding capabilities in time series analysis. Differing from existing models, LLMs progress mainly through parameter expansion and extensive pretraining while maintaining their fundamental structures. Motivated by these developments, we propose a Spatial-Temporal Large Language Model (ST-LLM) for traffic prediction. In the ST-LLM, we define timesteps at each location as tokens and design a spatial-temporal embedding to learn the spatial location and global temporal patterns of these tokens. Additionally, we integrate these embeddings by a fusion convolution to each token for a unified spatial-temporal representation. Furthermore, we innovate a partially frozen attention strategy to adapt the LLM to capture global spatial-temporal dependencies for traffic prediction. Comprehensive experiments on real traffic datasets offer evidence that ST-LLM is a powerful spatial-temporal learner that outperforms state-of-the-art models. Notably, the ST-LLM also exhibits robust performance in both few-shot and zero-shot prediction scenarios. The code is publicly available at https://github.com/ChenxiLiu-HNU/ST-LLM.

연구 동기 및 목표

대형 언어 모델을 활용하여 전역 시공-시간 의존성을 포착함으로써 교통 예측의 개선을 목표로 한다.
각 위치의 타임스텝을 토큰으로 재구성하는 시공-시간 임베딩 및 토큰화 스킴을 도입한다.
사전 학습된 지식을 유지하면서 LLM을 트래픽 데이터에 적응시키기 위한 부분적으로 고정된 어텐션(PFA) 전략을 개발한다.
ST-LLM의 최첨단 모델에 비해 우수한 정확도와 소수 샷 및 제로 샷 시나리오에서의 강건성을 입증한다.

제안 방법

트래픽 데이터를 R^{T x N x C} 형태의 텐서 X로 정의한다.
P개의 과거 타임스텝을 시공-시간 임베딩 계층을 통해 토큰 임베딩으로 인코딩한다 (PConv 기반 토큰 임베딩, 시간/일/주 위치 인코딩, 적응형 공간 임베딩).
합성 컨볼루션으로 임베딩을 융합하여 E_F ∈ R^{N x 3D}를 형성한다.
처음 F 계층은 고정하고 마지막 U 다중-헤드 어텐션 계층만 해제된 상태로 부분적으로 고정된 어텐션 LLM으로 임베딩을 처리하여 H^L ∈ R^{N x 3D}를 생성한다.
다음 S 타임스텝을 예측하기 위해 회귀 컨볼루션을 사용한다: Ŷ_S = RConv(H^{F+U}).
손실 L = ||Ŷ_S - Y_S|| + λ L_reg로 학습한다.

실험 결과

연구 질문

RQ1ST-LLM이 위치-시간을 토큰으로 다룸으로써 교통 데이터의 시공-시간 의존성을 효과적으로 모델링할 수 있는가?
RQ2부분적으로 고정된 어텐션 계층이 완전히 고정되거나 완전히 튜닝 가능한 설정에 비해 LLM의 트래픽 예측 적응을 향상시키는가?
RQ3다양한 교통 데이터셋에서 ST-LLM이 소수 샷 및 제로 샷 전이에서 어떻게 성능을 보이는가?
RQ4시공-시간 임베딩 및 그 융합이 예측 정확도에 미치는 영향은 무엇인가?
RQ5실세계 데이터셋에서 ST-LLM이 최첨단 GNN/어텐션 기반 모델과 어떻게 비교되는가?

주요 결과

ST-LLM은 NYC 택시 및 CHBike 데이터셋에서 다수의 트래픽 예측 시나리오에 걸쳐 최첨단 모델보다 우수한 성능을 보인다.
ST-LLM은 DCRNN, STGCN, ASTGCN, GMAN, GATGPT, GCNGPT, LLAMA2를 포함한 기준선보다 더 낮은 MAE/MAPE/RMSE/WAPE를 달성한다.
부분적으로 고정된 어텐션(PFA)은 고정된 경우, 완전히 튜닝된 경우 및 기타 기준선에 비해 우수한 성능을 보인다.
ST-LLM은 소수 샷 및 제로 샷 예측에서 강건성을 보이며 뛰어난 도메인 간 전이 능력을 보여준다.
변수 제거 연구에서 LLM 또는 시공-시간 임베딩을 제거하면 성능 저하가 크게 나타나므로 이들의 중요성이 강조된다.
추론 시 분석에서 ST-LLM이 여러 LLM 기준선에 비해 속도와 정확도의 균형이 우수함을 시사한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.