QUICK REVIEW

[논문 리뷰] LB-SimTSC: An Efficient Similarity-Aware Graph Neural Network for Semi-Supervised Time Series Classification

Wenjie Xi, Arnav Jain|arXiv (Cornell University)|2023. 01. 12.

Time Series Analysis and Forecasting인용 수 8

한 줄 요약

LB-SimTSC가 SimTSC의 DTW 기반 그래프 구성방식을 선형시간 LB_Keogh 기반 그래프로 교체하여 정확도 손실 없이 대규모 반지도학 시간시계열 분류를 가능하게 한다.

ABSTRACT

Time series classification is an important data mining task that has received a lot of interest in the past two decades. Due to the label scarcity in practice, semi-supervised time series classification with only a few labeled samples has become popular. Recently, Similarity-aware Time Series Classification (SimTSC) is proposed to address this problem by using a graph neural network classification model on the graph generated from pairwise Dynamic Time Warping (DTW) distance of batch data. It shows excellent accuracy and outperforms state-of-the-art deep learning models in several few-label settings. However, since SimTSC relies on pairwise DTW distances, the quadratic complexity of DTW limits its usability to only reasonably sized datasets. To address this challenge, we propose a new efficient semi-supervised time series classification technique, LB-SimTSC, with a new graph construction module. Instead of using DTW, we propose to utilize a lower bound of DTW, LB_Keogh, to approximate the dissimilarity between instances in linear time, while retaining the relative proximity relationships one would have obtained via computing DTW. We construct the pairwise distance matrix using LB_Keogh and build a graph for the graph neural network. We apply this approach to the ten largest datasets from the well-known UCR time series classification archive. The results demonstrate that this approach can be up to 104x faster than SimTSC when constructing the graph on large datasets without significantly decreasing classification accuracy.

연구 동기 및 목표

시계열 분류에서 라벨 부족 문제를 반지도학 학습으로 해결한다.
쿼드라틱 DTW를 선형 시간 하한으로 대체하여 similarity-aware GCN 접근법의 확장성을 향상한다.
워핑 인식을 유지하면서 대규모 데이터셋을 위한 효율적인 그래프 구성 가능하게 한다.

제안 방법

레이블링된 시계열 샘플과 비레이블링 샘플의 수를 동일하게 구성한다.
LB_Keogh를 DTW의 하한으로 사용하여 쌍 간 비유사성을 O(L)로 근사하는 LB-그래프를 구성한다.
LB_Keogh 거리를 친화도로 변환하고 희소화하여 배치 그래프 G^B를 형성하고 계산 효율성과 GPU 호환성을 확보한다.
백본 신경망(ResNet 기반)으로 배치 데이터를 임베딩하고 G^B 위에서 그래프 컨볼루션 네트워크를 통해 정보를 전파한다.
Softmax를 이용해 엔드투엔드로 학습하며, 각 배치 내 라벨링 데이터가 비레이블링 인스턴스로 라벨 정보를 흐를 수 있도록 한다.

실험 결과

연구 질문

RQ1LB_Keogh 기반 그래프 구성이 DTW에 비해 반워핑 인식 신호를 제공하여 반지도학 TSC에 유사하게 작동할 수 있는가?
RQ2LB-SimTSC 프레임워크가 대규모 시계열 데이터셋에서 그래프 구성의 속도 향상을 상당히 달성하는가? 정확도 손실 없이?
RQ3적은 라벨 조건에서 SimTSC에 비해 LB-SimTSC의 성능은 어떤가?
RQ4제한된 라벨 데이터로 배치 단위 그래프 구성이 비레이블링 인스턴스로 라벨 정보를 효과적으로 전달할 수 있는가?

주요 결과

데이터셋	1NN-DTW (5)	LB-SimTSC (5)	1NN-DTW (10)	LB-SimTSC (10)	1NN-DTW (15)	LB-SimTSC (15)
FordA	0.540	0.793	0.531	0.826	0.545	0.816
FordB	0.627	0.812	0.603	0.806	0.628	0.807
NIFECGT1	0.665	0.662	0.704	0.743	0.710	0.830
NIFECGT2	0.757	0.689	0.788	0.798	0.817	0.856
UWGLAll	0.779	0.447	0.854	0.549	0.865	0.622
Phoneme	0.185	0.253	0.204	0.330	0.209	0.346
Mallat	0.944	0.908	0.960	0.963	0.971	0.960
MSST	0.784	0.871	0.796	0.883	0.834	0.908
MSRT	0.790	0.775	0.819	0.891	0.843	0.916
SLC	0.559	0.920	0.683	0.939	0.763	0.946

LB-SimTSC는 여러 대규모 UCR 데이터셋에서 SimTSC 대비 그래프 구성 속도를 최대 71배 빠르게 달성했으며, 전체 그래프 구성 시간에서 상당한 감소를 보였다.
LB-SimTSC는 여러 라벨 조건에서 SimTSC와 경쟁력 있는 정확도를 유지했으며, 정확도 차이가 유의하게 크지 않았다(p-값이 양측검정에서 0.05보다 큰 경우가 다수).
5라벨 및 그 이상 설정에서 LB-SimTSC는 종종 1NN-DTW보다 우수한 성능을 보여, 소수의 라벨 조건에서도 강력한 성능을 보인다.
LB_Keogh 기반 그래프는 효과적인 워핑 인식 유사도 포착과 GPU 친화적 계산을 가능하게 해, 더 큰 데이터셋으로의 확장을 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.