QUICK REVIEW

[논문 리뷰] RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks

Jingkun Gao, Xiaomin Song|arXiv (Cornell University)|2020. 02. 21.

Anomaly Detection Techniques and Applications참고 문헌 49인용 수 86

한 줄 요약

RobustTAD는 robust 시간 시계열 분해를 U-Net 기반 인코더-디코더 네트워크와 결합하여 포인트별 이상치를 탐지하고, 데이터 증강 및 가중치 보정 손실의 도움을 받아 Yahoo 벤치마크에서 최첨단 성능을 달성합니다.

ABSTRACT

The monitoring and management of numerous and diverse time series data at Alibaba Group calls for an effective and scalable time series anomaly detection service. In this paper, we propose RobustTAD, a Robust Time series Anomaly Detection framework by integrating robust seasonal-trend decomposition and convolutional neural network for time series data. The seasonal-trend decomposition can effectively handle complicated patterns in time series, and meanwhile significantly simplifies the architecture of the neural network, which is an encoder-decoder architecture with skip connections. This architecture can effectively capture the multi-scale information from time series, which is very useful in anomaly detection. Due to the limited labeled data in time series anomaly detection, we systematically investigate data augmentation methods in both time and frequency domains. We also introduce label-based weight and value-based weight in the loss function by utilizing the unbalanced nature of the time series anomaly detection problem. Compared with the widely used forecasting-based anomaly detection algorithms, decomposition-based algorithms, traditional statistical algorithms, as well as recent neural network based algorithms, RobustTAD performs significantly better on public benchmark datasets. It is deployed as a public online service and widely adopted in different business scenarios at Alibaba Group.

연구 동기 및 목표

대규모 산업 환경에서 다양한 시계열을 위한 확장 가능하고 실시간 이상 탐길 탐지 목표.
강인한 분해를 활용해 추세/계절성에서 이상치를 분리하고 신경망 설계를 단순화한다.
레이블이 부족한 상황을 보완하기 위해 시계 및 주파수 도메인에서 데이터 증강을 조사한다.
손실에 라벨 기반 및 값 기반 가중치를 도입해 클래스 불균형을 다룬다.
공개 데이터셋에서 baselines 대비 실용적인 온라인 배포 및 우수한 성능을 시연한다.

제안 방법

RobustPeriod 및 RobustSTL 또는 주기성에 따라 RobustTrend를 사용하여 시계열을 추세, 계절성 및 잔차로 분해한다.
스킵 연결이 있는 U-Net 스타일의 인코더-디코더 네트워크를 사용해 잔차 성분으로부터 밀집 이상치 맵을 예측한다.
클래스 불균형을 다루고 값 기반 가중치를 활용하기 위해 가중치 보정 손실을 적용한다.
훈련 데이터를 확장하기 위해 시계-domain 및 주파수-domain 데이터 증강 기법을 개발한다.
스트리밍 데이터에 대한 온라인 분해 및 빠른 신경망 예측으로 온라인 추론을 구현한다.
데이터 수집, 오프라인 학습, 온라인 서비스 및 시각화를 포함한 엔드투엔드 시스템 아키텍처를 제공한다.

실험 결과

연구 질문

RQ1강인한 시계열 분해를 CNN과 결합하는 것이 다양한 시계열에서 이상 탐지 성능을 향상시키는가?
RQ2레이블 희소성과 클래스 불균형에서 데이터 증강과 가중 손실이 학습에 어떤 영향을 미치는가?
RQ3온라인 분해 및 추론이 실제 생산 환경에서 실시간 요구사항을 충족할 수 있는가?
RQ4CNN 기반 이상 탐지를 위한 분해된 잔여 입력과 원시 시계열 사용의 영향은 무엇인가?

주요 결과

방법	정밀도	재현율	F1 점수	Relax F1 점수
ARIMA	0.513	0.144	0.225	0.533
SHESD	0.501	0.488	0.494	0.557
Donut	0.015	0.829	0.029	0.030
U-Net-Raw	0.473	0.351	0.403	0.533
U-Net-De	0.651	0.594	0.621	0.710
U-Net-DeW	0.793	0.569	0.662	0.795
U-Net-DeWA	0.859	0.581	0.693	0.812

제안된 RobustTAD 프레임워크는 Yahoo 벤치마크에서 예측 기반, 분해 기반 및 원시 CNN 베이스라인보다 더 높은 F1 점수를 제공합니다.
분해 + U-Net에 가중치 보정 손실 및 데이터 증강을 결합하면 성능이 크게 향상되어 F1 = 0.693 및 Relax F1 = 0.812 (U-Net-DeWA) 을 달성합니다.
원시 데이터에 대한 나이브한 U-Net은 성능이 저조하지만( F1 0.403 ), 분해와 조정으로 상당한 향상을 보입니다(약 0.22–0.29의 F1).
온라인 추론은 효율적이며 스트리밍 배치에서 이상 탐지 라벨링 비용이 낮아(예측당 10 ms 미만; 온라인 분해에는 약 100 ms).
이 접근 방식은 Alibaba의 공개 온라인 서비스로 배포되었고 생산에서 널리 사용되며 공개 벤치마크에서 여러 경쟁 방법보다 우수한 성능을 보입니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.