QUICK REVIEW

[논문 리뷰] NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation

Taesik Gong, Jongheon Jeong|arXiv (Cornell University)|2022. 08. 10.

Domain Adaptation and Few-Shot Learning인용 수 30

한 줄 요약

NOTE는 Instance-Aware Batch Normalization (IABN)과 Prediction-balanced Reservoir Sampling (PBRS)를 도입하여 비-i.i.d., 시간적으로 상관된 데이터 스트림에서 강건한 테스트 시 적응을 달성하고, 특히 비-i.i.d. 설정에서 베이스라인보다 우수합니다.

ABSTRACT

Test-time adaptation (TTA) is an emerging paradigm that addresses distributional shifts between training and testing phases without additional data acquisition or labeling cost; only unlabeled test data streams are used for continual model adaptation. Previous TTA schemes assume that the test samples are independent and identically distributed (i.i.d.), even though they are often temporally correlated (non-i.i.d.) in application scenarios, e.g., autonomous driving. We discover that most existing TTA methods fail dramatically under such scenarios. Motivated by this, we present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams. Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner. Our evaluation with various datasets, including real-world non-i.i.d. streams, demonstrates that the proposed robust TTA not only outperforms state-of-the-art TTA algorithms in the non-i.i.d. setting, but also achieves comparable performance to those algorithms under the i.i.d. assumption. Code is available at https://github.com/TaesikGong/NOTE.

연구 동기 및 목표

학습 시점과 테스트 시 간의 분포 차이를 레이블 없이 테스트 스트림에 대해 모델을 적응시키며 해결합니다.
현실 세계 시나리오에서 일반적으로 발생하는 비-i.i.d., 시간적으로 상관된 테스트 데이터(예: 자율 주행, HAR)를 다룹니다.
시간 의 패턴에 과적합되는 것을 방지하면서 타깃 도메인으로의 적응을 돕는 정규화 및 데이터 관리 기법을 개발합니다.
제안된 방법이 i.i.d. 설정에서도 경쟁력을 유지하고 비-i.i.d. 조건에서 강력한 이득을 제공하는지 보여줍니다.]
method:[

실험 결과

연구 질문

RQ1비-i.i.d., 시간적으로 상관된 테스트 스트림에 대한 테스트 시 적응을 어떻게 견고하게 만들 수 있을까?
RQ2IABN과 PBRS가 최신 TTA 방법들과 비교해 비-i.i.d. 테스트 데이터에서 성능을 향상시키는가?
RQ3시간 상관 수준 및 배치 크기가 TTA 성능에 미치는 영향은 무엇인가?
RQ4NOTE가 i.i.d. 조건에서도 경쟁력 있는 성능을 유지하면서 비-i.i.d. 조건에서 뛰어난 성능을 보일 수 있는가?

주요 결과

Method	CIFAR10-C	CIFAR100-C	ImageNet-C	Avg
Source	42.3 ± 1.1	66.6 ± 0.1	86.1 ± 0.0	65.0
BN Stats [29]	73.4 ± 1.3	65.0 ± 0.3	96.9 ± 0.0	78.5
ONDA [27]	63.6 ± 1.0	49.6 ± 0.3	89.0 ± 0.0	67.4
PL [22]	75.4 ± 1.8	66.4 ± 0.4	98.9 ± 0.0	80.2
TENT [41]	76.4 ± 2.7	66.9 ± 0.6	96.9 ± 0.0	80.1
LAME [4]	36.2 ± 1.3	63.3 ± 0.3	82.7 ± 0.0	60.7
CoTTA [44]	75.5 ± 0.7	64.2 ± 0.2	97.0 ± 0.0	78.9
NOTE	21.1 ± 0.6	47.0 ± 0.1	80.6 ± 0.1	49.6

비-i.i.d. 테스트 스트림에서 NOTE가 베이스라인보다 크게 우수합니다(예: CIFAR10-C: 평균 21.1% 감소 vs 최적 베이스라인).
i.i.d. 조건에서 NOTE는 경쟁력 있는 성능을 달성합니다(예: CIFAR10-C: 17.6% 오차 vs 최적 베이스라인의 17.8%).
추론은 단일 순전파이며 배치 없이 수행되고, 적응은 메모리를 사용하여 매 N 샘플마다 BN 통계를 업데이트합니다(N=64).
IABN 단독으로 오차를 크게 줄이고 PBRS가 정규화 통계의 추정을 개선하며, IABN과 결합하면 최상의 결과를 얻습니다.
실세계 스트림(KITTI, HARTH, ExtraSensory)에서 NOTE는 베이스라인 대비 적응 후 오차를 지속적으로 줄입니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.