QUICK REVIEW

[논문 리뷰] TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation

Hyesu Lim, Byeonggeun Kim|arXiv (Cornell University)|2023. 02. 10.

Cancer-related molecular mechanisms research인용 수 21

한 줄 요약

TTN은 도메인-시프트를 인지하는 BN 계층으로, 사후 훈련 단계에서 학습된 채널별 가중치를 사용해 기존 BN과 테스트 시 BN 사이를 보간하고, 다양한 테스트 배치 크기 및 시나리오에서 강건성을 향상시키며, 추가 추론 비용 없이 기존 TTA 방법을 보강할 수 있다.

ABSTRACT

This paper proposes a novel batch normalization strategy for test-time adaptation. Recent test-time adaptation methods heavily rely on the modified batch normalization, i.e., transductive batch normalization (TBN), which calculates the mean and the variance from the current test batch rather than using the running mean and variance obtained from the source data, i.e., conventional batch normalization (CBN). Adopting TBN that employs test batch statistics mitigates the performance degradation caused by the domain shift. However, re-estimating normalization statistics using test data depends on impractical assumptions that a test batch should be large enough and be drawn from i.i.d. stream, and we observed that the previous methods with TBN show critical performance drop without the assumptions. In this paper, we identify that CBN and TBN are in a trade-off relationship and present a new test-time normalization (TTN) method that interpolates the statistics by adjusting the importance between CBN and TBN according to the domain-shift sensitivity of each BN layer. Our proposed TTN improves model robustness to shifted domains across a wide range of batch sizes and in various realistic evaluation scenarios. TTN is widely applicable to other test-time adaptation methods that rely on updating model parameters via backpropagation. We demonstrate that adopting TTN further improves their performance and achieves state-of-the-art performance in various standard benchmarks.

연구 동기 및 목표

실용적인 도메인 시프트 및 배치 크기 변동에 따른 강 robust한 테스트 타임 적응의 동기를 제시한다.
TTA에서 기존 BN과 트랜스덕티브 BN 간의 trade-off를 다룬다.
사전 학습 가중치를 변경하지 않고 채널별 보간 가중치를 학습하는 사후 훈련 단계를 개발한다.
TTN이 분류 및 분할 벤치마크 전반에서 기존 TTA 방법과의 호환성을 보여준다.

제안 방법

TTN을 소스(CBN)와 테스트(TBN) 배치 통계 간의 채널별 보간으로 정의하고 채널별 알파를 사용한다.
tilde 통계를 tilde{mu}=alpha mu +(1-alpha) mu_s 및 tilde{sigma}^2 = alpha sigma^2 +(1-alpha) sigma_s^2 + alpha(1-alpha)(mu-mu_s)^2 로 도출한다.
도메인 시프트 민감도에서 계층/채널당 사전 alpha를 추정한 사후 훈련 단계를 도입하고, 기본 가중치를 고정한 채 CE와 MSE 손실로 alpha를 최적화한다.
클린 입력과 증강 입력 하에서 BN 선형 매개변수의 그래디언트를 사용해 도메인 시프트 거리 점수 d^{(l,c)}를 계산하고 이를 통해 alpha 초기화를 위한 사전 A를 도출한다.
사후 훈련 동안 TTN으로 BN을 대체하고 테스트 시 alpha 를 고정하여 원본 지식을 유지하면서 대상 도메인에 적응한다.
TTN이 기존 정규화 기반 또는 최적화 기반 TTA 방식과의 호환성을 입증하기 위해 기존 정규화 또는 최적화 기반 접근 방식 위에 TTN을 적용한다.

Figure 1: Trade-off between CBN & TBN. In conceptual illustrations (a), (b), and (c), the depicted standardization only considers making the feature distribution have a zero mean, disregarding making it have unit variance. When the source and test distributions are different, and the test batch size

실험 결과

연구 질문

RQ1소스 배치 통계와 테스트 배치 통계 간의 채널별 보간이 다양한 배치 크기에서 TTA의 도메인 시프트에 대한 강건성을 향상시킬 수 있는가?
RQ2테스트 시 이전에 보간 가중치를 학습하는 사후 훈련 단계가 이미지 분류와 의미론적 분할 벤치마크 전반에서 일관된 이득을 제공하는가?
RQ3TTN은 정규화 기반 및 최적화 기반 TTA 방법에 대해 정적, 지속적으로 변화하는, 혼합 도메인 적응 하에서 어떻게 성능이 나오는가?
RQ4TTN이 미지의 타깃 분포에 적응하는 동안 원본 도메인 지식을 보존할 수 있는가?

주요 결과

TTN은 테스트 배치 크기(200에서 1까지의 폭넓은 범위) 및 다양한 적응 시나리오에서 기존의 정규화 기반 BN 접근법보다 더 나은 성능을 보인다.
TTN을 기존 TTA 방법에 채택하면 특히 작은 배치 크기 및 지속적이거나 혼합 도메인 적응에서 추가 이득이 발생한다.
TTN은 순수한 테스트 배치 통계 방법보다 원본 도메인 지식을 더 잘 보존하여 도메인이 다를 때의 성능 저하를 줄인다.
채널별 보간 가중치는 얕은 층이 TBN의 혜택을 더 받고, 더 깊은 층은 CBN에 더 의존하는 경향이 있어 도메인 시프트 특성과 일치한다.
사후 훈련에서 학습된 고정 TTN 혼합 비율은 사전 학습 가중치를 바꾸지 않고도 평가 설정 전반에서 효과적이다.
TTN은 도메인 시프트 벤치마크에서 의미론적 분할 일반화도 향상시키며, TENT 및 SWR와 같은 방법과도 보완적이다.

Figure 2: Method overview. (a) We introduce an additional training phase between pre-train and test time called (a-1) post-training phase. (b) Our proposed TTN layer combines per-batch statistics and frozen source statistics with interpolating weight $\alpha$ , which is (b-1) optimized in post-train

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.