QUICK REVIEW

[논문 리뷰] Switching Temporary Teachers for Semi-Supervised Semantic Segmentation

Jaemin Na, Jung-Woo Ha|arXiv (Cornell University)|2023. 10. 28.

Advanced Neural Network Applications인용 수 16

한 줄 요약

이 논문은 두 개의 임시 EMA 교사를 교대로 사용하는 Dual Teacher를 제안하며, 하나의 학생을 반지도 학습 세그먼테이션에서 가이드하고 교사-학생 결합을 줄이며 벤치마크 전반에서 효율성을 개선한다.

ABSTRACT

The teacher-student framework, prevalent in semi-supervised semantic segmentation, mainly employs the exponential moving average (EMA) to update a single teacher's weights based on the student's. However, EMA updates raise a problem in that the weights of the teacher and student are getting coupled, causing a potential performance bottleneck. Furthermore, this problem may become more severe when training with more complicated labels such as segmentation masks but with few annotated data. This paper introduces Dual Teacher, a simple yet effective approach that employs dual temporary teachers aiming to alleviate the coupling problem for the student. The temporary teachers work in shifts and are progressively improved, so consistently prevent the teacher and student from becoming excessively close. Specifically, the temporary teachers periodically take turns generating pseudo-labels to train a student model and maintain the distinct characteristics of the student model for each epoch. Consequently, Dual Teacher achieves competitive performance on the PASCAL VOC, Cityscapes, and ADE20K benchmarks with remarkably shorter training times than state-of-the-art methods. Moreover, we demonstrate that our approach is model-agnostic and compatible with both CNN- and Transformer-based models. Code is available at \url{https://github.com/naver-ai/dual-teacher}.

연구 동기 및 목표

Motivate and address teacher–student coupling in semi-supervised segmentation where EMA updates link teacher and student too closely.
Propose Dual Teacher with two temporary EMA teachers that switch each epoch to diversify supervision.
Leverage strong augmentations for the student and weak augmentations for teachers to generate pseudo-labels.
Introduce implicit consistency learning via sub-models to form an implicit ensemble and improve robustness.

제안 방법

Introduce Dual Teacher: two temporary EMA teachers that alternately generate pseudo-labels for a single student.
Change the student’s augmentation pool per epoch to maintain distinct student characteristics and induce teacher diversity.
Update each temporary teacher via EMA of the student weights, ensuring they reflect the evolving student but remain distinct.
Apply implicit consistency learning by enforcing predictions between sub-models of the student and the full teacher models using stochastic depth.
Optimize with supervised loss on labeled data and unsupervised loss on unlabeled data using pseudo-labels from the teachers.

실험 결과

연구 질문

RQ1Can switching dual temporary teachers mitigate teacher–student coupling in SSL for semantic segmentation?
RQ2Does alternating teacher switching with diverse augmentations improve segmentation accuracy on standard benchmarks?
RQ3Is the approach model-agnostic across CNN and Transformer-based backbones?
RQ4What is the impact of implicit consistency learning in this framework?

주요 결과

Dual Teacher achieves competitive mIoU on PASCAL VOC, Cityscapes, and ADE20K with shorter training times and fewer parameters.
Prediction distance analyses show single-EMA teachers remain highly coupled to the student, whereas Dual Teacher maintains diverse supervision.
Two temporary teachers provide distinct category-wise supervision, leading to complementary guidance for the student.
Increasing augmentations or teachers yields gains up to a point, with dual teachers plus augmentations offering best performance.
Implicit consistency learning with uniform decay improves stability and final accuracy.
On ADE20K with SegFormer, Dual Teacher outperforms supervised baselines across all partitions and maintains efficiency.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.