QUICK REVIEW

[논문 리뷰] TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

Vladimir Iglovikov, Alexey A. Shvets|arXiv (Cornell University)|2018. 01. 17.

Advanced Neural Network Applications참고 문헌 11인용 수 576

한 줄 요약

TernausNet은 ImageNet에서 사전 학습된 VGG11 인코더를 사용하고 세 가지 가중치 초기화 스킴을 비교하여 수렴이 더 빨라지고 검증 IoU가 더 높음을 보여준다.

ABSTRACT

Pixel-wise image segmentation is demanding task in computer vision. Classical U-Net architectures composed of encoders and decoders are very popular for segmentation of medical images, satellite images etc. Typically, neural network initialized with weights from a network pre-trained on a large data set like ImageNet shows better performance than those trained from scratch on a small dataset. In some practical applications, particularly in medicine and traffic safety, the accuracy of the models is of utmost importance. In this paper, we demonstrate how the U-Net type architecture can be improved by the use of the pre-trained encoder. Our code and corresponding pre-trained weights are publicly available at https://github.com/ternaus/TernausNet. We compare three weight initialization schemes: LeCun uniform, the encoder with weights from VGG11 and full network trained on the Carvana dataset. This network architecture was a part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge.

연구 동기 및 목표

픽셀 단위 세분화 작업에서 인코더의 사전 학습이 U-Net에 이점을 주는지 동기를 부여하고 시연한다.
세 가지 가중치 초기화 스킴을 평가하고 세분화 성능에 미치는 영향을 측정한다.
실제 항공 영상 데이터셋에서 개선된 수렴 속도와 최종 IoU를 보여준다.

제안 방법

U-Net 인코더를 ImageNet에서 사전 학습된 VGG11 인코더(완전 연결 계층 제외)로 교체한다.
세 가지 초기화 스킴을 비교한다: LeCun uniform, ImageNet에서 사전 학습된 VGG11 인코더, 그리고 Fully pre-trained Carvana 모델.
Adam 옵티마이저를 사용하여 Inria Aerial Image Labeling Dataset에서 100 에포크로 학습하고 평가 지표로 IoU를 측정한다.
확률 맵에서 이진 예측을 얻기 위해 이진 픽셀 마스크 임계값을 0.3으로 사용한다.

실험 결과

연구 질문

RQ1ImageNet에서 사전 학습된 VGG11으로 U-Net 인코더의 초기화를 수행하는 것이 무작위 초기화에 비해 세분화 성능을 향상시키는가?
RQ2Carvana에서의 전체 네트워크 사전 학습이 ImageNet에서 사전 학습된 인코더 대비 IoU 및 수렴 속도 측면에서 어떤 차이를 보이는가?
RQ3항공 영상 건물 분할에서 사전 학습이 학습 시간을 줄이고 최종 검증 IoU를 향상시킬 수 있는가?

주요 결과

ImageNet에서의 인코더 사전 학습은 무작위 초기화 대비 IoU를 0.686으로 향상시킨다(0.593에서).
Carvana에서의 전체 사전 학습은 IoU 0.687를 산출하며 ImageNet-사전 학습 인코더와 비슷하다.
사전 학습된 모델은 무작위로 초기화된 모델보다 더 빠르게 수렴하고 더 높은 안정화 IoU에 도달한다.
이 방법은 Inria Aerial Image Labeling Dataset에서 150개의 학습 이미지와 30개의 검증 이미지로 시연된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.