QUICK REVIEW

[논문 리뷰] ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time

Rudra P. K. Poudel, Ujwal Bonde|arXiv (Cornell University)|2018. 05. 11.

Advanced Neural Network Applications참고 문헌 18인용 수 189

한 줄 요약

ContextNet은 딥한 저해상도 컨텍스트 분기와 얕은 고해상도 디테일 분기를 결합하여 메모리 사용을 낮춘 실시간 의미 분할을 가능하게 한다; Cityscapes 전체 해상도 이미지에서 18.3 fps로 66.1% mIoU를 달성한다.

ABSTRACT

Modern deep learning architectures produce highly accurate results on many challenging semantic segmentation datasets. State-of-the-art methods are, however, not directly transferable to real-time applications or embedded devices, since naive adaptation of such systems to reduce computational cost (speed, memory and energy) causes a significant drop in accuracy. We propose ContextNet, a new deep neural network architecture which builds on factorized convolution, network compression and pyramid representation to produce competitive semantic segmentation in real-time with low memory requirement. ContextNet combines a deep network branch at low resolution that captures global context information efficiently with a shallow branch that focuses on high-resolution segmentation details. We analyse our network in a thorough ablation study and present results on the Cityscapes dataset, achieving 66.1% accuracy at 18.3 frames per second at full (1024x2048) resolution (41.9 fps with pipelined computations for streamed data).

연구 동기 및 목표

자율 주행 및 임베디드 디바이스를 위한 낮은 메모리 풋프린트로 실시간 의미 분할을 동기 부여한다.
글로벌 컨텍스트를 다운샘플 분기로부터 융합하여 고해상도 로컬 디테일을 제공하는 ContextNet 아키텍처를 제안한다.
Cityscapes 데이터셋에서 자세한 절단 실험과 함께 접근 방식을 평가한다.
깊이별로 분리된 컨볼루션과 가지치기가 효율적이고 정확한 성능을 가능하게 한다.

제안 방법

두 분기로 구성된 아키텍처, 글로벌 컨텍스트를 위한 저해상도 심층 분기와 디테일 정제를 위한 고해상도 얕은 분기.
깊이별 분리 합성곱과 병목 잔차 블록을 사용하여 매개변수와 연산 감소.
두 분기를 결합하는 퓨전 유닛으로 특징을 더하고 최종 예측에 1x1 컨볼루션을 적용.
저해상도 분기에 보조 손실을 적용하여 의미 있는 글로벌 컨텍스트 특징을 촉진.
표준 데이터 증강과 RMSprop 최적화로 학습; 저정밀 설정에서 강건성을 위해 배치 정규화와 ReLU6를 사용.
훈련 후 네트워크를 가지치하여 더 작고 빠른 변형을 탐색(lottery-ticket 스타일 가지치기).

실험 결과

연구 질문

RQ1저해상도에서 전역 컨텍스트를 포착하고 풀해상도에서 로컬 디테일을 포착하는 두 분기 네트워크가 정확도 손실 없이 실시간 의미 분할을 달성할 수 있는가?
RQ2깊이별 컨볼루션과 병목 블록이 Cityscapes 규모 데이터에서 정확도, 속도, 메모리에 어떤 영향을 미치는가?
RQ3네트워크 가지치기가 ContextNet의 mIoU와 임베디드/실시간 설정의 런타임에 어떤 영향을 주는가?

주요 결과

가지치기 후 ContextNet은 Cityscapes 테스트 세트에서 66.1% mIoU를 달성.
가지치기 없이 ContextNet은 18.3 fps로 1024×2048 이미지에서 64.2% mIoU를 달성, 단일 CPU 스레드(Titan X 측정).
저해상도 컨텍스트 분기와 얕은 전체 해상도 디테일 분기로 구성된 이 두 분기 디자인은 정확도와 실시간 성능의 균형을 이룬다.
가지치기는 Cityscapes 테스트 세트에서 mIoU를 64.2%에서 66.1%로 향상시킨다.
ContextNet은 풀 해상도에서 18.3 fps로 실행되며 최적화된 설정에서 파이프라인 데이터로 41.9 fps에 도달할 수 있다.
여러 실시간 라이벌과 비교했을 때 ContextNet은 경쟁력 있는 정확도와 더 낮은 메모리 점유(기본 변형에서 0.85M 매개변수)를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.