QUICK REVIEW

[논문 리뷰] A Review on Deep Learning Techniques Applied to Semantic Segmentation

Alberto García-García, Sergio Orts‐Escolano|arXiv (Cornell University)|2017. 04. 22.

Advanced Neural Network Applications참고 문헌 96인용 수 1,035

한 줄 요약

본 논문은 의미 분할에 대한 딥 러닝 방법을 조사하고, 데이터셋과 도전과제를 검토하며, 이 분야의 성능과 향후 방향에 대해 논의합니다.

ABSTRACT

Image semantic segmentation is more and more being of interest for computer vision and machine learning researchers. Many applications on the rise need accurate and efficient segmentation mechanisms: autonomous driving, indoor navigation, and even virtual or augmented reality systems to name a few. This demand coincides with the rise of deep learning approaches in almost every field or application target related to computer vision, including semantic segmentation or scene understanding. This paper provides a review on deep learning methods for semantic segmentation applied to various application areas. Firstly, we describe the terminology of this field as well as mandatory background concepts. Next, the main datasets and challenges are exposed to help researchers decide which are the ones that best suit their needs and their targets. Then, existing methods are reviewed, highlighting their contributions and their significance in the field. Finally, quantitative results are given for the described methods and the datasets in which they were evaluated, following up with a discussion of the results. At last, we point out a set of promising future works and draw our own conclusions about the state of the art of semantic segmentation using deep learning techniques.

연구 동기 및 목표

딥 러닝 기법을 이용한 분할에 유용한 데이터셋에 대한 폭넓은 조사를 제공한다.
의미 분할을 위한 중요한 딥 러닝 방법과 그 기여에 대한 체계적인 검토를 제공한다.
정확도, 속도, 메모리 등의 성능 지표를 요약하고 데이터셋 간 방법들을 비교한다.
의미 분할 분야의 도전 과제를 논의하고 딥 러닝을 이용한 향후 연구 방향을 제안한다.
해당 분야에 진입하거나 선도하는 연구자들을 위한 맥락과 최첨단 상태를 제시한다.

제안 방법

의미 분할 문제와 픽셀 단위 라벨링 형식을 기술한다.
구성 요소로 자주 사용되는 일반적인 딥 네트워크 아키텍처를 검토한다(예: AlexNet, VGG, GoogLeNet, ResNet, ReNet).
의미 분할 네트워크를 위한 전이 학습 및 미세 조정 전략을 논의한다.
일반화와 학습 효율성을 향상시키기 위한 데이터 전처리 및 증강 기법을 설명한다.
주요 2D, 2.5D(RGB-D), 및 3D 데이터셋과 벤치마크를 속성 및 분할을 포함하여 제시하고 분류한다.
인용된 데이터셋에서 방법들의 질적 및 양적 성능 평가를 제공한다.

실험 결과

연구 질문

RQ1딥 러닝 기반 의미 분할 방법을 평가하는 데 가장 대표적인 데이터셋과 벤치마크는 무엇인가?
RQ2어떤 아키텍처와 학습 전략이 분할 작업에 가장 효과적인 것으로 입증되었나?
RQ3데이터셋 전반에서 딥 러닝 접근법이 정확도, 속도, 메모리 사용 측면에서 전통적 방법과 어떻게 비교되는가?
RQ4의미 분할에서의 주요 도전과제와 향후 방향은 무엇인가?

주요 결과

본 검토는 2D, 2.5D, 3D에 걸친 광범위한 데이터셋을 통합하고 그 목적, 클래스, 형식 및 분할을 명확히 한다.
딥 러닝 기반 의미 분할 방법은 일반적으로 전통적 접근법을 능가하며, 주석이 제한된 데이터를 다루기 위해 전이 학습과 사전 학습된 네트워크에 의존한다.
전이 학습과 미세 조정은 분할 작업의 픽셀당 라벨링 데이터가 분류 작업에 비해 더 작기 때문에 일반적인 전략이다.
데이터 증강 및 전처리는 일반화를 향상시키는 데 필수적이며, 특히 작은 데이터셋에서 강조된다.
본 논문은 성능 중심의 논의를 제공하고 딥 러닝을 통한 의미 분할을 발전시키기 위한 향후 연구 방향을 제시한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.