QUICK REVIEW

[논문 리뷰] MobileNetV2: Inverted Residuals and Linear Bottlenecks

Mark Sandler, Andrew Howard|arXiv (Cornell University)|2018. 01. 13.

Advanced Neural Network Applications참고 문헌 41인용 수 2,267

한 줄 요약

MobileNetV2를 도입하여 계산량은 줄이고 정확도는 높이는 역 잔여(inverted residuals)와 선형 병목(linear bottlenecks)을 사용하는 메모리 효율적인 모바일 CNN을 제시합니다; 또한 효율적인 객체 탐지를 위한 SSDLite와 모바일용 시맨틱 세분화를 위한 Mobile DeepLabv3도 제안합니다.

ABSTRACT

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. The MobileNetV2 architecture is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on Imagenet classification, COCO object detection, VOC image segmentation. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as the number of parameters

연구 동기 및 목표

높은 정확도와 낮은 계산 비용을 가진 모바일 친화적 신경망 아키텍처를 설계한다.
정보를 보존하면서 메모리 사용을 줄이기 위해 선형 병목(linear bottlenecks)으로 역 잔여(inverted residuals)를 도입한다.
경량 프레임워크를 통해 모바일 객체 검출 및 시맨틱 세분화에의 적용 가능성을 입증한다.
임베디드 하드웨어에 적합한 메모리 효율적인 추론 전략을 제공한다.
ImageNet, COCO, VOC 벤치마크에서 MobileNetV1 및 기타 모바일 모델과의 성능 비교를 수행한다.

제안 방법

확장 단계 다음에 깊이별 합성곱(depthwise convolution)과 선형 투영(linear projection)을 갖는 병목 심층분리합성곱을 제안한다.
병목 계층 간 잔여 연결( inverted residuals)을 사용하여 그래디언트 흐름과 메모리 효율을 개선한다.
병목(확장) 단계에서 선형성(비선형성 없음)을 강제하여 저차원 공간에서 정보를 보존한다.
저정밀도 계산에서 강건성을 위해 ReLU6 비선형성을 채택한다.
고정된 확장 계수(일반적으로 6)로 다양한 너비와 입력 해상도에서 아키텍처를 평가한다.
모바일 객체 검출용 SSD 예측 계층에서 합성곱을 깊이별 분리합성곱으로 대체하여 SSDLite를 도입한다.

실험 결과

연구 질문

RQ1역 잔여(inverted residuals)와 선형 병목(linear bottlenecks)이 모바일 비전 과제에서 낮은 계산 예산으로 정확도를 향상시킬 수 있는가?
RQ2입출력 도메인(용량)과 변환 표현력의 분리가 성능과 메모리 사용에 어느 정도 영향을 미치는가?
RQ3다양한 스케일에서 MobileNetV2의 정확도, 곱셈-덧셈(MAdds), 지연(latency), 매개변수 수 간의 트레이드오프는 무엇인가?
RQ4최소 오버헤드로 모바일 최적화 아키텍처를 객체 탐지(SSDLite) 및 세분화(Mobile DeepLabv3)로 확장할 수 있는 방법은 무엇인가?

주요 결과

MobileNetV2는 많은 기준선 대비 매개변수와 Multiply-Adds가 크게 적으면서도 ImageNet에서 경쟁력 있는 Top-1 정확도를 달성한다.
역 잔여(inverted residuals)와 선형 병목(linear bottlenecks)은 메모리 효율적인 특징 변환과 향상된 그래디언트 흐름을 제공한다.
좁은 병목에서의 비선형성은 성능을 저하시킨다; 선형 병목이 정보를 보존하고 정확도를 높이는 데 도움이 된다.
SSDLite는 COCO 객체 탐지에서 매개변수와 계산을 상당히 감소시키면서도 큰 검출기 대비 정확도를 유지한다.
MobileNetV2 + SSDLite는 COCO에서 효율성과 크기 지표 면에서 YOLOv2를 능가한다.
MobileNetV2에 DeepLabv3 기반 헤드를 탑재하면 모바일 시맨틱 세분화에서 우호적인 정확도/계산 성능(trade-off)을 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.