QUICK REVIEW

[論文レビュー] MobileNetV2: Inverted Residuals and Linear Bottlenecks

Mark Sandler, Andrew Howard|arXiv (Cornell University)|Jan 13, 2018

Advanced Neural Network Applications参考文献 41被引用数 2,267

ひとこと要約

MobileNetV2を導入した、逆残差と線形ボトルネックを用いるメモリ効率の高いモバイルCNNで、計算量を削減しつつ精度を向上させる。さらに、効率的な物体検出のためのSSDLiteと、モバイル向けセマンティックセグメンテーションのMobile DeepLabv3を提案。

ABSTRACT

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. The MobileNetV2 architecture is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on Imagenet classification, COCO object detection, VOC image segmentation. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as the number of parameters

研究の動機と目的

高精度かつ低計算コストを実現する、モバイル対応のニューラルネットワークアーキテクチャを設計する。
情報を保持しつつメモリ使用量を削減するために、線形ボトルネックを伴う inverted residuals を導入している。
軽量フレームワークを介したモバイル物体検出とセマンティックセグメンテーションへの適用性を示す。
組込みハードウェアに適したメモリ効率の良い推論戦略を提供する。
ImageNet、COCO、VOCのベンチマークにおける性能をMobileNetV1および他のモバイルモデルと比較する。

提案手法

拡張段階に続く深さ分離畳み込みのボトルネックを提案し、深さ方向畳み込みと線形射影を組み合わせる。
ボトルネック層間の残差接続（inverted residuals）を用い、勾配の流れとメモリ効率を改善する。
ボトルネック（拡張）段階で線形（非線形性なし）を適用し、低次元空間で情報を保持する。
低精度計算での頑健性のためにReLU6の非線形性を採用する。
固定拡張係数（通常は6）を用いて、さまざまな幅と入力解像度でアーキテクチャを評価する。
モバイル物体検出のために、SSD予測層の畳み込みをdepthwise separable convolutionsに置換してSSDLiteを導入する。

実験結果

リサーチクエスチョン

RQ1 inverted residuals with linear bottlenecksは低計算予算のモバイルビジョンタスクで精度を向上させるか？
RQ2入力/出力領域（容量）と変換表現力の分離は、性能とメモリ使用にどのような影響を与えるか？
RQ3MobileNetV2をさまざまなスケールで、精度、Multiply-Adds (MAdds)、レイテンシ、パラメータ数のトレードオフはどうなるか？
RQ4モバイル最適化アーキテクチャを、最小限のオーバーヘッドで物体検出（SSDLite）とセグメンテーション（Mobile DeepLabv3）に拡張するにはどうすればよいか？

主な発見

MobileNetV2は、他の多くのベースラインよりもはるかに少ないパラメータとMultiply-AddsでImageNetのTop-1精度で競争力を示す。
inverted residuals with linear bottlenecks は、メモリ効率の高い特徴変換と改善された勾配フローを提供する。
狭いボトルネックにおける非線形は性能を低下させ、線形ボトルネックは情報を保持して精度を向上させる。
SSDLiteはCOCO物体検出におけるパラメータと計算量を大幅に削減しつつ、より大きな検出器に対する精度を維持する。
MobileNetV2 + SSDLite は、報告された設定でCOCOにおける効率とサイズの指標でYOLOv2を上回る。
DeepLabv3ベースのヘッドを組み合わせたMobileNetV2は、モバイルセマンティックセグメンテーションにおいて高精度/計算量の良好なトレードオフを提供する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。