QUICK REVIEW

[論文レビュー] ResizeMix: Mixing Data with Preserved Object Information and True Labels

Jie Qin, Jiemin Fang|arXiv (Cornell University)|Dec 21, 2020

Advanced Neural Network Applications参考文献 54被引用数 39

ひとこと要約

ResizeMixは元画像全体を小さなパッチにリサイズして、それをターゲット画像のランダムな領域に貼り付け、追加の計算なしにオブジェクト情報と真のラベルを保持する方法を提案し、分類でCutMixと顕著性ガイド付き増強を上回り、物体検出の一般化を改善する。

ABSTRACT

Data augmentation is a powerful technique to increase the diversity of data, which can effectively improve the generalization ability of neural networks in image recognition tasks. Recent data mixing based augmentation strategies have achieved great success. Especially, CutMix uses a simple but effective method to improve the classifiers by randomly cropping a patch from one image and pasting it on another image. To further promote the performance of CutMix, a series of works explore to use the saliency information of the image to guide the mixing. We systematically study the importance of the saliency information for mixing data, and find that the saliency information is not so necessary for promoting the augmentation performance. Furthermore, we find that the cutting based data mixing methods carry two problems of label misallocation and object information missing, which cannot be resolved simultaneously. We propose a more effective but very easily implemented method, namely ResizeMix. We mix the data by directly resizing the source image to a small patch and paste it on another image. The obtained patch preserves more substantial object information compared with conventional cut-based methods. ResizeMix shows evident advantages over CutMix and the saliency-guided methods on both image classification and object detection tasks without additional computation cost, which even outperforms most costly search-based automatic augmentation methods.

研究の動機と目的

画像顕性の混合ベース拡張における役割を評価し、パッチの切り出しによる欠点（ラベルの誤割り当てとオブジェクト情報の喪失）を特定する。
追加コストなしでオブジェクト情報と真のラベルを保持するデータオーギュメンテーション手法を開発する。
CIFAR-10/100およびImageNetにおける画像分類と、MS-COCOおよびPascal VOCにおける物体検出でResizeMixの有効性を示す。
ResizeMixをCutMixおよび顕著性ガイド付き手法と比較し、デザイン選択を理解するためのアブレーションを分析する。

提案手法

顕著性ベースの混合について、パッチ貼り付け位置（非顕著、顕著、ランダム）とパッチ取得元（顕著、非顕著、ランダム）を比較して体系的に評価する。
ResizeMixを提案する：全体の元画像をランダムスケールτでリサイズし、リサイズされたパッチをターゲット画像のランダム領域に貼り付ける；ラベル混合をl_m = lambda l_s + (1-lambda) l_tで計算し、lambda = tau^2と置く。
顕著性モジュールや探索ベースの拡張を回避して、標準的な混合を超える追加計算コストを発生させないようにする。
CIFAR-10、CIFAR-100、ImageNet、および物体検出ベンチマーク（MS-COCO、Pascal VOC）を横断して広範な実験を行い、CutMixおよび顕著性ガイド付き手法と比較する。

実験結果

リサーチクエスチョン

RQ1混合ベースの拡張の有効性に対して、顕著性情報は特に貼り付け位置やパッチ取得方法に重要か。
RQ2切り抜きでない、情報を保存するパッチ—特にリサイズされた全画像—はデータ混合におけるラベル誤割り当てとオブジェクト情報の喪失を解消できるか。
RQ3ResizeMixは画像分類および物体検出タスクでCutMixおよび顕著性ガイド付き拡張と比較してどのように性能を発揮するか。
RQ4リサイズスケール、RandAugmentの配置などのアブレーションはResizeMixの性能にどのように影響するか。

主な発見

方法	コスト (GHs)	CIFAR-10_WRS28-10	CIFAR-10_SS-2x96d	CIFAR-100_WRS28-10	CIFAR-100_SS-2x96d	ImageNet_Res50	ImageNet_Res101
Baseline	0	96.13	97.14	81.20	82.95	76.31	-
AutoAugment (AA)	5000	97.32	98.00	82.91	85.72	77.63	-
FastAA	3.5	97.30	98.00	82.70	85.40	77.60	-
PBA	5	97.42	97.97	83.27	84.69	-	-
OHL-AA	83.4	97.39	-	82.91	-	78.93	-
RandAugment (RA)	0	97.30	98.00	83.30	-	77.60	-
Faster AA	0.23	97.40	98.00	82.20	84.40	76.50	-
DADA	0.1	97.30	98.00	82.50	84.70	-	-
Cutout	0	96.90	97.14	81.59	84.0	-	-
CutMix	0	97.10	97.62	83.40	85.0	78.60	79.83
FMix	6†	96.38	-	82.03	-	-	-
SaliencyMix	6†	97.24	-	83.44	-	-	-
ResizeMix	0	97.60	97.93	84.31	85.26	79.00	80.54
ResizeMix+	6	98.10	98.47	85.23	85.60	-	-

パッチ貼り付け位置の顕著性ガイダンスはある程度の利点を提供するが、ランダム貼り付けの方がデータの多様性を高め、しばしばより良い性能を示す。
クロッピングに基づくパッチはラベルの誤割り当てとオブジェクト情報の喪失を引き起こす可能性があるが、全体画像のリサイズはオブジェクト情報を保持し、誤割り当てを回避する。
ResizeMixはCIFAR-10/100およびImageNetでCutMixおよび顕著性ガイド付き手法を追加計算コストなしで一貫して上回る。ResizeMix+ with RandAugmentはさらに結果を改善する。
物体検出では、ResizeMixで事前学習したバックボーンがMS-COCOおよびPascal VOCのSSDおよびFaster R-CNN設定でCutMixより高いmAPを示す。
アブレーションは半分解像度の訓練でリサイズがクロッピングを上回ること、最適な効果のためには混合後にRandAugmentを適用すべきこと、リサイズスケールalpha/betaを0.1～0.8程度に設定するのが有効であることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。