QUICK REVIEW

[論文レビュー] A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets

Patryk Chrabąszcz, Ilya Loshchilov|arXiv (Cornell University)|Jul 27, 2017

Advanced Neural Network Applications参考文献 9被引用数 371

ひとこと要約

本論文は、クラス数を保持したまま実験を高速化するためのダウンサンプリング済みImageNetバリアント（ImageNet16x16/32x32/64x64）を提案し、サイズ間で類似のハイパーパラメータ領域を示し、Wide ResNetでの高い性能を実証する。

ABSTRACT

The original ImageNet dataset is a popular large-scale benchmark for training Deep Neural Networks. Since the cost of performing experiments (e.g, algorithm design, architecture search, and hyperparameter tuning) on the original dataset might be prohibitive, we propose to consider a downsampled version of ImageNet. In contrast to the CIFAR datasets and earlier downsampled versions of ImageNet, our proposed ImageNet32$ imes$32 (and its variants ImageNet64$ imes$64 and ImageNet16$ imes$16) contains exactly the same number of classes and images as ImageNet, with the only difference that the images are downsampled to 32$ imes$32 pixels per image (64$ imes$64 and 16$ imes$16 pixels for the variants, respectively). Experiments on these downsampled variants are dramatically faster than on the original ImageNet and the characteristics of the downsampled datasets with respect to optimal hyperparameters appear to remain similar. The proposed datasets and scripts to reproduce our results are available at http://image-net.org/download-images and https://github.com/PatrykChrabaszcz/Imagenet32_Scripts

研究の動機と目的

元のクラス数と画像数を保持しつつImageNetをダウンサンプリングして、より安価でスケーラブルなベンチマークを提供する。
ダウンサンプリングが学習の重要なダイナミクスとハイパーパラメータの感度を保持するかを評価する。
ネットワーク幅と学習率が、ダウンサンプリング解像度とどのように相互作用するかを評価し、安価な実験を導く。

提案手法

元のImageNet画像をダウンサンプリングしつつ、クラスラベルと画像数を保持してImageNet32x32、ImageNet64x64、ImageNet16x16を作成する。
ダウンサンプリング画像に適応した、標準的なCIFAR風の設定でWide Residual Networks (WRN-N-k)を訓練する。
6種類のダウンサンプリング手法（bicubic、bilinear、box、hamming、lanczos、nearest）を比較し、nearest neighborを劣ると特定する。
データ拡張（左右反転、ランダムシフト）と、モーメンタム付きの標準SGD、そして学習率のスケジュールドロップを使用する。
複数のネットワーク幅とダウンサンプリング解像度にわたる性能を評価し、より大きなモデルへの転移性を評価する。

実験結果

リサーチクエスチョン

RQ1ImageNetを32x32へダウンサンプリングする（16x16/64x64の varianteも）ことで、異なるアーキテクチャやハイパーパラメータの相対的な性能を保持できるか？
RQ2ネットワーク幅(k)がダウンサンプリング解像度とどのように相互作用して精度と訓練時間に影響するか？
RQ3ダウンサンプリング ImageNet の結果はフル ImageNet の結果を予測でき、安価なアーキテクチャ/ハイパーパラメータ探索を可能にするか？
RQ4どのダウンサンプリング手法がダウンサンプリングされた ImageNet の分類情報を最も効果的に保存するか？

主な発見

ダウンサンプリング手法は、nearest neighborを除いて類似した結果を生み出す。nearest neighborはすべての実験で性能が劣る。
Wide ResNetsはImageNet32x32で高い性能を発揮し、画像あたりのピクセル数が大幅に少ないにもかかわらず、AlexNetの元のImageNet結果に近い。
ネットワーク幅を増やすと、すべてのダウンサンプリングサイズで性能が向上する。大きいkほど良い結果を得る。
最適な学習率領域は、ImageNet16x16、ImageNet32x32、ImageNet64x64のいずれにおいても、および異なる widthsでも類似している。
性能と訓練時間のトレードオフは、最適な随時性能を得るためにダウンサンプリングとネットワークサイズの組み合わせを用いることを示唆する。
これらの知見は、より高価なセットアップにもおそらく適用可能で、アーキテクチャ/ハイパーパラメータ探索の安価な代理手段を可能にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。