QUICK REVIEW

[論文レビュー] Wavelet Convolutional Neural Networks for Texture Classification

Shin Fujieda, Kohei Takayama|arXiv (Cornell University)|Jul 24, 2017

Image Retrieval and Classification Techniques参考文献 27被引用数 100

ひとこと要約

Wavelet CNNを導入し、CNNに多解像度スペクトル分析を統合して、パラメータを減らしつつテクスチャ分類を向上させる。

ABSTRACT

Texture classification is an important and challenging problem in many image processing applications. While convolutional neural networks (CNNs) achieved significant successes for image classification, texture classification remains a difficult problem since textures usually do not contain enough information regarding the shape of object. In image processing, texture classification has been traditionally studied well with spectral analyses which exploit repeated structures in many textures. Since CNNs process images as-is in the spatial domain whereas spectral analyses process images in the frequency domain, these models have different characteristics in terms of performance. We propose a novel CNN architecture, wavelet CNNs, which integrates a spectral analysis into CNNs. Our insight is that the pooling layer and the convolution layer can be viewed as a limited form of a spectral analysis. Based on this insight, we generalize both layers to perform a spectral analysis with wavelet transform. Wavelet CNNs allow us to utilize spectral information which is lost in conventional CNNs but useful in texture classification. The experiments demonstrate that our model achieves better accuracy in texture classification than existing models. We also show that our model has significantly fewer parameters than CNNs, making our model easier to train with less memory.

研究の動機と目的

CNNにスペクトル分析を組み込むことによって、テクスチャ分類の改善を動機づける。
プーリングと畳み込みを一般化して、ウェーブレットベースの多解像度解析を実行する。
標準的なテクスチャデータセット上で精度とパラメータ効率を示す。
このアプローチの利点を示すために、AlexNet、T-CNN、およびスペクトル法と比較する。

提案手法

畳み込みとプーリングを一般化されたフィルタリングとダウンサンプリングとして再定式化する。
ネットワークにHaarウェーブレットの多解像度解析を組み込み、低周波成分と高周波成分を用いる。
3x3畳み込み、1x1パディング、ストライドに基づくダウンサンプリングを備えたVGG-19風のアーキテクチャを用いる。
全結合層の前にエネルギーレイヤーを挿入してテクスチャ特徴を強化する。
ゼロからのトレーニングとImageNet事前学習でのトレーニングを行い、性能を比較する。
Caffeで実装し、224x224の入力でデータ拡張とバッチ正規化を用いて学習する。

実験結果

リサーチクエスチョン

RQ1CNN内のウェーブレットベースの多解像度解析は、従来のCNNと比較してテクスチャ分類の精度を向上させるか。
RQ2低周波成分と共に高周波成分を取り入れることは情報を保持し、テクスチャの変動に対する頑健性を改善するか。
RQ3精度とパラメータ効率の観点で、ウェーブレットCNNは既存のスペクトル法およびCNNベースのテクスチャ手法とどう比較されるか。
RQ4分解レベルがテクスチャ分類性能に与える影響は何か。

主な発見

表1のヘッダ	表2のヘッダ	主な結果表（出典元）
Dataset	AlexNet	T-CNN	1-level	2-level	3-level	4-level	5-level
kth-tips2-b	48.3 ± 1.4	49.6 ± 0.6	57.5 ± 3.0	57.0 ± 2.3	57.8 ± 2.5	60.5 ± 2.1	59.6 ± 2.5
DTD	22.7 ± 1.3	27.8 ± 1.2	29.0 ± 1.4	30.3 ± 0.9	31.6 ± 1.0	32.2 ± 0.8	32.2 ± 0.7

Wavelet CNNは複数の分解レベルでゼロから学習した場合にテクスチャデータセットでAlexNetおよびT-CNNを上回る。
4レベルの分解が、精度とパラメータの最良のバランスをよくもたらす。
ImageNet事前学習では、wavelet CNNはkth-tips2-bで最高性能を達成し、DTDでも競争力のある結果を示す一方でFV-CNNよりはるかに少ないパラメータである。
Wavelet CNNは競合モデルよりはるかに少ない学習可能パラメータを使用する（例：一部の構成で90 MB未満）。
4レベルの分解は高い性能を提供し、5レベルはパラメータ増加のために収益が小さくなる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。