QUICK REVIEW

[論文レビュー] A New Fuzzy Stacked Generalization Technique and Analysis of its Performance

Mete Özay, Fatoş T. Yarman Vural|arXiv (Cornell University)|Apr 1, 2012

Face and Expression Recognition参考文献 59被引用数 32

ひとこと要約

本稿では、異なる特徴量サブセット上で動作する複数のファジィ k-NN 分類器を組み合わせることで、Nearest Neighbor 分類器の性能を向上させる、Fuzzy Stacked Generalization (FSG) と呼ばれる新しいアンサンブル学習手法を提案する。ファジィ所属度をメタラーナーで統合することで、小標本と大標本の誤差率の差を低減し、AdaBoost や Random Subspace、Rotation Forest といった最先端手法を凌駕する性能を、多特徴量の実世界データセットで示している。

ABSTRACT

In this study, a new Stacked Generalization technique called Fuzzy Stacked Generalization (FSG) is proposed to minimize the difference between N -sample and large-sample classification error of the Nearest Neighbor classifier. The proposed FSG employs a new hierarchical distance learning strategy to minimize the error difference. For this purpose, we first construct an ensemble of base-layer fuzzy k- Nearest Neighbor (k-NN) classifiers, each of which receives a different feature set extracted from the same sample set. The fuzzy membership values computed at the decision space of each fuzzy k-NN classifier are concatenated to form the feature vectors of a fusion space. Finally, the feature vectors are fed to a meta-layer classifier to learn the degree of accuracy of the decisions of the base-layer classifiers for meta-layer classification. Rather than the power of the individual base layer-classifiers, diversity and cooperation of the classifiers become an important issue to improve the overall performance of the proposed FSG. A weak base-layer classifier may boost the overall performance more than a strong classifier, if it is capable of recognizing the samples, which are not recognized by the rest of the classifiers, in its own feature space. The experiments explore the type of the collaboration among the individual classifiers required for an improved performance of the suggested architecture. Experiments on multiple feature real-world datasets show that the proposed FSG performs better than the state of the art ensemble learning algorithms such as Adaboost, Random Subspace and Rotation Forest. On the other hand, compatible performances are observed in the experiments on single feature multi-attribute datasets.

研究の動機と目的

Nearest Neighbor 分類器における小標本誤差率と大標本誤差率の性能ギャップを是正すること。
強い個々のモデルに依存するのではなく、弱いベース分類器の多様性と協調性を活用して分類精度を向上させること。
さまざまな標本サイズにおける誤差差を最小化するための階層的距離学習戦略を構築すること。
ベース分類器の意思決定の信頼性を評価するメタラーニングフレームワークを設計し、アンサンブルの一般化性能を向上させること。
多特徴量の実世界データセットにおいて、既存のアンサンブル手法を上回る優れた性能を示すこと。

提案手法

同じデータセットから抽出した異なる特徴量サブセット上で学習された、ベース層のファジィ k-NN 分類器のアンサンブルを構築する。
各ベース分類器の意思決定空間において、各サンプルのファジィ所属度を計算し、分類の信頼性を表現する。
すべてのベース分類器からのファジィ所属度ベクトルを連結し、集約された分類器出力を表す統合融合空間を構築する。
融合空間上でメタ層分類器を学習させ、各ベース分類器の意思決定の正確度を学習し、最終予測を改善する。
融合プロセスを最適化し、N標本誤差率と大標本誤差率の差を最小化するため、階層的距離学習戦略を採用する。
個々の分類器の強さよりも、分類器の多様性と協調性を重視し、弱いが相補的な分類器が全体の性能を向上させることを可能にする。

実験結果

リサーチクエスチョン

RQ1ファジィ k-NN 分類器に基づくスタックド一般化フレームワークは、近傍法分類における小標本誤差率と大標本誤差率の差を低減できるか？
RQ2提案された FSG アーキテクチャにおいて、ベース分類器のどの種の協調が最適な性能をもたらすか？
RQ3多特徴量の実世界データセットにおいて、FSG の性能は AdaBoost や Random Subspace、Rotation Forest といった既存のアンサンブル手法と比べてどの程度優れているか？
RQ4FSG フレームワークにおいて、分類器の多様性は、個々の分類器の強さに比べて、一般化性能の向上にどの程度寄与するか？
RQ5メタラーナーは、ベース分類器の意思決定の信頼性を効果的に評価・活用でき、最終分類精度を向上させることができるか？

主な発見

FSG は階層的距離学習とファジィ統合を通じて、Nearest Neighbor 分類器における N 標本誤差率と大標本誤差率の差を顕著に低減する。
提案手法は、AdaBoost や Random Subspace、Rotation Forest を含む最先端のアンサンブルアルゴリズムを、複数の多特徴量実世界データセットで上回る性能を示している。
単一特徴量・多属性データセットでは、FSG は既存手法と同等の性能を達成しており、データタイプにかかわらず堅牢であることが示された。
弱いベース分類器が他の分類器が見逃したサンプルを認識している場合、強い分類器よりも全体の性能に寄与する可能性があるため、多様性の重要性が浮き彫りになった。
メタラーナーが意思決定の信頼性を評価できることで、特に多様で相補的なベース分類器を用いた場合に一般化性能が向上することが明らかになった。
実験的結果から、ベース分類器間の協調と多様性が、個々のモデルの強さよりも性能向上により重要であることが確認された。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。