QUICK REVIEW

[論文レビュー] Empirical Analysis of Predictive Algorithms for Collaborative Filtering

John S. Breese, David Heckerman|arXiv (Cornell University)|Jan 30, 2013

Data Management and Algorithms参考文献 11被引用数 4,510

ひとこと要約

論文は、協調フィルタリングの推定アルゴリズムを、相関ベース、ベクトル類似性、ベイズ法を含む複数の手法で比較し、複数のドメインと評価指標に渡って評価している。

ABSTRACT

Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.

研究の動機と目的

異なる協調フィルタリングアルゴリズムの予測精度を評価する。
相関、ベクトルベースの類似性、およびベイズ法を比較する。
複数のデータセット、プロトコル、評価指標で性能を評価する。

提案手法

協調フィルタリングの相関ベース、ベクトル類似性、およびベイズ法の変種を実装・比較する。
2つの評価指標を用いる：平均絶対偏差とランキングリストの有用性。
3つの適用ドメイン、4つのプロトコル、および2つの指標にわたって実験を実施する。

実験結果

リサーチクエスチョン

RQ1協調フィルタリングのデータセット間で、どの予測アルゴリズム（相関ベース、ベクトル類似性、ベイズ）が予測タスクの精度を高めるか。
RQ2意思決定木を用いたベイズネットワークは、ベイズクラスタリングおよびベクトル類似性法と、異なる評価指標や適用設定の下でどのように対比されるか。
RQ3データセットの性質、ランキング表示対逐次表示、投票の可用性など、どの要因が予測手法の優位性に影響を与えるか。

主な発見

意思決定木を各ノードに用いたベイズネットワークと相関法は、しばしばベイズクラスタリングおよびベクトル類似性法よりも優れる。
好まれる手法はデータセットの特性と適用タイプ（ランキング表示か逐次表示か）に依存する。
性能はデータセットの規模、予測速度、学習時間に依存する。
結果は問題ドメインと実験プロトコルによって異なる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。