QUICK REVIEW

[論文レビュー] PU Learning for Matrix Completion

Cho‐Jui Hsieh, Nagarajan Natarajan|arXiv (Cornell University)|Nov 22, 2014

Sparse and Compressive Sensing Techniques参考文献 28被引用数 112

ひとこと要約

本稿は、有界な核ノルムの仮定の下で、行列補完のためのPU学習を導入し、確率的行列回復のためのシフト行列補完と、閾値化されたバイナリ行列のためのバイアス行列補完の2つの手法を提案する。Frobenius誤差に対する強力な境界 $ O(1/((1-\rho)n)) $ を確立し、密行列では $ O(n\log n) $ のサンプル複雑性を示し、特徴に基づくモデリングを用いた誘導的行列補完へとフレームワークを拡張する。

ABSTRACT

In this paper, we consider the matrix completion problem when the observations are one-bit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social networks where only "likes" or "friendships" are observed. The problem of learning from only positive and unlabeled examples, called PU (positive-unlabeled) learning, has been studied in the context of binary classification. We consider the PU matrix completion problem, where an underlying real-valued matrix M is first quantized to generate one-bit observations and then a subset of positive entries is revealed. Under the assumption that M has bounded nuclear norm, we provide recovery guarantees for two different observation models: 1) M parameterizes a distribution that generates a binary matrix, 2) M is thresholded to obtain a binary matrix. For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix. Both methods yield strong error bounds --- if M is n by n, the Frobenius error is bounded as O(1/((1-rho)n), where 1-rho denotes the fraction of ones observed. This implies a sample complexity of O(n\log n) ones to achieve a small error, when M is dense and n is large. We extend our methods and guarantees to the inductive matrix completion problem, where rows and columns of M have associated features. We provide efficient and scalable optimization procedures for both the methods and demonstrate the effectiveness of the proposed methods for link prediction (on real-world networks consisting of over 2 million nodes and 90 million links) and semi-supervised clustering tasks.

研究の動機と目的

社会的ネットワークやレコメンデーションシステムなど、実世界の応用では正例（1ビット）の観測しか得られないが、そのような状況において行列補完理論のギャップを埋める。
2つの異なる設定の下で、バイナリ観測の確率的生成と、実数値行列の決定的閾値化の下で、PU行列補完問題を定式化・分析する。
両設定に対して理論的回復保証を提供し、正例のみを観測しても低誤差の再構成を保証する。
行と列の特徴が利用可能な状況において、提案手法を誘導的行列補完に拡張し、大規模ネットワークにおけるスケーラブルかつ高精度な予測を可能にする。
200万ノード、9000万リンクを超える実世界のデータセット上で、提案手法の有効性を実証し、リンク予測および半教師付きクラスタリングにおいて優れた性能を示す。

提案手法

観測済み正例の二乗損失の不偏推定量を最小化する「シフト行列補完」手法を提案し、退化解を避けるために問題を再定式化する。
観測済み正例と未観測エントリに対して異なるペナルティを課す「バイアス行列補完」手法を導入し、決定的閾値化下での閾値化バイナリ行列の回復を可能にする。
核ノルム正則化を用いて、低ランク構造と行列回復の安定性を確保し、$ \|M\|_* \leq \text{const} $ を仮定する。
座標降下法と低ランク近似を用いたスケーラブルな最適化手順を設計し、大規模データセットへの適用を可能にする。
行と列の特徴を用いた双線形関数として行列エントリをモデリングすることで、両手法を誘導的行列補完に拡張し、理論的保証を維持する。
効率的なSVDに基づく近似と緩和技術（例：ShiftMC-relax）を活用し、大規模データを処理しながら性能を維持する。

実験結果

リサーチクエスチョン

RQ1元の行列が量子化または閾値化されている場合、正の1ビット観測のみから低ランク行列を回復できるか？
RQ21ビット行列補完の文脈において、PU学習の下で行列回復の理論的誤差境界をどのように確立できるか？
RQ3正例のみが観測される状況で、行列サイズの増大に伴いサンプル複雑性はどのように変化するか？
RQ4特徴情報が利用可能な状況において、提案手法を誘導的行列補完に拡張できるか、かつ回復保証を維持できるか？
RQ5既存のヒューリスティクス（例：欠損エントリを0として扱う）と比較して、提案手法は実世界のリンク予測およびクラスタリングタスクでどのように性能を発揮するか？

主な発見

$ n \times n $ 行列のFrobenius誤差は、観測済み正例の割合 $ 1-\rho $ を用いて $ O\left(\frac{1}{(1-\rho)n}\right) $ で有界である。
小さな誤差を達成するためのサンプル複雑性は $ O(n\log n) $ であり、$ n $ が大きい場合に密行列に対して効率的である。
BiasMCは、最大9000万リンクのデータセットにおいて、他の手法よりも低い偽陽性率（FPR）と偽陰性率（FNR）を達成し、リンク予測で優れた性能を示す。
BiasMCは非常に効率的であり、MySpaceデータセット（200万ノード、9000万リンク）を516秒で処理し、10回の座標降下スイープで標準SVD計算（2408秒）を上回る性能を発揮する。
BiasMC-inductiveは、MushroomおよびSegmentデータセットで100件のラベル付き正関係のみを用いて10％未満のクラスタリング誤差を達成し、MC-inductiveおよびスペクトルクラスタリングを大きく上回る。
理論的保証が誘導的行列補完へと拡張され、正例のみが観測される状況でも、バイアス行列補完が元の行列構造を回復できることを示した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。