QUICK REVIEW

[논문 리뷰] Robust Bayesian Tensor Factorization for Incomplete Multiway Data.

Qibin Zhao, Guoxu Zhou|arXiv (Cornell University)|2014. 10. 09.

Tensor decomposition and applications참고 문헌 40인용 수 8

한 줄 요약

이 논문은 계층적 사전분포와 변분 추론을 사용하여 완전하지 않은 다중방향 데이터에서 저질서 및 희소 성분을 함께 모델링하는 강건한 베이지안 텐서 분해 방법을 제안한다. 이 방법은 초모수 조정 없이도 자동으로 질서 수 결정과 이방성 탐지가 가능하며, 합성 및 실세계 데이터셋에서 텐서 복원과 강건성 면에서 뛰어난 성능을 달성한다.

ABSTRACT

Abstract—We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is modeled by multilinear interactions between multiple latent factors on which the column sparsity is enforced by a hierarchical prior, while the sparse tensor is modeled by a hierarchical view of Student-t distribution that associates an individual hyperparameter with each element independently. For model learning, we develop an efficient closed-form variational inference under a fully Bayesian treatment, which can effectively prevent the overfitting problem and scales linearly with data size. In contrast to existing related works, our method can perform model selection automatically and implicitly without need of tuning parameters. More specifically, it can discover the groundtruth of CP rank and automatically adapt the sparsity inducing priors to various types of outliers. In addition, the tradeoff between the low-rank approximation and the sparse representation can be optimized in the sense of maximum model evidence. The extensive experiments and comparisons with many state-of-the-art algorithms on both synthetic and real-world datasets demonstrate the superiorities of our method from several perspectives. Index Terms—Tensor factorization, tensor completion, robust factorization, rank determination, variational Bayesian inference, video background modeling F 1

연구 동기 및 목표

결손 데이터와 이방성 존재 상황에서 강건한 텐서 분해 문제를 해결하기 위해.
텐서 데이터에서 전반적인 저질서 구조와 局부 희소 이방성을 명시적으로 분리하기 위해.
수동 조정 없이 CP 질서 수와 희소성 유도 초모수에 대한 자동 모델 선택을 가능하게 하기 위해.
과적합을 방지하고 데이터 크기에 따라 선형적으로 확장 가능한 완전한 베이지안 프레임워크를 제공하기 위해.
최대 모형 증거를 통한 저질서 근사와 희소 표현 간의 균형을 최적화하기 위해.

제안 방법

잠재 요인 간의 다중선형 상호작용을 통해 저CP질서 텐서를 모델링하고, 계층적 사전분포를 통해 열 희소성 강제 적용.
개별 요소별 초모수를 갖는 계층적 스튜던트-t 분포를 사용하여 희소 텐서를 표현하여 강건한 이방성 모델링 구현.
완전한 생성 모델링 하에 효율적이고 확장 가능한 베이지안 추론을 위한 폐쇄형 해를 갖춘 변분 추론 활용.
모델의 주변 가능성 최대화를 통한 자동 질서 발견을 통한 수동 초모수 조정 방지.
학습의 정규화와 완전한 베이지안 처리를 통한 완전한 일반화 향상, 특히 결손 데이터 상황에서의 성능 향상.
최대 모형 증거를 통한 저질서 및 희소 성분 간 균형 최적화.

실험 결과

연구 질문

RQ1베이지안 텐서 분해 모델이 사전 지식 없이 진정한 CP 질서 수를 자동으로 결정할 수 있는가?
RQ2완전하지 않은 다중방향 데이터에서 전반적인 저질서 구조와 국소 이방성을 얼마나 효과적으로 분리할 수 있는가?
RQ3결손 데이터와 오염 상황에서 최신 기술 대비 텐서 복원 성능에서 얼마나 뛰어나게 성능을 발휘하는가?
RQ4자동 희소성 유도를 통해 다양한 이방성 유형에 적응할 수 있는가?
RQ5데이터 크기에 비례하여 효율적으로 확장되면서도 강건성을 유지하는가?

주요 결과

합성 실험에서 수동 조정 없이도 실제 CP 질서 수를 자동으로 발견함.
결손 항목이 포함된 합성 및 실세계 데이터셋에서 최신 기술 대비 뛰어난 텐서 복원 정확도 달성.
이방성을 효과적으로 식별하고 분리하여 극단적인 오염 상황에서도 예측 성능 향상.
변분 추론 프레임워크는 데이터 크기에 따라 선형적으로 확장되어 대규모 텐서에서의 효율적 학습 가능.
모형 증거를 통한 저질서 및 희소 성분 간 균형 최적화로 더 나은 일반화 성능 달성.
계층적 스튜던트-t 사전분포를 통한 적응형 이방성 탐지로 고정 페널티 또는 가우시안 기반 방법보다 뛰어난 성능 확보.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.