QUICK REVIEW

[论文解读] Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion

Qibin Zhao, Liqing Zhang|arXiv (Cornell University)|May 10, 2015

Tensor decomposition and applications参考文献 31被引用 48

一句话总结

本文提出贝叶斯稀疏Tucker模型，通过分层拉普拉斯和学生t先验实现结构稀疏性，实现自动多线性秩确定与张量补全。该方法采用变分贝叶斯推断，联合估计模型参数与超参数，在合成数据、化学计量学数据和MRI数据中实现低秩结构与缺失条目的优越恢复性能，且用户干预极少。

ABSTRACT

Tucker decomposition is the cornerstone of modern machine learning on tensorial data analysis, which have attracted considerable attention for multiway feature extraction, compressive sensing, and tensor completion. The most challenging problem is related to determination of model complexity (i.e., multilinear rank), especially when noise and missing data are present. In addition, existing methods cannot take into account uncertainty information of latent factors, resulting in low generalization performance. To address these issues, we present a class of probabilistic generative Tucker models for tensor decomposition and completion with structural sparsity over multilinear latent space. To exploit structural sparse modeling, we introduce two group sparsity inducing priors by hierarchial representation of Laplace and Student-t distributions, which facilitates fully posterior inference. For model learning, we derived variational Bayesian inferences over all model (hyper)parameters, and developed efficient and scalable algorithms based on multilinear operations. Our methods can automatically adapt model complexity and infer an optimal multilinear rank by the principle of maximum lower bound of model evidence. Experimental results and comparisons on synthetic, chemometrics and neuroimaging data demonstrate remarkable performance of our models for recovering ground-truth of multilinear rank and missing entries.

研究动机与目标

解决在噪声和不完整数据下Tucker分解中手动选择多线性秩的挑战。
通过贝叶斯推断与结构稀疏先验，实现模型复杂度的自动适应。
通过从数据证据中推断最优秩，提升张量补全的泛化能力。
基于多线性运算开发可扩展、高效的算法，适用于大规模张量应用。
为潜在因子提供不确定性感知推断，增强低秩恢复的鲁棒性。

提出的方法

引入分层拉普拉斯与学生t先验，在多线性潜在空间上诱导组稀疏性。
采用变分贝叶斯推断，联合估计潜在因子、核心张量与超参数。
利用多线性运算及推导定理，提升计算效率与可扩展性。
应用模型证据的最小下界原则，自动确定最优多线性秩。
通过变分近似实现非共轭先验推断，以处理拉普拉斯先验。
在统一框架下支持完全观测张量分解与部分观测张量补全。

实验结果

研究问题

RQ1贝叶斯稀疏Tucker模型能否在无需手动指定的情况下自动确定最优多线性秩？
RQ2通过分层先验实现的结构稀疏性在存在缺失数据与噪声时，能否提升张量补全性能？
RQ3变分贝叶斯推断结合非共轭先验在张量分解中，能否显著增强鲁棒性与泛化能力？
RQ4在MRI与化学计量学数据中，所提方法与HaLRTC和iHOOI等最先进方法相比表现如何？
RQ5在高缺失率与非均匀分布缺失数据下，该模型是否仍保持高性能？

主要发现

在MRI数据中，50%缺失率下，BTC-T与BTC-L的PSNR达到最高（27.32 ± 0.11），RRSE最低（0.12），显著优于iHOOI与HaLRTC。
在80%缺失率下，BTC-T的PSNR为20.14 ± 0.25，RRSE为0.25，显著优于HaLRTC（PSNR 17.37 ± 0.34）与iHOOI（PSNR 18.65 ± 0.30）。
在全局张量补全中，BTC-T在80%缺失率下PSNR达到22.33，优于WTucker（19.42）与iHOOI（19.77）。
在合成数据中，该方法即使在高噪声与高缺失率下，也能高精度恢复真实多线性秩。
图6的可视化结果证实，BTC-T在MRI数据中，尤其在高缺失率下，重建质量显著更优。
BTC方法在神经影像与化学计量学等多种数据类型中均表现出鲁棒性，且性能持续提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。