QUICK REVIEW

[论文解读] A Riemannian Network for SPD Matrix Learning

Zhiwu Huang, Luc Van Gool|arXiv (Cornell University)|Aug 15, 2016

Face and Expression Recognition被引用 120

一句话总结

简要结论：提出 SPDNet，一种深度网络，使用 BiMap、ReEig 和 LogEig 层学习 SPD 矩阵的非线性表示，采用在 Stiefel 流形上的黎曼 SGD 训练，以保持 SPD 结构。

ABSTRACT

Symmetric Positive Definite (SPD) matrix learning methods have become popular in many image and video processing tasks, thanks to their ability to learn appropriate statistical representations while respecting Riemannian geometry of underlying SPD manifolds. In this paper we build a Riemannian network architecture to open up a new direction of SPD matrix non-linear learning in a deep model. In particular, we devise bilinear mapping layers to transform input SPD matrices to more desirable SPD matrices, exploit eigenvalue rectification layers to apply a non-linear activation function to the new SPD matrices, and design an eigenvalue logarithm layer to perform Riemannian computing on the resulting SPD matrices for regular output layers. For training the proposed deep network, we exploit a new backpropagation with a variant of stochastic gradient descent on Stiefel manifolds to update the structured connection weights and the involved SPD matrix data. We show through experiments that the proposed SPD matrix network can be simply trained and outperform existing SPD matrix learning and state-of-the-art methods in three typical visual classification tasks.

研究动机与目标

在保持黎曼几何的前提下，直接在对称正定（SPD）矩阵上进行学习的动机。
提出一种在各层上对 SPD 矩阵进行操作的深度架构（SPDNet）。
在 Stiefel 流形上开发用于 SPD 变换权重的反向传播与优化。
在情感、动作和人脸验证任务上，展示相较于浅层 SPD 方法的性能提升。

提出的方法

BiMap 层：X_k = W_k X_{k-1} W_k^T，W_k 位于 Stiefel 流形上以使输出保持 SPD。
ReEig 层：通过 X_k = U diag(max(εI, Σ)) U^T 对 SPD 特征值应用类似 ReLU 的非线性。
LogEig 层：对特征值取对数，将 SPD 映射到欧几里得空间，以便于标准全连接/softmax 层。
黎曼反向传播：使用在 Stiefel 流形上的 SGD 更新 BiMap 权重，并进行回缩步。
矩阵反向传播：通过特征值分解(EIG) 对 ReEig 与 LogEig 层使用矩阵链式法则扩展推导梯度。
训练细节：四种配置（0–3 个 BiRe 块），学习率 1e-2，随机半正交初始化，ε = 1e-4。

实验结果

研究问题

RQ1能否在保持 SPD 结构的前提下，直接在 SPD 流形上进行深度非线性学习？
RQ2BiMap 和 ReEig 层是否能为 SPD 矩阵提供超越 LogEig 转换的有意义的非线性？
RQ3在标准视觉任务上，带黎曼反向传播的 SPDNet 与现有的浅层 SPD 学习方法相比如何？

主要发现

方法	AFEW	HDM05	PaSC1	PaSC2
STM-ExpLet	31.73%	–	–	–
RSR-SPDML	30.12%	48.01% ±3.38	–	–
DeepO2P	28.54%	–	68.76%	60.14%
CDL	31.81%	41.74% ±1.92	78.29%	70.41%
LEML	25.13%	46.87% ±2.19	66.53%	58.34%
SPDML-AIM	26.72%	47.25% ±2.78	65.47%	59.03%
SPDML-Stein	24.55%	46.21% ±2.65	61.63%	56.67%
RSR	27.49%	41.12% ±2.53	–	–
SPDNet-0BiRe	26.32%	48.12% ±3.15	68.52%	63.92%
SPDNet-1BiRe	29.12%	55.26% ±2.37	71.75%	65.81%
SPDNet-2BiRe	31.54%	59.13% ±1.78	76.23%	69.64%
SPDNet-3BiRe	34.23%	61.45% ±1.12	80.12%	72.83%

SPDNet-3BiRe 在 AFEW 上达到 34.23%，在 HDM05 上达到 61.45%，在 PaSC1 上达到 80.12%（PaSC2: 72.83）超过了若干浅层 SPD 方法。
更深的 SPDNet 配置（更多 BiRe 块）在 SPDNet-0BiRe 与 SPDNet-1/2BiRe 上持续带来性能提升。
LogEig 层至关重要；省略它会显著降低准确率（例如在 AFEW 上，SPDNet-0BiRe 为 26.32%，SPDNet-3BiRe 为 34.23%）。
SPDNet-3BiRe 在 AFEW、HDM05 和 PaSC 数据集上均超越 DeepO2P 及其他 SPD 学习基线。
实验表明该方法收敛，并从非线性特征值整流中受益（ε 参数研究）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。