QUICK REVIEW

[论文解读] Fast Evaluation and Approximation of the Gauss-Newton Hessian Matrix for the Multilayer Perceptron

Chao Chen, Severin Reiz|arXiv (Cornell University)|Jan 1, 2019

Neural Networks and Applications被引用 2

一句话总结

该论文提出了一种快速采样算法，将多层感知机（multilayer perceptrons）中高斯-牛顿海森矩阵（Gauss-Newton Hessian, GNH）的计算成本从 $O(Nn)$ 降低至每项 $O(n + d/\theta^2)$，从而实现了高效的层级矩阵（$\mathcal{H}$-matrix）近似。该方法通过利用低秩结构，实现了 $\mathcal{O}(N r_o)$ 的内存占用和 $\mathcal{O}(N r_o^2)$ 的分解计算量，显著加速了神经网络训练中的线性系统求解与特征值问题。

ABSTRACT

We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton Hessian (GNH) matrix for the multilayer perceptron. The algorithm has a precomputation step and a sampling step. While it generally requires $O(Nn)$ work to compute an entry (and the entire column) in the GNH matrix for a neural network with $N$ parameters and $n$ data points, our fast sampling algorithm reduces the cost to $O(n+d/\epsilon^2)$ work, where $d$ is the output dimension of the network and $\epsilon$ is a prescribed accuracy (independent of $N$). One application of our algorithm is constructing the hierarchical-matrix (\hmatrix{}) approximation of the GNH matrix for solving linear systems and eigenvalue problems. While it generally requires $O(N^2)$ memory and $O(N^3)$ work to store and factorize the GNH matrix, respectively. The \hmatrix{} approximation requires only $\bigO(N r_o)$ memory footprint and $\bigO(N r_o^2)$ work to be factorized, where $r_o \ll N$ is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the \hmatrix{} approximation on classification and autoencoder neural networks.

研究动机与目标

降低多层感知机中高斯-牛顿海森矩阵（GNH）逐项评估的计算成本。
实现大规模线性代数问题中GNH矩阵的高效层级矩阵（$\mathcal{H}$-matrix）近似。
在求解GNH相关系统与特征值问题时，实现低内存与计算复杂度。
在分类与自编码器神经网络上展示实际性能表现。

提出的方法

提出一种两阶段算法：预计算阶段与GNH矩阵项的采样阶段。
采用随机采样方法，以与参数数量 $N$ 无关的误差 $\epsilon$ 近似GNH项。
利用GNH矩阵非对角块的低秩结构，构建 $\mathcal{H}$-矩阵近似。
使用 $\mathcal{H}$-矩阵格式，将内存占用从 $O(N^2)$ 降低至 $\mathcal{O}(N r_o)$，其中 $r_o \ll N$ 为非对角块的最大秩。
利用 $\mathcal{H}$-矩阵结构，将GNH矩阵的分解计算量从 $O(N^3)$ 降低至 $\mathcal{O}(N r_o^2)$。
通过输出维度 $d$ 和精度参数 $\epsilon$ 控制采样复杂度 $O(n + d/\epsilon^2)$。

实验结果

研究问题

RQ1能否实现与参数数量 $N$ 无关的GNH矩阵逐项评估复杂度？
RQ2在控制精度 $\epsilon$ 的前提下，采样GNH矩阵项的最小计算成本是多少？
RQ3能否为大规模神经网络高效构建GNH矩阵的$\mathcal{H}$-矩阵近似？
RQ4与完整GNH存储和分解相比，$\mathcal{H}$-矩阵近似在内存与计算量上能节省多少？
RQ5所提出的采样方法在分类与自编码器网络上的实际可扩展性如何？

主要发现

所提出的采样算法将GNH项评估成本降低至 $O(n + d/\epsilon^2)$，与 $N$ 无关，显著提升了可扩展性。
GNH矩阵的$\mathcal{H}$-矩阵近似仅需 $\mathcal{O}(N r_o)$ 内存，其中 $r_o \ll N$ 为非对角块的最大秩。
通过$\mathcal{H}$-矩阵结构实现GNH矩阵的分解，计算量为 $\mathcal{O}(N r_o^2)$，显著低于标准的 $O(N^3)$ 成本。
该方法可高效求解大规模神经网络中涉及GNH矩阵的线性系统与特征值问题。
实验结果表明，结合快速采样与$\mathcal{H}$-矩阵近似，在分类与自编码器神经网络任务中均实现了性能提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。