QUICK REVIEW

[论文解读] Convolutional Rectifier Networks as Generalized Tensor Decompositions

Nadav Cohen, Amnon Shashua|arXiv (Cornell University)|Mar 1, 2016

Tensor decomposition and applications参考文献 17被引用 21

一句话总结

本文提出了一种广义张量分解框架，将卷积算术电路转换为卷积整流器网络（带有ReLU和池化的ConvNets）。借助算术电路理论的工具，证明了虽然最大池化ReLU网络是通用的，但其深度效率仅部分实现——与之相比，乘积池化算术电路则实现了完全的深度效率。这表明，若能有效训练，卷积算术电路可能具备更优越的表达能力。

ABSTRACT

Convolutional rectifier networks, i.e. convolutional neural networks with rectified linear activation and max or average pooling, are the cornerstone of modern deep learning. However, despite their wide use and success, our theoretical understanding of the expressive properties that drive these networks is partial at best. On the other hand, we have a much firmer grasp of these issues in the world of arithmetic circuits. Specifically, it is known that convolutional arithmetic circuits possess the property of "complete depth efficiency", meaning that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be implemented (or even approximated) by a shallow network. In this paper we describe a construction based on generalized tensor decompositions, that transforms convolutional arithmetic circuits into convolutional rectifier networks. We then use mathematical tools available from the world of arithmetic circuits to prove new results. First, we show that convolutional rectifier networks are universal with max pooling but not with average pooling. Second, and more importantly, we show that depth efficiency is weaker with convolutional rectifier networks than it is with convolutional arithmetic circuits. This leads us to believe that developing effective methods for training convolutional arithmetic circuits, thereby fulfilling their expressive potential, may give rise to a deep learning architecture that is provably superior to convolutional rectifier networks but has so far been overlooked by practitioners.

研究动机与目标

建立卷积整流器网络与广义张量分解之间的理论联系。
利用算术电路理论的工具，分析基于ReLU的ConvNets的表达能力与深度效率。
比较基于ReLU的网络在最大池化/平均池化与卷积算术电路在乘积池化下的深度效率。
论证此前被忽视的卷积算术电路，若能有效训练，可能提供可证明更优的性能。

提出的方法

作者定义广义张量分解，以建模卷积整流器网络的分层组合结构。
通过噪声扰动与张量逼近，构建从卷积算术电路（线性激活，乘积池化）到ReLU网络（ReLU激活，最大/平均池化）的映射。
通过引入对权重和激活的小幅扰动，表明ReLU网络可以逼近分解层次中形成的基张量。
利用算术电路复杂性领域的数学工具，分析转换后ReLU网络的通用性与深度效率。
该分析依赖于对权重的扰动，以确保系数非负且ReLU激活单调，从而保持张量结构。
证明在小噪声下，所得网络可计算基本张量分解，从而支持对表达能力的理论分析。

实验结果

研究问题

RQ1卷积整流器网络能否通过广义张量分解的视角被正式分析？
RQ2在该框架下，带有最大池化或平均池化的ReLU网络的通用性是否完全保持？
RQ3与卷积算术电路相比，ReLU网络在最大池化下在多大程度上表现出深度效率？
RQ4使用ReLU和池化操作是否会限制其相对于线性激活与乘积池化的表达能力？
RQ5若开发出合适的训练方法，卷积算术电路的理论优势能否在实践中被利用？

主要发现

带有最大池化的卷积整流器网络是通用的，意味着它们可以逼近紧集上的任意连续函数。
相比之下，带有平均池化的ReLU网络并非通用的，即使增加宽度也无法表达所有函数。
ReLU网络在最大池化下的深度效率是不完全的——存在一个正测度的函数集合，可被浅层网络高效实现。
这与卷积算术电路形成对比，后者表现出完全的深度效率，即几乎所有深层网络可表达的函数都无法被浅层网络近似。
结果表明，卷积算术电路的理论表达能力强于标准的ReLU-based ConvNets。
本文结论认为，若能开发出卷积算术电路的训练方法，将可能获得可证明优于当前ReLU-based架构的深度学习模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。