Skip to main content
QUICK REVIEW

[论文解读] Tensor Regression Networks

Jean Kossaifi, Zachary C. Lipton|arXiv (Cornell University)|Jul 26, 2017
Tensor decomposition and applications参考文献 49被引用 78
一句话总结

论文提出张量收缩层(TCLs)和张量回归层(TRLs),以在神经网络中保持多线性结构,在ImageNet上实现有竞争力的准确度的同时实现大规模参数压缩,并改进基于MRI的性状预测。

ABSTRACT

Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach's ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset.

研究动机与目标

  • 在整个CNN中保持激活张量的多线性结构,而不是在进入全连接层之前展平。
  • 引入TCLs,通过张量收缩压缩激活。
  • 引入TRLs,通过低秩多线性映射建模输出,无需展平。
  • 展示在大规模及医疗成像数据集上的参数效率与准确度权衡。

提出的方法

  • 定义并集成张量收缩层(TCLs),将激活张量 X 映射到核心 G,通过 X' = X ×1 V(0) ×2 V(1) ... ×N+1 V(N)。
  • 定义张量回归层(TRLs),学习低秩Tucker结构的权重张量 W = ⟪G; U(0),...,U(N),U(N+1)⟫ 并计算 Y = ⟨X, W⟩N + b。
  • 推导TCLs与TRLs的梯度表达式,以实现端到端反向传播。
  • 通过张量积视角展示TCLs等价于全连接层,并强调参数数量的减少(维度之和与乘积之比)。
  • 通过将Y改写为低秩子空间的形式以最小化高维计算,提供高效实现。
  • 讨论通过低秩约束和因子矩阵的归一化进行正则化。

实验结果

研究问题

  • RQ1通过TCLs与TRLs保持多模态张量结构,是否能在大规模视觉任务上达到或超越全连接层?
  • RQ2TCLs与TRLs在保持ImageNet精度的同时,在多大程度上降低参数数量?
  • RQ3TRLs在结构丰富的医疗成像数据(如MRI)相比传统展平方法是否有优势?
  • RQ4张量化架构的端到端训练在性能和正则化方面与传统CNN相比如何?
  • RQ5在现代硬件上实现张量收缩的实际效率提升是多少?

主要发现

ModelTop-1Top-5Space savings (%)
Baseline77.193.40
(300, 1, 1, 700)77.293.525.6
(200, 1, 1, 200)77.193.268.2
(120, 1, 1, 300)76.793.171.2
(150, 1, 1, 150)7692.976.6
(100, 1, 1, 100)74.691.784.6
(50 , 1, 1, 50)73.69192.4
  • 在ImageNet上使用ResNet-101,将全连接层替换为TRL,在基线相比下得到相似或更高的Top-1/Top-5准确率,同时实现显著的参数空间节省(例如从25%到高达92.4%的节省)。
  • 较小的TRL配置在实现较大参数减少的同时保持或提高准确性(例如在最小精度损失下实现高达约65%的空间节省)。
  • TRLs结合TCLs,保持多线性结构并通过将flatten+FC替换为基于张量的映射来减少参数。
  • 在基于MRI的UK Biobank任务(年龄、性别、BMI)中,TRLs显著超越基线3D-ResNet FC设置,实现MAE下降(年龄:2.96→2.70岁),性别误差降低(0.79%→0.53%),BMI MAE下降(2.37→2.26)。
  • 结果表明张量结构网络可以利用拓扑数据属性,特别是在医学影像中,提升预测性能。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。