[论文解读] The Gaussian equivalence of generative models for learning with shallow neural networks
本文分析一个师生设置,其中数据由通过生成网络传递的潜在高斯向量生成,并在高斯等价框架下研究浅层神经网络的学习。
Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models. This is possible due to a Gaussian equivalence stating that the key metrics of interest, such as the training and test errors, can be fully captured by an appropriately chosen Gaussian model. We provide three strands of rigorous, analytical and numerical evidence corroborating this equivalence. First, we establish rigorous conditions for the Gaussian equivalence to hold in the case of single-layer generative models, as well as deterministic rates for convergence in distribution. Second, we leverage this equivalence to derive a closed set of equations describing the generalisation performance of two widely studied machine learning problems: two-layer neural networks trained using one-pass stochastic gradient descent, and full-batch pre-learned features or kernel methods. Finally, we perform experiments demonstrating how our theory applies to deep, pre-trained generative models. These results open a viable path to the theoretical study of machine learning models with realistic data.
研究动机与目标
- 在师生设置中,推动使用生成数据进行学习,超越i.i.d. 假设。
- 将数据建模为 x = G(c),其中 c ~ N(0, I_D),标签来自一个两层教师网络。
- 导出两层神经网络的闭式学习规则,或者在固定特征映射后的单层网络的闭式规则。
- 研究在何种条件下高斯近似能捕捉生成数据的学习动力学。
提出的方法
- 将数据生成过程表述为 c ~ N(0,I_D),随后得到 x = G(c)。
- 通过对 c 的响应,使用一个两层教师网络来定义标签 y。
- 对两层神经网络的学习进行闭式分析。
- 在通过固定特征映射投影后,对单层网络的学习进行分析。
- 或许利用高斯等价原则,将生成模型与标准高斯设定联系起来。
实验结果
研究问题
- RQ1深度生成网络生成的数据在学习中是否成立高斯等价?
- RQ2通过固定特征映射的投影如何影响浅层网络的可学习性?
- RQ3在此生成设定下,两层与单层网络的闭式学习动力学是什么?
- RQ4在何种条件下可以将生成数据视为高斯,以用于训练浅层网络?
主要发现
- 确立了在浅层网络学习中基于生成模型数据的高斯等价视角。
- 在师生设置下推导出两层网络的闭式学习表达。
- 展示了固定特征映射投影如何影响单层网络的学习。
- 提供了何时可以使用高斯近似工具分析生成数据的见解。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。