QUICK REVIEW

[论文解读] Discovering Symbolic Models from Deep Learning with Inductive Biases

Miles Cranmer, Álvaro Sánchez‐González|arXiv (Cornell University)|Jun 19, 2020

Model Reduction and Neural Networks参考文献 61被引用 269

一句话总结

该论文通过对 Graph Neural Networks 强制稀疏潜在表示并对内部组成应用符号回归，从符号、可人类解释的方程中提炼出图神经网络，成功重新发现已知物理定律并发现一个与宇宙学相关的新公式。

ABSTRACT

We develop a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. We focus on Graph Neural Networks (GNNs). The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations. We find the correct known equations, including force laws and Hamiltonians, can be extracted from the neural network. We then apply our method to a non-trivial cosmology example-a detailed dark matter simulation-and discover a new analytic formula which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures. The symbolic expressions extracted from the GNN using our technique also generalized to out-of-distribution data better than the GNN itself. Our approach offers alternative directions for interpreting neural networks and discovering novel physical principles from the representations they learn.

研究动机与目标

动机：将深度学习与符号回归结合，以获得可解释的以物理为灵感的模型。
利用图网络中的强归纳偏置来学习紧凑的内部表示。
通过对学习到的GN组件应用符号回归，提取显式解析表达式。
在牛顿动力学、哈密顿动力学和宇宙学（暗物质晕）上演示该方法。
证明符号表达式在超出分布的数据上能比原始神经模型具有更好的泛化性。

提出的方法

使用具有适合相互作用粒子系统的归纳偏置的图网络（边、节点、全局模型）。
端到端训练以预测动力学或系统能量，通过正则化（L1 或 KL）或瓶颈鼓励紧凑的潜在表示。
对GN组件（phi^e, phi^v, phi^u）应用符号回归（eureqa）拟合解析表达式。
用学习到的符号表达式替代GN内部函数并重新拟合参数以获得解析模型。
在哈密顿情形下，使用扁平化哈密顿图网络以获得 H_pair 和 H_self 项用于符号提取。
展示力学定律与势能的提取，并在涉及暗物质晕的宇宙学数据集上进行验证。

实验结果

研究问题

RQ1图网络中的强归纳偏置是否能够提取描述所学动力学的显式符号方程？
RQ2从GN组件导出的符号表达式是否对应于已知物理定律（如牛顿力、哈密顿量），并且它们能否对超出分布的数据泛化？
RQ3符号回归能否在现实数据集（宇宙学）中除了再现的力学定律外，恢复新的、可解释的公式？
RQ4正则化（L1 或 KL）或瓶颈是否能提高可解释性和对潜在物理的可恢复性？

主要发现

该框架能够从训练的图网络中恢复已知的力定律和哈密顿量。
L1 正则化和瓶颈约束产生更易解释、紧凑的潜在信息，与真实力相关。
对内部GN组件应用符号回归可以提取与物理定律匹配的显式解析表达式（例如 1/r^2 力、弹簧势能）。
从宇宙学数据中发现了一个新的解析公式，用于从附近质量分布预测暗物质超密度，与人工设计模型的误差相竞争。
在一个宇宙学示例中，符号表达式对超出分布的数据的泛化优于原始GN，表明符号解释在泛化方面的潜在好处。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。