QUICK REVIEW

[论文解读] Learning on Large-scale Text-attributed Graphs via Variational Inference

Jianan Zhao, Meng Qu|arXiv (Cornell University)|Oct 26, 2022

Topic Modeling被引用 25

一句话总结

GLEM 引入变分 EM 框架，以交替训练语言模型（LM）和图神经网络（GNN）以在大文本属性图上进行节点分类，实现可扩展且处于前沿水平的结果。

ABSTRACT

This paper studies learning on text-attributed graphs (TAGs), where each node is associated with a text description. An ideal solution for such a problem would be integrating both the text and graph structure information with large language models and graph neural networks (GNNs). However, the problem becomes very challenging when graphs are large due to the high computational complexity brought by training large language models and GNNs together. In this paper, we propose an efficient and effective solution to learning on large text-attributed graphs by fusing graph structure and language learning with a variational Expectation-Maximization (EM) framework, called GLEM. Instead of simultaneously training large language models and GNNs on big graphs, GLEM proposes to alternatively update the two modules in the E-step and M-step. Such a procedure allows training the two modules separately while simultaneously allowing the two modules to interact and mutually enhance each other. Extensive experiments on multiple data sets demonstrate the efficiency and effectiveness of the proposed approach.

研究动机与目标

通过整合文本语义与图结构，推动对 TEXT-ATTRIBUTED GRAPHS 的可扩展学习。
提出 GLEM 以交替训练 LM 和 GNN，从而在不牺牲性能的前提下提高可扩展性。
证明基于 GLEM 的 LM 与 GNN 模块在大型 TAG 基准数据集上取得强劲结果。
展示在使用大语言模型（如 DeBERTa-large）和结构无关的归纳设置下的可扩展性。

提出的方法

采用伪似然变分框架，通过 ELBO 最大化观测标签的对数似然。
用文本为基础的 LM 近似 q(yU|sU) 和 GNN 近似 p(y n|sV,A,yV\{n}) 来实现变分分布，以捕捉局部文本与全局结构。
使用平均场分解 q(yU|sU)=∏n∈U q(y n|s n) 来建模来自文本的节点-标签分布。
在 E 步中，固定 GNN，通过模仿 GNN 预测的伪标签并利用有标签节点（带有唤醒-睡眠目标）来训练 LM。
在 M 步中，固定 LM，利用 LM 生成的嵌入和伪标签训练 GNN（以 LM 作为输入的伪似然）。
对未标注节点进行 LM 预测伪标签的标注，使在大 TAG 上的 GNN 训练变得可行。

实验结果

研究问题

RQ1变分 EM 框架是否能在大文本属性图上实现 LM 与 GNN 的可扩展融合？
RQ2交替的 LM 与 GNN 更新是否通过同时利用局部文本与全局图结构来提升节点分类？
RQ3在大型 TAG 基准上，GLEM 相较于固定的 LM/GNN 基线和其他融合策略表现如何？
RQ4GLEM 是否可扩展到大型 LM（如 DeBERTa-large）并在结构自由的归纳设置中有效？

主要发现

GLEM 在 TAG 基准数据集 ogbn-arxiv、ogbn-products、ogbn-papers100M 上，优于纯 LM 和多种 GNN 基线。
在使用带有 LM 感知嵌入的消息传递时，GLEM-GNN 在若干 ogb 基准数据集上创下新的最先进性能。
基于 EM 的训练范式提高了可扩展性，使可使用如 DeBERTa-large 这类大 LM，同等规模参数下保持竞争力。
在结构自由的归纳设置中，GLEM-LM 与 GLEM-GNN 通过利用文本属性和伪标签表现出稳健性能。
比较的训练范式显示 GLEM 在准确性和效率方面均优于静态 LM 与联合训练方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。