QUICK REVIEW

[论文解读] Graph Convolutional Networks for Text Classification

Liang Yao, Chengsheng Mao|arXiv (Cornell University)|Sep 15, 2018

Topic Modeling参考文献 28被引用 62

一句话总结

本文提出 Text GCN，利用两层图卷积网络学习的异构语料库图，进行文本分类且无需外部嵌入，在若干基准数据集上取得强劲结果。

ABSTRACT

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus. Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents. Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. On the other hand, Text GCN also learns predictive word and document embeddings. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.

研究动机与目标

将文本分类建模为利用全局词共现的图结构问题以提高性能。
提出一个包含词节点和文档节点的语料库级异构图以实现端到端学习。
证明两层 Text GCN 在没有外部嵌入的情况下可超越最先进的基线。
展示 Text GCN 产生可解释的词嵌入和文档嵌入，并且在标记数据有限时具鲁棒性。

提出的方法

为整个语料库构建一个包含词和文档节点的大型异构图。
将输入特征设为词和文档的一热编码向量（单位矩阵）。
用 TF-IDF 权重连接文档-词边，用滑动窗口共现的正 PMI 连接词-词边。
应用两层图卷积网络传播信息并产生节点嵌入。
在第二层嵌入上使用 softmax 分类器进行文档分类。
在带标签的文档上以交叉熵损失端到端训练；使用 Tikhonov 正则化和 Adam 优化。

实验结果

研究问题

RQ1在不使用外部词嵌入的情况下，Text GCN 是否能在标准文本分类基准上达到高准确率？
RQ2模型在训练过程中是否学习到有用的词嵌入和文档嵌入？
RQ3与基线相比，在标注数据有限的情况下 Text GCN 的表现如何？
RQ4图构建的选择（窗口大小、PMI）对性能有何影响？

主要发现

Text GCN 在 20NG、R8、R52 和 Ohsumed 数据集上优于多种基线（统计显著性 p<0.05）。
Text GCN 在相对较少的训练数据下也能取得具有竞争力的结果，并且在标注数据稀缺时表现出鲁棒性。
模型学习到的第二层词嵌入具有可解释性且与文档类别相关。
两层 GCN 已足够；增加更多层并未提高结果。
在无外部嵌入的情况下，Text GCN 可以超越若干强的有监督方法，尤其在长文本数据集上；MR（短文本）则因边较少且缺乏词序建模而不太有利。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。