QUICK REVIEW

[論文レビュー] TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency

Adji Bousso Dieng, Chong Wang|arXiv (Cornell University)|Nov 5, 2016

Topic Modeling被引用数 129

ひとこと要約

TopicRNNはRNNと潜在トピックを組み合わせて、局所語順と全体的意味コンテキストをエンドツーエンドでモデル化し、文脈依存RNNのベースラインよりパープレキシティを改善し、感情分析のための教師なし文書特徴を可能にする。

ABSTRACT

In this paper, we propose TopicRNN, a recurrent neural network (RNN)-based language model designed to directly capture the global semantic meaning relating words in a document via latent topics. Because of their sequential nature, RNNs are good at capturing the local structure of a word sequence - both semantic and syntactic - but might face difficulty remembering long-range dependencies. Intuitively, these long-range dependencies are of semantic nature. In contrast, latent topic models are able to capture the global underlying semantic structure of a document but do not account for word ordering. The proposed TopicRNN model integrates the merits of RNNs and latent topic models: it captures local (syntactic) dependencies using an RNN and global (semantic) dependencies using latent topics. Unlike previous work on contextual RNN language modeling, our model is learned end-to-end. Empirical results on word prediction show that TopicRNN outperforms existing contextual RNN baselines. In addition, TopicRNN can be used as an unsupervised feature extractor for documents. We do this for sentiment analysis on the IMDB movie review dataset and report an error rate of $6.28\%$. This is comparable to the state-of-the-art $5.91\%$ resulting from a semi-supervised approach. Finally, TopicRNN also yields sensible topics, making it a useful alternative to document models such as latent Dirichlet allocation.

研究の動機と目的

RNNの局所的な統語モデリングとトピックモデルからのグローバル意味構造を組み合わせる動機づけ。
エンドツーエンドのTopicRNNフレームワークを提案し、RNNパラメータと潜在トピック表現を共同学習する。
グローバルな意味の影響と局所的構文を分離するためにストップワードを明示的に扱う。
事前学習されたトピックなしでPTBのパープレキシティを改善し、IMDBの感情分析で競争力のある結果を示す。
TopicRNNが一貫したトピックを生成し、下流タスクの教師なし特徴抽出器として機能できることを示す。

提案手法

潜在トピックベクトルθがガウス priorから引かれる生成モデルとしてTopicRNNを定義する。
各ステップtで、前の単語x_tとh_{t-1}からRNNの隠れ状態h_tを計算する。
h_tに依存するレートを持つBernoulliから抽出されるストップワード指標l_tを導入する。
局所的項v_i^T h_tと、l_t=0のときはグローバルトピックバイアスb_i^T θを用いてp(y_t|h_t, θ, l_t)をモデル化する。さもなくばθは出力に影響を及ぼさない。
X_cを非ストップ語の袋(word bag)として、θのポスターリオを近似する変分推論ネットワークq(θ|X_c, W_c)を用いる。
再parameterizationを用いてELBOを最適化し、推論ネットワークとモデルを共同で訓練することでエンドツーエンド学習を行う。
θの点推定値（qの平均）を用いてl_tを周辺化して予測を生成し、効率のために窓を滑動させてθを更新する。

実験結果

リサーチクエスチョン

RQ1潜在トピックは、前もって訓練されたトピック特徴や外部提供のトピックなしに、RNN言語モデルにグローバルな意味コンテキストを提供できるか？
RQ2TopicRNNはPTBで文脈的なRNNベースラインと比べて語彙予測のパープレキシティを改善するか？
RQ3TopicRNNは意味のあるトピックを生成し、IMDBの感情分析の教師なし特徴抽出器として機能するか？
RQ4θを通じたグローバルな意味の影響と局所的構文を分離することが、モデルの性能と訓練ダイナミクスにどのような影響を与えるか？

主な発見

モデル	妥当なパープレキシティ	テストパープレキシティ
rnn (no features)	239.2	225.0
rnn (LDA features)	197.3	187.4
TopicRNN	184.5	172.2
TopicLSTM	188.0	175.0
TopicGRU	178.3	166.7
rnn (no features) (100 Neurons)	150.1	142.1
rnn (LDA features) (100 Neurons)	132.3	126.4
TopicRNN (100 Neurons)	128.5	122.3
TopicLSTM (100 Neurons)	126.0	118.1
TopicGRU (100 Neurons)	118.3	112.4
rnn (no features) (300 Neurons)	-	124.7
rnn (LDA features) (300 Neurons)	-	113.7
TopicRNN (300 Neurons)	118.3	112.2
TopicLSTM (300 Neurons)	104.1	99.5
TopicGRU (300 Neurons)	99.6	97.3

TopicRNNはPTBにおいてネットワークサイズに関係なく、文ごとのパープレキシティを文脈RNNベースラインより低くする。
100ニューロンと50トピックを持つモデルは、事前訓練されたトピック特徴なしで競争力のあるパープレキシティを達成する。
TopicRNN由来の特徴はIMDB 100Kで競争力のある感情分析を可能にし、誤差率は6.28%で最先端法に近い。
TopicRNNは妥当なトピックと一貫したテキストサンプルを生成できる。
TopicRNNのトピックと特徴は、クラスタリングや感情分析などの下流タスクに有用な教師なし表現を提供する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。