QUICK REVIEW

[論文レビュー] Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

William J. Murdoch, Peter J. Liu|arXiv (Cornell University)|Jan 16, 2018

Topic Modeling参考文献 14被引用数 118

ひとこと要約

Contextual Decomposition (CD) は、個々の LSTM の予測を、フレーズ特異的な寄与と文脈依存の寄与に分解することによって解釈し、単語レベルの重要性を超える語の相互作用を捉えます。感情分析タスクで CD を検証し、否定と組み合わせ効果を明らかにします。

ABSTRACT

The driving force behind the recent success of LSTMs has been their ability to learn complex and non-linear relationships. Consequently, our inability to describe these relationships has led to LSTMs being characterized as black boxes. To this end, we introduce contextual decomposition (CD), an interpretation algorithm for analysing individual predictions made by standard LSTMs, without any changes to the underlying model. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM. On the task of sentiment analysis with the Yelp and SST data sets, we show that CD is able to reliably identify words and phrases of contrasting sentiment, and how they are combined to yield the LSTM's final prediction. Using the phrase-level labels in SST, we also demonstrate that CD is able to successfully extract positive and negative negations from an LSTM, something which has not previously been done.

研究の動機と目的

NLP における unigram の重要性を超えて LSTMs を解釈する必要性を動機づける。
Contextual Decomposition (CD) を提案し、LSTM の出力をフレーズ特異的寄与と文脈主導寄与に分解する。
CD が感情分析タスクにおける相互作用と否定効果を明らかにすることを示す。
既存の解釈ベースラインと CD を比較し、組成的な感情を捉える改善を示す。

提案手法

Contextual Decomposition (CD) を導入し、h_t と c_t を phrase-only (beta) および context-involving (gamma) の寄与に分解する（Equations 8–9）。
ゲート（i_t, f_t, g_t）と活性化を線形近似して、語句と文脈の相互作用としてのクロス項を同定する（Equations 11–18）。
最終予測への語句寄与を定量化するため、ソフトマックス入力を W beta_T + W gamma_T として計算する（Equation 10）。
語句内外の時間ステップをまたぐ beta_t および gamma_t の更新の一般的再帰式を提供する（Appendix 6.2）。
入力の順序を平均化することによる活性化関数 L_sigma および L_tanh の線形化を説明する（Section 3.2.2, Equations 25–28）。
SST および Yelp データセットに対して、CD をベースライン（cell decomposition, integrated gradients, leave-one-out, gradient × input）と比較する。

実験結果

リサーチクエスチョン

RQ1CD は LSTM の予測に対して信頼できる語句レベルおよび相互作用レベルの寄与を生み出せるか。
RQ2CD 由来のスコアは、部分語句の相互作用や否定を含む組成的な感情を明らかにするか。
RQ3CD は語レベルおよび語句レベルの説明において、既存の解釈法とどのように比較されるか。
RQ4CD は意味的類似性と整合する語句/相互作用の意味深い埋め込みを抽出できるか。

主な発見

CD はロジスティック回帰係数と相関が高い語レベルのスコアを生成し、SST と Yelp データセットでいくつかのベースラインより上回る。
CD は肯定的な語句内に否定的なサブフレーズを含むなど、過去の手法が失敗する場面で、肯定的/否定的語句の分裂したサブフレーズを識別する。
CD は語句間の組成的な感情を捉え、大きなレビューの部分の感情を他の手法よりも良く区別する。
CD は SST データで正と負の否定を分離し、明確な否定相互作用を明らかにする。
CD はbeta_T に対応する密な語句/相互作用の埋め込みを提供し、否定と修飾の意味的直感と近接する近傍を持つ。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。