QUICK REVIEW

[論文レビュー] Neural Module Networks for Reasoning over Text

Nitish Gupta, Kevin Lin|arXiv (Cornell University)|Dec 10, 2019

Multimodal Machine Learning Applications参考文献 25被引用数 51

ひとこと要約

この論文は、 differentiable text-and-symbol modulesとauxiliary supervisionを導入することで、 paragraphs of text に対する組み合わせ的質問へ答えるニューラルモジュールネットワークを拡張し、DROPデータセットのサブセットで強力な結果を達成する。

ABSTRACT

Answering compositional questions that require multiple steps of reasoning against text is challenging, especially when they involve discrete, symbolic operations. Neural module networks (NMNs) learn to parse such questions as executable programs composed of learnable modules, performing well on synthetic visual QA domains. However, we find that it is challenging to learn these models for non-synthetic questions on open-domain text, where a model needs to deal with the diversity of natural language and perform a broader range of reasoning. We extend NMNs by: (a) introducing modules that reason over a paragraph of text, performing symbolic reasoning (such as arithmetic, sorting, counting) over numbers and dates in a probabilistic and differentiable manner; and (b) proposing an unsupervised auxiliary loss to help extract arguments associated with the events in text. Additionally, we show that a limited amount of heuristically-obtained question program and intermediate module output supervision provides sufficient inductive bias for accurate learning. Our proposed model significantly outperforms state-of-the-art models on a subset of the DROP dataset that poses a variety of reasoning challenges that are covered by our modules.

研究の動機と目的

オープンドメインテキストに対する組み合わせ的QAの動機づけと、エンドツーエンドQA監督の課題を強調。
paragraph 上でのシンボリック推論を可能にするテキスト、数字、日付の微分可能モジュールを導入。
局所的な推論と情報抽出を促す教師なしの補助損失を提案。
質問プログラムとモジュール出力の限定的な監督信号が学習を支援する。
解釈可能な中間出力を備えたDROPサブセットで最先端ベースラインを上回る性能を示す。

提案手法

質問をニューラルモジュールからなる実行可能なプログラムにパース。
質問とパラグラフ表現（GRUまたはBERT）からの文脈的トークン埋め込みでモジュールをグラウンド化。
Q, P, N, D, C, TD, S 上で動作する型付きの微分可能モジュール集合（find, filter, relocate, find-num, find-date, count, compare-*, time-diff, find-max-num, span）を定義。
ビーム探索されたプログラム集合に対してエンドツーエンドの微分可能な周辺尤度で学習。
find-num, find-date, relocate の局所的な引数抽出を促す教師なし補助損失を導入。
データのサブセットで質問プログラムと中間モジュール出力に対する限定的なヒューリスティック監督を提供して学習をブートストラップ。

実験結果

リサーチクエスチョン

RQ1NMNを自然言語テキスト上で多段階の象徴的推論に適用できるか。
RQ2微分可能で確率的なモジュールはパラグラフ内の数字、日付、スパンに対する堅牢な推論を可能にするか。
RQ3補助監督がオープンドメインテキストQAにおける質問パーサと実行可能なモジュールの共学習を改善するか。
RQ4NMNベースのアプローチはDROP由来タスクで既存の最先端モデルとどのように比較されるか。
RQ5より広範なDROP質問へNMNを拡張する際の制限と今後の方向性は何か。

主な発見

GRUを用いたモデルはpruned DROPテストセットで73.1 F1と69.6 EMを達成し、NAQANetの62.1 F1、57.9 EMを上回る。
BERT表現を用いるとモデルは77.4 F1と74.0 EMに達し、MTMSNの76.5 F1を超える。
補助的な教師なし損失は性能を大幅に向上させる（BERTベースの変種で57.3から73.1 F1へ）。
プログラムと中間出力のコンパクトでヒューリスティックな学習監督は追加の改善をもたらす（5–10%の監督）。
このアプローチは中間モジュール出力を介した解釈性と、ターゲットを絞った誤り分析および転移学習の可能性を強調する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。