QUICK REVIEW

[论文解读] Neural Module Networks for Reasoning over Text

Nitish Gupta, Kevin Lin|arXiv (Cornell University)|Dec 10, 2019

Multimodal Machine Learning Applications参考文献 25被引用 51

一句话总结

该论文通过引入可微分的文本与符号模块和辅助监督，将神经模块网络扩展用于对段落文本的组合性问题回答，在DROP数据集的一子集上取得了强劲结果。

ABSTRACT

Answering compositional questions that require multiple steps of reasoning against text is challenging, especially when they involve discrete, symbolic operations. Neural module networks (NMNs) learn to parse such questions as executable programs composed of learnable modules, performing well on synthetic visual QA domains. However, we find that it is challenging to learn these models for non-synthetic questions on open-domain text, where a model needs to deal with the diversity of natural language and perform a broader range of reasoning. We extend NMNs by: (a) introducing modules that reason over a paragraph of text, performing symbolic reasoning (such as arithmetic, sorting, counting) over numbers and dates in a probabilistic and differentiable manner; and (b) proposing an unsupervised auxiliary loss to help extract arguments associated with the events in text. Additionally, we show that a limited amount of heuristically-obtained question program and intermediate module output supervision provides sufficient inductive bias for accurate learning. Our proposed model significantly outperforms state-of-the-art models on a subset of the DROP dataset that poses a variety of reasoning challenges that are covered by our modules.

研究动机与目标

在开放领域文本上激发/倡导组合式问答，并突出端到端问答监督的挑战。
引入针对文本、数字和日期的可微分模块，以在段落上实现符号化推理。
提出一种无监督的辅助损失，用于引导中间推理和信息提取。
表明有限的监督信号（问题程序和模块输出）有助于学习。
在一个DROP子集上展示相较于最先进基线的改进性能，并具有可解释的中间输出。

提出的方法

将问题解析为由神经模块组成的可执行程序。
将模块定位在来自问题与段落表示的上下文标记嵌入之上（GRU 或 BERT）。
定义一个带类型的可微分模块集合（find, filter, relocate, find-num, find-date, count, compare-*, time-diff, find-max-num, span），在 Q, P, N, D, C, TD, S 上运行。
通过对一组通过束搜索得到的程序的端到端可微分边际似然进行训练。
引入无监督的辅助损失，以鼓励在窗口内对 find-num、find-date、relocate 的局部参数提取。
在数据子集上为问题程序和中间模块输出提供有限的启发式监督，以启动学习。

实验结果

研究问题

RQ1是否可以将 NMNs 适应为对自然语言文本执行多步骤、符号化推理？
RQ2可微分、概率性的模块是否能够在段落中对数字、日期和跨度进行鲁棒推理？
RQ3辅助监督是否改善开放领域文本问答中问题解析器和可执行模块的联合学习？
RQ4基于 NMN 的方法与现有最先进模型在基于 DROP 的任务上的对比如何？
RQ5将 NMN 扩展到更广泛的 DROP 问题的局限性和未来方向是什么？

主要发现

使用 GRU 的模型在剪裁后的 DROP 测试集上取得 73.1 F1 和 69.6 EM，优于 NAQANet（62.1 F1，57.9 EM）。
在 BERT 表示下，模型达到 77.4 F1 和 74.0 EM，超过 MTMSN（76.5 F1）。
辅助无监督损失显著提升性能（基于 BERT 的变体 F1 从 57.3 提升到 73.1。）
紧凑、启发式的程序和中间输出训练监督带来额外提升（5–10% 的监督）。
该方法通过中间模块输出突出可解释性，并实现有针对性的错误分析和潜在的迁移学习。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。