QUICK REVIEW

[論文レビュー] Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

Yunhu Ye, Binyuan Hui|arXiv (Cornell University)|Jan 31, 2023

Topic Modeling被引用数 11

ひとこと要約

本論文は、Large Language Modelsを用いて巨大な表形式の証拠と複雑なNL質問をサブ証拠とサブ質問に分解し、文脈に基づく学習を通じた表ベースの推論を改善するフレームワーク「Dater」を紹介します。

ABSTRACT

Table-based reasoning has shown remarkable progress in combining deep models with discrete reasoning, which requires reasoning over both free-form natural language (NL) questions and structured tabular data. However, previous table-based reasoning solutions usually suffer from significant performance degradation on huge evidence (tables). In addition, most existing methods struggle to reason over complex questions since the required information is scattered in different places. To alleviate the above challenges, we exploit large language models (LLMs) as decomposers for effective table-based reasoning, which (i) decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information for table reasoning; and (ii) decompose complex questions into simpler sub-questions for text reasoning. Specifically, we first use the LLMs to break down the evidence (tables) involved in the current question, retaining the relevant evidence and excluding the remaining irrelevant evidence from the huge table. In addition, we propose a "parsing-execution-filling" strategy to alleviate the hallucination dilemma of the chain of thought by decoupling logic and numerical computation in each step. Extensive experiments show that our method can effectively leverage decomposed evidence and questions and outperforms the strong baselines on TabFact, WikiTableQuestion, and FetaQA datasets. Notably, our model outperforms human performance for the first time on the TabFact dataset.

研究の動機と目的

巨大でノイズの多い表と複雑な質問を扱う表ベースの推論を動機付ける。
関連するサブテーブルを抽出して無関係なデータによる干渉を減らす証拠の分解を提案する。
NL質問を実行可能なSQL指向の手順へ翻訳する信頼できる質問分解法（パーシング–実行–埋め戻し）を導入する。
分解ベースのプロンプティングを用いたLLMが標準ベンチマークで高いまたはそれを上回る性能を達成することを示す。
追跡可能なサブ証拠とサブ質問を通じた解釈性の利点を強調する。

提案手法

質問を与えたときに全表からサブテーブルの行/列のインデックスを予測して証拠を分解する強力なLLMを用いる。
抽象的な論理的サブ質問を実行可能なSQLに変換し、それをサブ証拠上で実行して具体的な値を埋め戻す、パーシング–実行–埋め戻し戦略を採用する。
証拠に対応する信頼できるサブ質問を構築するための中間的なSQLベースの手順を生成する。
プロンプトと小さなサンプル集を用いた文脈学習を適用して、証拠分解と質問分解の双方を導く。
分解されたサブ証拠とサブ質問を組み合わせ、別の文脈内推論の局面を通じて最終回答を得る。

実験結果

リサーチクエスチョン

RQ1巨大な表を小さく関連性のあるサブ証拠に分解するLLMベースの分解は、表ベースの推論タスクを改善できるか？
RQ2信頼性の高いSQL指向の複雑なNL質問の分解は、幻視を減らし証拠との整合性を改善できるか？
RQ3証拠分解と質問分解は、表ベースの事実検証とQAベンチマークの性能を共同で向上させるか？
RQ4生成されたサブ証拠とサブ質問を通じてアプローチは解釈可能か？

主な発見

提案手法は複数の表ベースの推論ベンチマークで競合ベースラインを大きく上回る改善を達成する。
この手法は、表ベースの推論の少なくとも1つのベンチマークデータセットで人間の性能を超えることがある。
証拠の分解は関連する表領域に推論を集中させ、関連性のないデータによる干渉を減らす。
パーシング–実行–埋め戻し戦略による質問分解は、サブ証拠上の実行可能SQLに基づいて信頼できるサブ質問を生み出す。
証拠分解と質問分解を組み合わせた総合分解は、データセットとアブレーションの全体で最良の性能をもたらす。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。