QUICK REVIEW

[論文レビュー] Improving Symbolic Translation of Language Models for Logical Reasoning

Ramya Thatikonda, Jiuzhou Han|arXiv (Cornell University)|Jan 14, 2026

Topic Modeling被引用数 0

ひとこと要約

論文は、エラーを分類し、LLM合成データでのファインチューニングを行い、逐次推論と検証を導入することで、小型LMによる論理推論のための記号NL-to-FOL翻訳を改善し、精度と述語カバーを向上させる。

ABSTRACT

The use of formal language for deductive logical reasoning aligns well with language models (LMs), where translating natural language (NL) into first-order logic (FOL) and employing an external solver results in a verifiable and therefore reliable reasoning system. However, smaller LMs often struggle with this translation task, frequently producing incorrect symbolic outputs due to formatting and translation errors. Existing approaches typically rely on self-iteration to correct these errors, but such methods depend heavily on the capabilities of the underlying model. To address this, we first categorize common errors and fine-tune smaller LMs using data synthesized by large language models. The evaluation is performed using the defined error categories. We introduce incremental inference, which divides inference into two stages, predicate generation and FOL translation, providing greater control over model behavior and enhancing generation quality as measured by predicate metrics. This decomposition framework also enables the use of a verification module that targets predicate-arity errors to further improve performance. Our study evaluates three families of models across four logical-reasoning datasets. The comprehensive fine-tuning, incremental inference, and verification modules reduce error rates, increase predicate coverage, and improve reasoning performance for smaller LMs, moving us closer to developing reliable and accessible symbolic-reasoning systems.

研究の動機と目的

自然言語を一階述語論理へ翻訳し外部ソルバーを用いて推論することで信頼できる演繹的推論を動機づける。
記号的出力のフォーマット不整合や誤訳による小型LMの翻訳誤りに対処する。
翻訳精度と推論信頼性を向上させるデータ拡張と方法論的フレームワークを開発する。
推論とエラーハンドリングをより細かく制御して記号翻訳エラーを減らす。

提案手法

一般的なNL-to-FOL翻訳エラーを分類する。
大規模言語モデルで合成したデータを用いて小型LMをファインチューニングする。
推論を述語生成とFOL翻訳に分割して逐次推論を導入する。
述語アリティの誤りをターゲットとする検証モジュールを組み込み、生成品質を改善する。
提案フレームワークを用いて四つの論理推論データセットで三つのモデルファミリーを評価する。

実験結果

リサーチクエスチョン

RQ1小型言語モデルは自然言語を一階述語論理へ信頼性高く翻訳できるか。
RQ2エラー分類とLLM合成ファインチューニングが記号翻訳の精度に与える影響は。
RQ3逐次推論は述語生成とFOL翻訳の品質をエンドツーエンド方式と比べて改善するか。
RQ4検証モジュールは述語アリティなどの構造的エラーを減らして推論性能を高めるか。
RQ5提案フレームワークを用いた場合、複数の論理推論データセットで異なるモデルファミリーはどのように性能を示すか。

主な発見

LLM合成データでのファインチューニングは小型LMの翻訳エラーを低減する。
逐次推論は述語生成とFOL翻訳の制御を改善し、生成品質を高める。
述語アリティエラーをターゲットとする検証モジュールはさらに性能を高める。
このフレームワークは複数データセットで述語カバーと推論性能を向上させる。
三つのモデルファミリーは提案手法によりエラー率が低下する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。