QUICK REVIEW

[論文レビュー] TabSieve: Explicit In-Table Evidence Selection for Tabular Prediction

Yongyao Wang, Ziqi Miao|arXiv (Cornell University)|Feb 12, 2026

Machine Learning and Data Classification被引用数 0

ひとこと要約

TabSieve は、予測を行う前に表内の証拠を明示的に選択する select-then-predict フレームワークを導入し、合成軌跡で訓練され、TAB-GRPO を通じて強化されることで、few-shot in-context 設定における分類・回帰を改善します。

ABSTRACT

Tabular prediction can benefit from in-table rows as few-shot evidence, yet existing tabular models typically perform instance-wise inference and LLM-based prompting is often brittle. Models do not consistently leverage relevant rows, and noisy context can degrade performance. To address this challenge, we propose TabSieve, a select-then-predict framework that makes evidence usage explicit and auditable. Given a table and a query row, TabSieve first selects a small set of informative rows as evidence and then predicts the missing target conditioned on the selected evidence. To enable this capability, we construct TabSieve-SFT-40K by synthesizing high-quality reasoning trajectories from 331 real tables using a strong teacher model with strict filtering. Furthermore, we introduce TAB-GRPO, a reinforcement learning recipe that jointly optimizes evidence selection and prediction correctness with separate rewards, and stabilizes mixed regression and classification training via dynamic task-advantage balancing. Experiments on a held-out benchmark of 75 classification and 52 regression tables show that TabSieve consistently improves performance across shot budgets, with average gains of 2.92% on classification and 4.45% on regression over the second-best baseline. Further analysis indicates that TabSieve concentrates more attention on the selected evidence, which improves robustness to noisy context.

研究の動機と目的

証拠が明示的に特定・使用されることを保証することで、頑健な表形式の in-context 学習を動機づける。
合成証拠選択軌跡上の教師あり微調整と、証拠選択と予測の共同最適化のための強化学習を組み合わせた二段階学習パイプラインを開発する。
タスク優位性のバランシング機構を通じて、分類と回帰タスク間の初期段階の最適化不均衡に対処する。
ノイズの多い in-table コンテキストに対する頑健性と、複数ショット予算にわたる選択証拠への注意の強化を示す。

提案手法

331 個の実表を用いて強力な教師モデルによる厳格なフィルタリングで推論軌跡を合成し、TabSieve-SFT-40K を構築する。
二段階学習を用いたモデルを訓練する：TabSieve-SFT-40K でのコールドスタート SFT、続いて TAB-GRPO を用いた証拠選択と予測の共同最適化のための強化学習。
分類と回帰タスクの共同最適化を安定させるためのタスク優位性バランシング機構を使用する。
RL における証拠選択の正確さ、予測の正解性、出力の整形正確性の報酬を設計する。
ゼロショットおよびfew-shot 設定の held-out ベンチマークで 75 個の分類表と 52 個の回帰表を評価する。
証拠行への注意の移動とノイズの多いコンテキストに対する頑健性を分析する。

実験結果

リサーチクエスチョン

RQ1RQ1: 明示的な証拠選択経路は、表形式の in-context 学習においてモデルの注意を証拠行へ向けることができるか。
RQ2RQ2: ノイズの多いコンテキストへの依存は、証拠選択経路が明示されている場合にモデルを積極的に誤導するか。
RQ3RQ3: 明示的証拠選択は、few-shot レジームにおける分類と回帰の予測の頑健性と精度を向上させるか。
RQ4RQ4: 分类と回帰のタスク優位性をバランスさせることは、RL の共同最適化にどのように影響するか。
RQ5RQ5: 合成推論軌跡は下流の強化学習の効果的な初期化を提供するか。

主な発見

TabSieve は 75 個の分類表と 52 個の回帰表、および様々なショット予算において、セカンドベストのベースラインを一貫して上回る。
分類の向上はセカンドベスト手法より平均 2.92%、回帰の向上はセカンドベスト手法より平均 4.45%。
明示的な証拠選択は証拠行への注意を集中させ、ノイズの多いコンテキストの悪影響を緩和する。
<select> ステップまたは証拠報酬を削除すると性能が低下することが分かり、証拠選択の価値を確認する。
タスク優位性バランシングを伴う TAB-GRPO は共同最適化を安定化させ、特に分類で顕著な利益をもたらす。
TabSieve はゼロショットおよび few-shot 設定で一般的および表形式に特化した LLM を上回り、ショット予算が増加しても頑健である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。