QUICK REVIEW

[論文レビュー] Human-LLM Collaborative Feature Engineering for Tabular Data

Zhuoyan Li, Aditya Bansal|arXiv (Cornell University)|Jan 28, 2026

Machine Learning and Data Classification被引用数 0

ひとこと要約

この論文は、タブular特徴量エンジニアリングのための人間–LLM協調フレームワークを提案し、操作提案（LLMsによる）を操作選択（ベイズ代理モデルと任意の人間の好みでガイド）から切り離します。複数のデータセットで予測性能の向上と認知負荷の低減を示します。

ABSTRACT

Large language models (LLMs) are increasingly used to automate feature engineering in tabular learning. Given task-specific information, LLMs can propose diverse feature transformation operations to enhance downstream model performance. However, current approaches typically assign the LLM as a black-box optimizer, responsible for both proposing and selecting operations based solely on its internal heuristics, which often lack calibrated estimations of operation utility and consequently lead to repeated exploration of low-yield operations without a principled strategy for prioritizing promising directions. In this paper, we propose a human-LLM collaborative feature engineering framework for tabular learning. We begin by decoupling the transformation operation proposal and selection processes, where LLMs are used solely to generate operation candidates, while the selection is guided by explicitly modeling the utility and uncertainty of each proposed operation. Since accurate utility estimation can be difficult especially in the early rounds of feature engineering, we design a mechanism within the framework that selectively elicits and incorporates human expert preference feedback, comparing which operations are more promising, into the selection process to help identify more effective operations. Our evaluations on both the synthetic study and the real user study demonstrate that the proposed framework improves feature engineering performance across a variety of tabular datasets and reduces users' cognitive load during the feature engineering process.

研究の動機と目的

表形式の特徴量操作提案を選択から分離することで効率の改善を動機づける。
提案された操作の有用性と不確実性を推定するベイズ代理モデルを導入する。
選択をさらに洗練させるために selective human expert preference feedback を組み込む。
操作選択の探索と活用のバランスを Upper Confidence Bound (UCB) 戦略で取る。
合成データとユーザ研究を通じて、性能向上と認知負荷の低減を示す。

提案手法

LLM は履歴とデータセットメタデータ（H_t、C、Meta）から多様な候補特徴変換を生成する。
ベイズニューラルネットワークの代理モデルが各操作の有用性 g(e) をモデリングし、意味論と列使用特徴を組み合わせた埋め込みベースのエンコード phi(e) を用いる。
有用性 μ_t(e) と不確実性 σ_t(e) を用いて UCB を計算する： UCB_t(e) = μ_t(e) + sqrt(β_t) * σ_t(e)。
人間の嗜好フィードバックをペアワイズ比較として引き出し、選択をさらに洗練させる（プロbit尤度と更新後事後分布 q'_t(θ) を用いる）。
人間のエリシテーションには2つの決定条件が支配的である：（C1）潜在的利益を保証するUCBとLCBの重複、（C2）認知コストを正当化する不確実性閾値；フィードバックは最終的な e_t^a と e_t^b の選択に反映される。
アルゴリズムは予算 T までラウンドを繰り返し、履歴 H_t と代理モデルを人間入力の有無にかかわらず更新する。

実験結果

リサーチクエスチョン

RQ1表形式データの特徴量エンジニアリングにおいて、操作提案と選択を分離することは効率を改善できるか。
RQ2この設定で提案された特徴操作の有用性と不確実性をベイズ代理モデルはどう推定するのか。
RQ3 selective human preference feedback は特徴量エンジニアリングの性能と認知負荷をさらに改善するか。
RQ4このフレームワーク中のLLM提案操作の探索と活用のトレードオフはどうなるか。
RQ5AutoMLや既存のLLMベース手法と比較して、複数データセット・下流モデルでのフレームワークの性能はどうか。

主な発見

提案されたフレームワークは、13の分類データセットにおいてMLPおよびXGBoost評価器を横断してAutoMLおよび他のLLMベースベースラインより一貫して優れている。
人間の入力なしでも手法は有意な誤差率の低減を達成し、人間のフィードバックを用いるとタスク間でさらに低減が増す。
LLMベースの特徴量エンジニアリング手法は、伝統的な非LLMのAutoML手法より概して上回る。
明示的な有用性・不確実性を考慮した選択はブラックボックスLLM最適化より効率を向上させる。
selective human preference feedback は特徴量エンジニアリングのワークフローにおいて一貫した性能向上と人間の認知負荷の低減をもたらす。
独自の変換データセットにおいて、同一イテレーション予算内で baseline OCTree より高い AUROC を達成した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。