QUICK REVIEW

[论文解读] ReFuGe: Feature Generation for Prediction Tasks on Relational Databases with LLM Agents

Kyungho Kim, Do Geon Lee|arXiv (Cornell University)|Jan 25, 2026

Data Quality and Management被引用 0

一句话总结

一个具备行动能力的框架（ReFuGe）使用专门的LLM代理来识别相关模式，生成多样化的关系特征，并通过推理与验证进行筛选，迭代直至预测性能收敛，在多项RDB预测任务上表现优于基线。

ABSTRACT

Relational databases (RDBs) play a crucial role in many real-world web applications, supporting data management across multiple interconnected tables. Beyond typical retrieval-oriented tasks, prediction tasks on RDBs have recently gained attention. In this work, we address this problem by generating informative relational features that enhance predictive performance. However, generating such features is challenging: it requires reasoning over complex schemas and exploring a combinatorially large feature space, all without explicit supervision. To address these challenges, we propose ReFuGe, an agentic framework that leverages specialized large language model agents: (1) a schema selection agent identifies the tables and columns relevant to the task, (2) a feature generation agent produces diverse candidate features from the selected schema, and (3) a feature filtering agent evaluates and retains promising features through reasoning-based and validation-based filtering. It operates within an iterative feedback loop until performance converges. Experiments on RDB benchmarks demonstrate that ReFuGe substantially improves performance on various RDB prediction tasks. Our code and datasets are available at https://github.com/K-Kyungho/REFUGE.

研究动机与目标

需要自动化、信息化的关系模式特征生成以用于RDB的预测任务的动机。
提出一个具备模式选择、特征生成与特征筛选代理的代理框架（ReFuGe）。
展示基于迭代反馈的改进与自学习，无需ground-truth监督。
在多个真实RDB基准上评估ReFuGe以确立有效性。
提供关于何时以及如何通过关系特征提升预测性能的指南与洞见。

提出的方法

模式选择代理从RDB模式与任务描述中识别相关的表和列。
特征生成代理使用多個LLM实例创建多样化的候选关系特征以促进多样性。
基于推理的特征筛选基于语义推理（并结合前序迭代）筛选有前景的特征。
基于验证的特征筛选将特征临时地加入目标表并训练一个表格模型以经验性评估效用（如AUROC）。
迭代反馈循环，代理从前一次迭代获得自然语言反馈以指导未来的特征生成与选择，若不再产生新特征则停止。

实验结果

研究问题

RQ1ReFuGe在不同数据集的RDB预测任务上与最先进的基线相比如何？
RQ2每个关键组件（模式选择、特征筛选、迭代反馈）对总体性能的贡献是什么？
RQ3迭代的、基于反馈的循环是否始终提升预测性能？
RQ4ReFuGe在实际应用中的代表性特征（案例研究）有哪些？

主要发现

ReFuGe在大多数任务上超越所有基线，在七个数据集上获得最佳平均性能和平均排名。
消融研究表明各组件均对性能有贡献；移除模式选择、筛选或反馈都会降低结果，特征筛选影响尤为显著。
性能通常随迭代提升，平均在收敛前完成2.4轮迭代。
一个案例研究展示了来自相关表的特征（如点击的不同广告、地理层级等）如何带来显著提升。
增加特征生成代理中的LLM实例数量往往在各任务上提高性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。