QUICK REVIEW

[论文解读] Causal Pre-training Under the Fairness Lens: An Empirical Study of TabPFN

Qinyi Liu, Mohammad Khalil|arXiv (Cornell University)|Jan 25, 2026

Ethics and Social Impacts of AI被引用 0

一句话总结

本论文在不同数据集和分布偏移下评估 TabPFN 与 FT-TabPFN 的准确性与公平性，发现预测性能强，但公平性提升仅适中且不一致，尤其在 MNAR 偏移下。

ABSTRACT

Foundation models for tabular data, such as the Tabular Prior-data Fitted Network (TabPFN), are pre-trained on a massive number of synthetic datasets generated by structural causal models (SCM). They leverage in-context learning to offer high predictive accuracy in real-world tasks. However, the fairness properties of these foundational models, which incorporate ideas from causal reasoning during pre-training, remain underexplored. In this work, we conduct a comprehensive empirical evaluation of TabPFN and its fine-tuned variants, assessing predictive performance, fairness, and robustness across varying dataset sizes and distributional shifts. Our results reveal that while TabPFN achieves stronger predictive accuracy compared to baselines and exhibits robustness to spurious correlations, improvements in fairness are moderate and inconsistent, particularly under missing-not-at-random (MNAR) covariate shifts. These findings suggest that the causal pre-training in TabPFN is helpful but insufficient for algorithmic fairness, highlighting implications for deploying TabPFN (and similar) models in practice and the need for further fairness interventions.

研究动机与目标

评估 TabPFN 与 FT-TabPFN 在不同数据集规模与基准上的预测准确性。
评估 TabPFN 变体的分组公平性（人口统计学公平与等机会）.
检验 TabPFN 对虚假相关与 MNAR 共变偏移的鲁棒性。
在公平性与鲁棒性标准下，将基于 TabPFN 的方法与传统基线进行比较。

提出的方法

在数百万个合成的由因果结构驱动的表格任务上预训练 TabPFN，并对 FT-TabPFN 进行微调。
在四个表格公平性基准（Heart、Bank、Law、Adult）上进行评估，使用标准化预处理。
使用零-shot 的 TabPFN 与 FT-TabPFN，并与经典基线（LR、RF、MLP）一起比较。
通过 FairLearn 在不同样本量（500–全量）下评估准确性与公平性（DP 与 EO）。
引入虚假相关扰动（Z_spur）以测试在非因果依赖下的翻转一致性。
通过移除有偏子群来模拟 MNAR 共变偏移，以研究公平性与准确性的鲁棒性。

实验结果

研究问题

RQ1RQ1：在不同数据集规模下，与传统模型相比，TabPFN 与 FT-TabPFN 在准确性与公平性方面的表现如何？
RQ2RQ2：在存在协变量偏移和虚假相关时，TabPFN 与 FT-TabPFN 在保持准确性与公平性方面的鲁棒性如何？

主要发现

FT-TabPFN 与 TabPFN 在多种规模和数据集上达到最高或接近最高的准确性，在 Bank 与 Heart 上表现尤为强劲。
TabPFN 变体在小数据情境下显示出更优的公平性，但 EO/DP 随着规模与数据集而变化。
TabPFN 模型对虚假相关具有较强鲁棒性，在扰动下表现出高翻转一致性。
在 MNAR 共变偏移下，TabPFN/FT-TabPFN 保持高准确性（0.86–0.99），但公平性改进不一致（DP/EO 在不同数据集间变化）。
公平性提升并非在所有数据集都一致；DP 的范围为 0.04–0.42，EO 的范围为 0.03–0.27，依数据集与条件而异。
Law 数据集在与虚假扰动相关的极端属性不平衡下表现出一致性和准确性均较低的异常现象。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。