QUICK REVIEW

[论文解读] Forward-Backward Selection with Early Dropping

Giorgos Borboudakis, Ioannis Tsamardinos|arXiv (Cornell University)|May 30, 2017

Bayesian Modeling and Causal Inference参考文献 27被引用 25

一句话总结

本文提出了一种前向-后向选择早期剔除方法（Forward-Backward Selection with Early Dropping, FBED），通过暂时丢弃在当前已选变量集下与结果条件独立的变量，加速特征选择过程。该方法将运行时间最多缩短两个数量级，选择的变量更少，同时在满足信仰性假设的前提下，仍能正确识别马尔可夫毯。

ABSTRACT

Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that significantly improves its running time, while preserving predictive accuracy. The idea is to temporarily discard the variables that are conditionally independent with the outcome given the selected variable set. Depending on how those variables are reconsidered and reintroduced, this heuristic gives rise to a family of algorithms with increasingly stronger theoretical guarantees. In distributions that can be faithfully represented by Bayesian networks or maximal ancestral graphs, members of this algorithmic family are able to correctly identify the Markov blanket in the sample limit. In experiments we show that the proposed heuristic increases computational efficiency by about two orders of magnitude in high-dimensional problems, while selecting fewer variables and retaining predictive performance. Furthermore, we show that the proposed algorithm and feature selection with LASSO perform similarly when restricted to select the same number of variables, making the proposed algorithm an attractive alternative for problems where no (efficient) algorithm for LASSO exists.

研究动机与目标

为解决高维特征选择中前向-后向选择方法存在的高计算成本和多重检验问题。
在不牺牲预测性能或识别相关变量准确性的前提下提升效率。
开发一种启发式方法，实现在保持贝叶斯网络和最大因果图理论保证下的更快收敛。
在LASSO计算不可行的场景中，提供一种可行的替代方案。

提出的方法

该算法执行前向选择，但会暂时剔除在当前已选集合下与结果条件独立的变量。
前向选择完成后，算法在已选变量上应用后向剔除，并最多进行K轮额外的前向-后向循环。
该方法由参数K参数化，其中K=0、1或∞分别对应于马尔可夫毯恢复的不同理论保证。
条件独立性通过统计检验进行判断，该启发式方法利用了贝叶斯网络和最大因果图的性质。
该算法在MXM R包中实现，支持混合数据类型、非线性关系和稳健检验。
该方法在12个不同维度和稀疏度的数据集中进行了评估，比较了运行时间、准确率和变量选择质量。

实验结果

研究问题

RQ1是否可以通过早期剔除条件独立变量显著降低前向-后向选择的运行时间，同时不降低预测性能？
RQ2所提出的启发式方法在贝叶斯网络或最大因果图的信仰性假设下，是否能保持马尔可夫毯识别的理论保证？
RQ3当两者均被限制选择相同数量的特征时，FBED与LASSO在变量选择质量和预测准确性方面有何差异？
RQ4额外的前向-后向循环次数（K）对计算效率与选择准确率之间的权衡有何影响？
RQ5在高维场景下，FBED是否在速度和变量选择稀疏性方面优于标准的前向-后向选择方法？

主要发现

在高维数据集中，FBED将计算时间最多缩短两个数量级，例如在gisette数据集上，FBED∞的运行时间为4,191.7秒，而FBS为6,759.6秒。
在gisette数据集上，FBED∞的马尔可夫毯恢复率最高（85.2%），优于FBS（81.7%）和FBED0（43.7%）。
FBED1和FBED∞的预测性能与LASSO-FS相当，多个数据集的AUC差异中位数小于0.05。
FBED0选择的变量少于FBS（例如在musk数据集上为23.1 vs. 26.1），同时保持或提升了准确率。
该算法在混合连续与分类预测变量、时间序列数据和生存时间结果等多种数据类型中均保持高准确率。
FBED∞在准确率与效率之间实现了最佳平衡，在dorothea数据集上的中位运行时间为22.3秒，而LASSO-FS在1000个λ值下的运行时间为89.0秒。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。