QUICK REVIEW

[论文解读] On Supervised Selection of Bayesian Networks

Petri Kontkanen, Petri Myllymäki|arXiv (Cornell University)|Jan 23, 2013

Bayesian Modeling and Causal Inference参考文献 24被引用 38

一句话总结

本文研究了贝叶斯网络的监督模型选择，表明当目标是分类任务中的预测准确性时，标准的边际似然评分表现不佳。相反，它表明戴维德的预序（预测性顺序）方法在多个基准数据集上表现更优，主张在以聚焦预测分布为优先的监督学习场景中采用该方法。

ABSTRACT

Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more ``focused' predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classification data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawids prequential(predictive sequential) principle.The results demonstrate that the marginal likelihood score does NOT perform well FOR supervised model selection, WHILE the best results are obtained BY using Dawids prequential r napproach.

研究动机与目标

评估标准无监督模型选择标准（尤其是边际似然）在监督贝叶斯网络学习中的适用性。
识别与分类任务中预测性能更匹配的模型选择标准，其中使用聚焦的预测分布。
在真实世界的分类数据集上，对边际似然、交叉验证和监督评分方法进行经验比较。
倡导在监督设置中使用预序（预测性顺序）评分作为边际似然的更优替代方案。

提出的方法

作者将边际似然评分（一种标准的无监督准则）应用于监督贝叶斯网络结构学习。
将其与经验交叉验证（一种监督边际似然的近似方法）以及戴维德的预序评分方法进行比较。
通过大量公开的分类数据集评估不同模型选择标准的预测准确性。
通过依次在数据点上评估预测结果来应用预序评分，将模型性能视为一系列预测更新的序列。
在不同数据集上采用一致的实验协议，以确保对模型选择性能的公平评估。
通过分类准确率衡量性能，并将结果在数据集间聚合，以评估泛化能力。

实验结果

研究问题

RQ1在无监督设置中广泛使用的边际似然评分，在监督贝叶斯网络选择中是否表现良好？
RQ2在分类任务的预测准确性方面，交叉验证和预序评分与边际似然相比如何？
RQ3在真实世界的分类数据集中，无监督与监督模型选择标准之间是否存在显著的性能差距？
RQ4对边际似然评分的监督适应版本是否能在预测性能上超越标准版本？
RQ5戴维德的预序评分方法是否在监督贝叶斯网络学习中始终优于其他标准？

主要发现

尽管在无监督情境中广泛应用，边际似然评分在监督模型选择中表现不佳。
经验交叉验证方法相比边际似然有所改进，但仍逊于预序评分方法。
预序评分方法在所测试的基准数据集中实现了最佳的预测准确性。
对监督边际似然度量的近似表现中等，但仍劣于预序方法。
结果表明，在分类任务中，最优联合分布建模与最优预测性能之间存在明显脱节。
本研究提供了强有力的实证证据，表明监督模型选择应优先考虑预测性能，而非联合似然最大化。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。