QUICK REVIEW

[论文解读] Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology

Alexei Botchkarev|arXiv (Cornell University)|Sep 9, 2018

Forecasting Techniques and Applications参考文献 28被引用 386

一句话总结

这篇论文概览了机器学习回归、预测和预测性维护等领域的性能度量（误差度量），并提出一个用以帮助度量选择的类型学。它引入一个四类别框架，并识别影响主要度量的关键组成要素。

ABSTRACT

Performance metrics (error measures) are vital components of the evaluation frameworks in various fields. The intention of this study was to overview of a variety of performance metrics and approaches to their classification. The main goal of the study was to develop a typology that will help to improve our knowledge and understanding of metrics and facilitate their selection in machine learning regression, forecasting and prognostics. Based on the analysis of the structure of numerous performance metrics, we propose a framework of metrics which includes four (4) categories: primary metrics, extended metrics, composite metrics, and hybrid sets of metrics. The paper identified three (3) key components (dimensions) that determine the structure and properties of primary metrics: method of determining point distance, method of normalization, method of aggregation of point distances over a data set.

研究动机与目标

激发对回归、预测和预测性维护领域的误差度量进行全面综述的必要性。
构建一个类型学，以提升对这些领域度量的理解与选择。
分析性能度量的结构，以识别支配维度和属性。

提出的方法

从文献和实践中分析现有的性能度量。
识别度量中的结构模式和分类。
提出一个包含四类度量的框架：主要度量、扩展度量、复合度量和混合集。
阐明决定主要度量属性的三个关键维度：距离/不相似度计算方法、归一化方法，以及在数据集上的聚合方法。

实验结果

研究问题

RQ1回归、预测和预测性维护中使用的性能度量的主要类别和结构是什么？
RQ2如何对度量进行分类，以改进机器学习评估中的选择和理解？
RQ3决定主要误差度量属性的基本维度是什么？

主要发现

提出了四类别的度量类型：主要度量、扩展度量、复合度量以及混合集度量。
三个维度决定主要度量结构：点间距离如何计算、距离如何归一化、以及在数据集上的距离如何聚合。
该框架旨在促进在机器学习评估中对度量的更好理解、解释与选择。
该研究提供了一种将不同领域的误差度量进行比较和对比的组织性视角。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。