Skip to main content
QUICK REVIEW

[论文解读] Numerical Association Rule Mining: A Systematic Literature Review

Minakshi Kaushik, Rahul Sharma|arXiv (Cornell University)|Jul 2, 2023
Data Mining Algorithms and Applications被引用 7
一句话总结

简要:本文对从1996到2022年的数值关联规则挖掘(NARM)进行了系统综述(SLR),从1140篇文章中识别出68项最终研究并引入了一种新的离散化度量。

ABSTRACT

Numerical association rule mining is a widely used variant of the association rule mining technique, and it has been extensively used in discovering patterns and relationships in numerical data. Initially, researchers and scientists integrated numerical attributes in association rule mining using various discretization approaches; however, over time, a plethora of alternative methods have emerged in this field. Unfortunately, the increase of alternative methods has resulted into a significant knowledge gap in understanding diverse techniques employed in numerical association rule mining -- this paper attempts to bridge this knowledge gap by conducting a comprehensive systematic literature review. We provide an in-depth study of diverse methods, algorithms, metrics, and datasets derived from 1,140 scholarly articles published from the inception of numerical association rule mining in the year 1996 to 2022. In compliance with the inclusion, exclusion, and quality evaluation criteria, 68 papers were chosen to be extensively evaluated. To the best of our knowledge, this systematic literature review is the first of its kind to provide an exhaustive analysis of the current literature and previous surveys on numerical association rule mining. The paper discusses important research issues, the current status, and future possibilities of numerical association rule mining. On the basis of this systematic review, the article also presents a novel discretization measure that contributes by providing a partitioning of numerical data that meets well human perception of partitions.

研究动机与目标

  • 定义一个严格的SLR协议,以映射1996至2022年的NARM文献。
  • 整理NARM中使用的方法、算法、度量标准和数据集。
  • 识别现有NARM研究的优点、局限性和空白点。
  • 提出一种与人类感知对齐的新型离散化度量。
  • 为NARM研究人员提供未来研究方向和实际指导。

提出的方法

  • 遵循Kitchenham和Charters关于规划、执行和报告SLR的指南。
  • 检索多个数据库(ACM DL、Scopus、SpringerLink、IEEE Xplore、ScienceDirect)和Google Scholar的英文文章(1996–2022)。
  • 应用纳入/排除标准筛选原始研究(I1–I5)并进行质量评估(QQ1–QQ5)。
  • 对每篇文章提取数据(标题、作者、来源、年份、类型、数据集、目标、度量标准)。
  • 综合结果以对NARM方法进行分类并识别一种新的离散化度量。
  • 报告结果并讨论有效性威胁与未来视角。
Figure 1 . Metrics Used to Evaluate NARM Algorithms.
Figure 1 . Metrics Used to Evaluate NARM Algorithms.

实验结果

研究问题

  • RQ1RQ1:现有解决NARM问题的方法有哪些?
  • RQ2RQ2:对于现有的每种NARM方法,可用的几种算法有哪些?
  • RQ3RQ3:现有NARM方法的优点和局限性是什么?
  • RQ4RQ4:现有多目标优化NARM算法考虑了哪些目标?
  • RQ5RQ5:评估NARM算法的度量标准有哪些?
  • RQ6RQ6:NARM方法用于实验的数据集有哪些?
  • RQ7RQ7:NARM领域的潜在未来前景是什么?
  • RQ8RQ8:如何以有用(自然)的方式自动离散化NARM中的数值属性?

主要发现

  • 识别出四种主要的NARM方法:离散化、聚类、模糊和混合方法。
  • 优化方法(生物启发和物理基础)较为突出,广泛使用进化和群体智能技术。
  • 从1140篇筛选文章(1996–2022)中,68篇文章符合质量标准进入最终评审。
  • 研究提出一种自动化离散化度量,旨在使分区与对数值数据的人类感知对齐。
  • 以往综述研究范围有限;本次SLR提供对NARM文献及其空白点的全面、系统综述。
  • 文章概述了未来的研究方向及NARM技术的潜在改进。
Figure 2 . Distribution of Metrics Used in NARM Methods.
Figure 2 . Distribution of Metrics Used in NARM Methods.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。