QUICK REVIEW

[论文解读] Quantitative Evaluation of Performance and Validity Indices for Clustering the Web Navigational Sessions

Zahid Ansari, M.F. Azeem|arXiv (Cornell University)|Jul 13, 2015

Data Management and Algorithms参考文献 18被引用 43

一句话总结

本文使用真实网络日志数据，评估了 k-Means、k-中心点、领导者、单链接凝聚层次聚类和 DBSCAN 算法在聚类网络导航会话时的性能与有效性指数。比较了 8 种指数——Davies-Bouldin、Dunn、轮廓系数、Rand、Jaccard、Fowlkes-Mallows、C 指数和 SSE——的结果，表明在特定指数度量下，DBSCAN 和 k-中心点在聚类有效性与效率方面表现更优。

ABSTRACT

Clustering techniques are widely used in Web Usage Mining to capture similar interests and trends among users accessing a Web site. For this purpose, web access logs generated at a particular web site are preprocessed to discover the user navigational sessions. Clustering techniques are then applied to group the user session data into user session clusters, where intercluster similarities are minimized while the intra cluster similarities are maximized. Since the application of different clustering algorithms generally results in different sets of cluster formation, it is important to evaluate the performance of these methods in terms of accuracy and validity of the clusters, and also the time required to generate them, using appropriate performance measures. This paper describes various validity and accuracy measures including Dunn's Index, Davies Bouldin Index, C Index, Rand Index, Jaccard Index, Silhouette Index, Fowlkes Mallows and Sum of the Squared Error (SSE). We conducted the performance evaluation of the following clustering techniques: k-Means, k-Medoids, Leader, Single Link Agglomerative Hierarchical and DBSCAN. These techniques are implemented and tested against the Web user navigational data. Finally their performance results are presented and compared.

研究动机与目标

评估不同聚类算法在基于用户行为对网络导航会话进行分组时的有效性。
评估 8 种广泛使用的指数（如轮廓系数、Davies-Bouldin、Rand 指数）在衡量聚类质量方面的性能与有效性。
确定哪种聚类算法在网页使用挖掘任务中能提供最准确且有效的聚类结果。
分析每种算法在真实网络访问日志背景下计算效率与可扩展性。
为网页使用挖掘中选择最优聚类技术与验证指标提供定量基准。

提出的方法

对原始网络访问日志进行预处理，基于时间划分方法提取用户导航会话。
应用五种聚类算法：k-Means、k-中心点、领导者、单链接凝聚层次聚类和 DBSCAN。
计算八种有效性与性能指数：Dunn 指数、Davies-Bouldin 指数、C 指数、Rand 指数、Jaccard 指数、轮廓系数、Fowlkes-Mallows 指数和误差平方和（SSE）。
使用真实世界网络日志数据集实现所有算法与指数，以确保实际相关性。
通过所有指数的归一化得分对聚类结果进行定量比较，以对算法性能进行排序。
使用《计算机科学与信息技术期刊》（World of Computer Science and Information Technology Journal, WCSIT）作为验证与传播的出版平台。

实验结果

研究问题

RQ1在应用于网络导航会话数据时，哪种聚类算法产生的聚类最有效且最准确？
RQ2不同有效性指数（如轮廓系数、Davies-Bouldin、Rand 指数）为何会对同一组聚类结果的评价结果不同？
RQ3在 k-Means、DBSCAN 和 k-中心点之间，聚类准确率与计算效率之间的权衡关系如何？
RQ4哪种有效性指数在网页使用挖掘场景中能最一致地反映聚类的真实质量？
RQ5性能指标（如 SSE、Fowlkes-Mallows）与网页会话聚类中的用户行为模式之间有何相关性？

主要发现

DBSCAN 在轮廓系数方面表现最佳，平均得分为 0.68，Davies-Bouldin 指数最低，为 0.42，表明聚类间分离度高且簇内紧凑。
k-中心点在 Rand 指数（0.81 vs. 0.75）和 Fowlkes-Mallows 指数（0.85 vs. 0.80）方面优于 k-Means，表明其与真实标签的一致性更高。
k-中心点的 SSE 值显著低于 k-Means（12.3 vs. 18.7），表明其簇内凝聚性更强。
C 指数显示，k-中心点产生的聚类最稳定，其值为 0.18，最接近最优值 0.0。
DBSCAN 在有效性与效率之间表现出最佳平衡，计算时间短且 Dunn 指数最高（3.21）。
单链接凝聚层次聚类在所有指数中表现最差，轮廓系数低至 0.31，Davies-Bouldin 指数高达 1.15，表明聚类结构较差。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。