QUICK REVIEW

[论文解读] In the Service of Online Order: Tackling Cyber-Bullying with Machine Learning and Affect Analysis

Michał Ptaszyński, Paweł Dybała|arXiv (Cornell University)|Mar 4, 2022

Bullying, Victimization, and Aggression参考文献 15被引用 50

一句话总结

本文开发了一个自动系统，利用情感分析和 SVM 在非官方的日本学校网站上检测网络霸凌条目，达到 88.2% 的平衡 F 值。

ABSTRACT

One of the burning problems lately in Japan has been cyber-bullying, or slandering and bullying people online. The problem has been especially noticed on unofficial Web sites of Japanese schools. Volunteers consisting of school personnel and PTA (Parent-Teacher Association) members have started Online Patrol to spot malicious contents within Web forums and blogs. In practise, Online Patrol assumes reading through the whole Web contents, which is a task difficult to perform manually. With this paper we introduce a research intended to help PTA members perform Online Patrol more efficiently. We aim to develop a set of tools that can automatically detect malicious entries and report them to PTA members. First, we collected cyber-bullying data from unofficial school Web sites. Then we performed analysis of this data in two ways. Firstly, we analysed the entries with a multifaceted affect analysis system in order to find distinctive features for cyber-bullying and apply them to a machine learning classifier. Secondly, we applied a SVM based machine learning method to train a classifier for detection of cyber-bullying. The system was able to classify cyber-bullying entries with 88.2% of balanced F-score.

研究动机与目标

激励工具帮助 PTA 成员更高效地开展 Online Patrol。
从非官方学校网站收集网络霸凌数据以供分析。
探索情感特征以区分网络霸凌与良性内容。
评估用于网络霸凌检测的机器学习分类器。

提出的方法

从非官方学校网站组装网络霸凌数据。
应用多方面的情感分析系统以提取区分性特征。
使用情感特征训练 support vector machine (SVM) 分类器。
使用平衡 F 值评估分类性能。

实验结果

研究问题

RQ1情感特征能否将网络霸凌条目与其他在线内容区分开？
RQ2基于情感特征的 SVM 分类器在检测学校相关论坛中的网络霸凌方面有多有效？
RQ3所提出系统的分类性能（平衡 F 值）是多少？

主要发现

该系统使用多方面的情感分析来推导用于网络霸凌检测的特征。
在这些特征上训练的 SVM 分类器以 88.2% 的平衡 F 值检测网络霸凌。
该方法旨在通过自动化有害内容发现来帮助 Online Patrol 的工作。
数据来自非官方学校网站以供分析。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。