QUICK REVIEW

[论文解读] SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

Marcos Zampieri, Shervin Malmasi|arXiv (Cornell University)|Mar 19, 2019

Hate Speech and Cyberbullying Detection参考文献 73被引用 37

一句话总结

本论文描述 OffensEval 共享任务（SemEval-2019 Task 6），使用 OLID 数据集来在英语推文中识别冒犯、对冒犯类型进行分类，以及识别冒犯目标，基于 BERT-based 的方法和集成方法取得了顶级结果。

ABSTRACT

We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. In sub-task B, the focus was on the type of offensive content in the post. Finally, in sub-task C, systems had to detect the target of the offensive posts. OffensEval attracted a large number of participants and it was one of the most popular tasks in SemEval-2019. In total, about 800 teams signed up to participate in the task, and 115 of them submitted results, which we present and analyze in this report.

研究动机与目标

推动对冒犯性语言的自动检测，以减轻手动审核负担。
引入 OLID，一种捕捉冒犯存在、类型和目标的分层三级标注框架。
定义三个子任务（A: offensive vs not; B: offense type; C: offense target）以分别研究这些现象。
提供基线和有竞争力的结果，以为英语推文中的冒犯性语言识别建立基准。

提出的方法

使用带有三层分层标注方案的 OLID 数据集。
由于类别不平衡，使用 macro F1 作为官方评测指标，对三个子任务进行评估。
调研从传统机器学习（SVM）到深度学习（CNN、RNN、BiLSTM、transformers）以及集成方法的广泛模型。
整合外部数据集和预训练嵌入（FastText、GloVe、Twitter embeddings），并应用推文特定的预处理（hashtags、tokens、emojis）。
报告结果和顶尖系统，突出在子任务 A 中 BERT-based 模型的占比以及在子任务 B 和 C 中集成方法的占比。

实验结果

研究问题

RQ1分层标注框架是否能够有效捕捉社交媒体文本中的冒犯存在、类型和目标？
RQ2在 OLID 上，每个子任务最有效的建模方法是什么（例如，BERT、集成方法）？
RQ3模型在 offensive vs non-offensive、冒犯类型和冒犯目标之间的表现差异有多大？
RQ4在多大程度上利用外部数据和预处理技术可以提升 OffensEval 的性能？

主要发现

约有 800 支团队报名参加；在各子任务中提交了 115 份结果。
子任务 A（冒犯性语言识别）的顶级成绩为 82.9% F1（NULI 使用 BERT-base-uncased）。
子任务 B 在集成方法与 BERT 的组合上表现强劲；在某些情况下，冠军达到 75.5% F1（jhan014，基于规则+关键词）。
子任务 C 的最高结果为 0.660 F1（vradivchev_anikolov 使用的 BERT-based 方法）。
深度学习与集成方法占主导，传统机器学习也有出现；常用的预训练嵌入和推文特定的预处理被广泛使用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。