QUICK REVIEW

[論文レビュー] Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study

Maral Dadvar, Kai Eckert|arXiv (Cornell University)|Dec 19, 2018

Hate Speech and Cyberbullying Detection参考文献 17被引用数 70

ひとこと要約

この論文は、Wikipedia、Twitter、Formspring の深層学習ベースのサイバーブリンキング検出結果を再現し、新しい YouTube データセットへ評価を拡張し（約54k 投稿、約4k ユーザー）、モデルのプラットフォーム跨ぎ転移を検討します。

ABSTRACT

Cyberbullying is a disturbing online misbehaviour with troubling consequences. It appears in different forms, and in most of the social networks, it is in textual format. Automatic detection of such incidents requires intelligent systems. Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time. In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance. In this paper, we investigate the findings of a recent literature in this regard. We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors. Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms. We also transferred and evaluated the performance of the models trained on one platform to another platform. Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset. We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks.

研究の動機と目的

Wikipedia、Twitter、Formspring の同じデータセットを用いて、深層学習モデルによるサイバーブリンキング検出に関する最近の文献の所見を再現する。
新しいソーシャルプラットフォーム（YouTube）へ評価を拡張し、既存アプローチと比較して性能を評価する。
クロスプラットフォーム転移を探る: あるプラットフォームで訓練されたモデルが他のプラットフォームでどの程度機能するか。
追加情報としてユーザープロファイルデータなどを組み込むことで検出性能を向上させる潜在的利益を示唆する。

提案手法

前の文献で報告されている深層学習ベースのサイバーブリンキング検出手法を、Wikipedia、Twitter、Formspring の同じデータセットを用いて再現する。
再現した手法を新しい YouTube データセット（約54,000 投稿、約4,000 ユーザー）に適用し、性能を評価する。
YouTube データセット上で、深層学習モデルと伝統的な機械学習ベースラインを比較する。
他のプラットフォームで訓練されたモデルを別のプラットフォームに転移させ、クロスプラットフォーム転移を調査する。
ユーザープロファイルデータなど追加情報の統合が、モデル性能に与える影響を議論する。

実験結果

リサーチクエスチョン

RQ1YouTube データセットで、深層学習ベースのモデルは以前のデータセットと同様に伝統的な機械学習モデルより優れているか？
RQ2Wikipedia、Twitter、Formspring からの知見は、同じデータセットで再現できるか？
RQ3一つのソーシャルプラットフォームで訓練されたモデルは、他のプラットフォームへどの程度転移できるか？
RQ4ユーザープロファイル情報を含めることは、サイバーブリンキング検出性能にどのような影響を与えるか？

主な発見

DLベースのモデルは、YouTube データセットに以前適用された機械学習モデルよりも優れていた。
著者らは、Wikipedia、Twitter、Formspring に関する文献の所見を、同じデータセットを用いて再現に成功した。
新しい YouTube データセットに適用した場合、DL モデルは伝統的な ML ベースラインより優れた性能を示した。
一つのプラットフォームで訓練されたモデルを、別のプラットフォームに転移して評価することができる。
ユーザープロファイル情報など追加情報の統合は、モデル性能に利益をもたらす可能性がある。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。