QUICK REVIEW

[論文レビュー] Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach

Aditya Gaydhani, Vikrant Doma|arXiv (Cornell University)|Sep 23, 2018

Hate Speech and Cyberbullying Detection参考文献 6被引用数 95

ひとこと要約

本論文は Twitter データに対して three-class classifier（hateful、offensive、clean）を構築し、n-gram TFIDF 特徴量を用いて Logistic Regression、Naive Bayes、SVM を比較し、test 精度の最高は 95.6% の Logistic Regression を達成している。

ABSTRACT

Toxic online content has become a major issue in today's world due to an exponential increase in the use of internet by people of different cultures and educational background. Differentiating hate speech and offensive language is a key challenge in automatic detection of toxic text content. In this paper, we propose an approach to automatically classify tweets on Twitter into three classes: hateful, offensive and clean. Using Twitter dataset, we perform experiments considering n-grams as features and passing their term frequency-inverse document frequency (TFIDF) values to multiple machine learning models. We perform comparative analysis of the models considering several values of n in n-grams and TFIDF normalization methods. After tuning the model giving the best results, we achieve 95.6% accuracy upon evaluating it on test data. We also create a module which serves as an intermediate between user and Twitter.

研究の動機と目的

自動化された Twitter 上の有害言語検出を促進し、 hate speech を offensive 言語と benign な内容と区別する。
複数の公開データセットと Twitter API からデータを組み合わせて分類器を訓練・評価するパイプラインを開発する。
TFIDF 正規化と n-gram 範囲を複数の分類器と組み合わせて、効果的な特徴-分類器の組み合わせを特定する。
クロスバリデーションの性能を最大化するようハイパーパラメータを調整し、最終的な test 結果を報告する。

提案手法

tweets から unigram 〜 trigram の n-gram 特徴を抽出し、TFIDF で重み付けする。
L1 および L2 TFIDF 正規化と三つの分類器（Naive Bayes、Logistic Regression、SVM）を評価する。
特徴量パラメータと 10-fold クロスバリデーションを用いてモデルを比較する。
Naive Bayes の平滑化パラメータ alpha と Logistic Regression の正則化パラメータ C およびソルバーを調整する。
クロスバリデーションに基づいて最良モデルを選択し、テスト性能を報告するとともに誤分類を分析する。

実験結果

リサーチクエスチョン

RQ1TFIDF 重み付けされた n-gram 特徴を用いて、三クラスの分類器は hateful、offensive、clean のツイートを信頼して区別できるか。
RQ2どの分類器（NB、LR、SVM）とどの特徴設定が最も高いクロスバリデーションおよびテスト性能をもたらすか。
RQ3TFIDF 正規化と n-gram 範囲は、ヘイトスピーチおよび攻撃的言語の検出性能にどう影響するか。
RQ4誤分類の共通パターンと、攻撃的クラスのリコールおよび hateful クラスの適合率を改善する潜在的手段は何か。

主な発見

Model	N-gram Range + TFIDF Norm	Cross-Validation Accuracy	Test/Final Accuracy
Naive Bayes	1-3 + L2	0.934	0.934
Logistic Regression	1-3 + L2	0.951	0.956
Support Vector Machines	1-3 + L2	0.901	-

1-3 n-gram と L2 TFIDF 正規化を用いた Logistic Regression が三モデルの中で最も高いクロスバリデーション性能を達成し、他のモデルをターンニング後に上回る。
alpha = 0.1 の Naive Bayes と L2 TFIDF は 0.934 のクロスバリデーション精度を達成し、初期結果を上回り LR に僅差で対抗する。
最終的にテストデータに対して評価された Logistic Regression モデルは、n-gram 範囲 1-3 と TFIDF 正規化 L2（C = 100, liblinear）で 0.956 の精度を示す。
テストセットでは hateful、offensive、clean の各クラスの適合率/再現率は 0.94–0.98 の範囲で、offensive の再現率は 0.93、混同行列から offensive ツイートの 4.8% が hateful と誤分類される。
誤り分析は、攻撃的クラスのリコールを高め、 hateful の誤分類を減らす改善と、言語的特徴を組み込むことを示唆している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。