QUICK REVIEW

[Paper Review] Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

João Leite, Diego Furtado Silva|arXiv (Cornell University)|Oct 9, 2020

Hate Speech and Cyberbullying Detection10 references43 citations

TL;DR

Introduces ToLD-Br, a large Brazilian Portuguese toxic language Twitter dataset with demographic-aware annotations, and analyzes monolingual vs multilingual BERT models for binary and multi-label toxic comment classification.

ABSTRACT

Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainly in English, with very few work in languages like Brazilian Portuguese. In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in different types of toxicity. We present our dataset collection and annotation process, where we aimed to select candidates covering multiple demographic groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case. We also show that large-scale monolingual data is still needed to create more accurate models, despite recent advances in multilingual approaches. An error analysis and experiments with multi-label classification show the difficulty of classifying certain types of toxic comments that appear less frequently in our data and highlights the need to develop models that are aware of different categories of toxicity.

Motivation & Objective

Create a large-scale Brazilian Portuguese toxic language dataset (ToLD-Br) from Twitter with demographic-aware annotations.
Analyze the effectiveness of monolingual vs multilingual BERT models on binary toxic Comment classification.
Investigate transfer learning and zero-shot learning in a multilingual setting for toxicity detection.
Explore data requirements and challenges of multi-label toxicity classification in this language.
Provide insights on annotation agreement, label diversity, and model error patterns to guide future research.

Proposed method

Collect over 10 million tweets using keyword/hashtag and influencer-based strategies; annotate 21k tweets with seven toxicity categories.
Compute Krippendorff’s alpha to assess inter-annotator agreement and analyze annotation divergence.
Train and evaluate baseline BoW+AutoML models and multiple BERT-based classifiers (Brazilian Portuguese BERT BR-BERT and Multilingual BERT MBERT-BR).
Perform monolingual (Portuguese) fine-tuning; experiment with transfer learning and zero-shot learning using OLID English data in a multilingual setup.
Analyze the impact of training data size on binary toxicity performance and conduct initial multi-label classification experiments.
Provide error analyses by toxicity type and discuss data-imbalance effects and annotator agreement.

Experimental results

Research questions

RQ1What is the effectiveness of monolingual Brazilian Portuguese BERT models for binary toxic comment detection on ToLD-Br compared to multilingual models?
RQ2Does incorporating English data via transfer learning or zero-shot learning improve toxicity detection in Brazilian Portuguese?
RQ3How does training data size affect binary classification performance, especially for minority toxicity classes?
RQ4What are the challenges of multi-label toxicity classification in ToLD-Br, and how do model performances vary across categories?
RQ5How do annotator demographics and label agreement impact dataset quality and model training?

Key findings

A monolingual Brazilian Portuguese BERT model (M-BERT-BR) achieves the highest macro-F1 among tested approaches (about 76%), with fewer false negatives than the other models.
Monolingual BR-BERT performs comparably to M-BERT-BR and often slightly better on macro-F1, indicating language-specific data remains advantageous.
Transfer learning (M-BERT(transfer)) from English OLID data does not outperform monolingual models and yields more false negatives.
Zero-shot learning (M-BERT(zero-shot)) performs poorly, especially for the toxic (positive) class, with macro-F1 around 0.56.
BoW+AutoML provides a strong baseline with macro-F1 near 0.74, showing competitive performance without deep learning.
Increasing training data improves both precision and recall for the toxic class, with around 6k examples needed for more reliable results; minority classes remain challenging due to imbalance.
Multi-label classification proves substantially harder; labels with many examples (insult, obscene) fare better than scarce ones (racism, xenophobia, LGBTQ+phobia).

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.