QUICK REVIEW

[논문 리뷰] Thou shalt not hate: Countering Online Hate Speech

Binny Mathew, Hardik Tharad|arXiv (Cornell University)|2018. 01. 01.

Hate Speech and Cyberbullying Detection참고 문헌 21인용 수 27

한 줄 요약

이 논문은 유튜브에서의 대응 발언 댓글에 대한 첫 번째 대규모 수작업 레이블링 데이터셋을 소개하며, 대응 발언에 대한 엄밀한 언어학적 분석을 가능하게 한다. F1 점수 0.71로 대응 발언을 탐지하는 기계학습 모델과 F1 점수 0.60로 다중 레이블 유형을 탐지하는 모델을 제안하여, 대응 발언의 동적 특성, 효과성, 그리고 혐오 발언과의 정서언어학적 차이에 대한 핵심 통찰을 드러낸다.

ABSTRACT

Hate content in social media is ever-increasing. While Facebook, Twitter, Google have attempted to take several steps to tackle the hateful content, they have mostly been unsuccessful. Counterspeech is seen as an effective way of tackling the online hate without any harm to the freedom of speech. Thus, an alternative strategy for these platforms could be to promote counterspeech as a defense against hate content. However, in order to have a successful promotion of such counterspeech, one has to have a deep understanding of its dynamics in the online world. Lack of carefully curated data largely inhibits such understanding. In this paper, we create and release the first ever dataset for counterspeech using comments from YouTube. The data contains 13,924 manually annotated comments where the labels indicate whether a comment is a counterspeech or not. This data allows us to perform a rigorous measurement study characterizing the linguistic structure of counterspeech for the first time. This analysis results in various interesting insights such as: the counterspeech comments receive much more likes as compared to the non-counterspeech comments, for certain communities majority of the non-counterspeech comments tend to be hate speech, the different types of counterspeech are not all equally effective and the language choice of users posting counterspeech is largely different from those posting non-counterspeech as revealed by a detailed psycholinguistic analysis. Finally, we build a set of machine learning models that are able to automatically detect counterspeech in YouTube videos with an F1-score of 0.71. We also build multilabel models that can detect different types of counterspeech in a comment with an F1-score of 0.60.

연구 동기 및 목표

유튜브, 페이스북, 트위터와 같은 소셜 미디어 플랫폼에서 증가하는 온라인 혐오 발언 문제를 다루기 위해.
자유의 발언을 억압하지 않는 비검열 전략인 대응 발언 연구를 위한 체계화된 데이터의 부족을 극복하기 위해.
대응 발언 존재 여부가 레이블링된 첫 번째 대규모 수작업 레이블링 유튜브 댓글 데이터셋을 구축하고 공개하기 위해.
대응 발언의 구조적 및 행동적 특성 이해를 위해 대응 발언의 종합적인 언어학적 및 정서언어학적 분석을 수행하기 위해.
실제 온라인 댓글에서 대응 발언과 그 유형을 자동으로 탐지할 수 있는 기계학습 모델을 개발하기 위해.

제안 방법

첫 번째 공개 대응 발언 데이터셋을 만들기 위해 13,924개의 유튜브 댓글을 체계화하고 수작업으로 레이블링함.
언어학적 및 정서언어학적 분석을 적용하여 대응 발언과 비대응 발언 댓글 간의 언어 사용 방식을 비교함.
댓글의 어휘적, 문법적, 정서적 특성에서 유도된 특징을 사용하여 지도 기반 기계학습 모델을 훈련함.
이진 분류기(대응 발언 대비 비대응 발언)와 다중 레이블 분류기(다양한 유형의 대응 발언)를 모두 개발함.
표준 NLP 메트릭(예: F1 점수)을 사용하여 수작업 레이블링된 데이터셋에서 모델 성능을 평가함.
통계적 분석을 통해 대응 발언과 비대응 발언 댓글 간의 참여도(예: 좋아요 수) 및 콘텐츠 패턴을 비교함.

실험 결과

연구 질문

RQ1유튜브 댓글에서 대응 발언과 비대응 발언 댓글을 구분하는 데 사용되는 언어학적 및 정서언어학적 특성은 무엇인가?
RQ2대응 발언 댓글의 참여도(예: 좋아요 수)는 비대응 발언 댓글과 비교해 어떻게 다른가?
RQ3온라인 커뮤니티에서 혐오 발언을 효과적으로 저지하기 위해 어떤 유형의 대응 발언이 가장 효과적인가?
RQ4대응 발언을 작성하는 사용자들의 언어 패턴은 혐오 발언을 작성하는 사용자들과 어떻게 다를까?
RQ5기계학습 모델은 실제 유튜브 댓글 섹션에서 대응 발언과 그 하위 유형을 얼마나 정확하게 탐지할 수 있는가?

주요 결과

대응 발언 댓글은 비대응 발언 댓글보다 유의미하게 더 많은 좋아요를 받으며, 이는 더 높은 사용자 참여도와 더불어 더 높은 가치로 평가됨을 시사함.
일부 온라인 커뮤니티에서는 비대응 발언 댓글의 다수를 혐오 발언으로 분류할 수 있었으며, 이는 유해한 논의의 광범위한 유통을 보여줌.
다양한 유형의 대응 발언은 동등하게 효과적이지 않으며, 이는 응답 스타일의 전략적 다양성이 영향력 향상에 기여할 수 있음을 시사함.
정서언어학적 분석을 통해 대응 발언과 비대응 발언을 작성하는 사용자 간의 언어 사용 패턴, 특히 정서적 어조와 어휘 복잡도에서 뚜렷한 차이를 확인함.
이진 기계학습 모델은 새로운 데이터셋에서 대응 발언 탐지에 F1 점수 0.71을 기록하여 뛰어난 성능을 보임.
다중 레이블 기반의 대응 발언 유형 분류 모델은 F1 점수 0.60을 기록하여 세부적인 응답 유형 탐지가 가능하지만 도전 과제가 있음을 시사함.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.