QUICK REVIEW

[논문 리뷰] Fine-Grained Analysis of Propaganda in News Articles

Giovanni Da San Martino, Seunghak Yu|arXiv (Cornell University)|2019. 10. 06.

Topic Modeling참고 문헌 23인용 수 95

한 줄 요약

이 논문은 프로파간다 탐지에 대한 세밀한 조각(fragment) 수준 접근을 제시합니다. 18가지 기법을 주석화하고, 대규모 주석 코퍼스를 만들며, 평가 지표를 제안하고, 강력한 BERT 베이스라인을 능가하는 다중- granularity 신경망을 제시합니다.

ABSTRACT

Propaganda aims at influencing people's mindset with the purpose of advancing a specific agenda. Previous work has addressed propaganda detection at the document level, typically labelling all articles from a propagandistic news outlet as propaganda. Such noisy gold labels inevitably affect the quality of any learning system trained on them. A further issue with most existing systems is the lack of explainability. To overcome these limitations, we propose a novel task: performing fine-grained analysis of texts by detecting all fragments that contain propaganda techniques as well as their type. In particular, we create a corpus of news articles manually annotated at the fragment level with eighteen propaganda techniques and we propose a suitable evaluation measure. We further design a novel multi-granularity neural network, and we show that it outperforms several strong BERT-based baselines.

연구 동기 및 목표

문서 수준 라벨링이 아닌 fragment 수준의 프로파간다 분석의 필요성을 제시한다.
fragment 수준에서 18가지 프로파간다 기법에 대한 전문가 주석이 포함된 고품질 코퍼스를 생성한다.
부분적 중첩과 다양한 fragment 길이를 수용하는 평가 척도를 제안한다.
하위 granularity 신호를 활용하여 상위 granularity 예측을 향상시키는 다중-granularity 신경망을 개발한다.
제안된 모델이 fragment- 및 sentence-level 작업에서 강력한 BERT 기반 베이스라인을 능가함을 입증한다.

제안 방법

fragment 수준 주석에 적합한 18가지 저널리즘 프로파간다 기법을 정의한다.
fragment 수준 레이블로 기법에 대해 451개 뉴스 기사(350k 토큰) 코퍼스를 구성하고 주석을 달다.
표절 탐지 및 NER 기반 아이디어에서 영감을 얻은 부분 중첩 인식 평가 지표를 제안한다.
하위 granularity 신호(sentence-level)를 활용하여 상위 granularity 예측(fragment-level)을 보완하는 다중-granularity 네트워크를 개발한다.
BERT 기반 베이스라인(BERT, BERT-Joint, BERT-Grain)을 미세조정하고 제안된 다중-granularity 네트워크와 비교한다.
SLC(sentence-level classification)와 FLC(fragment-level classification) 두 작업에서 맞춤 손실 및 게이팅 메커니즘으로 평가한다.

실험 결과

연구 질문

RQ1뉴스 기사에서 프로파aganda 조각을 미세한 수준으로 신뢰성 있게 탐지하고 라벨을 달 수 있는가?
RQ2문장 수준 신호를 활용하는 다중- granularity 아키텍처가 표준 BERT 베이스라인에 비해 fragment-level 프로파간다 탐지를 개선하는가?
RQ3제안된 평가 척도가 부분 중첩과 다양한 fragment 길이를 보상하는 데 얼마나 효과적인가?

주요 결과

모델	스팬	전체 태스크 - P	전체 태스크 - R	전체 태스크 - F1	메모
BERT	39.57	21.48	21.39	21.39	Spans; Full-task results shown together in table
BERT-Joint	39.26	20.11	19.74	19.92	Joint training for SLC and FLC
Granu	43.08	23.85	20.14	21.80	Sentence-level info integrated into FLC
Multi-Granularity - ReLU	43.29	23.98	20.33	21.82	Gate-based fusion; aggressive filtering
Multi-Granularity - Sigmoid	44.12	24.42	21.05	22.58	Gate-based fusion; partial overlaps credited

코퍼스에는 21,230 문장에 걸쳐 7,485개의 프로파간다 기법 인스턴스가 포함되어 있으며(35.2%).
가장 자주 나타나는 기법은 loaded language(2,547 인스턴스)와 name calling/labeling(1,294 인스턴스)이다.
제안된 다중-Granularity 네트워크(MGN)는 fragment-level 탐지에서 BERT 기반 베이스라인보다 우수하며, 특히 게이팅 메커니즘(Sigmoid 또는 ReLU)을 사용할 때 그렇다.
fragment-level 탐지에서, Sigmoid를 사용하는 MGN은 전체 태스크 평가에서 P=24.42, R=21.05, F1=22.58로 베이스라인보다 높은 정밀도를 보였다.
sentence-level 탐지에서, MGN은 BERT 베이스라인에 비해 상당한 이점을 보이며 모든 프로파간다 설정에서 재현율을 8.42% 포인트, F1을 3.24 포인트 상승시켰다.
연구는 하위 granularity 신호를 통합하는 것이 상위 granularity 과제를 의미 있게 개선할 수 있음을 보여주며, 음성 게이팅은 노이즈가 있는 음수 샘플을 감소시킨다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.