QUICK REVIEW

[논문 리뷰] The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race

Stefano Cresci, Roberto Di Pietro|Technical University of Denmark, DTU Orbit (Technical University of Denmark, DTU)|2017. 01. 11.

Spam and Phishing Detection참고 문헌 47인용 수 227

한 줄 요약

이 논문은 Twitter에서 플랫폼과 인간의 탐지를 모두 우회하는 새 물결의 사회적 봇에 대한 경험적 증거를 제공하고, 기존 탐지 도구를 평가하며, 크라우드소싱된 인간 성능을 분석하고, 그룹-행동 기반 주석과 새로운 탐지 접근법을 옹호한다.

ABSTRACT

Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots. Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design. First, we measure current Twitter's capabilities of detecting the new social spambots. Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots. Then, we benchmark several state-of-the-art techniques proposed by the academic literature. Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots. Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon. We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors. Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots. Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.

연구 동기 및 목표

novel wave of social spambots의 존재와 탐지 도전 과제를 입증합니다.
Twitter와 인간의 현재 사회적 봇 탐지 능력을 평가합니다.
전통적 탐지 도구와 특징과 새로운, 그룹 기반 접근 방식 간의 비교를 검토하고 비판합니다.
사회적 봇 연구에서 주석 및 벤치마킹을 진전시키기 위한 데이터셋과 지침을 제공합니다.
진화하는 사회적 봇에 대응하는 지속적 군비경쟁 전략의 기초를 마련합니다.

제안 방법

진짜 계정, 전통적 봇, 그리고 세 그룹에 걸친 새로운 사회적 봇을 포함한 다수의 Twitter 데이터세트를 구성하고 분석합니다.
API 응답 코드 값을 이용해 악성 계정의 이용 가능성을 추정하며 Twitter의 정지 능력을 평가합니다.
신뢰할 수 있는 기여자와 함께 크라우드소싱 탐지 캠페인을 수행하여 계정을 분류하고 인간 성능을 측정합니다.
새로운 봇 데이터세트에서 기존의 봇 탐지 기술(BotOrNot?, Yang et al. classifier, 및 비지도/그래프 기반 방법)을 벤치마크합니다.
그라운드 트루스 데이터세트를 개선하기 위해 그룹 행동의 유사성에 초점을 맞춘 대안적 주석 방법을 제안하고 구현합니다.

Figure 1: Survival rates for different types of accounts.

실험 결과

연구 질문

RQ1RQ1: Twitter가 사회적 봇을 탐지하고 제거할 수 있는 정도는 어느 정도입니까?
RQ2RQ2: 인간은 실제 환경에서 사회적 봇을 탐지하는 데 성공합니까?
RQ3RQ3: 인간은 전통적 봇, 사회적 봇, 진짜 계정을 구별할 수 있습니까?
RQ4RQ4: 최첨단 탐지 도구가 사회적 봇을 탐지할 수 있습니까?
RQ5RQ5: 사회적 봇에 효과적으로 대응할 수 있는 새로운 방법론적 방향은 무엇입니까?

주요 결과

진짜 계정은 Twitter에서 높은 생존율을 보이며(96.5%), 반면 가짜 팔로워 및 일부 전통적 봇은 대체로 탐지되거나 높은 중지 비율을 보입니다.
사회적 봇은 생존율이 진짜 계정과 유사하게 나타나(95.2%–99.6%), 전통적 봇보다 플랫폼 탐지를 더 많이 회피한다는 것을 시사합니다.
크라우드워크러들은 전통적 봇(≈0.91–0.92)과 진짜 계정(≈0.92)에서 높은 정확도를 달성하지만 사회적 봇에서 큰 성과를 보이지 못하고(≈0.24) 사회적 봇에 대한 인터-라터 신뢰도(Kappa ≈ 0.186)가 낮습니다.
Established 도구는 사회적 봇에 대해 제한적 성공을 보이며; BotOrNot?와 Yang et al. 분류기는 특히 재현율에서 사회적 봇에 대해 성능이 저조합니다.
계정-특성 그래프에 대한 비지도 그래프-클러스터링 접근법(fastgreedy)은 테스트 세트 #1에서 MCC ≈ 0.886, 테스트 세트 #2에서 MCC ≈ 0.847로 강력한 탐지를 달성하여, 여러 감독 학습 및 텍스트 중심 방법을 능가합니다.
논문은 그룹-행동 기반 주석을 옹호하고 이러한 접근을 지원하는 공개 주석 데이터세트를 제공합니다.

Figure 2: Dataset composition for the crowdsourcing experiment.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.