QUICK REVIEW

[論文レビュー] Disturbed YouTube for Kids: Characterizing and Detecting Disturbing Content on YouTube

Kostantinos Papadamou, Antonis Papasavva|arXiv (Cornell University)|Feb 6, 2019

Hate Speech and Cyberbullying Detection参考文献 18被引用数 11

ひとこと要約

本稿では、82.8%の正確性を達成する分類器を提案し、YouTubeにおける乳児向け不適切コンテンツを検出する。大規模な分析により、広範な露出リスクが明らかになった。研究では、YouTubeの現在の対策が早期検出に効果を発揮していないことが判明し、乳児が無害なコンテンツから始める場合でさえ、恐ろしい動画に遭遇する可能性がある。

ABSTRACT

A considerable number of the most-subscribed YouTube channels feature content popular among children of very young age. Hundreds of toddler-oriented channels on YouTube offer inoffensive, well produced, and educational videos. Unfortunately, inappropriate (disturbing) content that targets this demographic is also common. YouTube’s algorithmic recommendation system regrettably suggests inappropriate content because some of it mimics or is derived from otherwise appropriate content. Considering the risk for early childhood development, and an increasing trend in toddler’s consumption of YouTube media, this is a worrying problem. While there are many anecdotal reports of the scale of the problem, there is no systematic quantitative measurement. Hence, in this work, we develop a classifier able to detect toddler-oriented inappropriate content on YouTube with 82.8% accuracy, and we leverage it to perform a first-of-its-kind, largescale, quantitative characterization that reveals some of the risks of YouTube media consumption by young children. Our analysis indicates that YouTube’s currently deployed countermeasures are ineffective in terms of detecting disturbing videos in a timely manner. Finally, using our classifier, we assess how prominent the problem is on YouTube, finding that young children are likely to encounter disturbing videos when they randomly browse the platform starting from benign videos.

研究の動機と目的

YouTubeにおける乳児を対象とした不適切コンテンツの広がりを体系的に測定すること。
子供のメディア消費における不適切コンテンツに関する大規模な実証的データの不足を解消すること。
YouTubeの既存のコンテンツモデレーションシステムが、このようなコンテンツを検出する効果を評価すること。
高い正確性で乳児向けの不適切動画を特定できる機械学習分類器の開発と検証すること。

提案手法

乳児を対象とした動画における不適切コンテンツを検出するための教師あり機械学習分類器を訓練した。
分類器は、YouTube動画から抽出したテキスト、視覚的、音声的特徴を用いて不適切コンテンツを特定する。
分類器の訓練および評価のため、乳児向け動画の大規模データセットを収集し、アノテートした。
モデルは、人間によるアノテート済みテストセットを用いて検証され、82.8%の正確性を達成した。
分類器を用いてYouTubeの推薦システムの大規模分析を実施し、露出リスクを評価した。
研究では、分類器を活用してYouTubeの現在のコンテンツモデレーションメカニズムの性能を評価した。

実験結果

リサーチクエスチョン

RQ1乳児向けYouTube動画における不適切コンテンツの広がりはどの程度で、推薦を通じてどのように拡散するのか？
RQ2YouTubeの現在の自動および手動のコンテンツモデレーションシステムが、このようなコンテンツを検出する効果はどの程度か？
RQ3見かけ上適切で無害なコンテンツから始める乳児が、不適切な動画に露出する可能性はどの程度か？
RQ4子供のメディアの文脈において、不適切なコンテンツと適切なコンテンツを区別する特徴は何か？

主な発見

分類器は、YouTubeにおける乳児向け不適切コンテンツの検出において82.8%の正確性を達成した。
YouTubeの現在の対策は、不適切な動画の早期検出に効果を発揮していない。
乳児は、無害で教育的な動画から始める場合でさえ、不適切な動画に遭遇する可能性がある。
不適切なコンテンツは、しばしば正当な子供向けコンテンツを模倣または模倣して作られており、検出を回避できる。
推薦システムはしばしば不適切なコンテンツを促進し、若年層の露出リスクを高めている。
子供のメディアにおける不適切コンテンツの体系的かつ大規模な測定には大きなギャップがあり、本研究がその欠落を埋める一歩を踏み出した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。