QUICK REVIEW

[論文レビュー] The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework

Chandan K. Reddy, Ebrahim Beyrami|arXiv (Cornell University)|Jan 23, 2020

Speech and Audio Processing参考文献 18被引用数 67

ひとこと要約

INTERSPEECH 2020 Deep Noise Suppression Challengeを紹介し、オープンソースのトレーニングデータ、代表的な実世界のテストセット、そして感覚的な音声品質を評価するためのITU-T P.808ベースのオンライン主観評価フレームワークを提供します。

ABSTRACT

The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. Many publications report reasonable performance on the synthetic test set drawn from the same distribution as that of the training set. However, often the model performance degrades significantly on real recordings. Also, most of the conventional objective metrics do not correlate well with subjective tests and lab subjective tests are not scalable for a large test set. In this challenge, we open-source a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings. We also open source an online subjective test framework based on ITU-T P.808 for researchers to quickly test their developments. The winners of this challenge will be selected based on subjective evaluation on a representative test set using P.808 framework.

研究の動機と目的

リアルタイムの単一チャネル音声強調における協力的な研究を促進する。
DSPモデルのトレーニング用にオープンソースのクリーン音声およびノイズコーパスを提供する。
合成データと実録データを含む代表的なテストセットを提供し、堅牢な評価を行う。
スケーラブルな知覚品質評価を可能にするオンライン主観評価フレームワークを提供する。
P.808フレームワークを用いた主観的評価に基づいて勝者を選定することを保証する。

提案手法

ノイズ除去モデルのトレーニングのためのオープンソースの大規模クリーン音声およびノイズデータセット。
現実世界のシナリオを反映するために、合成データと実録データを含む代表的なテストセットを編成する。
迅速な評価のためにITU-T P.808に基づくオンライン主観試験フレームワークを提供する。
勝者選定の主要基準として主観的知覚品質を主要基準とする。

実験結果

リサーチクエスチョン

RQ1実世界の単一チャネル環境において、知覚品質指標で評価した場合、ノイズ抑制手法はどのように機能するか？
RQ2オープンデータセットとスケーラブルな主観評価フレームワークは、合成データのみのベンチマークに比べて評価の信頼性と一般化を向上させることができるか？
RQ3競合するDNSアプローチのランキングにP.808ベースの主観フレームワークを使用することの影響は何か？

主な発見

オープンソースのトレーニングデータと代表的なテストセットは、DNS手法のより現実的な評価を可能にする。
P.808に基づくオンライン主観試験フレームワークは、提出全体にわたるスケーラブルな知覚評価を促進する。
チャレンジの勝者は代表的なテストセットの主観的知覚スコアによって決定される。
このアプローチは、実録データに対する客観的指標と知覚品質の間の潜在的なギャップを浮き彫りにする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。