QUICK REVIEW

[논문 리뷰] The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework

Chandan K. Reddy, Ebrahim Beyrami|arXiv (Cornell University)|2020. 01. 23.

Speech and Audio Processing참고 문헌 18인용 수 67

한 줄 요약

INTERSPEECH 2020 Deep Noise Suppression Challenge를 소개하며, 오픈 소스 학습 데이터, 대표적인 실제 세계 테스트 세트, 그리고 지각적 음질 평가를 위한 ITU-T P.808 기반 온라인 주관적 테스트 프레임워크를 제시한다.

ABSTRACT

The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. Many publications report reasonable performance on the synthetic test set drawn from the same distribution as that of the training set. However, often the model performance degrades significantly on real recordings. Also, most of the conventional objective metrics do not correlate well with subjective tests and lab subjective tests are not scalable for a large test set. In this challenge, we open-source a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings. We also open source an online subjective test framework based on ITU-T P.808 for researchers to quickly test their developments. The winners of this challenge will be selected based on subjective evaluation on a representative test set using P.808 framework.

연구 동기 및 목표

real-time 단일 채널 음성 향상 분야의 협력 연구 촉진.
DSP 모델 학습을 위한 오픈 소스 깨끗한 음성 및 잡음 코퍼스 제공.
강건한 평가를 위한 합성 및 실제 녹음을 포함하는 대표 테스트 세트 제공.
확장 가능한 지각적 품질 평가를 가능하게 하는 온라인 주관적 테스트 프레임워크 제공.
수상자는 P.808 프레임워크를 사용한 주관적 평가를 기반으로 선정되도록 보장.

제안 방법

노이즈 제거 모델 학습을 위한 대규모 오픈 소스 깨끗한 음성 및 잡음 데이터셋 제공.
현실 세계의 시나리오를 반영하도록 합성 및 실제 녹음을 포함하는 대표 테스트 세트 구성.
빠른 평가를 위한 ITU-T P.808 기반 온라인 주관적 테스트 프레임워크 제공.
수상자 선정의 주요 기준으로 주관적, 지각적 품질을 사용.

실험 결과

연구 질문

RQ1노이즈 제거 방법은 지각적 품질 지표로 평가할 때 실제 세계의 단일 채널 설정에서 어떻게 성능을 보이나요?
RQ2오픈 데이터셋과 확장 가능한 주관적 테스트 프레임워크가 합성 데이터만의 벤치마크에 비해 평가의 신뢰성과 일반화를 향상시킬 수 있나요?
RQ3P.808 기반 주관적 프레임워크를 사용하는 것이 경쟁 DNS 접근 방식의 순위에 어떤 영향을 미치나요?

주요 결과

오픈 소스 학습 데이터와 대표 테스트 세트가 DNS 방법의 보다 현실적인 평가를 가능하게 한다.
P.808 기반의 온라인 주관적 테스트 프레임워크가 제출 전반에 걸친 지각적 평가를 확장 가능하게 촉진한다.
대표 테스트 세트에서의 지각적 점수로 챌린지의 수상자가 결정된다.
이 접근 방식은 실제 녹음의 경우 객관적 지표와 지각적 품질 간의 잠재적 격차를 강조한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.