QUICK REVIEW

[논문 리뷰] Distribution Distillation Loss: Generic Approach for Improving Face Recognition from Hard Samples

Yuge Huang, Pengcheng Shen|arXiv (Cornell University)|2020. 02. 10.

Face recognition and analysis참고 문헌 35인용 수 3

한 줄 요약

이 논문은 분포 디스티illation 손실을 제안하며, 쉬운(교사) 및 어려운(학생) 샘플들 사이의 유사도 분포를 디스티illation하여 어려운 샘플에서의 얼굴 인식 성능을 향상시키는 일반적인 방법이다. 새로운 손실을 통해 학생의 분포를 교사의 분포에 맞추어 양성 및 음성 쌍 간의 오버랩을 줄임으로써, 자세, 인종, 해상도와 같은 어려운 변형에 대해 성능을 크게 향상시킨다. 이는 대규모 벤치마크에서 ArcFace와 CosFace를 뛰어넘는 성능을 기록한다.

ABSTRACT

Large facial variations are the main challenge in face recognition. To this end, previous variation-specific methods make full use of task-related prior to design special network losses, which are typically not general among different tasks and scenarios. In contrast, the existing generic methods focus on improving the feature discriminability to minimize the intra-class distance while maximizing the interclass distance, which perform well on easy samples but fail on hard samples. To improve the performance on those hard samples for general tasks, we propose a novel Distribution Distillation Loss to narrow the performance gap between easy and hard samples, which is a simple, effective and generic for various types of facial variations. Specifically, we first adopt state-of-the-art classifiers such as ArcFace to construct two similarity distributions: teacher distribution from easy samples and student distribution from hard samples. Then, we propose a novel distribution-driven loss to constrain the student distribution to approximate the teacher distribution, which thus leads to smaller overlap between the positive and negative pairs in the student distribution. We have conducted extensive experiments on both generic large-scale face benchmarks and benchmarks with diverse variations on race, resolution and pose. The quantitative results demonstrate the superiority of our method over strong baselines, e.g., Arcface and Cosface.

연구 동기 및 목표

큰 얼굴 변형 상황에서 쉬운 샘플과 어려운 샘플 간의 성능 격차를 해소하기.
다양한 변형 유형 간 일반화 능력이 떨어지는 작업별 특화 손실의 한계를 극복하기.
작업별 특화 사전 지식에 의존하지 않고 어려운 샘플에서의 특징 구분 능력을 향상시키기.
다양한 얼굴 인식 시나리오와 변형에 적용 가능한 일반적인 손실 함수 개발하기.
분포 수준의 지식 디스티illation을 통해 어려운 샘플에서의 내부 클래스 분산을 최소화하고 클래스 간 오버랩을 줄이기.

제안 방법

ArcFace와 같은 최신 기술 기반 분류기를 사용하여 쉬운 샘플(교사)과 어려운 샘플(학생)에서 각각 유사도 분포를 생성하기.
교사의 분포를 근접하게 만드는 분포 기반 손실 정의하기.
학생 분포에서 양성 및 음성 쌍 간의 오버랩을 줄이기 위해 디스티illation 손실을 제안하기.
기본 분류 손실과 함께 제안된 분포 디스티illation 손실을 사용하여 학생 네트워크를 엔드 투 엔드로 훈련하기.
교사의 잘 분리된 분포에서 유래한 지식을 활용하여 학생이 어려운 샘플에 대해 더 강력한 표현을 학습하도록 이끌기.
작업별 설계나 변형별 사전 지식에 의존하지 않도록 하여 방법의 일반성을 확보하기.

실험 결과

연구 질문

RQ1작업별 설계 없이도 일반적인 손실 함수가 어려운 샘플에서의 얼굴 인식 성능을 효과적으로 향상시킬 수 있는가?
RQ2쉬운 샘플과 어려운 샘플의 유사도 분포를 디스티illation하면 특징의 구분 능력에 어떤 영향을 미치는가?
RQ3분포 디스티illation을 통해 다양한 얼굴 변형 상황에서 쉬운 샘플과 어려운 샘플 간의 성능 격차를 어느 정도 줄일 수 있는가?
RQ4제안된 방법은 인종, 해상도, 자세 등의 변형이 있는 다양한 벤치마크에 일반화되는가?
RQ5ArcFace와 CosFace와 같은 기존 최고 성능 손실과 비교해 볼 때, 분포 디스티illation 손실은 어려운 샘플을 다루는 데 어떻게 성능을 내는가?

주요 결과

제안된 분포 디스티illation 손실은 다양한 얼굴 변형 상황에서 어려운 샘플에서의 얼굴 인식 정확도를 크게 향상시킨다.
대규모 얼굴 인식 벤치마크에서 ArcFace와 CosFace와 같은 강력한 베이스라인을 뛰어넘는 성능을 기록한다.
특히 극단적인 자세나 낮은 해상도와 같은 도전적인 조건에서 쉬운 샘플과 어려운 샘플 간의 성능 격차가 뚜렷이 좁혀진다.
다양한 인종과 해상도 변형이 포함된 여러 벤치마크에서 일관된 성능 향상을 달성한다.
작업별 적응 없이 다양한 네트워크 아키텍처와 데이터 분포에 대해 강력한 일반성을 유지한다.
제거 분석 결과, 성능 향상의 핵심 요소는 ArcFace를 교사로 사용하는 것 자체가 아니라 분포 디스티illation 메커니즘이라는 것이 확인되었다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.