QUICK REVIEW

[논문 리뷰] Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

Kaiyang Zhou, Yu Qiao|arXiv (Cornell University)|2017. 12. 29.

Video Analysis and Summarization인용 수 78

한 줄 요약

본 논문은 강화학습을 통해 새로 개발된 unsupervised diversity-representativeness (DR) 보상으로 학습되는 end-to-end deep summarization 네트워크(DSN)를 도입하며, 감독 학습 방법에 필적하는 감독 없는 비디오 요약을 가능하게 한다.

ABSTRACT

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos. In this paper, we formulate video summarization as a sequential decision-making process and develop a deep summarization network (DSN) to summarize videos. DSN predicts for each video frame a probability, which indicates how likely a frame is selected, and then takes actions based on the probability distributions to select frames, forming video summaries. To train our DSN, we propose an end-to-end, reinforcement learning-based framework, where we design a novel reward function that jointly accounts for diversity and representativeness of generated summaries and does not rely on labels or user interactions at all. During training, the reward function judges how diverse and representative the generated summaries are, while DSN strives for earning higher rewards by learning to produce more diverse and more representative summaries. Since labels are not required, our method can be fully unsupervised. Extensive experiments on two benchmark datasets show that our unsupervised method not only outperforms other state-of-the-art unsupervised methods, but also is comparable to or even superior than most of published supervised approaches.

연구 동기 및 목표

지상 진실 요약의 주관성으로 인해 감독 없는 비디오 요약의 필요성 제기.
주요 프레임을 선택하기 위해 비디오 요약을 순차적 의사결정 문제로 공식화한다.
프레임 선정 확률을 출력하는 deep summarization network (DSN)을 개발한다.
다양성과 대표성을 결합한 라벨 없는 DR 보상을 설계한다.
주석이 이용 가능할 때 감독(supervised) 버전으로 프레임워크를 확장한다.

제안 방법

CNN(GoogLeNet)으로 프레임을 인코딩하여 특징을 추출한다.
양방향 LSTM으로 디코딩하여 프레임 선택 확률을 생성한다.
예측된 확률로 프레임당 이진 액션을 샘플링한다.
DR 보상(Rdiv + Rrep)을 최대화하기 위해 정책경사(REINFORCE)로 학습한다.
최적화 중에 퍼센트 정규화와 가중치 정규화를 부여한다.
주석된 주요 프레임의 로그확률을 최대화하여 감독 목표를 선택적으로 포함한다.

실험 결과

연구 질문

RQ1다양성-대표성 보상을 갖는 강화학습이 완전한 감독 없는 비디오 요약을 가능하게 할 수 있는가?
RQ2다양성 및 대표성 구성요소가 어떻게 상호 작용하여 고품질 요약을 생성하는가?
RQ3비감독 DR-DSN이 SumMe와 TVSum에서 감독 방법과 비교하여 어떤가?
RQ4감독적으로 확장하면 성능이 더 향상되는가?

주요 결과

DR-DSN은 SumMe 및 TVSum에서 다른 감독 없는 방법들보다 우수하다.
감독 없는 DR-DSN은 테스트된 데이터셋에서 많은 감독 학습 방법들과 비교되거나 우수하다.
Rdiv와 Rrep의 공동 사용은 단일 보상보다 더 나은 요약을 이끈다.
감독 확장 DR-DSN_sup은 여러 설정에서 비감독 버전보다 결과를 더 향상시킨다.
이 접근법은 인간이 판단한 중요한 프레임과 강한 질적 일치를 보인다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.