QUICK REVIEW

[논문 리뷰] Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning

Ruozi Huang, Hu Huang|arXiv (Cornell University)|2020. 06. 11.

Human Pose and Action Recognition참고 문헌 62인용 수 52

한 줄 요약

본 논문은 장기 음악-조건 댄스 생성을 위한 seq2seq 아키텍처를 제시하고, 자기회귀 오차 누적을 줄이기 위한 커리큘럼 학습을 도입하여 기존 방법들보다 우수한 성과를 달성한다.

ABSTRACT

Dancing to music is one of human's innate abilities since ancient times. In machine learning research, however, synthesizing dance movements from music is a challenging problem. Recently, researchers synthesize human motion sequences through autoregressive models like recurrent neural network (RNN). Such an approach often generates short sequences due to an accumulation of prediction errors that are fed back into the neural network. This problem becomes even more severe in the long motion sequence generation. Besides, the consistency between dance and music in terms of style, rhythm and beat is yet to be taken into account during modeling. In this paper, we formalize the music-conditioned dance generation as a sequence-to-sequence learning problem and devise a novel seq2seq architecture to efficiently process long sequences of music features and capture the fine-grained correspondence between music and dance. Furthermore, we propose a novel curriculum learning strategy to alleviate error accumulation of autoregressive models in long motion sequence generation, which gently changes the training process from a fully guided teacher-forcing scheme using the previous ground-truth movements, towards a less guided autoregressive scheme mostly using the generated movements instead. Extensive experiments show that our approach significantly outperforms the existing state-of-the-arts on automatic metrics and human evaluation. We also make a demo video to demonstrate the superior performance of our proposed approach at https://www.youtube.com/watch?v=lmE20MEheZ8.

연구 동기 및 목표

음악으로부터 긴 댄스 시퀀스를 생성하는 문제의 중요성과 도전 과제를 제시한다.
길고 복잡한 음악 피처 시퀀스를 처리할 수 있는 seq2seq 모델을 개발한다.
자기회귀 댄스 생성에서의 오차 누적 문제를 해결한다.
교사강제(teacher-forcing)에서 자기회귀 생성으로의 학습 전이를 위한 커리큘럼 학습을 도입한다.
최신 방법보다 향상된 성능을 보여준다.

제안 방법

음악-조건 댄스 생성을 시퀀스-투-시퀀스 학습으로 정형화한다.
길고 긴 음악 피처를 효과적으로 처리하고 미세한 음악-댄스 대응을 포착하는 새로운 seq2seq 아키텍처를 제안한다.
정답 움직임이 포함된 교사강제에서 생성된 움직임을 이용한 자기회귀 생성으로 점진적으로 이동하도록 하는 커리큘럼 학습 전략을 도입한다.
장기 모션 시퀀스 생성에서의 오차 누적을 완화하기 위해 학습 역학을 활용한다.
자동 지표와 인간 평가를 사용하여 평가하고 우수한 성능을 검증한다.

실험 결과

연구 질문

RQ1길고 긴 음악 피처 시퀀스를 어떻게 효율적으로 처리하여 대응하는 댄스 시퀀스를 생성할 수 있는가?
RQ2음악과 댄스 간의 정밀한 정합을 어떻게 포착할 수 있는가?
RQ3완전한 지도 학습과 비교하여 커리큘럼 학습이 긴 모션 생성에서의 자기회귀 오차 누적을 완화하는가?

주요 결과

제안된 방법은 자동 지표 및 인간 평가에서 기존 최첨단 방법보다 우수한 성능을 보인다.
커리큘럼 학습 전략이 긴 모션 생성에서의 오차 누적을 완화한다.
해당 아키텍처는 장기 음악-댄스 대응 및 스타일 일관성을 효과적으로 모델링한다.
실험은 객관적 지표와 인간 판단 모두에 걸쳐 접근 방식을 검증한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.