Skip to main content
QUICK REVIEW

[논문 리뷰] Counterpoint By Convolution.

Cheng-Zhi Anna Huang, Tim Cooijmans|arXiv (Cornell University)|2017. 10. 23.
Music and Audio Processing참고 문헌 10인용 수 80
한 줄 요약

Coconet은 부분 폴리포닉 악보를 완성하기 위해 orderless NADE 학습을 사용하는 딥 컨볼루션 모델을 훈련시키고, 차단된 Gibbs 샘플링으로 샘플 품질을 향상시키며, Bach 합창곡에서 Gibbs 기반 방법이 ancestral 샘플링을 능가하는 테스트를 보여준다.

ABSTRACT

Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. In order to better approximate this process, we train a convolutional neural network to complete partial musical scores, and explore the use of blocked Gibbs sampling as an analogue to rewriting. Neither the model nor the generative procedure are tied to a particular causal direction of composition. Our model is an instance of orderless NADE (Uria et al., 2014), which allows more direct ancestral sampling. However, we find that Gibbs sampling greatly improves sample quality, which we demonstrate to be due to some conditional distributions being poorly modeled. Moreover, we show that even the cheap approximate blocked Gibbs procedure from Yao et al. (2014) yields better samples than ancestral sampling, based on both log-likelihood and human evaluation.

연구 동기 및 목표

  • Introduce a convolutional generative model for musical counterpoint that can complete partial scores.
  • Leverage orderless NADE training to enable conditioning on arbitrary contexts.
  • Evaluate sampling strategies and demonstrate improved sample quality with blocked Gibbs sampling.
  • Compare performance against sequence-based models on Bach chorales at multiple temporal resolutions.

제안 방법

  • Represent music as piano rolls (I x T x P) and model p_theta(x) with a deep CNN.
  • Train with orderless NADE to learn p_theta(x_i | x_C) for all context C.
  • Mask random subsets C of the score and reconstruct the rest using a softmax over pitches.
  • Use framed-based log-likelihood evaluation that conditions on model predictions without ground-truth frames.
  • Compare chronological versus random orderings and evaluate sampling methods including ancestral sampling and blocked Gibbs sampling with annealing.
  • Provide public code and samples for replication.

실험 결과

연구 질문

  • RQ1Can a convolutional model trained under orderless NADE effectively complete partial polyphonic scores?
  • RQ2Does blocked Gibbs sampling improve sample quality over ancestral sampling in an orderless NADE setting?
  • RQ3How do sampling schemes (ancestral vs blocked Gibbs, independent vs ancestral) affect log-likelihood and human judgments of Bach chorales?
  • RQ4What is the impact of temporal resolution on model likelihood and evaluation metrics in polyphonic music generation?

주요 결과

모델Quarter note NLLEighth note NLLSixteenth note NLL
Nade [ 4 ]7.19
RNN-RBM [ 4 ]6.27
RNN - Nade [ 4 ]5.56
RNN - Nade (our implementation)5.033.782.05
Coconet (chronological)7.79±0.094.21±0.052.22±0.03
Coconet (random)5.03±0.061.84±0.020.57±0.01
  • Block Gibbs sampling significantly improves sample quality over ancestral sampling.
  • Independent blocked Gibbs sampling yields better samples and faster generation than ancestral sampling.
  • Random orderings provide better log-likelihoods than strictly chronological orderings on Bach chorales.
  • Temporal resolution affects reported log-likelihoods, with higher resolutions influencing evaluation due to chord-change sparsity.
  • Sampling likelihoods with independent Gibbs are competitive with or superior to naive Nade ancestral sampling, as shown by both quantitative and human evaluations.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.