Skip to main content
QUICK REVIEW

[논문 리뷰] Computational bioacoustics with deep learning: a review and roadmap

Dan Stowell|arXiv (Cornell University)|2021. 12. 13.
Animal Vocal Communication and Behavior인용 수 35
한 줄 요약

A comprehensive review of how deep learning is applied to computational bioacoustics, outlining current practices, architectures, representations, and a roadmap for future research.

ABSTRACT

Animal vocalisations and natural soundscapes are fascinating objects of study, and contain valuable evidence about animal behaviours, populations and ecosystems. They are studied in bioacoustics and ecoacoustics, with signal processing and analysis an important component. Computational bioacoustics has accelerated in recent decades due to the growth of affordable digital sound recording devices, and to huge progress in informatics such as big data, signal processing and machine learning. Methods are inherited from the wider field of deep learning, including speech and image processing. However, the tasks, demands and data characteristics are often different from those addressed in speech or music analysis. There remain unsolved problems, and tasks for which evidence is surely present in many acoustic signals, but not yet realised. In this paper I perform a review of the state of the art in deep learning for computational bioacoustics, aiming to clarify key concepts and identify and analyse knowledge gaps. Based on this, I offer a subjective but principled roadmap for computational bioacoustics with deep learning: topics that the community should aim to address, in order to make the most of future developments in AI and informatics, and to use audio data in answering zoological and ecological questions.

연구 동기 및 목표

  • Clarify how deep learning is currently used in computational bioacoustics and summarize standard practices across taxa and tasks.
  • Identify knowledge gaps and under-explored topics to guide future research in AI-enabled bioacoustics.
  • Provide a principled roadmap integrating deep learning advances with ecological and zoological questions.

제안 방법

  • Survey existing literature published from 2016 onward on deep learning for bioacoustics using keyword searches in Google Scholar and Web of Science.
  • Summarise standard DL pipelines for bioacoustic classification, detection, and segmentation, including data preparation, model architectures, and evaluation metrics.
  • Discuss input representations (spectrograms, waveforms, etc.), data augmentation, and training practices in the context of bioacoustic data.
  • Review neural network architectures (CNNs, CRNNs, TCNs, attention/transformers), and their applicability to bioacoustic tasks.
  • Highlight taxonomic coverage (birds, cetaceans, bats, mammals, anurans, insects, fish) and data challenges (e.g., data deluge, unbalanced datasets).
  • Propose a roadmap highlighting topics from DL and bioacoustics that the community should address to advance the field.

실험 결과

연구 질문

  • RQ1What is the current state of deep learning methods used for computational bioacoustics across taxa and tasks?
  • RQ2Which neural network architectures and input representations are most effective for bioacoustic classification and detection?
  • RQ3What are the major knowledge gaps and opportunities guiding future research in computational bioacoustics with DL?

주요 결과

  • CNN-based architectures dominate bioacoustic DL workflows for classification and detection.
  • Spectrogram-based inputs (often mel or CQT) are the standard, with PCEN as a useful normalisation and potential benefits from multiple spectrogram representations or raw-waveform approaches being explored.
  • CRNNs and newer architectures (including attention/transformers and temporal CNNs) are investigated, with mixed gains depending on task; training complexity varies.
  • Two-step workflows (detect then classify) are common for sparsely occurring sounds, though end-to-end detection/classification is also explored.
  • Taxonomic focus is broad, with birds and marine mammals well represented, alongside bats, primates, insects, fishes, and other taxa; data challenges and large datasets (e.g., BirdCLEF) drive progress.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.