QUICK REVIEW

[논문 리뷰] A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks

Saidul Islam, Hanae Elmekki|arXiv (Cornell University)|2023. 06. 11.

Advanced Neural Network Applications인용 수 26

한 줄 요약

이 논문은 2017년부터 2022년까지의 transformer 기반 모델을 다섯 가지 주요 응용 도메인(NLP, 컴퓨터 비전, 다중 모달, 오디오/음성, 및 신호 처리)에서 조사하고, 응용 태스크별로 모델의 분류 체계를 제안하며 핵심 영향력 있는 모델을 분석한다.

ABSTRACT

Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data. Unlike conventional neural networks or updated versions of Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM), transformer models excel in handling long dependencies between input sequence elements and enable parallel processing. As a result, transformer-based models have attracted substantial interest among researchers in the field of artificial intelligence. This can be attributed to their immense potential and remarkable achievements, not only in Natural Language Processing (NLP) tasks but also in a wide range of domains, including computer vision, audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published highlighting the transformer's contributions in specific fields, architectural differences, or performance evaluations, there is still a significant absence of a comprehensive survey paper encompassing its major applications across various domains. Therefore, we undertook the task of filling this gap by conducting an extensive survey of proposed transformer models from 2017 to 2022. Our survey encompasses the identification of the top five application domains for transformer-based models, namely: NLP, Computer Vision, Multi-Modality, Audio and Speech Processing, and Signal Processing. We analyze the impact of highly influential transformer-based models in these domains and subsequently classify them based on their respective tasks using a proposed taxonomy. Our aim is to shed light on the existing potential and future possibilities of transformers for enthusiastic researchers, thus contributing to the broader understanding of this groundbreaking technology.

연구 동기 및 목표

트랜스포머 기반 모델의 최상위 응용 도메인을 식별하고 각 도메인 내 영향력 있는 모델을 요약한다.
응용 태스크를 기반으로 트랜스포머 모델의 분류 체계를 제안하고 그들의 태스크 성능을 분석한다.
다양한 분야에서 트랜스포머 응용의 도전 과제와 향후 기회를 강조한다.

제안 방법

2017–2022년의 survey 논문과 트랜스포머 기반 모델에 대한 체계적 문헌 검토.
응용 도메인 및 태스크에 따른 모델 분류를 통한 분류 체계(Taxonomy) 제시.
참여 모델의 선정 기준은 참신성, 어텐션 메커니즘 혁신, 영향력 및 실제 적용 가능성에 기초.
각 도메인 내 중요한 모델들의 데이터셋, 아키텍처 및 작동 원리 분석.

실험 결과

연구 질문

RQ1트랜스포머 모델이 가장 큰 영향을 준 주요 응용 도메인은 무엇인가?
RQ2NLP, 비전, 다중 모달, 오디오/음성, 및 신호 처리에서 발전을 주도한 트랜스포머 모델과 태스크 형식은 무엇인가?
RQ3아키텍처, 사전 학습, 및 응용 관점에서 트랜스포머의 변형을 가장 잘 포착하는 분류 체계는 무엇인가?
RQ4다양한 딥러닝 작업에 트랜스포머를 적용하는 주요 도전 과제와 향후 방향은 무엇인가?

주요 결과

저자들은 NLP, 컴퓨터 비전, 다중 모달, 오디오/음성, 및 신호 처리를 상위 다섯 가지 트랜스포머 응용 도메인으로 식별한다.
그들은 600개가 넘는 트랜스포머 모델을 검토하고 도메인 간의 분류 체계 및 논의를 위한 대표 모델을 선택한다.
본 논문은 응용 분야와 태스크를 바탕으로 트랜스포머 모델의 고수준 분류 체계를 제안한다.
다중 모달 및 신호 처리 응용 분야에서의 격차를 강조하고 기존 조사들을 비교한다.
본 조사는 트랜스포머 연구의 향후 전망과 해결되지 않은 과제를 논의한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.