QUICK REVIEW

[논문 리뷰] Efficient Processing of Deep Neural Networks: A Tutorial and Survey

Vivienne Sze, Yu‐Hsin Chen|arXiv (Cornell University)|2017. 03. 27.

Advanced Neural Network Applications참고 문헌 99인용 수 50

한 줄 요약

이 논문은 효율적인 심층 신경망(DNN) 처리의 기술, 하드웨어 플랫폼, 설계 트레이드오프를 조사하며, 추론 가속, 근데이터 처리, 알고리즘-하드웨어 공동 설계를 강조합니다.

ABSTRACT

Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI) applications including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost are critical to the wide deployment of DNNs in AI systems. This article aims to provide a comprehensive tutorial and survey about the recent advances towards the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various hardware platforms and architectures that support DNNs, and highlight key trends in reducing the computation cost of DNNs either solely via hardware design changes or via joint hardware design and DNN algorithm changes. It will also summarize various development resources that enable researchers and practitioners to quickly get started in this field, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware designs, optionally including algorithmic co-designs, being proposed in academia and industry. The reader will take away the following concepts from this article: understand the key design considerations for DNNs; be able to evaluate different DNN hardware implementations with benchmarks and comparison metrics; understand the trade-offs between various hardware architectures and platforms; be able to evaluate the utility of various DNN design techniques for efficient processing; and understand recent implementation trends and opportunities.

연구 동기 및 목표

심층 신경망(DNN)과 그것의 AI 응용 분야에서의 중요성에 대한 개요를 제공합니다.
DNN 추론을 지원하고 효율성 향상을 제공하는 하드웨어 플랫폼 및 아키텍처를 조사합니다.
정확도를 희생하지 않으면서 계산 및 에너지를 감소시키는 기술을 강조합니다.
DNN 하드웨어를 평가하기 위한 자원, 벤치마킹 지표 및 설계 고려사항을 논의합니다.
알고리즘 및 하드웨어 최적화를 공동으로 적용했을 때의 잠재적 이익을 설명하고, 추세와 기회를 식별합니다.

제안 방법

DNN과 AI에서의 역할 및 배치된 응용에 대한 배경을 제시합니다.
DNN 구성요소, 모델 및 CNN과 FC 계층의 핵심 계산을 설명합니다.
DNN용 하드웨어 플랫폼, 메모리 기술 및 근접 데이터 처리 접근법을 조사합니다.
데이터 이동 비용 완화를 위한 혼합 신호 및 메모리 중심 전략을 논의합니다.
공동 알고리즘–하드웨어 최적화 접근법과 처리량 및 에너지 효율에 대한 영향을 개요합니다.
DNN 하드웨어 설계를 위한 벤치마킹 지표 및 평가 고려사항을 제안합니다.

실험 결과

연구 질문

RQ1효율적인 DNN 하드웨어 구현을 위한 핵심 설계 고려사항은 무엇인가요?
RQ2처리량, 에너지 효율성 및 정확도 보존을 기준으로 DNN 하드웨어를 어떻게 평가하고 벤치마크할 수 있을까요?
RQ3DNN 추론을 위한 다양한 하드웨어 아키텍처 및 플랫폼 간의 트레이드오프는 무엇인가요?
RQ4효율성을 달성하는 데 있어 알고리즘 기법(예: 가지치기, 양자화)과 하드웨어 설계의 역할은 무엇인가요?
RQ5DNN용 근접 데이터 처리 및 메모리 기술에서 새로운 기회는 무엇인가요?

주요 결과

DNN은 높은 정확도를 달성하지만 높은 계산 및 데이터 이동 비용을 수반하여 특화된 가속화가 필요하게 만듭니다.
컨볼루션, 완전 연결, 풀링 및 정규화 계층은 현대 DNN의 핵심 구성 요소를 형성하며, BN이 표준 관행이 되고 있습니다.
다양한 하드웨어 플랫폼과 최적화는 정확도를 저하시키지 않으면서 처리량과 에너지 효율을 향상시킬 수 있습니다.
근접 데이터 처리 및 혼합 신호/메모리 기술은 데이터 이동 병목 현상을 해결하는 방향으로 강조됩니다.
공동 알고리즘–하드웨어 최적화는 정확도 손실을 관리하면서 처리량과 에너지 이점을 가져올 수 있습니다.
성장하는 DNN 가속기의 전반을 평가하기 위한 벤치마킹 지표와 설계 고려사항의 집합이 제안됩니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.