QUICK REVIEW

[논문 리뷰] Handwritten Bangla Digit Recognition Using Deep Learning

Md Zahangir Alom, Paheding Sidike|arXiv (Cornell University)|2017. 05. 07.

Handwritten Text Recognition Techniques참고 문헌 1인용 수 80

한 줄 요약

이 논문은 CMATERdb 3.1.1에서 Handwritten Bangla Digit Recognition을 위한 여러 딥 러닝 모델을 평가하고, CNN with Gabor features and dropout이 98.78%의 최고 정확도에 도달한다는 것을 발견한다.

ABSTRACT

In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the performance of Handwritten Bangla Digit Recognition (HBDR), we herein present a new approach based on deep neural networks which have recently shown excellent performance in many pattern recognition and machine learning applications, but has not been throughly attempted for HBDR. We introduce Bangla digit recognition techniques based on Deep Belief Network (DBN), Convolutional Neural Networks (CNN), CNN with dropout, CNN with dropout and Gaussian filters, and CNN with dropout and Gabor filters. These networks have the advantage of extracting and using feature information, improving the recognition of two dimensional shapes with a high degree of invariance to translation, scaling and other pattern distortions. We systematically evaluated the performance of our method on publicly available Bangla numeral image database named CMATERdb 3.1.1. From experiments, we achieved 98.78% recognition rate using the proposed method: CNN with Gabor features and dropout, which outperforms the state-of-the-art algorithms for HDBR.

연구 동기 및 목표

손으로 쓴 벵갈 숫자 인식의 다양성으로 인한 인식 향상을 동기화한다.
깊은 학습 접근 방식(DBN, CNN 변형)을 Feature Engineering 없이 비교한다.
드롭아웃과 Gabor/필터 특징이 인식 성능에 미치는 영향을 평가한다.
CMATERdb 3.1.1에서 최첨단 방법과의 벤치마크를 통해 강력한 기준선을 확립한다.

제안 방법

CMATERdb 3.1.1에서 Deep Belief Networks (DBN)와 Convolutional Neural Networks (CNN)을 평가한다.
드롭아웃, 가우시안 필터, Gabor 필터를 포함한 CNN 변형을 탐구한다.
CNN 아키텍처를 설명한다: 두 개의 합성곱 계층, 두 개의 하위 샘플링 계층, 하나의 완전 연결 분류 계층.
DBN에 대해 contractive divergence 학습의 RBM 기반 사전 학습을 사용한다.
여러 반복에 걸쳐 학습 및 테스트를 수행하여 SVM 및 기타 방법과의 성능을 비교한다.

실험 결과

연구 질문

RQ1다른 딥 러닝 아키텍처(DBN 대 CNN)가 벵갈 숫자에 대한 HBDR에서 어떻게 성능을 나타내는가?
RQ2드롭아웃 적용 및 Gabor 또는 Gaussian 필터 사용이 CNN 기반 HBDR 성능을 향상시키는가?
RQ3가장 우수한 DL 방식이 CMATERdb 3.1.1에서 최첨단 방법과 어떻게 비교되는가?

주요 결과

방법	정확도
SVM	95.50%
DBN	97.20%
CNN + Gaussian	97.70%
CNN + Gabor	98.30%
CNN + Gaussian + Dropout	98.64%
CNN + Gabor + Dropout	98.78%

CNN with Gabor features and dropout이 98.78%의 가장 높은 보고된 정확도에 도달한다.
CNN with random Gaussian filters가 97.70% 정확도, CNN with Gabor는 98.30%를 얻는다.
CNN with dropout and Gaussian filters가 98.64% 정확도를 달성하여 표준 CNN보다 우수하다.
DBN이 97.20% 정확도를 달성하여 SVM의 95.50%를 능가한다.
평가된 방법 중 CNN with Gabor + Dropout이 동일한 데이터셋에서 최첨단 방법보다 우월하다.
CNN-based approaches가 CMATERdb 3.1.1에서 HBDR에 대해 이전의 비-DL 방법들을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.